CN118153931A - Data processing method, device, equipment and storage medium - Google Patents

Data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN118153931A
CN118153931A CN202410257003.7A CN202410257003A CN118153931A CN 118153931 A CN118153931 A CN 118153931A CN 202410257003 A CN202410257003 A CN 202410257003A CN 118153931 A CN118153931 A CN 118153931A
Authority
CN
China
Prior art keywords
node
job
nodes
scheduling
flow chart
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410257003.7A
Other languages
Chinese (zh)
Inventor
王颖
马冲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
CCB Finetech Co Ltd
Original Assignee
China Construction Bank Corp
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp, CCB Finetech Co Ltd filed Critical China Construction Bank Corp
Priority to CN202410257003.7A priority Critical patent/CN118153931A/en
Publication of CN118153931A publication Critical patent/CN118153931A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0633Workflow analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Operations Research (AREA)
  • Marketing (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Game Theory and Decision Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a data processing method, a device, equipment and a storage medium, which can be applied to the technical field of computer application. The method comprises the following steps: obtaining a batch graph file, wherein the batch graph file comprises M operation nodes, and M is a positive integer more than or equal to 1; analyzing the target attribute of each operation node in the M operation nodes to obtain node data corresponding to the operation node; generating a directed acyclic graph according to node data of the M operation nodes; and generating the upstream scheduling flow chart and/or the downstream scheduling flow chart of the target node according to the node data and the directed acyclic graph of the target node in response to the operation of generating the upstream scheduling flow chart and/or the downstream scheduling flow chart of the target node.

Description

Data processing method, device, equipment and storage medium
Technical Field
The present disclosure relates to the field of computer application technologies, and in particular, to a data processing method, apparatus, device, medium, and program product.
Background
Control-m is cross-platform batch job scheduling management software, and the management software comprises the functions of automatically scheduling and submitting jobs according to business logic; monitoring and analyzing the operation state and the operation result of the operation in real time; and automatically performing post-processing of the job based on the operation result.
In the related art, enterprise-level centralized job scheduling management is represented by a batch job scheduling tool control-m, but the control-m provides an upstream and downstream scheduling flow chart function based on a certain node, and the upstream and downstream scheduling flow chart function needs to be manually turned up and down from the current node or manually turned to the next node or the previous node according to the link direction. The operation scheduling process is not convenient for operators to intuitively see, and the production and daily operation and maintenance under the batch processing scene are not facilitated.
Disclosure of Invention
In view of the foregoing, the present disclosure provides a data processing method, apparatus, device, medium, and program product.
According to a first aspect of the present disclosure, there is provided a data processing method comprising: obtaining a batch graph file, wherein the batch graph file comprises M operation nodes, and M is a positive integer more than or equal to 1; analyzing the target attribute of each operation node in the M operation nodes to obtain node data corresponding to the operation node; generating a directed acyclic graph according to node data of the M operation nodes; and generating the upstream scheduling flow chart and/or the downstream scheduling flow chart of the target node according to the node data and the directed acyclic graph of the target node in response to the operation of generating the upstream scheduling flow chart and/or the downstream scheduling flow chart of the target node.
According to an embodiment of the present disclosure, node data of a job node includes an input condition and an output condition; generating the directed acyclic graph from the node data of the M job nodes includes: and generating a directed acyclic graph according to the association relation between the input condition and the output condition of each of the M job nodes.
According to an embodiment of the present disclosure, the above method further includes: simplifying the operation nodes and links in the directed acyclic graph according to a preset rule to obtain a simplified directed acyclic graph, wherein the links comprise physical connecting lines between the two operation nodes; the generating the upstream scheduling flow chart and/or the downstream scheduling flow chart of the target node according to the node data of the target node and the directed acyclic graph comprises the following steps: and generating an upstream scheduling flow chart and/or a downstream scheduling flow chart of the target node according to the node data of the target node and the simplified directed acyclic graph.
According to an embodiment of the present disclosure, simplifying job nodes and links of a directed acyclic graph according to a preset rule, where obtaining the simplified directed acyclic graph includes: for each job node in the directed acyclic graph, selecting a substitution table node from at least two job nodes executing concurrent scheduling jobs under the condition that the job node is determined to contain at least two concurrent scheduling jobs; hiding non-representative nodes in the at least two job nodes, wherein the non-representative nodes comprise job nodes which are not representative nodes in the at least two job nodes for executing concurrent scheduling jobs; detecting a repetition condition of each operation node in the directed acyclic graph; selecting a representative link from at least two links containing a repeat condition; concealing non-representative links of the at least two links, wherein the non-representative links include links of the at least two links that are not representative links that include a duplicate condition.
According to an embodiment of the present disclosure, the above method further includes: responding to a scheduling time display operation aiming at a target node, calculating time consumed by an initial node to execute to the target node in the directed acyclic graph, and obtaining the scheduling time of the target node; and displaying the scheduling time of the target node on a display interface.
According to an embodiment of the present disclosure, calculating a time consumed by an initial node to execute to a target node in a directed acyclic graph, obtaining a scheduling time of the target node includes: calculating link weight values of all links between the starting node and the target node; and determining the scheduling time of the target node according to the link weight values of all links.
According to an embodiment of the present disclosure, determining a scheduling time of a target node according to link weight values of all links includes: under the condition that at least two operation flows are included between the starting node and the target node, for each operation flow, determining the operation flow weight value of the operation flow according to the links contained in the operation flow, and obtaining at least two operation flow weight values; and determining the maximum value of the at least two job flow weight values as the scheduling time of the target node.
According to an embodiment of the present disclosure, the above method further includes: under the condition that the number of the job nodes in the upstream scheduling flow chart and/or the downstream scheduling flow chart is larger than the preset node display number, hiding the job nodes serving as child nodes to obtain an optimized upstream scheduling flow chart and/or downstream scheduling flow chart; and displaying the optimized upstream scheduling flow chart and/or the optimized downstream scheduling path chart on a display interface.
According to an embodiment of the present disclosure, the above method further includes: under the condition that the number of the levels of the upstream scheduling flow chart and/or the downstream scheduling flow chart is larger than the preset number of the levels, optimizing the job nodes in the upstream scheduling flow chart and/or the downstream scheduling flow chart according to the level relation in a preset optimizing mode to obtain an optimized upstream scheduling flow chart and/or a downstream scheduling flow chart; and displaying the optimized upstream scheduling flow chart and/or the optimized downstream scheduling path chart on a display interface.
According to an embodiment of the present disclosure, the above method further includes: traversing output conditions and input conditions of all job nodes from a starting node of the directed acyclic graph in response to a loop detection operation for the directed acyclic graph; determining that a loop exists in the directed acyclic graph under the condition that the output condition of the current operation node is determined as the input condition of the operation node before the current operation node; and displaying the current operation node on a display interface.
According to an embodiment of the present disclosure, the above method further includes: acquiring a format file corresponding to the batch file format under the condition that the batch file is not in a preset file format, wherein the format file is used for defining the format of the batch file; analyzing the target attribute of the operation node to obtain node data corresponding to the operation node, wherein the step of obtaining the node data comprises the following steps: and analyzing the target attribute of the operation node according to the format file to obtain node data corresponding to the operation node.
A second aspect of the present disclosure provides a data processing apparatus comprising: the first acquisition module is used for acquiring a batch graph file, wherein the batch graph file comprises M operation nodes, and M is a positive integer more than or equal to 1; the analysis module is used for analyzing the target attribute of the operation node aiming at each operation node in the M operation nodes to obtain node data corresponding to the operation node; the first generation module is used for generating a directed acyclic graph according to node data of M operation nodes; and the second generation module is used for responding to the operation of generating the upstream scheduling flow chart and/or the downstream scheduling flow chart of the target node, and generating the upstream scheduling flow chart and/or the downstream scheduling flow chart of the target node according to the node data and the directed acyclic graph of the target node.
A third aspect of the present disclosure provides an electronic device, comprising: one or more processors storing one or more computer programs, the one or more processors executing the one or more computer programs to implement the steps of the method.
A fourth aspect of the present disclosure also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the above method.
A fifth aspect of the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the steps of the above method.
According to the data processing method, the device, the equipment, the medium and the program product provided by the disclosure, target attributes of the operation nodes are analyzed for each operation node in M operation nodes by acquiring a batch graph file comprising the M operation nodes, so as to obtain node data corresponding to the operation nodes; then generating a directed acyclic graph according to the node data of the M operation nodes; and then generating the upstream scheduling flow chart and/or the downstream scheduling flow chart of the target node according to the node data and the directed acyclic graph of the target node in response to the operation of generating the upstream scheduling flow chart and/or the downstream scheduling flow chart of the target node. Because the upstream and downstream scheduling flow charts of the target nodes can be quickly generated according to the node data and the directed acyclic graph of the target nodes, the technical problems that in the related art, the current node needs to be manually turned up and down, or the next node or the previous node needs to be manually jumped to according to the link direction are at least partially solved, and the operation scheduling process is inconvenient for operators to intuitively see. Therefore, the technical effects of facilitating operators to intuitively see the job scheduling process and being beneficial to production and daily operation and maintenance in a batch processing scene are achieved.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be more apparent from the following description of embodiments of the disclosure with reference to the accompanying drawings, in which:
FIG. 1 schematically illustrates an application scenario diagram of a data processing method, apparatus, device, medium and program product according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow chart of a data processing method according to an embodiment of the disclosure;
FIG. 3 schematically illustrates a schematic diagram of a directed acyclic graph, according to an embodiment of the disclosure;
FIG. 4A schematically illustrates a link schematic of a directed acyclic graph according to an embodiment of the disclosure;
FIG. 4B schematically illustrates a link mapping diagram of a directed acyclic graph according to an embodiment of the disclosure;
FIG. 5A schematically illustrates a schematic diagram of an upstream scheduling flow diagram and/or a downstream scheduling flow diagram in accordance with an embodiment of the present disclosure;
FIG. 5B schematically illustrates a simplified schematic diagram of an upstream scheduling flow diagram and/or a downstream scheduling flow diagram in accordance with an embodiment of the present disclosure;
FIG. 6 schematically illustrates a flow chart of a data processing method according to another embodiment of the present disclosure;
FIG. 7 schematically illustrates a schematic diagram of a presentation interface according to an embodiment of the present disclosure;
FIG. 8 schematically illustrates a block diagram of a data processing apparatus according to an embodiment of the present disclosure;
FIG. 9 schematically illustrates a block diagram of a data processing apparatus according to another embodiment of the present disclosure; and
Fig. 10 schematically illustrates a block diagram of an electronic device adapted to implement a data processing method according to an embodiment of the disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is only exemplary and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.
Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a convention should be interpreted in accordance with the meaning of one of skill in the art having generally understood the convention (e.g., "a system having at least one of A, B and C" would include, but not be limited to, systems having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
In the technical scheme of the disclosure, the related data (such as including but not limited to personal information of a user) are collected, stored, used, processed, transmitted, provided, disclosed, applied and the like, all conform to the regulations of related laws and regulations, necessary security measures are adopted, and the public welcome is not violated.
The existing batch job scheduling tool cannot acquire an intuitive scheduling flow chart aiming at a specific node, and needs to manually turn up and down from the current node.
In view of this, an embodiment of the present disclosure provides a data processing method, including: obtaining a batch graph file, wherein the batch graph file comprises M operation nodes, and M is a positive integer more than or equal to 1; analyzing the target attribute of each operation node in the M operation nodes to obtain node data corresponding to the operation node; generating a directed acyclic graph according to node data of the M operation nodes; and generating the upstream scheduling flow chart and/or the downstream scheduling flow chart of the target node according to the node data and the directed acyclic graph of the target node in response to the operation of generating the upstream scheduling flow chart and/or the downstream scheduling flow chart of the target node.
Fig. 1 schematically illustrates an application scenario diagram of a data processing method, apparatus, device, medium and program product according to an embodiment of the present disclosure.
As shown in fig. 1, an application scenario 100 according to this embodiment may include a first terminal device 101, a second terminal device 102, a third terminal device 103, a network 104, and a server 105. The network 104 is a medium used to provide a communication link between the first terminal device 101, the second terminal device 102, the third terminal device 103, and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 through the network 104 using at least one of the first terminal device 101, the second terminal device 102, the third terminal device 103, to receive or send messages, etc. Various communication client applications, such as a shopping class application, a web browser application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc. (by way of example only) may be installed on the first terminal device 101, the second terminal device 102, and the third terminal device 103.
The first terminal device 101, the second terminal device 102, the third terminal device 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The first terminal device 101, the second terminal device 102, and the third terminal device 103 are configured to upload the batch map file, and show an upstream scheduling flowchart and/or the downstream scheduling flowchart.
The server 105 may be a server providing various services, such as a background management server (by way of example only) providing support for websites browsed by the user using the first terminal device 101, the second terminal device 102, and the third terminal device 103. The background management server may analyze and process the received data such as the user request, and feed back the processing result (e.g., the web page, information, or data obtained or generated according to the user request) to the terminal device.
The server 105 is configured to obtain and parse the batch file, generate an upstream scheduling flow chart and/or the downstream scheduling flow chart, and feed back the upstream scheduling flow chart and/or the downstream scheduling flow chart to the first terminal device 101, the second terminal device 102, and the third terminal device 103.
It should be noted that the data processing method provided in the embodiments of the present disclosure may be generally performed by the server 105. Accordingly, the data processing apparatus provided by the embodiments of the present disclosure may be generally provided in the server 1 05. The data processing method provided by the embodiments of the present disclosure may also be performed by a server or a server cluster that is different from the server 105 and is capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103, and/or the server 105. Accordingly, the data processing apparatus provided by the embodiments of the present disclosure may also be provided in a server or a server cluster that is different from the server 105 and is capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103, and/or the server 105.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
The data processing method of the disclosed embodiment will be described in detail below with reference to fig. 2 to 7 based on the scenario described in fig. 1.
Fig. 2 schematically illustrates a flow chart of a data processing method according to an embodiment of the present disclosure.
As shown in fig. 2, the data processing method of this embodiment includes operations S210 to S240.
In operation S210, a batch graph file is obtained, wherein the batch graph file includes M job nodes, and M is a positive integer greater than or equal to 1.
In operation S220, for each of the M job nodes, the target attribute of the job node is analyzed to obtain node data corresponding to the job node.
In operation S230, a directed acyclic graph is generated from node data of the M job nodes.
In operation S240, in response to an operation of generating the upstream and/or downstream schedule flowcharts of the target node, the upstream and/or downstream schedule flowcharts of the target node are generated from the node data and the directed acyclic graph of the target node.
According to an embodiment of the present disclosure, the lot map file may be an input file of a lot scheduling job management tool represented by control-m. The batch file defines all batch tasks to be executed, and each job node included in the batch file is finally scheduled to a specific program or script to execute the corresponding business logic function.
According to an embodiment of the present disclosure, the above-described batch scheduling job refers to a processing mode in which a series of tasks are executed by a computer program based on input of a batch without manual intervention. When the data processing is carried out in the mode, the input data comprises a plurality of pieces, and no manual interaction exists in the processing process. In response to this, an on-line processing mode is used, in which a piece of input data is processed once, and the result is directly fed back to the caller. The batch processing is that a large amount of data can be processed by single execution, the execution time is long, the connection with a calling party is not required to be maintained, the execution result is generally notified to the calling party in the form of a report and the like, and the processing time is in a time period when the computing resource is not tight so as to better utilize the system resource. The method is commonly used in scenes with low requirements on real-time performance and interactivity and large data volume to be processed. And two key links in batch processing are batch tasks and task scheduling. The batch tasks uniformly prescribe the design processes of definition, arrangement, execution and the like of the jobs, a good job model conceals internal complexity, the development difficulty of specific jobs is simplified, and the scheduling process can be better supported. Task scheduling refers to controlling when or which conditions the job triggers, which system resources including server nodes, cpu resources, etc. are used for executing, and also includes processing after the job fails to execute, alarming, etc.
According to embodiments of the present disclosure, job nodes may be logical function nodes, such as a table node, subTable node, and job node, for scheduling to specific programs or scripts to execute corresponding services. Each node in the batch graph file which completes the function of the individual business logic is defined as a table node, all child nodes contained below the node are defined as subTable nodes, one subTable node can be embedded with a plurality of subTable nodes, and finally, a leaf node which is pointed to execute a specific program or script is a job node.
According to embodiments of the present disclosure, each job node may be a node that specifically performs a business logic function, and each job node is associated with each other. The M job nodes together constitute a complete or partial service function.
According to embodiments of the present disclosure, the target attribute may be configured according to the type of batch file, e.g., in some embodiments, the target attribute may include an input condition, an output condition, etc. of the job node. In other embodiments, the target attributes may also include scheduling time frequency, and the like.
According to an embodiment of the present disclosure, the node data may be data of the job node resolved based on the target attribute, and the node data of the job node may include attribute data such as an absolute path name, an input condition, an output condition, and a weight, for example.
According to embodiments of the present disclosure, node data may be persisted to a database or saved to a memory model for later use.
Fig. 3 schematically illustrates a schematic diagram of a directed acyclic graph, according to an embodiment of the disclosure.
According to embodiments of the present disclosure, the directed acyclic graph may be a loop-free directed graph that describes the dependency relationship between each job node.
As shown in FIG. 3, the input condition of job node 3-2 depends on the output condition of job node 3-1, while the output condition of job node 3-2 is a dependency of the input condition of job node 3-4.
According to embodiments of the present disclosure, the target node may be any node in the directed acyclic graph input by the operator at the front-end interface. For example, the target node is a job node a in the directed acyclic graph, and then an upstream scheduling flowchart and/or a downstream scheduling flowchart of the job node a are generated according to the node data of the job node a and the directed acyclic graph.
According to embodiments of the present disclosure, the upstream scheduling flow diagram may be a flow diagram that includes the target node and some or all of the job nodes preceding the target node; the downstream dispatch flow may be a flow diagram that includes the target node and some or all of the job nodes that follow the target node.
In one embodiment, as shown in FIG. 3, for example, where job node 3-4 is the target node, the upstream dispatch flow diagram for job node 3-4 may be a flow diagram including job node 3-1, job node 3-2, job node 3-3, and job node 3-4; the downstream dispatch flow diagram of job node 3-4 may include job node 3-4, job node 3-5, job node 3-6, and job node 3-7.
According to an embodiment of the present disclosure, generating an upstream scheduling flowchart and/or a downstream scheduling flowchart of a target node from node data of the target node and a directed acyclic graph may include: and constructing an upstream scheduling flow chart and/or a downstream scheduling flow chart of the target node based on the breadth-first traversal algorithm according to the node data of the target node and the directed acyclic graph.
According to the data processing method provided by the disclosure, a batch graph file comprising M operation nodes is obtained, and target attributes of the operation nodes are analyzed for each operation node in the M operation nodes to obtain node data corresponding to the operation nodes; then generating a directed acyclic graph according to the node data of the M operation nodes; and then generating the upstream scheduling flow chart and/or the downstream scheduling flow chart of the target node according to the node data and the directed acyclic graph of the target node in response to the operation of generating the upstream scheduling flow chart and/or the downstream scheduling flow chart of the target node. Because the upstream and downstream scheduling flow charts of the target nodes can be quickly generated according to the node data and the directed acyclic graph of the target nodes, the technical problems that in the related art, the current node needs to be manually turned up and down, or the current node needs to be manually jumped to one node or the previous node according to the link direction are at least partially solved, and the operation scheduling process is inconvenient for operators to intuitively see. Therefore, the technical effects of facilitating operators to intuitively see the job scheduling process and being beneficial to production and daily operation and maintenance in a batch processing scene are achieved.
According to an embodiment of the present disclosure, node data of a job node includes an input condition and an output condition; generating the directed acyclic graph from the node data of the M job nodes includes: and generating a directed acyclic graph according to the association relation between the input condition and the output condition of each of the M job nodes.
According to the embodiment of the disclosure, according to an actual service scene, the dependency relationship among the operation nodes is predefined, and the dependency relationship among the operation nodes is intuitively embodied through the node data. The dependency relationship among a plurality of job nodes is constructed in the form of arrow pointing among the job nodes.
According to an embodiment of the present disclosure, as shown in fig. 3, each job node may have 0, 1 or more job nodes of upper-lower relationship such as job node 3-1 to job node 3-7 depending thereon, wherein the input condition of job node 3-4 depends on the output conditions of job node 3-2 and job node 3-3, while the output condition of job node 3-4 is a dependency of the input conditions of job node 3-5 and job node 3-6, and a directed acyclic graph is constructed based on the dependency of the input conditions and the output conditions.
According to an embodiment of the present disclosure, the plurality of job nodes may be further topologically ordered based on the plurality of job nodes and the dependency relationship between the plurality of job nodes, where the topologically ordered is used to represent the upper-lower relationship (i.e., the dependency relationship described above) between the plurality of job nodes. For example, if the input condition of the job node 1 is A1, the output condition is B1, the input condition of the job node 2 is A2, and the output condition is B2, and the output condition B1 and the input condition A2 have a dependency relationship, the job node 1 is a higher node of the job node 2.
According to embodiments of the present disclosure, the construction of the directed acyclic graph may also employ breadth-first searching to traverse from any one job node. And firstly confirming the upper node or the lower node of the operation node, then continuing to search the upper node of the upper node or the lower node of the operation node, and the like until the upper and lower relationships of all the operation nodes are confirmed. Further, the upper and lower relationships among the operation nodes are arranged in a carding way, and the upper and lower relationships among the operation nodes are indicated by pointing through arrows, so that the directed acyclic graph is obtained.
According to an embodiment of the present disclosure, the above method further includes: simplifying the operation nodes and links in the directed acyclic graph according to a preset rule to obtain a simplified directed acyclic graph, wherein the links comprise physical connecting lines between the two operation nodes; the generating the upstream scheduling flow chart and/or the downstream scheduling flow chart of the target node according to the node data of the target node and the directed acyclic graph comprises the following steps: and generating an upstream scheduling flow chart and/or a downstream scheduling flow chart of the target node according to the node data of the target node and the simplified directed acyclic graph.
According to embodiments of the present disclosure, a link is a segment of a physical line from one job node to an adjacent job node without any other job node in between.
According to embodiments of the present disclosure, for example, during batch job scheduling, multiple databases may be involved to schedule jobs simultaneously. In the process of scheduling the job in the databases, a plurality of job nodes with the same input condition and output condition are generated. For example, when the job node 1 executes 3 database parallel scheduling job tasks, three job nodes of the job node 1-1, the job node 1-2, and the job node 1-3 are generated, respectively, and the input conditions and the output conditions of the job node 1-1, the job node 1-2, and the job node 1-3 are the same, the job node 1-1, the job node 1-2, and the job node 1-3 need to be simplified.
According to an embodiment of the present disclosure, simplifying job nodes and links of a directed acyclic graph according to a preset rule, where obtaining the simplified directed acyclic graph includes: for each job node in the directed acyclic graph, selecting a substitution table node from at least two job nodes executing concurrent scheduling jobs under the condition that the job node is determined to contain at least two concurrent scheduling jobs; hiding non-representative nodes in the at least two job nodes, wherein the non-representative nodes comprise job nodes which are not representative nodes in the at least two job nodes for executing concurrent scheduling jobs; detecting a repetition condition of each operation node in the directed acyclic graph; selecting a representative link from at least two links containing a repeat condition; concealing non-representative links of the at least two links, wherein the non-representative links include links of the at least two links that are not representative links that include a duplicate condition.
According to an embodiment of the present disclosure, if job node 1-1, job node 1-2, and job node 1-3 above need to be simplified, the simplification process may include: the representative node, for example, the representative node is determined to be the job node 1-1 from the job node 1-1, the job node 1-2 and the job node 1-3, and then the non-representative nodes, for example, the job node 1-2 and the job node 1-3 are hidden, at this time, only the job node 1-1 is displayed, so that the purpose of simplifying the directed acyclic graph is achieved.
FIG. 4A schematically illustrates a link schematic of a directed acyclic graph according to an embodiment of the disclosure; fig. 4B schematically illustrates a link mapping diagram of a directed acyclic graph according to an embodiment of the disclosure.
According to the embodiment of the disclosure, when the batch graph file is initialized and constructed, two nodes, namely a start node and an end node, are constructed for the processing of the subTable nodes, wherein the start node represents the original subTable node and comprises all the entry conditions thereof, and the conditions are output to all the nodes (comprising subTable nodes and job nodes) below the start node; the end node needs to contain all the conditions of subTable nodes and receive all the conditions of all the nodes below it (including subTable node and job node), thus constructing a batch graph below subTable node. But this would present a redundant condition. For example, as shown in fig. 4A and fig. 4B, the original connection relationship of the batch file is shown in fig. 4A, the subTableA node includes a subTableB node and a jobC node, and the condition association relationship is shown in the figure; mapping to program as shown in FIG. 4B, the startTable and endTable nodes are constructed for subTableA nodes, the startTable node inherits all the in conditions of subTableA nodes, and the out conditions are to the subTableB and jobC nodes; node endTable inherits all outgoing conditions from node subTableB and receives all outgoing conditions from nodes subTableB and jobC, but so the connection between node subTableA (startTable) to jobC and the connection between node subTableB to endTable are duplicate conditions that need to be optimized.
According to an embodiment of the disclosure, for each job node in the directed acyclic graph, detecting a repetition condition for the job node; selecting a representative link from at least two links containing a repeat condition; hiding non-representative links of the at least two links includes: for example, in fig. 4B, the links from subTableA to jobC and the links from subTableA to jobB contain duplicate conditions, where the links from subTableA to jobB may be selected as representative links, and the links from subTableA to jobC are non-representative links, so that the links from subTableA to jobC are hidden, thereby achieving the purpose of simplifying the links.
According to an embodiment of the present disclosure, the above method further includes: responding to a scheduling time display operation aiming at a target node, calculating time consumed by an initial node to execute to the target node in the directed acyclic graph, and obtaining the scheduling time of the target node; and displaying the scheduling time of the target node on a display interface.
According to embodiments of the present disclosure, the scheduling time may be the time required for one job node to perform a job task to the next job node.
According to an embodiment of the present disclosure, calculating a time consumed by an initial node to execute to a target node in a directed acyclic graph, obtaining a scheduling time of the target node includes: calculating link weight values of all links between the starting node and the target node; and determining the scheduling time of the target node according to the link weight values of all links.
According to an embodiment of the present disclosure, a start node is defined as a job node that does not depend on the output condition of any other job node as a dependent input condition; an end node is defined as a node that does not have any output condition as an input condition for other job nodes. A link is a segment of a physical line from one job node to an adjacent job node without any other job node in between.
In one embodiment, the starting node includes a link a, a link b, and a link c in sequence from the target node, and then the link weight values of all the links between the starting node and the target node include a link weight value a of the link a, a link weight value b of the link b, and a link weight value c of the link c, and then the scheduling time of the target node is determined according to the link weight value a, the link weight value b, and the link weight value c.
According to an embodiment of the present disclosure, determining a scheduling time of a target node according to link weight values of all links includes: under the condition that at least two operation flows are included between the starting node and the target node, for each operation flow, determining the operation flow weight value of the operation flow according to the links contained in the operation flow, and obtaining at least two operation flow weight values; and determining the maximum value of the at least two job flow weight values as the scheduling time of the target node.
According to an embodiment of the present disclosure, as shown in fig. 3, for example, the start node is the job node 3-1, the target node is the job node 3-4, and two job flows are included between the job node 3-1 and the job node 3-4, namely, the job flow 1: job node 3-1, job node 3-2, job node 3-4, job flow 2: job node 3-1, job node 3-3, job node 3-4. For example, link a is included between operation node 3-1 and operation node 3-2, link b is included between operation node 3-2 and operation node 3-4, link c is included between operation node 3-1 and operation node 3-3, and link d is included between operation node 3-3 and operation node 3-4. Then job flow 1 includes link a and link b and job flow 2 includes link c and link d. At this time, the workflow weight value 1 of the workflow 1 includes the sum of the link weight value a of the link a and the link weight value b of the link b; the workflow weight value 2 of the workflow 2 includes the sum of the link weight value c of the link c and the link weight value d of the link d. Then, the workflow weight value 1 and the workflow weight value 2 are compared, and if the workflow weight value 1 is larger than the workflow weight value 2, the workflow weight value 1 is used as the scheduling time of the target node, namely, the operation nodes 3-4.
According to an embodiment of the present disclosure, the above method further includes: under the condition that the number of the job nodes in the upstream scheduling flow chart and/or the downstream scheduling flow chart is larger than the preset node display number, hiding the job nodes serving as child nodes to obtain an optimized upstream scheduling flow chart and/or downstream scheduling flow chart; and displaying the optimized upstream scheduling flow chart and/or the optimized downstream scheduling path chart on a display interface.
According to an embodiment of the present disclosure, if execution of job node 1 needs to depend on completion of job node 2, there is a parent-child relationship between the two jobs, where job node 1 is a child node of job node 2.
According to embodiments of the present disclosure, the number is presented by a preset node such that the upstream and/or downstream scheduling flowcharts are presented intuitively. If the current node display quantity is larger than the preset node display quantity, determining the relationship of father and son nodes in a plurality of operation nodes according to the dependency relationship among the nodes, hiding the son nodes, and only reserving the father node of the son node.
FIG. 5A schematically illustrates a schematic diagram of an upstream scheduling flow diagram and/or a downstream scheduling flow diagram in accordance with an embodiment of the present disclosure; fig. 5B schematically illustrates a simplified schematic diagram of an upstream scheduling flow diagram and/or a downstream scheduling flow diagram according to an embodiment of the present disclosure.
In one embodiment, as shown in fig. 5A, for example, if the preset node display number is 3, the child nodes from the operation node 5-2 to the operation nodes 5-4 may be hidden to simplify the directed acyclic graph, so as to obtain the directed acyclic graph shown in fig. 5B.
According to an embodiment of the present disclosure, the above method further includes: under the condition that the number of the levels of the upstream scheduling flow chart and/or the downstream scheduling flow chart is larger than the preset number of the levels, optimizing the job nodes in the upstream scheduling flow chart and/or the downstream scheduling flow chart according to the level relation in a preset optimizing mode to obtain an optimized upstream scheduling flow chart and/or a downstream scheduling flow chart; and displaying the optimized upstream scheduling flow chart and/or the optimized downstream scheduling path chart on a display interface.
According to embodiments of the present disclosure, the number of levels is preset such that the upstream and/or downstream scheduling flowcharts are intuitive. If the current level number is greater than the preset level number, determining an upper level relationship and a lower level relationship among a plurality of operation nodes according to the level relationship among the nodes, hiding the lower level nodes, and only displaying the upper level nodes.
In one embodiment, as shown in FIG. 5A, the first tier includes job node 5-1, the second tier includes job node 5-2 and job node 5-3, and the third tier includes job nodes 5-4 through 5-8. The number of preset levels is 2, and at this time, the number of current levels is greater than the number of preset levels. Wherein, the operation nodes 5-4 to 5-8 are lower level nodes with respect to the operation nodes 5-1 to 5-3. Thus, job node 5-4 through job node 5-8 may be hidden, resulting in a directed acyclic graph as shown in FIG. 5B.
According to an embodiment of the present disclosure, the above method further includes: traversing output conditions and input conditions of all job nodes from a starting node of the directed acyclic graph in response to a loop detection operation for the directed acyclic graph; determining that a loop exists in the directed acyclic graph under the condition that the output condition of the current operation node is determined as the input condition of the operation node before the current operation node; and displaying the current operation node on a display interface.
In accordance with an embodiment of the present disclosure, in the case of determining an output condition of a current job node as an input condition of a job node preceding it, determining that a directed acyclic graph exists in a loop may include: the input condition of the job node 2 depends on the output condition of the job node 1, while the input condition of the job node 3 depends on the output condition of the job node 2, and there is no dependency of any output condition as the input condition of other job nodes, at this time, the directed graph has no loop. If the input condition of the job node 1 depends on the output condition of the job node 3, the job node 1, the job node 2, and the job node 3 at this time form one loop, that is, the directed acyclic graph has a loop.
According to an embodiment of the present disclosure, all job nodes of the directed acyclic graph are traversed by a sub-breadth-first traversal algorithm. If the output condition of the current operation node is not the input condition of any operation node, or the input condition of the current operation node is not the output condition of any other operation node, namely the bit redundancy condition. In a depth-first-based recursive algorithm, traversal is started from the beginning of a job node, and if the output condition of the traversed current job node is taken as the input condition relied on by the previous ancestor node, a loop is determined to exist in the directed acyclic graph.
According to an embodiment of the present disclosure, the above method further includes: acquiring a format file corresponding to the batch file format under the condition that the batch file is not in a preset file format, wherein the format file is used for defining the format of the batch file; analyzing the target attribute of the operation node to obtain node data corresponding to the operation node, wherein the step of obtaining the node data comprises the following steps: and analyzing the target attribute of the operation node according to the format file to obtain node data corresponding to the operation node.
According to an embodiment of the present disclosure, the preset file format may be an xml format.
According to the embodiment of the disclosure, the format file may be a format attribute such as text, number, etc. and is used for representing the format of the target attribute such as the input condition, the scheduling time frequency, the output condition after the scheduling is completed, etc. of each node.
According to the embodiment of the disclosure, the target attribute may be an attribute such as a scheduling condition and scheduling time of each job node, for example, a key attribute in a table node, a subTable node and a job node, and the target attribute of the node is determined through a format file, so as to obtain node data corresponding to the job node.
According to an embodiment of the present disclosure, a format file defines a format of a batch graph file, including node elements and condition definitions contained in the batch graph file, and the entire batch graph file is connected by association between a root node (no input condition) and each child node below the root node through output conditions and input conditions of the node. Node element definitions of the batch graph file (referring to relationships among nodes in the xml file) can be known through the format file, and then the batch graph file is analyzed based on the node element definitions.
In the related art, a call flow chart formed based on an input batch file with a fixed format is blurred and not intuitive as the batch file is increased and the number of nodes defined in the batch file is increased. The diversified configurable format files can analyze batch files with different formats, and have strong flexibility and expandability.
Fig. 6 schematically illustrates a flow chart of a data processing method according to another embodiment of the present disclosure.
As shown in fig. 6, the data processing method of this embodiment includes operations S601 to S608.
In operation S601, a batch map file is acquired.
In operation S602, a format file corresponding to the batch file is acquired, where the format file is used to define a format of the batch file.
In operation S603, the batch file is parsed according to the root file, and node data corresponding to each job node of the batch file is obtained.
In operation S604, the association relationship of each job node is determined according to the node data obtained by the analysis, using the input condition and the output condition in the node data, so as to generate the directed acyclic graph.
In operation S605, the above obtained directed acyclic graph is subjected to a simplified process to conceal duplicate job nodes and links.
In operation S606, the time consumed by the start node to execute to the target node in the directed acyclic graph is confirmed, and the scheduling time of the target node is obtained.
In operation S607, loop detection is performed on the directed acyclic graph to avoid the existence of loops in the directed acyclic graph.
In operation S608, an upstream scheduling flowchart and a downstream scheduling flowchart of the target node are generated according to the directed acyclic graph.
Fig. 7 schematically illustrates a schematic diagram of a presentation interface according to an embodiment of the present disclosure.
As shown in fig. 7, the presentation interface submits the batch map file by "select batch map file to upload"; inputting the name of a job node which is pre-used as a target node in the node name; then clicking the upstream scheduling flow chart or the downstream scheduling flow chart to obtain the upstream scheduling flow chart and the downstream scheduling flow chart of the target node based on the input; further, clicking on "loop detection" may perform loop detection on the directed acyclic graph of the target node based on the input, and output a detection result.
According to the embodiment of the disclosure, the batch file with different formats is analyzed based on the diversified configurable format files, so that the method has high flexibility and expandability. And the target node can quickly read and acquire the associated upstream scheduling flow chart and downstream scheduling flow chart, so that the service logic function corresponding to the target node and the whole calling link relation thereof can be intuitively acquired. By calculating the scheduling time of the target node, the influence of the scheduling time of the target node on the time and range of other operation nodes can be more accurately obtained, so that the production and daily operation and maintenance under a batch processing scene are assisted.
Based on the data processing method, the disclosure also provides a data processing device. The device will be described in detail below in connection with fig. 8.
Fig. 8 schematically shows a block diagram of a data processing apparatus according to an embodiment of the present disclosure.
As shown in fig. 8, the data processing apparatus 800 of this embodiment includes a first acquisition module 810, a parsing module 820, a first generation module 830, and a second generation module 840.
The first obtaining module 810 is configured to obtain a batch graph file, where the batch graph file includes M job nodes, and M is a positive integer greater than or equal to 1. In an embodiment, the obtaining module 810 may be configured to perform the operation S210 described above, which is not described herein.
The parsing module 820 is configured to parse, for each of the M job nodes, the target attribute of the job node, and obtain node data corresponding to the job node. In an embodiment, the parsing module 820 may be used to perform the operation S220 described above, which is not described herein.
A first generating module 830, configured to generate a directed acyclic graph according to node data of the M job nodes. In an embodiment, the first generating module 830 may be configured to perform the operation S230 described above, which is not described herein.
The second generating module 840 is configured to generate the upstream scheduling flowchart and/or the downstream scheduling flowchart of the target node according to the node data and the directed acyclic graph of the target node in response to an operation of generating the upstream scheduling flowchart and/or the downstream scheduling flowchart of the target node. In an embodiment, the second generating module 840 may be configured to perform the operation S240 described above, which is not described herein.
According to the data processing device provided by the disclosure, a batch graph file comprising M operation nodes is obtained by utilizing an obtaining module, and target attributes of the operation nodes are analyzed by utilizing an analyzing module aiming at each operation node in the M operation nodes, so that node data corresponding to the operation nodes is obtained; then, a first generation module is utilized to generate a directed acyclic graph according to node data of M operation nodes; and then, generating the upstream scheduling flow chart and/or the downstream scheduling flow chart of the target node according to the node data and the directed acyclic graph of the target node by utilizing a second generating module in response to the operation of generating the upstream scheduling flow chart and/or the downstream scheduling flow chart of the target node. Because the upstream and downstream scheduling flow charts of the target nodes can be quickly generated according to the node data and the directed acyclic graph of the target nodes, the technical problems that in the related art, the current node needs to be manually turned up and down, or the current node needs to be manually jumped to one node or the previous node according to the link direction are at least partially solved, and the operation scheduling process is inconvenient for operators to intuitively see. Therefore, the technical effects of facilitating operators to intuitively see the job scheduling process and being beneficial to production and daily operation and maintenance in a batch processing scene are achieved.
According to an embodiment of the present disclosure, the first generation module 830 includes a first generation sub-module.
The first generation sub-module is used for generating a directed acyclic graph according to the association relation between the input condition and the output condition of each of the M operation nodes.
According to an embodiment of the present disclosure, the data processing apparatus 800 further comprises a simplification module.
And the simplifying module is used for simplifying the operation nodes and the links in the directed acyclic graph according to a preset rule to obtain the simplified directed acyclic graph, wherein the links comprise physical connection lines between the two operation nodes.
According to an embodiment of the present disclosure, the second generation module 840 includes a first generation sub-module.
And the first generation submodule is used for generating an upstream scheduling flow chart and/or a downstream scheduling flow chart of the target node according to the node data of the target node and the simplified directed acyclic graph.
According to an embodiment of the disclosure, the simplification module includes a first selection sub-module, a first concealment sub-module, a detection sub-module, a second selection sub-module, and a second concealment sub-module.
The first selecting sub-module is used for selecting a table node from at least two job nodes for executing the concurrent scheduling job under the condition that the job nodes contain at least two concurrent scheduling jobs according to each job node in the directed acyclic graph.
And the first hiding submodule is used for hiding non-representing nodes in the at least two job nodes, wherein the non-representing nodes comprise job nodes which are not representing nodes in the at least two job nodes for executing the concurrent scheduling job.
And the detection submodule is used for detecting the repetition condition of each operation node in the directed acyclic graph.
A second selection sub-module for selecting a representative link from at least two links containing a repetition condition.
And a second hiding submodule for hiding a non-representative link of the at least two links, wherein the non-representative link includes a link of the at least two links that is not a representative link that includes the repetition condition.
According to an embodiment of the present disclosure, the data processing apparatus 800 further includes a first obtaining module and a first displaying module.
The first obtaining module is used for responding to the scheduling time display operation aiming at the target node, calculating the time consumed by the starting node to execute to the target node in the directed acyclic graph, and obtaining the scheduling time of the target node.
And the first display module is used for displaying the scheduling time of the target node on the display interface.
According to an embodiment of the present disclosure, the first obtaining module includes a first computing sub-module and a first determining sub-module.
And the first computing sub-module is used for computing the link weight values of all links between the starting node and the target node.
And the first determining submodule is used for determining the scheduling time of the target node according to the link weight values of all links.
According to an embodiment of the present disclosure, the first determination submodule includes a first obtaining unit and a first determination unit.
The first obtaining unit is configured to determine, for each of the job flows, a job flow weight value of the job flow according to a link included in the job flow, and obtain at least two job flow weight values, when it is determined that at least two job flows are included between the start node and the target node.
And the first determining unit is used for determining the maximum value of the at least two job flow weight values as the scheduling time of the target node.
According to an embodiment of the present disclosure, the data processing apparatus 800 further includes a second obtaining module and a second displaying module.
And the second obtaining module is used for hiding the operation nodes serving as the child nodes to obtain the optimized upstream scheduling flow chart and/or downstream scheduling flow chart under the condition that the number of the operation nodes in the upstream scheduling flow chart and/or the downstream scheduling flow chart is determined to be larger than the preset node display number.
And the second display module is used for displaying the optimized upstream scheduling flow chart and/or the optimized downstream scheduling path chart on the display interface.
According to an embodiment of the present disclosure, the data processing apparatus 800 further includes a third obtaining module and a third displaying module.
And the third obtaining module is used for optimizing the operation nodes in the upstream scheduling flow chart and/or the downstream scheduling flow chart according to a preset optimizing mode according to the hierarchical relation under the condition that the hierarchical number of the upstream scheduling flow chart and/or the downstream scheduling flow chart is larger than the preset hierarchical number, so as to obtain the optimized upstream scheduling flow chart and/or the downstream scheduling flow chart.
And the third display module is used for displaying the optimized upstream scheduling flow chart and/or the optimized downstream scheduling path chart on the display interface.
According to an embodiment of the present disclosure, the data processing apparatus 800 further includes a traversal module, a determination module, and a fourth presentation module.
And the traversing module is used for responding to the loop detection operation aiming at the directed acyclic graph, starting from the starting node of the directed acyclic graph, and traversing the output conditions and the input conditions of all the job nodes.
And the determining module is used for determining that the directed acyclic graph has a loop in the case of determining the output condition of the current operation node as the input condition of the operation node before the current operation node.
And the fourth display module is used for displaying the current operation node on the display interface.
According to an embodiment of the present disclosure, the data processing apparatus 800 further comprises a second acquisition module.
And the second acquisition module is used for acquiring a format file corresponding to the batch file format under the condition that the batch file is not in the preset file format, wherein the format file is used for defining the format of the batch file.
According to an embodiment of the present disclosure, parsing module 820 includes a parsing sub-module.
And the analysis submodule is used for analyzing the target attribute of the operation node according to the format file to obtain node data corresponding to the operation node.
According to an embodiment of the present disclosure, any of the first obtaining module 810, the parsing module 820, the first generating module 830, and the second generating module 840 may be combined in one module to be implemented, or any of the modules may be split into a plurality of modules. Or at least some of the functionality of one or more of the modules may be combined with, and implemented in, at least some of the functionality of other modules. According to embodiments of the present disclosure, at least one of the first acquisition module 810, the parsing module 820, the first generation module 830, and the second generation module 840 may be implemented at least in part as hardware circuitry, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in hardware or firmware in any other reasonable manner of integrating or packaging the circuitry, or in any one of or a suitable combination of three of software, hardware, and firmware. Or at least one of the first acquisition module 810, the parsing module 820, the first generation module 830 and the second generation module 840 may be at least partially implemented as computer program modules which, when executed, may perform the respective functions.
Fig. 9 schematically shows a block diagram of a data processing apparatus according to another embodiment of the present disclosure.
As shown in fig. 9, the data processing apparatus 900 of this embodiment includes a batch file parsing module 910 and a link parsing module 920.
The batch file parsing module 910 is configured to parse the job node in the batch file according to one or more batch files and format files uploaded by the front end, obtain node data, and persist the node data to a database for use by a subsequent link parsing module, where the node data mainly includes: absolute path name, input condition, output condition, and weight value. Meanwhile, the batch file parsing module 910 is further configured to construct a directed acyclic graph according to the association relationship between the input condition and the output condition of each node. It should be noted that the batch file parsing module 910 may correspond to the first obtaining module 810, the parsing module 820, and the first generating module 830 in fig. 8.
The link parsing module 920 is configured to construct an upstream scheduling flowchart and a downstream scheduling flowchart according to a traversing algorithm for performing breadth-first on the designated input node based on the directed acyclic graph and the node data of each node, and calculate weight values of the node and the link in the process. Meanwhile, the link resolution module 920 is further configured to perform condition and loop detection in response to the condition and loop detection operation for the target node. It should be noted that the link parsing module 920 may correspond to the second generating module in fig. 8.
Fig. 10 schematically illustrates a block diagram of an electronic device adapted to implement a data processing method according to an embodiment of the disclosure.
As shown in fig. 10, an electronic device 1000 according to an embodiment of the present disclosure includes a processor 1001 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 1002 or a program loaded from a storage section 1008 into a Random Access Memory (RAM) 1003. The processor 1001 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. The processor 1001 may also include on-board memory for caching purposes. The processor 1001 may include a single processing unit or multiple processing units for performing different actions of the method flows according to embodiments of the present disclosure.
In the RAM 1003, various programs and data necessary for the operation of the electronic apparatus 1000 are stored. The processor 1001, the ROM 1002, and the RAM 1003 are connected to each other by a bus 1004. The processor 1001 performs various operations of the method flow according to the embodiment of the present disclosure by executing programs in the ROM 1002 and/or the RAM 1003. Note that the program may be stored in one or more memories other than the ROM 1002 and the RAM 1003. The processor 1001 may also perform various operations of the method flow according to the embodiments of the present disclosure by executing programs stored in the one or more memories.
According to an embodiment of the disclosure, the electronic device 1000 may also include an input/output (I/O) interface 1005, the input/output (I/O) interface 1005 also being connected to the bus 1004. The electronic device 1000 may also include one or more of the following components connected to an input/output (I/O) interface 1005: an input section 1006 including a keyboard, a mouse, and the like; an output portion 1007 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), etc., and a speaker, etc.; a storage portion 1008 including a hard disk or the like; and a communication section 1009 including a network interface card such as a LAN card, a modem, or the like. The communication section 1009 performs communication processing via a network such as the internet. The drive 1010 is also connected to an input/output (I/O) interface 1005 as needed. A removable medium 1011, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is installed as needed in the drive 1010, so that a computer program read out therefrom is installed as needed in the storage section 1008.
The present disclosure also provides a computer-readable storage medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs which, when executed, implement methods in accordance with embodiments of the present disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example, but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, the computer-readable storage medium may include ROM 1002 and/or RAM 1003 and/or one or more memories other than ROM 1002 and RAM 1003 described above.
Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the methods shown in the flowcharts. The program code means for causing a computer system to carry out the methods as provided by the embodiments of the present disclosure when the computer program product is run on the computer system.
The above-described functions defined in the system/apparatus of the embodiments of the present disclosure are performed when the computer program is executed by the processor 1001. The systems, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
In one embodiment, the computer program may be based on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted in the form of signals on a network medium, distributed, and downloaded and installed via the communication section 1009, and/or installed from the removable medium 1011. The computer program may include program code that may be transmitted using any appropriate network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 1009, and/or installed from the removable medium 1011. The above-described functions defined in the system of the embodiments of the present disclosure are performed when the computer program is executed by the processor 1001. The systems, devices, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
According to embodiments of the present disclosure, program code for performing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, such computer programs may be implemented in high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. Programming languages include, but are not limited to, such as Java, c++, python, "C" or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that the features recited in the various embodiments of the disclosure and/or in the claims may be provided in a variety of combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the disclosure. In particular, the features recited in the various embodiments of the present disclosure and/or the claims may be variously combined and/or combined without departing from the spirit and teachings of the present disclosure. All such combinations and/or combinations fall within the scope of the present disclosure.
The embodiments of the present disclosure are described above. These examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described above separately, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be made by those skilled in the art without departing from the scope of the disclosure, and such alternatives and modifications are intended to fall within the scope of the disclosure.

Claims (15)

1. A method of data processing, the method comprising:
Obtaining a batch graph file, wherein the batch graph file comprises M operation nodes, and M is a positive integer more than or equal to 1;
Analyzing target attributes of the operation nodes aiming at each operation node in the M operation nodes to obtain node data corresponding to the operation nodes;
generating a directed acyclic graph according to the node data of the M operation nodes; and
In response to an operation of generating an upstream schedule flow and/or a downstream schedule flow of a target node, the upstream schedule flow and/or the downstream schedule flow of the target node is generated according to node data of the target node and the directed acyclic graph.
2. The method of claim 1, wherein the node data of the job node includes an input condition and an output condition;
the generating the directed acyclic graph according to the node data of the M job nodes comprises:
and generating the directed acyclic graph according to the association relation between the input condition and the output condition of each of the M job nodes.
3. The method according to claim 1, wherein the method further comprises:
Simplifying the operation nodes and links in the directed acyclic graph according to a preset rule to obtain a simplified directed acyclic graph, wherein the links comprise physical connecting lines between the two operation nodes;
Wherein the generating the upstream scheduling flow chart and/or the downstream scheduling flow chart of the target node according to the node data of the target node and the directed acyclic graph comprises:
and generating the upstream scheduling flow chart and/or the downstream scheduling flow chart of the target node according to the node data of the target node and the simplified directed acyclic graph.
4. A method according to claim 3, wherein the simplifying the job nodes and links of the directed acyclic graph according to the preset rule to obtain the simplified directed acyclic graph comprises:
Selecting a substitution table node from at least two job nodes executing the concurrent scheduling job under the condition that the job nodes contain at least two concurrent scheduling jobs according to each job node in the directed acyclic graph;
hiding non-representative nodes in the at least two job nodes, wherein the non-representative nodes comprise job nodes which are not representative nodes in the at least two job nodes executing the concurrent scheduling job;
detecting a repetition condition of each operation node in the directed acyclic graph;
selecting a representative link from at least two links containing a repeat condition;
concealing non-representative links of the at least two links, wherein the non-representative links include links of the at least two links that include a duplicate condition that are not representative links.
5. The method according to claim 1, wherein the method further comprises:
Responding to a scheduling time display operation aiming at the target node, calculating the time consumed by an initial node to execute to the target node in the directed acyclic graph, and obtaining the scheduling time of the target node;
And displaying the scheduling time of the target node on a display interface.
6. The method of claim 5, wherein the calculating the time spent by the starting node executing to the target node in the directed acyclic graph, the obtaining the scheduled time for the target node comprises:
calculating link weight values of all links between the starting node and the target node;
And determining the scheduling time of the target node according to the link weight values of all links.
7. The method of claim 6, wherein said determining the scheduling time of the target node based on the link weight values of all links comprises:
Determining a working flow weight value of each working flow according to a link contained in the working flow for each working flow under the condition that at least two working flows are included between the starting node and the target node, so as to obtain at least two working flow weight values;
And determining the maximum value of the at least two job flow weight values as the scheduling time of the target node.
8. The method according to claim 1, wherein the method further comprises:
Under the condition that the number of the job nodes in the upstream scheduling flow chart and/or the downstream scheduling flow chart is larger than the preset node display number, hiding the job nodes serving as child nodes to obtain an optimized upstream scheduling flow chart and/or downstream scheduling flow chart;
and displaying the optimized upstream scheduling flow chart and/or the optimized downstream scheduling path chart on a display interface.
9. The method according to claim 1, wherein the method further comprises:
Under the condition that the number of the levels of the upstream scheduling flow chart and/or the downstream scheduling flow chart is larger than the preset number of the levels, optimizing the operation nodes in the upstream scheduling flow chart and/or the downstream scheduling flow chart according to a preset optimizing mode according to the level relation to obtain an optimized upstream scheduling flow chart and/or downstream scheduling flow chart;
And displaying the optimized upstream scheduling flow chart and/or the downstream scheduling path chart on a display interface.
10. The method according to claim 1, wherein the method further comprises:
Traversing output conditions and input conditions of all job nodes from a starting node of the directed acyclic graph in response to a loop detection operation for the directed acyclic graph;
Determining that the directed acyclic graph has a loop if the output condition of the current job node is determined as the input condition of the previous job node;
and displaying the current operation node on a display interface.
11. The method according to claim 1, wherein the method further comprises:
Acquiring a format file corresponding to the batch file format under the condition that the batch file is not in a preset file format, wherein the format file is used for defining the format of the batch file;
the analyzing the target attribute of the operation node to obtain node data corresponding to the operation node includes:
And analyzing the target attribute of the operation node according to the format file to obtain node data corresponding to the operation node.
12. A data processing apparatus, the apparatus comprising:
The first acquisition module is used for acquiring a batch graph file, wherein the batch graph file comprises M operation nodes, and M is a positive integer more than or equal to 1;
The analysis module is used for analyzing the target attribute of each operation node in the M operation nodes to obtain node data corresponding to the operation node;
The first generation module is used for generating a directed acyclic graph according to the node data of the M operation nodes; and
And the second generation module is used for responding to the operation of generating an upstream scheduling flow chart and/or a downstream scheduling flow chart of a target node, and generating the upstream scheduling flow chart and/or the downstream scheduling flow chart of the target node according to the node data of the target node and the directed acyclic graph.
13. An electronic device, comprising:
One or more processors;
A memory for storing one or more computer programs,
Characterized in that the one or more processors execute the one or more computer programs to implement the steps of the method according to any one of claims 1 to 11.
14. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, realizes the steps of the method according to any one of claims 1-11.
15. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the steps of the method according to any one of claims 1-11.
CN202410257003.7A 2024-03-06 2024-03-06 Data processing method, device, equipment and storage medium Pending CN118153931A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410257003.7A CN118153931A (en) 2024-03-06 2024-03-06 Data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410257003.7A CN118153931A (en) 2024-03-06 2024-03-06 Data processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN118153931A true CN118153931A (en) 2024-06-07

Family

ID=91288025

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410257003.7A Pending CN118153931A (en) 2024-03-06 2024-03-06 Data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN118153931A (en)

Similar Documents

Publication Publication Date Title
US11392654B2 (en) Data fabric service system
US11669517B2 (en) Interactive interpreter and graphical user interface
US11709842B2 (en) System and method for implementing a reporting engine framework
CN116594683A (en) Code annotation information generation method, device, equipment and storage medium
US20190197453A1 (en) Aggregating computer functions across different computer applications
CN115422202A (en) Service model generation method, service data query method, device and equipment
CN115033634A (en) Data acquisition method, data acquisition device, electronic equipment and medium
CN118153931A (en) Data processing method, device, equipment and storage medium
US11113664B2 (en) Data provisioning system and method
CN113706209B (en) Operation data processing method and related device
CN116703143A (en) Workflow configuration method, device, equipment and medium
CN115098391A (en) Page detection method, device, equipment and medium
CN116700718A (en) Page configuration-based data processing method and system
CN118210778A (en) Database operation method, apparatus, device, storage medium, and program product
CN115033215A (en) Data flow graph construction method, device, equipment and medium
CN112686743A (en) Resource transfer tracking method, device and system and electronic equipment
CN117806977A (en) Test method, apparatus, device, medium and program product
CN116501324A (en) Page generation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination