CN117608985A - Real-time task management method and device, electronic equipment and storage medium - Google Patents
Real-time task management method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN117608985A CN117608985A CN202311092049.XA CN202311092049A CN117608985A CN 117608985 A CN117608985 A CN 117608985A CN 202311092049 A CN202311092049 A CN 202311092049A CN 117608985 A CN117608985 A CN 117608985A
- Authority
- CN
- China
- Prior art keywords
- data
- real
- time task
- task
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000007726 management method Methods 0.000 title claims abstract description 72
- 238000003860 storage Methods 0.000 title claims abstract description 34
- 238000004458 analytical method Methods 0.000 claims abstract description 84
- 238000000034 method Methods 0.000 claims abstract description 44
- 238000005192 partition Methods 0.000 claims description 78
- 230000015654 memory Effects 0.000 claims description 66
- 238000004519 manufacturing process Methods 0.000 claims description 31
- 238000012545 processing Methods 0.000 claims description 31
- 238000005457 optimization Methods 0.000 claims description 27
- 238000012549 training Methods 0.000 claims description 22
- 230000002159 abnormal effect Effects 0.000 abstract description 6
- 238000012544 monitoring process Methods 0.000 description 38
- 230000000007 visual effect Effects 0.000 description 12
- 239000000872 buffer Substances 0.000 description 10
- 238000004364 calculation method Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 8
- 230000005856 abnormality Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 230000004888 barrier function Effects 0.000 description 4
- 230000000737 periodic effect Effects 0.000 description 4
- 238000011144 upstream manufacturing Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000001360 synchronised effect Effects 0.000 description 3
- 101100264195 Caenorhabditis elegans app-1 gene Proteins 0.000 description 2
- 238000007792 addition Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000011835 investigation Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013024 troubleshooting Methods 0.000 description 2
- KLDZYURQCUYZBL-UHFFFAOYSA-N 2-[3-[(2-hydroxyphenyl)methylideneamino]propyliminomethyl]phenol Chemical compound OC1=CC=CC=C1C=NCCCN=CC1=CC=CC=C1O KLDZYURQCUYZBL-UHFFFAOYSA-N 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000013479 data entry Methods 0.000 description 1
- 201000001098 delayed sleep phase syndrome Diseases 0.000 description 1
- 208000033921 delayed sleep phase type circadian rhythm sleep disease Diseases 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 101150086656 dim1 gene Proteins 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3447—Performance evaluation by modeling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3089—Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
- G06F11/3093—Configuration details thereof, e.g. installation, enabling, spatial arrangement of the probes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The application relates to a real-time task management method, a device, electronic equipment and a storage medium, wherein the real-time task management method comprises the following steps: setting performance indexes of a target real-time task are collected at fixed time; determining characteristic data of the set performance index; inputting the characteristic data into a performance analysis model to determine a performance analysis result corresponding to the set performance index; and controlling configuration information of the real-time task based on the performance analysis result. According to the method and the device, some performance indexes of the target real-time task in the real-time task chain can be collected at regular time, then the performance condition of the target real-time task can be accurately determined based on the collected performance indexes and the performance analysis model, and when the target real-time task is abnormal, configuration information of the target real-time task is timely improved, so that execution of the target real-time task is improved, execution of the real-time task chain is better guaranteed, and use experience is improved.
Description
Technical Field
The present invention relates to the field of real-time task management technologies, and in particular, to a real-time task management method, device, electronic device, and storage medium.
Background
The data volume and complexity related in each industry are continuously increased, and real-time calculation tasks and task performances are required to be monitored in real time in a real-time calculation task chain, so that the upstream and downstream relations between each real-time calculation task in the data source to business application are comprehensively, dynamically and visually displayed, and the real-time calculation tasks with performance problems are timely positioned.
In the related technology, the data circulation process of a real-time calculation task chain is complex, the data timeliness requires a high service scene, if abnormality occurs, all real-time calculation tasks related from data application to a data source need to be checked one by one, the working efficiency is low, the problem discovery is caused, the improvement of the real-time calculation tasks is not timely, and the use experience is seriously affected.
Disclosure of Invention
One of the purposes of the application is to provide a real-time task management method, which can timely find out the abnormality in the real-time task execution process, timely improve the execution of the real-time task and improve the use experience; the second object of the present application is to provide a real-time task management device; it is a third object of the present application to provide an electronic device; it is a fourth object of the present application to provide a storage medium.
In order to achieve the above object, in a first aspect, the present application provides a real-time task management method, including:
setting performance indexes of a target real-time task are collected at fixed time; wherein the set performance index includes at least one of: data backlog level, data skew level, system index, task index, operator index, checkpoint index, taskManager index, and jogmanager index;
determining characteristic data of the set performance index;
inputting the characteristic data into a performance analysis model to determine a performance analysis result corresponding to the set performance index; wherein the performance analysis result comprises a performance state label and/or an optimization suggestion label;
and controlling the configuration information of the real-time task based on the performance analysis result.
Further, the data backlog level is determined by:
collecting data backlog information corresponding to the target real-time task; the data backlog information comprises the data consumption rate of the target real-time task in the data processing system and the partition data backlog quantity of the corresponding partition within a set time length;
determining the quotient of the partition data backlog quantity and the partition data consumption rate of the same corresponding partition;
The data backlog level is determined based on the quotient of all corresponding partitions.
Further, the determining the data backlog level based on the quotient of all corresponding partitions includes:
determining the data backlog level based on the maximum value of the quotient based on the corresponding partition and the set backlog level information;
when the maximum value is smaller than or equal to the maximum set level in the set backlog level information, determining the corresponding set level of the maximum value in the set backlog level information as the data backlog level; and when the maximum value is larger than the maximum set level of the set backlog level information, determining the maximum set level as the data backlog level.
Further, the data skew level is determined by:
collecting data inclination information corresponding to the target real-time task; the data inclination information comprises partition data backlog amount of a corresponding partition of the target real-time task in a data processing system within a set time length;
determining a data backlog average value and a data backlog standard deviation corresponding to the data backlog information;
and determining the data inclination grade based on the data backlog average value and the data backlog standard deviation.
Further, the determining the data skew level based on the data backlog average and the data backlog standard deviation includes at least one of:
if the standard deviation of the data backlog is smaller than the first percentage of the average value of the data backlog, determining the data inclination grade as non-inclination;
if the standard deviation of the data backlog is greater than or equal to a first percentage of the average value of the data backlog and less than a second percentage of the average value of the data backlog, determining the data inclination grade as slightly inclined;
and if the standard deviation of the data backlog quantity is greater than or equal to the second percentage of the average value of the data backlog quantity, determining the data inclination grade as serious inclination.
Further, the performance analysis model is determined by:
constructing a training set; the training set comprises a plurality of training sample pairs, wherein each training sample pair comprises an input sample and an output sample, each output sample comprises a performance analysis result sample, and each input sample comprises a characteristic data sample of a set performance index corresponding to the corresponding performance analysis result sample;
and training the original model based on the training set to determine the performance analysis model meeting the set accuracy.
Further, the real-time task management method includes:
acquiring target configuration information of the target real-time task;
defining target node relation data corresponding to the target real-time task based on the target configuration information; the target node relation data comprises a first node, a second node, a node relation between the first node and the second node and a relation direction of the node relation, wherein the first node represents a producer of the target real-time task, the second node represents a consumer of the target real-time task, the node relation represents the target real-time task, and the relation direction represents a data processing direction of the target real-time task;
and adding the target node relation data to a graph database for real-time task management.
Further, after defining the target node relationship data corresponding to the target real-time task based on the target configuration information, the real-time task management method includes:
collecting node attribute data and/or relationship attribute data;
wherein the node attribute data includes at least one of: data production rate, total number of data production, data production storage rate, total size of data production storage, node status, data tilt level, total number of node partitions, node topic name and node message middleware cluster name;
The relationship attribute data includes at least one of: data consumption rate, total number of data consumption, consumer group name, task running state, data backlog level, and data consumption backlog reason.
Further, the real-time task management method includes:
drawing a task chain map corresponding to a real-time task chain based on the node relation data in the graph database; the real-time task chain comprises at least one real-time task, the task chain map comprises a task map of the at least one real-time task, and the task map represents node relation data of the real-time task;
and displaying the task chain map on a front-end display interface based on the timing task.
Further, the drawing of the task chain map corresponding to the real-time task chain based on the node relation data in the graph database includes:
and drawing task maps of the corresponding real-time tasks in different forms based on the different types of real-time tasks.
To achieve the above object, in a second aspect, the present application further provides a real-time task management device, including:
the acquisition module is used for regularly acquiring set performance indexes of the target real-time task; wherein the set performance index includes at least one of: data backlog level, data skew level, system index, task index, operator index, checkpoint index, taskManager index, and jogmanager index;
The determining module is used for determining the characteristic data of the set performance index;
the characteristic data are input into a performance analysis model to determine a performance analysis result corresponding to the set performance index; wherein the performance analysis result comprises a performance state label and/or an optimization suggestion label;
and the control module is used for controlling the configuration information of the real-time task based on the performance analysis result.
To achieve the above object, according to a third aspect, the present application further provides an electronic device, including: a processor and a memory, the processor being configured to execute a control program stored in the memory, to implement the real-time task management method according to any one of the first aspect.
To achieve the above object, in a fourth aspect, the present application further provides a storage medium storing one or more programs executable by one or more processors to implement the real-time task management method as described above.
The beneficial effects of this application:
according to the method and the device, some performance indexes of the target real-time task in the real-time task chain can be collected at regular time, then the performance condition of the target real-time task can be accurately determined based on the collected performance indexes and the performance analysis model, and when the target real-time task is abnormal, configuration information of the target real-time task is timely improved, so that execution of the target real-time task is improved, execution of the real-time task chain is better guaranteed, and use experience is improved.
Drawings
Fig. 1 shows a flow chart of a real-time task management method according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of determining a data backlog level according to an embodiment of the present application;
FIG. 3 is a schematic flow chart of determining a data skew level according to an embodiment of the present application;
fig. 4a shows a flow chart of a real-time task management method according to an embodiment of the present application;
FIG. 4b shows a schematic diagram of a task chain graph provided by an embodiment of the present application;
FIG. 4c shows a schematic diagram of a task graph provided by an embodiment of the present application;
FIG. 4d shows a schematic diagram of a task graph provided by an embodiment of the present application;
FIG. 4e illustrates a data production rate funnel diagram of a task chain data node provided by an embodiment of the present application;
FIG. 4f illustrates a schematic diagram of a real-time task chain dynamic monitoring system provided by an embodiment of the present application;
fig. 5 shows a schematic structural diagram of a real-time task management device according to an embodiment of the present application;
fig. 6 shows a schematic structural diagram of an electronic device according to an embodiment of the present application.
Wherein:
10. an acquisition module; 20. a determining module; 30. a control module; 40. defining a module; 50. adding a module; 60. a drawing module; 70. a display module;
100. An electronic device; 101. a processor; 102. a memory; 1021. an operating system; 1022. an application program; 103. a user interface; 104. a network interface; 105. a bus system.
Detailed Description
Further advantages and effects of the present application will be readily apparent to those skilled in the art from the disclosure in the present specification, by describing embodiments of the present application with reference to the accompanying drawings and preferred examples. The present application may be embodied or carried out in other specific embodiments, and the details of the present application may be modified or changed from various points of view and applications without departing from the spirit of the present application. It should be understood that the preferred embodiments are presented by way of illustration only and not limitation to the scope of the present application.
It should be noted that, the illustrations provided in the following embodiments merely illustrate the basic concepts of the application by way of illustration, and only the components related to the application are shown in the drawings and are not drawn according to the number, shape and size of the components in actual implementation, and the form, number and proportion of the components in actual implementation may be arbitrarily changed, and the layout of the components may be more complex.
Further advantages and effects of the present application will be readily apparent to those skilled in the art from the disclosure in the present specification, by describing embodiments of the present application with reference to the accompanying drawings and preferred examples. The present application may be embodied or carried out in other specific embodiments, and the details of the present application may be modified or changed from various points of view and applications without departing from the spirit of the present application. It should be understood that the preferred embodiments are presented by way of illustration only and not limitation to the scope of the present application.
For the purpose of facilitating an understanding of the embodiments of the present application, reference will now be made to the following description of specific embodiments, taken in conjunction with the accompanying drawings, in which the embodiments are not intended to limit the embodiments of the present application.
The embodiment provides a real-time task management method which can be applied to electronic equipment, and the electronic equipment can be provided with a real-time task chain dynamic monitoring system. Referring to fig. 1, the method may include:
s110, setting performance indexes of a target real-time task are collected regularly;
s120, determining characteristic data of the set performance index;
s130, inputting the characteristic data into a performance analysis model to determine a performance analysis result corresponding to the set performance index; wherein the performance analysis result comprises a performance state label and/or an optimization suggestion label;
And S140, controlling configuration information of the real-time task based on the performance analysis result.
In step S110, the target real-time task refers to a real-time task for which performance analysis is required. The real-time task may be a computing task or other tasks, which is not limited.
The set performance index includes at least one of: data backlog level, data skew level, system index, task index, operator index, checkpoint index, taskManager index, and jogmanager index.
In this step, the set performance index may be acquired by configuring a timing acquisition task of the set performance index. The timing acquisition task may be set to acquire once every 1 second, or may be set to acquire once every other time length, which is not limited.
In some embodiments, when configuring a timed acquisition task, it may be determined that the set performance metrics include a system metric, a task metric, an operator metric, a checkpoint metric, a TaskManager, and a jogmanager metric. And configuring monitoring indexes, alarm thresholds, reminding messages and the like. And adding a new energy monitoring thread into the real-time task (namely the Flink task), and writing the acquired performance index into the time sequence database if the monitored performance index exceeds a threshold value. Meanwhile, the passive acquisition performance index and the acquisition frequency can be configured, and the performance index is acquired at fixed time through the performance monitoring thread and written into the time sequence database to finish the performance index data acquisition.
The set performance index may be acquired by other means besides the above-described means, and is not limited thereto.
The system index may include, for example, hostname, CPU, memory, network, IO (including throughput and IOPS (Input/Output Per Second), i.e., input/output per second), and the like.
The task metrics may include, for example:
start Time (Start Time): a time stamp of the task start execution;
completion Time (Completion Time): a time stamp of completion execution of the task;
execution time (Execution Duration): the execution time of a task, i.e., the time difference from start to completion;
input Records (Input Records): the number of input records of the task processing;
output Records (Output Records): the number of output records generated by the task;
parallelism (parallelsm): the parallelism of the tasks, namely the number of task instances executed in parallel at the same time;
state Size (State Size): the state data size of task maintenance;
network traffic (Network Bytes): the data quantity transmitted between tasks through a network;
buffer usage (Buffer Utilization): the utilization rate of a buffer used by tasks in a certain time period;
Delay (Latency): delay of task processing data, i.e., the time difference from the data entry task to the completion of data processing;
number of failed restarts (Failure Restart Count): the number of times the task is restarted after failure;
number of Data Loss (Data Loss Count): the amount of data lost that the task has occurred during the processing;
consumption Rate (Consumer Rate): the rate at which the task consumes data from the input source;
wait Time (Waiting Time): the time the task waits to execute.
It should be noted that during the execution of a flank task, a plurality of task index data may be collected and monitored for evaluating and optimizing the performance of the application. These task index data may be collected and presented by a Flink monitoring tool (e.g., flink Web UI, flink Metrics API). By monitoring and analyzing the index data in real time, the performance, the resource utilization condition and the bottleneck of the task can be known, and performance adjustment and fault investigation can be performed according to the requirements.
The operator index may include, for example:
input Records (Input Records): the number of input records processed by the operator;
output Records (Output Records): the number of output records generated by the operator;
Start Time (Start Time): a time stamp at which the operator began execution;
completion Time (Completion Time): the operator completes the timestamp of execution;
execution time (Execution Duration): the execution time of the operator, i.e. the time difference from start to finish;
state Size (State Size): the state data size maintained by the operator;
buffer usage (Buffer Utilization): the utilization rate of a buffer used by an operator in a certain time period;
network traffic (Network Bytes): the data quantity transmitted between operators through a network;
delay (Latency): delay of operator processing data, i.e. the time difference from the entry of data into the operator to the completion of data processing;
average treatment time (Average Processing Time): the operator averages the processing time of each record;
progress (Progress): the operator finishes the progress of the processing and represents the percentage of the processed data;
parallelism (parallelsm): parallelism of operators, i.e., the number of operator instances executed in parallel at the same time;
number of failed restarts (Failure Restart Count): restarting times after the operator fails;
number of Data Loss (Data Loss Count): the amount of data loss that the operator has occurred during processing.
It should be noted that, during the flight operation, index data of each Operator (Operator) may be collected and monitored, so as to evaluate and optimize performance of the Operator. These operator index data may be collected and presented by a Flink monitoring tool (e.g., flink Web UI, flink Metrics API). By monitoring and analyzing the index data in real time, the performance, the resource utilization condition and the bottleneck of the operator can be known, and performance adjustment and fault investigation can be performed according to the requirements.
The checkpoint index may include, for example:
checkpoint start time (Checkpoint Start Time): a timestamp indicating the start of execution of each Checkpoint;
checkpoint completion time (Checkpoint Completion Time): a timestamp representing the completion of each Checkpoint;
checkpoint duration (Checkpoint Duration): representing the execution time of each Checkpoint, i.e., the time difference from start to finish;
checkpoint data volume (Checkpoint Data Size): representing the size of the data stored after writing the state during each Checkpoint;
maximum Parallelism (Max parallelsm): representing the maximum parallelism setting in the operation, namely the maximum task number which can be operated simultaneously;
Alignment Barrier (Current Aligned Barrier) currently pending: barrers indicating that all tasks are currently waiting for alignment;
confirmed Barrier (Acknowledged Barriers): indicating the number of barrers that have been acknowledged by all tasks;
total completed Checkpoints (Total Completed Checkpoints): indicating the total number of checkpoints that have completed successfully.
It should be noted that Checkpoint of the link is a fault tolerant mechanism for ensuring data consistency of the streaming application when a fault occurs. The Checkpoint index reflects the performance and status of the Checkpoint. These Checkpoint indices provide information about the performance and real-time status of the checkpoints. By monitoring these metrics, the behavior of the checkpoints can be known and help optimize performance, troubleshoot, and system monitoring.
The TaskManager index data may include, for example:
CPU Usage (CPU Usage): representing the use of CPU resources on the TaskManager node, typically in percent;
memory Usage (Memory Usage): representing the use condition of memory resources on the TaskManager node, wherein the use condition is usually represented by percentage;
network Traffic (Network Traffic): the data quantity transmitted by the network on the TaskManager node is represented, so that the utilization rate of the network bandwidth can be measured;
Heap memory usage (Heap Memory Usage): representing the use condition of Java heap memory on a task manager node, wherein the use condition comprises the allocated, used and available heap memory size;
non-heap memory usage (Non-heap Memory Usage): the service condition of non-heap memory (such as a thread stack and direct memory) on the task manager node is represented;
off-heap memory usage (Off-heap Memory Usage): representing the out-of-heap (Off-map) memory size used on the TaskManager node;
thread Count (Thread Count): representing the number of active threads on the TaskManager node, including threads for executing tasks, I/O, etc.;
number of connected tasks (Connected Task Count): representing the number of tasks (tasks) that the TaskManager node is currently connected to JobManager;
parallelism (parallelsm): the task parallelism degree of simultaneous running on the task manager node is represented, namely the number of task instances executed in parallel at the same time;
network buffer usage (Network Buffer Usage): representing the use condition of network buffers on the TaskManager node, including the number of buffers allocated, free and used for data transmission;
the TaskManager state (TaskManager State): representing the current state of the TaskManager node, such as running, idle, stopped, etc.
It should be noted that during the execution of the flank task, the index data of the TaskManager may be collected and monitored for evaluating and optimizing the performance of the task. These TaskManager index data can be collected and exposed through a Flink's monitoring tool (e.g., flink Web UI, flink Metrics API). By monitoring and analyzing the index data in real time, the resource utilization condition, the performance bottleneck and the fault condition of the TaskManager can be known, and the system tuning and the fault troubleshooting can be performed as required.
The jogmanager index may include, for example:
CPU Usage (CPU Usage): representing the use of CPU resources on JobManager nodes, typically expressed in percent;
memory Usage (Memory Usage): representing the use of memory resources on JobManager nodes, typically in percent;
network Traffic (Network Traffic): the data quantity transmitted by the network on the JobManager node is represented, so that the utilization rate of the network bandwidth can be measured;
heap memory usage (Heap Memory Usage): representing the use of Java heap memory on JobManager nodes, including allocated, used and available heap memory sizes;
non-heap memory usage (Non-heap Memory Usage): representing the use condition of non-heap memory (such as a thread stack and direct memory) on a JobManager node;
Off-heap memory usage (Off-heap Memory Usage): represents the Off-heap (Off-map) memory size used on the JobManager node;
thread Count (Thread Count): representing the number of active threads on JobManager nodes, including executing threads for job management, scheduling, communication, etc.;
job number (Job Count): representing the number of jobs currently running or completed on the JobManager node;
number of active tasks (Active Task Count): representing the number of tasks (tasks) currently running on the JobManager node;
number of failed restarts (Failure Restart Count): representing the number of restarts after a job failure occurred on the JobManager node;
JobManager State (JobManager State): representing the current state of the JobManager node, such as running, idle, stopped, etc.
It should be noted that, during the running process of the link task, the index data of the JobManager may be collected and monitored, so as to evaluate and optimize the performance of the job. These JobManager index data may be collected and presented by a Flink's monitoring tool (e.g., flink Web UI, flink Metrics API). By monitoring and analyzing the index data in real time, the resource utilization condition, the performance bottleneck and the fault condition of the JobManager can be known, and the system tuning and the fault troubleshooting can be performed according to the requirements.
In step S120, after the set performance index is collected, the collected data may be preprocessed and feature extracted to extract feature data from the collected set performance index, so as to facilitate a more efficient and effective performance analysis. The pretreatment of the data may include cleaning, denoising, de-duplication, and the like, which are not limited thereto.
In step S130, after the feature data is determined, the feature data may be input into a performance analysis model, and the performance analysis model may process the feature data and output a corresponding performance analysis result. And marking the secondary performance analysis result as a performance analysis result corresponding to the set performance index of the target real-time task.
The performance analysis result may include a performance status tag, an optimization suggestion tag, and both a performance status tag and an optimization suggestion tag. The performance state label can represent whether the operation of the target real-time task is abnormal or not, and when the operation of the target real-time task is abnormal, the performance state label can also represent the reason of the abnormality. The optimization suggestion tags may characterize the optimization measures corresponding to the performance state tags described above, which may include adjustment measures for configuration information of the target real-time task.
For example, if the performance state label characterizes that the operator running parallelism of the target real-time task is small, the optimization suggestion label can characterize a prompt for adjusting the parallelism. For another example, if the performance state label characterizes that the operator operation performance of the target real-time task is low, the optimization suggestion label characterizes an optimization operator logic prompt. For another example, if the performance state label characterizes that the target real-time task is not enough in memory, the optimization suggestion label characterizes that the memory resource configuration of the task manager is adjusted, and the like. For another example, if the performance state label characterizes that the target real-time task is running normally, the optimization suggestion label characterizes that no adjustment is needed.
Besides the performance state labels and the corresponding optimization suggestion labels, the performance analysis model of the method may output other performance state labels and the corresponding optimization suggestion labels, which are not limited.
Wherein, when determining the performance analysis model, a training set may be constructed. Wherein the training set may comprise a plurality of training sample pairs. The training sample pair includes an input sample and an output sample. The output sample comprises a performance analysis result sample, and the input sample comprises a characteristic data sample of a set performance index corresponding to the performance analysis result sample.
The method comprises the steps of firstly, acquiring the set performance index of a real-time task at fixed time, preprocessing the set performance index and extracting the characteristics, so that the characteristic data corresponding to the set performance index is obtained, and the characteristic data can be recorded as a characteristic data sample.
In addition, the label definition can be performed on the collected performance analysis results corresponding to the set performance indexes, which can include a performance state label, an optimization suggestion label, and the two labels, which can be set according to actual requirements.
Then, this performance analysis result is noted as a performance analysis result sample. A pair of training sample pairs can be formed by the characteristic data samples and the corresponding performance analysis result samples. The characteristic data sample is used as an input sample, and the performance analysis result sample is used as an output sample. By the method, a plurality of training sample pairs can be constructed based on the set performance indexes and the corresponding performance analysis results in a period of time, so that the construction of the training set is completed.
After the training set is built, the training set can be used for training the original model, the test set or the verification set is used for evaluating the trained model, and if the accuracy rate of the model reaches the required set accuracy rate, the model can be determined to be a performance analysis model. If the accuracy rate does not reach the required set accuracy rate, the model can be further optimized until the accuracy rate reaches the required set accuracy rate, and then the model is determined to be a performance analysis model.
It should be noted that the setting accuracy may be set according to actual requirements, and specific numerical values thereof may not be limited. For example, the setting accuracy may be set to 98%, or may be set to another value.
In step S140, after the performance analysis result is obtained, the configuration information of the target real-time task may be adjusted based on the performance analysis result, so as to reduce the end-to-end data delay of the target real-time task in the real-time task chain, thereby ensuring good operation of the target real-time task.
After the performance analysis result includes the performance state label, the running condition of the target real-time task can be known after the performance state label is determined, and corresponding control measures can be adopted based on the information represented by the performance state label. When the performance analysis result comprises the optimization suggestion label, after the optimization suggestion label is determined, the configuration information of the target real-time task can be controlled based on the information represented by the optimization suggestion label so as to ensure good operation of the target real-time task. When the performance analysis result comprises the performance state label and the corresponding optimization suggestion label, the configuration information of the target real-time task can be controlled based on the information represented by the performance state label and the optimization suggestion label at the same time, so that good operation of the target real-time task can be better ensured.
In some embodiments of the present invention, in some embodiments,
in the running of a target real-time task in a real-time task chain, performing performance analysis on characteristic data of a set performance index of the target real-time task through a performance analysis model, if the output performance state label represents that the abnormality exists and represents the reasons of data backlog type abnormality, and performing self-adaptive resource adjustment on resource optimization parameters by using optimization suggestion representation output by the performance analysis model. In the real-time task chain dynamic monitoring system, the self-adaptive resource optimization interface can be called through the rest api server, the resource optimization parameters are input, the self-adaptive resource adjustment task thread is started, a unit is added to the corresponding resource parameters (set according to actual conditions), and the target real-time task is restarted. If the target real-time task still prompts that the task resource is abnormal, a unit is added to the corresponding resource parameter, the task is re-submitted until the consumption data amount of the task is 20% greater than the data writing amount of the upstream message middleware (such as kafka) in the time period of restarting the target real-time task, the execution resource of the target real-time task is considered to be optimized, and updated resource parameters and operation processes can be recorded to provide a history query function.
According to the method, some performance indexes of the target real-time task in the real-time task chain can be acquired at regular time, then the performance condition of the target real-time task can be accurately determined based on the acquired performance indexes and the performance analysis model, and when the target real-time task is abnormal, the configuration information of the target real-time task is improved in time, so that the execution of the target real-time task is improved, the execution of the real-time task chain is better ensured, and the use experience is improved.
The embodiment provides a real-time task management method which can be applied to electronic equipment, and the electronic equipment can be provided with a real-time task chain dynamic monitoring system. Referring to FIG. 2, in the method, the data backlog level may be determined by:
s210, collecting data backlog information corresponding to a target real-time task;
s220, determining the quotient of the partition data backlog quantity and the partition data consumption rate of the same corresponding partition;
s230, determining the data backlog level based on the quotient of all the corresponding partitions.
In step S210, the periodic tasks may be configured in the real-time task chain dynamic monitoring system to periodically collect data backlog information corresponding to the target real-time task in the message middleware (e.g., kafka).
The data backlog information comprises the data consumption rate of the target real-time task in the data processing system and the partition data backlog quantity of the corresponding partition in a set time length.
It should be noted that, the corresponding partition of the target real-time task in the data processing system may be recorded as a corresponding partition, and the corresponding partition of the target real-time task may be set according to the actual situation, which is not limited. In addition, the set time period may be set according to practical situations, which is not limited. The period corresponding to the periodic task can also be set according to practical situations, which is not limited. For example, the period may be 1s, the set period may be 10s or 1min, and so on.
In step S220, after the partition data consumption rate and the allowable data backlog amount of all the partitions corresponding to the target real-time task are collected, the analysis of the data backlog can be performed on each partition.
The quotient of the partition data backlog amount and the partition data consumption of the same corresponding partition can be determined, and the quotient can reflect the data backlog degree of the corresponding partition.
In step S230, after determining all the quotients of the corresponding partitions of the target real-time task, the data backlog level of the target real-time task may be determined based on all the quotients.
The maximum value can be determined from the quotient of all the corresponding partitions, and then the maximum value is compared with the set backlog level information to determine the final data backlog level.
When the maximum value is smaller than or equal to the maximum set level in the set backlog level information, determining the set level corresponding to the maximum value in the set backlog level information as the data backlog level; and when the maximum value is larger than the maximum set level of the set backlog level information, determining the maximum set level as the data backlog level.
In some embodiments of the present invention, in some embodiments,
the set backlog level information may include a total of 10 set backlog levels of 0-9. In this embodiment, the periodic tasks may be configured in a real-time task chain dynamic monitoring system for determining whether data consumption is backlogged by consumer data offset analysis of nodes in kafka at regular intervals. Assume that there are 10 partitions corresponding to the target real-time task, and the partition data backlog of each partition is denoted as x1, x2 … x10, and the data consumption rate of a certain consumer is denoted as a.
In this embodiment, after the rest api service obtains the partition data backlog amount and the data consumption rate, the quotient of the partition data backlog amount divided by the data consumption rate of each partition may be obtained, the maximum value of the quotient is taken, and the quotient is compared with the 10 set backlog level in the set backlog level information, and if the maximum value is greater than 9, the data backlog level is determined to be 9. And if the maximum value is less than or equal to 9, determining the corresponding grade of the minimum value in the 0-9 set backlog grades as the data backlog grade.
The data backlog level is determined according to the following formula:
Max(floor(x1/a),floor(x2/a),...,floor(x10/a))=99:floor(Max(x1/a,x2/a,...,x10/a))
in this embodiment, after the rest api service determines the data backlog level, the data backlog level may be written into the relationship attribute corresponding to the target real-time task in the graph database, so as to facilitate the subsequent visual display, so that the relevant personnel can timely know the data backlog level of the target real-time task.
It should be noted that the data backlog level may be determined by other means besides the above-described means, and is not limited thereto.
The method divides the data backlog level of the real-time task, and can determine the data backlog level of the real-time task through the quotient of the data backlog amount and the data consumption rate of each partition corresponding to the real-time task, so that the data backlog level of the real-time task can be simply and efficiently determined, and the performance analysis on the operation of the real-time task is facilitated.
The embodiment provides a real-time task management method which can be applied to electronic equipment, and the electronic equipment can be provided with a real-time task chain dynamic monitoring system. Referring to fig. 3, in the method, the data skew level may be determined by:
S310, collecting data inclination information corresponding to a target real-time task;
s320, determining a data backlog average value and a data backlog standard deviation corresponding to the data backlog information;
s330, determining the data inclination grade based on the data backlog average value and the data backlog standard deviation.
In step S310, the data inclination information includes a partition data backlog amount of a corresponding partition of the target real-time task in the data processing system within a set duration.
It should be noted that, the collection of the backlog amount of the partition data may refer to the description related to step S210 in other embodiments, which is not described herein.
In step S320, after the partition data backlog amounts of all the corresponding partitions of the target real-time task are collected, the data backlog amount average value and the data backlog amount standard deviation corresponding to the target real-time task may be calculated based on all the partition data backlog amounts.
In step S330, after the data backlog average value and the data backlog standard deviation are determined, the data backlog average value and the data backlog standard deviation may be subjected to tilt analysis, so as to determine the data tilt level.
Wherein in this step, the data tilt grade can be classified into three grades, no tilt, slight tilt and severe tilt, respectively. If the standard deviation of the data backlog quantity is smaller than the first percentage of the average value of the data backlog quantity, determining the data inclination grade as non-inclination; if the standard deviation of the data backlog is greater than or equal to the first percentage of the average value of the data backlog and less than the second percentage of the average value of the data backlog, determining the data inclination grade as slight inclination; if the standard deviation of the data backlog is greater than or equal to the third percentage of the average of the data backlog, the data skew level is determined to be a severe skew.
The first percentage and the second percentage may be set according to practical situations, and specific numerical values thereof may not be limited.
In some embodiments of the present invention, in some embodiments,
the tilt level information may be configured first. The set inclination level information may include three inclination levels, respectively noted as no inclination, slight inclination, and severe inclination. Wherein 0 may be defined as a label that is not tilted, 1 may be defined as a label that is slightly tilted, and 2 may be defined as a label that is severely tilted. The first percentage may be set to 20% and the second percentage may be set to 50%.
In this embodiment, the periodic task may be configured in the real-time task chain dynamic monitoring system for use in timing determination of the partition data backlog of each corresponding partition of the target real-time task. Assume that there are 10 partitions corresponding to the target real-time task, and the partition data backlog of each partition is denoted as x1, x2 … x10.
In this embodiment, after the rest api service obtains the partition data backlog amounts, the average value and the standard deviation of the 10 partition data backlog amounts may be respectively calculated, and the average value and the standard deviation of the data backlog amounts may be respectively recorded.
The determination formula for the data tilt data is as follows:
If the standard deviation of the data backlog quantity is smaller than 20% of the average value of the data backlog quantity, returning to a value 0, namely determining that the data inclination grade is not inclined; if the standard deviation of the data backlog is greater than or equal to 20% of the average value of the data backlog and less than 50% of the average value of the data backlog, returning to a value 1, namely determining that the data inclination grade is slightly inclined; and if the standard deviation of the data backlog quantity is greater than or equal to 50% of the average value of the data backlog quantity, returning to the value 2, and determining that the data inclination grade is severely inclined.
In this embodiment, after the rest api service determines the data inclination level, the data inclination level may be written into the node attribute corresponding to the target real-time task in the graph database, so as to facilitate the subsequent visual display, so that the relevant personnel can timely know the data backlog level of the target real-time task.
It should be noted that the data backlog level may be determined by other means besides the above-described means, and is not limited thereto.
The method divides the data inclination grade of the real-time task, can determine the data inclination grade of the real-time task through the data backlog quantity of each partition corresponding to the real-time task, and can simply and efficiently determine the data inclination grade of the real-time task, thereby being convenient for performance analysis on the operation of the real-time task.
The embodiment provides a real-time task management method which can be applied to electronic equipment, and the electronic equipment can be provided with a real-time task chain dynamic monitoring system. Referring to fig. 4a, the method may include:
s410, acquiring target configuration information of a target real-time task;
s420, defining target node relation data corresponding to a target real-time task based on target configuration information;
s430, adding the target node relation data to a graph database for real-time task management.
In step S410, the target real-time task refers to a real-time task that needs to be managed. Each real-time task may correspond to some configuration information. The configuration information corresponding to the target real-time task may be denoted as target configuration information.
In some embodiments of the present invention, in some embodiments,
the real-time tasks may be configured according to the task creation window hints. And the operations of adding, deleting, checking, starting, stopping, uploading and downloading can be performed on the real-time task. After the real-time task is created, the real-time task can be used as a target real-time task, and configuration information of the target real-time task in a configuration file or a configuration center can be automatically acquired and monitored. Wherein configuration information of a producer (e.g., kafka touch end) of the message middleware and a consumer (e.g., kafka sink end) of the message middleware of the configuration key can be acquired through a fixed prefix.
The target configuration information may be obtained by other means besides the above-described means, and is not limited thereto.
In step S420, the target node relationship data includes a first node, a second node, a node relationship between the first node and the second node, and a relationship direction of the node relationship, where the first node represents a producer (i.e., kafka touch end) of the target real-time task, the second node represents a consumer (i.e., kafka sink end) of the target real-time task, the node relationship represents the target real-time task, and the relationship direction represents a data processing direction of the target real-time task.
For example, after the configuration information of the target real-time task is acquired, the nodes and the relationship labels in the target real-time task may be defined based on the configuration information, the kafka source end and the kafka sink end may be defined as two nodes, the target real-time task may be defined as a relationship, and the kafka source end to the kafka sink end may be defined as a relationship direction.
Wherein, the data source name of the kafka source end of the target real-time task can be defined as the name of the first node, the data source name of the kafka sink end of the target real-time task can be defined as the name of the second node, and the target implementation task name can be defined as the name of the node relation.
The target node relationship data of the target real-time task may be defined in the above manner, or may be defined in the above manner, which is not limited thereto.
In addition, after the definition of the target node relationship data is completed, node attribute data and/or relationship attribute data may be collected. The node attribute data refers to attribute data of the first node and the second node. The relationship attribute data refers to attribute data of a relationship between the first node and the second node, that is, attribute data of a target real-time task.
Wherein the node attribute data may include at least one of: data production rate, total number of data productions, data production storage rate, total data production storage size, node status, data skew level, total number of node partitions, node topic name (e.g., node topic name), and node message middleware cluster name (e.g., node kafka cluster name).
The relationship attribute data may include at least one of: data consumption rate, total number of data consumption, consumer group name, task running state, data backlog level, and data consumption backlog reason.
In some embodiments of the present invention, in some embodiments,
The performance index collection task can be configured in the real-time task chain dynamic monitoring system and used for periodically collecting node attribute data, and the attribute data can comprise: data production rate, total number of data productions, data production storage rate, total data production storage size, node status, partition data tilt level, total number of node partitions, node topic name and node kafka cluster name, among others. The definition method of each attribute data is as follows:
data production rate attribute: dividing the difference value of the total data production number in the timing task period by the time difference of the timing task;
data production total number attribute: the total written number of the messages of each partition in the timing task query kakfa is the total written number of the messages;
data production storage rate attribute: dividing the data production storage size difference in the timing task period by the timing task time difference;
data production storage total size attribute: changing topic in timing task inquiry kakfa to occupy the space of a physical disk;
node state attributes; in a real-time task of whether the node is in operation, if so, the state is in use, otherwise, the state is off-line;
node partition total number attribute: total number of partitions of kafka;
node topic name attribute: the topic name of the node in kafka;
Node kafka cluster name attribute: the kafka data source name in the data source management system;
note that, the node attribute data may include other attribute data in addition to the above attribute data, which is not limited thereto, and may be set to include other attribute data according to the service requirement.
In the embodiment, the performance index acquisition task can be configured in the real-time task chain dynamic monitoring system for periodically acquiring the relationship attribute data. The relationship attribute data may include: data consumption rate, total number of data consumption, consumer group name, task running status, data consumption backlog level, data consumption backlog cause, and others. The relationship attribute data are defined as follows:
data consumption rate attribute: dividing the difference value of the total data consumption in the timing task period by the time difference of the timing task;
data consumption total count attribute: the total consumption of the information of each partition in the timing task query kakfa is the total number of the written information;
consumer group name attribute: kafka consumer group name;
task running state attribute: the task running state comprises running or offline.
It should be noted that, the relationship attribute data may include other attribute data in addition to the above attribute data, which is not limited thereto, according to the service requirement.
Note that, the collection of the node attribute data and the relationship attribute data may be performed in other ways than the above-described way, and this is not limited thereto.
After the definition of the target node relationship data is completed in step S430, the target node relationship data may be added to the graph database of the real-time task management for use in the subsequent generation of the real-time task chain graph (e.g., as shown in fig. 4 b).
In addition, after the node attribute data and the relationship attribute data are acquired based on the target node relationship data, the acquired node attribute data and relationship attribute data can be written into the graph database.
In the method, a task chain map corresponding to a real-time task chain can be drawn based on node relation data in a graph database. The real-time task chain comprises at least one real-time task, and the task chain graph comprises a task graph (such as shown in fig. 4c and 4 d) of the at least one real-time task, wherein the task graph represents node relation data of the real-time task.
In addition, when node attribute data and relationship attribute data of a real-time task exist in the graph database, the node attribute data and relationship attribute data may be displayed in a task map of the real-time task. It should be noted that when the task map is drawn, the task map of the corresponding real-time task can be drawn in different forms based on different types of real-time tasks, so that the related personnel can distinguish and know the running condition of each real-time task more clearly.
In the method, a timing task can be configured in the real-time task chain dynamic monitoring system, so that the real-time task chain dynamic monitoring system can display a task chain map on a front-end display interface based on the timing task.
In some embodiments of the present invention, in some embodiments,
referring to fig. 4b, 4c, 4d, and 4e, node labels and node relation labels of real-time computing tasks are defined in a graph database, and node relation directions are pointed from the kafka touch end to the kafka touch end. Wherein, the nodes comprise ods1, dwd1, dws1, app1, ods2, dwd2, dws2, app2, dim1, dim2, app3, and the like, and the node relations comprise dwd task 1, dws task 1, app1 task, dwd task 2, dws task 2, app2 task, app3 task, and the like.
And various indexes can be defined as node attribute data and relationship attribute data. In this embodiment, by monitoring the addition, modification and deletion of configuration files (i.e., files including configuration information) corresponding to all real-time tasks, relevant node information and relationship information can be automatically extracted and synchronously written into the graph database, so as to facilitate drawing of a task chain graph.
The front-end display page can acquire a task chain map of a real-time calculation task chain from the graph database through timing tasks. The topological structure of the task chain can be drawn in the atlas tool according to the structure and the relation of the task chain so as to form a task chain atlas.
In mapping, appropriate colors, shapes and styles may be used to distinguish between different types of real-time tasks, as well as links representing data flows and dependencies. According to the execution sequence and the dependency relationship of the real-time tasks, nodes and connecting lines are reasonably laid out, and the task chain is comprehensively displayed.
And (3) periodically inquiring real-time task chain data by configuring a data updating period at the front end of the visualization, writing the real-time task chain data into a graph database, and dynamically and visually displaying the relation, the nodes and the attributes of the real-time task chain labels at the front end.
The task running state is the relation sum of the offline, the node state is that unused nodes display gray, and the relation connection line color is displayed as the corresponding display of the data consumption backlog level attribute from green to red 10 color levels. The data tilt level display is displayed from green to red in 3 color levels, corresponding to 3 tilt levels, respectively. And (3) regularly acquiring the writing quantity, the offset and the backlog quantity of each group id of the topic data node, calculating the production speed and the consumption speed, writing the data into a time sequence database, displaying the data of topic production, consumption and speed in a historical time range at the node by inquiring the time sequence database, and drawing a discount chart for visual display. In addition, the data nodes with the relation can be selected, and a data production rate funnel diagram of the task chain data nodes can be generated, as shown in fig. 4 e.
It should be noted that, in addition to the above manner, the visual display of the real-time task chain may be implemented, and the visual display of the real-time task chain may also be implemented by other manners, which is not limited thereto.
The construction and business data processing of the real-time task chain can comprise a plurality of real-time tasks, and the application can provide a unified task management function based on the real-time task management window, so that the task submitting, executing, offline and other operations of the real-time task can be conveniently operated.
In the related technology, in the construction of a real-time task chain and the processing of service data, a real-time calculation task link is long and complex, a data flow link is unclear, and the task relationship between real-time tasks and the upstream and downstream data sources of the tasks cannot be clearly displayed. The method and the system can be used for visually displaying the full-link data real-time task flow chart from the data access to each real-time task and then to the data application, and the relation among all the real-time tasks and the upstream and downstream data sources are clear at a glance.
In the related art, the real-time task chain data circulation process is complex, high service scenes are required for data timeliness, the problem of data backlog delay occurs, all real-time tasks related from data application to a data source need to be checked one by one, the working efficiency is low, and the problem finding is not timely. The method and the system can realize dynamic monitoring of the task chain, monitor the running state of each task in real time, feed back the real-time calculated change condition of the data of each node in time, and facilitate a manager to quickly know the change condition of the data processing efficiency and the data backlog delay condition of the real-time task. And the performance problems existing in the real-time calculation of the data can be automatically positioned in the full-link real-time task by combining with the performance analysis model, and the data is automatically optimized, so that the timeliness and the effectiveness of the data are ensured.
The embodiment provides a real-time task chain dynamic monitoring system, which is shown in reference to fig. 4f, and may include a data source management unit, a real-time task management unit, a configuration center, a storage unit, a visual display unit, a timing task management unit, a performance index acquisition unit and a performance analysis unit, wherein the performance analysis unit is provided with a performance analysis model.
In the system, S1 represents creating a real-time task, and in this step, a task management unit selects a producer data source and a consumer data source from a data source management unit. S2 represents synchronizing the configuration information of the created real-time task to the configuration center. S3, analyzing configuration information of the real-time task, and writing information such as producer data sources, consumer data sources, specific task content and the like corresponding to the real-time task into a graph database of the storage unit. The producer data source can be used as a first node in the target node relation data, the consumer data source can be used as a second node in the target node relation data, the target real-time task can be used as a node relation between the first node and the second node, and the relation direction is pointed to the consumer data source by the producer data source. In addition, it should be noted that, in step S3, when the configuration information of the real-time task is modified, the modified related information may be synchronized to the graph database. S4.1, a timing task management unit distributes tasks for timing acquisition of set performance indexes to a performance index acquisition unit. S4.2, the timed task management unit distributes the timed task for visual display to the visual display unit. S5, the performance index acquisition unit acquires the task of setting the performance index based on the timing distributed by the timing task management unit, and periodically acquires the set performance index corresponding to the real-time task from the task management unit. S6 represents that the performance index acquisition unit transmits the set performance index acquired at regular time to the storage unit, for example, a graph database and a time sequence database which can be transmitted to the storage unit. The graph database is used for storing task chain graphs, the time sequence database is used for storing historical setting performance indexes, and the historical setting performance indexes can be stored in a graph form or other forms, and the task chain graph is not limited. S7, the performance index acquisition unit transmits the acquired set performance index to the performance analysis unit. S8 represents that the performance analysis unit transmits the performance analysis result to the storage unit, for example, a graph database that can be transmitted to the storage unit. S9, the performance analysis unit transmits the performance analysis result to the task management unit, so that the task management unit can adjust the configuration information of the real-time task based on the performance analysis result, thereby improving the end-to-end data delay of the real-time task and further improving the data delay of the whole real-time task chain. S10, performing visual display tasks based on timing by a visual display unit, and performing visual display on the data stored in the storage unit, wherein the visual display unit can perform task chain map display based on the data of the graph database and can perform history setting performance index display based on the data in the time sequence database.
According to the system, after the performance analysis result is determined based on the set performance index acquired at regular time, the performance analysis unit can act on the task management unit, the task management unit can adjust the configuration information of the target real-time task based on the performance analysis result, and end-to-end data delay of the target real-time task in the real-time task chain is reduced, so that good operation of the target real-time task is guaranteed.
The embodiment provides a real-time task management device which can be applied to electronic equipment, and the electronic equipment can be provided with a real-time task chain dynamic monitoring system. Referring to fig. 5, the apparatus may be used to implement the real-time task management method described above. For example, the apparatus may include an acquisition module 10, a determination module 20, and a control module 30.
The acquisition module 10 is used for regularly acquiring set performance indexes of the target real-time task; wherein the set performance index includes at least one of: data backlog level, data skew level, system index, task index, operator index, checkpoint index, taskManager index, and jogmanager index;
a determining module 20, configured to determine feature data of the set performance index;
The characteristic data is also used for inputting the characteristic data into a performance analysis model so as to determine a performance analysis result corresponding to the set performance index; wherein the performance analysis result comprises a performance state label and/or an optimization suggestion label;
and a control module 30 for controlling configuration information of the real-time task based on the performance analysis result.
The embodiment provides a real-time task management device which can be applied to electronic equipment, and the electronic equipment can be provided with a real-time task chain dynamic monitoring system. Referring to fig. 5, in the apparatus,
the acquisition module 10 is used for acquiring data backlog information corresponding to the target real-time task; the data backlog information comprises the data consumption rate of the target real-time task in the data processing system and the partition data backlog quantity of the corresponding partition in a set time length;
a determining module 20, configured to determine a quotient of the partition data backlog amount and the partition data consumption rate of the same corresponding partition;
and may also be used to determine a data backlog level based on the quotient of all corresponding partitions.
The embodiment provides a real-time task management device which can be applied to electronic equipment, and the electronic equipment can be provided with a real-time task chain dynamic monitoring system. Referring to fig. 5, in the apparatus, the determining module 20 may be configured to:
Determining a data backlog level based on the maximum value of the quotient based on the corresponding partition and the set backlog level information;
when the maximum value is smaller than or equal to the maximum set level in the set backlog level information, determining the set level corresponding to the maximum value in the set backlog level information as the data backlog level; and when the maximum value is larger than the maximum set level of the set backlog level information, determining the maximum set level as the data backlog level.
The embodiment provides a real-time task management device which can be applied to electronic equipment, and the electronic equipment can be provided with a real-time task chain dynamic monitoring system. Referring to fig. 5, in the apparatus,
the acquisition module 10 is used for acquiring data inclination information corresponding to a target real-time task; the data inclination information comprises partition data backlog quantity of a corresponding partition of the target real-time task in the data processing system within a set time length;
the determining module 20 may be configured to determine a data backlog average value and a data backlog standard deviation corresponding to the data backlog information;
and can also be used for determining the data inclination grade based on the data backlog average value and the data backlog standard deviation.
The embodiment provides a real-time task management device which can be applied to electronic equipment, and the electronic equipment can be provided with a real-time task chain dynamic monitoring system. Referring to fig. 5, in the apparatus, the determining module 20 may be configured to perform at least one of:
if the standard deviation of the data backlog quantity is smaller than the first percentage of the average value of the data backlog quantity, determining the data inclination grade as non-inclination;
if the standard deviation of the data backlog is greater than or equal to the first percentage of the average value of the data backlog and less than the second percentage of the average value of the data backlog, determining the data inclination grade as slight inclination;
if the standard deviation of the data backlog is greater than or equal to the second percentage of the average of the data backlog, the data skew level is determined to be a severe skew.
The embodiment provides a real-time task management device which can be applied to electronic equipment, and the electronic equipment can be provided with a real-time task chain dynamic monitoring system. Referring to fig. 5, the apparatus may include a definition module 40 and an addition module 50.
The acquisition module 10 is used for acquiring target configuration information of a target real-time task;
the definition module 40 is configured to define target node relationship data corresponding to a target real-time task based on the target configuration information; the target node relation data comprises a first node, a second node, a node relation between the first node and the second node and a relation direction of the node relation, wherein the first node represents a producer of a target real-time task, the second node represents a consumer of the target real-time task, the node relation represents the target real-time task, and the relation direction represents a data processing direction of the target real-time task;
The adding module 50 may be configured to add the target node relationship data to a graph database for real-time task management.
The embodiment provides a real-time task management device which can be applied to electronic equipment, and the electronic equipment can be provided with a real-time task chain dynamic monitoring system. Referring to fig. 5, in the apparatus,
the collection module 10 is configured to collect node attribute data and/or relationship attribute data after defining target node relationship data corresponding to a target real-time task based on the target configuration information;
wherein the node attribute data includes at least one of: data production rate, total number of data production, data production storage rate, total size of data production storage, node status, data tilt level, total number of node partitions, node topic name and node message middleware cluster name;
the relationship attribute data includes at least one of: data consumption rate, total number of data consumption, consumer group name, task running state, data backlog level, and data consumption backlog reason.
The embodiment provides a real-time task management device which can be applied to electronic equipment, and the electronic equipment can be provided with a real-time task chain dynamic monitoring system. Referring to fig. 5, the apparatus may include a drawing module 60 and a display module 70.
The drawing module 60 is configured to draw a task chain graph corresponding to the real-time task chain based on the node relationship data in the graph database; the real-time task chain comprises at least one real-time task, the task chain map comprises a task map of the at least one real-time task, and the task map represents node relation data of the real-time task;
the display module 70 may be configured to display a task chain graph on the front-end presentation interface based on the timed task.
The embodiment provides a real-time task management device which can be applied to electronic equipment, and the electronic equipment can be provided with a real-time task chain dynamic monitoring system. Referring to fig. 5, in the apparatus, a drawing module 60 may be used to:
and drawing task maps of the corresponding real-time tasks in different forms based on the different types of real-time tasks.
The embodiment provides an electronic device. The electronic device may be an electronic device of various fields. For example, large screen devices in the smart home field, artificial intelligence (artificial intelligence, AI) speakers, high fidelity (HiFi) speakers, and the like. And, cell phones, tablet computers, wearable devices, augmented Reality (AR)/Virtual Reality (VR) devices, notebook computers, ultra-mobile personal computer (UMPC), netbooks, personal digital assistants (personaldigital assistant, PDA), etc. in the field of intelligent terminals. And, logistics vehicles, intelligent shelves, etc. in the field of intelligent manufacturing. The specific type of electronic device may not be limited by the embodiments of the present application.
Referring to fig. 6, the electronic device 100 includes: at least one processor 101, memory 102, at least one network interface 104, and other user interfaces 103. The various components in the electronic device 100 are coupled together by a bus system 105. It is understood that the bus system 105 is used to enable connected communications between these components. The bus system 105 includes a power bus, a control bus, and a status signal bus in addition to a data bus. But for clarity of illustration, the various buses are labeled as bus system 105.
The user interface 103 may include, among other things, a display, a keyboard, or a pointing electronic device (e.g., a mouse, a trackball (trackball), a touch pad, or a touch screen, etc.
It is to be appreciated that the memory 102 in embodiments of the present application may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM) which acts as an external cache. By way of example, and not limitation, many forms of RAM are available, such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (Double Data Rate SDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), and Direct memory bus RAM (DRRAM). The memory 102 described herein is intended to comprise, without being limited to, these and any other suitable types of memory.
In some implementations, the memory 102 stores the following elements, executable units or data structures, or a subset thereof, or an extended set thereof: an operating system 1021, and application programs 1022.
The operating system 1021 includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, for implementing various basic services and processing hardware-based tasks. The application programs 1022 include various application programs such as a Media Player (Media Player), a Browser (Browser), and the like for implementing various application services. A program for implementing the method of the embodiment of the present application may be included in the application program 1022.
In the embodiment of the present application, the processor 101 is configured to execute the methods provided in the method embodiments by calling a program or an instruction stored in the memory 102, specifically, a program or an instruction stored in the application 1022.
The method disclosed in the embodiments of the present application may be applied to the processor 101 or implemented by the processor 101. The processor 101 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in the processor 101 or instructions in the form of software. The processor 101 described above may be a general purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), an off-the-shelf programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in hardware, in a decoded processor, or in a combination of hardware and software elements in a decoded processor. The software elements may be located in a random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 102, and the processor 101 reads the information in the memory 102, and in combination with its hardware, performs the above method.
It is to be understood that the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or a combination thereof. For a hardware implementation, the processing units may be implemented within one or more application specific integrated circuits (Application Specific Integrated Circuits, ASIC), digital signal processors (Digital Signal Processing, DSP), digital signal processing electronics (dspev, DSPD), programmable logic electronics (Programmable Logic Device, PLD), field programmable gate arrays (Field-Programmable Gate Array, FPGA), general purpose processors, controllers, micro-controllers, microprocessors, other electronic units designed to perform the functions described herein, or a combination thereof.
For a software implementation, the techniques described herein may be implemented by means of units that perform the functions described herein. The software codes may be stored in a memory and executed by a processor. The memory may be implemented within the processor or external to the processor.
The embodiment of the application also provides a storage medium (computer readable storage medium). The storage medium here stores one or more programs. Wherein the storage medium may comprise volatile memory, such as random access memory; the memory may also include non-volatile memory, such as read-only memory, flash memory, hard disk, or solid state disk; the memory may also comprise a combination of the above types of memories.
When the one or more programs are executed by the one or more processors in the storage medium. Wherein the method performed at the electronic device as described above may be implemented when the storage medium is applied to the electronic device. The processor is configured to execute a control program of the electronic device stored in the memory, so as to implement the method of executing in the electronic device.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of function in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It should be noted that references in the specification to "one implementation," "an embodiment," "an example embodiment," "some embodiments," etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Furthermore, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or electronic device that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or electronic device. Without further limitations, the element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or electronic device comprising the element.
The above embodiments are merely preferred embodiments for the purpose of fully explaining the present application, and the scope of the present application is not limited thereto. Equivalent substitutions and modifications will occur to those skilled in the art based on the present application, and are intended to be within the scope of the present application.
Claims (13)
1. A real-time task management method, characterized in that the real-time task management method comprises:
setting performance indexes of a target real-time task are collected at fixed time; wherein the set performance index includes at least one of: data backlog level, data skew level, system index, task index, operator index, checkpoint index, taskManager index, and jogmanager index;
determining characteristic data of the set performance index;
inputting the characteristic data into a performance analysis model to determine a performance analysis result corresponding to the set performance index; wherein the performance analysis result comprises a performance state label and/or an optimization suggestion label;
and controlling the configuration information of the real-time task based on the performance analysis result.
2. The real-time task management method of claim 1, wherein the data backlog level is determined by:
collecting data backlog information corresponding to the target real-time task; the data backlog information comprises the data consumption rate of the target real-time task in the data processing system and the partition data backlog quantity of the corresponding partition within a set time length;
Determining the quotient of the partition data backlog quantity and the partition data consumption rate of the same corresponding partition;
the data backlog level is determined based on the quotient of all corresponding partitions.
3. The method of real-time task management according to claim 2, wherein said determining the data backlog level based on the quotient of all corresponding partitions comprises:
determining the data backlog level based on the maximum value of the quotient based on the corresponding partition and the set backlog level information;
when the maximum value is smaller than or equal to the maximum set level in the set backlog level information, determining the corresponding set level of the maximum value in the set backlog level information as the data backlog level; and when the maximum value is larger than the maximum set level of the set backlog level information, determining the maximum set level as the data backlog level.
4. The real-time task management method according to claim 1, wherein the data skew level is determined by:
collecting data inclination information corresponding to the target real-time task; the data inclination information comprises partition data backlog amount of a corresponding partition of the target real-time task in a data processing system within a set time length;
Determining a data backlog average value and a data backlog standard deviation corresponding to the data backlog information;
and determining the data inclination grade based on the data backlog average value and the data backlog standard deviation.
5. The real-time task management method of claim 4, wherein said determining said data skew level based on said data backlog average and said data backlog standard deviation comprises at least one of:
if the standard deviation of the data backlog is smaller than the first percentage of the average value of the data backlog, determining the data inclination grade as non-inclination;
if the standard deviation of the data backlog is greater than or equal to a first percentage of the average value of the data backlog and less than a second percentage of the average value of the data backlog, determining the data inclination grade as slightly inclined;
and if the standard deviation of the data backlog quantity is greater than or equal to the second percentage of the average value of the data backlog quantity, determining the data inclination grade as serious inclination.
6. The real-time task management method of claim 1, wherein the performance analysis model is determined by:
Constructing a training set; the training set comprises a plurality of training sample pairs, wherein each training sample pair comprises an input sample and an output sample, each output sample comprises a performance analysis result sample, and each input sample comprises a characteristic data sample of a set performance index corresponding to the corresponding performance analysis result sample;
and training the original model based on the training set to determine the performance analysis model meeting the set accuracy.
7. The real-time task management method according to any one of claims 1 to 6, characterized in that the real-time task management method comprises:
acquiring target configuration information of the target real-time task;
defining target node relation data corresponding to the target real-time task based on the target configuration information; the target node relation data comprises a first node, a second node, a node relation between the first node and the second node and a relation direction of the node relation, wherein the first node represents a producer of the target real-time task, the second node represents a consumer of the target real-time task, the node relation represents the target real-time task, and the relation direction represents a data processing direction of the target real-time task;
And adding the target node relation data to a graph database for real-time task management.
8. The method for real-time task management according to claim 7, wherein after defining the target node relationship data corresponding to the target real-time task based on the target configuration information, the method for real-time task management comprises:
collecting node attribute data and/or relationship attribute data;
wherein the node attribute data includes at least one of: data production rate, total number of data production, data production storage rate, total size of data production storage, node status, data tilt level, total number of node partitions, node topic name and node message middleware cluster name;
the relationship attribute data includes at least one of: data consumption rate, total number of data consumption, consumer group name, task running state, data backlog level, and data consumption backlog reason.
9. The real-time task management method according to claim 7, wherein the real-time task management method comprises:
drawing a task chain map corresponding to a real-time task chain based on the node relation data in the graph database; the real-time task chain comprises at least one real-time task, the task chain map comprises a task map of the at least one real-time task, and the task map represents node relation data of the real-time task;
And displaying the task chain map on a front-end display interface based on the timing task.
10. The method for real-time task management according to claim 9, wherein the drawing a task chain map corresponding to a real-time task chain based on node relation data in the graph database includes:
and drawing task maps of the corresponding real-time tasks in different forms based on the different types of real-time tasks.
11. A real-time task management device, characterized in that the real-time task management device comprises:
the acquisition module is used for regularly acquiring set performance indexes of the target real-time task; wherein the set performance index includes at least one of: data backlog level, data skew level, system index, task index, operator index, checkpoint index, taskManager index, and jogmanager index;
the determining module is used for determining the characteristic data of the set performance index;
the characteristic data are input into a performance analysis model to determine a performance analysis result corresponding to the set performance index; wherein the performance analysis result comprises a performance state label and/or an optimization suggestion label;
and the control module is used for controlling the configuration information of the real-time task based on the performance analysis result.
12. An electronic device, comprising: a processor and a memory, the processor being configured to execute a control program stored in the memory to implement the real-time task management method according to any one of claims 1 to 10.
13. A storage medium storing one or more programs executable by one or more processors to implement the real-time task management method of any of claims 1-10.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311092049.XA CN117608985A (en) | 2023-08-28 | 2023-08-28 | Real-time task management method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311092049.XA CN117608985A (en) | 2023-08-28 | 2023-08-28 | Real-time task management method and device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117608985A true CN117608985A (en) | 2024-02-27 |
Family
ID=89948488
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311092049.XA Pending CN117608985A (en) | 2023-08-28 | 2023-08-28 | Real-time task management method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117608985A (en) |
-
2023
- 2023-08-28 CN CN202311092049.XA patent/CN117608985A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7437351B2 (en) | Data stream processing language for analyzing software with built-in instrumentation | |
US11010278B2 (en) | Real-time reporting based on instrumentation of software | |
US9892020B1 (en) | User interface for specifying data stream processing language programs for analyzing instrumented software | |
US9367601B2 (en) | Cost-based optimization of configuration parameters and cluster sizing for hadoop | |
US10749782B2 (en) | Analyzing servers based on data streams generated by instrumented software executing on the servers | |
CN107924360B (en) | Diagnostic framework in a computing system | |
CN111190753B (en) | Distributed task processing method and device, storage medium and computer equipment | |
CN115373835A (en) | Task resource adjusting method and device for Flink cluster and electronic equipment | |
CN115237566A (en) | Batch task execution method, device, equipment, medium and product | |
CN113220530B (en) | Data quality monitoring method and platform | |
CN113656369A (en) | Log distributed streaming acquisition and calculation method in big data scene | |
CN112817687A (en) | Data synchronization method and device | |
CN117608985A (en) | Real-time task management method and device, electronic equipment and storage medium | |
CN115269519A (en) | Log detection method and device and electronic equipment | |
CN113835953A (en) | Statistical method and device of job information, computer equipment and storage medium | |
US20230376469A1 (en) | Flow-based data quality monitoring | |
US10949232B2 (en) | Managing virtualized computing resources in a cloud computing environment | |
CN116483824A (en) | Data processing method, device, electronic equipment and storage medium | |
CN113986841A (en) | All-node rapid acquisition and analysis system and method for I2P network | |
CN116644111A (en) | Index determination method and device based on Internet of things data and electronic equipment | |
CN114416414A (en) | Fault information positioning method, device, equipment and storage medium | |
CN115396319A (en) | Data stream fragmentation method, device, equipment and storage medium | |
CN116415860A (en) | Service policy generation method and service policy generation device | |
CN115396415A (en) | Data intelligent identification, distribution and execution method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |