CN114064403A - Task delay analysis processing method and device - Google Patents

Task delay analysis processing method and device Download PDF

Info

Publication number
CN114064403A
CN114064403A CN202111314727.3A CN202111314727A CN114064403A CN 114064403 A CN114064403 A CN 114064403A CN 202111314727 A CN202111314727 A CN 202111314727A CN 114064403 A CN114064403 A CN 114064403A
Authority
CN
China
Prior art keywords
task
detected
computing resource
time
delay
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111314727.3A
Other languages
Chinese (zh)
Inventor
田恒宇
张宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN202111314727.3A priority Critical patent/CN114064403A/en
Publication of CN114064403A publication Critical patent/CN114064403A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/301Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is a virtual computing platform, e.g. logically partitioned systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Quality & Reliability (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a method and a device for analyzing and processing task delay, which relate to the technical field of computers, and a specific implementation mode of the method comprises the following steps: responding to the triggering of task delay analysis, and acquiring basic information of a task to be detected; judging whether the task to be detected is delayed or not according to the basic information of the task to be detected; determining a use coefficient of a computing resource index of a cluster and/or a queue to which the task to be detected belongs under the condition that the task to be detected is delayed; when the use coefficient of the computing resource index is smaller than a first threshold value, confirming that the computing resource is a reason causing task delay; when the usage coefficient of the computing resource indicator is greater than or equal to a first threshold value, confirming that the computing resource is not a cause of task delay. The implementation mode improves the speed and the convenient degree of task delay analysis, does not need manual analysis, and has high efficiency.

Description

Task delay analysis processing method and device
Technical Field
The invention relates to the technical field of computers, in particular to a task delay analysis processing method and device.
Background
In the construction process of an enterprise-level data warehouse, the construction of internal/external service tasks from data extraction/collection tasks to final data depends on complexity, and a basic detail layer core task is taken as the most preposed link in the whole data link, so that the stable and efficient operation of the basic detail layer core task is guaranteed to be particularly important. Under the condition of certain resource control and production cost, the enterprise avoids core task delay and ensures that the task stably runs. In order to avoid the task delay as much as possible, the reason for the task delay needs to be analyzed, and only then can the task delay be reduced according to the reason.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art:
the task delay analysis method based on the computing resources is complex, mainly depends on manual work, takes more time, cannot quickly and conveniently locate the reason influencing the task delay, and is low in efficiency.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for analyzing and processing task delay, which can more quickly and conveniently locate a cause affecting task delay compared to the prior art.
To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a task delay analysis processing method including:
responding to the triggering of task delay analysis, and acquiring basic information of a task to be detected;
judging whether the task to be detected is delayed or not according to the basic information of the task to be detected;
determining a use coefficient of a computing resource index of a cluster and/or a queue to which the task to be detected belongs under the condition that the task to be detected is delayed; when the use coefficient of the computing resource index is smaller than a first threshold value, confirming that the computing resource is a reason causing task delay; when the usage coefficient of the computing resource indicator is greater than or equal to a first threshold value, confirming that the computing resource is not a cause of task delay.
Optionally, the method further includes:
before the basic information of the task to be detected is obtained, obtaining a time dimension of task delay analysis appointed by a user; the time dimension of the task delay analysis specified by the user is the current time or a specified time period; determining the type of task delay analysis according to the time dimension; the types of task delay analysis include: real-time task delay analysis, and offline task delay analysis.
Optionally, the determining, according to the basic information of the task to be detected, whether the task to be detected is delayed includes:
when the type of the task delay analysis is real-time task delay analysis, determining the current operation duration of the task to be detected according to the operation starting time and the current time of the task to be detected; determining the ratio of the current running time of the task to be detected to the average historical running time of the task to be detected; when the ratio is larger than or equal to a second threshold value, confirming that the task to be detected is delayed; and when the ratio is smaller than a second threshold value, confirming that the task to be detected has no delay.
Optionally, the determining, according to the basic information of the task to be detected, whether the task to be detected is delayed includes:
when the type of the task delay analysis is offline task delay analysis, determining the operation duration of the task to be detected according to the operation starting time and the operation ending time of the task to be detected; determining the ratio of the running time of the task to be detected to the average historical running time of the task to be detected; when the ratio is larger than or equal to a second threshold value, confirming that the task to be detected is delayed; and when the ratio is smaller than a second threshold value, confirming that the task to be detected has no delay.
Optionally, the determining the usage coefficient of the computing resource indicator of the cluster and/or the queue to which the task to be detected belongs includes:
when the type of the task delay analysis is real-time task delay analysis, acquiring index values of various computing resources of a cluster and/or a queue to which the task to be detected belongs at the current time; and for each of the plurality of computing resources, determining the use coefficient of the computing resource index according to the computing resource index value of the cluster and/or the queue to which the task to be detected belongs at the current time and the average value of the computing resource indexes of a plurality of historical time periods.
Optionally, the determining the usage coefficient of the computing resource indicator of the cluster and/or the queue to which the task to be detected belongs includes:
when the type of the task delay analysis is offline task delay analysis, acquiring index values of various computing resources of a cluster and/or a queue to which the task to be detected belongs in a specified time period; and for each of the plurality of computing resources, determining the use coefficient of the computing resource index according to the computing resource index value of the cluster and/or the queue to which the task to be detected belongs in the specified time period and the average value of the computing resource indexes in a plurality of historical time periods.
Optionally, the method further includes:
before determining the use coefficient of the computing resource index, receiving the computing resource index input by a user, verifying the computing resource index input by the user, and confirming that the verification is passed.
Optionally, the method further includes:
and after determining the reason causing the task delay according to the analysis result, determining an abnormal task according to the task set running in the current time and the task set running in a plurality of historical time periods, and outputting the information of the abnormal task.
To achieve the above object, according to another aspect of the embodiments of the present invention, there is provided a task delay analysis apparatus including:
the acquisition module is used for responding to the triggering of the task delay analysis and acquiring the basic information of the task to be detected;
the judging module is used for judging whether the task to be detected is delayed or not according to the basic information of the task to be detected;
the analysis module is used for determining the use coefficient of the computing resource index of the cluster and/or the queue to which the task to be detected belongs under the condition that the task to be detected is delayed; when the use coefficient of the computing resource index is smaller than a first threshold value, confirming that the computing resource is a reason causing task delay; when the usage coefficient of the computing resource indicator is greater than or equal to a first threshold value, confirming that the computing resource is not a cause of task delay.
To achieve the above object, according to still another aspect of embodiments of the present invention, there is provided a task delay analysis electronic device including:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement any of the task delay analysis processing methods described above.
To achieve the above object, according to a further aspect of the embodiments of the present invention, there is provided a computer-readable medium on which a computer program is stored, the program, when executed by a processor, implementing any one of the task delay analysis processing methods described above.
One embodiment of the above invention has the following advantages or benefits: by acquiring basic information of a task to be detected, judging whether the task to be detected is delayed or not according to the basic information of the task to be detected, analyzing various calculation resource indexes of a cluster and/or a queue to which the task to be detected belongs during the running period of the task to be detected under the condition that the task to be detected is delayed, determining processing steps of reasons causing task delay according to an analysis result, automatically and quickly positioning the delayed task and automatically and quickly positioning the reasons causing task delay, the problems that a task delay analysis method based on calculation resources in the prior art is complex, slow in speed and low in efficiency are solved, and the effect that the task delay analysis is quicker and more convenient is achieved.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
fig. 1 is a schematic main flow chart of a task delay analysis processing method according to an embodiment of the present invention;
FIG. 2 is a flow diagram illustrating an alternative method for analyzing and processing task delays according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart diagram illustrating an alternative task delay analysis processing method according to an embodiment of the present invention;
FIG. 4 is a flow chart illustrating an alternative task delay analysis processing method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of the main blocks of a task delay analysis processing apparatus according to an embodiment of the present invention;
FIG. 6 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
FIG. 7 is a schematic block diagram of a computer system suitable for use with a mobile device or server implementing an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It should be noted that the embodiments of the present invention are mainly applied to a task delay analysis scenario, and are applicable to various systems or computer systems, such as a database, a server, a home or public computer system, and the like.
Referring to fig. 1, a main flowchart of a task delay analysis processing method according to an embodiment of the present invention is shown, including the following steps:
s101: and responding to the trigger of the task delay analysis, and acquiring the basic information of the task to be detected.
S102: and judging whether the task to be detected is delayed or not according to the basic information of the task to be detected.
S103: determining a use coefficient of a computing resource index of a cluster and/or a queue to which the task to be detected belongs under the condition that the task to be detected is delayed; when the use coefficient of the computing resource index is smaller than a first threshold value, confirming that the computing resource is a reason causing task delay; when the usage coefficient of the computing resource indicator is greater than or equal to a first threshold value, confirming that the computing resource is not a cause of task delay.
In the above embodiment, for step S101, for the triggering of the task delay analysis, one or more running or historically running tasks may be specified by a user or automatically specified by a computer program implemented by the method. The tasks may be in the same or different clusters that have one or more resource schedulers to manage the tasks, e.g., the resource scheduler may be a Yarn resource scheduler. In addition, the tasks have a degree of importance, for example, the level can be from high to low as L0-L3, wherein L0 is the highest level, and the task representing the level is the most core.
After the task is specified, the task information is obtained through a specific interface to obtain the serial number pid of the task, so that the operation of the subsequent steps is facilitated. For example, the interface may be an interface to a large data platform or an interface to a database. For example, the task information may include: a task belonging cluster, a task belonging resource scheduler, a task belonging queue and the like. In addition, when acquiring the task information, the program or the user may set the acquisition principle, for example, the acquisition principle may be set as: only the highest ranked tasks are acquired.
For step S102, after the task information is acquired, the operation duration of the task to be detected is determined according to the operation start time and the task end time of the task to be detected (if the task is operating at the current time, the current time is taken). Then determining the ratio of the running time of the task to be detected to the average historical running time of the task to be detected; when the ratio is larger than or equal to a second threshold value, confirming that the task to be detected is delayed; and when the ratio is smaller than a second threshold value, confirming that the task to be detected has no delay. For example: for a task starting at 10:00:00 am and ending at 11:00:00 am on the same day, the running time of the task is 1 hour, which can be obtained by the difference between 11:00:00 and 10:00: 00. And the average length of time after dividing by 3 is about 0.333 hours, which is the sum of the lengths of time that the task was run for the previous three days according to the query of the program. And the ratio of the operation time length of 1 hour to the average operation time length of 0.333 hours in the first three days is 3, and if the second threshold value is automatically set to 1.3 through a user or a program, the task to be detected is judged to be delayed.
For step S103, if it is determined in step S102 that the task to be detected is delayed, it is automatically determined and analyzed which computing resource indicators are analyzed by a user or a program, and background information or a log is read through an interface to obtain the indicators. The computing resource indicators include, but are not limited to, indicators of memory and CPU aspects of the cluster to which the task belongs, and/or the queue to which the task belongs. For example, the memory-related indicators may include:
totalMB (total configured memory), OveralllocationMemoryMB (excess memory), availbleMB (available memory), allocatadMB (allocated memory), totalOveralllocationMB (total excess memory), and effectiveMemoryMB (effective memory).
For example, CPU-side metrics may include:
totalVirtualCores (cluster total virtual core number), overAllocation Vcors (super-divided core number), availableVirtualCores (available virtual core number), allocatedVirtualCores (allocated virtual core number).
For example, queue metrics may include:
queue usage Memory case, queue usage vCore case), queue Apps Pending number, queue Pending containers number, queue Pending mb number, queue Apps Running number, queue Apps Completed number, queue Apps Submitted number.
After the computing resource metrics have been obtained, for each of them, the coefficient of use M of the computing resource metric is determined by calculation based on the value of the computing resource metric during the present or last run of the task (illustratively, the time representation of which may be refined from years to seconds) and the average of the computing resource metrics for a plurality of time periods in the history. When the use coefficient M of the computing resource index is smaller than a first threshold value set by a user or generated by a program or preset, confirming that the computing resource is a reason causing task delay; when the usage coefficient of the computing resource indicator is greater than or equal to a first threshold value, confirming that the computing resource is not a cause of task delay.
For example, assuming that at present time 2021, 03, month 02, 17:00:00, for a task for which a delay now occurs, it starts running at 2021, 03, month 02, 16:00:00, and for a resource indicator of the analysis of the delayed task, which is selected by the computer program as totaltmb, the coefficient of use of the selected resource indicator is mtoaltmb and can be determined by the following equation:
MtotalMB=(totalMBTime)/(totalMB(day(Time-1))+totalMB(day(Time-2))+totalMB(day(Time-3)))*1/3
wherein totalfMB isTimeRefers to the total configuration memory size totalfB used in the period of 2021 year-03 month 02 day-16: 00:00 to 2021 year-03 month 02 day-17: 00:00(day(Time-1))Refers to the total configuration memory size used in the period from 2021 year-03 month, day 01-16: 00:00 to 2021 year-03 month, day 01-17: 00:00, and totalfMB(day(Time-2))Refers to the total configuration memory size used in the period of 2021 year-02 month 28 day-16: 00:00 to 2021 year-02 month-28 day-17: 00:00, and totalfMB(day(Time-3))Refers to the period from 2021 year to 02 month 27 day to 16:00:00 to 2021 year-The total configuration memory size used during the period of time from 02 month to 27 days to 17:00: 00. The operation of the formula is that the ratio is obtained by dividing the calculation resource index value in the current or last running period (the time representation of the calculation resource index value can be refined from year to second) by the average value of the calculation resource indexes of a plurality of historical time periods, and the obtained ratio is the use coefficient. Assume that the first threshold is set to 1 and totalMBTimeIs 1024MB, totaltMB(day(Time-1))、totalMB(day(Time-2))、totalMB(day(Time-3))If the total resource utilization coefficient MtotalMB is 2048MB, the resource utilization coefficient MtotalMB of the total configuration memory is 0.5, and is smaller than the first threshold 1, so that the delay reason of the analyzed task can be obtained as the reason that the total configuration memory size in the cluster to which the analyzed task belongs is insufficient.
In the embodiment of the invention, the automatic and fast positioning of the delayed task is realized through the steps, the reason of task delay caused by automatic and fast positioning is realized, the problems of complexity, low speed and low efficiency of a task delay analysis method based on computing resources in the prior art are solved, and the effect of more fast and convenient analysis of the task delay is further achieved.
Referring to fig. 2, a schematic flow chart of an alternative task delay analysis processing method according to an embodiment of the present invention is shown, including the following steps:
s201: and responding to the trigger of the task delay analysis, and acquiring the time dimension of the task delay analysis specified by the user.
S202: and acquiring basic information of the task to be detected according to the acquired time dimension.
S203: and judging whether the task to be detected is delayed or not according to the basic information of the task to be detected.
S204: and under the condition that the task to be detected is delayed, analyzing various computing resource indexes of the cluster and/or the queue to which the task to be detected belongs during the running period so as to determine the reason causing the task delay according to the analysis result.
In step S201 and step S202, the time dimension specified by the user is selected by the user through the program interface. The time dimension includes a time period that has elapsed and a real-time current time, and the interface provides a selection entry for selection by the user. For example, the current time is 15:00:00 on month 02 of 2021, the user may select to enter a specified time period on the interface, such as 02:32:00 on month 01 of 2021 to 03:00:00 on month 01 of 2021, or the user may select to enter the current time. After the time period selected by the user is completed, the back-end program obtains the time dimension of task delay analysis designated by the user, and then determines the tasks running in the selected time period or time nodes according to the time dimension selected by the user, and the tasks can be automatically selected by the user or a system to determine which tasks are analyzed and which tasks are not analyzed.
Further, for a specific task, the type of the task delay analysis in the subsequent step is also confirmed according to the time dimension selected by the user, and the type of the task delay analysis comprises real-time task delay analysis and offline task delay analysis. The real-time task delay analysis refers to delay analysis performed on a task running at the current time, and the off-line task delay analysis refers to delay analysis performed on a task running in a selected time period, wherein the starting time and the ending time of the time period are earlier than the current time.
In the embodiment of the invention, the time dimension of the task delay analysis specified by the user is obtained, and the task delay analysis type is determined according to the obtained time dimension, so that the compatibility of the offline task delay analysis function and the real-time task delay analysis function is realized, the user can select the offline task delay analysis or the real-time task delay analysis according to the requirement, and the user experience in the task delay analysis process is improved.
In the embodiment of the present invention, how steps S203 to S204 are performed may refer to the description of steps S102 to S103 shown in fig. 1, and will not be described herein again. Other parts of steps S201 to S202 can be referred to the description of step S101 shown in fig. 1.
Referring to fig. 3, a schematic flow chart of another task delay analysis processing method according to an embodiment of the present invention is shown, including the following steps:
s301: and responding to the trigger of the task delay analysis, and acquiring the basic information of the task to be detected.
S302: and judging whether the task to be detected is delayed or not according to the basic information of the task to be detected.
S303: and under the condition that the task to be detected is delayed, receiving a computing resource index input by a user, and verifying the computing resource index input by the user.
S304: and if the verification is passed, analyzing various computing resource indexes of the cluster and/or the queue to which the task to be detected belongs during the running period so as to determine the reason causing the task delay according to the analysis result.
In the embodiment of the present invention, how steps S301 to S302 are performed may refer to the description of steps S101 to S102 shown in fig. 1, and will not be described herein again.
In the embodiment of the present invention, regarding steps S303 to S304, it is considered that, in addition to step S103 shown in fig. 1, when determining which computing resource indexes to analyze by the user, the user needs to input the name thereof through an input device such as a keyboard or a touch screen or select according to options provided by a program. When the user inputs the index name, the index name is mistakenly input, or the selected resource index is not suitable for the subsequent analysis in form. At this time, the calculation resource index input by the user is verified through the step S303, so that the effects of avoiding the jam and error of the subsequent step caused by the input error are achieved.
For example, in step S303, the check may determine whether the index name manually input by the user is included in the existing resource index through a character string matching function; whether the type parameter in the resource index required to be analyzed in the analyzed task is matched or not can be judged through the type parameter in the selected resource index, so that whether the selected resource index is suitable for the subsequent analysis in form or not can be judged. If the input name is correct or formally applicable, the verification is passed, the part of the resource indicator analysis in the subsequent step is executed, and if the input name is wrong or formally inapplicable, the calculation resource indicator determined and analyzed by the user is received again. For example, if the computing resource indicator that the user wants to determine is allocatedMB and the computing resource indicator that the user inputs is allocatedMB, the above embodiment will prompt the user that the indicator is not available and allow the user to re-input the indicator.
For other parts in steps S303 to S304, reference may be made to the description of step S103 shown in fig. 1.
Referring to fig. 4, a schematic flow chart of another alternative task delay analysis processing method according to the embodiment of the present invention is shown, which includes the following steps:
s401: and responding to the trigger of the task delay analysis, and acquiring the basic information of the task to be detected.
S402: and judging whether the task to be detected is delayed or not according to the basic information of the task to be detected.
S403: and under the condition that the task to be detected is delayed, analyzing various computing resource indexes of the cluster and/or the queue to which the task to be detected belongs during the running period so as to determine the reason causing the task delay according to the analysis result.
S404: and after determining the reason causing the task delay according to the analysis result, outputting the information of the abnormal task.
In the embodiment of the present invention, how steps S401 to S403 are performed may refer to the description about steps S101 to S103 in the embodiment shown in fig. 1, and details are not repeated herein.
In the embodiment of the present invention, for step S404, after determining the reason causing the task delay, it needs to first obtain the task set U at the current time through the interface and the background resource managertAnd task sets which are operated in a plurality of historical time periods, if the task sets which are operated in the plurality of historical time periods are obtained, the task sets need to be de-duplicated to obtain a new set Uk,. For example, assuming that the current time of the task is 2021 year 05 month 28 day 12:00:00 and the current time is 2021 year 05 month 28 day 13:00:00, the set of tasks running in the plurality of time periods may be tasks running on the first 1 day, i.e., 2021 year 05 month 27 day 12:00:00 to 2021 year 05 month 27 day 13:00:00, or tasks running on the first 2 days, i.e., 2021 year 05 month 26 day 12:00:00 to 2021 year 05 month 26 day 13:00:00The set of tasks of a row and the tasks that were run the first 3 days, i.e., 12:00:00 at 05/25/2021 and 13:00:00 at 05/25/2021. Will UtAnd Uk,Operating to obtain task set UfSaid task set UfThe method is a set of newly-added abnormal tasks, all the newly-added abnormal tasks occupying resources can be positioned according to the set, and specific information of the abnormal tasks, such as task names, task execution time, task occupied resources and the like, can be obtained through interface calling. The arithmetic operation may be taking a difference set.
Further, based on the acquired task set, the task level of the task may be called by accessing metadata in the set, that is, a single task in the task set, and if the task level is lower than a preset level, for example, lower than the level L1, the task may be located as a low-level abnormal task occupying resources, and specific information of the abnormal tasks may be obtained through interface calling.
Furthermore, for zero to a plurality of newly-added abnormal tasks and/or low-level abnormal tasks which are positioned, the abnormal tasks can be closed by calling a task manager or other managers to execute operation, so that the tasks do not occupy resources any more, and the normal operation of the core tasks is ensured.
Referring to fig. 5, a schematic diagram of main modules of a task delay analysis processing apparatus 500 according to an embodiment of the present invention is shown, including:
the obtaining module 501 is configured to obtain basic information of a task to be detected in response to a trigger of task delay analysis.
The determining module 502 is configured to determine whether the task to be detected is delayed according to the basic information of the task to be detected.
An analysis module 503, configured to determine a usage coefficient of a computing resource indicator of a cluster and/or a queue to which the task to be detected belongs when the task to be detected is delayed; when the use coefficient of the computing resource index is smaller than a first threshold value, confirming that the computing resource is a reason causing task delay; when the usage coefficient of the computing resource indicator is greater than or equal to a first threshold value, confirming that the computing resource is not a cause of task delay.
Optionally, in an implementation apparatus of the present invention, the obtaining module 501 is further configured to:
before the basic information of the task to be detected is obtained, obtaining a time dimension of task delay analysis appointed by a user; the time dimension of the task delay analysis specified by the user is the current time or a specified time period; determining the type of task delay analysis according to the time dimension; the types of task delay analysis include: real-time task delay analysis, and offline task delay analysis.
Optionally, in an implementation apparatus of the present invention, the determining module 502 is configured to:
when the type of the task delay analysis is real-time task delay analysis, determining the current operation duration of the task to be detected according to the operation starting time and the current time of the task to be detected; determining the ratio of the current running time of the task to be detected to the average historical running time of the task to be detected; when the ratio is larger than or equal to a second threshold value, confirming that the task to be detected is delayed; and when the ratio is smaller than a second threshold value, confirming that the task to be detected has no delay.
Optionally, in an implementation apparatus of the present invention, the determining module 502 is configured to:
when the type of the task delay analysis is offline task delay analysis, determining the operation duration of the task to be detected according to the operation starting time and the operation ending time of the task to be detected; determining the ratio of the running time of the task to be detected to the average historical running time of the task to be detected; when the ratio is larger than or equal to a second threshold value, confirming that the task to be detected is delayed; and when the ratio is smaller than a second threshold value, confirming that the task to be detected has no delay.
Optionally, in an implementation apparatus of the present invention, the analysis module 503 is configured to:
when the type of the task delay analysis is real-time task delay analysis, acquiring index values of various computing resources of a cluster and/or a queue to which the task to be detected belongs at the current time; and for each of the plurality of computing resources, determining the use coefficient of the computing resource index according to the computing resource index value of the cluster and/or the queue to which the task to be detected belongs at the current time and the average value of the computing resource indexes of a plurality of historical time periods.
Optionally, in an implementation apparatus of the present invention, the analysis module 503 is configured to:
when the type of the task delay analysis is offline task delay analysis, acquiring index values of various computing resources of a cluster and/or a queue to which the task to be detected belongs in a specified time period; and for each of the plurality of computing resources, determining the use coefficient of the computing resource index according to the computing resource index value of the cluster and/or the queue to which the task to be detected belongs in the specified time period and the average value of the computing resource indexes in a plurality of historical time periods.
Optionally, in an implementation apparatus of the present invention, the analysis module 503 is further configured to:
before determining the use coefficient of the computing resource index, receiving the computing resource index input by a user, verifying the computing resource index input by the user, and confirming that the verification is passed.
Optionally, in an implementation apparatus of the present invention, the analysis module 503 is further configured to:
and after determining the reason causing the task delay according to the analysis result, determining an abnormal task according to the task set running in the current time and the task set running in a plurality of historical time periods, and outputting the information of the abnormal task.
In the embodiment of the invention, the automatic and fast positioning of the delayed task is realized through the device, the reason of task delay caused by automatic and fast positioning is realized, the problems of complexity, low speed and low efficiency of a task delay analysis method based on computing resources in the prior art are solved, and the effect of more fast and convenient analysis of the task delay is further achieved.
FIG. 6 illustrates an exemplary system architecture 600 to which embodiments of the invention may be applied.
As shown in fig. 6, the system architecture 600 may include terminal devices 601, 602, 603, a network 604, and a server 605 (by way of example only). The network 604 serves to provide a medium for communication links between the terminal devices 601, 602, 603 and the server 605. Network 604 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.
A user may use the terminal devices 601, 602, 603 to interact with the server 605 via the network 604 to receive or send messages or the like. Various communication client applications can be installed on the terminal devices 601, 602, 603.
The terminal devices 601, 602, 603 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 605 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the terminal devices 601, 602, 603.
It should be noted that the task delay analysis processing method provided by the embodiment of the present invention is generally executed by the server 605, and accordingly, the task delay analysis processing apparatus is generally disposed in the server 605.
It should be understood that the number of terminal devices, networks, and servers in fig. 6 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 7, shown is a block diagram of a computer system 700 suitable for use with a terminal device implementing an embodiment of the present invention. The terminal device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU)701, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for the operation of the system 700 are also stored. The CPU 701, the ROM 702, and the RAM 703 are connected to each other via a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 701.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor comprises an acquisition module, a judgment module and an analysis module. The names of these modules do not in some cases form a limitation to the module itself, and for example, the acquiring module may also be described as a "task information acquiring module".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: responding to the triggering of task delay analysis, and acquiring basic information of a task to be detected; judging whether the task to be detected is delayed or not according to the basic information of the task to be detected; determining a use coefficient of a computing resource index of a cluster and/or a queue to which the task to be detected belongs under the condition that the task to be detected is delayed; when the use coefficient of the computing resource index is smaller than a first threshold value, confirming that the computing resource is a reason causing task delay; when the usage coefficient of the computing resource indicator is greater than or equal to a first threshold value, confirming that the computing resource is not a cause of task delay.
According to the technical scheme of the embodiment of the invention, the problems of complexity, low speed and low efficiency of a task delay analysis method based on computing resources in the prior art are solved, and the effect that the task delay analysis becomes faster and more convenient is achieved.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (11)

1. A task delay analysis processing method is characterized by comprising the following steps:
responding to the triggering of task delay analysis, and acquiring basic information of a task to be detected;
judging whether the task to be detected is delayed or not according to the basic information of the task to be detected;
determining a use coefficient of a computing resource index of a cluster and/or a queue to which the task to be detected belongs under the condition that the task to be detected is delayed; when the use coefficient of the computing resource index is smaller than a first threshold value, confirming that the computing resource is a reason causing task delay; when the usage coefficient of the computing resource indicator is greater than or equal to a first threshold value, confirming that the computing resource is not a cause of task delay.
2. The method of claim 1, further comprising:
before the basic information of the task to be detected is obtained, obtaining a time dimension of task delay analysis appointed by a user; the time dimension of the task delay analysis specified by the user is the current time or a specified time period; determining the type of task delay analysis according to the time dimension; the types of task delay analysis include: real-time task delay analysis, and offline task delay analysis.
3. The method according to claim 2, wherein the determining whether the task to be detected has a delay according to the basic information of the task to be detected comprises:
when the type of the task delay analysis is real-time task delay analysis, determining the current operation duration of the task to be detected according to the operation starting time and the current time of the task to be detected; determining the ratio of the current running time of the task to be detected to the average historical running time of the task to be detected; when the ratio is larger than or equal to a second threshold value, confirming that the task to be detected is delayed; and when the ratio is smaller than a second threshold value, confirming that the task to be detected has no delay.
4. The method according to claim 2, wherein the determining whether the task to be detected has a delay according to the basic information of the task to be detected comprises:
when the type of the task delay analysis is offline task delay analysis, determining the operation duration of the task to be detected according to the operation starting time and the operation ending time of the task to be detected; determining the ratio of the running time of the task to be detected to the average historical running time of the task to be detected; when the ratio is larger than or equal to a second threshold value, confirming that the task to be detected is delayed; and when the ratio is smaller than a second threshold value, confirming that the task to be detected has no delay.
5. The method according to claim 2, wherein determining the usage coefficient of the calculation resource indicator of the cluster and/or the queue to which the task to be detected belongs comprises:
when the type of the task delay analysis is real-time task delay analysis, acquiring index values of various computing resources of a cluster and/or a queue to which the task to be detected belongs at the current time; and for each of the plurality of computing resources, determining the use coefficient of the computing resource index according to the computing resource index value of the cluster and/or the queue to which the task to be detected belongs at the current time and the average value of the computing resource indexes of a plurality of historical time periods.
6. The method according to claim 2, wherein determining the usage coefficient of the calculation resource indicator of the cluster and/or the queue to which the task to be detected belongs comprises:
when the type of the task delay analysis is offline task delay analysis, acquiring index values of various computing resources of a cluster and/or a queue to which the task to be detected belongs in a specified time period; and for each of the plurality of computing resources, determining the use coefficient of the computing resource index according to the computing resource index value of the cluster and/or the queue to which the task to be detected belongs in the specified time period and the average value of the computing resource indexes in a plurality of historical time periods.
7. The method of claim 5 or 6, further comprising:
before determining the use coefficient of the computing resource index, receiving the computing resource index input by a user, verifying the computing resource index input by the user, and confirming that the verification is passed.
8. The method of claim 1, characterized in that the method further comprises:
and after determining the reason causing the task delay according to the analysis result, determining an abnormal task according to the task set running in the current time and the task set running in a plurality of historical time periods, and outputting the information of the abnormal task.
9. A task delay analysis processing apparatus, comprising:
the acquisition module is used for responding to the triggering of the task delay analysis and acquiring the basic information of the task to be detected;
the judging module is used for judging whether the task to be detected is delayed or not according to the basic information of the task to be detected;
the analysis module is used for determining the use coefficient of the computing resource index of the cluster and/or the queue to which the task to be detected belongs under the condition that the task to be detected is delayed; when the use coefficient of the computing resource index is smaller than a first threshold value, confirming that the computing resource is a reason causing task delay; when the usage coefficient of the computing resource indicator is greater than or equal to a first threshold value, confirming that the computing resource is not a cause of task delay.
10. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-8.
11. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-8.
CN202111314727.3A 2021-11-08 2021-11-08 Task delay analysis processing method and device Pending CN114064403A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111314727.3A CN114064403A (en) 2021-11-08 2021-11-08 Task delay analysis processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111314727.3A CN114064403A (en) 2021-11-08 2021-11-08 Task delay analysis processing method and device

Publications (1)

Publication Number Publication Date
CN114064403A true CN114064403A (en) 2022-02-18

Family

ID=80274317

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111314727.3A Pending CN114064403A (en) 2021-11-08 2021-11-08 Task delay analysis processing method and device

Country Status (1)

Country Link
CN (1) CN114064403A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116663868A (en) * 2023-08-01 2023-08-29 深圳市特旺电子有限公司 PCB assembly progress monitoring system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116663868A (en) * 2023-08-01 2023-08-29 深圳市特旺电子有限公司 PCB assembly progress monitoring system
CN116663868B (en) * 2023-08-01 2024-04-19 江门市科能电子有限公司 PCB assembly progress monitoring system

Similar Documents

Publication Publication Date Title
US10917463B2 (en) Minimizing overhead of applications deployed in multi-clouds
US10360087B2 (en) Web API recommendations based on usage in cloud-provided runtimes
US11861405B2 (en) Multi-cluster container orchestration
US10102033B2 (en) Method and system for performance ticket reduction
CN109152061B (en) Channel allocation method, device, server and storage medium
US10698785B2 (en) Task management based on an access workload
CN111383100A (en) Risk model-based full life cycle management and control method and device
CN113361838A (en) Business wind control method and device, electronic equipment and storage medium
CN114064403A (en) Task delay analysis processing method and device
CN113742057A (en) Task execution method and device
CN113760982A (en) Data processing method and device
CN114327918B (en) Method and device for adjusting resource amount, electronic equipment and storage medium
CN110912949B (en) Method and device for submitting sites
CN113220705A (en) Slow query identification method and device
US11627193B2 (en) Method and system for tracking application activity data from remote devices and generating a corrective action data structure for the remote devices
CN113986097B (en) Task scheduling method and device and electronic equipment
CN114924937A (en) Batch task processing method and device, electronic equipment and computer readable medium
CN112395081A (en) Resource online automatic recovery method, system, server and storage medium
US11556425B2 (en) Failover management for batch jobs
US20240086216A1 (en) Prioritizing tasks of an application server
US11372692B2 (en) Methods and systems for application program interface call management
CN117290113B (en) Task processing method, device, system and storage medium
US20220405174A1 (en) Method, device, and program product for managing data backup
CN115774706A (en) Database creation method and device, electronic equipment and storage medium
CN114490583A (en) Data migration method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination