CN114968739A - Operation and maintenance task management method, operation and maintenance method, device, equipment and medium - Google Patents

Operation and maintenance task management method, operation and maintenance method, device, equipment and medium Download PDF

Info

Publication number
CN114968739A
CN114968739A CN202210543761.6A CN202210543761A CN114968739A CN 114968739 A CN114968739 A CN 114968739A CN 202210543761 A CN202210543761 A CN 202210543761A CN 114968739 A CN114968739 A CN 114968739A
Authority
CN
China
Prior art keywords
task
maintenance
execution
execution result
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210543761.6A
Other languages
Chinese (zh)
Inventor
杨诚
沈一帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202210543761.6A priority Critical patent/CN114968739A/en
Publication of CN114968739A publication Critical patent/CN114968739A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3452Performance evaluation by statistical analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/80Management or planning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Computer Hardware Design (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application relates to an operation and maintenance task management method, an operation and maintenance device, equipment and a medium, which can be used in the field of financial science and technology or other related fields. The operation and maintenance task management method comprises the following steps: responding to a task management request sent by an operation and maintenance execution system, and generating task information corresponding to the task management request in a database of the task management system; the task management request is generated when the operation and maintenance execution system receives a target task; receiving a log file corresponding to a target task sent by an operation and maintenance execution system, wherein the log file is generated after the operation and maintenance execution system executes the target task; analyzing the log file to obtain an execution result corresponding to the target task; and storing the execution result corresponding to the target task to the position corresponding to the task information in the database. The method can automatically analyze and arrange the result of the ansable target task, performs summary analysis, and solves the problems that the operation and maintenance task result cannot be visually displayed and automatically analyzed and counted.

Description

Operation and maintenance task management method, operation and maintenance method, device, equipment and medium
Technical Field
The present application relates to the field of operation and maintenance management technologies, and in particular, to an operation and maintenance task management method, an operation and maintenance device, an operation and maintenance equipment, and a medium.
Background
The conventional ansable is used as a set of mature automatic operation and maintenance tools, and operation and maintenance personnel can perform operation and maintenance change operation on a target server through a simple and modularized script.
The general operation and maintenance team uses excel worksheet or external documents such as task boards to perform unified management on the operation and maintenance tasks. And updating the document every time of changing, and manually recording the changing condition for later analysis and use.
However, in a real production environment, there may be hundreds of historical operation and maintenance tasks and dozens of operation and maintenance tasks still changing, and a great number of operation and maintenance task entries are generated due to frequent change requirements in production. The cost for managing these tasks is quite high, from the number of the operation and maintenance tasks (multiple updates), to the number and types of the servers which are huge in production, to the detailed completion of each operation and maintenance task, if the current manual recording mode is adopted to summarize, the work complexity of the operation and maintenance personnel is high, and the work efficiency is reduced.
Disclosure of Invention
In view of the foregoing, it is desirable to provide an operation and maintenance task management method, an operation and maintenance method, an apparatus, a device, and a medium capable of automatically recording operation and maintenance task results.
In a first aspect, the present application provides an operation and maintenance task management method, which is applied to a task management system, where the task management system is connected to an operation and maintenance execution system, and the method includes:
responding to a task management request sent by the operation and maintenance execution system, and generating task information corresponding to the task management request in a database of the task management system; the task management request is generated by the operation and maintenance execution system under the condition of receiving a target task;
receiving a log file corresponding to the target task sent by the operation and maintenance execution system, wherein the log file is generated after the operation and maintenance execution system executes the target task;
analyzing the log file to obtain an execution result corresponding to the target task;
and storing the execution result corresponding to the target task to the position corresponding to the task information in the database.
In one embodiment, the operation and maintenance execution system comprises a plurality of server clusters, each server cluster comprising a corresponding plurality of node servers; the execution result comprises a total task execution result, a cluster task execution result and a node task execution result;
the analyzing the log file to obtain an execution result corresponding to the target task includes: acquiring a node task execution result of each node server in the log file;
obtaining a cluster task execution result of each server cluster corresponding to the task information according to the node task execution result of each node server; if the node task execution results of all the node servers corresponding to the server cluster are all completed, the cluster task execution result is completed, otherwise, the cluster task execution result is not completed;
obtaining a total task execution result according to a cluster task execution result of each server cluster; if the task execution results of all the server clusters are completed, the total task execution result is completed, otherwise, the total task execution result is not completed.
In one embodiment, the task information includes total task information, cluster task information, and node task information;
the storing of the execution result corresponding to the target task to the position corresponding to the task information in the database includes:
storing the total task execution result to a position corresponding to the total task information in the database;
storing the cluster task execution result to a position corresponding to the cluster task information in the database;
and storing the node task execution result to a position corresponding to the node task information in the database.
In one embodiment, the receiving a log file corresponding to the target task sent by the operation and maintenance execution system includes:
and responding to a reporting request sent by the operation and maintenance execution system, and receiving a log file corresponding to the target task sent by the operation and maintenance execution system.
In one embodiment, the method further comprises:
acquiring a task query request;
and inquiring task information and/or the execution result corresponding to the task inquiry request according to the task inquiry request.
In one embodiment, the operation and maintenance execution system includes a plurality of execution machines and a plurality of server clusters, each execution machine is connected to the plurality of server clusters, and the method further includes:
respectively acquiring a cluster identifier of each server cluster in the plurality of server clusters;
dividing the cluster identifier into N parts to obtain N cluster identifier sets, wherein N is a positive integer;
and respectively sending the N cluster identifier sets to different N execution machines so that each execution machine sends the target task to a corresponding server cluster according to the received cluster identifier set.
In one embodiment, the sending the N sets of cluster identifiers to different N execution machines respectively includes:
acquiring resource use conditions of the plurality of execution machines;
determining N execution machines with the minimum resource usage in the plurality of execution machines according to the resource usage condition;
and respectively sending the N cluster identifier sets to the N execution machines with the minimum resource usage.
In a second aspect, the present application provides an operation and maintenance method, which is applied to an operation and maintenance execution system, where the operation and maintenance execution system is connected to a task management system, and the method includes:
receiving an operation and maintenance task, if the operation and maintenance task comprises a result recording script, judging that the operation and maintenance task is a target task, and sending a task management request to the task management system so that the task management system generates task information corresponding to the task management request in a database of the task management system;
executing the target task and generating a log file;
and sending the log file corresponding to the target task to the task management system so that the task management system analyzes the log file to obtain an execution result corresponding to the target task, and storing the log file corresponding to the target task to a position corresponding to the task information in the database.
In a third aspect, the application further provides an operation and maintenance task management device. The device is applied to a task management system, the task management system is connected with an operation and maintenance execution system, and the device comprises:
the task information generating module is used for responding to a task management request sent by the operation and maintenance execution system and generating task information corresponding to the task management request in a database of the task management system; the task management request is generated by the operation and maintenance execution system under the condition of receiving a target task;
the receiving module is used for receiving a log file corresponding to the target task sent by the operation and maintenance execution system, wherein the log file is generated after the operation and maintenance execution system executes the target task;
the analysis module is used for analyzing the log file to obtain an execution result corresponding to the target task;
and the storage module is used for storing the execution result corresponding to the target task to the position corresponding to the task information in the database.
In a fourth aspect, the application further provides an operation and maintenance device. The device comprises:
the receiving and judging module is used for receiving the operation and maintenance task, judging the operation and maintenance task as a target task if the operation and maintenance task comprises an execution result recording script, and sending a task management request to the task management system so that the task management system generates task information corresponding to the task management request in a database of the task management system;
the execution module is used for executing the target task and generating a log file;
and the log sending module is used for sending the log file corresponding to the target task to the task management system so that the task management system analyzes the log file to obtain an execution result corresponding to the target task, and storing the log file corresponding to the target task to a position corresponding to the task information in the database.
In a fifth aspect, the present application further provides a computer device. The computer device comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps of the method embodiments when executing the computer program.
In a sixth aspect, the present application further provides a computer-readable storage medium. The computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
In a seventh aspect, the present application further provides a computer program product. The computer program product comprises a computer program which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
According to the operation and maintenance task management method, the operation and maintenance device, the equipment and the medium, the task management system responds to the task management request sent by the operation and maintenance execution system, and task information corresponding to the task management request is generated in a database of the task management system; the task management request is generated when the operation and maintenance execution system receives a target task; receiving a log file corresponding to a target task sent by an operation and maintenance execution system, wherein the log file is generated after the operation and maintenance execution system executes the target task; analyzing the log file to obtain an execution result corresponding to the target task; and storing the execution result corresponding to the target task to the position corresponding to the task information in the database. The automatic analysis and arrangement of the results of the ansable target task are performed for summary analysis, and the problems that the operation and maintenance task results cannot be displayed visually and the operation and maintenance task results cannot be analyzed and counted automatically are solved.
Drawings
FIG. 1 is a diagram of an application environment of a method for managing an operation and maintenance task according to an embodiment;
FIG. 2 is a flowchart illustrating a method for managing an operation and maintenance task according to an embodiment;
FIG. 3 is a flow diagram illustrating an operation and maintenance method according to an embodiment;
FIG. 4 is a block diagram of an embodiment of an operation and maintenance task management device;
FIG. 5 is a block diagram of an embodiment of an operation and maintenance device;
FIG. 6 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The existing ansable is used as a set of mature automatic operation and maintenance tools, and operation and maintenance personnel can perform operation and maintenance change operation on a target server through a simple and modularized script. However, the ansable does not provide a management platform for the operation and maintenance task itself if the ultra-large scale production server is faced. The general operation and maintenance team uses excel worksheet or external documents such as task boards to perform unified management on the operation and maintenance tasks. When the task is changed every time, the document is updated, and the change condition is recorded manually for later analysis and use. However, in a real production environment, there may be hundreds of historical operation and maintenance tasks, and dozens of operation and maintenance tasks still performing changes, and in addition, the frequent change requirements in production may generate a great number of operation and maintenance task entries. The cost of managing these tasks is quite high, ranging from the number of subtasks of the operation and maintenance job task (multiple updates), to the number/variety of servers that are enormous in production, to the completion details of each operation and maintenance job task, which is particularly complex if documented to summarize.
Based on the above reasons, the embodiment of the present application provides an operation and maintenance task management method for the defects of the ansable operation and maintenance tool, which can be applied to the application environment shown in fig. 1. Wherein, the task management system 102 communicates with the operation and maintenance execution system 104 through a network. The operation and maintenance execution system comprises an operation and maintenance management platform and a plurality of server clusters connected with the operation and maintenance management platform. The operation and maintenance management platform is used for managing the execution of tasks, distributes target tasks to the server cluster so that the node servers on each node in the server cluster execute the operation and maintenance tasks, records the task execution result of each node server after the target tasks are executed, generates log files according to the task execution results, returns the log files to the operation and maintenance management platform, and obtains the log files, analyzes the log files, obtains corresponding execution results, records and stores the corresponding execution results.
In one embodiment, as shown in fig. 2, an operation and maintenance task management method is provided, and is applied to a task management system, where the task management system is connected with an operation and maintenance execution system. The task management system comprises an access layer, a service layer and a data layer, wherein the access layer is used for being connected with an operation and maintenance execution system and a third-party application, the service layer is used for realizing the processing function of task information, such as adding task information, deleting task information and modifying task information, and is used for acquiring log files from the operation and maintenance execution system, and the service layer is provided with an application service interface (such as an API application service structure) to provide interface service for a system background. The data layer is used for storing various task management tables, and the task management tables record task information of various tasks and task execution results corresponding to the task information. Taking the application of the method to the task management system in fig. 1 as an example for explanation, the method includes the following steps:
step 202, responding to a task management request sent by an operation and maintenance execution system, and generating task information corresponding to the task management request in a database of a task management system; the task management request is generated by the operation and maintenance execution system under the condition of receiving the target task.
The task management request refers to that the operation and maintenance execution system generates the task after receiving the target task, and the task management request is used for enabling the task relation system to automatically establish task information related to the target task. The task management request includes task information corresponding to the target task, and the task information includes information such as a task ID and a version. The target task is a task file uploaded to the operation and maintenance execution system by operation and maintenance personnel, the task file comprises a task script and a result recording script, the task script is used for executing the target task, and the result recording script is used for recording the result of executing the task and generating a log file.
Specifically, after receiving a target task, the operation and maintenance execution system sends a task management request to the task management system, where the task management request carries task information related to the target task, and after receiving the task management request, the task management system adds task information corresponding to the task management request to a database thereof. And a task management table is preset in the database, and the task management system adds the task information to the task management table to generate task information corresponding to the task management request. In some embodiments, in addition to the capability of the system to generate task information by itself, the user may perform basic operations such as adding, deleting, and modifying on the stored task information through a task management interface of the task management system, so as to manage the task information. Specifically, through the task management interface, the user can enter related task information, including a name, server cluster information, version information, and the like. Meanwhile, in the process that the operation and maintenance execution system does not execute the target task or execute the target task, the user can modify or delete the task information through the task management interface, and the accuracy of the task information is ensured.
And 204, receiving a log file corresponding to the target task sent by the operation and maintenance execution system, wherein the log file is generated after the operation and maintenance execution system executes the target task.
The log file is a file which is generated by the operation and maintenance execution system after the operation and maintenance execution system executes a target task and records a task execution result, the operation and maintenance execution system can directly send the log file to the task management system after generating the log file, or the log file can be stored firstly and then sent to the task management system after receiving a log file request sent by the task management system.
And step 206, analyzing the log file to obtain an execution result corresponding to the target task.
After receiving the log file, the task management system analyzes the log file and determines an execution result corresponding to the target task. And the log file records the execution result of each node server in the operation and maintenance execution system, and the execution result of the target task is determined according to the execution result of each node server.
And 208, storing the execution result corresponding to the target task to the position corresponding to the task information in the database.
The task information and the execution result can be stored in the same table in the database or different tables in the database. For example, a task information management table is stored in the database, each time a task management request sent by the operation and maintenance execution system is received, a column of task information is newly added to the task information management table, and an execution result is recorded in the same column as the task information; or, a task information management table and an execution result table are stored in the database, each time a task management request sent by the operation and maintenance execution system is received, a column of task information is newly added in the task information management table, a mapping relation is established between the task information and a corresponding column in the execution result table, and the execution result is recorded in the execution result table according to the mapping relation.
In the operation and maintenance task management method, the task management system responds to the task management request sent by the operation and maintenance execution system and generates task information corresponding to the task management request in a database of the task management system; the task management request is generated when the operation and maintenance execution system receives a target task; receiving a log file corresponding to a target task sent by an operation and maintenance execution system, wherein the log file is generated after the operation and maintenance execution system executes the target task; analyzing the log file to obtain an execution result corresponding to the target task; and storing the execution result corresponding to the target task to the position corresponding to the task information in the database. The results of the ansible target tasks are automatically analyzed and sorted, and are summarized and analyzed, so that the problems that the operation and maintenance task results cannot be visually displayed and are automatically analyzed and counted are solved.
In one embodiment, the operation and maintenance execution system comprises a plurality of server clusters, each server cluster comprising a corresponding plurality of node servers; the execution result comprises a total task execution result, a cluster task execution result and a node task execution result; analyzing the log file to obtain an execution result corresponding to the target task, wherein the execution result comprises:
step 302, obtaining a node task execution result of each node server in a log file;
step 304, obtaining a cluster task execution result of each server cluster corresponding to the task information according to the node task execution result of each node server; if the node task execution results of all the node servers corresponding to the server cluster are all completed, the cluster task execution result is completed, otherwise, the cluster task execution result is not completed;
step 306, obtaining a total task execution result according to the cluster task execution result of each server cluster; if the task execution results of all the server clusters are completed, the total task execution result is completed, otherwise, the total task execution result is not completed.
Specifically, the server cluster may be any server cluster, the server cluster in this embodiment is a kubernets cluster, and kubernets is an open-source container management platform, and various container-based applications may be run by using these resources. The nodes in the cluster can be simply divided into a node load node and a master management node. The target tasks comprise task scripts and result recording scripts, the task scripts and the result recording scripts are executed on node servers on all nodes of the server cluster, the task scripts are used for running the target tasks, the result recording scripts record execution results of the node servers for executing the target tasks and generate log data, the log data of all server nodes in the operation and maintenance execution system are returned to an operation and maintenance management platform in the operation and maintenance execution system, and the operation and maintenance management platform enables all the log data to form log files. After receiving the log file, the task management system firstly acquires log data in the log file, and extracts an execution result of each node server as a node task execution result according to the log data. And then determining the server cluster to which each node server belongs according to the log data, determining the cluster task execution result of each server cluster according to the execution result of the node server in each server cluster, wherein if the execution result of all the node servers in the server cluster is finished, the cluster task execution result of the server cluster is finished, otherwise, the cluster task execution result is unfinished, namely if the execution result of at least one node server in the server cluster is unfinished, the execution result of the server cluster is unfinished. And then determining the execution result of the total task according to the cluster task execution results of all the server clusters, wherein if the cluster execution results of all the server clusters are finished, the total task execution result is finished, otherwise, the total task execution result is not finished, namely if the cluster task execution result of at least one server cluster is not finished, the total task execution result is not finished.
In one embodiment, the task information includes total task information, cluster task information, and node task information; storing the execution result corresponding to the target task to the position corresponding to the task information in the database comprises:
step 402, storing the total task execution result to a position corresponding to the total task information in a database;
step 404, storing the cluster task execution result to a position corresponding to the cluster task information in a database;
and step 406, storing the node task execution result to a position corresponding to the node task information in the database.
Specifically, each task information includes total task information, cluster task information, and node task information, where the total task information includes a task identifier, task version information, and the like of a target task, the cluster task information includes a corresponding cluster identifier, task version information, node details, and the like, and the node task information includes a node IP, task version information, and the like. Each kind of task information is stored in different task tables, the database comprises a total task table, a cluster task table and a node task table, the total task information is recorded in the total task table, the cluster task information is recorded in the cluster task table, and the node task information is recorded in the node task table. Different execution results can be recorded in the corresponding task table, or different execution result tables can be established separately, and the mapping relation between the different execution result tables and the corresponding task information is established.
In one embodiment, receiving a log file corresponding to a target task sent by an operation and maintenance execution system includes: and responding to a reporting request sent by the operation and maintenance execution system, and receiving a log file corresponding to a target task sent by the operation and maintenance execution system.
Specifically, after all server clusters in the operation and maintenance execution system complete a target task and generate log files, a reporting request is generated and automatically sent to the task management system, the reporting request carries the log files corresponding to the target task, and the task management system automatically receives the log files after receiving the reporting request. The automatic management of the log files of the target tasks is further realized, and the files can be analyzed and the corresponding execution results can be stored without the need of operation and maintenance personnel to pay attention to the execution condition of the target tasks at any time.
In one embodiment, the operation and maintenance task management method further includes:
step 502, acquiring a task query request;
and step 504, inquiring task information and/or execution results according to the task inquiry request.
Specifically, the operation and maintenance task management system in this embodiment can automatically query corresponding task information according to information input by a user. The task query request can be any task information or execution result, and queries all information of the target task corresponding to the task information according to the task information, including the execution result of the target task, the information of the corresponding server cluster and the cluster task execution result, and the information of the node server corresponding to each corresponding server cluster and the node task execution result. For example, information of a certain target task needs to be queried, the task query request is a target task identifier, and the task management system queries task version information, corresponding server cluster information, corresponding node server information, and various execution results of the target task according to the target task identifier. And according to the execution result, inquiring which total tasks are successfully or unsuccessfully executed and which cluster tasks or node tasks are successful or unsuccessfully executed.
In one embodiment, the operation and maintenance execution system includes a plurality of execution machines and a plurality of server clusters, each execution machine is connected to the plurality of server clusters, and the method further includes:
step 602, respectively obtaining a cluster identifier of each server cluster in a plurality of server clusters;
step 604, dividing the cluster identifier into N parts to obtain N cluster identifier sets, where N is a positive integer;
and 606, respectively sending the N cluster identifier sets to different N execution machines, so that each execution machine sends the target task to the corresponding server cluster according to the received cluster identifier set.
The task management system can also optimize a scheduling strategy of the operation and maintenance execution system for the target task, the execution machine is used for distributing the target task to the corresponding server cluster so that the server cluster executes the target task, and the distribution strategy of the execution machine is automatically determined by the task management system.
Because the infrastructure is a single task scheduling system, one command can only execute one operation and maintenance operation, and only one infrastructure executive can send tasks each time. While the anchor does not have the capability of distributed deployment and scheduling, when a plurality of tasks are changed in production, the anchor cannot automatically perform distributed scheduling and parallel processing of operation and maintenance tasks. The operation and maintenance personnel are required to manually plan which tasks are to be performed on which ansable task machines. Therefore, the task management system of the embodiment also optimizes the scheduling policy of the target task.
Specifically, the task management system obtains a cluster identifier of each server cluster in the multiple server clusters, where the cluster identifier may be stored in the task management system, the task management system locally calls all cluster identifiers, the cluster identifier may also be stored in an operation and maintenance management platform of the operation and maintenance execution system, and the task management system may receive the cluster identifier from the operation and maintenance management platform. After all the cluster identifiers are obtained, the cluster identifiers are divided into N parts, which may be equally divided into N parts or randomly divided into N parts, where N is a preset concurrency degree. After the N cluster identifier sets are obtained, the N cluster identifier sets are sent to the N execution machines in a script packet manner, that is, each execution machine receives one cluster identifier set. And after receiving the script packet comprising the cluster identifier set, the executive machine analyzes the script packet, acquires the cluster needing to send the target task, and sends the target task to the corresponding server cluster according to the cluster identifier in the received cluster identifier set. Therefore, the purpose of automatically distributing the target task is achieved, operation and maintenance personnel are not required to determine the operation and maintenance task execution plan, and operation and maintenance time and labor cost are reduced.
In one embodiment, sending the N sets of cluster identifiers to different N execution machines respectively includes:
step 702, acquiring resource use conditions of a plurality of execution machines;
step 704, determining the N execution machines with the minimum resource usage among the plurality of execution machines according to the resource usage;
and step 706, respectively sending the N cluster identifier sets to the N execution machines with the minimum resource usage.
Specifically, resource usage refers to usage of the Central Processing Unit (CPU) or memory (MEM) of the execution engine. The task management platform acquires the resource use conditions of the multiple execution machines, determines the N execution machines with the lowest resource use amount in the multiple execution machines according to the use conditions of the execution machines, and distributes the divided cluster identification set to the N execution machines with the lowest resource use amount.
In an embodiment, there is further provided an operation and maintenance method, as shown in fig. 3, applied to an operation and maintenance execution system, where the operation and maintenance execution system is connected to a task management system, and the method includes:
step 802, receiving the operation and maintenance task, if the operation and maintenance task comprises a result recording script, judging the operation and maintenance task as a target task, and sending a task management request to a task management system so that the task management system generates task information corresponding to the task management request in a database of the task management system;
step 804, executing the target task and generating a log file;
step 806, sending the log file corresponding to the target task to the task management system, so that the task management system analyzes the log file to obtain an execution result corresponding to the target task, and storing the log file corresponding to the target task to a position corresponding to the task information in the database.
The operation and maintenance task is uploaded to the operation and maintenance execution system in the form of a script package, and may be manually uploaded by operation and maintenance personnel, or the operation and maintenance task is stored in the task management system, and when the operation and maintenance task needs to be executed, the operation and maintenance execution system calls the script package of the operation and maintenance task from the task management system. And the operation and maintenance execution system analyzes the script packet and determines whether the script packet comprises a result recording script, wherein the result recording script is used for generating log data comprising an execution result of the operation and maintenance task after the operation and maintenance task is executed. If the operation and maintenance task comprises the result recording script, the execution result of the operation and maintenance task can be automatically recorded in the task management system, and the operation and maintenance task is determined to be the target task. And sending the task management request to a task management system, and automatically generating task information corresponding to the task management request in a database of the task management system. The operation and maintenance execution system executes the target task, namely, the server cluster in the operation and maintenance execution system executes the target task and then generates log data, and the log data is returned to the operation and maintenance management platform of the operation and maintenance execution system to form a log file. And finally, sending the log file to a task management system, and storing an analysis result to a position corresponding to the task information in a database after the task management system analyzes the log file.
It should be understood that, although the steps in the flowcharts related to the embodiments are shown in sequence as indicated by the arrows, the steps are not necessarily executed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the above embodiments may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, the embodiment of the present application further provides an operation and maintenance task management device for implementing the above-mentioned operation and maintenance task management method. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme described in the above method, so specific limitations in one or more embodiments of the operation and maintenance task management device provided below may refer to the limitations in the above operation and maintenance task management method, and details are not described here.
In one embodiment, as shown in fig. 4, there is provided an operation and maintenance task management apparatus 400, including: a task information generating module 401, a receiving module 402, an analyzing module 403 and a storing module 404, wherein:
the task information generating module 401 is configured to respond to a task management request sent by an operation and maintenance execution system, and generate task information corresponding to the task management request in a database of the task management system; the task management request is generated by the operation and maintenance execution system under the condition of receiving the target task.
The receiving module 402 is configured to receive a log file corresponding to a target task sent by an operation and maintenance execution system, where the log file is generated after the operation and maintenance execution system executes the target task.
And the analysis module 403 is configured to analyze the log file to obtain an execution result corresponding to the target task.
The storage module 404 is configured to store an execution result corresponding to the target task to a position corresponding to the task information in the database.
In one embodiment, the operation and maintenance execution system comprises a plurality of server clusters, each server cluster comprising a corresponding plurality of node servers; the execution result comprises a total task execution result, a cluster task execution result and a node task execution result; the analysis module 403 includes: the system comprises a node task result acquisition module, a cluster task result analysis module and a total task analysis module, wherein the node task result acquisition module is used for acquiring a node task execution result of each node server in a log file; the cluster task result analysis module is used for obtaining a cluster task execution result of each server cluster corresponding to the task information according to the node task execution result of each node server; if the node task execution results of all the node servers corresponding to the server cluster are all completed, the cluster task execution result is completed, otherwise, the cluster task execution result is not completed; the total task analysis module is used for obtaining a total task execution result according to the cluster task execution result of each server cluster; if the task execution results of all the server clusters are completed, the total task execution result is completed, otherwise, the total task execution result is not completed.
In one embodiment, the task information includes total task information, cluster task information, and node task information; the storage module 404 includes: the system comprises a total task execution result storage module, a cluster task execution result storage module and a node task execution result storage module, wherein the total task execution result storage module is used for storing a total task execution result to a position corresponding to total task information in a database; the cluster task execution result storage module is used for storing the cluster task execution result to a position corresponding to the cluster task information in the database; and the node task execution result storage module is used for storing the node task execution result to a position corresponding to the node task information in the database.
In an embodiment, the receiving module 402 is specifically configured to receive, in response to a report request sent by an operation and maintenance execution system, a log file corresponding to a target task sent by the operation and maintenance execution system.
In one embodiment, the operation and maintenance task management device further includes: the query module is used for acquiring a task query request; and inquiring task information and/or an execution result corresponding to the task inquiry request according to the task inquiry request.
In one embodiment, the operation and maintenance execution system includes a plurality of execution machines and a plurality of server clusters, each execution machine is connected to the plurality of server clusters, and the operation and maintenance task management device further includes: the system comprises a cluster identifier acquisition module, a distribution module and a set sending module, wherein the cluster identifier acquisition module is used for respectively acquiring the cluster identifier of each server cluster in a plurality of server clusters; the distribution module is used for dividing the cluster identifier into N parts to obtain N cluster identifier sets, wherein N is a positive integer; the set sending module is used for sending the N cluster identifier sets to different N execution machines respectively, so that each execution machine sends the target task to the corresponding server cluster according to the received cluster identifier set.
In one embodiment, the set sending module is specifically configured to obtain resource usage of a plurality of execution machines; determining N execution machines with the minimum resource usage in the multiple execution machines according to the resource usage condition; and respectively sending the N cluster identifier sets to the N execution machines with the minimum resource usage.
In one embodiment, as shown in fig. 5, there is provided an operation and maintenance device, including: a receiving judgment module 501, an execution module 502 and a log sending module 503, wherein:
the receiving and determining module 501 is configured to receive the operation and maintenance task, determine that the operation and maintenance task is a target task if the operation and maintenance task includes an execution result recording script, and send the task management request to the task management system, so that the task management system generates task information corresponding to the task management request in a database of the task management system.
An execution module 502 for executing the target task and generating a log file;
the log sending module 503 is configured to send the log file corresponding to the target task to the task management system, so that the task management system analyzes the log file to obtain an execution result corresponding to the target task, and stores the log file corresponding to the target task in a location corresponding to the task information in the database.
The operation and maintenance task management device and each module in the operation and maintenance device may be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 6. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operating system and the computer program to run on the non-volatile storage medium. The database of the computer device is used for storing operation and maintenance data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an operation and maintenance task management method.
Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the above-described method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
In an embodiment, a computer program product is provided, comprising a computer program which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include a Read-Only Memory (ROM), a magnetic tape, a floppy disk, a flash Memory, an optical Memory, a high-density embedded nonvolatile Memory, a resistive Random Access Memory (ReRAM), a Magnetic Random Access Memory (MRAM), a Ferroelectric Random Access Memory (FRAM), a Phase Change Memory (PCM), a graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims (13)

1. An operation and maintenance task management method is applied to a task management system, the task management system is connected with an operation and maintenance execution system, and the method comprises the following steps:
responding to a task management request sent by the operation and maintenance execution system, and generating task information corresponding to the task management request in a database of the task management system; the task management request is generated by the operation and maintenance execution system under the condition of receiving a target task;
receiving a log file corresponding to the target task sent by the operation and maintenance execution system, wherein the log file is generated after the operation and maintenance execution system executes the target task;
analyzing the log file to obtain an execution result corresponding to the target task;
and storing the execution result corresponding to the target task to the position corresponding to the task information in the database.
2. The method of claim 1, wherein the operation and maintenance execution system comprises a plurality of server clusters, each server cluster comprising a corresponding plurality of node servers; the execution result comprises a total task execution result, a cluster task execution result and a node task execution result;
the analyzing the log file to obtain the execution result corresponding to the target task includes:
acquiring a node task execution result of each node server in the log file;
obtaining a cluster task execution result of each server cluster corresponding to the task information according to the node task execution result of each node server; if the node task execution results of all the node servers corresponding to the server cluster are all completed, the cluster task execution result is completed, otherwise, the cluster task execution result is not completed;
obtaining a total task execution result according to a cluster task execution result of each server cluster; if the task execution results of all the server clusters are completed, the total task execution result is completed, otherwise, the total task execution result is not completed.
3. The method of claim 2, wherein the task information includes total task information, cluster task information, and node task information;
the storing of the execution result corresponding to the target task to the position corresponding to the task information in the database includes:
storing the total task execution result to a position corresponding to the total task information in the database;
storing the cluster task execution result to a position corresponding to the cluster task information in the database;
and storing the node task execution result to a position corresponding to the node task information in the database.
4. The method of claim 1, wherein the receiving the log file corresponding to the target task sent by the operation and maintenance execution system comprises:
and responding to a report request sent by the operation and maintenance execution system, and receiving a log file corresponding to the target task sent by the operation and maintenance execution system.
5. The method of claim 1, further comprising:
acquiring a task query request;
and inquiring task information and/or the execution result corresponding to the task inquiry request according to the task inquiry request.
6. The method of claim 1, wherein the operation and maintenance execution system comprises a plurality of execution machines and a plurality of server clusters, each execution machine being connected to the plurality of server clusters, the method further comprising:
respectively acquiring a cluster identifier of each server cluster in the plurality of server clusters;
dividing the cluster identifier into N parts to obtain N cluster identifier sets, wherein N is a positive integer;
and respectively sending the N cluster identifier sets to different N execution machines so that each execution machine sends the target task to a corresponding server cluster according to the received cluster identifier set.
7. The method of claim 6, wherein the sending the N sets of cluster identifiers to different N execution machines, respectively, comprises:
acquiring resource use conditions of the plurality of execution machines;
determining N execution machines with the minimum resource usage in the plurality of execution machines according to the resource usage condition;
and respectively sending the N cluster identifier sets to the N execution machines with the minimum resource usage.
8. An operation and maintenance method is applied to an operation and maintenance execution system, the operation and maintenance execution system is connected with a task management system, and the method comprises the following steps:
receiving an operation and maintenance task, if the operation and maintenance task comprises a result recording script, judging that the operation and maintenance task is a target task, and sending a task management request to the task management system so that the task management system generates task information corresponding to the task management request in a database of the task management system;
executing the target task and generating a log file;
and sending the log file corresponding to the target task to the task management system so that the task management system analyzes the log file to obtain an execution result corresponding to the target task, and storing the log file corresponding to the target task to a position corresponding to the task information in the database.
9. An operation and maintenance task management device is applied to a task management system, the task management system is connected with an operation and maintenance execution system, and the device comprises:
the task information generating module is used for responding to a task management request sent by the operation and maintenance execution system and generating task information corresponding to the task management request in a database of the task management system; the task management request is generated by the operation and maintenance execution system under the condition of receiving a target task;
the receiving module is used for receiving a log file corresponding to the target task sent by the operation and maintenance execution system, wherein the log file is generated after the operation and maintenance execution system executes the target task;
the analysis module is used for analyzing the log file to obtain an execution result corresponding to the target task;
and the storage module is used for storing the execution result corresponding to the target task to the position corresponding to the task information in the database.
10. An operation and maintenance device, comprising:
the receiving and judging module is used for receiving the operation and maintenance task, judging the operation and maintenance task as a target task if the operation and maintenance task comprises an execution result recording script, and sending a task management request to the task management system so that the task management system generates task information corresponding to the task management request in a database of the task management system;
the execution module is used for executing the target task and generating a log file;
and the log sending module is used for sending the log file corresponding to the target task to the task management system so that the task management system analyzes the log file to obtain an execution result corresponding to the target task, and storing the log file corresponding to the target task to a position corresponding to the task information in the database.
11. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 8.
12. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 8.
13. A computer program product comprising a computer program, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 8 when executed by a processor.
CN202210543761.6A 2022-05-19 2022-05-19 Operation and maintenance task management method, operation and maintenance method, device, equipment and medium Pending CN114968739A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210543761.6A CN114968739A (en) 2022-05-19 2022-05-19 Operation and maintenance task management method, operation and maintenance method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210543761.6A CN114968739A (en) 2022-05-19 2022-05-19 Operation and maintenance task management method, operation and maintenance method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN114968739A true CN114968739A (en) 2022-08-30

Family

ID=82985664

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210543761.6A Pending CN114968739A (en) 2022-05-19 2022-05-19 Operation and maintenance task management method, operation and maintenance method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN114968739A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115766862A (en) * 2022-11-16 2023-03-07 中国工商银行股份有限公司 Container operation and maintenance method and device, computer equipment and storage medium
CN116400928A (en) * 2023-02-08 2023-07-07 广东名阳信息科技有限公司 Method and system for improving operation and maintenance efficiency based on log data
CN117312042A (en) * 2023-12-01 2023-12-29 之江实验室 Operation and maintenance method and operation and maintenance system of computer cluster

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115766862A (en) * 2022-11-16 2023-03-07 中国工商银行股份有限公司 Container operation and maintenance method and device, computer equipment and storage medium
CN116400928A (en) * 2023-02-08 2023-07-07 广东名阳信息科技有限公司 Method and system for improving operation and maintenance efficiency based on log data
CN117312042A (en) * 2023-12-01 2023-12-29 之江实验室 Operation and maintenance method and operation and maintenance system of computer cluster

Similar Documents

Publication Publication Date Title
US11392561B2 (en) Data migration using source classification and mapping
US11144361B2 (en) System and method for automatic dependency analysis for use with a multidimensional database
US11797496B2 (en) System and method for parallel support of multidimensional slices with a multidimensional database
US20200104377A1 (en) Rules Based Scheduling and Migration of Databases Using Complexity and Weight
US20190230000A1 (en) Intelligent analytic cloud provisioning
CN114968739A (en) Operation and maintenance task management method, operation and maintenance method, device, equipment and medium
CN109408746B (en) Image information query method, image information query device, computer equipment and storage medium
US20200142872A1 (en) System and method for use of a dynamic flow in a multidimensional database environment
US7979858B2 (en) Systems and methods for executing a computer program that executes multiple processes in a multi-processor environment
US9135071B2 (en) Selecting processing techniques for a data flow task
US20110307291A1 (en) Creating a capacity planning scenario
US10158709B1 (en) Identifying data store requests for asynchronous processing
CN108073696B (en) GIS application method based on distributed memory database
CN105468720A (en) Method for integrating distributed data processing systems, corresponding systems and data processing method
US8892557B2 (en) Optimal persistence of a business process
CN112163048A (en) Method and device for realizing OLAP analysis based on ClickHouse
US20200250192A1 (en) Processing queries associated with multiple file formats based on identified partition and data container objects
CN117389830A (en) Cluster log acquisition method and device, computer equipment and storage medium
CN111752539A (en) BI service cluster system and building method thereof
CN113946618B (en) Cloud data interaction method and system, computer equipment and storage medium
CN114462859A (en) Workflow processing method and device, computer equipment and storage medium
CN103488792A (en) PM2.5 monitoring, storing and processing method achieved through cloud computing
CN109918410B (en) Spark platform based distributed big data function dependency discovery method
CN111782363A (en) Method and flow system for supporting multi-service scene calling
CN103761248A (en) Method and system for querying data through main memory database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination