CN105915633B - Automatic operation and maintenance system and method - Google Patents

Automatic operation and maintenance system and method Download PDF

Info

Publication number
CN105915633B
CN105915633B CN201610389650.9A CN201610389650A CN105915633B CN 105915633 B CN105915633 B CN 105915633B CN 201610389650 A CN201610389650 A CN 201610389650A CN 105915633 B CN105915633 B CN 105915633B
Authority
CN
China
Prior art keywords
task
automation
server
automatic
receiving
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610389650.9A
Other languages
Chinese (zh)
Other versions
CN105915633A (en
Inventor
朱宇
张恒华
王丽梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201610389650.9A priority Critical patent/CN105915633B/en
Publication of CN105915633A publication Critical patent/CN105915633A/en
Application granted granted Critical
Publication of CN105915633B publication Critical patent/CN105915633B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/562Brokering proxy services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/04Network management architectures or arrangements
    • H04L41/046Network management architectures or arrangements comprising network management agents or mobile agents therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/566Grouping or aggregating service requests, e.g. for unified processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/567Integrating service provisioning from a plurality of service providers

Abstract

the application discloses an automated operation and maintenance system and method. One embodiment of the automated operation and maintenance system comprises: the task receiving server is used for receiving an automation task from a resource management system, adding the automation task to a corresponding task cache queue and determining the type of the automation task; the task distribution server is used for distributing the automatic tasks in the task cache queue to the automatic task processing servers of the corresponding types; and the automatic task processing server is used for processing the received automatic task. The implementation mode reduces the requirement on the coupling between the resource management system and the automation task processing server, thereby reducing the development difficulty and improving the operation and maintenance efficiency.

Description

Automatic operation and maintenance system and method
Technical Field
The application relates to the technical field of computers, in particular to the technical field of system maintenance or management, and particularly relates to an automatic operation and maintenance system and method.
background
The automation platform may provide services for all server automation, network automation, and DNS domain name management within the system. At present, as the network demand increases, the system scale gradually increases, and the automation task amount also increases. Independent development and online operation are carried out among the respective automated processing modules in the system, and the modules are different in structural design, so that the automatic platform is low in operation efficiency and difficult to maintain efficiently. In addition, each module is provided with an independent log processing method, and logs are stored locally, so that the automatic task processing logs in the system are not queried and analyzed efficiently; the configuration information of each module is also stored locally, and when the configuration is modified, the corresponding module needs to be restarted to take effect, which may affect the continuity of the automatic operation and maintenance service.
disclosure of Invention
in view of the above, it is desirable to provide an automatic operation and maintenance management architecture with high efficiency and easy maintenance, and further, it is desirable to provide an automatic operation and maintenance system with high efficiency for managing logs and configuration information. To address one or more of the problems set forth above, automated operations and maintenance systems and methods are provided.
In one aspect, the present application provides an automated operation and maintenance system, including: the task receiving server is used for receiving an automation task from a resource management system, adding the automation task to a corresponding task cache queue and determining the type of the automation task; the task distribution server is used for distributing the automatic tasks in the task cache queue to the automatic task processing servers of the corresponding types; and the automatic task processing server is used for processing the received automatic task.
in some embodiments, the automation task includes a task keyword; the task receiving server includes: the receiving and sending module is used for receiving an automation task from the resource management system and determining the type of the automation task according to the task keyword; the buffer module is used for adding the automatic tasks to corresponding task buffer queues according to the types; and the control module is used for extracting the automation tasks from the cache queue and sending the automation tasks to a task distribution server.
In some embodiments, the automation task processing server is further configured to generate automation task state information and send the automation task state information to the task distribution server; the task distribution server is also used for receiving the automation task state information from the automation task processing server and sending the automation task state information to the control module; the control module is also used for receiving the automatic task state information sent by the task distribution server; the transceiver module is further configured to send the automation task state information to the resource management system.
In some embodiments, the caching module is further configured to add the automation task state information to a message caching queue; the transceiver module is further configured to extract the task state information from the message buffer queue and send the extracted automated task state information to the resource management system.
in some embodiments, the automated task processing server is further to: configuring a task type; and sending the task type to the task distribution server so as to register the task type to the task distribution server.
in some embodiments, the task distribution server is further configured to: and receiving the task type and storing the automatic task processing server in association with the task type.
In some embodiments, the task type of the automated task processing server includes at least one of: domain name system automation, network address translation automation, gateway automation, and server automation.
In some embodiments, the task distribution server further comprises a protocol conversion module; the protocol conversion module is used for converting a data protocol between the automatic task processing server and the task distribution server.
In some embodiments, the task receiving server further comprises a first log collection module for collecting log information of the task receiving server; the task distribution server also comprises a second log acquisition module, and the second log acquisition module is used for acquiring log information of the task distribution server; the automatic task processing server also comprises a third log acquisition module, and the third log acquisition module is used for acquiring log information of the automatic task processing server; and the system further comprises: and the log management server is used for acquiring the log information of the task receiving server, the log information of the task distributing server and the log information of the automatic task processing server through the first log acquisition module, the second log acquisition module and the third log acquisition module respectively.
in some embodiments, the system further comprises: and the configuration updating server is used for updating the configuration information of the task receiving server, the task distributing server and the automatic task processing server.
in a second aspect, the present application provides an automated operation and maintenance method, including: receiving an automation task from a resource management system and determining the type of the automation task, wherein the automation task comprises a task keyword; adding the automation task to a corresponding task buffer queue; and sending the automation tasks in the task cache queue to a task distribution server so that the task distribution server can distribute the automation tasks to automation task processing servers of corresponding task types.
In some embodiments, the method further comprises: receiving automatic task state information sent by the task distribution server; sending the automated task state information to the resource management system; wherein the automation task state information is generated by the automation task processing server.
In some embodiments, prior to sending the automation task state information to the resource management system, the method further comprises: adding the automation task state information to a message buffer queue; and said sending said automated task state information to said resource management system comprises: reading the automation task state information from the message buffer queue; and sending the read automatic task state information to the resource management system.
in some embodiments, the task type of the automated task processing server is preconfigured and registered in the task distribution server.
In some embodiments, the task type includes at least one of: domain name system automation, network address translation automation, gateway automation, and server automation.
In some embodiments, the method further comprises: collecting operation and maintenance log information, and sending the operation and maintenance log information to a log management server; and updating the configuration information in response to monitoring the updating operation of the configuration updating server.
According to the automatic operation and maintenance system and method, the task receiving server and the task distributing server are used for communicating between the resource management system and each automatic task processing server, the requirement for the coupling between the resource management system and each automatic task processing server is lowered, the development difficulty is lowered, and stable and efficient operation of automatic services is further guaranteed.
drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, with reference to the accompanying drawings in which:
FIG. 1 is a schematic system architecture diagram of an automated operation and maintenance system to which the present application may be applied;
FIG. 2 is a schematic block diagram of one embodiment of an automated operation and maintenance system according to the present application;
FIG. 3 is a schematic diagram of data interaction in the automated operation and maintenance system shown in FIG. 2;
FIG. 4 is a schematic diagram of a specific application scenario of the automated operation and maintenance system;
FIG. 5 is a flow diagram of one embodiment of an automated operation and maintenance management method according to the present application;
FIG. 6 is a schematic block diagram of a computer system suitable for use in implementing a server according to embodiments of the present application.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Referring to FIG. 1, there is shown a schematic system architecture diagram of an automated operation and maintenance system to which the present application may be applied.
As shown in fig. 1, the system architecture 100 includes a terminal device 101, an operation and maintenance management server 102, and automation processing servers 103, 104, 105, and the like. The terminal device 101 may be connected to the operation and maintenance management server 102 through a wired connection or a wireless connection.
The operation and maintenance personnel 110 can use the terminal device 101 to interact with the operation and maintenance management server 102. The terminal device 101 may install an operation platform that controls the operation and maintenance management server 102. The operation and maintenance personnel 110 can execute the operation and maintenance operation on the operation platform, and the terminal device 101 can generate an operation and maintenance instruction according to the operation and maintenance operation of the operation and maintenance personnel 110 and send the operation and maintenance instruction to the operation and maintenance management server. The operation platform may also show the operation and maintenance status of the system to the operation and maintenance personnel 110.
The operation and maintenance management server 102 may receive the operation and maintenance instruction sent by the terminal device 101, generate an automation processing request after analyzing the instruction, and send the automation processing request to the automation processing servers 103, 104, 105, and the like.
The automation processing servers 103, 104, 105 may perform automation processing tasks included in the automation processing request, such as automatic assignment of network addresses, disk formatting, server offline, and the like.
It should be understood that the numbers of terminal devices, operation and maintenance management servers, and automation processing servers in fig. 1 are merely illustrative. Any number of terminal devices, operation and maintenance management servers and automation processing servers can be provided according to implementation requirements.
Referring to fig. 2, a schematic structural diagram of an embodiment of an automated operation and maintenance system according to the present application is shown. As shown in fig. 2, the automation operation and maintenance system 200 includes a task receiving server 210, a task distributing server 220, and an automation task processing server 230. The task receiving server 210 is configured to receive an automation task from the resource management system, add the automation task to a corresponding task cache queue, and determine the type of the automation task, and the task distributing server 220 is configured to distribute the automation task in the task cache queue to an automation task processing server of a corresponding type. The automation task processing server 230 is configured to process the received automation task.
In this embodiment, the task receiving server 210 may first receive an automation task sent by an RMS (Resource management System). The RMS is used for managing computers in a network system, monitoring the running state of each computer in real time and generating an automatic task according to service requirements. The automation tasks may include server automation, network automation, DNS (Domain Name System) management, and other tasks, and specifically, the automation tasks may include automation operation tasks such as automatic network address allocation, System reinstallation, switch update, machine offline, automatic Domain Name allocation, and disk formatting.
The task receiving server can establish a task cache queue by using a Redis storage system, and store the received automation tasks into the task cache queue. A determination may also be made as to the type of automation task received. The type of the automation task may include one or more of domain name system automation, network address translation automation, gateway automation, server automation, and may also include other task types. The types of the automation tasks can be configured in advance by operation and maintenance personnel, and the task receiving server can create a corresponding task cache queue for each type of the automation tasks. When receiving the automation task sent by the RMS, the automation task can be stored in a corresponding task buffer queue according to the type of the automation task.
In a further embodiment, the task receiving server 210 may include a transceiver module 211, a cache module 212, and a control module 213. The transceiver module 211 may be configured to receive an automation task from the resource management system, the buffer module 212 may be configured to add the automation task to a corresponding task buffer queue, and the control module 213 is configured to extract the automation task from the buffer queue and send the automation task to the task distribution server.
The transceiver module 211 may receive an automation task, wherein the automation task may include a task keyword. The task keywords are used to identify the type of automation task. The transceiving module 211 may determine the type of the received automation task according to the task keyword.
further, the transceiver module 211 may include a receiving unit, a processing unit and an inquiring unit. The receiving unit can receive the automation task and carry out safety verification on the automation task, the processing unit can send the state information of the automation task acquired by the receiving and sending module to the resource management system, and the query unit can complete a service state query request sent by the resource management system.
The cache module 212 is configured to write the received automation task to the database. Task cache queues corresponding to the types of the automation tasks can be created by utilizing Redis, and the received automation tasks are added to the corresponding task cache queues according to the types. Redis may store the received data in a master-slave mode, and the cache module 212 may write the automation task into the database in the master-slave mode. The caching module 212 may cache the automation task sent by the resource management system for a preset time period, and may provide a caching environment when the automation task processing server cannot normally provide a service, so as to provide a repair time for the abnormal automation task processing server.
The control module 213 may extract the automation task from the cache queue and send the automation task to the task distribution server 220 according to the type of the automation task. In a practical scenario, the control module 213 may include task distribution logic for various types of automation tasks, and the control module 213 may send the automation tasks to the task distribution server according to the task distribution logic.
The task distribution server 220 can be used to communicate automation tasks to corresponding automation servers. The task distribution server 220 can configure task queues corresponding to different types of automation tasks, and upon receiving an automation task from a task receiving server, the automation task can be stored in the corresponding task queue so that the automation server 230 can retrieve the automation task from the corresponding task queue.
In the present embodiment, a GearmanServer (Gearman server) may be adopted as the task distribution server 220. The GearmanServer can provide various Application Programming Interfaces (APIs), and the APIs are used for realizing communication with the task receiving server, the resource management system and the automation task processing server.
The GearmanServer provides a distribution framework, can distribute a certain type of tasks to a server suitable for processing the type of tasks for processing, and is favorable for realizing the task parallel processing and the load balancing of the tasks. In a particular application, the GearmanServer may send a wake-up command to the automation server 230 to wake up the automation server when receiving an automation task. The automation processing server 230 is awakened to generate a task fetch request and send it to the GearmanServer.
The automation task processing server 230 may grab the automation task and process the automation task. Alternatively, the automation task processing server 230 may configure the task type and transmit the task type to the task distribution server 220 to register the task type of the automation task processing server to the task distribution server 220. The configured task type may be consistent with the type of automation task that the automation task processing server 230 is capable of processing, and may include at least one of: domain name system automation, network address translation automation, gateway automation, and server automation. The task type of the automated task processing server can also include other custom types.
The task distribution server 220 may also be used to receive task types and store automated task processing servers in association with the task types. The task distribution server 220, upon receiving an automation task, may notify the automation task processing server that registered the type of automation task. When capturing an automation task, the automation task processing server 230 may directly capture the automation task from the corresponding type of task queue of the task distribution server 220, so that the task distribution server may accurately and efficiently distribute the automation task.
further, the control module 213 of the task receiving server 210 may also be configured to configure a distribution logic of the automation task, so that the control module 213 sends the type of automation task to the corresponding task queue of the task distributing server, thereby ensuring that the automation server 230 can obtain the automation task from the corresponding task queue of the task distributing server 220.
In this embodiment, the automation task processing server may also directly obtain the automation task from a task cache queue created by a cache module of the task receiving server, for example, an automation task processing server of the server automation type may send a task obtaining request to the task receiving server and receive the automation task in a Redis queue sent by the task receiving server.
It should be noted that fig. 2 only schematically illustrates one automation task processing server 230, in an application scenario of the present application, the automation operation and maintenance system 200 may include a plurality of automation task processing servers, where each automation task processing server may be configured with the same or different task types, and the present application does not limit the number of the automation task processing servers.
In some embodiments, the automation task processing server 230 is further configured to generate and transmit automation task status information to the task distribution server 220, the task distribution server 220 is further configured to receive and transmit automation task status information from the automation task processing server 230 to the control module 213, the control module 213 is further configured to receive automation task status information transmitted by the task distribution server 220, and the transceiving module 211 is further configured to transmit automation task status information to the resource management system. Specifically, the automation task processing server 230 may generate task state information according to a processing state of the automation task and send the task state information to the task distribution server 220 in an operation process of processing the automation task, the task distribution server 220 may forward the task state information to the caching module 212, the caching module 212 may create a message caching queue, and further add the task state information received from the task distribution server to the message caching queue. The transceiver module 211 may extract task state information from the message buffer queue and send the extracted automation task state information to the resource management system, so as to report the current process of the automation task processing to the resource management system.
alternatively, the resource management system may send a task status query request to the task receiving server 210, and the query unit in the transceiver module 211 may, in response to the task status query request, find out corresponding status information from the message buffer queue and feed back the status information to the resource management system.
According to the scheme described in the embodiment, the task receiving server and the task distributing server are used for forwarding the tasks and the messages between the automatic task processing server and the resource management system, the processing of the automatic tasks is reasonably scheduled, and the processing efficiency of the automatic operation and maintenance system can be improved. Meanwhile, the third party is adopted to realize the communication between the automatic task processing server and the resource management system, so that the coupling between the resource management system and the automatic task processing server and between the automatic task processing servers are reduced, and the requirement on the framework consistency of the respective automatic task processing servers is reduced, thereby reducing the development difficulty and enabling the automatic operation and maintenance system to be easier to maintain and expand.
In some optional implementations of this implementation, the task distribution server 220 may further include a protocol conversion module. The protocol conversion module can be used for converting data protocols between the automation task processing server and the task distribution server. When the data protocol between some automatic task processing servers (for example, the automatic task processing servers with the task type of server automation) and the task distribution server is different, the protocol conversion module can be used for converting the data protocol. Therefore, the automatic task processing server can be compatible with different automatic task processing servers with different data protocols, and the expandability of the automatic operation and maintenance system is improved.
In some optional embodiments, the task receiving server 210 may further include a first log collection module, the task distributing server 220 further includes a second log collection module, and the automated task processing server 230 further includes a third log collection module. Namely, each server in the automatic operation and maintenance system comprises a log collection module. The first log collection module, the second log collection module and the third log collection module are respectively used for collecting the log information of the task receiving server 210, the log information of the task distributing server 220 and the log information of the automation task processing server 230. The first, second and third log collection modules may employ a flash log collection module to monitor log files of the task receiving server 210, the task distributing server 220 and the automated task processing server 230 using tail commands (for displaying a certain number of text blocks at the beginning or end), or monitor output of rsyslog, which is a log monitoring system, using a flash log collection module.
further, the automatic operation and maintenance system may further include a log management server, which is configured to obtain log information of the task receiving server, log information of the task distributing server, and log information of the automatic task processing server through the first log collection module, the second log collection module, and the third log collection module, respectively. The first log collection module, the second log collection module and the third log collection module can uniformly transmit collected log information to the log server through a flash communication unit. Therefore, operation and maintenance personnel can directly inquire or analyze the logs on the log server, and compared with a mode that each server in the system stores the logs independently in the prior art, the operation and maintenance system can accelerate the inquiry and analysis speed of the logs and is beneficial to improving the operation and maintenance efficiency of the system.
In some optional embodiments, the automation operation and maintenance system further includes a configuration update server for updating configuration information of the task receiving server, the task distributing server, and the automation task processing server. The configuration update server can be used for uniformly managing the configuration information of each server in the automation operation and maintenance system. Each server may monitor the operation of the configuration update server. Upon configuration information for a server in the system, the modified configuration information may be entered into the configuration server. After monitoring the operation of the configuration updating server, the server can correspondingly change the configuration information. In a specific application, a zookeeper server cluster may be adopted as the configuration update server. The Zookeeper server cluster comprises a plurality of nodes, and each node is used for managing configuration information of one or more servers in the automatic operation and maintenance system. When a certain node performs a configuration information change operation, the corresponding server may monitor the operation, and in response to monitoring the operation, receive the changed configuration information from the node and load the received configuration information. In the process, the server with the changed configuration information does not need to be restarted, and new configuration information can be dynamically loaded, so that the optimization of configuration management in the automatic operation and maintenance system is realized, and the operation and maintenance efficiency is further improved.
Referring to fig. 3, a schematic diagram of data interaction in the automated operation and maintenance system shown in fig. 2 is shown. As shown in fig. 3, in the task distribution process, the automation task is sent by the resource management system to the transceiver module of the task receiving server in step 311. Wherein the automation task may include task keywords for identifying a task type. The transceiver module may determine the type of the automation task according to the task keyword in step 312, the automation task may be added to the task cache queue by the cache module in step 313, and the control module may extract the automation task from the task cache queue and send it to the task distribution server in step 314. The task distribution server may distribute the automation task to the automation task processing server according to the task type in step 315. The automated task processing server processes the task in step 316. Thereby, the distribution of the automation task is realized.
in the task state feedback flow, the automation task processing server generates state information of the automation task in step 321 and transmits the state information to the task distribution server in step 322. The task distribution server forwards the status information to the control module in step 323, the status information of the automation task is added to the message buffer queue by the buffer module in step 324, and the automation task status information in the message buffer queue is extracted in step 325 and fed back to the resource management system in step 326. Therefore, the communication between the resource management system and the automatic task processing server about the task state is realized.
In the log management process, the task receiving server, the task distributing server and the automation task processing server may upload log information to the log management server in step 331, step 332 and step 333, respectively.
In the configuration update flow, the configuration update server updates the configuration information in step 341, transmits the updated configuration information to the task receiving server in step 342, transmits the updated configuration information to the task distributing server in step 343, and transmits the updated configuration information to the automation task processing server in step 344.
With continued reference to FIG. 4, a schematic diagram of a specific application scenario of the automated operation and maintenance system is shown. As shown in fig. 4, in the automation operation and maintenance system 400, a UIM module 401, a Redis queue 402, and a TASK-CTRL module 403 may be configured in the TASK receiving server, wherein the UIM module 401 may serve as a transceiver module, the Redis queue 402 may be created by a cache module, and the TASK-CTRL module 403 may serve as a control module. The UIM module may be divided into a UIM _ accept unit, a UIM _ process unit, and a UIM _ query unit, where the UIM _ accept unit receives an automation task request transmitted by the RMS, performs security verification by using a white list, a secret key, and the like, and stores the task in the Redis queue 402. The UIM _ process unit may return the task state information to the RMS. The UIM _ query unit may provide a task status query service for the RMS, feeding back task status messages in the Redis queue 402 to the RMS. The TASK-CTRL module 403 may obtain an automation TASK from the Redis queue, and send the automation TASK to each TASK pipe of the GearmanServer according to the type of the automation TASK. Meanwhile, the TASK-CTRL module 403 may perform statistical recording on the TASKs. The GearmanServer 404 is a task distribution server and can be configured with a plurality of task pipes, each task pipe corresponding to a type of automation task. The GearmanServer can distribute and schedule the automation tasks and distribute the different types of automation tasks to the corresponding automation processing servers.
The DNS automation module 405, the gateway/network address translation automation module 406, the server automation module 407 handle servers for different types of automation tasks. Wherein the DNS automation module 405 may handle a DNS automation type of task, the gateway/network address translation automation module 406 may handle a gateway/network address translation automation type of task, and the server automation module 407 may handle a server automation type of task.
the relay module 408 may be configured in the task distribution server, and is used for converting the data protocol between the server automation module 407 and the GearmanServer 404.
The configuration center 409 may update the configuration information of each automation module, and the log center 410 may uniformly manage the log information of each automation module.
Taking application of a virtual IP as an example, the application scenario of the automation operation and maintenance system shown in fig. 4 may be: the method comprises the steps that a User Identity Module (UIM) receives a TASK sent by a Root Mean Square (RMS), the TASK type is determined to be gateway automation through a TASK keyword, security authentication and parameter verification are carried out and then stored in a Redis queue, TASK data are obtained from the Redis queue through a constant polling method by a TASK _ CTRL, the TASK is pushed to a GearmanServer, the gateway automation server which is in charge of gateway automation operation registers the interested TASK type in the GearmanServer after the gateway automation server is started, the GearmanServer can push the TASK data to the gateway automation server after receiving the TASK sent by the TASK _ CTRL, and the gateway automation server can analyze the TASK data and then carry out gateway automation operation. After the gateway automation operation is completed, the gateway automation server replies the completion state to the GearmanServer, the TASK _ CTRL continues to push the information to the Redis queue after receiving the reply information of the GearmanServer, and the UIM _ process module of the UIM reads the Redis queue to take out the return value and then sends the return value to the RMS system, so that the automation operation of a virtual IP application is formally completed. In the process, the gateway automation server is not directly communicated with the RMS, but is communicated with the RMS through a trusted third party network Redis and a GearmanServer, so that the requirement on the coupling between the gateway automation server and the RMS is reduced, and the system is easier to maintain.
In another application scenario of the automatic operation and maintenance system, the automation module in the automatic operation and maintenance system can be expanded. The method comprises the steps that a newly added automation module configures a flash log acquisition module on a server on the premise of ensuring that a service flow is correct, then configuration information is output to a zookeeper cluster for management, then interested TASK types are registered in a GearmanServer, and finally TASK distribution logic is compiled in a TASK _ CTRL module, wherein the logic can enable TASK data of corresponding types, which are obtained from a Redis queue by the TASK _ CTRL module, to be accurately sent to TASK queues of corresponding TASK types of the GearmanServer after the TASK data pass verification. By adding a new automation module which is used for configuring a log collection module, configuring information output management logic, registering interested task categories and compiling task distribution logic, the expandability of the automation operation and maintenance system is improved.
referring to FIG. 5, a flow diagram of one embodiment of an automated operation and maintenance management method according to the present application is shown. The automatic operation and maintenance management method can be applied to a task receiving server in an automatic operation and maintenance system. As shown in fig. 5, the method 500 for automated operation and maintenance management includes the following steps:
step 501, receiving an automation task from a resource management system and determining the type of the automation task.
In this embodiment, the electronic device (e.g., the task receiving server in the foregoing embodiment) on which the automation operation and maintenance method runs may receive an automation task, where the automation task may include a task keyword. The task keywords are used to identify the type of automation task. The type of automation task received may be determined based on the task keywords.
the electronic device on which the automated operation and maintenance method operates may provide an interface for data interaction with the resource management system, through which the resource management system may issue requests containing automated tasks. The electronic device on which the automated operation and maintenance method operates (e.g., the task receiving server in the foregoing embodiment) may receive the request and parse the automated task from the request.
Optionally, the type of automation task may include at least one of: domain name system automation, network address translation automation, gateway automation, and server automation. The types of automation tasks may also include other custom types.
Step 502, add the automation task to the corresponding task buffer queue.
in this embodiment, the received automation task may be written to a database. In a specific implementation, a task cache queue corresponding to the type of each automation task may be created by using Redis, and the received automation task is added to the corresponding task cache queue according to the type. Redis can store received data in a master-slave mode, can cache an automation task sent by a resource management system for a preset time period, and provides a buffering environment when an automation task processing server cannot normally provide services, so that repair time is provided for the abnormal automation task processing server.
Redis may create a task cache queue corresponding to different types of automation tasks, and after determining the type of the received automation task, may add the automation task to the task cache queue corresponding to the type.
Further, the electronic device on which the automation operation and maintenance method operates can perform security verification on the received automation task. The automation task that passes the security verification may be added to the corresponding buffer queue in step 502.
Step 503, the automation task in the task cache queue is sent to the task distribution server, so that the task distribution server distributes the automation task to the automation task processing server corresponding to the task type.
The automation tasks can be extracted from the task cache queue and sent to the task distribution server according to the type of the automation tasks. In an actual scenario, when the automated task processing server is started, the task type and the task distribution logic may be configured in advance in the electronic device on which the automated operation and maintenance method is executed. The electronic equipment can send the automation task to the task distribution server according to the task distribution logic, and simultaneously inform the task distribution server of the task type, so that the task distribution server can add the automation task to the corresponding task queue according to the task type. The automation task processing server can extract the automation task from the corresponding task queue and process the automation task, so that the automation operation is completed. The task type of the automatic task processing server is configured in advance and registered in the task distribution server. The task type may include at least one of: domain name system automation, network address translation automation, gateway automation, and server automation.
In some embodiments, the automated operation and maintenance method 500 may further include: and receiving the automatic task state information sent by the task distribution server, and sending the automatic task state information to the resource management system. Wherein the automation task state information is generated by an automation task processing server. The automatic task processing server can generate task state information according to the task processing state and send the task state information to the task distribution server. The electronic device on which the automatic operation and maintenance method operates can receive the task state information forwarded by the task distribution server and feed back the task state information to the resource management system.
Further, the automation task state information may be added to the message buffer queue prior to sending the automation task state information to the resource management system. The terminal device may create a message buffer queue by using the Redis, and write the received task state information into the message buffer queue. At this time, the terminal device may read the status information of the automation task from the message buffer queue within a predetermined time period and send the read status information of the automation task to the resource management system.
in a further embodiment, the automated operation and maintenance method 500 may further include: and collecting operation and maintenance log information and sending the operation and maintenance log information to a log management server. Operation and maintenance log information can be monitored by using a tail command, and operation and maintenance log information of the terminal equipment can be collected by using a log monitoring system such as rsyslog and the like, and the operation and maintenance log information is reported to a log management server.
In a further embodiment, the automated operation and maintenance method 500 may further include: and updating the configuration information in response to monitoring the updating operation of the configuration updating server. In the automatic operation and maintenance system, the configuration information of the electronic equipment can be managed by using the configuration updating server. The operation of the configuration update server on updating the configuration information of the electronic device can be monitored, and the updated configuration information is synchronized when the updating operation is monitored.
The automated operation and maintenance method provided by the embodiment of the application can be applied to task distribution of the resource management system, and the method is used for reasonably distributing the automated task request sent by the resource management system to the corresponding automated task processing server, so that the coupling between the resource management system and the automated task processing server can be reduced, the requirement on the structural consistency of the respective automated task processing server is reduced, and the operation and maintenance efficiency is improved.
Referring now to FIG. 6, shown is a block diagram of a computer system 600 suitable for use in implementing a server according to embodiments of the present application.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
In particular, the process described above with reference to the flowchart of fig. 5 may be implemented as a computer software program, according to an embodiment of the present disclosure. For example, embodiments of the present disclosure include a computer program product comprising a computer program tangibly embodied on a machine-readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
As another aspect, the present application also provides a non-volatile computer storage medium, which may be the non-volatile computer storage medium included in the apparatus in the above-described embodiments; or it may be a non-volatile computer storage medium that exists separately and is not incorporated into the terminal. The non-transitory computer storage medium stores one or more programs that, when executed by a device, cause the device to: receiving an automation task from a resource management system and determining the type of the automation task, wherein the automation task comprises a task keyword; adding the automation task to a corresponding task buffer queue; and sending the automation tasks in the task cache queue to a task distribution server so that the task distribution server can distribute the automation tasks to automation task processing servers of corresponding task types.
the above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by a person skilled in the art that the scope of the invention as referred to in the present application is not limited to the embodiments with a specific combination of the above-mentioned features, but also covers other embodiments with any combination of the above-mentioned features or their equivalents without departing from the inventive concept. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (14)

1. An automated operation and maintenance system, comprising:
The task receiving server is used for receiving an automation task from a resource management system, determining the type of the automation task and adding the automation task to a corresponding task cache queue according to the type;
The task distribution server is used for distributing the automatic tasks in the task cache queue to the automatic task processing servers of the corresponding types; and
the automatic task processing server is used for configuring a task type, sending the task type to the task distribution server so as to register the task type to the task distribution server and process the received automatic task;
The task distribution server is configured with task queues corresponding to different types of automation tasks, and after receiving the automation tasks from the task receiving server, the automation tasks are stored in the corresponding task queues; and
When the task distribution server receives the automation tasks, the task distribution server informs an automation task processing server which registers the task types of the received automation tasks, and the automation task processing server captures the automation tasks from a task queue of the corresponding type of the task distribution server.
2. The system of claim 1, wherein the automation task comprises a task keyword;
The task receiving server includes:
The receiving and sending module is used for receiving an automation task from the resource management system and determining the type of the automation task according to the task keyword;
The buffer module is used for adding the automatic tasks to corresponding task buffer queues according to the types; and
And the control module is used for extracting the automation task from the cache queue and sending the automation task to a task distribution server.
3. The system of claim 2, wherein the automated task processing server is further configured to generate and send automated task state information to the task distribution server;
the task distribution server is also used for receiving the automation task state information from the automation task processing server and sending the automation task state information to the control module;
The control module is also used for receiving the automatic task state information sent by the task distribution server;
The transceiver module is further configured to send the automation task state information to the resource management system.
4. The system of claim 3, wherein the caching module is further configured to add the automation task state information to a message caching queue;
The transceiver module is further configured to extract the task state information from the message buffer queue and send the extracted automated task state information to the resource management system.
5. The system of claim 1, wherein the task distribution server is further configured to: and receiving the task type and storing the automatic task processing server in association with the task type.
6. The system of claim 1, wherein the task type of the automated task processing server comprises at least one of: domain name system automation, network address translation automation, gateway automation, and server automation.
7. The system of claim 6, wherein the task distribution server further comprises a protocol conversion module;
the protocol conversion module is used for converting a data protocol between the automatic task processing server and the task distribution server.
8. the system according to any one of claims 1 to 7, wherein the task receiving server further comprises a first log collection module, the first log collection module is configured to collect log information of the task receiving server;
The task distribution server also comprises a second log acquisition module, and the second log acquisition module is used for acquiring log information of the task distribution server;
The automatic task processing server also comprises a third log acquisition module, and the third log acquisition module is used for acquiring log information of the automatic task processing server; and
The system further comprises:
And the log management server is used for acquiring the log information of the task receiving server, the log information of the task distributing server and the log information of the automatic task processing server through the first log acquisition module, the second log acquisition module and the third log acquisition module respectively.
9. the system according to any one of claims 1-7, further comprising:
and the configuration updating server is used for updating the configuration information of the task receiving server, the task distributing server and the automatic task processing server.
10. An automated operation and maintenance method, comprising:
receiving an automation task from a resource management system and determining the type of the automation task, wherein the automation task comprises a task keyword;
Adding the automatic tasks to corresponding task cache queues according to task types;
The automatic tasks in the task cache queues are sent to a task distribution server, so that when the task distribution server receives the automatic tasks, the automatic tasks are stored in corresponding task queues, the automatic task processing server registering the task types of the received automatic tasks is informed, the automatic tasks are distributed to the automatic task processing servers corresponding to the task types, and the automatic task processing servers capture the automatic tasks from the task queues corresponding to the task distribution servers;
The task type of the automatic task processing server is pre-configured and registered in the task distribution server.
11. the method of claim 10, further comprising:
receiving automatic task state information sent by the task distribution server;
sending the automated task state information to the resource management system;
Wherein the automation task state information is generated by the automation task processing server.
12. The method of claim 11, wherein prior to sending the automation task state information to the resource management system, the method further comprises:
adding the automation task state information to a message buffer queue; and
The sending the automated task state information to the resource management system includes:
Reading the automation task state information from the message buffer queue;
And sending the read automatic task state information to the resource management system.
13. The method of claim 10, wherein the task type comprises at least one of: domain name system automation, network address translation automation, gateway automation, and server automation.
14. the method according to any one of claims 10-13, further comprising:
collecting operation and maintenance log information, and sending the operation and maintenance log information to a log management server; and
And updating the configuration information in response to monitoring the updating operation of the configuration updating server.
CN201610389650.9A 2016-06-02 2016-06-02 Automatic operation and maintenance system and method Active CN105915633B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610389650.9A CN105915633B (en) 2016-06-02 2016-06-02 Automatic operation and maintenance system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610389650.9A CN105915633B (en) 2016-06-02 2016-06-02 Automatic operation and maintenance system and method

Publications (2)

Publication Number Publication Date
CN105915633A CN105915633A (en) 2016-08-31
CN105915633B true CN105915633B (en) 2019-12-10

Family

ID=56743322

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610389650.9A Active CN105915633B (en) 2016-06-02 2016-06-02 Automatic operation and maintenance system and method

Country Status (1)

Country Link
CN (1) CN105915633B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106484530A (en) * 2016-09-05 2017-03-08 努比亚技术有限公司 A kind of distributed task dispatching O&M monitoring system and method
CN106850848A (en) * 2017-03-14 2017-06-13 广东小天才科技有限公司 Data management implementation method and data management platform based on data management platform
CN107045459A (en) * 2017-03-31 2017-08-15 北京奇艺世纪科技有限公司 A kind of O&M request processing method and device based on ansible
CN107222523B (en) * 2017-05-04 2021-03-26 北京京电电网维护集团有限公司 Terminal data processing method, device and system
CN108874513A (en) * 2017-05-11 2018-11-23 北京京东尚科信息技术有限公司 Handle method, system, electronic equipment and the computer-readable medium of timed task
CN107368365A (en) * 2017-07-25 2017-11-21 携程旅游信息技术(上海)有限公司 Cloud platform automatic O&M method, system, equipment and storage medium
CN107968836B (en) * 2017-12-06 2020-12-18 北京微网通联股份有限公司 Task distribution method and device
CN108390786B (en) * 2018-02-27 2021-05-07 北京奇艺世纪科技有限公司 Business operation and maintenance method and device and electronic equipment
CN110209431B (en) * 2018-02-28 2021-04-27 杭州海康威视数字技术股份有限公司 Data partition splitting method and device
CN110290163B (en) * 2018-08-28 2022-03-25 新华三技术有限公司 Data processing method and device
CN109450979B (en) * 2018-10-10 2020-12-04 广州华多网络科技有限公司 Distributed dynamic task execution method and related device
CN110196731B (en) * 2018-10-29 2021-05-11 腾讯科技(深圳)有限公司 Operation and maintenance system, method and storage medium
CN109409853A (en) * 2018-12-29 2019-03-01 深圳市思迪信息技术股份有限公司 Flow of task processing method and processing device based on operation management platform
CN112398744A (en) * 2019-08-16 2021-02-23 阿里巴巴集团控股有限公司 Network communication method and device and electronic equipment
CN111414381B (en) * 2020-03-04 2021-09-14 腾讯科技(深圳)有限公司 Data processing method and device, electronic equipment and storage medium
CN112307046A (en) * 2020-11-26 2021-02-02 北京金堤征信服务有限公司 Data acquisition method and device, computer readable storage medium and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104407922A (en) * 2014-10-29 2015-03-11 中国建设银行股份有限公司 Asynchronous batch-processing dispatching method and system
CN104504495A (en) * 2014-11-27 2015-04-08 北京百度网讯科技有限公司 Operation and maintenance abnormity processing method, device and equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009134772A2 (en) * 2008-04-29 2009-11-05 Maxiscale, Inc Peer-to-peer redundant file server system and methods
CN103941662A (en) * 2014-03-19 2014-07-23 华存数据信息技术有限公司 Task scheduling system and method based on cloud computing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104407922A (en) * 2014-10-29 2015-03-11 中国建设银行股份有限公司 Asynchronous batch-processing dispatching method and system
CN104504495A (en) * 2014-11-27 2015-04-08 北京百度网讯科技有限公司 Operation and maintenance abnormity processing method, device and equipment

Also Published As

Publication number Publication date
CN105915633A (en) 2016-08-31

Similar Documents

Publication Publication Date Title
CN105915633B (en) Automatic operation and maintenance system and method
CN113742031B (en) Node state information acquisition method and device, electronic equipment and readable storage medium
CN112860451A (en) Multi-tenant data processing method and device based on SaaS
CN103677858A (en) Method, system and device for managing virtual machine software in cloud environment
GB2520514A (en) Message delivery in a messaging system
CN115640110B (en) Distributed cloud computing system scheduling method and device
CN110874272A (en) Resource allocation method and device, computer readable storage medium and electronic device
CN104243610A (en) Distributed file transmission service method
CN102026228B (en) Statistical method and equipment for communication network performance data
JP5268589B2 (en) Information processing apparatus and information processing apparatus operating method
CN109800081A (en) A kind of management method and relevant device of big data task
CN104468248A (en) Service performance monitoring method, reverse proxy server, statistical analysis server and system
CN105205735A (en) Power dispatching data cloud service system and implementation method
CN113760638A (en) Log service method and device based on kubernets cluster
CN109324892B (en) Distributed management method, distributed management system and device
CN107273047B (en) Cache management method, cache manager and storage management software
JP5809743B2 (en) Method for providing heterogeneous system data in a distributed system
US20150212834A1 (en) Interoperation method of newtork device performed by computing device including cloud operating system in could environment
CN112165527B (en) File distribution method, file distribution device and electronic equipment
CN111126604B (en) Model training method, device, server and storage medium
JP2009042995A (en) Method for controlling circulation of distributed information, distribution system, and its server and program
CN112416980A (en) Data service processing method, device and equipment
CN107124293B (en) Protocol management method and system of distributed network system
CN115361264B (en) Node management method, device, node, system and storage medium
JP4863126B2 (en) Server monitoring system and server monitoring method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant