WO2021174694A1 - 基于数据中心的运维监控方法、装置、设备及存储介质 - Google Patents

基于数据中心的运维监控方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2021174694A1
WO2021174694A1 PCT/CN2020/093315 CN2020093315W WO2021174694A1 WO 2021174694 A1 WO2021174694 A1 WO 2021174694A1 CN 2020093315 W CN2020093315 W CN 2020093315W WO 2021174694 A1 WO2021174694 A1 WO 2021174694A1
Authority
WO
WIPO (PCT)
Prior art keywords
monitoring data
monitoring
target
source
processed
Prior art date
Application number
PCT/CN2020/093315
Other languages
English (en)
French (fr)
Inventor
朱仁宇
董超
许俊威
黄伟星
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021174694A1 publication Critical patent/WO2021174694A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • G06F11/3072Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/327Alarm or error message display

Definitions

  • This application relates to the field of data monitoring technology, and in particular to a data center-based operation and maintenance monitoring method, device, equipment, and storage medium.
  • Monitoring is an important link in the operation and maintenance of current business systems.
  • the operation and maintenance personnel discover and locate fault events in the business system through monitoring, so as to maintain and update the business system according to the fault events.
  • an enterprise especially a group enterprise
  • Operation and maintenance personnel use different monitoring systems to monitor multiple business systems separately, and feed back the monitoring results to the monitoring management center of the enterprise.
  • the inventor realizes that the current monitoring results collected by different monitoring systems are not compatible, and it is impossible to effectively monitor and analyze the monitoring results collected by multiple monitoring systems, which makes the monitoring and analysis efficiency low.
  • the embodiments of the application provide a data center-based operation and maintenance monitoring method, device, equipment, and storage medium to solve the current incompatibility of monitoring results collected by different monitoring systems, resulting in the inability to achieve monitoring results collected by multiple monitoring systems Carry out effective monitoring and analysis, making the monitoring and analysis less efficient.
  • a data center-based operation and maintenance monitoring method including:
  • each original monitoring data includes the original alarm level
  • a data center-based operation and maintenance monitoring device including:
  • the original monitoring data acquisition module is used to obtain the original monitoring data corresponding to the business system collected by the associated monitoring system, and each original monitoring data includes the original alarm level;
  • a pending monitoring data acquisition module configured to standardize the original alarm level, obtain the standard alarm level, and determine the original monitoring data whose standard alarm level is the target alarm level as the pending monitoring data;
  • the target source obtaining module is configured to perform source detection on the to-be-processed monitoring data and obtain the target source corresponding to the to-be-processed monitoring data;
  • An effective monitoring data acquisition module configured to perform effectiveness detection on the to-be-processed monitoring data, obtain the effectiveness detection result, and determine the to-be-processed monitoring data whose effectiveness detection result is an alarm valid as effective monitoring data;
  • the target monitoring data obtaining module is configured to perform formatting processing based on the effective monitoring data and the target source corresponding to the effective monitoring data to obtain the target monitoring data;
  • the alarm monitoring result obtaining module is used to automatically monitor the target monitoring data based on the target source to obtain the alarm monitoring result.
  • a computer device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:
  • each original monitoring data includes the original alarm level
  • One or more readable storage media storing computer readable instructions
  • the computer readable storage medium storing computer readable instructions
  • the one Or multiple processors perform the following steps:
  • each original monitoring data includes the original alarm level
  • the data center is networked with the associated monitoring system so that the data center can obtain the collected original monitoring data corresponding to the business system from the associated monitoring system.
  • the data center can obtain the collected original monitoring data corresponding to the business system from the associated monitoring system.
  • the pertinence of the monitoring data to be processed Ensure the pertinence of the monitoring data to be processed; perform source detection on the monitoring data to be processed to determine the target source to ensure the pertinence of subsequent automated monitoring; filter effective monitoring data based on the effectiveness of the effectiveness of the monitoring data to be processed , To ensure the pertinence and timeliness of subsequent automated monitoring; formatting based on effective monitoring data and its corresponding target sources to obtain target monitoring data in a unified format to ensure the feasibility of the subsequent automated monitoring process; The formatted target monitoring data corresponding to any target source is automatically monitored to obtain the alarm monitoring results output by the automatic monitoring program, thereby improving the efficiency of monitoring the target monitoring data collected by at least one associated monitoring system in the data center.
  • FIG. 1 is a schematic diagram of an application environment of a data center-based operation and maintenance monitoring method in an embodiment of the present application
  • FIG. 2 is a flowchart of a data center-based operation and maintenance monitoring method in an embodiment of the present application
  • FIG. 3 is another flowchart of a data center-based operation and maintenance monitoring method in an embodiment of the present application
  • FIG. 5 is another flowchart of a data center-based operation and maintenance monitoring method in an embodiment of the present application
  • FIG. 6 is another flowchart of a data center-based operation and maintenance monitoring method in an embodiment of the present application.
  • FIG. 7 is another flowchart of a data center-based operation and maintenance monitoring method in an embodiment of the present application.
  • FIG. 8 is another flowchart of a data center-based operation and maintenance monitoring method in an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a data center-based operation and maintenance monitoring device in an embodiment of the present application.
  • Fig. 10 is a schematic diagram of a computer device in an embodiment of the present application.
  • the data center-based operation and maintenance monitoring method can be applied to the application environment as shown in FIG. 1.
  • the data center-based operation and maintenance monitoring method is applied in a data center-based operation and maintenance monitoring system, and the data center-based operation and maintenance monitoring system includes a data center as shown in FIG. 1 and at least a communication connection with the data center.
  • One related monitoring system is connected, and each related monitoring system is connected to at least one business system, so that the related monitoring system can collect the original monitoring data corresponding to the connected business system, and the data center can obtain all the original monitoring data from at least one related monitoring system.
  • a business system refers to a system that needs to be monitored and can realize a specific business.
  • the business system is a monitored object, which can specifically be a system of a certain application or an application product.
  • the associated monitoring system is a system that is connected to the business system and used to monitor the business system to realize whether there are fault events and fault location.
  • a data center refers to a processing center that communicates with at least one associated monitoring system for automatic monitoring and analysis of the original monitoring data collected by all associated monitoring systems, and real-time effective monitoring and analysis of multiple business systems.
  • the business system, the associated monitoring system, and the data center in this embodiment all include a server and a client that communicates with the server through a network.
  • the client is also called the client, which refers to the program that corresponds to the server and provides local services to the client.
  • the client can be installed on, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server can be implemented as an independent server or a server cluster composed of multiple servers.
  • a data center-based operation and maintenance monitoring method is provided. Taking the method applied to the server of the data center in FIG. 1 as an example for description, the method includes the following steps:
  • the original monitoring data is the unprocessed data formed by the associated monitoring system during the operation of the monitoring business system, specifically related data formed by the associated monitoring system identifying a failure event during the operation of the monitoring business system, including but not limited to Information such as data content, event occurrence time, monitoring log, and original alarm level.
  • the original alarm level is the level determined by the associated monitoring system to rate and classify the fault events identified during the operation of the monitoring business system based on the alarm level judgment criteria preset by the system.
  • the monitoring service system can use preset alarm levels such as Blocker, Critical, Major, Minor, and Info, and the corresponding alarm level judgment standards, or use alarm levels such as L1, L2, L3, L4, and L5.
  • the corresponding alarm level judgment standard the associated monitoring system monitors the operation status formed during the operation of the business system in real time. If the operation condition meets the corresponding alarm level judgment standard, the alarm level corresponding to the alarm level judgment standard is determined as the original alarm level .
  • the data center can send data collection instructions to at least one associated monitoring system connected to it in real time or regularly, and receive the original monitoring data collected by each associated monitoring system from at least one business system connected to it based on the data collection instructions, so as to The data center automatically monitors the original monitoring data collected by all associated monitoring systems, so as to realize real-time and effective monitoring of multiple business systems, improve monitoring efficiency, and save monitoring costs.
  • S202 Perform standardized processing on the original alarm level, obtain the standard alarm level, and determine the original monitoring data whose standard alarm level is the target alarm level as the to-be-processed monitoring data.
  • the standardization of the original alarm levels refers to the conversion of the original alarm levels determined by different alarm level judgment standards collected by all associated monitoring systems into the alarm levels corresponding to the unified alarm level judgment standards.
  • the standard alarm level is the alarm level determined after the original alarm level is standardized.
  • the target alarm level refers to the preset alarm level that needs to be monitored and processed. For example, an alarm level with a more serious fault level can be set as the target alarm level, so that the data center can monitor the fault events corresponding to the target alarm level at multiple levels to ensure monitoring efficiency.
  • the data center standardizes the original alarm levels corresponding to all the original monitoring data to determine its corresponding standard alarm levels; judge each original monitoring data separately Whether the standard alarm level corresponding to the data is the target alarm level; if the standard alarm level is the target alarm level, the original monitoring data corresponding to the standard alarm level is determined as the monitoring data to be processed for subsequent automated monitoring to ensure that the target Multi-level real-time monitoring of the original monitoring data corresponding to the alarm level improves the pertinence of monitoring.
  • S203 Perform source detection on the monitoring data to be processed, and obtain the target source corresponding to the monitoring data to be processed.
  • the target source is the source detection of the monitoring data to be processed to determine the corresponding source of the monitoring data to be processed, specifically a source such as a professional company or business team in the application business system, for example, professional company A or operation and maintenance responsibility Group B etc.
  • the data center after determining the monitoring data to be processed, performs source detection on each monitoring data to be processed to determine its corresponding target source from the data content of the monitoring data to be processed for subsequent monitoring analysis of the target source Dimension conducts automatic data monitoring and analysis to ensure the effectiveness of unified monitoring of monitoring data corresponding to a target source in the business system, so as to facilitate real-time and effective tracking of fault events.
  • S204 Perform validity detection on the monitoring data to be processed, obtain a validity detection result, and determine the to-be-processed monitoring data whose validity detection result is alarm valid as valid monitoring data.
  • the validity detection of the monitoring data to be processed is a process for real-time detection of whether the fault event corresponding to the monitoring data to be processed is legal and valid, so as to determine the validity detection result.
  • the validity check result is the result determined after the validity check of the monitoring data to be processed.
  • the validity detection result includes two conditions: the alarm is valid and the alarm is invalid.
  • the data center performs validity detection on the monitoring data to be processed according to a preset validity detection script to determine whether the validity detection result corresponding to the monitoring data to be processed is the alarm valid or the alarm invalid;
  • the data is screened out and determined as effective monitoring data, so that the effective monitoring data can be automatically monitored and processed later to ensure the pertinence and timeliness of the monitoring and processing.
  • S205 Perform formatting processing based on the effective monitoring data and the target source corresponding to the effective monitoring data, and obtain the target monitoring data.
  • the data center uses the determined target source to format the effective monitoring data to format the effective monitoring data uploaded by all associated monitoring systems, that is, to monitor different associated monitoring systems.
  • the incompatible part of the effective monitoring data uploaded by the system is formatted and converted into data that can be recognized by subsequent automated monitoring programs in a unified format to ensure the feasibility of subsequent automated monitoring processing, thereby helping to improve monitoring efficiency.
  • the data center is pre-installed with an automated monitoring program.
  • the formatted target monitoring data corresponding to any target source can be automatically monitored to obtain the alarm monitoring results output by the automated monitoring program. Therefore, the efficiency of the data center for monitoring the target monitoring data collected by at least one associated monitoring system is improved. Since the target monitoring data undergoes format conversion, the feasibility of the automated monitoring process can be ensured.
  • the data center is networked with the associated monitoring system, so that the data center can obtain the collected original monitoring data corresponding to the business system from the associated monitoring system, and realize the All the original monitoring data collected by the related monitoring system are monitored in a unified manner and multi-level monitoring;
  • the original alarm level of the original monitoring data is standardized to determine the standard alarm level, so as to screen out the monitoring data to be processed according to the standard alarm level to ensure that the The pertinence of processing monitoring data screening;
  • the source detection of the monitoring data to be processed to determine the target source to ensure the pertinence of subsequent automated monitoring;
  • the effective monitoring data is screened based on the effectiveness detection results of the effectiveness detection of the monitoring data to be processed, and Ensure the pertinence and timeliness of subsequent automated monitoring; formatting based on effective monitoring data and its corresponding target sources to obtain target monitoring data in a unified format to ensure the feasibility of the subsequent automated monitoring process; finally, for any The formatted target monitoring data corresponding to the target source is automatically monitored to obtain
  • the data center-based operation and maintenance monitoring method further includes the following steps: based on the alarm monitoring result, executing a target monitoring reminder mechanism corresponding to the alarm monitoring result.
  • the original monitoring reminder mechanism is a pre-set mechanism for reminding processing, for example, it can be set as a processing mechanism for telephone reminders, email reminders or other reminders to operation and maintenance personnel.
  • the data center determines which reminder condition the alarm monitoring result meets, and determines the original monitoring reminder mechanism corresponding to the met reminder condition as the target reminder mechanism, and executes
  • the target monitoring reminder mechanism corresponding to the alarm monitoring result sends reminder information to the corresponding operation and maintenance personnel, so as to realize timely response processing of all target monitoring data corresponding to the target source and improve response processing efficiency.
  • the data center-based operation and maintenance monitoring method further includes the following step: displaying the alarm monitoring result corresponding to the target monitoring data according to the preset display interface.
  • the alarm monitoring results corresponding to the target monitoring data are displayed through the Web page; or the alarm monitoring results corresponding to the target monitoring data are displayed through periodic reports; or, the alarm monitoring results corresponding to the target monitoring data are displayed through the external interface.
  • processing interfaces such as page query, periodic report, and data interface can be provided in the client of the data center to facilitate users to perform temporary query requirements, periodic inspections, and inspections through the client.
  • step S201 namely obtaining the original monitoring data corresponding to the business system collected by the associated monitoring system, specifically includes the following steps:
  • S301 Monitor the number of monitoring systems corresponding to the associated monitoring system in real time.
  • the number of monitoring systems is the number of associated monitoring systems connected to the data center.
  • the data center can broadcast http requests to all associated monitoring systems, count the http responses corresponding to the http requests received within a preset time period, and determine the monitoring system corresponding to the associated monitoring system according to the number of http responses received quantity.
  • S302 Create data collection processes corresponding to the number of monitoring systems, and an associated monitoring system corresponding to each data collection process.
  • the data collection process is a process for collecting data created on a server corresponding to the data center.
  • each data collection process corresponds to an associated monitoring system, so that the data collection process is dedicated to collecting the original monitoring data corresponding to the corresponding associated monitoring system to ensure the pertinence of data collection.
  • the data collection process can be associated with the network address corresponding to the associated monitoring system, so that the data collection process can communicate with the associated monitoring system according to the network address, so as to obtain the original monitoring data sent by the associated monitoring system.
  • S303 Execute all data collection processes in parallel, and obtain the original monitoring data corresponding to the business system collected by each associated monitoring system in the current collection period.
  • the current collection period refers to the time period of this data collection, specifically from the collection time of the last data collection to the current time of the system where the data collection instruction is formed by the data center.
  • the data center executes all data collection processes in parallel, so that each data collection process sends a data collection instruction to the corresponding associated monitoring system.
  • the data collection instruction carries the current collection cycle; each associated monitoring system receives the data collection After the instruction, all the original monitoring data whose event occurred during the data collection period are sent to the data center so that the data center can obtain the original monitoring data corresponding to the business system collected by each associated monitoring system in the current collection period to ensure The continuity of the collected original monitoring data in time, and the execution of all data collection processes in parallel can realize the simultaneous collection of the original monitoring data formed during the operation of multiple related monitoring systems to monitor their corresponding business systems to ensure the original monitoring data
  • the collection efficiency and timeliness of the data center prevents the data center from collecting the original monitoring data of different related monitoring systems late, resulting in failure to monitor and process failure events in time.
  • a corresponding data collection process is created based on the number of monitoring systems corresponding to the associated monitoring system, so that each data collection process can obtain the data collected by its corresponding associated monitoring system.
  • the original monitoring data can ensure the pertinence of the original monitoring data collection; parallel execution of all data collection processes for original monitoring data collection can realize the simultaneous collection of multiple associated monitoring systems to monitor the original monitoring data formed during the operation of their corresponding business systems. Guarantee the collection efficiency and timeliness of the original monitoring data; each data collection process can realize the collection of the original monitoring data formed by the associated monitoring system in the current collection cycle to monitor the business system, and ensure the continuity of the collected original monitoring data in time .
  • step S203 that is, source detection of the monitoring data to be processed, and obtaining the target source corresponding to the monitoring data to be processed, specifically includes the following steps:
  • S401 Use a keyword recognition algorithm to identify the monitoring data to be processed, and determine whether the monitoring data to be processed includes the source key field.
  • the keyword recognition algorithm is an algorithm used to identify whether a specific keyword is included in a certain text content.
  • the keyword recognition algorithm includes but is not limited to a regular matching algorithm, a string interception algorithm, and a mixed matching algorithm.
  • the source key field is a field that can directly reflect the source of its data. For example, the source field or the system attribute field is preset as the source key field.
  • the monitoring data to be processed includes the source key field, the field content corresponding to the source key field is determined as the target source corresponding to the monitoring data to be processed.
  • the data center uses a keyword recognition algorithm to identify the monitoring data to be processed, and determines that the data content of the monitoring data to be processed contains source key fields such as the source field or system attribute field, and then the field content corresponding to the source key field As its target source, thereby improving the efficiency and accuracy of determining the target source.
  • source key fields such as the source field or system attribute field
  • S403 If the to-be-processed monitoring data does not contain the source key field, use a keyword identification algorithm to identify the to-be-processed monitoring data, and determine whether the to-be-processed monitoring data contains associated key fields.
  • the associated key field is a field that cannot directly reflect its data source but is related to the data source, for example, the eventName field or the department field.
  • the data center uses a keyword recognition algorithm to identify the monitoring data to be processed, and when it is determined that the data content of the monitoring data to be processed does not contain the source key field, then the keyword recognition algorithm is used to identify whether the monitoring data to be processed contains an associated key Field for subsequent processing based on the judgment result to determine the target source corresponding to the monitoring data to be processed.
  • step S404 which is to perform data processing on the field content corresponding to the associated key field, and obtain the target source corresponding to the to-be-processed monitoring data, specifically includes the following steps: adopting the processing logic pair corresponding to the associated key field Data processing is performed on the field content corresponding to the associated key field to determine the target source corresponding to the monitoring data to be processed. Among them, the processing logic corresponding to the associated key field is used to process the field content corresponding to the associated key field to obtain the processing logic of the target source. For example, data processing can be performed by splitting, merging, or otherwise processing the content of the fields corresponding to the associated key fields, so as to quickly and accurately determine the target source from the associated key fields, and ensure the efficiency of determining the target source.
  • step S404 which is to perform data processing on the field content corresponding to the associated key field, and obtain the target source corresponding to the monitoring data to be processed, specifically includes the following steps: splitting and extracting the field content corresponding to the associated key field Process to obtain associated information; query the associated mapping table based on the associated information, and determine the data source corresponding to the associated information in the associated mapping table as the target source of the monitoring data to be processed.
  • splitting and extracting the field content corresponding to the associated key field refers to the process of splitting the field content of the associated key field and extracting the associated information related to the data source.
  • the association mapping table is a preset data table for storing association information and its corresponding data sources. Understandably, this example performs table look-up processing based on the associated information extracted by splitting the associated key fields to quickly determine the target source and ensure the efficiency of determining the target source.
  • the keyword recognition algorithm is used to identify the monitoring data to be processed, and after determining whether the monitoring data to be processed contains associated key fields, if the monitoring data to be processed does not contain associated key fields, this
  • other processing logic such as the processing logic corresponding to steps S501-S502
  • the to-be-processed monitoring data stream can be transferred to the manual processing mechanism, and the operation and maintenance personnel can manually calibrate the target source for subsequent processing .
  • Other processing logic here may include calling the data source interface, determining the monitoring system information corresponding to the associated monitoring system that transmits the monitoring data to be processed, querying a preset source mapping table based on the monitoring system information, and determining the target corresponding to the monitoring data to be processed source.
  • a keyword identification algorithm is used in turn to identify whether the source key field and the associated key field are included in the monitoring data to be processed, so that according to the source key field and the associated key The content of the field corresponding to the segment determines its target source, so as to ensure the accuracy and efficiency of the target source determination.
  • step S203 which is to perform source detection on the monitoring data to be processed, and obtain the target source corresponding to the monitoring data to be processed, specifically includes the following steps:
  • S501 Use a preset source identification model to perform source detection on the monitoring data to be processed, and obtain at least one identification source and an identification probability corresponding to each identification source.
  • the preset source recognition model is a pre-trained model used to recognize the data source of the monitoring data.
  • the identification source is the data source identified by the source detection of the monitoring data to be processed using the preset source identification model.
  • the identification probability corresponding to the identified source refers to the probability that the preset source identification model recognizes the monitoring data to be processed and determines that it belongs to a certain data source.
  • the data center-based operation and maintenance monitoring method further includes a process of training a preset source recognition model, which specifically includes the following steps:
  • Model training samples are samples for model training formed by pre-marking historical monitoring data with corresponding data sources.
  • model training samples in the training set can be input to a CNN, RNN, or other neural network model for model training, so as to update the model parameters in the neural network model, thereby forming an original source recognition model.
  • model test result refers to the use of model training samples in the test set to test the original source identification model to determine the test accuracy of the model training samples in the test set.
  • each model training sample in the test set to the original source recognition model for recognition, and obtain its recognition result; if the recognition result is consistent with the data source marked by the model training sample, the recognition is determined to be accurate; if the recognition result matches the model training If the data sources marked by the samples are inconsistent, the recognition is determined to be inaccurate; the test accuracy is determined based on the number of model training samples whose recognition results are accurate in the test set and the number of all model training samples.
  • the preset standard is a preset standard used to evaluate whether the original source recognition model meets the standard that is deemed to have a higher accuracy rate, for example, it can be set to 90%.
  • S502 Compare the maximum recognition probability with a preset probability threshold, and if the maximum recognition probability is greater than the preset probability threshold, determine the recognition source corresponding to the maximum recognition probability as the target source corresponding to the monitoring data to be processed.
  • the maximum identification probability is the maximum value of the identification probabilities corresponding to multiple identification sources identified by the preset source identification model.
  • the preset probability threshold is a preset probability threshold used to evaluate whether the recognition probability reaches the probability threshold evaluated as the target source.
  • the data center may use a pre-trained preset source identification model to perform source detection on the monitoring data to be processed, and obtain the detection result output by the preset source identification model.
  • the detection result includes at least one identification source and each identification source corresponding Probability of recognition. Then, the recognition probabilities corresponding to all the recognition sources are sorted, and then the maximum recognition probability is compared with the preset probability threshold. If the maximum recognition probability is greater than the preset probability threshold, the recognition source corresponding to the maximum recognition probability is determined as the target source corresponding to the monitoring data to be processed.
  • the maximum identification probability is not greater than the preset probability threshold, other processing logic for obtaining the target source can be executed, or the to-be-processed monitoring data flow can be transferred to a manual processing mechanism, and the operation and maintenance personnel can manually calibrate the target source for subsequent processing.
  • the data center-based operation and maintenance monitoring method provided in this embodiment uses a pre-trained preset source identification model to perform source detection on the monitoring data to be processed, so as to quickly and effectively determine the target source according to the data content in the monitoring data to be processed , So as to ensure the accuracy and efficiency of target source determination.
  • step S204 which is to perform validity detection on the monitoring data to be processed, and obtain the validity detection result, specifically includes the following steps:
  • S601 Query a legal source mapping table based on the target source of the to-be-processed monitoring data, and obtain a legality verification result.
  • the legal source mapping table is a preset data table used to evaluate whether the data source is legal.
  • the legality verification result is the result of the legality verification of the target source of the monitoring data to be processed.
  • the legality verification result includes two results: legal source and illegal source, where legal source means that the target source of a certain monitoring data to be processed is a legal data source that needs to be monitored by the data center; illegal source refers to a certain to be processed The target source of the monitoring data is an illegal data source that the data center does not need to monitor.
  • the source information of all legal sources can be stored in the legal source mapping table, and the legal source mapping table can be queried based on the target source identified by the monitoring data to be processed. If the target source is in the legal source mapping table, the legal source is obtained. The legality check result; if the target source is not in the legal source mapping table, the legality check result of the illegal source is obtained.
  • multiple data sources and the source attribute corresponding to each data source can be stored in the legal source mapping table.
  • the source attribute includes legal source and illegal source; the legal source is queried based on the target source identified by the monitoring data to be processed.
  • the mapping table determines the corresponding legality check result according to the source attribute of the target source in the legal source mapping table.
  • S602 Query the alarm state information table based on the to-be-processed monitoring data, and obtain the alarm state result.
  • the alarm status information table is a preset data table used to evaluate whether the alarm event in the monitoring data is valid.
  • the alarm status result is the result of determining whether a certain monitoring data to be processed is still in a valid state according to the alarm status information table.
  • the data center needs to query the alarm status information table based on the monitoring log recorded in the monitoring data to be processed to determine whether this monitoring log is still identified as a fault event that needs monitoring; obtain the alarm status according to the judgment result result.
  • the obtained alarm status result is a valid state; if you query the alarm status information table according to the monitoring log, it is determined that it is a fault that does not need to be monitored. Event, the obtained alarm status result is invalid.
  • the data center when the data center is in the process of checking the validity of the monitoring data to be processed, if it is determined that the source of the legality verification result is legal and the alarm status result is valid, it means that the target source of the monitoring data to be processed is the data center.
  • the legal data source that needs to be monitored, and the failure event corresponding to the monitoring data to be processed is the failure event that needs to be continuously monitored by the data center.
  • the effective validity detection result of the alarm is obtained, so that the subsequent monitoring data that is valid for the alarm can be processed Continue processing.
  • the legality verification result is determined to be an illegal source, or the alarm status result is a valid state, it means that the target source of the monitoring data to be processed is not data.
  • the illegal data source that the center needs to monitor, or the failure data corresponding to the monitoring data to be processed is not a failure event that requires the data center to continue monitoring.
  • the validity detection result of the invalid alarm is obtained, so that the data center does not need to continue to process the monitoring data Perform monitoring and analysis.
  • the legal source mapping table and the alarm state information table are respectively queried based on the monitoring data to be processed to obtain the legality verification result and the alarm state result; and the legality verification result If the source is legitimate and the result of the alarm status is valid, the alarm is determined to be valid. On the contrary, the alarm is determined to be invalid to ensure the accuracy and timeliness of the validity detection result.
  • step S205 that is, formatting based on the effective monitoring data and the target source corresponding to the effective monitoring data, to obtain the target monitoring data, specifically includes the following steps:
  • S701 Encode and format effective monitoring data to obtain standard monitoring data.
  • the code conversion of effective monitoring data refers to the process of converting all effective monitoring data into uniform codes.
  • Standard monitoring data refers to the monitoring data determined after the effective monitoring data is encoded and formatted.
  • the data center can uniformly convert the effective monitoring data uploaded by different associated monitoring systems into the target encoding format, such as UTF-8 encoding format, to ensure the feasibility of subsequent data monitoring and analysis and processing, thereby improving the efficiency and accuracy of monitoring and analysis.
  • the data center identifies the current encoding format of the effective monitoring data; if the current encoding format is the target encoding format of UTF-8 encoding format, there is no need to perform encoding conversion processing; if the current encoding format is not UTF-8 encoding format, this A target encoding format, which is a Unicode encoding format.
  • the current encoding format is formatted to obtain the standard monitoring data that matches the target encoding mode, so as to ensure the standard Monitor the consistency of the data encoding format for the feasibility of subsequent automated monitoring and analysis.
  • S702 Use a keyword recognition algorithm to identify the standard monitoring data, and determine whether the standard monitoring data includes core format fields.
  • the core format field refers to a field whose content needs to adopt a specific format.
  • the data center can use a keyword recognition algorithm to identify the data content in the standard monitoring data to determine whether all fields in the standard monitoring data contain core format fields, so as to determine according to the judgment result Whether further format conversion is required to ensure the pertinence of the format conversion.
  • S703 If the standard monitoring data includes core format fields, use the format conversion script corresponding to the core format fields to perform format conversion on the original field content corresponding to the core format fields, obtain the target field content, and replace the original field content with the target field content. Add source fields and corresponding target sources to obtain target monitoring data.
  • the format conversion script corresponding to the core format field is a preset script for realizing format conversion of the field content corresponding to the core format field.
  • eventName event name
  • the format conversion script corresponding to the core format field is used to compare the original field content in the standard monitoring data corresponding to the core format field.
  • Unified conversion into the target field content corresponding to the specific format of "ObjName (project name)-ObjType (project type)-Source (source)-Desc (team)” and then replace the target field content with the original field content.
  • the format conversion script corresponding to the core format field is used to compare the original field in the standard monitoring data corresponding to the core format field.
  • the content is checked for character length. If the character length meets the standard length, no format conversion is required; if the character length does not meet the standard length, new content will be added and replaced according to the established rules to obtain the target field content whose character length meets the standard length , And then replace the target field content with the original field content.
  • the format conversion script corresponding to the core format field is used to compare the original field content in the standard monitoring data corresponding to the core format field.
  • the time unit When the time unit is converted, it will be converted into a unified time unit for subsequent automatic monitoring of time-related monitoring data to improve monitoring efficiency.
  • the format conversion script is used to format the original field content corresponding to the core format field.
  • the standard monitoring data After replacing the original field content with the target field content, it needs to be included in the standard monitoring data Increase the source field and the target source corresponding to the source field, so that the final target monitoring field obtained adopts a unified format to ensure the feasibility of subsequent automatic monitoring and analysis; and each target monitoring data carries information such as the target source to facilitate Subsequent automatic monitoring and analysis of all target monitoring data based on the target source will ensure the effectiveness of monitoring.
  • the standard monitoring data does not include the core format field
  • the source field and the target source corresponding to the source field are directly added to the standard monitoring data to make
  • Each target monitoring data carries information such as the target source, so that subsequent automated monitoring and analysis of all target monitoring data based on the target source can ensure the effectiveness of the monitoring.
  • the data center-based operation and maintenance monitoring method uses encoding and formatting of effective monitoring data to ensure the recognizability of standard monitoring data obtained by encoding and formatting, so that the automated monitoring program of the data center can be accurate and effective. Identify the data content in the standard monitoring data; when the standard monitoring data contains the core format field, the format conversion script needs to be used for format conversion, and the target field content is replaced with the original field content, so that the target field content corresponding to the core format field adopts Unify the format to ensure the feasibility of subsequent automatic monitoring and analysis; add the source field and the target source corresponding to the source field in the standard format field, so that each target monitoring data carries the target source and other information, so that the follow-up based on the target source Automate monitoring and analysis of all target monitoring data to ensure the effectiveness of monitoring.
  • step S206 which is based on the target source, automatically monitors the target monitoring data, and obtains the alarm monitoring result, specifically includes the following steps:
  • S801 Start an automated monitoring task, which includes a monitoring cycle and target monitoring indicators.
  • the automated monitoring task is a computer-executable task set in advance for realizing automatic monitoring.
  • the monitoring data cycle refers to the time interval of monitoring data collected from business data that needs to be monitored in the data center.
  • Target monitoring indicators refer to the indicators that need to be monitored for this automatic monitoring task, including but not limited to alarm processing rate, alarm timeout rate, alarm processing timeliness, and user utilization rate.
  • S802 Divide all target monitoring data with the event occurrence time within the monitoring period according to the target source, and determine the monitoring data to be analyzed corresponding to each target source.
  • the data center can compare the event occurrence time of the target monitoring data with the monitoring cycle to determine whether the event occurrence time is within the monitoring cycle, so as to filter out all target monitoring data whose event occurrence time is within the monitoring cycle as needed.
  • Automatic monitoring data then, all target monitoring data whose event occurs within the monitoring period is divided according to the target source in the target monitoring data, so as to divide all target monitoring data into different to be monitored according to the number of target sources
  • each data set to be monitored corresponds to a target source
  • each data set to be monitored specifically includes the monitoring data to be analyzed corresponding to a target source
  • the monitoring data to be analyzed is specifically the occurrence time of all events corresponding to the target source Target monitoring data during the monitoring period.
  • S803 Call the indicator calculation script corresponding to the target monitoring indicator, read the variable factor value corresponding to the target variable factor from the monitoring data to be analyzed, and use the indicator calculation logic to calculate the variable factor value to obtain the alarm monitoring result corresponding to each target source .
  • the indicator calculation script is a script that can be run on a computer for calculating the target monitoring indicator
  • the indicator calculation script is a script written in a computer language that can execute an indicator calculation formula corresponding to a target monitoring indicator.
  • the indicator calculation formula includes indicator variable factors and indicator calculation logic.
  • the indicator variable factors are the variable factors used to calculate the target monitoring indicators
  • the data center can execute the indicator calculation script corresponding to the target monitoring indicator set in the automated monitoring task. Since each monitoring data to be analyzed is a formatted target monitoring indicator, the indicator calculation script can be monitored from the target monitoring indicator to be analyzed. Quickly obtain the variable factor value corresponding to the target variable factor in the indicator calculation formula from the data, and then use the indicator calculation logic in the indicator calculation formula to calculate the variable factor value, so as to obtain the alarm monitoring results corresponding to each target source for realization Automatically monitor all target monitoring data corresponding to each target source, improve the efficiency of data monitoring, and reduce the cost of data monitoring.
  • the processing process of dividing the monitoring data to be analyzed based on the event occurrence time and the target source has Feasibility, so as to facilitate the subsequent processing of all target monitoring data based on the analysis dimension of the target source; calling the indicator calculation script corresponding to the target monitoring indicator can automatically and quickly use the indicator calculation logic to obtain the corresponding variables from the monitoring data to be analyzed
  • the factor value is calculated to obtain the alarm monitoring results corresponding to each target source, so as to realize automatic monitoring of all target monitoring data corresponding to each target source, improve the efficiency of data monitoring, and reduce the cost of data monitoring.
  • a data center-based operation and maintenance monitoring device is provided, and the data center-based operation and maintenance monitoring device corresponds to the data center-based operation and maintenance monitoring method in the foregoing embodiment in a one-to-one correspondence.
  • the data center-based operation and maintenance monitoring device includes an original monitoring data acquisition module 901, a pending monitoring data acquisition module 902, a target source acquisition module 903, an effective monitoring data acquisition module 904, and a target monitoring data acquisition module 905 And the alarm monitoring result acquisition module 906.
  • each functional module is as follows:
  • the original monitoring data acquisition module 901 is used to obtain the original monitoring data corresponding to the business system collected by the associated monitoring system, and each original monitoring data includes the original alarm level.
  • the to-be-processed monitoring data acquisition module 902 is configured to standardize the original alarm level, obtain the standard alarm level, and determine the original monitoring data whose standard alarm level is the target alarm level as the to-be-processed monitoring data.
  • the target source obtaining module 903 is configured to perform source detection on the monitoring data to be processed and obtain the target source corresponding to the monitoring data to be processed.
  • the effective monitoring data acquisition module 904 is configured to perform effectiveness detection of the monitoring data to be processed, obtain the effectiveness detection results, and determine the pending monitoring data whose effectiveness detection results are alarm valid as effective monitoring data.
  • the target monitoring data acquisition module 905 is configured to perform formatting processing based on the effective monitoring data and the target source corresponding to the effective monitoring data to obtain the target monitoring data.
  • the alarm monitoring result obtaining module 906 is configured to automatically monitor the target monitoring data based on the target source, and obtain the alarm monitoring result.
  • the original monitoring data acquisition module includes a monitoring system quantity acquisition unit, a data collection process creation unit, and a collection process parallel execution unit.
  • the monitoring system quantity acquisition unit is used for real-time monitoring of the monitoring system quantity corresponding to the associated monitoring system.
  • the data collection process creation unit is used to create data collection processes corresponding to the number of monitoring systems, and each data collection process corresponds to an associated monitoring system.
  • the collection process parallel execution unit is used to execute all data collection processes in parallel, and obtain the original monitoring data corresponding to the business system collected by each associated monitoring system in the current collection cycle.
  • the target source acquiring module includes a first keyword identifying unit, a first source determining unit, a second keyword identifying unit, and a second source determining unit.
  • the first keyword identification unit is configured to use a keyword identification algorithm to identify the monitoring data to be processed, and to determine whether the monitoring data to be processed contains the source key field.
  • the first source determining unit is configured to determine the field content corresponding to the source key field as the target source corresponding to the to-be-processed monitoring data if the monitoring data to be processed includes the source key field.
  • the second keyword identification unit is configured to use a keyword identification algorithm to identify the monitoring data to be processed if the monitoring data to be processed does not contain the source key fields, and to determine whether the monitoring data to be processed contains associated key fields.
  • the second source determination unit is configured to, if the monitoring data to be processed contains associated key fields, perform data processing on the field content corresponding to the associated key fields to obtain the target source corresponding to the monitoring data to be processed.
  • the target source acquisition module includes a model recognition processing unit and a third source determination unit.
  • the model identification processing unit is configured to use a preset source identification model to perform source detection on the monitoring data to be processed, and obtain at least one identification source and an identification probability corresponding to each identification source.
  • the third source determination unit is used to compare the maximum recognition probability with the preset probability threshold, and if the maximum recognition probability is greater than the preset probability threshold, then the recognition source corresponding to the maximum recognition probability is determined as the target source corresponding to the monitoring data to be processed .
  • the effective monitoring data acquisition module includes a legality verification result acquisition unit, an alarm state result acquisition unit, a legal source result acquisition unit, and an illegal source result acquisition unit.
  • the legality verification result obtaining unit is used to query the legal source mapping table based on the target source of the monitoring data to be processed to obtain the legality verification result.
  • the alarm state result obtaining unit is used to query the alarm state information table based on the monitoring data to be processed, and obtain the alarm state result.
  • the legal source result obtaining unit is configured to obtain the validity detection result that the alarm is valid if the legality check result is that the source is legal and the alarm state result is a valid state.
  • the illegal source result obtaining unit is configured to obtain the validity detection result that the alarm is invalid if the legality check result is that the source is illegal, or the alarm state result is an invalid state.
  • the target monitoring data acquisition module includes a standard monitoring data acquisition unit, a core format field judgment unit, a first target data acquisition unit, and a second target data acquisition unit.
  • the standard monitoring data acquisition unit is used to encode and format effective monitoring data and obtain standard monitoring data.
  • the core format field judging unit is used to identify the standard monitoring data using a keyword recognition algorithm, and to determine whether the standard monitoring data includes the core format field.
  • the first target data acquisition unit is used for format conversion of the original field content corresponding to the core format field using the format conversion script corresponding to the core format field if the standard monitoring data contains the core format field, to obtain the target field content, and use the target field
  • the content replaces the original field content, adds a source field and corresponding target source, and obtains target monitoring data.
  • the second target data obtaining unit is used to add a source field and a corresponding target source if the standard monitoring data does not include the core format field, and obtain the target monitoring data.
  • the alarm monitoring result acquisition module includes a monitoring task start unit, a data to be analyzed determination unit, and an alarm monitoring processing unit.
  • the monitoring task start unit is used to start an automated monitoring task.
  • the automated monitoring task includes a monitoring cycle and target monitoring indicators.
  • the to-be-analyzed data determining unit is used to divide all target monitoring data whose event occurred within the monitoring period according to the target source, and determine the to-be-analyzed monitoring data corresponding to each target source.
  • the alarm monitoring processing unit is used to call the indicator calculation script corresponding to the target monitoring indicator, read the variable factor value corresponding to the target variable factor from the monitoring data to be analyzed, and use the indicator calculation logic to calculate the variable factor value to obtain the source of each target The corresponding alarm monitoring result.
  • Each module in the above-mentioned data center-based operation and maintenance monitoring device can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 10.
  • the computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the database of the computer equipment is used to store the data adopted or generated during the process of executing the operation and maintenance monitoring method based on the data center.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instructions are executed by the processor to realize a data center-based operation and maintenance monitoring method.
  • one or more readable storage media storing computer readable instructions are provided.
  • the computer readable storage medium stores computer readable instructions, and the computer readable instructions are executed by one or more processors When executed, the one or more processors execute computer-readable instructions to implement the data center-based operation and maintenance monitoring method in the foregoing embodiment, such as S201-S206 shown in FIG. 2 or shown in FIGS. 3 to 8 To avoid repetition, I won’t repeat it here.
  • the processor implements the functions of each module/unit in the embodiment of the data center-based operation and maintenance monitoring device when the processor executes the computer-readable instructions, for example, the original monitoring data acquisition module 901 and the to-be-processed monitoring data acquisition shown in FIG.
  • the readable storage medium in this embodiment includes a non-volatile readable storage medium and a volatile readable storage medium.
  • a computer-readable storage medium is provided, and computer-readable instructions are stored on the computer-readable storage medium.
  • the operation and maintenance monitoring based on the data center in the above-mentioned embodiment is realized.
  • the methods, such as S201-S206 shown in FIG. 2, or shown in FIGS. 3 to 8, are not repeated here to avoid repetition.
  • the computer-readable instruction is executed by the processor, the function of each module/unit in the embodiment of the above-mentioned data center-based operation and maintenance monitoring device is realized, for example, the original monitoring data acquisition module 901 shown in FIG. 9 and the to-be-processed
  • the functions of the monitoring data acquisition module 902, the target source acquisition module 903, the effective monitoring data acquisition module 904, the target monitoring data acquisition module 905, and the alarm monitoring result acquisition module 906 are not repeated here to avoid repetition.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Alarm Systems (AREA)
  • Debugging And Monitoring (AREA)

Abstract

一种基于数据中心的运维监控方法、装置、设备及存储介质。该方法包括:获取关联监控系统采集到的原始监控数据,每一原始监控数据包括原始告警级别;对原始告警级别进行标准化处理,获取标准告警级别并确定待处理监控数据;对待处理监控数据进行来源检测,获取目标来源;对待处理监控数据进行有效性检测,获取有效性检测结果,将有效性检测结果为告警有效的待处理监控数据确定为有效监控数据(S204);基于有效监控数据和有效监控数据对应的目标来源进行格式化处理,获取目标监控数据(S205);基于目标来源,对目标监控数据进行自动化监控,获取告警监控结果(S206)。该方法可提高数据中心对关联监控系统采集到的目标监控数据进行监控的效率。

Description

基于数据中心的运维监控方法、装置、设备及存储介质
本申请以2020年3月6日提交的申请号为202010153280.5,名称为“基于数据中心的运维监控方法、装置、设备及存储介质”的中国发明申请为基础,并要求其优先权。
技术领域
本申请涉及数据监控技术领域,尤其涉及一种基于数据中心的运维监控方法、装置、设备及存储介质。
背景技术
监控是当前业务系统运营维护的重要环节,运维人员通过监控发现并定位业务系统中出现的故障事件,以便根据该故障事件进行业务系统维护和更新。一般来说,企业(尤其是集团企业)会配置多个业务系统,运维人员采用不同的监控系统对多个业务系统分别进行监控,并将监控结果分别反馈给企业的监控管理中心。发明人意识到当前不同监控系统采集的监控结果不具有兼容性,无法实现对多个监控系统采集的监控结果进行有效监控和分析,使得监控分析效率较低。
发明内容
本申请实施例提供一种基于数据中心的运维监控方法、装置、设备及存储介质,以解决当前不同监控系统采集的监控结果不具有兼容性,导致无法实现对多个监控系统采集的监控结果进行有效监控和分析,使得监控分析效率较低的问题。
一种基于数据中心的运维监控方法,包括:
获取关联监控系统采集到的业务系统对应的原始监控数据,每一原始监控数据包括原始告警级别;
对所述原始告警级别进行标准化处理,获取标准告警级别,将所述标准告警级别为目标告警级别的原始监控数据确定为待处理监控数据;
对所述待处理监控数据进行来源检测,获取所述待处理监控数据对应的目标来源;
对所述待处理监控数据进行有效性检测,获取有效性检测结果,将所述有效性检测结果为告警有效的待处理监控数据确定为有效监控数据;
基于所述有效监控数据和所述有效监控数据对应的目标来源进行格式化处理,获取目标监控数据;
基于所述目标来源,对所述目标监控数据进行自动化监控,获取告警监控结果。
一种基于数据中心的运维监控装置,包括:
原始监控数据获取模块,用于获取关联监控系统采集到的业务系统对应的原始监控数据,每一原始监控数据包括原始告警级别;
待处理监控数据获取模块,用于对所述原始告警级别进行标准化处理,获取标准告警级别,将所述标准告警级别为目标告警级别的原始监控数据确定为待处理监控数据;
目标来源获取模块,用于对所述待处理监控数据进行来源检测,获取所述待处理监控数据对应的目标来源;
有效监控数据获取模块,用于对所述待处理监控数据进行有效性检测,获取有效性检测结果,将所述有效性检测结果为告警有效的待处理监控数据确定为有效监控数据;
目标监控数据获取模块,用于基于所述有效监控数据和所述有效监控数据对应的目标来源进行格式化处理,获取目标监控数据;
告警监控结果获取模块,用于基于所述目标来源,对所述目标监控数据进行自动化监控,获取告警监控结果。
一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:
获取关联监控系统采集到的业务系统对应的原始监控数据,每一原始监控数据包括原始告警级别;
对所述原始告警级别进行标准化处理,获取标准告警级别,将所述标准告警级别为目标告警级别的原始监控数据确定为待处理监控数据;
对所述待处理监控数据进行来源检测,获取所述待处理监控数据对应的目标来源;
对所述待处理监控数据进行有效性检测,获取有效性检测结果,将所述有效性检测结果为告警有效的待处理监控数据确定为有效监控数据;
基于所述有效监控数据和所述有效监控数据对应的目标来源进行格式化处理,获取目标监控数据;
基于所述目标来源,对所述目标监控数据进行自动化监控,获取告警监控结果。
一个或多个存储有计算机可读指令的可读存储介质,所述计算机可读存储介质存储有计算机可读指令,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:
获取关联监控系统采集到的业务系统对应的原始监控数据,每一原始监控数据包括原始告警级别;
对所述原始告警级别进行标准化处理,获取标准告警级别,将所述标准告警级别为目标告警级别的原始监控数据确定为待处理监控数据;
对所述待处理监控数据进行来源检测,获取所述待处理监控数据对应的目标来源;
对所述待处理监控数据进行有效性检测,获取有效性检测结果,将所述有效性检测结果为告警有效的待处理监控数据确定为有效监控数据;
基于所述有效监控数据和所述有效监控数据对应的目标来源进行格式化处理,获取目标监控数据;
基于所述目标来源,对所述目标监控数据进行自动化监控,获取告警监控结果。
上述基于数据中心的运维监控方法、装置、设备及存储介质中,数据中心与关联监控系统联网,以使数据中心可以从关联监控系统中获取其所采集到的业务系统对应的原始监控数据,实现对关联监控系统采集到的所有原始监控数据进行统一监控和多级监控;对原始监控数据的原始告警级别进行标准化处理,以确定标准告警级别,从而根据标准告警级别筛选出待处理监控数据,保证待处理监控数据筛选的针对性;对待处理监控数据进行来源检测,以确定目标来源,以保证后续自动化监控的针对性;基于对待处理监控数据进行有效性检测的有效性检测结果筛选有效监控数据,以保证后续自动化监控的针对性和时效性;基于有效监控数据及其对应的目标来源进行格式化处理,以获取格式统一的目标监控数据,以保证后续自动化监控过程的可行性;最后,对任一目标来源对应的经过格式化处理的目标监控数据进行自动化监控,以获取自动化监控程序输出的告警监控结果,从而提高数据中心对至少一个关联监控系统采集到的目标监控数据进行监控的效率。
附图说明
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本申请一实施例中基于数据中心的运维监控方法的一应用环境示意图;
图2是本申请一实施例中基于数据中心的运维监控方法的一流程图;
图3是本申请一实施例中基于数据中心的运维监控方法的另一流程图;
图4是本申请一实施例中基于数据中心的运维监控方法的另一流程图;
图5是本申请一实施例中基于数据中心的运维监控方法的另一流程图;
图6是本申请一实施例中基于数据中心的运维监控方法的另一流程图;
图7是本申请一实施例中基于数据中心的运维监控方法的另一流程图;
图8是本申请一实施例中基于数据中心的运维监控方法的另一流程图;
图9是本申请一实施例中基于数据中心的运维监控装置的一示意图;
图10是本申请一实施例中计算机设备的一示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请实施例提供的基于数据中心的运维监控方法,该基于数据中心的运维监控方法可应用如图1所示的应用环境中。具体地,该基于数据中心的运维监控方法应用在基于数据中心的运维监控系统中,该基于数据中心的运维监控系统包括如图1所示的数据中心和与数据中心通信相连的至少一个关联监控系统相连,每一关联监控系统与至少一个业务系统相连,以使关联监控系统可采集与其相连的业务系统对应的原始监控数据,而数据中心可从至少一个关联监控系统中获取所有原始监控数据,并对所有原始监控数据进行分析处理,以获取告警监控结果,以实现对所有关联监控系统采集的原始监控数据进行自动化监控和分析处理,实现对多个业务系统进行实时有效监控,并提高监控分析效率。其中,业务系统是指需要进行监控的可实现特定业务的系统,该业务系统为被监控对象,具体可以是某一应用程序或应用产品的系统。关联监控系统是与业务系统相连的用于实现对业务系统进行监控,以实现是否存在故障事件以及故障定位等功能的系统。数据中心是指与至少一个关联监控系统通信的用于实现所有关联监控系统采集的原始监控数据进行自动化监控和分析处理,实现对多个业务系统进行实时有效监控和分析的处理中心。
本实施例中的业务系统、关联监控系统和数据中心均包括服务器和与服务器通过网络进行通信的客户端。客户端又称为用户端,是指与服务器相对应,为客户提供本地服务的程序。客户端可安装在但不限于各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备上。服务器可以用独立的服务器或者是多个服务器组成的服务器集群来实现。
在一实施例中,如图2所示,提供一种基于数据中心的运维监控方法方法,以该方法应用在图1中的数据中心的服务器为例进行说明,包括如下步骤:
S201:获取关联监控系统采集到的业务系统对应的原始监控数据,每一原始监控数据包括原始告警级别。
其中,原始监控数据是关联监控系统在监控业务系统运行过程中形成的未经处理的数据,具体是关联监控系统在监控业务系统运行过程中认定发生故障事件所形成的相关数据,包括但不限于数据内容、事件发生时间、监控日志和原始告警级别等信息。
其中,原始告警级别是关联监控系统依据本系统预先设置的告警级别判断标准,对监控业务系统运行过程中认定发生的故障事件进行评级分类所确定的级别。作为一示例,监控业务系统可以采用预先设置严重程度依次减弱的Blocker、Critical、Major、Minor和Info等告警级别及对应的告警级别判断标准,或者采用L1、L2、L3、L4和L5等告警级别及对应的告警级别判断标准;关联监控系统实时监控业务系统运行过程中形成的运行情况,若运行情况满足相应的告警级别判断标准,则将该告警级别判断标准对应的告警级别确定为原始告警级别。
本示例中,数据中心可以实时或者定时向与其联网的至少一个关联监控系统发送数据采集指令,接收每一关联监控系统基于数据采集指令采集到与其相连的至少一个业务系统的原始监控数据,以便在数据中心对所有关联监控系统采集的原始监控数据进行自动化监控,从而实现对多个业务系统进行实时有效监控,提高监控效率,并节省监控成本。
S202:对原始告警级别进行标准化处理,获取标准告警级别,将标准告警级别为目标告警级别的原始监控数据确定为待处理监控数据。
其中,对原始告警级别进行标准化处理是指将所有关联监控系统采集到采用不同告警级别判断标准确定的原始告警级别转化成统一告警级别判断标准对应的告警级别。相应地,标准告警级别是原始告警级别经过标准化处理后确定的告警级别。目标告警级别是指预先设置的需要进行监控处理的告警级别。例如,可以将故障程度较严重的告警级别设置为目标告警级别,以便通过数据中心对目标告警级别对应的故障事件进行多级监控,以保障监控效率。
作为一示例,数据中心在获取至少一个关联监控系统采集到的原始监控数据之后,将所有原始监控数据对应的原始告警级别进行标准化处理,以确定其对应的标准告警级别;分别判断每一原始监控数据对应的标准告警级别是否为目标告警级别;若标准告警级别为目标告警级别,则将该标准告警级别对应的原始监控数据确定为待处理监控数据,以便进行后续进行自动化监控,从而保证对目标告警级别对应的原始监控数据进行多级实时监控,提高监控的针对性。
S203:对待处理监控数据进行来源检测,获取待处理监控数据对应的目标来源。
其中,目标来源是对待处理监控数据进行来源检测,以确定该待处理监控数据对应的来源,具体为应用业务系统中的某一专业公司或者业务团队等来源,例如,专业公司A或者运维责任组B等。
作为一示例,数据中心在确定待处理监控数据之后,对每一待处理监控数据进行来源检测,以从待处理监控数据的数据内容中确定其对应的目标来源,以便后续目标来源这一监控分析维度进行数据自动化监控分析,从而保障对业务系统中某一目标来源对应的监控数据进行统一监控的有效性,以方便对故障事件进行实时有效跟踪。
S204:对待处理监控数据进行有效性检测,获取有效性检测结果,将有效性检测结果为告警有效的待处理监控数据确定为有效监控数据。
其中,对待处理监控数据进行有效性检测是用于实时检测待处理监控数据对应的故障事件是否合法有效的过程,从而确定有效性检测结果。有效性检测结果是对待处理监控数据进行有效性检测后确定的结果。本示例中,有效性检测结果包括告警有效和告警无效两种情况。
作为一示例,数据中心依据预先设置的有效性检测脚本对待处理监控数据进行有效性检测,以确定待处理监控数据对应的有效性检测结果为告警有效还是告警无效;并将告警有效的待处理监控数据筛选出来,确定为有效监控数据,以便后续对有效监控数据进行自动化监控处理,保证监控处理的针对性和时效性。
S205:基于有效监控数据和有效监控数据对应的目标来源进行格式化处理,获取目标监控数据。
作为一示例,数据中心在获取有效监控数据之后,利用已确定的目标来源,对有效监控数据进行格式化处理,以将所有关联监控系统上传的有效监控数据进行格式化处理,即对不同关联监控系统上传的不能兼容的部分有效监控数据进行格式转换,转换成格式统一的可被后续自动化监控程序识别的数据,以保证后续自动化监控处理的可行性,从而有助于提高监控效率。
S206:基于目标来源,对目标监控数据进行自动化监控,获取告警监控结果。
作为一示例,数据中心预先设有自动化监控程序,可通过执行该自动化监控程序,对任一目标来源对应的经过格式化处理的目标监控数据进行自动化监控,以获取自动化监控程序输出的告警监控结果,从而提高数据中心对至少一个关联监控系统采集到的目标监控数据进行监控的效率,由于目标监控数据经过格式转换,可保证自动化监控过程的可行性。
本实施例所提供的基于数据中心的运维监控方法中,数据中心与关联监控系统联网, 以使数据中心可以从关联监控系统中获取其所采集到的业务系统对应的原始监控数据,实现对关联监控系统采集到的所有原始监控数据进行统一监控和多级监控;对原始监控数据的原始告警级别进行标准化处理,以确定标准告警级别,从而根据标准告警级别筛选出待处理监控数据,保证待处理监控数据筛选的针对性;对待处理监控数据进行来源检测,以确定目标来源,以保证后续自动化监控的针对性;基于对待处理监控数据进行有效性检测的有效性检测结果筛选有效监控数据,以保证后续自动化监控的针对性和时效性;基于有效监控数据及其对应的目标来源进行格式化处理,以获取格式统一的目标监控数据,以保证后续自动化监控过程的可行性;最后,对任一目标来源对应的经过格式化处理的目标监控数据进行自动化监控,以获取自动化监控程序输出的告警监控结果,从而提高数据中心对至少一个关联监控系统采集到的目标监控数据进行监控的效率。
作为一示例,步骤S206之后,基于数据中心的运维监控方法还包括如下步骤:基于告警监控结果,执行与告警监控结果相对应的目标监控提醒机制。
本示例中,数据中心中预先设置多个原始监控提醒机制,每一原始监控提醒机制对应一提醒条件。其中,原始监控提醒机制为预先设置的用于进行提醒处理的机制,例如,可以设置为向运维人员进行电话提醒、邮件提醒或者其他提醒的处理机制。
具体地,数据中心在获取任一目标来源对应的告警监控结果之后,判断该告警监控结果满足哪一个提醒条件,并将其所满足的提醒条件对应的原始监控提醒机制确定为目标提醒机制,执行与告警监控结果相对应的目标监控提醒机制,向相应的运维人员发送提醒信息,从而实现对目标来源对应的所有目标监控数据进行及时响应处理,提高响应处理效率。
作为一示例,步骤S206之后,基于数据中心的运维监控方法还包括如下步骤:依据预设展示界面对目标监控数据对应的告警监控结果进行展示。例如,通过Web页面对目标监控数据对应的告警监控结果进行展示;或者,通过定期报表对目标监控数据对应的告警监控结果进行展示;又或者,通过外部接口对目标监控数据对应的告警监控结果进行展示。相应地,可在数据中心的客户端中提供页面查询、定期报表和数据接口等处理界面,以方便用户通过客户端进行临时查询需求、定期巡检和检查等处理。
在一实施例中,如图3所示,步骤S201,即获取关联监控系统采集到的业务系统对应的原始监控数据,具体包括如下步骤:
S301:实时监测关联监控系统对应的监控系统数量。
其中,监控系统数量是与数据中心联网的关联监控系统的数量。作为一示例,数据中心可以向所有关联监控系统广播发送http请求,统计预设时间段内接收到的与http请求相对应的http响应,根据接收到http响应的数量确定关联监控系统对应的监控系统数量。
S302:创建与监控系统数量相对应的数据采集进程,每一数据采集进程对应的一关联监控系统。
其中,数据采集进程是在数据中心对应的服务器上创建的用于采集数据的进程。本示例中,每一数据采集进程对应一关联监控系统,以使该数据采集进程专用于采集对应的关联监控系统对应的原始监控数据,保证数据采集的针对性。例如,可将数据采集进程与关联监控系统对应的网络地址关联,以便数据采集进程可以根据该网络地址与关联监控系统通信,从而获取关联监控系统发送的原始监控数据。
S303:并行执行所有数据采集进程,获取每一关联监控系统在当前采集周期内采集到的业务系统对应的原始监控数据。
其中,当前采集周期是指本次数据采集的时间周期,具体是指从上次数据采集的采集时间起至数据中心形成数据采集指令的系统当前时间之间。
作为一示例,数据中心并行执行所有数据采集进程,以使每一数据采集进程向与对应的关联监控系统发送数据采集指令,该数据采集指令携带当前采集周期;每一关联监控系统接收到数据采集指令后,将事件发生时间在数据采集周期内的所有原始监控数据发送给 数据中心,以使数据中心可以获取每一关联监控系统在当前采集周期内采集到的业务系统对应的原始监控数据,保证采集到的原始监控数据在时间上的连续性,而且,并行执行所有数据采集进程,可实现同时采集多个关联监控系统监控其对应的业务系统运行过程中形成的原始监控数据,保障原始监控数据的采集效率和时效性,避免数据中心采集不同关联监控系统的原始监控数据的时间较晚,导致未能及时对故障事件进行监控和处理。
本实施例所提供的基于数据中心的运维监控方法中,基于关联监控系统对应的监控系统数量创建相应的数据采集进程,以使每一数据采集进程分别可获取与其对应的关联监控系统采集到的原始监控数据,保证原始监控数据采集的针对性;并行执行所有数据采集进程进行原始监控数据采集,可实现同时采集多个关联监控系统监控其对应的业务系统运行过程中形成的原始监控数据,保障原始监控数据的采集效率和时效性;每一数据采集进程可实现采集关联监控系统在当前采集周期内监控业务系统所形成的原始监控数据,保证采集到的原始监控数据在时间上的连续性。
在一实施例中,如图4所示,步骤S203,即对待处理监控数据进行来源检测,获取待处理监控数据对应的目标来源,具体包括如下步骤:
S401:采用关键字识别算法对待处理监控数据进行识别,判断待处理监控数据是否包含来源关键字段。
其中,关键字识别算法是用于识别某一文字内容中是否包括特定关键字的算法,该关键字识别算法包括但不限于正则匹配算法、字符串截取算法和混合匹配算法。来源关键字段是可以直接反映其数据来源的字段,例如,预先设置source字段或者系统属性字段为来源关键字段。
S402:若待处理监控数据包含来源关键字段,则将来源关键字段对应的字段内容,确定为待处理监控数据对应的目标来源。
作为一示例,数据中心采用关键字识别算法对待处理监控数据进行识别,确定待处理监控数据的数据内容中包含source字段或者系统属性字段等来源关键字段,则将来源关键字段对应的字段内容为其目标来源,从而提高目标来源的确定效率和准确性。
S403:若待处理监控数据未包含来源关键字段,则采用关键字识别算法对待处理监控数据进行识别,判断待处理监控数据是否包含关联关键字段。
其中,关联关键字段是不能直接反映其数据来源但与数据来源相关的字段,例如,eventName字段或者department字段。
作为一示例,数据中心采用关键字识别算法对待处理监控数据进行识别,确定待处理监控数据的数据内容中不包含来源关键字段时,再采用关键字识别算法识别待处理监控数据是否包含关联关键字段,以便基于判断结果进行后续处理,从而确定待处理监控数据对应的目标来源。
S404:若待处理监控数据包含关联关键字段,则对关联关键字段对应的字段内容进行数据加工处理,获取待处理监控数据对应的目标来源。
作为一示例,步骤S404,即对关联关键字段对应的字段内容进行数据加工处理,获取待处理监控数据对应的目标来源,具体包括如下步骤:采用与关联关键字段相对应的加工处理逻辑对关联关键字段对应的字段内容进行数据加工处理,以确定待处理监控数据对应的目标来源。其中,与关联关键字段相对应的加工处理逻辑是用于对关联关键字段对应的字段内容进行加工处理,以获取目标来源的处理逻辑。例如,可以采用对关联关键字段对应的字段内容进行拆分、合并或者其他方式进行数据加工,以便从关联关键字段中快速且准确地确定目标来源,保证目标来源的确定效率。
作为一示例,步骤S404,即对关联关键字段对应的字段内容进行数据加工处理,获取待处理监控数据对应的目标来源,具体包括如下步骤:对关联关键字段对应的字段内容进行拆分提取处理,获取关联信息;基于关联信息查询关联映射表,将关联映射表中与关 联信息相对应的数据来源确定为待处理监控数据对应的目标来源。其中,对关联关键字段对应的字段内容进行拆分提取处理是指对关联关键字段的字段内容进行拆分,并提取与数据来源相关的关联信息的处理过程。关联映射表是预先设置的用于存储关联信息及其对应的数据来源的数据表。可以理解地,本示例基于关联关键字段拆分提取出的关联信息进行查表处理,即可快速确定其目标来源,保证目标来源的确定效率。
作为另一示例,在步骤S403之后,即采用关键字识别算法对待处理监控数据进行识别,判断待处理监控数据是否包含关联关键字段之后,若待处理监控数据中未包含关联关键字段,此时,可以执行获取目标来源的其他处理逻辑(如步骤S501-S502对应的处理逻辑),也可以将待处理监控数据流转到人工处理机制,由运维人员人工标定其目标来源,以便进行后续处理。此处的其他处理逻辑可以包括调用数据来源接口,确定传输该待处理监控数据的关联监控系统对应的监控系统信息,基于监控系统信息查询预先设置的来源映射表,确定待处理监控数据对应的目标来源。
本实施例所提供的基于数据中心的运维监控方法中,依次采用关键字识别算法识别待处理监控数据中是否包含来源关键字段和关联关键字段,从而根据来源关键字段和关联关键字段对应的字段内容确定其目标来源,从而保证目标来源确定的准确性和效率。
在一实施例中,如图5所示,步骤S203,即对待处理监控数据进行来源检测,获取待处理监控数据对应的目标来源,具体包括如下步骤:
S501:采用预设来源识别模型对待处理监控数据进行来源检测,获取至少一个识别来源和每一识别来源对应的识别概率。
其中,预设来源识别模型是预先训练的用于识别监控数据的数据来源的模型。识别来源是采用预设来源识别模型对待处理监控数据进行来源检测,所识别出的数据来源。识别来源对应的识别概率是指预设来源识别模型对待处理监控数据进行识别确定其属于某一数据来源的概率。
作为一示例,在步骤S201之前,基于数据中心的运维监控方法还包括训练预设来源识别模型的处理过程,具体包括如下步骤:
(1)基于历史监控数据,获取模型训练样本,将模型训练样本划分为训练集和测试集。其中,历史监控数据是在执行步骤S201之前,即在获取原始监控数据的系统当前时间之前采集到的监控数据。模型训练样本是预先给历史监控数据标注相应的数据来源所形成的用于进行模型训练的样本。
(2)采用训练集中的模型训练样本对神经网络模型进行训练,更新神经网络模型中的模型参数,获取原始来源识别模型。作为一示例,可以将训练集中的模型训练样本输入到CNN、RNN或者其他神经网络模型中进行模型训练,以更新神经网络模型中的模型参数,从而形成原始来源识别模型。
(3)采用测试集中的模型训练样本对原始来源识别模型进行测试,获取模型测试结果,若模型测试结果达到预设标准,则将原始来源识别模型确定为预设来源识别模型。其中,模型测试结果是指采用测试集中的模型训练样本对原始来源识别模型进行测试,以确定测试集中的模型训练样本的测试准确率。例如,将测试集中的每一模型训练样本输入到原始来源识别模型进行识别,获取其识别结果;若识别结果与模型训练样本所标注的数据来源一致,则认定识别准确;若识别结果与模型训练样本所标注的数据来源不一致,则认定识别不准确;基于测试集中识别结果为识别准确的模型训练样本的数量与所有模型训练样本的数量,确定测试准确率。预设标准是预先设置的用于评估原始来源识别模型是否达到认定为准确率较高的标准,例如,可设置为90%。
S502:将最大识别概率与预设概率阈值进行比较,若最大识别概率大于预设概率阈值,则将最大识别概率对应的识别来源,确定为待处理监控数据对应的目标来源。
其中,最大识别概率是预设来源识别模型识别出的多个识别来源对应的识别概率中的 最大值。一般来说,识别概率越大,说明预设来源识别模型识别出的识别来源越有可能为待处理监控数据的目标来源。预设概率阈值是预先设置的用于评估识别概率是否达到评估为目标来源的概率阈值。
作为一示例,数据中心可以采用预先训练好的预设来源识别模型对待处理监控数据进行来源检测,获取预设来源识别模型输出的检测结果,该检测结果包括至少一个识别来源和每一个识别来源对应的识别概率。然后,对所有识别来源对应的识别概率进行排序,再将最大识别概率与预设概率阈值进行比较。若最大识别概率大于预设概率阈值,则将最大识别概率对应的识别来源,确定为待处理监控数据对应的目标来源。若最大识别概率不大于预设概率阈值,则可以执行其他获取目标来源的处理逻辑,也可以将待处理监控数据流转到人工处理机制,由运维人员人工标定其目标来源,以便进行后续处理。
本实施例所提供的基于数据中心的运维监控方法,采用预先训练好的预设来源识别模型对待处理监控数据进行来源检测,以根据待处理监控数据中的数据内容快速有效地确定其目标来源,从而保证目标来源确定的准确性和效率。
在一实施例中,如图6所示,步骤S204,即对待处理监控数据进行有效性检测,获取有效性检测结果,具体包括如下步骤:
S601:基于待处理监控数据的目标来源查询合法来源映射表,获取合法性校验结果。
其中,合法来源映射表是预先设置的用于评估数据来源是否合法的数据表。合法性校验结果是对待处理监控数据的目标来源进行合法校验的结果。该合法性校验结果包括来源合法和来源非法两种结果,其中,来源合法是指某一待处理监控数据的目标来源为数据中心需要监控的合法的数据来源;来源非法是指某一待处理监控数据的目标来源为数据中心无需监控的非法的数据来源。
作为一示例,可以在合法来源映射表中存储所有合法来源的来源信息,基于待处理监控数据识别出的目标来源查询合法来源映射表,若目标来源在合法来源映射表中,则获取来源合法的合法性校验结果;若目标来源不在合法来源映射表中,则获取来源非法的合法性校验结果。
作为另一示例,可以在合法来源映射表中存储多个数据来源以及每一数据来源对应的来源属性,该来源属性包括来源合法和来源非法;基于待处理监控数据识别出的目标来源查询合法来源映射表,根据目标来源在合法来源映射表中的来源属性确定其对应的合法性校验结果。
S602:基于待处理监控数据查询告警状态信息表,获取告警状态结果。
其中,告警状态信息表是预先设置的用于评估监控数据中的告警事件是否有效的数据表。告警状态结果是根据告警状态信息表确定的某一待处理监控数据是否仍处于有效状态的结果。
作为一示例,由于关联监控系统采集原始监控数据的时间与数据中心从关联监控系统中获取原始监控数据的时间之间存在时间差,在该时间差内,待处理监控数据的状态可能发生变化,因此,数据中心在获取待处理监控数据之后,需基于待处理监控数据中记录的监控日志查询告警状态信息表,判断这一监控日志是否仍然被认定需要进行监控的故障事件;根据判断结果,获取告警状态结果。即若根据监控日志查询告警状态信息表,确定其为仍然需要继续监控的故障事件,则获取的告警状态结果为有效状态;若根据监控日志查询告警状态信息表,确定其为无需继续监控的故障事件,则获取的告警状态结果为无效状态。
S603:若合法性校验结果为来源合法,且告警状态结果为有效状态,则获取告警有效的有效性检测结果。
作为一示例,数据中心在对待处理监控数据进行有效性检测过程中,若确定其合法性校验结果为来源合法,且告警状态结果为有效状态,则说明待处理监控数据的目标来源为 数据中心需要监控的合法的数据来源,且待处理监控数据对应的故障事件为需要数据中心继续监控的故障事件,此时,获取告警有效的有效性检测结果,以便后续对告警有效的待处理监控数据进行继续处理。
S604:若合法性校验结果为来源非法,或者告警状态结果为无效状态,则获取告警无效的有效性检测结果。
作为一示例,数据中心在对待处理监控数据进行有效性检测过程中,若确定其合法性校验结果为来源非法,或者告警状态结果为有效状态,则说明待处理监控数据的目标来源不为数据中心需要监控的非法的数据来源,或者待处理监控数据对应的故障数据不为需要数据中心继续监控的故障事件,此时,获取告警无效的有效性检测结果,使得数据中心无需继续对待处理监控数据进行监控分析。
本实施例所提供的基于数据中心的运维监控方法,基于待处理监控数据分别查询合法来源映射表和告警状态信息表,以获取合法性校验结果和告警状态结果;将合法性校验结果为来源合法且告警状态结果为有效状态的有效性检测结果确定的告警有效,反之,确定为告警无效,保证有效性检测结果的准确性和时效性。
在一实施例中,如图7所示,步骤S205,即基于有效监控数据和有效监控数据对应的目标来源进行格式化处理,获取目标监控数据,具体包括如下步骤:
S701:对有效监控数据进行编码格式化,获取标准监控数据。
其中,对有效监控数据进行编码转换,是指将所有有效监控数据转换成统一编码的处理过程。标准监控数据是指对有效监控数据进行编码格式化后确定的监控数据。
数据中心可以将不同关联监控系统上传的有效监控数据统一转换成目标编码格式,例如UTF-8编码格式,以保证后续数据监控和分析处理的可行性,从而提高监控分析的效率和准确性。作为一示例,数据中心识别有效监控数据的当前编码格式;若当前编码格式为UTF-8编码格式这一目标编码格式,则无需进行编码转换处理;若当前编码格式不为UTF-8编码格式这一目标编码格式,而为Unicode编码格式,则基于当前编码格式与目标编码格式对应的格式转换规则,对当前编码格式进行格式转换,以获取与目标编码模式相匹配的标准监控数据,从而保证标准监控数据编码格式的一致性,以便后续自动化监控分析的可行性。
S702:采用关键字识别算法对标准监控数据进行识别,判断标准监控数据是否包括核心格式字段。
其中,核心格式字段是指字段内容需要采用特定格式的字段。
作为一示例,数据中心在获取标准监控数据之后,可以采用关键字识别算法对标准监控数据中的数据内容进行识别,以判断标准监控数据中所有字段是否包含核心格式字段,以便根据根据判断结果确定是否需要进行进一步格式转换,从而保证格式转换的针对性。
S703:若标准监控数据包含核心格式字段,则采用与核心格式字段对应的格式转换脚本,对核心格式字段对应的原始字段内容进行格式转换,获取目标字段内容,采用目标字段内容替换原始字段内容,新增来源字段和对应的目标来源,获取目标监控数据。
其中,与核心格式字段对应的格式转换脚本是预先设置的用于实现对核心格式字段对应的字段内容进行格式转换的脚本。作为一示例,若标准监控数据包含eventName(事件名称)这一核心格式字段,则采用与该核心格式字段相对应的格式转换脚本,对标准监控数据中与核心格式字段相对应的原始字段内容的统一转换成“ObjName(项目名称)-ObjType(项目类型)-Source(来源)-Desc(团队)”这种特定格式对应的目标字段内容,再将目标字段内容替换为原始字段内容。
作为一示例,若标准监控数据包含tags.department(标签部门)这一核心格式字段,则采用与该核心格式字段相对应的格式转换脚本,对标准监控数据中与核心格式字段相对应的原始字段内容进行字符长度校验,若字符长度符合标准长度,则无需进行格式转换; 字符长度不符合标准长度,则按照既定规则拼接新的内容进行增加替换,以获取字符长度符合标准长度的目标字段内容,再将目标字段内容替换为原始字段内容。
作为一示例,若标准监控数据包含duration(告警持续时间)这一核心格式字段,则采用与该核心格式字段相对应的格式转换脚本,对标准监控数据中与核心格式字段相对应的原始字段内容进行时间单位转换,则将转换成统一时间单位,以便后续针对与时间相关的监控数据进行自动化监控,提高监控效率。
本示例中,在标准监控数据包含核心格式字段时,利用格式转换脚本对核心格式字段对应的原始字段内容进行格式转换,在将原始字段内容替换为目标字段内容之后,还需在标准监控数据中增加来源字段以及与来源字段相对应的目标来源,以使最终获取的目标监控字段采用统一格式,以保证后续自动监控分析的可行性;并使每一目标监控数据携带有目标来源等信息,以便后续基于目标来源对所有目标监控数据进行自动化监控分析,保证监控的有效性。
S704:若标准监控数据未包含核心格式字段,则新增来源字段和对应的目标来源,获取目标监控数据。
本示例中,在标准监控数据不包含核心格式字段时,无需针对核心格式字段对应的原始字段内容进行格式转换,直接在标准监控数据中增加来源字段以及与来源字段相对应的目标来源,以使每一目标监控数据携带有目标来源等信息,以便后续基于目标来源对所有目标监控数据进行自动化监控分析,保证监控的有效性。
本实施例所提供的基于数据中心的运维监控方法,通过对有效监控数据进行编码格式化,以保证编码格式化获取的标准监控数据的可识别性,以便数据中心的自动化监控程序可以准确有效识别标准监控数据中的数据内容;在标准监控数据中包含核心格式字段时,需采用格式转换脚本进行格式转换,并将目标字段内容替换原始字段内容,以使核心格式字段对应的目标字段内容采用统一格式,以保证后续自动监控分析的可行性;在标准格式字段中增加来源字段以及与来源字段相对应的目标来源,以使每一目标监控数据携带有目标来源等信息,以便后续基于目标来源对所有目标监控数据进行自动化监控分析,保证监控的有效性。
在一实施例中,如图8所示,步骤S206,即基于目标来源,对目标监控数据进行自动化监控,获取告警监控结果,具体包括如下步骤:
S801:启动自动化监控任务,自动化监控任务包括监控周期和目标监控指标。
其中,自动化监控任务是预先设置的用于实现自动监控的计算机可执行任务。监控数据周期是指需要在数据中心监控的从业务数据采集的监控数据的时间区间。目标监控指标是指该自动监控任务需要监控的指标,包括但不限于告警处理率、告警超时率、告警处理时效和用户使用率等。
S802:将事件发生时间在监控周期内的所有目标监控数据,依据目标来源进行划分,确定每一目标来源对应的待分析监控数据。
作为一示例,数据中心可以将目标监控数据的事件发生时间与监控周期进行比较,以判断事件发生时间是否在监控周期内,从而筛选出事件发生时间在监控周期内的所有目标监控数据作为需要进行自动化监控的数据;然后,对事件发生时间在监控周期内的所有目标监控数据,依据目标监控数据中的目标来源进行划分,从而根据目标来源的数量,将所有目标监控数据划分成不同的待监控数据集,每一待监控数据集对应的一目标来源,每一待监控数据集具体包括某一目标来源对应的待分析监控数据,该待分析监控数据具体为该目标来源对应的所有事件发生时间在监控周期内的目标监控数据。
S803:调用目标监控指标对应的指标运算脚本,从待分析监控数据中读取目标变量因素对应的变量因素值,采用指标运算逻辑对变量因素值进行计算,获取每一目标来源对应的告警监控结果。
其中,指标运算脚本是用于计算目标监控指标对应的可在计算机上运行的脚本,该指标运算脚本是采用计算机语言编写的可执行某一目标监控指标对应的指标运算公式的脚本。一般来说,指标运算公式包括指标变量因素和指标运算逻辑,其中,指标变量因素是用于计算目标监控指标的变量因素,指标运算逻辑是用于对所有指标变量因素进行数学运算的处理逻辑。例如,告警超时率=超时告警量/总发生量。告警处理率=1-告警超时率,其中,一条告警事件,故障持续时间超过30分钟未处理,则认为处理超时。监控处理时效=总告警持续时间/总发生量,备注:总告警持续时间=总发生量中的每条告警持续时间相加总和;监控处理时效的单位:分钟。用户使用度=到达一定访问量的核心URL/全量核心URL。
作为一示例,数据中心可以执行自动化监控任务中设定的目标监控指标对应的指标运算脚本,由于每一待分析监控数据为经过格式化处理的目标监控指标,使得指标运算脚本可从待分析监控数据中快速获取与指标运算公式中的目标变量因素对应的变量因素值,再利用指标运算公式中的指标运算逻辑对变量因素值进行计算,从而获取每一目标来源对应的告警监控结果,以便实现对每一目标来源对应的所有目标监控数据进行自动化监控,提高数据监控的效率,并降低数据监控的成本。
本实施例所提供的基于数据中心的运维监控方法,由于经过格式化处理的目标监控数据均包含事件发生时间和目标来源,使得基于事件发生时间和目标来源划分待分析监控数据的处理过程具有可行性,从而方便后续基于目标来源这一分析维度对所有目标监控数据进行处理;调用目标监控指标对应的指标运算脚本,可实现自动快速利用指标运算逻辑对从待分析监控数据中获取相应的变量因素值进行计算,以获取每一目标来源对应的告警监控结果,以便实现对每一目标来源对应的所有目标监控数据进行自动化监控,提高数据监控的效率,并降低数据监控的成本。
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
在一实施例中,提供一种基于数据中心的运维监控装置,该基于数据中心的运维监控装置与上述实施例中基于数据中心的运维监控方法一一对应。如图9所示,该基于数据中心的运维监控装置包括原始监控数据获取模块901、待处理监控数据获取模块902、目标来源获取模块903、有效监控数据获取模块904、目标监控数据获取模块905和告警监控结果获取模块906。各功能模块详细说明如下:
原始监控数据获取模块901,用于获取关联监控系统采集到的业务系统对应的原始监控数据,每一原始监控数据包括原始告警级别。
待处理监控数据获取模块902,用于对原始告警级别进行标准化处理,获取标准告警级别,将标准告警级别为目标告警级别的原始监控数据确定为待处理监控数据。
目标来源获取模块903,用于对待处理监控数据进行来源检测,获取待处理监控数据对应的目标来源。
有效监控数据获取模块904,用于对待处理监控数据进行有效性检测,获取有效性检测结果,将有效性检测结果为告警有效的待处理监控数据确定为有效监控数据。
目标监控数据获取模块905,用于基于有效监控数据和有效监控数据对应的目标来源进行格式化处理,获取目标监控数据。
告警监控结果获取模块906,用于基于目标来源,对目标监控数据进行自动化监控,获取告警监控结果。
优选地,原始监控数据获取模块,包括监控系统数量获取单元、数据采集进程创建单元和采集进程并行执行单元。
监控系统数量获取单元,用于实时监测关联监控系统对应的监控系统数量。
数据采集进程创建单元,用于创建与监控系统数量相对应的数据采集进程,每一数据 采集进程对应的一关联监控系统。
采集进程并行执行单元,用于并行执行所有数据采集进程,获取每一关联监控系统在当前采集周期内采集到的业务系统对应的原始监控数据。
优选地,目标来源获取模块,包括第一关键字识别单元、第一来源确定单元、第二关键字识别单元和第二来源确定单元。
第一关键字识别单元,用于采用关键字识别算法对待处理监控数据进行识别,判断待处理监控数据是否包含来源关键字段。
第一来源确定单元,用于若待处理监控数据包含来源关键字段,则将来源关键字段对应的字段内容,确定为待处理监控数据对应的目标来源。
第二关键字识别单元,用于若待处理监控数据未包含来源关键字段,则采用关键字识别算法对待处理监控数据进行识别,判断待处理监控数据是否包含关联关键字段。
第二来源确定单元,用于若待处理监控数据包含关联关键字段,则对关联关键字段对应的字段内容进行数据加工处理,获取待处理监控数据对应的目标来源。
优选地,目标来源获取模块,包括模型识别处理单元和第三来源确定单元。
模型识别处理单元,用于采用预设来源识别模型对待处理监控数据进行来源检测,获取至少一个识别来源和每一识别来源对应的识别概率。
第三来源确定单元,用于将最大识别概率与预设概率阈值进行比较,若最大识别概率大于预设概率阈值,则将最大识别概率对应的识别来源,确定为待处理监控数据对应的目标来源。
优选地,有效监控数据获取模块,包括合法性校验结果获取单元、告警状态结果获取单元、合法来源结果获取单元和非法来源结果获取单元。
合法性校验结果获取单元,用于基于待处理监控数据的目标来源查询合法来源映射表,获取合法性校验结果。
告警状态结果获取单元,用于基于待处理监控数据查询告警状态信息表,获取告警状态结果。
合法来源结果获取单元,用于若合法性校验结果为来源合法,且告警状态结果为有效状态,则获取告警有效的有效性检测结果。
非法来源结果获取单元,用于若合法性校验结果为来源非法,或者告警状态结果为无效状态,则获取告警无效的有效性检测结果。
优选地,目标监控数据获取模块,包括标准监控数据获取单元、核心格式字段判断单元、第一目标数据获取单元和第二目标数据获取单元。
标准监控数据获取单元,用于对有效监控数据进行编码格式化,获取标准监控数据。
核心格式字段判断单元,用于采用关键字识别算法对标准监控数据进行识别,判断标准监控数据是否包括核心格式字段。
第一目标数据获取单元,用于若标准监控数据包含核心格式字段,则采用与核心格式字段对应的格式转换脚本对核心格式字段对应的原始字段内容进行格式转换,获取目标字段内容,采用目标字段内容替换原始字段内容,新增来源字段和对应的目标来源,获取目标监控数据。
第二目标数据获取单元,用于若标准监控数据未包含核心格式字段,则新增来源字段和对应的目标来源,获取目标监控数据。
优选地,告警监控结果获取模块,包括监控任务启动单元、待分析数据确定单元和告警监控处理单元。
监控任务启动单元,用于启动自动化监控任务,自动化监控任务包括监控周期和目标监控指标。
待分析数据确定单元,用于将事件发生时间在监控周期内的所有目标监控数据,依据 目标来源进行划分,确定每一目标来源对应的待分析监控数据。
告警监控处理单元,用于调用目标监控指标对应的指标运算脚本,从待分析监控数据中读取目标变量因素对应的变量因素值,采用指标运算逻辑对变量因素值进行计算,获取每一目标来源对应的告警监控结果。
关于基于数据中心的运维监控装置的具体限定可以参见上文中对于基于数据中心的运维监控方法的限定,在此不再赘述。上述基于数据中心的运维监控装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图10所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储执行基于数据中心的运维监控方法过程采用或生成的数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种基于数据中心的运维监控方法。
在一个实施例中,提供一个或多个存储有计算机可读指令的可读存储介质,所述计算机可读存储介质存储有计算机可读指令,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行计算机可读指令时实现上述实施例中基于数据中心的运维监控方法,例如图2所示S201-S206,或者图3至图8中所示,为避免重复,这里不再赘述。或者,处理器执行计算机可读指令时实现基于数据中心的运维监控装置这一实施例中的各模块/单元的功能,例如图9所示的原始监控数据获取模块901、待处理监控数据获取模块902、目标来源获取模块903、有效监控数据获取模块904、目标监控数据获取模块905和告警监控结果获取模块906的功能,为避免重复,这里不再赘述。本实施例中的可读存储介质包括非易失性可读存储介质和易失性可读存储介质。
在一实施例中,提供一计算机可读存储介质,该计算机可读存储介质上存储有计算机可读指令,该计算机可读指令被处理器执行时实现上述实施例中基于数据中心的运维监控方法,例如图2所示S201-S206,或者图3至图8中所示,为避免重复,这里不再赘述。或者,该计算机可读指令被处理器执行时实现上述基于数据中心的运维监控装置这一实施例中的各模块/单元的功能,例如图9所示的原始监控数据获取模块901、待处理监控数据获取模块902、目标来源获取模块903、有效监控数据获取模块904、目标监控数据获取模块905和告警监控结果获取模块906的功能,为避免重复,这里不再赘述。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,该计算机可读指令可存储于一非易失性可读存储介质也可以存储在易失性可读存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单 元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。

Claims (20)

  1. 一种基于数据中心的运维监控方法,其中,包括:
    获取关联监控系统采集到的业务系统对应的原始监控数据,每一原始监控数据包括原始告警级别;
    对所述原始告警级别进行标准化处理,获取标准告警级别,将所述标准告警级别为目标告警级别的原始监控数据确定为待处理监控数据;
    对所述待处理监控数据进行来源检测,获取所述待处理监控数据对应的目标来源;
    对所述待处理监控数据进行有效性检测,获取有效性检测结果,将所述有效性检测结果为告警有效的待处理监控数据确定为有效监控数据;
    基于所述有效监控数据和所述有效监控数据对应的目标来源进行格式化处理,获取目标监控数据;
    基于所述目标来源,对所述目标监控数据进行自动化监控,获取告警监控结果。
  2. 如权利要求1所述的基于数据中心的运维监控方法,其中,所述获取关联监控系统采集到的业务系统对应的原始监控数据,包括:
    实时监测所述关联监控系统对应的监控系统数量;
    创建与所述监控系统数量相对应的数据采集进程,每一所述数据采集进程对应的一所述关联监控系统;
    并行执行所有所述数据采集进程,获取每一所述关联监控系统在当前采集周期内采集到的所述业务系统对应的原始监控数据。
  3. 如权利要求1所述的基于数据中心的运维监控方法,其中,所述对所述待处理监控数据进行来源检测,获取所述待处理监控数据对应的目标来源,包括:
    采用关键字识别算法对所述待处理监控数据进行识别,判断所述待处理监控数据是否包含来源关键字段;
    若所述待处理监控数据包含来源关键字段,则将所述来源关键字段对应的字段内容,确定为所述待处理监控数据对应的目标来源;
    若所述待处理监控数据未包含来源关键字段,则采用关键字识别算法对所述待处理监控数据进行识别,判断所述待处理监控数据是否包含关联关键字段;
    若所述待处理监控数据包含关联关键字段,则对所述关联关键字段对应的字段内容进行数据加工处理,获取所述待处理监控数据对应的目标来源。
  4. 如权利要求1所述的基于数据中心的运维监控方法,其中,所述对所述待处理监控数据进行来源检测,获取所述待处理监控数据对应的目标来源,包括:
    采用预设来源识别模型对所述待处理监控数据进行来源检测,获取至少一个识别来源和每一所述识别来源对应的识别概率;
    将最大识别概率与预设概率阈值进行比较,若所述最大识别概率大于预设概率阈值,则将所述最大识别概率对应的识别来源,确定为所述待处理监控数据对应的目标来源。
  5. 如权利要求1所述的基于数据中心的运维监控方法,其中,所述对所述待处理监控数据进行有效性检测,获取有效性检测结果,包括:
    基于所述待处理监控数据的目标来源查询合法来源映射表,获取合法性校验结果;
    基于所述待处理监控数据查询告警状态信息表,获取告警状态结果;
    若所述合法性校验结果为来源合法,且所述告警状态结果为有效状态,则获取告警有效的有效性检测结果;
    若所述合法性校验结果为来源非法,或者所述告警状态结果为无效状态,则获取告警无效的有效性检测结果。
  6. 如权利要求1所述的基于数据中心的运维监控方法,其中,所述基于所述有效监 控数据和所述有效监控数据对应的目标来源进行格式化处理,获取目标监控数据,包括:
    对所述有效监控数据进行编码格式化,获取标准监控数据;
    采用关键字识别算法对所述标准监控数据进行识别,判断所述标准监控数据是否包括核心格式字段;
    若所述标准监控数据包含核心格式字段,则采用与所述核心格式字段对应的格式转换脚本对所述核心格式字段对应的原始字段内容进行格式转换,获取目标字段内容,采用所述目标字段内容替换所述原始字段内容,新增来源字段和对应的目标来源,获取目标监控数据;
    若所述标准监控数据未包含核心格式字段,则新增来源字段和对应的目标来源,获取目标监控数据。
  7. 如权利要求1所述的基于数据中心的运维监控方法,其中,所述基于所述目标来源,对所述目标监控数据进行自动化监控,获取告警监控结果,包括:
    启动自动化监控任务,所述自动化监控任务包括监控周期和目标监控指标;
    将事件发生时间在所述监控周期内的所有所述目标监控数据,依据所述目标来源进行划分,确定每一所述目标来源对应的待分析监控数据;
    调用所述目标监控指标对应的指标运算脚本,从所述待分析监控数据中读取目标变量因素对应的变量因素值,采用指标运算逻辑对所述变量因素值进行计算,获取每一所述目标来源对应的告警监控结果。
  8. 一种基于数据中心的运维监控装置,其中,包括:
    原始监控数据获取模块,用于获取关联监控系统采集到的业务系统对应的原始监控数据,每一原始监控数据包括原始告警级别;
    待处理监控数据获取模块,用于对所述原始告警级别进行标准化处理,获取标准告警级别,将所述标准告警级别为目标告警级别的原始监控数据确定为待处理监控数据;
    目标来源获取模块,用于对所述待处理监控数据进行来源检测,获取所述待处理监控数据对应的目标来源;
    有效监控数据获取模块,用于对所述待处理监控数据进行有效性检测,获取有效性检测结果,将所述有效性检测结果为告警有效的待处理监控数据确定为有效监控数据;
    目标监控数据获取模块,用于基于所述有效监控数据和所述有效监控数据对应的目标来源进行格式化处理,获取目标监控数据;
    告警监控结果获取模块,用于基于所述目标来源,对所述目标监控数据进行自动化监控,获取告警监控结果。
  9. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其中,所述处理器执行所述计算机可读指令时实现如下步骤:
    获取关联监控系统采集到的业务系统对应的原始监控数据,每一原始监控数据包括原始告警级别;
    对所述原始告警级别进行标准化处理,获取标准告警级别,将所述标准告警级别为目标告警级别的原始监控数据确定为待处理监控数据;
    对所述待处理监控数据进行来源检测,获取所述待处理监控数据对应的目标来源;
    对所述待处理监控数据进行有效性检测,获取有效性检测结果,将所述有效性检测结果为告警有效的待处理监控数据确定为有效监控数据;
    基于所述有效监控数据和所述有效监控数据对应的目标来源进行格式化处理,获取目标监控数据;
    基于所述目标来源,对所述目标监控数据进行自动化监控,获取告警监控结果。
  10. 如权利要求9所述的计算机设备,其中,所述对所述待处理监控数据进行来源检测,获取所述待处理监控数据对应的目标来源,包括:
    采用关键字识别算法对所述待处理监控数据进行识别,判断所述待处理监控数据是否包含来源关键字段;
    若所述待处理监控数据包含来源关键字段,则将所述来源关键字段对应的字段内容,确定为所述待处理监控数据对应的目标来源;
    若所述待处理监控数据未包含来源关键字段,则采用关键字识别算法对所述待处理监控数据进行识别,判断所述待处理监控数据是否包含关联关键字段;
    若所述待处理监控数据包含关联关键字段,则对所述关联关键字段对应的字段内容进行数据加工处理,获取所述待处理监控数据对应的目标来源。
  11. 如权利要求9所述的计算机设备,其中,所述对所述待处理监控数据进行来源检测,获取所述待处理监控数据对应的目标来源,包括:
    采用预设来源识别模型对所述待处理监控数据进行来源检测,获取至少一个识别来源和每一所述识别来源对应的识别概率;
    将最大识别概率与预设概率阈值进行比较,若所述最大识别概率大于预设概率阈值,则将所述最大识别概率对应的识别来源,确定为所述待处理监控数据对应的目标来源。
  12. 如权利要求9所述的计算机设备,其中,所述对所述待处理监控数据进行有效性检测,获取有效性检测结果,包括:
    基于所述待处理监控数据的目标来源查询合法来源映射表,获取合法性校验结果;
    基于所述待处理监控数据查询告警状态信息表,获取告警状态结果;
    若所述合法性校验结果为来源合法,且所述告警状态结果为有效状态,则获取告警有效的有效性检测结果;
    若所述合法性校验结果为来源非法,或者所述告警状态结果为无效状态,则获取告警无效的有效性检测结果。
  13. 如权利要求9所述的计算机设备,其中,所述基于所述有效监控数据和所述有效监控数据对应的目标来源进行格式化处理,获取目标监控数据,包括:
    对所述有效监控数据进行编码格式化,获取标准监控数据;
    采用关键字识别算法对所述标准监控数据进行识别,判断所述标准监控数据是否包括核心格式字段;
    若所述标准监控数据包含核心格式字段,则采用与所述核心格式字段对应的格式转换脚本对所述核心格式字段对应的原始字段内容进行格式转换,获取目标字段内容,采用所述目标字段内容替换所述原始字段内容,新增来源字段和对应的目标来源,获取目标监控数据;
    若所述标准监控数据未包含核心格式字段,则新增来源字段和对应的目标来源,获取目标监控数据。
  14. 如权利要求9所述的计算机设备,其中,所述基于所述目标来源,对所述目标监控数据进行自动化监控,获取告警监控结果,包括:
    启动自动化监控任务,所述自动化监控任务包括监控周期和目标监控指标;
    将事件发生时间在所述监控周期内的所有所述目标监控数据,依据所述目标来源进行划分,确定每一所述目标来源对应的待分析监控数据;
    调用所述目标监控指标对应的指标运算脚本,从所述待分析监控数据中读取目标变量因素对应的变量因素值,采用指标运算逻辑对所述变量因素值进行计算,获取每一所述目标来源对应的告警监控结果。
  15. 一个或多个存储有计算机可读指令的可读存储介质,所述计算机可读存储介质存储有计算机可读指令,其中,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:
    获取关联监控系统采集到的业务系统对应的原始监控数据,每一原始监控数据包括原 始告警级别;
    对所述原始告警级别进行标准化处理,获取标准告警级别,将所述标准告警级别为目标告警级别的原始监控数据确定为待处理监控数据;
    对所述待处理监控数据进行来源检测,获取所述待处理监控数据对应的目标来源;
    对所述待处理监控数据进行有效性检测,获取有效性检测结果,将所述有效性检测结果为告警有效的待处理监控数据确定为有效监控数据;
    基于所述有效监控数据和所述有效监控数据对应的目标来源进行格式化处理,获取目标监控数据;
    基于所述目标来源,对所述目标监控数据进行自动化监控,获取告警监控结果。
  16. 如权利要求15所述的可读存储介质,其中,所述对所述待处理监控数据进行来源检测,获取所述待处理监控数据对应的目标来源,包括:
    采用关键字识别算法对所述待处理监控数据进行识别,判断所述待处理监控数据是否包含来源关键字段;
    若所述待处理监控数据包含来源关键字段,则将所述来源关键字段对应的字段内容,确定为所述待处理监控数据对应的目标来源;
    若所述待处理监控数据未包含来源关键字段,则采用关键字识别算法对所述待处理监控数据进行识别,判断所述待处理监控数据是否包含关联关键字段;
    若所述待处理监控数据包含关联关键字段,则对所述关联关键字段对应的字段内容进行数据加工处理,获取所述待处理监控数据对应的目标来源。
  17. 如权利要求15所述的可读存储介质,其中,所述对所述待处理监控数据进行来源检测,获取所述待处理监控数据对应的目标来源,包括:
    采用预设来源识别模型对所述待处理监控数据进行来源检测,获取至少一个识别来源和每一所述识别来源对应的识别概率;
    将最大识别概率与预设概率阈值进行比较,若所述最大识别概率大于预设概率阈值,则将所述最大识别概率对应的识别来源,确定为所述待处理监控数据对应的目标来源。
  18. 如权利要求15所述的可读存储介质,其中,所述对所述待处理监控数据进行有效性检测,获取有效性检测结果,包括:
    基于所述待处理监控数据的目标来源查询合法来源映射表,获取合法性校验结果;
    基于所述待处理监控数据查询告警状态信息表,获取告警状态结果;
    若所述合法性校验结果为来源合法,且所述告警状态结果为有效状态,则获取告警有效的有效性检测结果;
    若所述合法性校验结果为来源非法,或者所述告警状态结果为无效状态,则获取告警无效的有效性检测结果。
  19. 如权利要求15所述的可读存储介质,其中,所述基于所述有效监控数据和所述有效监控数据对应的目标来源进行格式化处理,获取目标监控数据,包括:
    对所述有效监控数据进行编码格式化,获取标准监控数据;
    采用关键字识别算法对所述标准监控数据进行识别,判断所述标准监控数据是否包括核心格式字段;
    若所述标准监控数据包含核心格式字段,则采用与所述核心格式字段对应的格式转换脚本对所述核心格式字段对应的原始字段内容进行格式转换,获取目标字段内容,采用所述目标字段内容替换所述原始字段内容,新增来源字段和对应的目标来源,获取目标监控数据;
    若所述标准监控数据未包含核心格式字段,则新增来源字段和对应的目标来源,获取目标监控数据。
  20. 如权利要求15所述的可读存储介质,其中,所述基于所述目标来源,对所述目 标监控数据进行自动化监控,获取告警监控结果,包括:
    启动自动化监控任务,所述自动化监控任务包括监控周期和目标监控指标;
    将事件发生时间在所述监控周期内的所有所述目标监控数据,依据所述目标来源进行划分,确定每一所述目标来源对应的待分析监控数据;
    调用所述目标监控指标对应的指标运算脚本,从所述待分析监控数据中读取目标变量因素对应的变量因素值,采用指标运算逻辑对所述变量因素值进行计算,获取每一所述目标来源对应的告警监控结果。
PCT/CN2020/093315 2020-03-06 2020-05-29 基于数据中心的运维监控方法、装置、设备及存储介质 WO2021174694A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010153280.5 2020-03-06
CN202010153280.5A CN111475370A (zh) 2020-03-06 2020-03-06 基于数据中心的运维监控方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2021174694A1 true WO2021174694A1 (zh) 2021-09-10

Family

ID=71748061

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/093315 WO2021174694A1 (zh) 2020-03-06 2020-05-29 基于数据中心的运维监控方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN111475370A (zh)
WO (1) WO2021174694A1 (zh)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113641567A (zh) * 2021-10-13 2021-11-12 北京易真学思教育科技有限公司 一种数据库巡检方法、装置、电子设备及存储介质
CN114140032A (zh) * 2022-01-29 2022-03-04 北京优特捷信息技术有限公司 一种设施运行状态监测方法、装置、设备及存储介质
CN114161410A (zh) * 2021-11-16 2022-03-11 中国电信集团系统集成有限责任公司 运维方法、装置、电子设备和存储介质
CN114363934A (zh) * 2021-12-30 2022-04-15 中国电信股份有限公司 一种基站健康度评估方法、装置、设备及介质
CN115118572A (zh) * 2022-07-28 2022-09-27 济南浪潮数据技术有限公司 一种服务器告警适配方法、装置、设备及存储介质
CN115629915A (zh) * 2022-12-07 2023-01-20 深圳华声医疗技术股份有限公司 医疗设备数据处理方法、装置、医疗设备及存储介质
CN116380149A (zh) * 2023-04-07 2023-07-04 深圳市兴源智能仪表股份有限公司 一种仪表码盘转动测试方法、系统
CN116610664A (zh) * 2023-07-19 2023-08-18 深圳高灯计算机科技有限公司 数据监控方法、装置、计算机设备、存储介质和产品
CN116708506A (zh) * 2023-06-28 2023-09-05 广州豪特节能环保科技股份有限公司 一种数据中心智能监控方法、系统以及存储介质
CN116881526A (zh) * 2023-09-07 2023-10-13 埃睿迪信息技术(北京)有限公司 一种数据处理方法、装置及设备

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112181967B (zh) * 2020-09-29 2023-08-22 中国平安人寿保险股份有限公司 源数据质量的监测方法、装置、计算机设备及介质
CN112416714A (zh) * 2020-11-23 2021-02-26 平安普惠企业管理有限公司 日志处理方法、装置、电子设备及可读存储介质
CN113515603A (zh) * 2021-04-27 2021-10-19 深圳力维智联技术有限公司 机房运维监控数据的处理方法、系统及设备
CN113361430A (zh) 2021-06-15 2021-09-07 合肥维天运通信息科技股份有限公司 一种车辆状态信息采集监控方法及系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170132063A1 (en) * 2015-11-10 2017-05-11 China Construction Bank Corporation Information system fault scenario information collecting method and system
CN107679713A (zh) * 2017-09-16 2018-02-09 广西电网有限责任公司电力科学研究院 一种输变电设备状态告警处理方法
CN108763038A (zh) * 2018-08-08 2018-11-06 平安科技(深圳)有限公司 告警数据的管理方法、装置、计算机设备及存储介质
CN109218097A (zh) * 2018-09-19 2019-01-15 山东浪潮云投信息科技有限公司 一种云平台可配置告警规则的告警系统及告警方法
CN110505102A (zh) * 2019-09-11 2019-11-26 国网湖北省电力有限公司鄂州供电公司 电力信息通信融合监控与服务标准化管理平台系统及方法
CN110620690A (zh) * 2019-09-19 2019-12-27 国网思极网安科技(北京)有限公司 一种网络攻击事件的处理方法及其电子设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170132063A1 (en) * 2015-11-10 2017-05-11 China Construction Bank Corporation Information system fault scenario information collecting method and system
CN107679713A (zh) * 2017-09-16 2018-02-09 广西电网有限责任公司电力科学研究院 一种输变电设备状态告警处理方法
CN108763038A (zh) * 2018-08-08 2018-11-06 平安科技(深圳)有限公司 告警数据的管理方法、装置、计算机设备及存储介质
CN109218097A (zh) * 2018-09-19 2019-01-15 山东浪潮云投信息科技有限公司 一种云平台可配置告警规则的告警系统及告警方法
CN110505102A (zh) * 2019-09-11 2019-11-26 国网湖北省电力有限公司鄂州供电公司 电力信息通信融合监控与服务标准化管理平台系统及方法
CN110620690A (zh) * 2019-09-19 2019-12-27 国网思极网安科技(北京)有限公司 一种网络攻击事件的处理方法及其电子设备

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113641567A (zh) * 2021-10-13 2021-11-12 北京易真学思教育科技有限公司 一种数据库巡检方法、装置、电子设备及存储介质
CN114161410B (zh) * 2021-11-16 2024-01-09 中电信数智科技有限公司 运维方法、装置、电子设备和存储介质
CN114161410A (zh) * 2021-11-16 2022-03-11 中国电信集团系统集成有限责任公司 运维方法、装置、电子设备和存储介质
CN114363934A (zh) * 2021-12-30 2022-04-15 中国电信股份有限公司 一种基站健康度评估方法、装置、设备及介质
CN114140032A (zh) * 2022-01-29 2022-03-04 北京优特捷信息技术有限公司 一种设施运行状态监测方法、装置、设备及存储介质
CN115118572A (zh) * 2022-07-28 2022-09-27 济南浪潮数据技术有限公司 一种服务器告警适配方法、装置、设备及存储介质
CN115629915A (zh) * 2022-12-07 2023-01-20 深圳华声医疗技术股份有限公司 医疗设备数据处理方法、装置、医疗设备及存储介质
CN116380149A (zh) * 2023-04-07 2023-07-04 深圳市兴源智能仪表股份有限公司 一种仪表码盘转动测试方法、系统
CN116380149B (zh) * 2023-04-07 2024-02-02 深圳市兴源智能仪表股份有限公司 一种仪表码盘转动测试方法、系统
CN116708506A (zh) * 2023-06-28 2023-09-05 广州豪特节能环保科技股份有限公司 一种数据中心智能监控方法、系统以及存储介质
CN116708506B (zh) * 2023-06-28 2023-10-27 广州豪特节能环保科技股份有限公司 一种数据中心智能监控方法、系统以及存储介质
CN116610664A (zh) * 2023-07-19 2023-08-18 深圳高灯计算机科技有限公司 数据监控方法、装置、计算机设备、存储介质和产品
CN116610664B (zh) * 2023-07-19 2024-01-16 深圳高灯计算机科技有限公司 数据监控方法、装置、计算机设备、存储介质和产品
CN116881526A (zh) * 2023-09-07 2023-10-13 埃睿迪信息技术(北京)有限公司 一种数据处理方法、装置及设备
CN116881526B (zh) * 2023-09-07 2023-12-15 埃睿迪信息技术(北京)有限公司 一种数据处理方法、装置及设备

Also Published As

Publication number Publication date
CN111475370A (zh) 2020-07-31

Similar Documents

Publication Publication Date Title
WO2021174694A1 (zh) 基于数据中心的运维监控方法、装置、设备及存储介质
CN111859384B (zh) 异常事件监控方法、装置、计算机设备及存储介质
CN109274526B (zh) 测试缺陷自动预警方法、装置、计算机设备及存储介质
CN111143163B (zh) 数据监控方法、装置、计算机设备和存储介质
CN109344170B (zh) 流数据处理方法、系统、电子设备及可读存储介质
WO2021218178A1 (zh) 报表自动生成方法、装置、计算机设备及存储介质
CN109684052B (zh) 事务分析方法、装置、设备及存储介质
CN112631913B (zh) 应用程序的运行故障监控方法、装置、设备和存储介质
CN108509313B (zh) 一种业务监控方法、平台及存储介质
CN109272215B (zh) 项目开发质量监控方法、装置、计算机设备及存储介质
CN109542764B (zh) 网页自动化测试方法、装置、计算机设备和存储介质
CN112035437A (zh) 病案数据的传输方法、装置、计算机设备及存储介质
CN112527601A (zh) 一种监控预警方法及装置
CN110310127B (zh) 录音获取方法、装置、计算机设备及存储介质
CN117273429A (zh) 事件监测方法、系统、电子设备及存储介质
WO2020233021A1 (zh) 基于智能决策的测试结果分析方法及相关装置
CN113190411A (zh) 数据处理方法、装置、电子设备及存储介质
CN110990223A (zh) 一种基于系统日志的监控告警方法及装置
CN115525392A (zh) 容器监控方法、装置、电子设备及存储介质
CN115514618A (zh) 告警事件的处理方法、装置、电子设备和介质
CN115309638A (zh) 协助模型优化的方法及装置
CN111274112B (zh) 应用程序压测方法、装置、计算机设备和存储介质
CN112433909A (zh) 一种基于kafka的实时监控数据的处理方法
CN114428715A (zh) 一种日志处理方法、装置、系统及存储介质
CN109658052B (zh) 监察结果的获取方法、装置、计算机设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20923078

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20923078

Country of ref document: EP

Kind code of ref document: A1