WO2022253054A1 - Fault handling method and apparatus, and server and storage medium - Google Patents

Fault handling method and apparatus, and server and storage medium Download PDF

Info

Publication number
WO2022253054A1
WO2022253054A1 PCT/CN2022/094796 CN2022094796W WO2022253054A1 WO 2022253054 A1 WO2022253054 A1 WO 2022253054A1 CN 2022094796 W CN2022094796 W CN 2022094796W WO 2022253054 A1 WO2022253054 A1 WO 2022253054A1
Authority
WO
WIPO (PCT)
Prior art keywords
fault
atomic service
model
current
workflow
Prior art date
Application number
PCT/CN2022/094796
Other languages
French (fr)
Chinese (zh)
Inventor
蔡金玲
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2022253054A1 publication Critical patent/WO2022253054A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring

Definitions

  • the embodiments of the present application relate to the field of fault handling, and in particular, to a fault handling method, device, server, and storage medium.
  • the purpose of the embodiments of the present application is to provide a fault handling method, device, server, and storage medium.
  • the embodiment of the present application provides a fault handling method, including the following steps: according to the processing status of the current fault, select a preset atomic service model or create an atomic service model; the atomic service model is used to provide at least one atomic service process, Among them, the atomic service process is used to process one of the following faults or any combination thereof: monitoring, analysis, correction, evaluation; create a workflow according to the atomic service process in the atomic service model; the workflow includes the processing chain of the atomic service process ; Generate the fault model to which the current fault belongs, and the fault model includes the indication information of the workflow; the workflow determined according to the indication information calls the atomic service process in the atomic service model to handle the current fault.
  • Embodiments of the present application also provide a fault handling device, including:
  • the atomic service module is used to select a preset atomic service model or create an atomic service model according to the processing status of the current fault; the atomic service model is used to provide at least one type of atomic service process, wherein the atomic service process uses One or any combination of the following processes for faults: monitoring, analysis, correction, evaluation; workflow module, used to create a workflow according to the atomic service process in the atomic service model; the workflow includes the The processing chain of the atomic service process; the fault scenario module, used to generate the fault model to which the current fault belongs, and the fault model includes the indication information of the workflow; the processing module, used to determine the workflow according to the indication information Invoking the atomic service process in the atomic service model to handle the current fault.
  • the embodiment of the present application also provides a server, including: at least one processor; and a memory connected in communication with the at least one processor; wherein, the memory stores the information executable by the at least one processor Instructions, the instructions are executed by the at least one processor, so that the at least one processor can execute the above fault handling method.
  • Embodiments of the present application also provide a computer-readable storage medium storing a computer program, and implementing the above fault handling method when the computer program is executed by a processor.
  • FIG. 1 is a flowchart of a fault handling method provided according to an embodiment of the present application
  • FIG. 2 is a fault handling device provided according to an embodiment of the present application.
  • Fig. 3 is a server provided according to an embodiment of the present application.
  • first and second in the embodiments of the present application are used for description purposes only, and cannot be understood as indicating or implying relative importance or implicitly indicating the quantity of indicated technical features. Thus, the features defined as “first” and “second” may explicitly or implicitly include at least one of these features.
  • the terms “including” and “having” and any variations thereof are intended to cover non-exclusive inclusion. For example, a system, product or equipment comprising a series of components or units is not limited to the listed components or units, but optionally also includes components or units not listed, or optionally also includes Other parts or units inherent in equipment.
  • “plurality” means at least two, such as two, three, etc., unless otherwise specifically defined.
  • An embodiment of the present application relates to a fault handling method. The specific process is shown in Figure 1.
  • Step 101 according to the processing status of the current fault, select a preset atomic service model or create an atomic service model; the atomic service model is used to provide at least one type of atomic service process, wherein the atomic service process is used to handle the fault as follows One or any combination thereof: monitoring, analysis, correction, evaluation;
  • Step 102 create a workflow according to the atomic service process in the atomic service model; the workflow includes a processing chain of the atomic service process;
  • Step 103 generating a fault model to which the current fault belongs, where the fault model includes workflow indication information;
  • Step 104 calling the atomic service process in the atomic service model according to the workflow determined by the indication information to handle the current fault.
  • first analyze the closed-loop fault resolution process and split the fault closed-loop fault resolution process, for example, into process types such as fault identification and fault analysis; use the workflow model to assemble and call the atomic service model according to the actual situation; The types of faults in the model and the corresponding workflows are used to achieve the purpose of customizing and adding fault scenarios and solving them. Reduce the professional technical requirements for custom fault scenarios, enabling users to conveniently and flexibly increase or decrease the fault handling process individually, and improve the efficiency of fault handling.
  • step 101 according to the processing status of the current fault, select a preset atomic service model or create an atomic service model; the atomic service model is used to provide at least one type of atomic service process, wherein the atomic service process is used to perform the following faults One or any combination of processes: monitoring, analysis, correction, evaluation.
  • an atomic service model that can be used to process the current fault parameters is searched in the preset atomic service models. If there is no atomic service model for the current failure parameters in the preset atomic service models, you need to create an atomic service model; if there is an atomic service model for the current failure parameters, just call it.
  • each atomic service step in the atomic service model can be classified into different types of atomic service processes, such as fault monitoring, fault analysis, fault correction and fault evaluation.
  • the fault monitoring process for example, monitors the operation of the external field according to the fault definition information, wherein the fault definition information can be obtained from the fault model or preset in the fault monitoring process;
  • the fault analysis process for example, monitors the monitored fault Carry out fault analysis, get the cause of the fault and give automatic or manual solution suggestions;
  • fault correction process for example, correct the fault according to the fault solution suggestions obtained from fault analysis;
  • fault assessment process for example, use various related fault Indicator data is used to evaluate the effect of fault correction, and the closed loop confirms whether the fault has been recovered.
  • the main data characteristics of the atomic service model may include: the name of the atomic service (name), which can briefly describe the function of the atomic service; the process of fault handling (faultStep), such as fault monitoring, fault analysis , fault correction, fault evaluation, etc.; reference measures (invokeMethod), used to identify and call the corresponding program segment to realize the function of the current atomic service model, which may include descriptions such as uniform resource locator (URL) prefixes Information on how to invoke the functionality implementing this atomic service.
  • name name
  • faultStep process of fault handling
  • reference measures invokeMethod
  • this step first confirm the atomic service model, including selecting a preset model or adding a new one; as the basis for handling the current fault.
  • step 102 according to the atomic service process in the atomic service model, a workflow is created; the workflow includes the processing chain of the atomic service process;
  • the workflow After determining the atomic service process for handling the current fault, it is necessary to determine the order of the execution steps of each determined atomic service process to complete the processing flow for the current fault, making the framework planning for fault processing more flexible. Wherein, the workflow, for example, needs to execute the fault assessment process after the fault correction process is performed, so as to feedback the validity of the processing result in a closed loop.
  • the main data characteristics of a workflow may include: a workflow name (workflowName), which is unique; a processing chain (atomicServiceList) of an atomic service process, which is used to describe the workflow Execution link information, that is, the corresponding data information for invoking the atomic service function.
  • This step establishes a workflow, which is used to define the execution sequence and execution logic of the atomic service process, so that it can perform corresponding personalized solutions to the current fault.
  • a fault model to which the current fault belongs is generated, and the fault model includes indication information of the workflow, and the indication information may be a name of the workflow.
  • the main data characteristics of the fault model include: the name of the current fault; the type of fault (faultSceneType), such as network fault class, alarm class, transmission class, etc.; fault index information (sceneDefinition), For example, the observed indicators and threshold conditions; the name of the workflow data (workflowName) that needs to be called, etc.
  • generating the fault model to which the current fault belongs includes: if the fault parameters of the current fault are of the same type as the fault parameters of the first fault model in the preset fault models, copying the first fault model, and according to The fault parameters of the current fault modify the copied first fault model to generate the fault model to which the current fault belongs; failure model.
  • the fault parameters in the first fault model in the preset fault models are a, b, c ;
  • the fault parameters of the current fault are b, c, d, wherein, the parameters a, b, c, d all belong to the parameters of network congestion; that is, although the current fault cannot be completely processed according to the first fault model, but because the fault parameters are of the same type,
  • the fault model for the current fault can be obtained with a slight modification. It does not need to be completely new, reducing the data processing and calculation process.
  • there are preset categories of fault parameters in the system to support this step are preset categories of fault parameters in the system to support this step.
  • the fault model also includes: execution mode (executeMode), such as immediate execution, timing execution, or periodic execution, etc., to personalize the fault processing time period, so that the effect of fault processing is better or better Meet the needs of users for troubleshooting.
  • execution mode such as immediate execution, timing execution, or periodic execution, etc.
  • creating an atomic service model includes: selecting a template of the atomic service model; and creating the atomic service model according to fault parameters of the current fault and the selected template of the atomic service model. That is, there are preset atomic service model templates. When an atomic model needs to be created, select the preset atomic service model template to create a new one, which can ensure that the necessary data information for this execution process is obtained, and the newly created atomic service model The service model can be put into use. Specifically, an atomic service model is created according to the template of the selected atomic service model and combined with fault parameters that need to be processed for the current fault. For example, user instructions are received, and data required in the template are adaptively modified according to fault parameters of the current fault.
  • step 104 the workflow determined according to the indication information invokes the atomic service process in the atomic service model to handle the current fault. Specifically, since the current fault model is generated according to the current fault, the fault model can be called to deal with the current fault in a targeted manner; according to the indication information of the workflow in the fault model, point to the corresponding workflow, and according to the corresponding workflow In the atomic service process, call the corresponding atomic service process to handle the current fault.
  • data used to define the input and output data types of each execution process when creating atomic service models and creating workflow information, so as to ensure the validity of output results for users; output validity is for example user Visible or able to be used for analysis, without a series of useless codewords that cannot be processed or recognized, etc. That is, limit the range of data types during the execution of this method, and only respond to corresponding data types, so as to ensure that this method can be implemented smoothly.
  • the main data characteristics of the data model can include: input data form (inputData), for example, the input of the fault overview step can be the entire network, grouping, or a list of designated network elements/cells; output data form (outputData), for example, the output data of fault analysis includes a list of poor-quality cells, reasons for poor quality, and proposed operation suggestions.
  • inputData input data form
  • outputData output data form
  • the output data of fault analysis includes a list of poor-quality cells, reasons for poor quality, and proposed operation suggestions.
  • the atomic service process in the atomic service model after invoking the atomic service process in the atomic service model according to the workflow determined by the instruction information to handle the current fault, it also includes: determining the format of the corresponding output data according to the atomic service process; according to the format of the output data Output the result of processing the current failure.
  • the output data formats of different atomic service processes are different.
  • the main data characteristics of the data model (data) can also include: atomic service process (faultStep), the data types in different atomic service processes are different, and the results of each process can be distinguished, making the execution process more clear and stable, and reducing due to data failure.
  • the above steps are divided into units.
  • an interface management unit which provides users with a visual entry for operation and viewing
  • a model management unit which performs addition, deletion, modification, and query operations on the above models according to user operations and stores them accordingly
  • a data storage unit which uses a relational database or disk The file stores the execution result files of each fault scenario
  • there is an application running unit and after the user triggers the execution of the scenario through the interface or periodically, it first reads the corresponding workflow data from the model management unit, and then executes the atomic services configured in the workflow data in sequence
  • Processes enable the identification, analysis, correction and/or evaluation of faults.
  • it is more convenient for users to modify the fault scenario solutions automatically generated by the system through the interface for example, the fault scenario execution mode of the newly added fault scenario solution, the threshold definition of the fault scenario, etc. can be modified.
  • E-RAB Evolved Radio Access Bearer
  • the scenario model information includes the scenario name, the indicator formula required by the scenario, and fault type information; at the same time, set the start time of the new fault scenario, which can be executed immediately Or execute it regularly. After the scene is added successfully, the fault can be closed-loop handled according to the new fault handling process.
  • the fault model of "wireless disconnection rate” in the category of dropped calls has been preset, and the "Flow disconnection rate” index of the outfield also belongs to the category of dropped calls suddenly continues to deteriorate, of which "wireless disconnected rate ” and “Flow drop rate” scenarios all belong to the dropped call category.
  • the system found that the existing fault model found that the fault model of "wireless disconnection rate", the workflow model, the atomic model, and the data model can meet the closed-loop processing of the newly added "Flow disconnection rate” fault, that is, in addition to the name and other definitions and
  • the adjustable parameters are processed according to the process corresponding to the "wireless disconnection rate", which can repair the abnormal data in the "Flow disconnection rate”, so it will automatically follow the solution of the "wireless disconnection rate” , after modifying the unchangeable information such as the name, generate a "Flow dropout rate” solution to close-loop handle the new "Flow dropout rate” fault.
  • the process checks the root cause of the fault and the recommended solution to the root cause, triggers the fault monitoring or fault evaluation process, and automatically monitors fault indicators and field business conditions to monitor and evaluate fault resolution.
  • the steps of fault handling are organized and modularized planning and management are carried out, the closed-loop fault resolution process is analyzed, and it is divided into fault identification, fault analysis, and other process types, and different solutions are provided for each process ;Specify the input and output data types of each process through the data model to achieve the goal of being able to customize the addition or replacement of execution steps; assemble and call the atomic service model according to the actual situation through the workflow; through the user-defined fault information in the fault model And the corresponding workflow to achieve the purpose of customizing and adding fault scenarios and solving them.
  • Reduce the professional technical requirements for custom fault scenarios enabling users to conveniently and flexibly increase or decrease the fault handling process individually, and improve the efficiency of fault handling.
  • step division of the above various methods is only for the sake of clarity of description. During implementation, it can be combined into one step or some steps can be split and decomposed into multiple steps. As long as they include the same logical relationship, they are all within the scope of protection of this patent. ; Adding insignificant modifications or introducing insignificant designs to the algorithm or process, but not changing the core design of the algorithm and process are all within the scope of protection of this patent.
  • An embodiment of the present application relates to a fault handling device, as shown in FIG. 2 , including:
  • the atomic service module 201 is configured to select a preset atomic service model or create an atomic service model according to the processing status of the current fault; the atomic service model is used to provide at least one type of atomic service process, wherein the atomic service process To perform one or any combination of the following processes on a failure: monitoring, analysis, correction, evaluation;
  • the workflow module 202 is configured to create a workflow according to the atomic service process in the atomic service model; the workflow includes a processing chain of the atomic service process;
  • a fault scenario module 203 configured to generate a fault model to which the current fault belongs, where the fault model includes indication information of the workflow;
  • the processing module 204 is configured to call the atomic service process in the atomic service model according to the workflow determined by the indication information to process the current fault.
  • the creation of the atomic service model includes: selecting the template of the atomic service model; and creating the atomic service model according to the fault parameters of the current fault and the selected template of the atomic service model.
  • generating the failure model to which the current failure belongs includes: if the failure parameters of the current failure are of the same type as the failure parameters of the first failure model among the preset failure models, copying the first failure model, And modify the copied first fault model according to the fault parameters of the current fault to generate the fault model to which the current fault belongs; The fault model to which the fault belongs.
  • the fault model to which the current fault belongs includes one of the following or any combination thereof: a name of the current fault, an index of the current fault, and type information of the current fault.
  • the fault model to which the current fault belongs further includes: an execution mode for processing the current fault, wherein the execution mode includes: immediate execution, timing execution or periodic execution.
  • processing module 204 after invoking the atomic service process in the atomic service model according to the workflow determined by the instruction information to handle the current fault, it also includes: determining the format of the corresponding output data according to the atomic service process; Output the result of processing the current failure.
  • this embodiment is a system embodiment corresponding to the above embodiment, and this embodiment can be implemented in cooperation with the above embodiment.
  • the relevant technical details mentioned in the foregoing implementation manners are still valid in this implementation manner, and will not be repeated here in order to reduce repetition.
  • the relevant technical details mentioned in this implementation manner may also be applied in the foregoing implementation manners.
  • modules involved in this embodiment are logical modules.
  • a logical unit can be a physical unit, or a part of a physical unit, or multiple physical units. Combination of units.
  • units that are not closely related to solving the technical problems proposed in the present application are not introduced in this embodiment, but this does not mean that there are no other units in this embodiment.
  • An embodiment of the present application relates to a server, as shown in FIG. 3 , including: at least one processor 301; and,
  • the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the above fault handling method.
  • the memory and the processor are connected by a bus
  • the bus may include any number of interconnected buses and bridges, and the bus connects one or more processors and various circuits of the memory together.
  • the bus may also connect together various other circuits such as peripherals, voltage regulators, and power management circuits, all of which are well known in the art and therefore will not be further described herein.
  • the bus interface provides an interface between the bus and the transceivers.
  • a transceiver may be a single element or multiple elements, such as multiple receivers and transmitters, providing means for communicating with various other devices over a transmission medium.
  • the data processed by the processor is transmitted on the wireless medium through the antenna, further, the antenna also receives the data and transmits the data to the processor.
  • the processor is responsible for managing the bus and general processing, and can also provide various functions, including timing, peripheral interface, voltage regulation, power management, and other control functions. Instead, memory can be used to store data that the processor uses when performing operations.
  • the first embodiment of the present application relates to a computer-readable storage medium storing a computer program.
  • the above method embodiments are implemented when the computer program is executed by the processor.
  • a storage medium includes several instructions to make a device ( It may be a single-chip microcomputer, a chip, etc.) or a processor (processor) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiments of the present application relate to the field of fault handling, and particularly to a fault handling method and apparatus, and a server and a storage medium. The fault handling method comprises: selecting a preset atomic service model or creating an atomic service model according to the handling status of the current fault, wherein the atomic service model is used for providing at least one type of atomic service process, and the atomic service process is used for performing one or any combination of the following handlings on a fault: monitoring, analysis, correction, and evaluation; creating a workflow according to the atomic service process in the atomic service model, wherein the workflow comprises a handling chain of the atomic service process; generating a fault model for the current fault, wherein the fault model comprises indication information of the workflow; and according to the workflow determined by the indication information, calling the atomic service process in the atomic service model, so as to handle the current fault.

Description

一种故障处理方法、装置、服务器及存储介质A fault handling method, device, server and storage medium
交叉引用cross reference
本申请基于申请号为“202110600323.4”、申请日为2021年05月31日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以引入方式并入本申请。This application is based on the Chinese patent application with the application number "202110600323.4" and the filing date is May 31, 2021, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is hereby incorporated by reference. Application.
技术领域technical field
本申请实施例涉及故障处理领域,特别涉及一种故障处理方法、装置、服务器及存储介质。The embodiments of the present application relate to the field of fault handling, and in particular, to a fault handling method, device, server, and storage medium.
背景技术Background technique
随着人工智能的发展,通过数据挖掘,机器学习等方法能自动识别故障和给出解决建议,并且能对网络进行优化评估,大大减轻了对故障处理者经验的依赖,并节省了人力成本。With the development of artificial intelligence, data mining, machine learning and other methods can automatically identify faults and give solutions, and can optimize and evaluate the network, greatly reducing the dependence on the experience of fault handlers and saving labor costs.
但是目前的系统大多是支持预置的故障场景,对于外场新的故障场景起不到帮助作用,各个外场的情况也不尽相同,对网络性能的要求也不同。同时,支持修改的系统仅允许技术人员进行对于新增的场景进行编码层面的设定,对于自定义故障处理场景及方法有一定的技术门槛,用户日常进行故障处理的效率低。However, most of the current systems support preset fault scenarios, which cannot help new fault scenarios in the field. The conditions of each field are different, and the requirements for network performance are also different. At the same time, the system that supports modification only allows technicians to set the coding level for newly added scenarios. There are certain technical thresholds for custom troubleshooting scenarios and methods, and the efficiency of daily troubleshooting for users is low.
发明内容Contents of the invention
本申请实施方式的目的在于提供一种故障处理方法、装置、服务器及存储介质。The purpose of the embodiments of the present application is to provide a fault handling method, device, server, and storage medium.
本申请的实施方式提供了一种故障处理方法,包括以下步骤:根据对当前故障的处理状况,选择预置的原子服务模型或创建原子服务模型;原子服务模 型用于提供至少一个原子服务进程,其中,原子服务进程用于对故障进行以下处理之一或其任意组合:监控、分析、修正、评估;根据原子服务模型中的原子服务进程,创建工作流;工作流包括原子服务进程的处理链;生成当前故障所属的故障模型,故障模型包括工作流的指示信息;根据指示信息确定的工作流调用原子服务模型中的原子服务进程以处理当前故障。The embodiment of the present application provides a fault handling method, including the following steps: according to the processing status of the current fault, select a preset atomic service model or create an atomic service model; the atomic service model is used to provide at least one atomic service process, Among them, the atomic service process is used to process one of the following faults or any combination thereof: monitoring, analysis, correction, evaluation; create a workflow according to the atomic service process in the atomic service model; the workflow includes the processing chain of the atomic service process ; Generate the fault model to which the current fault belongs, and the fault model includes the indication information of the workflow; the workflow determined according to the indication information calls the atomic service process in the atomic service model to handle the current fault.
本申请的实施方式还提供了一种故障处理装置,包括:Embodiments of the present application also provide a fault handling device, including:
原子服务模块,用于根据对当前故障的处理状况,选择预置的原子服务模型或创建原子服务模型;所述原子服务模型用于提供至少一类原子服务进程,其中,所述原子服务进程用于对故障进行以下处理之一或其任意组合:监控、分析、修正、评估;工作流模块,用于根据所述原子服务模型中的原子服务进程,创建工作流;所述工作流包括所述原子服务进程的处理链;故障场景模块,用于生成所述当前故障所属的故障模型,所述故障模型包括所述工作流的指示信息;处理模块,用于根据所述指示信息确定的工作流调用所述原子服务模型中的所述原子服务进程以处理所述当前故障。The atomic service module is used to select a preset atomic service model or create an atomic service model according to the processing status of the current fault; the atomic service model is used to provide at least one type of atomic service process, wherein the atomic service process uses One or any combination of the following processes for faults: monitoring, analysis, correction, evaluation; workflow module, used to create a workflow according to the atomic service process in the atomic service model; the workflow includes the The processing chain of the atomic service process; the fault scenario module, used to generate the fault model to which the current fault belongs, and the fault model includes the indication information of the workflow; the processing module, used to determine the workflow according to the indication information Invoking the atomic service process in the atomic service model to handle the current fault.
本申请的实施方式还提供了一种服务器,包括:至少一个处理器;以及,与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行上述的故障处理方法。The embodiment of the present application also provides a server, including: at least one processor; and a memory connected in communication with the at least one processor; wherein, the memory stores the information executable by the at least one processor Instructions, the instructions are executed by the at least one processor, so that the at least one processor can execute the above fault handling method.
本申请的实施方式还提供了一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时实现上述的故障处理方法。Embodiments of the present application also provide a computer-readable storage medium storing a computer program, and implementing the above fault handling method when the computer program is executed by a processor.
附图说明Description of drawings
一个或多个实施例通过与之对应的附图中的图片进行示例性说明,这些示例性说明并不构成对实施例的限定,附图中具有相同参考数字标号的元件表示为类似的元件,除非有特别申明,附图中的图不构成比例限制。One or more embodiments are exemplified by the pictures in the corresponding drawings, and these exemplifications do not constitute a limitation to the embodiments. Elements with the same reference numerals in the drawings represent similar elements. Unless otherwise stated, the drawings in the drawings are not limited to scale.
图1是根据本申请一个实施方式提供的一种故障处理方法的流程图;FIG. 1 is a flowchart of a fault handling method provided according to an embodiment of the present application;
图2是根据本申请一个实施方式提供的一种故障处理装置;FIG. 2 is a fault handling device provided according to an embodiment of the present application;
图3是根据本申请一个实施方式提供的一种服务器。Fig. 3 is a server provided according to an embodiment of the present application.
具体实施方式Detailed ways
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合附图对本申请的各实施方式进行详细的阐述。然而,本领域的普通技术人员可以理解,在本申请各实施方式中,为了使读者更好地理解本申请而提出了许多技术细节。但是,即使没有这些技术细节和基于以下各实施方式的种种变化和修改,也可以实现本申请所要求保护的技术方案。以下各个实施例的划分是为了描述方便,不应对本申请的具体实现方式构成任何限定,各个实施例在不矛盾的前提下可以相互结合相互引用。In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, various implementations of the present application will be described in detail below in conjunction with the accompanying drawings. However, those of ordinary skill in the art can understand that, in each implementation manner of the present application, many technical details are provided for readers to better understand the present application. However, even without these technical details and various changes and modifications based on the following implementation modes, the technical solution claimed in this application can also be realized. The division of the following embodiments is for the convenience of description, and should not constitute any limitation to the specific implementation of the present application, and the embodiments can be combined and referred to each other on the premise of no contradiction.
本申请实施例中的术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。本申请的描述中,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列部件或单元的系统、产品或设备没有限定于已列出的部件或单元,而是可选地还包括没有列出的部件或单元,或可选地还包括对于这些产品或设备固有的其它部件或单元。本申请的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。The terms "first" and "second" in the embodiments of the present application are used for description purposes only, and cannot be understood as indicating or implying relative importance or implicitly indicating the quantity of indicated technical features. Thus, the features defined as "first" and "second" may explicitly or implicitly include at least one of these features. In the description of the present application, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusion. For example, a system, product or equipment comprising a series of components or units is not limited to the listed components or units, but optionally also includes components or units not listed, or optionally also includes Other parts or units inherent in equipment. In the description of the present application, "plurality" means at least two, such as two, three, etc., unless otherwise specifically defined.
本申请的一个实施方式涉及一种故障处理方法。具体流程如图1所示。An embodiment of the present application relates to a fault handling method. The specific process is shown in Figure 1.
步骤101,根据对当前故障的处理状况,选择预置的原子服务模型或创建原子服务模型;原子服务模型用于提供至少一类原子服务进程,其中,原子服务进程用于对故障进行以下处理之一或其任意组合:监控、分析、修正、评估; Step 101, according to the processing status of the current fault, select a preset atomic service model or create an atomic service model; the atomic service model is used to provide at least one type of atomic service process, wherein the atomic service process is used to handle the fault as follows One or any combination thereof: monitoring, analysis, correction, evaluation;
步骤102,根据原子服务模型中的原子服务进程,创建工作流;工作流包括原子服务进程的处理链; Step 102, create a workflow according to the atomic service process in the atomic service model; the workflow includes a processing chain of the atomic service process;
步骤103,生成当前故障所属的故障模型,故障模型包括工作流的指示信息; Step 103, generating a fault model to which the current fault belongs, where the fault model includes workflow indication information;
步骤104,根据指示信息确定的工作流调用原子服务模型中的原子服务进程以处理当前故障。Step 104, calling the atomic service process in the atomic service model according to the workflow determined by the indication information to handle the current fault.
本实施例中,首先分析闭环解决故障流程,对于故障的闭环解决故障流程进行拆分,例如分为故障识别,故障分析等进程类型;通过工作流模型根据实际情况组装调用原子服务模型;通过故障模型中的故障类型以及对应的工作流 来达到自定义增加故障场景并进行解决的目的。降低自定义故障场景的专业技术要求,使得用户能够便捷灵活地对故障处理流程进行个性化增减,提高故障处理的效率。In this embodiment, first analyze the closed-loop fault resolution process, and split the fault closed-loop fault resolution process, for example, into process types such as fault identification and fault analysis; use the workflow model to assemble and call the atomic service model according to the actual situation; The types of faults in the model and the corresponding workflows are used to achieve the purpose of customizing and adding fault scenarios and solving them. Reduce the professional technical requirements for custom fault scenarios, enabling users to conveniently and flexibly increase or decrease the fault handling process individually, and improve the efficiency of fault handling.
下面对本实施方式的一种故障处理方法的实现细节进行具体的说明,以下内容仅为方便理解提供的实现细节,并非实施本方案的必须。The implementation details of a fault handling method in this embodiment are described in detail below, and the following content is only implementation details provided for easy understanding, and is not necessary for implementing this solution.
在步骤101中,根据对当前故障的处理状况,选择预置的原子服务模型或创建原子服务模型;原子服务模型用于提供至少一类原子服务进程,其中,原子服务进程用于对故障进行以下处理之一或其任意组合:监控、分析、修正、评估。本步骤中,首先根据当前故障的故障参数,在预置的原子服务模型中寻找能够用于处理当前故障参数的原子服务模型。若预置的原子服务模型中不存在应对当前故障参数的原子服务模型,则需要创建原子服务模型;若存在应对当前故障参数的原子服务模型,则调用即可。In step 101, according to the processing status of the current fault, select a preset atomic service model or create an atomic service model; the atomic service model is used to provide at least one type of atomic service process, wherein the atomic service process is used to perform the following faults One or any combination of processes: monitoring, analysis, correction, evaluation. In this step, firstly, according to the fault parameters of the current fault, an atomic service model that can be used to process the current fault parameters is searched in the preset atomic service models. If there is no atomic service model for the current failure parameters in the preset atomic service models, you need to create an atomic service model; if there is an atomic service model for the current failure parameters, just call it.
在一个例子中,原子服务模型中各原子服务步骤可归为不同类的原子服务进程,例如:故障监控,故障分析,故障修正和故障评估。具体地,故障监控进程例如,依据对故障的定义信息对外场运行情况进行监控,其中该故障的定义信息可以由故障模型获取或故障监控进程中预置;故障分析进程例如,对监控到的故障进行故障分析,得到故障发生的原因并给出自动或者手动的解决建议;故障修正进程例如,依据故障分析得到的故障解决建议对故障进行修正处理;故障评估进程例如,利用和故障相关的各项指标数据来评估故障修正的效果,闭环确认故障是否已经恢复。In one example, each atomic service step in the atomic service model can be classified into different types of atomic service processes, such as fault monitoring, fault analysis, fault correction and fault evaluation. Specifically, the fault monitoring process, for example, monitors the operation of the external field according to the fault definition information, wherein the fault definition information can be obtained from the fault model or preset in the fault monitoring process; the fault analysis process, for example, monitors the monitored fault Carry out fault analysis, get the cause of the fault and give automatic or manual solution suggestions; fault correction process, for example, correct the fault according to the fault solution suggestions obtained from fault analysis; fault assessment process, for example, use various related fault Indicator data is used to evaluate the effect of fault correction, and the closed loop confirms whether the fault has been recovered.
在一个具体实施方式中,原子服务模型(atomicService)的主要数据特征可以包括:原子服务名称(name),能概要阐述此原子服务的功能;故障处理的进程(faultStep),例如故障监控,故障分析,故障修正,故障评估等;引用措施(invokeMethod),用于标识及调用对应的程序段以实现当前原子服务模型的功能,具体其中可以包括统一资源定位系统(uniform resource locator,URL)前缀等描述如何调用实现此原子服务功能的信息。In a specific implementation, the main data characteristics of the atomic service model (atomicService) may include: the name of the atomic service (name), which can briefly describe the function of the atomic service; the process of fault handling (faultStep), such as fault monitoring, fault analysis , fault correction, fault evaluation, etc.; reference measures (invokeMethod), used to identify and call the corresponding program segment to realize the function of the current atomic service model, which may include descriptions such as uniform resource locator (URL) prefixes Information on how to invoke the functionality implementing this atomic service.
在本步骤中,首先确认原子服务模型,包括选择预置模型或进行新增;作为处理当前故障的基础。In this step, first confirm the atomic service model, including selecting a preset model or adding a new one; as the basis for handling the current fault.
在步骤102中,根据原子服务模型中的原子服务进程,创建工作流;工作 流包括原子服务进程的处理链;In step 102, according to the atomic service process in the atomic service model, a workflow is created; the workflow includes the processing chain of the atomic service process;
在确定了对于当前故障进行处理的原子服务进程之后,需要对于各已经确定的原子服务进程的执行步骤进行顺序的确定,以完成对于当前故障的处理流程,使得故障处理的框架规划更灵活。其中,工作流例如在进行故障修正进程后,需要执行故障评估进程,以闭环反馈处理结果的有效性。After determining the atomic service process for handling the current fault, it is necessary to determine the order of the execution steps of each determined atomic service process to complete the processing flow for the current fault, making the framework planning for fault processing more flexible. Wherein, the workflow, for example, needs to execute the fault assessment process after the fault correction process is performed, so as to feedback the validity of the processing result in a closed loop.
在一个具体实施方式中,工作流(workflow)的主要数据特征可以包括:工作流名称(workflowName),具有唯一性;原子服务进程的处理链(atomicServiceList),该处理链用于描述此工作流的执行链路信息,即调用原子服务功能的对应数据信息。In a specific embodiment, the main data characteristics of a workflow (workflow) may include: a workflow name (workflowName), which is unique; a processing chain (atomicServiceList) of an atomic service process, which is used to describe the workflow Execution link information, that is, the corresponding data information for invoking the atomic service function.
本步骤建立工作流,工作流用于对于原子服务进程的执行顺序、执行逻辑等加以定义,使其能够针对当前故障进行对应的个性化解决。This step establishes a workflow, which is used to define the execution sequence and execution logic of the atomic service process, so that it can perform corresponding personalized solutions to the current fault.
在步骤103中,生成所述当前故障所属的故障模型,所述故障模型包括所述工作流的指示信息,所述指示信息可以是工作流的名称。在一个具体实施方式中,故障模型(faultScene)的主要数据特征有:当前故障的名称;故障的类型(faultSceneType),例如网络故障类,告警类,传输类等;故障的指标信息(sceneDefinition),例如观察的指标及阈值条件;需要调用的工作流数据名称(workflowName)等。In step 103, a fault model to which the current fault belongs is generated, and the fault model includes indication information of the workflow, and the indication information may be a name of the workflow. In a specific embodiment, the main data characteristics of the fault model (faultScene) include: the name of the current fault; the type of fault (faultSceneType), such as network fault class, alarm class, transmission class, etc.; fault index information (sceneDefinition), For example, the observed indicators and threshold conditions; the name of the workflow data (workflowName) that needs to be called, etc.
在一个例子中,生成当前故障所属的故障模型,包括:若当前故障的故障参数与预设的各故障模型中的第一故障模型的故障参数的类型相同,则复制第一故障模型,并根据当前故障的故障参数对复制的第一故障模型进行修改,生成当前故障所属的故障模型;若当前故障的故障参数与预设的各故障模型的故障参数的类型均不相同,则创建当前故障所属的故障模型。例如,根据当前故障参数与各预设故障模型中的故障参数的类型,判断需要新增或修改故障模型,例如:预设故障模型中的第一故障模型中的故障参数为a,b,c;当前故障的故障参数为b,c,d,其中,参数a,b,c,d均属于网络阻塞类参数;即虽然按照第一故障模型无法完全处理当前故障,但由于故障参数类型相同,实际上稍作修改即可得到针对于当前故障的故障模型。不需要完全新建,减少数据处理及计算过程。另外,在系统中存在预置的故障参数类型分类,以支持本步骤。In one example, generating the fault model to which the current fault belongs includes: if the fault parameters of the current fault are of the same type as the fault parameters of the first fault model in the preset fault models, copying the first fault model, and according to The fault parameters of the current fault modify the copied first fault model to generate the fault model to which the current fault belongs; failure model. For example, according to the current fault parameters and the types of fault parameters in each preset fault model, it is judged that a fault model needs to be added or modified, for example: the fault parameters in the first fault model in the preset fault models are a, b, c ; The fault parameters of the current fault are b, c, d, wherein, the parameters a, b, c, d all belong to the parameters of network congestion; that is, although the current fault cannot be completely processed according to the first fault model, but because the fault parameters are of the same type, In fact, the fault model for the current fault can be obtained with a slight modification. It does not need to be completely new, reducing the data processing and calculation process. In addition, there are preset categories of fault parameters in the system to support this step.
在另一个例子中,故障模型还包括:执行处理的时间(executeMode),例 如立即执行,定时执行或周期性执行等,对于故障的处理时段进行个性化定义,使得故障处理的效果更好或更满足用户对于故障处理的需求。In another example, the fault model also includes: execution mode (executeMode), such as immediate execution, timing execution, or periodic execution, etc., to personalize the fault processing time period, so that the effect of fault processing is better or better Meet the needs of users for troubleshooting.
在一个步骤中,创建原子服务模型,包括:选取原子服务模型的模板;根据当前故障的故障参数与选取的原子服务模型的模板,创建原子服务模型。即,存在预置的原子服务模型的模板,在需要创建原子模型的时候,选取预置的原子服务模型的模板进行新建,能够保证获取到本执行过程所必要的数据信息,确保该新建的原子服务模型能够投入使用。具体的,根据所选取的原子服务模型的模板,结合当前故障需要处理的故障参数,创建原子服务模型。例如,接收用户指令,针对当前故障的故障参数,对模板中需要的数据进行适应性修改。In one step, creating an atomic service model includes: selecting a template of the atomic service model; and creating the atomic service model according to fault parameters of the current fault and the selected template of the atomic service model. That is, there are preset atomic service model templates. When an atomic model needs to be created, select the preset atomic service model template to create a new one, which can ensure that the necessary data information for this execution process is obtained, and the newly created atomic service model The service model can be put into use. Specifically, an atomic service model is created according to the template of the selected atomic service model and combined with fault parameters that need to be processed for the current fault. For example, user instructions are received, and data required in the template are adaptively modified according to fault parameters of the current fault.
在步骤104中,根据指示信息确定的工作流调用原子服务模型中的原子服务进程以处理当前故障。具体的,由于当前的故障模型是根据当前故障生成的,即能够调用故障模型以针对性处理当前故障;根据故障模型中存在的工作流的指示信息,指向对应的工作流,依据对应的工作流中的原子服务进程,调用对应的原子服务进程来处理当前故障。In step 104, the workflow determined according to the indication information invokes the atomic service process in the atomic service model to handle the current fault. Specifically, since the current fault model is generated according to the current fault, the fault model can be called to deal with the current fault in a targeted manner; according to the indication information of the workflow in the fault model, point to the corresponding workflow, and according to the corresponding workflow In the atomic service process, call the corresponding atomic service process to handle the current fault.
在一个例子中,存在数据模型(data)用于在创建原子服务模型、创建工作流信息时定义各执行过程的输入和输出的数据类型,保证输出结果对于用户的有效性;输出有效性例如用户可视或能够用于分析,不会得到一系列无法处理或识别的无用码字等。即,对于本方法执行过程中的数据类型进行范围限定,仅响应对应的数据类型,以保证本方法能够顺利实施。具体地数据模型(data)的主数据特征可以包括:输入数据形式(inputData),例如故障总览步骤的输入可以是全网,分组,或者指定的网元/小区列表;输出数据形式(outputData),例如故障分析的输出数据有质差小区列表,质差原因,提出的操作建议等。In one example, there is a data model (data) used to define the input and output data types of each execution process when creating atomic service models and creating workflow information, so as to ensure the validity of output results for users; output validity is for example user Visible or able to be used for analysis, without a series of useless codewords that cannot be processed or recognized, etc. That is, limit the range of data types during the execution of this method, and only respond to corresponding data types, so as to ensure that this method can be implemented smoothly. Specifically, the main data characteristics of the data model (data) can include: input data form (inputData), for example, the input of the fault overview step can be the entire network, grouping, or a list of designated network elements/cells; output data form (outputData), For example, the output data of fault analysis includes a list of poor-quality cells, reasons for poor quality, and proposed operation suggestions.
在一个例子中,根据指示信息确定的工作流调用原子服务模型中的原子服务进程以处理所述当前故障之后,还包括:根据原子服务进程,确定对应的输出数据的格式;根据输出数据的格式输出处理当前故障的结果。另外,不同的原子服务进程的输出数据的格式不同。具体地,数据模型(data)主数据特征还可以包括:原子服务进程(faultStep),不同原子服务进程中的数据类型不同,能够区分各进程的结果,使得执行流程更明确和稳定,减少由于数据格式引发的执行顺序错误;故障类型(faultSceneType),在一些特定的情况下,属 于同一原子服务进程但是不同故障类型时输出数据形式可能不一致,例如网络故障类的故障评估输出结果是对网络指标的监控结果,告警类的故障评估输出结果是告警恢复情况。In an example, after invoking the atomic service process in the atomic service model according to the workflow determined by the instruction information to handle the current fault, it also includes: determining the format of the corresponding output data according to the atomic service process; according to the format of the output data Output the result of processing the current failure. In addition, the output data formats of different atomic service processes are different. Specifically, the main data characteristics of the data model (data) can also include: atomic service process (faultStep), the data types in different atomic service processes are different, and the results of each process can be distinguished, making the execution process more clear and stable, and reducing due to data failure. The execution sequence error caused by the format; the fault type (faultSceneType), in some specific cases, the output data format may be inconsistent when belonging to the same atomic service process but different fault types, for example, the fault evaluation output of the network fault class is the network indicator Monitoring results, the output result of the fault evaluation of the alarm class is the alarm recovery situation.
在一个实施方式中,为了便于用户进行便捷使用,为以上步骤进行单元划分。例如:存在界面管理单元,为用户提供操作和查看的可视化入口;存在模型管理单元,依据用户操作对上述模型进行增删改查的操作并进行相应存储;存在数据存储单元,使用关系型数据库或者磁盘文件存储各故障场景的执行结果文件;存在应用运行单元,用户通过界面触发或者周期触发场景执行后,先从模型管理单元读取对应的工作流数据,然后依次执行工作流数据中配置的原子服务进程实现故障的识别,分析,修正和/或评估。从而更便于用户通过界面修改本系统自动生成的故障场景解决方案,例如可以修改新增故障场景解决方案的故障场景执行方式,故障场景的阈值定义等。In one embodiment, for the convenience of users, the above steps are divided into units. For example: there is an interface management unit, which provides users with a visual entry for operation and viewing; there is a model management unit, which performs addition, deletion, modification, and query operations on the above models according to user operations and stores them accordingly; there is a data storage unit, which uses a relational database or disk The file stores the execution result files of each fault scenario; there is an application running unit, and after the user triggers the execution of the scenario through the interface or periodically, it first reads the corresponding workflow data from the model management unit, and then executes the atomic services configured in the workflow data in sequence Processes enable the identification, analysis, correction and/or evaluation of faults. In this way, it is more convenient for users to modify the fault scenario solutions automatically generated by the system through the interface, for example, the fault scenario execution mode of the newly added fault scenario solution, the threshold definition of the fault scenario, etc. can be modified.
以一个实际执行过程为例。当外场演进的无线接入承载(Evolved Radio Access Bearer,E-RAB)建立成功率出现下降问题,检测到系统预置的场景发现并不能满足解决E-RAB建立成功率出现下降问题,因此需要新增一个E-RAB建立成功率下降问题的故障处理流程。具体为:用户根据预置的原子服务模型框架新增原子服务信息,包括原子服务的输入输出,功能实现等;根据工作流模型框架新增工作流信息,包括为了实现分析故障的各步骤原子服务的调用逻辑和规则,并新建场景模型将其与工作流对应,其中,场景模型信息包括场景名称,场景需要的指标公式和故障类型信息;同时设置新增故障场景的启动时间,可以是立即执行或者定时执行。场景新增成功后,就可以按照新故障处理流程对故障进行闭环处理。Take an actual execution process as an example. When the success rate of Evolved Radio Access Bearer (E-RAB) establishment in the field is declining, it is detected that the scene preset by the system cannot solve the problem of the decline in the establishment success rate of E-RAB. Therefore, a new Add a troubleshooting process for the problem of the decline in the success rate of E-RAB establishment. Specifically: users add atomic service information according to the preset atomic service model framework, including the input and output of atomic services, function realization, etc.; add workflow information according to the workflow model framework, including atomic services for each step of analyzing faults call logic and rules, and create a new scenario model to correspond to the workflow. The scenario model information includes the scenario name, the indicator formula required by the scenario, and fault type information; at the same time, set the start time of the new fault scenario, which can be executed immediately Or execute it regularly. After the scene is added successfully, the fault can be closed-loop handled according to the new fault handling process.
在另一个实际执行过程中,已经预置了掉话类中“无线掉线率”的故障模型,外场同属于掉话类的“Flow掉线率”指标突然持续恶化,其中“无线掉线率”及“Flow掉线率”场景下的故障数据均属于掉话类。系统通过分析发现现有的故障模型发现“无线掉线率”的故障模型,工作流模型,原子模型,数据模型能够满足闭环处理新增“Flow掉线率”的故障,即除名称等定义且无法改变的信息不同之外,可调整的参数按照“无线掉线率”对应的流程进行处理能够修复“Flow掉线率”中的异常数据,因此会自动依照“无线掉线率”的解决 方案,在修改名称等不可更改的信息后生成“Flow掉线率”的方案来闭环处理新出现的“Flow掉线率”故障。In another actual execution process, the fault model of "wireless disconnection rate" in the category of dropped calls has been preset, and the "Flow disconnection rate" index of the outfield also belongs to the category of dropped calls suddenly continues to deteriorate, of which "wireless disconnected rate ” and “Flow drop rate” scenarios all belong to the dropped call category. Through analysis, the system found that the existing fault model found that the fault model of "wireless disconnection rate", the workflow model, the atomic model, and the data model can meet the closed-loop processing of the newly added "Flow disconnection rate" fault, that is, in addition to the name and other definitions and In addition to the different information that cannot be changed, the adjustable parameters are processed according to the process corresponding to the "wireless disconnection rate", which can repair the abnormal data in the "Flow disconnection rate", so it will automatically follow the solution of the "wireless disconnection rate" , after modifying the unchangeable information such as the name, generate a "Flow dropout rate" solution to close-loop handle the new "Flow dropout rate" fault.
另外,在得到故障模型后,可以通过可视化界面查看此故障的监控的信息,例如TOP差小区(即故障最严重的小区),单小区的指标趋势图等;若触发故障分析还可以根据故障分析进程查看故障发生的根因以及对此根因推荐的解决建议,触发故障监控或故障评估进程还可以自动监控故障指标和外场业务情况等对故障解决情况进行监控和评估。In addition, after obtaining the fault model, you can view the monitoring information of this fault through the visual interface, such as the TOP poor cell (that is, the cell with the most serious fault), the indicator trend graph of a single cell, etc.; if the fault analysis is triggered, you can also use the fault analysis The process checks the root cause of the fault and the recommended solution to the root cause, triggers the fault monitoring or fault evaluation process, and automatically monitors fault indicators and field business conditions to monitor and evaluate fault resolution.
在本实施例中,将故障处理的步骤整理并进行了模块化规划及管理,分析闭环解决故障流程,拆分成故障识别,故障分析,等进程类型,并为每一个进程提供不同的解决方案;通过数据模型将各进程的输入输出数据类型进行规定,以达到能够自定义新增或替换执行步骤的目标;通过工作流根据实际情况组装调用原子服务模型;通过故障模型中用户定义的故障信息以及对应的工作流来达到自定义增加故障场景并进行解决的目的。降低自定义故障场景的专业技术要求,使得用户能够便捷灵活地对故障处理流程进行个性化增减,提高故障处理的效率。In this embodiment, the steps of fault handling are organized and modularized planning and management are carried out, the closed-loop fault resolution process is analyzed, and it is divided into fault identification, fault analysis, and other process types, and different solutions are provided for each process ;Specify the input and output data types of each process through the data model to achieve the goal of being able to customize the addition or replacement of execution steps; assemble and call the atomic service model according to the actual situation through the workflow; through the user-defined fault information in the fault model And the corresponding workflow to achieve the purpose of customizing and adding fault scenarios and solving them. Reduce the professional technical requirements for custom fault scenarios, enabling users to conveniently and flexibly increase or decrease the fault handling process individually, and improve the efficiency of fault handling.
上面各种方法的步骤划分,只是为了描述清楚,实现时可以合并为一个步骤或者对某些步骤进行拆分,分解为多个步骤,只要包括相同的逻辑关系,都在本专利的保护范围内;对算法中或者流程中添加无关紧要的修改或者引入无关紧要的设计,但不改变其算法和流程的核心设计都在该专利的保护范围内。The step division of the above various methods is only for the sake of clarity of description. During implementation, it can be combined into one step or some steps can be split and decomposed into multiple steps. As long as they include the same logical relationship, they are all within the scope of protection of this patent. ; Adding insignificant modifications or introducing insignificant designs to the algorithm or process, but not changing the core design of the algorithm and process are all within the scope of protection of this patent.
本申请一个实施方式涉及一种故障处理装置,如图2所示,包括:An embodiment of the present application relates to a fault handling device, as shown in FIG. 2 , including:
原子服务模块201,用于根据对当前故障的处理状况,选择预置的原子服务模型或创建原子服务模型;所述原子服务模型用于提供至少一类原子服务进程,其中,所述原子服务进程用于对故障进行以下处理之一或其任意组合:监控、分析、修正、评估;The atomic service module 201 is configured to select a preset atomic service model or create an atomic service model according to the processing status of the current fault; the atomic service model is used to provide at least one type of atomic service process, wherein the atomic service process To perform one or any combination of the following processes on a failure: monitoring, analysis, correction, evaluation;
工作流模块202,用于根据所述原子服务模型中的原子服务进程,创建工作流;所述工作流包括所述原子服务进程的处理链;The workflow module 202 is configured to create a workflow according to the atomic service process in the atomic service model; the workflow includes a processing chain of the atomic service process;
故障场景模块203,用于生成所述当前故障所属的故障模型,所述故障模型包括所述工作流的指示信息;A fault scenario module 203, configured to generate a fault model to which the current fault belongs, where the fault model includes indication information of the workflow;
处理模块204,用于根据所述指示信息确定的工作流调用所述原子服务模 型中的所述原子服务进程以处理所述当前故障。The processing module 204 is configured to call the atomic service process in the atomic service model according to the workflow determined by the indication information to process the current fault.
在原子服务模块201中,所述创建原子服务模型,包括:选取所述原子服务模型的模板;根据当前故障的故障参数与所述选取的原子服务模型的模板,创建原子服务模型。In the atomic service module 201, the creation of the atomic service model includes: selecting the template of the atomic service model; and creating the atomic service model according to the fault parameters of the current fault and the selected template of the atomic service model.
在故障场景模块203中,生成当前故障所属的故障模型,包括:若当前故障的故障参数与预设的各故障模型中的第一故障模型的故障参数的类型相同,则复制第一故障模型,并根据当前故障的故障参数对复制的第一故障模型进行修改,生成当前故障所属的故障模型;若当前故障的故障参数与预设的各故障模型的故障参数的类型均不相同,则创建当前故障所属的故障模型。In the failure scenario module 203, generating the failure model to which the current failure belongs includes: if the failure parameters of the current failure are of the same type as the failure parameters of the first failure model among the preset failure models, copying the first failure model, And modify the copied first fault model according to the fault parameters of the current fault to generate the fault model to which the current fault belongs; The fault model to which the fault belongs.
在一个例子中,当前故障所属的故障模型,包括以下之一或其任意组合:所述当前故障的名称、所述当前故障的指标、所述当前故障的类型信息。In an example, the fault model to which the current fault belongs includes one of the following or any combination thereof: a name of the current fault, an index of the current fault, and type information of the current fault.
在另一个例子中,当前故障所属的故障模型,还包括:对当前故障的进行处理的执行方式,其中,所述执行方式包括:立即执行,定时执行或周期性执行。In another example, the fault model to which the current fault belongs further includes: an execution mode for processing the current fault, wherein the execution mode includes: immediate execution, timing execution or periodic execution.
在处理模块204中,在根据指示信息确定的工作流调用原子服务模型中的原子服务进程以处理当前故障之后,还包括:根据原子服务进程,确定对应的输出数据的格式;根据输出数据的格式输出处理当前故障的结果。In the processing module 204, after invoking the atomic service process in the atomic service model according to the workflow determined by the instruction information to handle the current fault, it also includes: determining the format of the corresponding output data according to the atomic service process; Output the result of processing the current failure.
另外,不同的原子服务进程的输出数据的格式不同。In addition, the output data formats of different atomic service processes are different.
在本申请的实施方式中,分析闭环解决故障流程,拆分成故障识别,故障分析,等进程类型;通过工作流根据实际情况组装调用原子服务模型;通过故障模型中的故障类型以及对应的工作流来达到自定义增加故障场景并进行解决的目的。降低自定义故障场景的专业技术要求,使得用户能够便捷灵活地对故障处理流程进行个性化增减,提高故障处理的效率。In the implementation of this application, analyze the closed-loop fault resolution process, split it into fault identification, fault analysis, and other process types; assemble and call the atomic service model according to the actual situation through the workflow; through the fault type in the fault model and the corresponding work flow to achieve the purpose of customizing and adding fault scenarios and solving them. Reduce the professional technical requirements for custom fault scenarios, enabling users to conveniently and flexibly increase or decrease the fault handling process individually, and improve the efficiency of fault handling.
不难发现,本实施方式为与上述实施方式相对应的系统实施例,本实施方式可与上述实施方式互相配合实施。上述实施方式中提到的相关技术细节在本实施方式中依然有效,为了减少重复,这里不再赘述。相应地,本实施方式中提到的相关技术细节也可应用在上述实施方式中。It is not difficult to find that this embodiment is a system embodiment corresponding to the above embodiment, and this embodiment can be implemented in cooperation with the above embodiment. The relevant technical details mentioned in the foregoing implementation manners are still valid in this implementation manner, and will not be repeated here in order to reduce repetition. Correspondingly, the relevant technical details mentioned in this implementation manner may also be applied in the foregoing implementation manners.
值得一提的是,本实施方式中所涉及到的各模块均为逻辑模块,在实际应用中,一个逻辑单元可以是一个物理单元,也可以是一个物理单元的一部分, 还可以以多个物理单元的组合实现。此外,为了突出本申请的创新部分,本实施方式中并没有将与解决本申请所提出的技术问题关系不太密切的单元引入,但这并不表明本实施方式中不存在其它的单元。It is worth mentioning that all the modules involved in this embodiment are logical modules. In practical applications, a logical unit can be a physical unit, or a part of a physical unit, or multiple physical units. Combination of units. In addition, in order to highlight the innovative part of the present application, units that are not closely related to solving the technical problems proposed in the present application are not introduced in this embodiment, but this does not mean that there are no other units in this embodiment.
本申请一个实施方式涉及一种服务器,如图3所示,包括包括:至少一个处理器301;以及,An embodiment of the present application relates to a server, as shown in FIG. 3 , including: at least one processor 301; and,
与所述至少一个处理器通信连接的存储器302;其中,A memory 302 communicatively connected to the at least one processor; wherein,
所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行上述的故障处理方法。The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the above fault handling method.
其中,存储器和处理器采用总线方式连接,总线可以包括任意数量的互联的总线和桥,总线将一个或多个处理器和存储器的各种电路连接在一起。总线还可以将诸如外围设备、稳压器和功率管理电路等之类的各种其他电路连接在一起,这些都是本领域所公知的,因此,本文不再对其进行进一步描述。总线接口在总线和收发机之间提供接口。收发机可以是一个元件,也可以是多个元件,比如多个接收器和发送器,提供用于在传输介质上与各种其他装置通信的单元。经处理器处理的数据通过天线在无线介质上进行传输,进一步,天线还接收数据并将数据传送给处理器。Wherein, the memory and the processor are connected by a bus, and the bus may include any number of interconnected buses and bridges, and the bus connects one or more processors and various circuits of the memory together. The bus may also connect together various other circuits such as peripherals, voltage regulators, and power management circuits, all of which are well known in the art and therefore will not be further described herein. The bus interface provides an interface between the bus and the transceivers. A transceiver may be a single element or multiple elements, such as multiple receivers and transmitters, providing means for communicating with various other devices over a transmission medium. The data processed by the processor is transmitted on the wireless medium through the antenna, further, the antenna also receives the data and transmits the data to the processor.
处理器负责管理总线和通常的处理,还可以提供各种功能,包括定时,外围接口,电压调节、电源管理以及其他控制功能。而存储器可以被用于存储处理器在执行操作时所使用的数据。The processor is responsible for managing the bus and general processing, and can also provide various functions, including timing, peripheral interface, voltage regulation, power management, and other control functions. Instead, memory can be used to store data that the processor uses when performing operations.
本申请第一个实施方式涉及一种计算机可读存储介质,存储有计算机程序。计算机程序被处理器执行时实现上述方法实施例。The first embodiment of the present application relates to a computer-readable storage medium storing a computer program. The above method embodiments are implemented when the computer program is executed by the processor.
即,本领域技术人员可以理解,实现上述实施例方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。That is, those skilled in the art can understand that all or part of the steps in the method of the above-mentioned embodiments can be completed by instructing related hardware through a program, the program is stored in a storage medium, and includes several instructions to make a device ( It may be a single-chip microcomputer, a chip, etc.) or a processor (processor) to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. .
本领域的普通技术人员可以理解,上述各实施方式是实现本申请的具体实 施例,而在实际应用中,可以在形式上和细节上对其作各种改变,而不偏离本申请的精神和范围。Those of ordinary skill in the art can understand that the above-mentioned implementation modes are specific examples for realizing the present application, and in practical applications, various changes can be made to it in form and details without departing from the spirit and spirit of the present application. scope.

Claims (10)

  1. 一种故障处理方法,其中,包括:A method for troubleshooting, including:
    根据对当前故障的处理状况,选择预置的原子服务模型或创建原子服务模型;所述原子服务模型用于提供至少一类原子服务进程,其中,所述原子服务进程用于对故障进行以下处理之一或其任意组合:监控、分析、修正、评估;According to the processing status of the current fault, select a preset atomic service model or create an atomic service model; the atomic service model is used to provide at least one type of atomic service process, wherein the atomic service process is used to perform the following processing on the fault One or any combination of: monitoring, analysis, correction, evaluation;
    根据所述原子服务模型中的原子服务进程,创建工作流;所述工作流包括所述原子服务进程的处理链;Create a workflow according to the atomic service process in the atomic service model; the workflow includes a processing chain of the atomic service process;
    生成所述当前故障所属的故障模型,所述故障模型包括所述工作流的指示信息;generating a fault model to which the current fault belongs, where the fault model includes indication information of the workflow;
    根据所述指示信息确定的工作流调用所述原子服务模型中的所述原子服务进程以处理所述当前故障。The workflow determined according to the indication information invokes the atomic service process in the atomic service model to handle the current fault.
  2. 根据权利要求1所述的故障处理方法,其中,所述生成所述当前故障所属的故障模型,包括:The fault handling method according to claim 1, wherein said generating the fault model to which the current fault belongs comprises:
    若所述当前故障的故障参数与预设的各故障模型中的第一故障模型的故障参数的类型相同,则复制所述第一故障模型,并根据所述当前故障的故障参数对复制的所述第一故障模型进行修改,生成所述当前故障所属的故障模型;If the fault parameters of the current fault are of the same type as the fault parameters of the first fault model among the preset fault models, then copy the first fault model, and perform all copied fault parameters according to the fault parameters of the current fault Modifying the first fault model to generate the fault model to which the current fault belongs;
    若所述当前故障的故障参数与预设的各故障模型的故障参数的类型均不相同,则创建所述当前故障所属的故障模型。If the types of the fault parameters of the current fault are different from those of the preset fault models, the fault model to which the current fault belongs is created.
  3. 根据权利要求1或2所述的故障处理方法,其中,所述当前故障所属的故障模型,包括以下之一或其任意组合:所述当前故障的名称、所述当前故障的指标、所述当前故障的类型信息。The fault handling method according to claim 1 or 2, wherein the fault model to which the current fault belongs includes one of the following or any combination thereof: the name of the current fault, the index of the current fault, the current Information about the type of failure.
  4. 根据权利要求3所述的故障处理方法,其中,所述当前故障所属的故障模型,还包括:The fault handling method according to claim 3, wherein the fault model to which the current fault belongs further comprises:
    对当前故障的进行处理的执行方式,其中,所述执行方式包括:立即执行,定时执行或周期性执行。An execution mode for processing the current fault, wherein the execution mode includes: immediate execution, timing execution or periodic execution.
  5. 根据权利要求1至4中任一项所述的故障处理方法,其中,在根据所述指示信息确定的工作流调用所述原子服务模型中的所述原子服务进程以处理所述当前故障之后,还包括:The fault handling method according to any one of claims 1 to 4, wherein after the workflow determined according to the indication information invokes the atomic service process in the atomic service model to handle the current fault, Also includes:
    根据所述原子服务进程,确定对应的输出数据的格式;According to the atomic service process, determine the format of the corresponding output data;
    根据所述输出数据的格式输出所述处理所述当前故障的结果。Outputting the result of processing the current fault according to the format of the output data.
  6. 根据权利要求1至5中任一项所述的故障处理方法,其中,所述不同的原子服务进程的输出数据的格式不同。The fault handling method according to any one of claims 1 to 5, wherein the formats of the output data of the different atomic service processes are different.
  7. 根据权利要求1至6中任一项所述的故障处理方法,其中,所述创建原子服务模型,包括:The fault handling method according to any one of claims 1 to 6, wherein said creating an atomic service model includes:
    选取所述原子服务模型的模板;selecting a template of the atomic service model;
    根据所述当前故障的故障参数与所述选取的原子服务模型的模板,创建原子服务模型。An atomic service model is created according to the fault parameters of the current fault and the template of the selected atomic service model.
  8. 一种故障处理装置,包括:A fault handling device, comprising:
    原子服务模块,用于根据对当前故障的处理状况,选择预置的原子服务模型或创建原子服务模型;所述原子服务模型用于提供至少一类原子服务进程,其中,所述原子服务进程用于对故障进行以下处理之一或其任意组合:监控、分析、修正、评估;The atomic service module is used to select a preset atomic service model or create an atomic service model according to the processing status of the current fault; the atomic service model is used to provide at least one type of atomic service process, wherein the atomic service process uses To perform one or any combination of the following actions on the failure: monitoring, analysis, correction, evaluation;
    工作流模块,用于根据所述原子服务模型中的原子服务进程,创建工作流;所述工作流包括所述原子服务进程的处理链;A workflow module, configured to create a workflow according to the atomic service process in the atomic service model; the workflow includes a processing chain of the atomic service process;
    故障场景模块,用于生成所述当前故障所属的故障模型,所述故障模型包括所述工作流的指示信息;A fault scenario module, configured to generate a fault model to which the current fault belongs, where the fault model includes indication information of the workflow;
    处理模块,用于根据所述指示信息确定的工作流调用所述原子服务模型中的所述原子服务进程以处理所述当前故障。A processing module, configured to call the atomic service process in the atomic service model according to the workflow determined by the indication information to process the current fault.
  9. 一种服务器,包括:A server comprising:
    至少一个处理器;以及,at least one processor; and,
    与所述至少一个处理器通信连接的存储器;其中,a memory communicatively coupled to the at least one processor; wherein,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如权利要求1至7中任一所述的故障处理方法。The memory is stored with instructions executable by the at least one processor, the instructions are executed by the at least one processor, so that the at least one processor can perform any one of claims 1 to 7 troubleshooting method.
  10. 一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1至7中任一项所述的故障处理方法。A computer-readable storage medium storing a computer program, which implements the fault handling method according to any one of claims 1 to 7 when the computer program is executed by a processor.
PCT/CN2022/094796 2021-05-31 2022-05-24 Fault handling method and apparatus, and server and storage medium WO2022253054A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110600323.4 2021-05-31
CN202110600323.4A CN115934451A (en) 2021-05-31 2021-05-31 Fault processing method, device, server and storage medium

Publications (1)

Publication Number Publication Date
WO2022253054A1 true WO2022253054A1 (en) 2022-12-08

Family

ID=84322781

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/094796 WO2022253054A1 (en) 2021-05-31 2022-05-24 Fault handling method and apparatus, and server and storage medium

Country Status (2)

Country Link
CN (1) CN115934451A (en)
WO (1) WO2022253054A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170102982A1 (en) * 2015-10-13 2017-04-13 Honeywell International Inc. Methods and apparatus for the creation and use of reusable fault model components in fault modeling and complex system prognostics
CN106651328A (en) * 2017-01-13 2017-05-10 山东浪潮商用系统有限公司 Configurable intelligent flow navigation method
CN107844799A (en) * 2017-10-17 2018-03-27 西安建筑科技大学 A kind of handpiece Water Chilling Units method for diagnosing faults of integrated SVM mechanism
CN111831520A (en) * 2019-04-17 2020-10-27 烽火通信科技股份有限公司 Fault diagnosis method and system for Linux operating system
CN111859047A (en) * 2019-04-23 2020-10-30 华为技术有限公司 Fault solving method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170102982A1 (en) * 2015-10-13 2017-04-13 Honeywell International Inc. Methods and apparatus for the creation and use of reusable fault model components in fault modeling and complex system prognostics
CN106651328A (en) * 2017-01-13 2017-05-10 山东浪潮商用系统有限公司 Configurable intelligent flow navigation method
CN107844799A (en) * 2017-10-17 2018-03-27 西安建筑科技大学 A kind of handpiece Water Chilling Units method for diagnosing faults of integrated SVM mechanism
CN111831520A (en) * 2019-04-17 2020-10-27 烽火通信科技股份有限公司 Fault diagnosis method and system for Linux operating system
CN111859047A (en) * 2019-04-23 2020-10-30 华为技术有限公司 Fault solving method and device

Also Published As

Publication number Publication date
CN115934451A (en) 2023-04-07

Similar Documents

Publication Publication Date Title
EP3798846B1 (en) Operation and maintenance system and method
CN102571403B (en) The implementation method of general data quality control adapter and device
WO2017041406A1 (en) Failure positioning method and device
US20170220698A1 (en) Canonical data model for iterative effort reduction in business-to-business schema integration
US20180268258A1 (en) Automated decision making using staged machine learning
US9798576B2 (en) Updating and redistributing process templates with configurable activity parameters
WO2022142927A1 (en) Rule-based fault analysis method and apparatus, device, and storage medium
US10474954B2 (en) Feedback and customization in expert systems for anomaly prediction
CN110322143B (en) Model materialization management method, device, equipment and computer storage medium
CN107094086A (en) A kind of information acquisition method and device
CN109240732A (en) A kind of process method of combination based on elastic regulation
CN105072139A (en) Recommendation method and recommendation device
CN111897643B (en) Thread pool configuration system, method, device and storage medium
WO2021012909A1 (en) Data export method and apparatus, computer device, and storage medium
CN115271473B (en) Intelligent multidimensional data service index scheduling method
US20180239341A1 (en) System and method for automatic configuration of a data collection system and schedule for control system monitoring
CN110837496A (en) Data quality management method and system based on dynamic sql
US11983636B2 (en) Automated analytical model retraining with a knowledge graph
CN103354990A (en) System and method for processing virtual machine in cloud platform
US8341463B2 (en) System operations management apparatus, system operations management method
CN111552771A (en) Safety compliance strategy creating and managing system for electric power system
US8276150B2 (en) Methods, systems and computer program products for spreadsheet-based autonomic management of computer systems
WO2021012910A1 (en) Data import method and apparatus, and computer device and storage medium
WO2022253054A1 (en) Fault handling method and apparatus, and server and storage medium
CN107766156B (en) Task processing method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22815101

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE