CN113760579A - Troubleshooting method and device - Google Patents

Troubleshooting method and device Download PDF

Info

Publication number
CN113760579A
CN113760579A CN202111040339.0A CN202111040339A CN113760579A CN 113760579 A CN113760579 A CN 113760579A CN 202111040339 A CN202111040339 A CN 202111040339A CN 113760579 A CN113760579 A CN 113760579A
Authority
CN
China
Prior art keywords
troubleshooting
service
program
application program
service information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111040339.0A
Other languages
Chinese (zh)
Inventor
朱进
张一曼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Ltd
Original Assignee
Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Ltd filed Critical Bank of China Ltd
Priority to CN202111040339.0A priority Critical patent/CN113760579A/en
Publication of CN113760579A publication Critical patent/CN113760579A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3089Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/547Messaging middleware
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Abstract

The application provides a troubleshooting method and a troubleshooting device, relates to the technical field of computers, and solves the problems that in the prior art, when troubleshooting is performed on an application system, the workload is large, and the operation and maintenance efficiency is low. The method comprises the following steps: acquiring service information and fault data of an application program, and performing association query according to the service information and the fault data to obtain associated service information; freely combining according to the service information, the fault data or the associated service information, and performing fault troubleshooting; and performing online service troubleshooting on the upstream and downstream service programs for the application program, and outputting troubleshooting results.

Description

Troubleshooting method and device
Technical Field
The present application relates to the field of computer technologies, and in particular, to a troubleshooting method and apparatus.
Background
With the increasing development of information technology, most application systems adopt a distributed system, so that the application systems can be supported by dozens or even hundreds of virtual computer environments to operate, thereby increasing the complexity and workload of operation and maintenance, and therefore, it is very important to develop an automatic operation and maintenance tool to perform troubleshooting and emergency response of the application systems.
The current operation arrangement development tool can form a flow by arranging a series of 'operations' through dragging, graphical editing, and can also realize manual execution, conditional trigger automatic execution or timing execution of the flow and the like, thereby realizing maintenance task visualization, flow automation and service automation. Among them, the Operation Organization (OO) tool can perform the organization design of the OO stream.
However, the current operation editing tool can only implement the checking of a single function, for example, only implement the checking of a certain fault or a certain function in an application program, and when the checking script is run, the user needs to manually apply for logging in an Identity document identification number (ID), frequently perform operations such as multi-machine logging and manual checking, and the operation and maintenance efficiency is not high.
Disclosure of Invention
The application provides a troubleshooting method and a troubleshooting device, and solves the problems that in the prior art, when troubleshooting is performed on an application system, the workload is large, and the operation and maintenance efficiency is low.
In order to achieve the purpose, the technical scheme is as follows:
in a first aspect, a troubleshooting method is provided, the method including: acquiring service information and fault data of an application program, and performing association query according to the service information and the fault data to obtain associated service information; freely combining according to the service information, the fault data or the associated service information, and performing fault troubleshooting; and performing online service troubleshooting on the upstream and downstream service programs for the application program, and outputting troubleshooting results.
In one embodiment, performing online service troubleshooting on an upstream service program and a downstream service program on the application program specifically includes: and calling at least one of a local transaction table and an external associated transaction table of the application program, an associated service program of an upstream service or a downstream service, a Log of the application program and the associated service program, and communication plane middleware of the application program and the associated service program to perform troubleshooting.
In one embodiment, the communication plane middleware comprises message queue MQ middleware.
In one embodiment, before outputting the troubleshooting results, the method further comprises: and performing auxiliary troubleshooting according to at least one of batch state inspection, statistical analysis results or changed inspection.
In one embodiment, before performing online business troubleshooting of an upstream business program and a downstream business program on the application program, the method further comprises: and carrying out multi-dimensional free combination on different fields in the Log Log of the application program, and carrying out keyword checking on the data obtained by free self-combination.
In one embodiment, the method further comprises: and executing emergency treatment on the application program according to the troubleshooting result.
In one embodiment, the emergency treatment includes at least one of a program backup, a program restart, or a custom emergency treatment program.
According to the implementation mode of the application, an automatic troubleshooting tool of the application system is constructed, troubleshooting of automatic operation arrangement is carried out on online transactions in the operation and maintenance process, troubleshooting functions of one-key troubleshooting, real-time monitoring, multi-dimensional maintenance data statistical analysis and the like of the online transactions are achieved, operation and maintenance personnel can be helped to quickly and accurately position faults of the application system, and emergency application events can be responded timely and safely.
In a second aspect, there is provided a troubleshooting apparatus comprising: the parameter input module is used for acquiring the service information and the fault data of the application program and performing correlation query according to the service information and the fault data to obtain correlation service information; the information free query module is used for carrying out free combination according to the service information, the fault data or the associated service information and carrying out fault troubleshooting; and the online service troubleshooting module is used for performing online service troubleshooting on the upstream and downstream service programs on the application program and outputting a troubleshooting result.
In an embodiment, the online service troubleshooting module is specifically configured to invoke a local transaction table and an external associated transaction table of an application program, invoke an associated service program of an upstream service or a downstream service, a Log of the application program and the associated service program, and perform troubleshooting on at least one of a communication plane middleware of the application program and the associated service program.
In one embodiment, the communication plane middleware comprises message queue MQ middleware.
In one embodiment, the apparatus further comprises: and the auxiliary troubleshooting module is used for performing auxiliary troubleshooting according to at least one of batch state inspection, statistical analysis results or changed inspection.
In an embodiment, the information free query module is further specifically configured to perform multidimensional free combination on different fields in a Log of the application program, and perform keyword search on data obtained by free self-combination.
In one embodiment, the apparatus further comprises: and the emergency processing module is used for executing emergency processing on the application program according to the troubleshooting result.
In one embodiment, the emergency treatment includes at least one of a program backup, a program restart, or a custom emergency treatment program.
In a third aspect, an electronic device is provided, which includes: a processor and a transmission interface; wherein the processor is configured to execute instructions stored in the memory to implement the method of any of the first aspects above.
In a fourth aspect, a computer-readable storage medium is provided, in which instructions that, when executed by a processor of an electronic device, enable the electronic device to perform the method of any of the above first aspects.
In a fifth aspect, a computer program product is provided, which, when run on a computer or a processor, causes the computer or the processor to perform the method according to any of the first aspects.
It should be understood that any one of the troubleshooting apparatuses, the electronic devices, the computer readable storage media and the computer program products provided above can be used for executing the corresponding methods provided above, and therefore, the beneficial effects achieved by the troubleshooting apparatuses, the beneficial effects in the corresponding methods provided above can be referred to, and are not described herein again.
Drawings
Fig. 1 is a schematic diagram of a system architecture according to an embodiment of the present application;
fig. 2 is a schematic flowchart of a troubleshooting method according to an embodiment of the present application;
fig. 3 is a schematic functional module diagram of an electronic device according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a troubleshooting apparatus provided in an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In the following, the terms "first", "second" are used for descriptive purposes only and are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present embodiment, "a plurality" means two or more unless otherwise specified.
It is noted that, in the present application, words such as "exemplary" or "for example" are used to mean exemplary, illustrative, or descriptive. Any embodiment or design described herein as "exemplary" or "e.g.," is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the word "exemplary" or "such as" is intended to present concepts related in a concrete fashion.
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
First, a brief description is given of an implementation environment and an application scenario of the embodiment of the present application.
FIG. 1 illustrates an exemplary system architecture 100 to which embodiments of the present method for troubleshooting may be applied.
As shown in FIG. 1, the system architecture 100 may include a server 101, a network 102, and an operation and maintenance computer 103.
The server 101 may be an electronic device running an application program, and specifically may be a single server, or may also be a server cluster, or a server in a distributed architecture. The application running on the server 101 may be a banking application, or may be an application providing other services or services, which is not limited in this application.
Network 102 serves as a communication medium that provides a communication link between server 101 and operation and maintenance computer 103. Network 102 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
A user may use the operation and maintenance computer 103 to perform data interaction with the server 101 through the network 102, so as to perform data access, operation and maintenance, parameter update, message receiving or issuing, and the like on an application program running on the server 101. The server 101 and the operation and maintenance computer 103 may be installed with various applications for implementing communication therebetween, such as an operation and maintenance application, a data transmission application, an instant messaging application, and the like.
The server 101 and the operation and maintenance computer 103 may be hardware or software. When the server 101 is hardware, it may be a hardware cluster constructed from a plurality of electronic devices including, but not limited to, smart phones, tablets, laptop and desktop computers, workstations, servers, and the like. When the server 101 is software, it may be installed in the electronic device listed above, and it may be implemented as multiple software or software modules, or may be implemented as a single software or software module, and is not limited in this respect. When the operation and maintenance computer 103 is hardware, it may be a single operation and maintenance computer, or a distributed computer cluster formed by a plurality of computers; when the operation and maintenance computer 103 is software, it may be implemented as a plurality of software or software modules, or may be implemented as a single software or software module, and is not limited herein.
The operation and maintenance computer 103 may be an electronic device for performing operation and maintenance on an application program, and may provide operation and maintenance troubleshooting services, such as operation and maintenance applications, for the server 101 through various built-in applications.
Specifically, the operation and maintenance computer 103 may implement the following effects when running the operation and maintenance application: firstly, logging in an application program running in a server 101 through a network 102, and acquiring running data of each functional module; then, determining the associated operation data which is associated among different functional modules according to the operation data; and determining abnormal data existing in the operation and maintenance process according to the operation data and the associated operation data, and performing troubleshooting. That is, the operation and maintenance computer 103 may finally output information such as abnormal data, health index, emergency treatment, and the like, which are searched by the application program running on the server 101 for the current operation and maintenance through the above processing steps.
In order to not affect the operation of the normal service loaded on the server 101 itself as much as possible, the troubleshooting method provided in the following embodiments of the present application is generally executed by the operation and maintenance computer 103 independent from the server 101, and accordingly, the device for troubleshooting the operation and maintenance abnormality is also generally disposed in the operation and maintenance computer 103. However, it should be noted that when the server 101 also has the computation capability and computation resources meeting the requirements, the server 101 may also complete the above operations performed by the operation and maintenance computer 103 through the operation and maintenance application installed thereon, and then output the same result as the operation and maintenance computer 103 by itself, especially when the server 101 is currently in a situation with a large number of remaining computation resources. Accordingly, a device for checking the operation and maintenance abnormality may be provided in the server 101. In such a case, the exemplary system architecture 100 may also not include the operation and maintenance computer 103 and the network 102.
It should be understood that the number of servers, networks, and operation and maintenance computers in FIG. 1 are merely illustrative. There may be any number of servers, networks, and operation and maintenance computers, as desired for an implementation.
The following embodiments of the present application are described only by way of examples of electronic devices.
The embodiment of the application provides a troubleshooting method, which is applied to electronic equipment, and particularly can be an operation and maintenance computer for implementing processing such as operation maintenance, data monitoring, troubleshooting, emergency response and the like on an application program. In the embodiment of the application, one-key troubleshooting can be realized for the application program through the operation arrangement tool of the electronic equipment, namely, multi-angle comprehensive on-line troubleshooting can be carried out on the application program, so that faults among associated services of an application system can be automatically checked, emergency treatment can be timely provided, automatic one-key troubleshooting is realized, the manual workload of operation and maintenance workers is reduced, and the operation and maintenance efficiency is improved.
As shown in fig. 2, the embodiment of the present application includes the following steps:
201: the electronic equipment acquires the service information and the fault data, and performs correlation query according to the service information and the fault data to obtain correlation service information.
The service information may specifically be service data generated by an application program, or monitoring data. The service information may be input by an external device, or may be obtained by the electronic device or the operation and maintenance personnel monitoring the operation of the service program or the associated service program. For example, the service information may be a work order, or data or a fault recorded or marked by an operation and maintenance person in the operation process of the service program.
The fault data may be a fault record from an external device input or a user fault data record from a service platform record of the application. For example, in the process of using the application program, a user finds that data returned by the application program is incorrect, or finds that the operation of the application program is abnormal, and the like, and can perform feedback and registration through a client service platform of the application program, for example, in a manner of telephone customer service, internet customer service, and the like, and report fault information to a background of the application program.
In one embodiment, the electronic device may first perform security verification on externally provided service information to ensure the security of the externally provided service information. After the security verification passes, an association query can be performed.
The association query refers to querying related service data of the service information in the upstream service program and the downstream service program of the application program through the upstream service or the downstream service associated with the application program, so as to obtain the associated service information.
Illustratively, the application is a loan application system in a financial system, and the service information 1 is service data of a loan application of the user 1 in the loan application system. The upstream business of the loan application system is a card management system, the downstream business of the loan application system is a loan auditing system, the business information 1 of the user 1 is simultaneously associated with the business information 2 in the card management system, and in addition, the business information 1 of the user 1 is also simultaneously associated with the business information 3 in the loan application system. In the embodiment of the present application, when troubleshooting is performed, for example, associated service information 2, service information 3, and the like may be obtained according to the relevant service data of the service information 1 in the upstream service program and the downstream service program of the application program, which is beneficial to fully performing automated troubleshooting and fault diagnosis.
In one embodiment, as shown in fig. 3, the electronic device may include the following functional modules: the system comprises a parameter input module 301, an information free query module 302, an online service troubleshooting module 303 and an emergency processing module 304.
The parameter input module 301 may be configured to obtain externally input service information, fault data, and the like. The parameter input module 301 may also be used to perform security verification on externally input service information or fault data.
In one embodiment, the electronic device may further perform online monitoring and troubleshooting according to the online monitored service data. Specifically, the electronic device may perform real-time monitoring on the application program or an upstream service system or a downstream service system of the application program associated service, and obtain public and private online transaction data in real time. If the electronic equipment finds potential safety hazards or abnormal data in the online monitored service data, the online service troubleshooting stage can be started.
As shown in fig. 3, the electronic device may further include an online monitoring module 305 for inputting online monitored service data. The online monitoring module 305 may further be configured to perform association query according to the service information and the fault data to obtain associated service information.
202: and the electronic equipment performs free combination inquiry according to the service information, the fault data and the associated service information, and performs fault troubleshooting according to preset keywords.
In the embodiment of the application, troubleshooting can be divided into active visual angles and passive visual angles for operation arrangement to troubleshoot system faults. The passive visual angle refers to troubleshooting performed according to the acquired external service information and fault data. The active visual angle refers to the fact that multi-dimensional free combination can be conducted on service information, then data are examined, potential safety hazards or abnormal data are found, and the like.
Specifically, the electronic device may perform multidimensional free combination of different fields according to a plurality of fields of the service information, the fault data, the associated service information, and the like. And then, inquiring the data table generated by combination, and performing troubleshooting according to preset keywords. The keyword may be an error (bug) keyword, an error type keyword, a fault type keyword, or the like. The preset keywords may be preset by a troubleshooting tool, may also be customized or configured by operation and maintenance personnel, or may be common fault keywords in the technical field, which is not limited in this application.
The electronic device may also automatically generate a Structured Query Language (SQL) with freely arranged arbitrary fields according to a plurality of fields of the service information, the fault data, the associated service information, and the like, and perform keyword check on the generated SQL stream. Or, the electronic device may also perform any multidimensional free combination according to a Log and the like stored in the system, and then perform keyword search on the data obtained by the free self-combination.
In one implementation mode, the electronic equipment can be freely combined according to the service information, the transaction operation condition of the application system can be examined in a multi-dimensional mode in real time, and once potential safety hazards or abnormal data are found, the electronic equipment can immediately enter an online service troubleshooting stage in a linkage mode.
As shown in fig. 3, the information free query module 302 may be used to freely combine multiple fields of service information, fault data, and associated service information. The method can also be used for carrying out keyword troubleshooting on the data table obtained by free combination to obtain a troubleshooting result.
203: and the electronic equipment performs online service troubleshooting and outputs troubleshooting results.
Specifically, the electronic device may execute an operation orchestration script for troubleshooting, perform online service troubleshooting by calling a local transaction table of the application program, calling an external associated transaction table, calling an associated system of an upstream service or a downstream service, the application program and a Log of the associated service program, and middleware of a communication layer of the application program and the associated service program, and output a troubleshooting result.
The middleware of the communication layer may be specifically Message Queue (MQ) middleware. MQ is a data structure of 'first-in first-out' in a basic data structure, and is generally used for solving the problems of application decoupling, asynchronous messages, traffic peak clipping and the like.
In one embodiment, the electronic device may perform keyword or transaction time window linkage checks for the current day for private, public online transactions, MQ middleware, Log files, and the like.
In one embodiment, the electronic device outputting the troubleshooting results may be outputting a health indicator reflecting the application, for example, by visualizing the health indicator as the troubleshooting results. Alternatively, the troubleshooting result may also be output specific fault data or abnormal data, which may include parameters such as fault type and fault time.
In the embodiment of the application, an automatic troubleshooting tool of the application system is constructed, troubleshooting of automatic operation arrangement is performed on online transactions in an operation and maintenance process, troubleshooting functions such as one-key troubleshooting, real-time monitoring and multi-dimensional maintenance data statistical analysis of the online transactions are realized, operation and maintenance personnel can be helped to quickly and accurately position faults of the application system, and emergency application events can be responded to safely in time.
In an embodiment, before the electronic device outputs the troubleshooting result, the method may further include:
204: and the electronic equipment performs auxiliary troubleshooting according to the results of batch state inspection, statistical analysis, post-change inspection and the like.
As shown in fig. 3, the auxiliary troubleshooting module 306 may specifically include: the system comprises a batch state checking module, a statistical analysis module and a check after change (check _ list) module. The secondary troubleshooting module 306 may be used as a secondary means of troubleshooting while online, i.e., health checking and troubleshooting the application system from other possible perspectives.
The batch status check may refer to performing operations such as batch running status, data file, batch log check, and the like on the application program.
The statistical analysis may refer to a result of performing statistics on service indexes, technical indexes, and the like of the application program.
The check after change (check _ list) is record information for checking after change of the application program for the private server, the public server, and the program on the server cluster. For example, record information such as program update, parameter check, or status check is performed on a port, a process, or a program of a certain server.
In the above possible implementation, the batch status check module, the statistical analysis module, the check after change (check _ list) and the like are used as auxiliary means for online troubleshooting, so that the health check for the application system is assisted from other application angles, and the operation and maintenance efficiency is improved.
In one embodiment, the electronic device may start to execute emergency processing according to the output troubleshooting result, so as to avoid major problems such as running crash or interruption of the application program. Therefore, the method provided by the embodiment of the present application may further include:
205: the electronic device performs emergency processing on the application program.
The emergency treatment may specifically include at least one of program backup, program restart, or custom emergency treatment program.
Program backup refers to copying data in a file system or a database system so as to recover application programs or business data according to the copied data.
In one embodiment, the electronic device may implement one-key backup through a visual operation and maintenance interface, that is, instruct a user to press a key to perform program backup. Therefore, the backup of the application programs of the public server, the private server and the batch server or server cluster is realized.
A program restart refers to shutting down and restarting an executable program, or starting a new process to implement a process of the same program and including the same parameters.
In an embodiment, the electronic device may implement one-key start-stop online through a visual operation and maintenance interface, and specifically, may implement operations such as stopping and starting application programs of a public server, a private server, and a batch server or a server cluster, or rolling and restarting a plurality of servers in turn.
In addition, the user can also execute a customized emergency treatment program through a parallel processing Console (CMD) of the electronic device. For example, when the application program has a fault of disordered interface display, a multi-machine parallel execution command can be input through the CMD console, and the interface display is adjusted in the horizontal and vertical directions, so that display errors or faults are solved in real time.
Optionally, in an embodiment, before step 203 in the above embodiment of the present application, that is, before the electronic device performs online service troubleshooting, the method may further include: and the electronic equipment performs background script safety audit. That is to say, before the electronic device can execute the troubleshooting, firstly, the script of the troubleshooting operation arrangement is scanned for high-risk operation, so as to eliminate possible script errors or potential safety hazards and warn operation risks. For example, if some operations which maliciously damage the normal running of the application program occur in the operation-arranged script, the safety audit of the background script can be used for timely warning or modifying the high-risk operations, so that the operation and maintenance efficiency is further improved.
Based on the above embodiments, the present application further provides a troubleshooting device, as shown in fig. 4, the device 400 includes a parameter input module 401, an information free query module 402, and an online service troubleshooting module 403.
The parameter input module 401 may be configured to obtain service information and fault data of an application program, and perform association query according to the service information and the fault data to obtain associated service information.
The information free query module 402 may be configured to perform free combination according to the service information, the fault data, or the associated service information, and perform troubleshooting.
The online service troubleshooting module 403 may be configured to perform online service troubleshooting on an upstream service program and a downstream service program for the application program, and output a troubleshooting result.
In an embodiment, the online service troubleshooting module 403 is specifically configured to invoke at least one of a local transaction table of an application program, an external associated transaction table, an associated service program of an upstream service or a downstream service, a Log of the application program and the associated service program, and a communication plane middleware of the application program and the associated service program, to perform troubleshooting.
In one embodiment, the communication plane middleware comprises message queue MQ middleware.
In one embodiment, the apparatus 400 may further include: and the auxiliary troubleshooting module is used for performing auxiliary troubleshooting according to at least one of batch state inspection, statistical analysis results or changed inspection.
In an embodiment, the information free query module 402 is further specifically configured to perform multidimensional free combination on different fields in a Log of an application program, and perform keyword search on data obtained by free self-combination.
In one embodiment, the apparatus 400 further comprises: and the emergency processing module is used for executing emergency processing on the application program according to the troubleshooting result.
In one embodiment, the emergency treatment includes at least one of a program backup, a program restart, or a custom emergency treatment program.
It should be noted that, for the specific implementation process and embodiment of the apparatus 400, reference may be made to the steps executed by the electronic device in the foregoing method embodiment and the related description, and the technical problem to be solved and the technical effect brought about may also refer to the content described in the foregoing embodiment, which is not described herein again.
In this embodiment, the apparatus may be presented in a form of dividing each functional module in an integrated manner. A "module" herein may refer to a specific circuit, a processor and memory that execute one or more software or firmware programs, an integrated logic circuit, and/or other devices that may provide the functionality described above. In a simple embodiment, the skilled person will appreciate that the above described device may take the form shown in figure 5.
Other embodiments of the present application provide an electronic device, which may include: a memory and one or more processors, the memory and processors coupled. The memory is for storing computer program code comprising computer instructions. When the processor executes the computer instructions, the electronic device may perform the various functions or steps of the above-described method embodiments. The structure of the electronic device may refer to the structure of the electronic device 500 shown in fig. 5.
Fig. 5 is a schematic structural diagram of an exemplary electronic device 500 shown in an embodiment of the present application, configured to execute the method executed by the electronic device in the foregoing embodiment. As shown in fig. 5, the electronic device 500 may include at least one processor 501, a communication link 502, and a memory 503.
The processor 501 may be a general processing unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more integrated circuits.
Communication link 502, which may be, for example, a bus, may include a path for communicating information between the aforementioned components.
The memory 503 may be, but is not limited to, a read-only memory (ROM) or other type of static storage device that may store static information and instructions, a Random Access Memory (RAM) or other type of dynamic storage device that may store information and instructions, an electrically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disk storage, optical disk storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory may be separate and coupled to the processor via a communication line 502. The memory may also be integral to the processor. The memory provided by the embodiment of the application is generally a nonvolatile memory. The memory 503 is used for storing computer program instructions related to the implementation of the solution of the embodiment of the present application, and is controlled and executed by the processor 501. The processor 501 is configured to execute the computer program instructions stored in the memory 503, thereby implementing the methods provided by the embodiments of the present application.
Optionally, the computer program instructions in the embodiments of the present application may also be referred to as application program codes, which are not specifically limited in the embodiments of the present application.
In particular implementations, processor 501 may include one or more CPUs such as CPU0 and CPU1 in fig. 5 as an example.
In particular implementations, electronic device 500 may include multiple processors, such as processor 501 and processor 507 in FIG. 5, for example, as an embodiment. These processors may be single-core (single-CPU) processors or multi-core (multi-CPU) processors. A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).
In particular implementations, electronic device 500 may also include a communication interface 504, as one embodiment. The electronic device may receive and transmit data through a communication interface 504, or communicate with other devices or a communication network, where the communication interface 504 may be, for example, an ethernet interface, a Radio Access Network (RAN), a Wireless Local Area Network (WLAN) interface, or a USB interface.
In particular implementations, electronic device 500 may also include an output device 505 and an input device 506, as one embodiment. An output device 505, which is in communication with the processor 501, may display information in a variety of ways. For example, the output device 505 may be a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display device, a Cathode Ray Tube (CRT) display device, a projector (projector), or the like. The input device 506 is in communication with the processor 501 and may receive user input in a variety of ways. For example, the input device 506 may be a mouse, a keyboard, a touch screen device, or a sensing device, among others.
In a specific implementation, the electronic device 500 may be a desktop, a laptop, a web server, a Personal Digital Assistant (PDA), a mobile phone, a tablet, a wireless terminal device, an embedded device, or a device with a similar structure as in fig. 5. The embodiment of the present application does not limit the type of the electronic device 500.
In some embodiments, the processor 501 in fig. 5 may cause the electronic device 500 to perform the methods in the above-described method embodiments by calling computer program instructions stored in the memory 503.
Illustratively, the functions/implementation of the processing modules in fig. 4 may be implemented by the processor 501 in fig. 5 calling computer program instructions stored in the memory 503.
In an exemplary embodiment, a computer readable storage medium comprising instructions executable by the processor 501 of the electronic device 500 to perform the method of the above-described embodiment is also provided. Therefore, the technical effects obtained by the method can be obtained by referring to the method embodiments, which are not described herein again.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented using a software program, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions according to the embodiments of the present application are all or partially generated when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device.
The embodiment of the present application further provides a computer storage medium, where the computer storage medium includes computer instructions, and when the computer instructions are run on the electronic device, the electronic device is enabled to execute each function or step executed by the mobile phone in the foregoing method embodiment.
The embodiment of the present application further provides a computer program product, which when running on a computer, causes the computer to execute each function or step executed by the mobile phone in the above method embodiments.
Through the description of the above embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical functional division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another device, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may be one physical unit or a plurality of physical units, that is, may be located in one place, or may be distributed in a plurality of different places. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or partially contributed to by the prior art, or all or part of the technical solutions may be embodied in the form of a software product, where the software product is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above description is only an embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions within the technical scope of the present disclosure should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (16)

1. A troubleshooting method, the method comprising:
acquiring service information and fault data of an application program, and performing association query according to the service information and the fault data to obtain associated service information;
freely combining according to the service information, the fault data or the associated service information, and performing fault troubleshooting;
and performing online service troubleshooting on the upstream and downstream service programs for the application program, and outputting troubleshooting results.
2. The method according to claim 1, wherein performing online business troubleshooting of upstream and downstream business programs on the application program specifically comprises:
and calling at least one of a local transaction table and an external associated transaction table of the application program, an associated service program of an upstream service or a downstream service, a Log of the application program and the associated service program, and communication plane middleware of the application program and the associated service program to perform troubleshooting.
3. The method according to claim 1 or 2, characterized in that the communication plane middleware comprises message queue MQ middleware.
4. The method of claim 1 or 2, wherein prior to outputting the troubleshooting results, the method further comprises:
and performing auxiliary troubleshooting according to at least one of batch state inspection, statistical analysis results or changed inspection.
5. The method of claim 1 or 2, wherein before performing online business troubleshooting of the upstream and downstream business programs on the application program, the method further comprises:
and carrying out multi-dimensional free combination on different fields in the Log Log of the application program, and carrying out keyword checking on the data obtained by free self-combination.
6. The method according to claim 1 or 2, characterized in that the method further comprises:
and executing emergency treatment on the application program according to the troubleshooting result.
7. The method of claim 6, wherein the emergency treatment comprises at least one of a program backup, a program restart, or a custom emergency treatment program.
8. A troubleshooting apparatus, the apparatus comprising:
the parameter input module is used for acquiring service information and fault data of an application program and performing association query according to the service information and the fault data to obtain associated service information;
the information free query module is used for carrying out free combination according to the service information, the fault data or the associated service information and carrying out fault troubleshooting;
and the online service troubleshooting module is used for performing online service troubleshooting on the upstream and downstream service programs on the application program and outputting a troubleshooting result.
9. The apparatus of claim 8, wherein the online service troubleshooting module is specifically configured to perform troubleshooting by invoking a local transaction table of the application program, an external associated transaction table, an associated service program of an upstream service or a downstream service, a Log of the application program and the associated service program, and at least one of communication plane middleware of the application program and the associated service program.
10. The apparatus according to claim 8 or 9, wherein the communication plane middleware comprises message queue, MQ, middleware.
11. The apparatus of claim 8 or 9, further comprising:
and the auxiliary troubleshooting module is used for performing auxiliary troubleshooting according to at least one of batch state inspection, statistical analysis results or changed inspection.
12. The apparatus according to claim 8 or 9, wherein the information free query module is further configured to perform multidimensional free combination on different fields in the Log of the application program, and perform keyword search on data obtained by the free self-combination.
13. The apparatus of claim 8 or 9, further comprising:
and the emergency processing module is used for executing emergency processing on the application program according to the troubleshooting result.
14. The apparatus of claim 12, wherein the emergency treatment comprises at least one of a program backup, a program restart, or a custom emergency treatment program.
15. An electronic device, characterized in that the electronic device comprises:
a processor and a transmission interface; wherein the processor is configured to execute instructions stored in the memory to implement the method of any one of claims 1 to 7.
16. A computer-readable storage medium, wherein instructions in the computer-readable storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the method of any of claims 1-7.
CN202111040339.0A 2021-09-06 2021-09-06 Troubleshooting method and device Pending CN113760579A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111040339.0A CN113760579A (en) 2021-09-06 2021-09-06 Troubleshooting method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111040339.0A CN113760579A (en) 2021-09-06 2021-09-06 Troubleshooting method and device

Publications (1)

Publication Number Publication Date
CN113760579A true CN113760579A (en) 2021-12-07

Family

ID=78793240

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111040339.0A Pending CN113760579A (en) 2021-09-06 2021-09-06 Troubleshooting method and device

Country Status (1)

Country Link
CN (1) CN113760579A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114422338A (en) * 2022-03-29 2022-04-29 浙江网商银行股份有限公司 Fault influence analysis method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114422338A (en) * 2022-03-29 2022-04-29 浙江网商银行股份有限公司 Fault influence analysis method and device
CN114422338B (en) * 2022-03-29 2022-08-26 浙江网商银行股份有限公司 Fault influence analysis method and device

Similar Documents

Publication Publication Date Title
US10303533B1 (en) Real-time log analysis service for integrating external event data with log data for use in root cause analysis
US10797958B2 (en) Enabling real-time operational environment conformity within an enterprise architecture model dashboard
US9892020B1 (en) User interface for specifying data stream processing language programs for analyzing instrumented software
CN107818431B (en) Method and system for providing order track data
EP4099170B1 (en) Method and apparatus of auditing log, electronic device, and medium
US20100070981A1 (en) System and Method for Performing Complex Event Processing
US10911447B2 (en) Application error fingerprinting
US20220263710A1 (en) Self-monitoring
JP2017207894A (en) Integrated monitoring operation system and method
CN112187933A (en) Method and system for monitoring services in multi-architecture cloud platform
CN111782456A (en) Anomaly detection method and device, computer equipment and storage medium
US9632904B1 (en) Alerting based on service dependencies of modeled processes
CN113760579A (en) Troubleshooting method and device
CN114153703A (en) Micro-service exception positioning method and device, electronic equipment and program product
CN111884858B (en) Equipment asset information verification method, device, system and medium
US10990413B2 (en) Mainframe system structuring
US8380729B2 (en) Systems and methods for first data capture through generic message monitoring
CN114021756A (en) Fault analysis method and device and electronic equipment
CN111290870A (en) Method and device for detecting abnormity
CN117834402A (en) Full link monitoring method and device, electronic equipment and storage medium
CN114610507A (en) Application service processing method, device, equipment, storage medium and program product
CN115292100A (en) Database fault processing method and device, electronic equipment and storage medium
CN117914870A (en) Data query method, device, electronic equipment, medium and program product
CN115114376A (en) Distributed data storage method, device, server and medium
CN114647579A (en) Breakpoint rerecording test method, system, device, medium and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination