CN118193265A - Fault detection method and related equipment - Google Patents

Fault detection method and related equipment Download PDF

Info

Publication number
CN118193265A
CN118193265A CN202410360861.4A CN202410360861A CN118193265A CN 118193265 A CN118193265 A CN 118193265A CN 202410360861 A CN202410360861 A CN 202410360861A CN 118193265 A CN118193265 A CN 118193265A
Authority
CN
China
Prior art keywords
transaction
thread
type
application program
acquiring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410360861.4A
Other languages
Chinese (zh)
Inventor
郑肇义
李承文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
CCB Finetech Co Ltd
Original Assignee
China Construction Bank Corp
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp, CCB Finetech Co Ltd filed Critical China Construction Bank Corp
Priority to CN202410360861.4A priority Critical patent/CN118193265A/en
Publication of CN118193265A publication Critical patent/CN118193265A/en
Pending legal-status Critical Current

Links

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The application provides a fault detection method and related equipment. Relates to the technical field of software. The method comprises the following steps: acquiring the running time of executing a first transaction by a thread of an application program, and determining the type of the first transaction; acquiring configuration information associated with the type, and determining a duration threshold value of the first transaction according to the configuration information; when the running time length is greater than or equal to the time length threshold, determining that the application program has a dead loop or long transaction, and acquiring a processing strategy associated with the type; and processing the thread according to the processing strategy. According to the method and the device, the repair time of the application program is reduced.

Description

Fault detection method and related equipment
Technical Field
The present application relates to the field of software technologies, and in particular, to a fault detection method and related devices.
Background
In the process of executing financial transactions by an application program, it is necessary to determine whether a dead loop or a long transaction occurs in the application program.
In an exemplary technique, it is determined whether a dead loop or long transaction has occurred to an application by memory usage of the application.
However, after determining that the application program has a dead loop or long transaction through the memory occupation, the problem needs to be repeated in the test environment, the loophole of the application program is found through a debug debugging method by additionally setting a breakpoint, and then a processing strategy is given based on the loophole, so that the repair duration of the application program is longer.
Disclosure of Invention
The application provides a fault detection method and related equipment, which are used for solving the problem of long repair time of an application program.
In a first aspect, the present application provides a fault detection method, including:
Acquiring the running time of executing a first transaction by a thread of an application program, and determining the type of the first transaction, wherein the running time is used for indicating the interval time between the current time point and the starting time point of executing the first transaction;
Acquiring configuration information associated with the type, and determining a duration threshold value of the first transaction according to the configuration information;
When the running time length is greater than or equal to the time length threshold, determining that the application program has a dead loop or long transaction, and acquiring a processing strategy associated with the type;
And processing the thread according to the processing strategy, wherein the processing strategy comprises controlling the thread to stop running, releasing the resources of the thread, recording the abnormal information of the thread and/or outputting the abnormal execution information of the thread.
In one possible design, before the step of obtaining the configuration information associated with the type, the method further includes:
After the version of the application program corresponding to the thread is updated, acquiring each second transaction completed after the application program is updated, and acquiring thread execution information of each second transaction, wherein the type of each second transaction is the same as the type of the first transaction;
determining a third transaction in each second transaction according to the thread execution information, and determining the ratio between the number of the third transaction and the number of the second transaction, wherein the third transaction is used for indicating the occurrence of the second transaction of abnormal thread execution;
When the ratio is larger than a preset ratio, setting an associated duration threshold value for the type corresponding to the second transaction, and generating configuration information associated with the type according to the duration threshold value.
In one possible design, the step of obtaining the running time of the thread of the application program to execute the first transaction includes:
Determining a location of a current node that processes the first transaction in a responsibility processing chain, the responsibility processing chain consisting of a plurality of nodes;
Determining a target sub-transaction to be processed by the current node in each sub-transaction of the first transaction according to the position;
when the target sub-transaction is a preset transaction, acquiring a thread of the application program in the current node, and executing the running time of the target sub-transaction as the running time of the thread for executing the first transaction, wherein the preset transaction is used for indicating sub-transactions with the occurrence probability of dead loops being larger than the preset probability.
In one possible design, the step of obtaining the running time of the thread of the application program to execute the first transaction includes:
acquiring a transaction identifier corresponding to the first transaction;
and when the transaction identifier is positioned on the white list, acquiring the running time of executing the first transaction by the thread of the application program.
In one possible design, the step of obtaining the running time of the thread of the application program to execute the first transaction includes:
acquiring information acquired by an abstract processor, wherein the abstract processor is used for acquiring the running information of the threads of the application program;
And acquiring the running time of executing the first transaction by the thread of the application program according to the acquired information.
In one possible design, the step of obtaining information collected by the abstract processor includes:
Acquiring configuration information of the abstract processor when a transaction request of the first transaction is detected;
And when the abstract configuration information indicates that the abstract processor is in an open state, acquiring information acquired by the abstract processor.
In one possible design, when the type of the first transaction is a query type, the processing policy includes controlling the thread to stop running and releasing resources of the thread;
when the type of the first transaction is a maintenance type, the processing strategy comprises recording abnormal information of the thread;
when the type of the first transaction is a specified type, the processing policy includes outputting abnormal execution information of the thread.
In a second aspect, the present application provides a fault detection device comprising:
The system comprises a first acquisition module, a second acquisition module and a first processing module, wherein the first acquisition module is used for acquiring the running time of executing a first transaction by a thread of an application program and determining the type of the first transaction, and the running time is used for indicating the interval time between the current time point and the starting time point of executing the first transaction;
the second acquisition module is used for acquiring the configuration information related to the type and determining a duration threshold value of the first transaction according to the configuration information;
The third acquisition module is used for determining that the application program has a dead loop or long transaction when the running time length is greater than or equal to the time length threshold value and acquiring the processing strategy associated with the type;
the processing module is used for processing the thread according to the processing strategy, and the processing strategy comprises controlling the thread to stop running, releasing the resources of the thread, recording the abnormal information of the thread and/or outputting the abnormal execution information of the thread.
In a third aspect, an embodiment of the present application provides an electronic device, including: at least one processor and memory; the memory stores computer-executable instructions; the at least one processor executes the computer-executable instructions stored by the memory, causing the at least one processor to perform the fault detection method as described above in the first aspect and the various possible designs of the first aspect.
In a fourth aspect, an embodiment of the present application provides a responsibility processing chain system, where the responsibility processing chain system includes a plurality of nodes connected in sequence, and any of the nodes is configured as described in the first aspect and the various possible designs of the first aspect.
In a fifth aspect, embodiments of the present application provide a computer readable storage medium, where computer executable instructions are stored, which when executed by a processor, implement the fault detection method according to the first aspect and the various possible designs of the first aspect.
In a sixth aspect, embodiments of the present application provide a computer program product comprising a computer program which, when executed by a processor, implements the fault detection method according to the first aspect and the various possible designs of the first aspect.
According to the fault detection method and the related equipment, the running time of executing the transaction by the thread of the application program is obtained, the type of the transaction is determined, the time threshold of the transaction is determined based on the configuration information related to the type of the transaction, and when the running time is greater than or equal to the time threshold, the occurrence of dead loop or long transaction of the application program can be determined, and the processing strategy related to the type is obtained, so that the thread is processed based on the processing strategy. According to the method and the device for monitoring the threads of the application program, the threads with faults in the application program can be located in the mode of monitoring the threads of the application program, so that fault location time of the application program is shortened, and repair time of the application program is shortened. Furthermore, the processing strategies such as stopping running of the control thread, releasing the resources of the thread, recording the abnormal information of the thread and/or outputting the abnormal execution information of the thread can be provided, so that the strategy for repairing the faults of the application program is provided, the processing strategy is not required to be provided again when the faults of the application program are positioned in the test environment, and the repairing duration of the application program is reduced.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
FIG. 1 is a schematic diagram of an architecture suitable for use in a responsible handler chain system in accordance with the present application;
Fig. 2 is a schematic flow chart of a first embodiment of a fault detection method according to an embodiment of the present application;
fig. 3 is a schematic flow chart of a second embodiment of a fault detection method according to an embodiment of the present application;
fig. 4 is a schematic flow chart of a third embodiment of a fault detection method according to an embodiment of the present application;
fig. 5 is a schematic flow chart of a fourth embodiment of a fault detection method according to an embodiment of the present application;
fig. 6 is a schematic flow chart of a fifth embodiment of a fault detection method according to an embodiment of the present application;
Fig. 7 is a schematic structural diagram of a fault detection device according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Specific embodiments of the present application have been shown by way of the above drawings and will be described in more detail below. The drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but rather to illustrate the inventive concepts to those skilled in the art by reference to the specific embodiments.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.
It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or fully authorized by each party, and the collection, use and processing of the related data need to comply with related laws and regulations and standards, and provide corresponding operation entries for the user to select authorization or rejection.
In the process of executing financial transactions by an application program, it is necessary to determine whether a dead loop or a long transaction occurs in the application program.
In an exemplary technique, it is determined whether a dead loop or long transaction has occurred to an application by memory usage of the application.
However, after determining that the application program has a dead loop or long transaction through the memory occupation, the problem needs to be repeated in the test environment, the loophole of the application program is found through a debug debugging method by additionally setting a breakpoint, and then a processing strategy is given based on the loophole, so that the repair duration of the application program is longer.
Aiming at the technical problems, the application provides the following technical conception: the thread with the fault can be directly found by monitoring the thread with the minimum unit of executing transaction of the application program, so that the fault problem of the application program is rapidly located through the fault thread, the fault location time of the application program is reduced, and the repair time of the application program is reduced; in addition, after the fault thread is determined, the fault thread is processed by processing strategies such as controlling the thread to stop running, releasing the thread resources, recording the abnormal information of the thread and/or outputting the abnormal execution information of the thread, so that the processing scheme of the fault thread can be directly given, the fault problem of the application program is not required to be found in the test environment, the processing scheme is given based on the fault problem, and the repair time of the application program is reduced.
Referring to fig. 1, fig. 1 is a schematic diagram of an architecture of a responsibility processing chain system according to the present application. The responsibility processing chain system 100 includes a plurality of nodes, and the plurality of nodes are connected in sequence, and the nodes may be clients of application programs for executing transactions or terminal devices loaded with the application programs. Illustratively, referring to FIG. 1, the responsible handler chain system 100 includes a node 110, a node 120, and a node 130. Further, each node is provided with an abstract processor, for example, node 110 is provided with an abstract processor 111, node 120 is provided with an abstract processor 121, and node 130 is provided with an abstract processor 131. The abstract processor may also be an external device, and if the abstract processor is an external device, each node is connected to one abstract processor. The abstract processor is used for collecting the information of the threads of the application program in the node where the abstract processor is located.
The terms involved in the present application are explained below.
Long transaction: in a financial system, it is common to refer to transactions that take longer than a preset timeout period during system operation.
Dead cycle: refers to a program in the system that cannot be terminated by its own control.
Query-type transactions: query-type transactions in financial-type transactions, including but not limited to queries for bills, customer information, account information, and the like.
Maintenance class transactions: for maintenance transactions in financial transactions, including transfers, repayment, and the like.
Thread: the minimum unit of operation scheduling that the operating system can perform is contained in the process, and is the actual unit of operation in the process.
Responsibility chain processing chain: comprising a plurality of objects (nodes) and connected in turn by objects to form a chain, requests are passed on the chain of processing of the responsible chain until an object on the chain decides to process the request.
The following describes the technical scheme of the present application and how the technical scheme of the present application solves the above technical problems in detail with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. The following will be taken in conjunction with the accompanying drawings
Based on the application scenario shown in fig. 1, the embodiment of the application further provides a fault detection method. Fig. 2 is a schematic flow chart of a first embodiment of a fault detection method according to an embodiment of the present application, where the fault detection method includes:
step S201, acquiring an operation duration of executing the first transaction by the thread of the application program, and determining a type of the first transaction, where the operation duration is used to indicate an interval duration between a current time point and a start time point of executing the first transaction.
In the present embodiment, the execution body is a failure detection device, and for convenience of description, the device will be referred to as a failure detection device hereinafter. The device may be a client of the application program or a terminal device loaded with the application program. Furthermore, a device may be any node in the responsibility processing chain. An application refers to an application that can conduct transactions of the financial type.
An application may be considered a process when the device is running, and a process includes one or more threads. When the application program executes the transaction, the transaction is executed through the thread of the smallest calling unit.
The device may acquire, in real time or at a fixed time, a running time of the thread to execute the transaction, where the executed transaction is defined as a first transaction, and the running time refers to an interval time between a current time point and a start time point of executing the first transaction.
There are various types of transactions, for example, the types of transactions include maintenance-type transactions, inquiry-type transactions, and the like. The device determines a type of the first transaction. The device receives a transaction request, wherein the transaction request carries a transaction code of a first transaction, and the device can determine the type of the first transaction through the transaction code. For example. The first letter of the transaction code is W, and the type of the first transaction is maintenance type transaction; if the first letter of the transaction code is C, the type of the first transaction is query type transaction.
Step S202, acquiring configuration information of type association, and determining a duration threshold of the first transaction according to the configuration information.
Configuration information is provided for different types of transactions, which may be manually entered. The configuration information includes a duration threshold, which refers to the maximum transaction duration of the type of transaction. The different types of transactions differ in flow and complexity, and thus differ in maximum transaction duration, and thus differ in duration thresholds in the different types of associated configuration information. The device stores configuration information associated with different types, and the device acquires the configuration information associated with the type to which the first transaction belongs in the storage area.
Step S203, when the running time is greater than or equal to the time threshold, determining that the application program has a dead loop or long transaction, and acquiring a processing strategy associated with the type.
After the duration threshold is obtained, the device judges whether the running duration of the thread is greater than or equal to the duration threshold. If the runtime length is greater than or equal to the time length threshold, it may be determined that the application is endless or long-transacted.
The device processes in different ways for different types of transactions resulting in endless loops or long transactions. The device stores processing strategies associated with different types, and reads the processing strategies associated with the types of the first transaction in the storage area. The device may be configured to obtain the configuration information associated with the type to which the first transaction belongs, where the configuration information records the processing policy, i.e. the device directly obtains the processing policy from the configuration information as the processing policy associated with the type.
In one example, the processing policy includes controlling the thread to stop running, freeing resources of the thread, recording exception information for the thread, and/or outputting exception execution information for the thread.
In another example, when the type of the first transaction is a query type, the processing policy includes controlling the thread to stop running and releasing resources of the thread; when the type of the first transaction is a maintenance type, the processing strategy comprises recording abnormal information of the thread; when the type of the first transaction is a specified type, the processing policy includes abnormal execution information of the output thread. Recording the exception information of the thread, such as recording the stack information of the thread, and sending the stack information to a message queue, so that the message queue asynchronously records the thread as abnormal based on the stack information. The abnormal execution information of the output thread can be that the direct display thread fails, or that a short message prompt message is sent to prompt the application program to have a dead cycle or a long transaction.
It will be appreciated that the device may identify different transactions to make targeted processing, i.e. to configure the manner in which different types of transactions need to be processed. The thread executing the inquiry class long transaction can kill and release resources, the thread executing the maintenance class transaction can asynchronously record stack information (abnormal execution information) of the thread, and the thread executing the important transaction can output an alarm.
Step S204, the thread is processed according to a processing strategy, wherein the processing strategy comprises controlling the thread to stop running, releasing the resources of the thread, recording the abnormal information of the thread and/or outputting the abnormal execution information of the thread.
After determining the processing policy, the apparatus processes the thread based on the processing policy, in one example, controlling the thread to stop running, releasing resources of the thread, recording exception information for the thread, and/or outputting exception execution information for the thread.
In this embodiment, the operation duration of executing the transaction by the thread of the application program is obtained, the type of the transaction is determined, the duration threshold of the transaction is determined based on the configuration information associated with the type of the transaction, and when the operation duration is greater than or equal to the duration threshold, it may be determined that the application program has a dead loop or long transaction, and the processing policy associated with the type is obtained, so that the thread is processed based on the processing policy. According to the method and the device for monitoring the threads of the application program, the threads with faults in the application program can be located in the mode of monitoring the threads of the application program, so that fault location time of the application program is shortened, and repair time of the application program is shortened. Furthermore, the processing strategies such as stopping running of the control thread, releasing the resources of the thread, recording the abnormal information of the thread and/or outputting the abnormal execution information of the thread can be provided, so that the strategy for repairing the faults of the application program is provided, the processing strategy is not required to be provided again when the faults of the application program are positioned in the test environment, and the repairing duration of the application program is reduced.
Fig. 3 is a flowchart of a second embodiment of a fault detection method according to an embodiment of the present application, based on the first embodiment, before step S201, further includes:
Step S301, after the version of the application program corresponding to the thread is updated, acquiring each second transaction completed after the update of the application program, and acquiring thread execution information of each second transaction, where the type of each second transaction is the same as the type of the first transaction.
In this embodiment, when the application program performs a transaction after the version of the application program is updated, there may be a dead loop or a long transaction, and thus the application program needs to be repaired, so that the configuration information needs to be set in the device in advance.
In this regard, after the version of the application program is updated, each transaction completed after the update of the application program is acquired, the transaction is defined as a second transaction, and the type of each second transaction is the same as the type of the first transaction. The second transaction has corresponding execution information, and the device acquires the execution information of the thread in the execution information, namely, acquires the thread execution information.
Step S302, determining a third transaction in each second transaction according to the thread execution information, and determining the ratio between the number of the third transaction and the number of the second transactions, wherein the third transaction is used for indicating the occurrence of the second transaction of abnormal thread execution.
After the thread execution information is obtained, the device determines whether the second transaction is a third transaction based on the thread execution information, wherein the third transaction refers to the second transaction in which the thread abnormal execution occurs, that is, the device determines the second transaction in which the thread abnormal execution occurs as the third transaction. The device determines a ratio between the number of third transactions and the number of second transactions.
Step S303, when the ratio is larger than the preset ratio, setting an associated duration threshold for the type corresponding to the second transaction, and generating configuration information of the type association according to the duration threshold.
After determining the ratio, the device determines whether the ratio is greater than a preset ratio. If the ratio is greater than the preset ratio, it may be determined that when the application executes a transaction of the type of the first transaction, a dead cycle or a long transaction frequently occurs, and thus an associated duration threshold needs to be set for the type corresponding to the second transaction, and configuration information associated with the type is generated based on the duration threshold, that is, the configuration information includes the duration threshold. And after the configuration information related to the configuration type is configured, the type and the configuration information are stored in a related manner. It should be noted that other types of configuration information may be set in the above manner.
In this embodiment, after the version of the application program is updated, that is, the transaction type to be monitored is dynamically increased or decreased in the production environment through the configuration information, so that the flexibility of the device for monitoring the application program is increased, and the device can be used for emergency treatment of the production environment.
In this embodiment, when an application program executes a certain type of transaction, a dead cycle or a long transaction frequently occurs, configuration information is set for the type of transaction, that is, the configuration information is set for the transaction in which the dead cycle or the long transaction frequently occurs, so that the device can locate the fault of the application program more quickly.
Fig. 4 is a flowchart of a third embodiment of a fault detection method according to an embodiment of the present application, based on the first or second embodiment, step S201 includes:
step S401, determining a position of a current node processing the first transaction in a responsibility processing chain, the responsibility processing chain being composed of a plurality of nodes.
In this embodiment, the nodes are connected in turn to form a responsibility processing chain, and the portion of each node that performs the first transaction is different. Illustratively, the first transaction includes sub-transaction 1, sub-transaction 2, and sub-transaction 3, with node 1 in the responsibility-processing chain processing sub-transaction 1, node 2 in the responsibility-processing chain processing sub-transaction 2, and node 3 in the responsibility-processing chain processing sub-transaction 3.
The application program is susceptible to endless loops or long transactions when executing a certain sub-transaction of the first transaction, and thus monitors the thread executing the sub-transaction to quickly find a solution.
In this regard, the apparatus determines a location of a current node processing the first transaction at the responsibility processing chain, which location may characterize a sub-transaction in the first transaction processed by the current node.
Step S402, determining a target sub-transaction to be processed by the current node in each sub-transaction of the first transaction according to the position.
The device determines a target sub-transaction to be processed by the current node in each sub-transaction of the first transaction based on the location. Illustratively, node 1 of the responsibility processing chain processes a first sub-transaction in the transaction, node 2 of the responsibility processing chain processes a second sub-transaction in the transaction, and if the identity of the location is 2, the location characterizes the current node as node 2, and the second sub-transaction in the first transaction is the target sub-transaction. The first sub-transaction is executed in an order that is earlier than the second sub-transaction.
Step S403, when the target sub-transaction is a preset transaction, acquiring a thread of the application program in the current node, and executing the running time of the target sub-transaction as the running time of the thread executing the first transaction, where the preset transaction is used for indicating sub-transactions with a probability of occurrence of a dead cycle greater than a preset probability.
The device stores sub-transactions to be monitored, the sub-transactions to be monitored are defined as preset transactions, and the sub-transactions to be monitored indicate sub-transactions with the occurrence probability of dead loops being larger than the preset probability. The dead loop probability may be the ratio of the number of times an application performs a sub-transaction that a dead loop occurs to the total number of times that sub-transaction is performed.
After determining the target sub-transaction, the device determines whether the target sub-transaction is a preset transaction. If the target sub-transaction is the preset transaction, acquiring a thread of the application program in the current node, and executing the running time of the target sub-transaction as the running time of the thread for executing the first transaction.
In this embodiment, the device monitors specific sub-transactions in the transaction to quickly determine whether a loop or long transaction has occurred with the application.
Fig. 5 is a flowchart of a fourth embodiment of a fault detection method according to an embodiment of the present application, based on any one of the first to third embodiments, step S201 includes:
step S501, a transaction identifier corresponding to the first transaction is obtained.
Step S502, when the transaction identification is located on the white list, the running time of executing the first transaction by the thread of the application program is obtained.
In this embodiment, the device monitors only the transactions stored in the whitelist for a dead-loop. Transactions that are in the white list may be important transactions specified by the user, or may be transactions that easily cause the application to appear to be endless or long transactions. The white list stores transaction identifications of transactions, such as transaction codes.
After obtaining the transaction request, the device analyzes the transaction request to obtain a transaction identifier corresponding to the first transaction, and judges whether the transaction identifier is positioned on a white list. When the transaction identifier is in the white list, the thread executing the first transaction needs to be monitored, and the device acquires the running time of the thread executing the first transaction of the application program to determine whether the application program has a dead loop or long transaction.
It should be noted that, the manner of acquiring the running time of the first transaction may be acquired through steps S401 to S403, or may be directly acquiring the running time of the thread of the first transaction.
In this embodiment, the monitoring of the endless loop or long transaction is performed only when the transaction in the white list is executed, so that the resource consumption of the device is greatly reduced.
Fig. 6 is a flowchart of a fifth embodiment of a fault detection method according to an embodiment of the present application, based on any one of the first to third embodiments, step S201 includes:
In step S601, information collected by an abstract processor is obtained, where the abstract processor is used to collect running information of a thread of an application program.
In this embodiment, the abstract processor DeadLockFinderHandler is provided in the apparatus, or the abstract processor is used as an external device connection apparatus. The abstract processor starts a thread after initialization, and the running information of the thread of the application program for executing the first transaction is collected through the started thread, namely, the abstract processor is used for collecting the running information of the thread of the application program.
Step S602, the running time of executing the first transaction by the thread of the application program is obtained according to the collected information.
After the device acquires the information acquired by the abstract processor, the running time of executing the first transaction by the thread of the application program can be acquired from the information. It should be noted that, in the third embodiment, the running time of the target sub-transaction is executed by acquiring the thread of the application program in the current node through the information collected by the abstract processor. Furthermore, the abstract processor only collects the information of the thread for executing the target transaction, and the target transaction is the thread for executing the transaction recorded in the white list, so that the white list can store the thread number for executing the transaction, and the abstract processor can collect the running information of the thread corresponding to the thread number in the white list. In addition, after the device obtains the white list, the transaction code corresponding to the transaction in the white list, the stored thread number of the thread and the time point when the thread starts to execute the transaction are used as thread information, and the transaction code is used as a key to be written into a linked list. The abstract processor performs inspection on threads corresponding to the thread information in the linked list, and the running time of the threads for executing the transaction can be acquired.
After the judgment processing of whether the application program has a dead loop or a long transaction is completed based on the abstract processor, a method (function) invoke of a scheduler in a device (current node) in the abstract processor enters the next node of the responsibility processing chain.
In addition, the abstract processor has corresponding configuration information, and the configuration information is used for indicating the starting or stopping of the abstract processor, namely, the configuration information is actually the main switch of the abstract processor. The device obtains configuration information of the abstract processor when detecting a transaction request of the first transaction. If the configuration information indicates that the abstract processor is in a processing start state, that is, when the abstract processor is in an enabling state, the abstract processor acquires information of a thread, that is, the device acquires the information acquired by the abstract processor to determine whether the application program has a loop or long transaction. If the configuration information indicates that the abstract processor is in a shutdown state, the device does not carry out thread inspection on the application program.
In this embodiment, the device collects, through the abstract processor, information of a thread for executing the first transaction, and the device does not need to collect the information of the thread, thereby saving resource consumption of the device.
Fig. 7 is a schematic structural diagram of a fault detection device according to an embodiment of the present application. As shown in fig. 7, the fault detection device includes:
A first obtaining module 710, configured to obtain an operation duration of executing the first transaction by the thread of the application program, and determine a type of the first transaction, where the operation duration is used to indicate an interval duration between a current time point and a start time point of executing the first transaction;
a second obtaining module 720, configured to obtain configuration information associated with the type, and determine a duration threshold of the first transaction according to the configuration information;
A third obtaining module 730, configured to determine that a dead loop or long transaction occurs in the application when the running duration is greater than or equal to the duration threshold, and obtain a processing policy associated with the type;
The processing module 740 is configured to process the thread according to a processing policy, where the processing policy includes controlling the thread to stop running, releasing resources of the thread, recording exception information of the thread, and/or outputting exception execution information of the thread.
In some embodiments, the fault detection device is specifically configured to:
after the version of the application program corresponding to the thread is updated, acquiring each second transaction completed after the update of the application program, and acquiring thread execution information of each second transaction, wherein the type of each second transaction is the same as that of the first transaction;
Determining a third transaction in each second transaction according to the thread execution information, and determining the ratio between the number of the third transaction and the number of the second transaction, wherein the third transaction is used for indicating the occurrence of the second transaction which is executed abnormally by the thread;
when the ratio is larger than a preset ratio, setting an associated duration threshold value for the type corresponding to the second transaction, and generating configuration information of type association according to the duration threshold value.
In some embodiments, the fault detection device is specifically configured to:
Determining a position of a current node for processing the first transaction in a responsibility processing chain, wherein the responsibility processing chain is composed of a plurality of nodes;
Determining a target sub-transaction to be processed by the current node in each sub-transaction of the first transaction according to the position;
when the target sub-transaction is a preset transaction, acquiring a thread of an application program in the current node, executing the running time of the target sub-transaction as the running time of the thread executing the first transaction, wherein the preset transaction is used for indicating sub-transactions with the occurrence probability of the dead cycle being larger than the preset probability.
In some embodiments, the fault detection device is specifically configured to:
acquiring a transaction identifier corresponding to a first transaction;
and when the transaction identifier is positioned on the white list, acquiring the running time of executing the first transaction by the thread of the application program.
In some embodiments, the fault detection device is specifically configured to:
Acquiring information acquired by an abstract processor, wherein the abstract processor is used for acquiring the running information of a thread of an application program;
And acquiring the running time of executing the first transaction by the thread of the application program according to the acquired information.
In some embodiments, the fault detection device is specifically configured to:
acquiring configuration information of an abstract processor when a transaction request of a first transaction is detected;
And when the abstract configuration information indicates that the abstract processor is in an on state, acquiring information acquired by the abstract processor.
In some embodiments, when the type of the first transaction is a query type, the processing policy includes controlling the thread to stop running and releasing resources of the thread;
When the type of the first transaction is a maintenance type, the processing strategy comprises recording abnormal information of the thread;
When the type of the first transaction is a specified type, the processing policy includes abnormal execution information of the output thread.
The fault detection device provided by the embodiment of the application can be used for executing the technical scheme of the fault detection method in the embodiment, and the implementation principle and the technical effect are similar, and are not repeated here.
It should be noted that, it should be understood that the division of the modules of the above apparatus is merely a division of a logic function, and may be fully or partially integrated into a physical entity or may be physically separated. And these modules may all be implemented in software in the form of calls by the processing element; or can be realized in hardware; the method can also be realized in a form of calling software by a processing element, and the method can be realized in a form of hardware by a part of modules. For example, the first acquisition module 710 may be a processing element that is set up separately, may be implemented in a chip of the above apparatus, or may be stored in a memory of the above apparatus in the form of program codes, and may be called by a processing element of the above apparatus to execute the functions of the above first acquisition module 710. The implementation of the other modules is similar. In addition, all or part of the modules can be integrated together or can be independently implemented. The processing element here may be an integrated circuit with signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in a software form.
Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 8, the electronic device may include: a transceiver 81, a processor 82, a memory 83.
Processor 82 executes computer-executable instructions stored in memory that cause processor 82 to perform the aspects of the embodiments described above. The processor 82 may be a general-purpose processor including a central processing unit CPU, a network processor (network processor, NP), etc.; but may also be a digital signal processor DSP, an application specific integrated circuit ASIC, a field programmable gate array FPGA or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component.
The memory 83 is connected to the processor 82 via a system bus and communicates with each other, and the memory 83 is adapted to store computer program instructions.
The transceiver 81 may be used to obtain a task to be run and configuration information of the task to be run.
The system bus may be a peripheral component interconnect (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The system bus may be classified into an address bus, a data bus, a control bus, and the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus. The transceiver is used to enable communication between the database access device and other computers (e.g., clients, read-write libraries, and read-only libraries). The memory may include random access memory (random access memory, RAM) and may also include non-volatile memory (non-volatile memory).
The electronic device provided by the embodiment of the application can be the terminal device of the embodiment.
The embodiment of the application also provides a chip for running the instruction, which is used for executing the technical scheme of the fault detection method in the embodiment.
The embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores computer instructions, and when the computer instructions run on a computer, the computer is caused to execute the technical scheme of the fault detection method of the embodiment.
The embodiment of the application also provides a computer program product, which comprises a computer program stored in a computer readable storage medium, at least one processor can read the computer program from the computer readable storage medium, and the technical scheme of the fault detection method in the embodiment can be realized when the at least one processor executes the computer program.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, e.g., the division of modules is merely a logical function division, and there may be additional divisions of actual implementation, e.g., multiple modules may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules, which may be in electrical, mechanical, or other forms.
The modules illustrated as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to implement the solution of this embodiment.
In addition, each functional module in the embodiments of the present application may be integrated in one processing unit, or each module may exist alone physically, or two or more modules may be integrated in one unit. The units formed by the modules can be realized in a form of hardware or a form of hardware and software functional units.
The integrated modules, which are implemented in the form of software functional modules, may be stored in a computer readable storage medium. The software functional modules described above are stored in a storage medium and include instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or processor to perform some of the steps of the methods of the various embodiments of the application.
It should be appreciated that the Processor may be a central processing unit (Central Processing Unit, abbreviated as CPU), or may be other general purpose Processor, digital signal Processor (DIGITAL SIGNAL Processor, abbreviated as DSP), application SPECIFIC INTEGRATED Circuit (ASIC), or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution.
The memory may comprise a high-speed RAM memory, and may further comprise a non-volatile memory NVM, such as at least one magnetic disk memory, and may also be a U-disk, a removable hard disk, a read-only memory, a magnetic disk or optical disk, etc.
The bus may be an industry standard architecture (Industry Standard Architecture, ISA) bus, an external device interconnect (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, the buses in the drawings of the present application are not limited to only one bus or to one type of bus.
The storage medium may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an Application SPECIFIC INTEGRATED Circuits (ASIC). Of course, the processor and the storage medium may reside as discrete components in an electronic control unit or master control device.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the method embodiments described above may be performed by hardware associated with program instructions. The foregoing program may be stored in a computer readable storage medium. The program, when executed, performs steps including the method embodiments described above; and the aforementioned storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.

Claims (12)

1. A fault detection method, comprising:
Acquiring the running time of executing a first transaction by a thread of an application program, and determining the type of the first transaction, wherein the running time is used for indicating the interval time between the current time point and the starting time point of executing the first transaction;
Acquiring configuration information associated with the type, and determining a duration threshold value of the first transaction according to the configuration information;
When the running time length is greater than or equal to the time length threshold, determining that the application program has a dead loop or long transaction, and acquiring a processing strategy associated with the type;
And processing the thread according to the processing strategy, wherein the processing strategy comprises controlling the thread to stop running, releasing the resources of the thread, recording the abnormal information of the thread and/or outputting the abnormal execution information of the thread.
2. The fault detection method according to claim 1, wherein prior to the step of obtaining the type-associated configuration information, further comprising:
After the version of the application program corresponding to the thread is updated, acquiring each second transaction completed after the application program is updated, and acquiring thread execution information of each second transaction, wherein the type of each second transaction is the same as the type of the first transaction;
determining a third transaction in each second transaction according to the thread execution information, and determining the ratio between the number of the third transaction and the number of the second transaction, wherein the third transaction is used for indicating the occurrence of the second transaction of abnormal thread execution;
When the ratio is larger than a preset ratio, setting an associated duration threshold value for the type corresponding to the second transaction, and generating configuration information associated with the type according to the duration threshold value.
3. The method of claim 1, wherein the step of obtaining the running time of the thread of the application program to execute the first transaction comprises:
Determining a location of a current node that processes the first transaction in a responsibility processing chain, the responsibility processing chain consisting of a plurality of nodes;
Determining a target sub-transaction to be processed by the current node in each sub-transaction of the first transaction according to the position;
when the target sub-transaction is a preset transaction, acquiring a thread of the application program in the current node, and executing the running time of the target sub-transaction as the running time of the thread for executing the first transaction, wherein the preset transaction is used for indicating sub-transactions with the occurrence probability of dead loops being larger than the preset probability.
4. The method of claim 1, wherein the step of obtaining the running time of the thread of the application program to execute the first transaction comprises:
acquiring a transaction identifier corresponding to the first transaction;
and when the transaction identifier is positioned on the white list, acquiring the running time of executing the first transaction by the thread of the application program.
5. The method of claim 1, wherein the step of obtaining the running time of the thread of the application program to execute the first transaction comprises:
acquiring information acquired by an abstract processor, wherein the abstract processor is used for acquiring the running information of the threads of the application program;
And acquiring the running time of executing the first transaction by the thread of the application program according to the acquired information.
6. The fault detection method of claim 5, wherein the step of obtaining information collected by the abstract processor comprises:
Acquiring configuration information of the abstract processor when a transaction request of the first transaction is detected;
And when the abstract configuration information indicates that the abstract processor is in an open state, acquiring information acquired by the abstract processor.
7. The fault detection method according to any one of claims 1-6, wherein when the type of the first transaction is a query type, the processing policy includes controlling the thread to stop running and releasing resources of the thread;
when the type of the first transaction is a maintenance type, the processing strategy comprises recording abnormal information of the thread;
when the type of the first transaction is a specified type, the processing policy includes outputting abnormal execution information of the thread.
8. A fault detection device, comprising:
The system comprises a first acquisition module, a second acquisition module and a first processing module, wherein the first acquisition module is used for acquiring the running time of executing a first transaction by a thread of an application program and determining the type of the first transaction, and the running time is used for indicating the interval time between the current time point and the starting time point of executing the first transaction;
the second acquisition module is used for acquiring the configuration information related to the type and determining a duration threshold value of the first transaction according to the configuration information;
The third acquisition module is used for determining that the application program has a dead loop or long transaction when the running time length is greater than or equal to the time length threshold value and acquiring the processing strategy associated with the type;
the processing module is used for processing the thread according to the processing strategy, and the processing strategy comprises controlling the thread to stop running, releasing the resources of the thread, recording the abnormal information of the thread and/or outputting the abnormal execution information of the thread.
9. A responsibility handling chain system, characterized in that it comprises a plurality of nodes connected in sequence, any of said nodes being adapted to implement the method according to any of claims 1-7.
10. An electronic device, comprising: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
The processor executes computer-executable instructions stored in the memory to implement the method of any one of claims 1-7.
11. A computer readable storage medium having stored therein computer executable instructions which when executed by a processor are adapted to carry out the method of any one of claims 1-7.
12. A computer program product comprising a computer program which, when executed by a processor, implements the method of any of claims 1-7.
CN202410360861.4A 2024-03-27 2024-03-27 Fault detection method and related equipment Pending CN118193265A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410360861.4A CN118193265A (en) 2024-03-27 2024-03-27 Fault detection method and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410360861.4A CN118193265A (en) 2024-03-27 2024-03-27 Fault detection method and related equipment

Publications (1)

Publication Number Publication Date
CN118193265A true CN118193265A (en) 2024-06-14

Family

ID=91392506

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410360861.4A Pending CN118193265A (en) 2024-03-27 2024-03-27 Fault detection method and related equipment

Country Status (1)

Country Link
CN (1) CN118193265A (en)

Similar Documents

Publication Publication Date Title
US7493477B2 (en) Method and apparatus for disabling a processor core based on a number of executions of an application exceeding a threshold
CN108197032B (en) Main thread jamming monitoring method, medium, equipment and system for IOS application
US20090044053A1 (en) Method, computer system, and computer program product for problem determination using system run-time behavior analysis
CN106682162B (en) Log management method and device
CN109144873B (en) Linux kernel processing method and device
US7765434B2 (en) Resource efficient software tracing for problem diagnosis
CN109324959B (en) Method for automatically transferring data, server and computer readable storage medium
JP5623557B2 (en) Method, apparatus, and computer program in a multi-threaded computing environment for collecting diagnostic data
US11093361B2 (en) Bus monitoring system, method and apparatus
CN118193265A (en) Fault detection method and related equipment
CN114003416B (en) Memory error dynamic processing method, system, terminal and storage medium
US7617417B2 (en) Method for reading input/output port data
CN116126832A (en) Database switching method, switching device, electronic equipment and storage medium
CN113127245B (en) Method, system and device for processing system management interrupt
CN115314289A (en) Attacked executor identifying method, output voter, equipment and storage medium
CN114153712A (en) Exception handling method, device, equipment and storage medium
CN113742113A (en) Embedded system health management method, equipment and storage medium
CN110740062A (en) Breakpoint resume method and device
CN108415788B (en) Data processing apparatus and method for responding to non-responsive processing circuitry
US20060230196A1 (en) Monitoring system and method using system management interrupt
CN115935341B (en) Vulnerability defense method, vulnerability defense system, vulnerability defense server and storage medium
CN114253825B (en) Memory leak detection method, device, computer equipment and storage medium
CN117171155A (en) Data cleaning method, device, equipment and storage medium
CN116302975A (en) Service system change testing method and device
CN115640215A (en) Method and device for generating debugging file, computing equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination