CN114490416A - Method and system for fault detection switching of workflow service and computer storage medium - Google Patents

Method and system for fault detection switching of workflow service and computer storage medium Download PDF

Info

Publication number
CN114490416A
CN114490416A CN202210138438.0A CN202210138438A CN114490416A CN 114490416 A CN114490416 A CN 114490416A CN 202210138438 A CN202210138438 A CN 202210138438A CN 114490416 A CN114490416 A CN 114490416A
Authority
CN
China
Prior art keywords
service
fault
workflow service
workflow
script
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210138438.0A
Other languages
Chinese (zh)
Inventor
杨飞
郭玉章
陈洁
李颢
吕震
刘永科
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202210138438.0A priority Critical patent/CN114490416A/en
Publication of CN114490416A publication Critical patent/CN114490416A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • G06F11/366Software debugging using diagnostics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3684Test management for test design, e.g. generating new test cases

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention provides a fault detection switching method and system of workflow service and a computer storage medium; the method comprises the following steps: based on the preset interval time in the fault detection script, the fault detection script initiates a detection request to the corresponding main workflow service; the fault detection script is deployed in the standby workflow service and is hung and executed in the background; based on the detection times and failure interval time preset in the fault detection script, the fault detection script judges whether the main workflow service has a fault; if the main workflow service has a fault, calling a starting script of the standby workflow service, and starting the standby workflow service to take over the service; namely, when the main workflow service fails or cannot provide service normally, the method can be used for detecting and automatically pulling up the standby workflow service, so that manual operation is reduced, and the timeliness is high; and the shared storage does not need to be applied, and only the related scripts need to be deployed on the standby workflow service, so that the deployment scheme is simpler, and the problem of frequent IO read-write does not exist.

Description

Method and system for fault detection switching of workflow service and computer storage medium
Technical Field
The invention belongs to the technical field of weblogic, and particularly relates to a fault detection switching method and system for workflow services and a computer storage medium.
Background
With the continuous development of banking business, the business process is becoming more and more complex, and a great number of middle and background businesses cannot be completed through a single link and need to be circulated to multiple links. Based on this, a bank system (mainly a process platform system) largely uses workflow services, and the workflow services refer to a technical method, and can divide a business operation process into specific work items according to a set business rule, such as layout identification, image splitting, primary entry, secondary entry, entry arbitration, data check, accounting and the like, and corresponding business personnel or robots can acquire tasks of the roles, write results into the workflow services after processing, and continue to flow backwards, thereby ensuring sequential processing of complex businesses.
Weblogic is widely applied to a bank IT system as a large-scale business middleware, most application programs are packaged in the Weblogic middleware, but the workflow service based on the Weblogic container relates to global unified task sequencing, so that the problems of lock and performance reduction caused by task repeated acquisition and memory data synchronization among different hosts are solved, and cluster deployment is generally not supported. Therefore, the problem of high availability of the weblogic-based workflow service becomes a difficult problem to be solved urgently.
Disclosure of Invention
In view of this, the present invention provides a method and a system for switching between fault detection of workflow services, and a computer storage medium, which are used to detect and automatically pull up a standby workflow service, thereby reducing manual operations and having higher timeliness.
The first aspect of the present application discloses a method for detecting and switching a fault of a workflow service, including:
based on preset interval time in the fault detection script, the fault detection script initiates a detection request to a corresponding main workflow service; the fault detection script is deployed in the standby workflow service and is executed in a background hanging mode;
based on the detection times and failure interval time preset in the fault detection script, the fault detection script judges whether the main workflow service has a fault or not;
if the main workflow service has a fault, the starting script of the standby workflow service is called up, and the standby workflow service is started to take over the service.
Optionally, the invoking a start script of the standby workflow service, and before starting the standby workflow service to take over the service, further includes:
judging whether the standby workflow service is started or not;
if not, executing the starting script for calling the standby workflow service, and starting the service takeover of the standby workflow service.
Optionally, after determining whether the standby workflow service is started, the method further includes:
and if the standby workflow service is started, returning to execute the preset interval time in the fault detection script, and the fault detection script initiates a detection request to the corresponding main workflow service.
Optionally, based on the detection times and the failure interval time preset in the fault detection script, the fault detection script determines whether the main workflow service has a fault, including:
the fault detection script judges whether the detection of the preset times fails or not;
if so, judging that the fault detection script judges that the main workflow service has a fault;
if not, judging that the fault detection script judges that the main workflow service has no fault.
Optionally, the determining, by the fault detection script, whether the detection fails includes:
the fault detection script accesses the main workflow service according to a provided interface;
if receiving a response given by the main workflow, indicating that the main workflow is normal in service and successful in detection;
and if the response given by the main workflow is not received, the main workflow is abnormal in service and failed in detection.
A second aspect of the present application discloses a system for detecting and switching a fault of a workflow service, including:
the system comprises a request unit, a fault detection script and a fault analysis unit, wherein the request unit is used for initiating a detection request to a corresponding main workflow service based on preset interval time in the fault detection script; the fault detection script is deployed in the standby workflow service and is executed in a background hanging mode;
the fault judgment unit is used for judging whether the main workflow service has a fault or not based on the detection times and the failure interval time preset in the fault detection script;
and the calling unit is used for calling the starting script of the standby workflow service and starting the takeover service of the standby workflow service if the fault judgment unit judges that the main workflow service has a fault.
Optionally, the method further includes:
the first judgment unit is used for judging whether the standby workflow service is started or not;
if not, triggering the starting unit to execute the starting script for starting the standby workflow service, and starting the service takeover service of the standby workflow service.
Optionally, the fault determining unit is configured to determine whether the main workflow service has a fault based on a detection number and a failure interval time preset in the fault detection script, and specifically configured to:
the fault detection script judges whether the detection of the preset times fails or not;
if so, judging that the fault detection script judges that the main workflow service has a fault;
if not, judging that the fault detection script judges that the main workflow service has no fault.
Optionally, the fault determining unit is configured to, when the fault detection script determines whether the detection fails, specifically:
the fault detection script accesses the main workflow service according to a provided interface;
if receiving a response given by the main workflow, indicating that the main workflow is normal in service and successful in detection;
and if the response given by the main workflow is not received, the main workflow is abnormal in service and failed in detection.
A third aspect of the present application discloses a computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements a method of failure detection switching of a workflow service according to any of the first aspects of the present application.
From the above technical solution, the method for detecting and switching the fault of the workflow service provided by the present invention includes: based on the preset interval time in the fault detection script, the fault detection script initiates a detection request to the corresponding main workflow service; the fault detection script is deployed in the standby workflow service and is hung and executed in a background; based on the detection times and the failure interval time preset in the fault detection script, the fault detection script judges whether the main workflow service has a fault or not; if the main workflow service has a fault, calling a starting script of the standby workflow service, and starting the standby workflow service to take over the service; that is, when the main workflow service fails or cannot provide service normally, the method can detect and automatically pull up the standby workflow service, so that manual operation is reduced, and high timeliness is achieved; meanwhile, shared storage does not need to be applied, only relevant scripts need to be deployed on the standby workflow service, the deployment scheme is simple, and the problem of frequent IO read-write does not exist.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart of a method for switching between fault detection of workflow services according to an embodiment of the present invention;
FIG. 2 is a flow chart of another method for switching between workflow services according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a system for detecting and switching a fault of a workflow service according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In this application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
It should be noted that high availability (highautomation) generally describes that a system is specially designed, so as to reduce the downtime caused by hardware and software factors such as hardware failure and program defect, and maintain high availability of services. Common high availability modes include a master-slave mode, a dual-machine duplex mode, a cluster working mode and the like.
The embodiment of the application provides a fault detection switching method for workflow services, which is used for solving the problems that in the prior art, Weblogic is widely applied to a bank IT system as a large-scale business middleware, most application programs are packaged in the Weblogic middleware, but the workflow services based on a Weblogic container generally do not support cluster deployment in order to avoid the problems of lock and performance reduction and the like caused by task repeated acquisition and memory data synchronization among different hosts due to the fact that the workflow services relate to global unified task sequencing.
Referring to fig. 1, the method for switching between fault detection of workflow services includes:
s101, based on the preset interval time in the fault detection script, the fault detection script initiates a detection request to the corresponding main workflow service.
The fault detection script is deployed in the standby workflow service and is executed in a background.
The fault detection script may be uploaded by the operation and maintenance personnel of the system and suspended for execution in the background.
Only the related scripts need to be deployed on the standby workflow service, and the deployment scheme is simple. The method can be suspended and executed in the background by packaging the script into a shell script and adopting a background to execute the script and using the script as a resident process nohup sh xx.
S102, judging whether the main workflow service has faults or not by the fault detection script based on the detection times and the failure interval time preset in the fault detection script.
It should be noted that the parameters of the fault detection script may be set, for example:
TRY _ TIMES ═ 2# queue probe TIMES;
DELAY-5 # interval time after failure (seconds);
TIME _ SLEEP ═ 30# next round of interval TIME (seconds);
the TRY _ TIMES is the number of detection TIMES of the workflow service, and this parameter determines how many TIMES of detection failures are performed continuously, and the corresponding main workflow service is considered to have a fault. DELAY is the interval time after failure, i.e. after failure of one probing, the probing is performed again after the time defined by the parameter. The TIME _ SLEEP is the interval TIME of the next round, namely, after the primary detection judges that the main workflow service has no fault, the secondary detection is carried out after the TIME defined by the parameter. Through the configuration of the three parameters, the sensitivity of the workflow service detection and switching method can be adaptively adjusted according to the importance degree of the service.
If the main workflow service has a failure, step S103 is executed.
S103, a starting script of the standby workflow service is called up, and the standby workflow service is started to take over the service.
In this embodiment, based on a preset interval time in the fault detection script, the fault detection script initiates a detection request to the corresponding main workflow service; the fault detection script is deployed in the standby workflow service and is executed in a background hanging manner; based on the detection times and the failure interval time preset in the fault detection script, the fault detection script judges whether the main workflow service has a fault or not; if the main workflow service has a fault, calling a starting script of the standby workflow service, and starting the standby workflow service to take over the service; that is, when the main workflow service fails or cannot provide service normally, the method can detect and automatically pull up the standby workflow service, so that manual operation is reduced, and high timeliness is achieved; meanwhile, shared storage does not need to be applied, only relevant scripts need to be deployed on the standby workflow service, the deployment scheme is simple, and the problem of frequent IO read-write does not exist.
It should be noted that, in the prior art, the first method: deploying a set of standby workflow services, wherein the standby workflow services are not started at ordinary times, meanwhile, the main workflow services are monitored, and after the main workflow services are down, the standby services are manually started; the second method comprises the following steps: configuring a set of standby workflow services, deploying a process monitoring script in a main workflow, writing the process quantity of the main workflow into a shared memory, regularly accessing the shared memory by the standby workflow to acquire the process quantity of the main workflow, and calling a self starting script by the standby workflow to take over services when the process quantity is abnormal; the method, namely the cold standby mode which we often say, has the advantages of simple deployment mode, but also has obvious defects, all operations need to be carried out manually, the treatment speed is difficult to guarantee, and the timeliness requirement cannot be met. For important services, a minute-level interruption can bring about a large service influence; the second method can realize automatic switching of the fault workflow service to a certain extent, but has some problems, mainly including: a. the second method is that whether the main workflow is in fault or not is judged based on a mode of checking the number of processes, a special condition exists, when the main workflow service is tamped, the processes exist, but the service cannot be provided, and at the moment, the method cannot realize the pulling-up of the standby workflow service; b. the scheme needs to deploy scripts on the main workflow service and the standby workflow service respectively, is complex in deployment scheme, and needs to apply for shared storage to record the state of the main workflow and needs more resources. Meanwhile, the method needs to frequently read and write the shared storage, and certain I/O pressure exists.
In the embodiment of the application, automatic switching after the service failure of the main workflow can be realized, manual operation is reduced, and high timeliness is achieved; the shared storage does not need to be applied, only relevant scripts need to be deployed on the standby workflow service, the deployment scheme is simple, and the problem of frequent IO read-write does not exist; the method supports parameterization adjustment of detection frequency and suppression times, and a user can adjust the sensitivity of the method according to the importance degree of the service; and the detection result is more accurate by a method of remotely accessing the main workflow service instead of a method of simply detecting whether the process related to the main workflow service exists.
In practical application, in step S103, before invoking a start script of the standby workflow service and starting the standby workflow service to take over the service, the method further includes: and judging whether the standby workflow service is started or not.
If not, executing step S103, invoking a start script of the standby workflow service, and starting the standby workflow service to take over the service.
In practical application, after determining whether the standby workflow service is started, the method further includes:
if the standby workflow service is started, the process returns to step S101.
That is, after the failure of the main workflow service is determined, it is checked whether the backup workflow service is pulled up, if not, a local start script is called to pull up the backup workflow service, and if the backup workflow is pulled up, the above-mentioned cycle is continued.
Specifically, the corresponding code text information is as follows:
if[$ret-ne 0];then
echo is incorrect, detailed information is as follows "
printf"$log\n"
ps=$(ps-ef|grep"Dweblogic.Name=$ServerName"|grep java)
if[$?-eq 0];then
echo "however, the backup server $ ServerName has started the process information is as follows"
echo"UID PID PPID C STIME TTY TIME CMD"
echo$ps
else
echo "start server $ ServerName." where $ organ queues reside "
cd$DOMAIN_PATH/bin
./nm_start_mserver.sh$ServerName
cd$SHELLPATH
echo
fi
else
eco "Normal"
fi
}
while[true];do
date"+%Y-%m-%d%H:%M:%S"
for((i=0;i<${#ServerInfo[*]};i++));do
read ServerName ip AppName<<<$(echo${ServerInfo[$i]})
echo ServerName ip AppName is:$ServerName$ip$AppName
detect
done
sleep$TIME_SLEEP
done
The function to be realized by the text code is to check whether the standby workflow service is pulled up or not after the main workflow service is judged to be in fault, if not, a local start script is called to pull up the standby workflow service, and if the standby workflow is pulled up, the circulation is continued.
In practical application, step S102, based on the detection times and failure interval time preset in the fault detection script, the fault detection script determines whether the main workflow service has a fault, including:
1) and judging whether the detection of the preset times fails or not by the fault detection script.
If yes, executing step 2).
2) And judging that the main workflow service has a fault by the fault detection script.
If not, executing step 3).
3) And the fault detection script judges that the main workflow service has no fault.
It should be noted that, the specific detection script may be:
detect()
{
(!ifconfig|grep-w$ip>/dev/null)&&return
organ=$(expr$(echo$AppName|sed"s/COSMainQueue//")\*1)
organ=${ORGS[$organ]}
printf "queue status of organization $ organ"
try=0
while[$try-lt$TRY_TIMES]
do
log=$(java-Dsun.lang.ClassLoader.allowArraySyntax=true-cp${CLASS_PATH}com/ccb/cost/management/dailymaintain/ExecQueueMonitor$organ2>/home/ap/cost/switch.log)
ret=$?
printf"."
if[$ret-eq 0];then
break
fi
try=$(expr$try+1)
sleep$DELAY
done
That is, upon probing, the script invokes the ExecQueueMonitor method to determine the active state of the workflow service. ExecQueueMonitor is an interface that uses RMI (remote Method invocation). RMI is an application programming interface in the JAVA programming language for implementing remote procedure calls.
In practical applications, the above-mentioned failure detection script determines whether the detection fails, including:
the fault detection script accesses the main workflow service according to the provided interface.
If the response given by the main workflow is received, the normal service and the successful detection of the main workflow are indicated.
If the response given by the main workflow is not received, the abnormal service and the failure of detection of the main workflow are indicated.
Specifically, the main workflow service provides a service-side interface, the detection method accesses the main workflow service according to the provided interface, and if the main workflow gives a response, the main workflow service is normal; and if the main workflow can not give a response, the main workflow is abnormal in service. The core code is as follows:
public int getQueueStatus(String centerCode){
int reValue=1;
Connection conn=null;
Properties props=null;
try{
conn=this.connectionManager.getConnection();
props=DateExportUtil.getQClientProPertiesFromDB(conn);
Configuration.INSTANCE.init(props);
QWorkItemHandlerworkItemHandler=null;
if(centerCode!=null&&centerCode.toLowerCase().contains("auto".toLowerCase())){
workItemHandler=WorkitemHandlerFactory.getDefaultHandler();
}else if(centerCode!=null){
workItemHandler=WorkitemHandlerFactory.getRMIHandlerByArea(centerCode);
}else{
log ("queue" + centerCode + ", parameter exception");
}
TaskQGroupStatus status=workItemHandler.getStatus();
String description=status.getDescription();
if (!STATE _ RUNNING ". equals (description) &! is RUNNING". equals (description) & &!. "STATE _ INITIALZING". EQUAals (description) & &! is INITIALIZING ". equals (description))
log ("queue" + centerCode + ", state" + description);
}else{
reValue=0;
info ("queue ═ centtecode +", state ═ description);
}
}catch(Throwable var12){
reValue=1;
log.error (var12.getmessage () + "get queue" + centerCode + "running state exception", var 12);
}finally{
this.safeClose(conn,reValue);
}
returnreValue;
}
the method is more accurate than simple process quantity check, and avoids the problem that the application process can not provide services although existing.
It should be noted that if the main workflow is normal, the main workflow will not continue to probe after the failure DELAY, but will continue to probe after the TIME _ SLEEP, and both parameters are settable, and the value of TIME _ SLEEP is usually larger than DELAY.
And when the detection script finds that the main workflow service meets the relevant parameters, judging that the main workflow service fails.
It should be noted that, as shown in fig. 2, a specific detection handover process is as follows:
(1) the fault detection script begins execution.
(2) And detecting according to preset parameters.
(3) And judging whether the main queue has a fault, namely whether the main workflow service has a fault.
If yes, executing the step (4); if not, the process is ended, and after waiting for the next round of interval TIME _ SLEEP, detection is continued.
(4) And after the interval DELAY after the failure is judged, whether the main workflow service has the failure or not is continuously detected.
If not, the process is ended, and after waiting for the next round of interval TIME _ SLEEP, detection is continued. If yes, executing step (5).
(5) It is detected whether the standby workflow service is pulled up.
If yes, the process is ended, and after waiting for the next round of interval TIME _ SLEEP, detection is continued. If yes, executing step (6).
(6) And calling a starting script of the standby workflow service to start the service.
The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
Although the operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous.
It should be understood that the various steps recited in method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
Another embodiment of the present application provides a system for detecting and switching a workflow service.
Referring to fig. 3, the system for detecting and switching a fault of a workflow service includes:
a request unit 101, configured to initiate a detection request to a corresponding main workflow service by a fault detection script based on a preset interval time in the fault detection script; the fault detection script is deployed in the standby workflow service and is executed in a background.
And the fault judging unit 102 is configured to judge whether the main workflow service has a fault or not based on the detection times and the failure interval time preset in the fault detection script.
And the invoking unit 103 is configured to invoke a start script of the standby workflow service and start the standby workflow service to take over the service if the fault determining unit 102 determines that the main workflow service has a fault.
In practical applications, the system for detecting and switching the fault of the workflow service further includes:
the first judging unit is used for judging whether the standby workflow service is started or not.
If not, triggering the invoking unit 103 to execute the starting script for invoking the standby workflow service, and starting the standby workflow service to take over the service.
In practical applications, the failure determining unit 102 is configured to, based on the detection times and the failure interval time preset in the failure detection script, specifically, when the failure detection script determines whether the main workflow service has a failure:
and judging whether the detection of the preset times fails or not by the fault detection script.
And if so, judging that the fault detection script judges that the main workflow service has a fault.
If not, the fault detection script judges that the main workflow service has no fault.
In practical applications, the failure determining unit 102 is configured to, when the failure detection script determines whether detection fails, specifically:
the fault detection script accesses the main workflow service according to the provided interface.
If the response given by the main workflow is received, the normal service and the successful detection of the main workflow are indicated.
If the response given by the main workflow is not received, the abnormal service and the failure of detection of the main workflow are indicated.
For details of the specific working process and principle of each unit, reference is made to the method for detecting and switching a fault of a workflow service provided in the above embodiment, which is not described herein any more, and all that is required is within the protection scope of the present application, depending on the actual situation.
In this embodiment, the request unit 101 initiates a detection request to a corresponding main workflow service based on a preset interval time in a fault detection script; the fault judgment unit 102 judges whether the main workflow service has a fault or not based on the detection times and the failure interval time preset in the fault detection script; if the main workflow service has a fault, the calling unit 103 calls a starting script of the standby workflow service and starts the standby workflow service to take over the service; that is, when the main workflow service fails or cannot provide service normally, the method can detect and automatically pull up the standby workflow service, so that manual operation is reduced, and high timeliness is achieved; meanwhile, shared storage does not need to be applied, only relevant scripts need to be deployed on the standby workflow service, the deployment scheme is simple, and the problem of frequent IO read-write does not exist.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of a unit does not in some cases constitute a limitation of the unit itself, for example, the first retrieving unit may also be described as a "unit for retrieving at least two internet protocol addresses".
The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.
Another embodiment of the present application provides a computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the method for switching between workflow services according to any one of the above embodiments.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
Features described in the embodiments in the present specification may be replaced with or combined with each other, and the same and similar portions among the embodiments may be referred to each other, and each embodiment is described with emphasis on differences from other embodiments. In particular, the system or system embodiments are substantially similar to the method embodiments and therefore are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described system and system embodiments are merely illustrative, wherein units described as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method for fault detection switching of workflow services is characterized by comprising the following steps:
based on preset interval time in the fault detection script, the fault detection script initiates a detection request to a corresponding main workflow service; the fault detection script is deployed in the standby workflow service and is executed in a background hanging mode;
based on the detection times and failure interval time preset in the fault detection script, the fault detection script judges whether the main workflow service has a fault or not;
if the main workflow service has a fault, the starting script of the standby workflow service is called up, and the standby workflow service is started to take over the service.
2. The method for switching between fault detection of workflow services according to claim 1, wherein the step of invoking a start script of said standby workflow service and before starting the standby workflow service to take over the service further comprises:
judging whether the standby workflow service is started or not;
if not, executing the starting script for calling the standby workflow service, and starting the service takeover of the standby workflow service.
3. The method for switching between fault detection of workflow services according to claim 2, after determining whether the standby workflow service is started, further comprising:
and if the standby workflow service is started, returning to execute the preset interval time in the fault detection script, and the fault detection script initiates a detection request to the corresponding main workflow service.
4. The method for switching between workflow service fault detection according to claim 1, wherein the fault detection script determines whether the main workflow service has a fault based on the detection times and the failure interval time preset in the fault detection script, and the method comprises:
the fault detection script judges whether the detection of the preset times fails or not;
if so, judging that the fault detection script judges that the main workflow service has a fault;
if not, judging that the fault detection script judges that the main workflow service has no fault.
5. The method for switching between workflow services according to claim 4, wherein said failure detection script determines whether detection fails, comprising:
the fault detection script accesses the main workflow service according to a provided interface;
if receiving a response given by the main workflow, indicating that the main workflow is normal in service and successful in detection;
and if the response given by the main workflow is not received, the main workflow is abnormal in service and failed in detection.
6. A system for failure detection switching of workflow services, comprising:
the system comprises a request unit, a fault detection script and a fault analysis unit, wherein the request unit is used for initiating a detection request to a corresponding main workflow service based on preset interval time in the fault detection script; the fault detection script is deployed in the standby workflow service and is executed in a background hanging mode;
the fault judgment unit is used for judging whether the main workflow service has a fault or not based on the detection times and the failure interval time preset in the fault detection script;
and the calling unit is used for calling the starting script of the standby workflow service and starting the takeover service of the standby workflow service if the fault judgment unit judges that the main workflow service has a fault.
7. The workflow services failure detection switching system of claim 6 further comprising:
the first judgment unit is used for judging whether the standby workflow service is started or not;
if not, triggering the starting unit to execute the starting script for starting the standby workflow service, and starting the service takeover service of the standby workflow service.
8. The system for switching between workflow service fault detection according to claim 6, wherein the fault determining unit is configured to determine whether the main workflow service has a fault according to a preset detection number and a preset failure interval time in a fault detection script, and when the fault detection script determines that the main workflow service has a fault, the fault determining unit is specifically configured to:
the fault detection script judges whether the detection of the preset times fails or not;
if so, judging that the fault detection script judges that the main workflow service has a fault;
if not, judging that the fault detection script judges that the main workflow service has no fault.
9. The system for switching between workflow service and fault detection according to claim 8, wherein the fault determining unit is configured to, when the fault detection script determines whether the detection fails, specifically:
the fault detection script accesses the main workflow service according to a provided interface;
if receiving a response given by the main workflow, indicating that the main workflow is normal in service and successful in detection;
and if the response given by the main workflow is not received, the main workflow is abnormal in service and failed in detection.
10. A computer storage medium, having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the method of failure detection switching of a workflow service according to any one of claims 1 to 5.
CN202210138438.0A 2022-02-15 2022-02-15 Method and system for fault detection switching of workflow service and computer storage medium Pending CN114490416A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210138438.0A CN114490416A (en) 2022-02-15 2022-02-15 Method and system for fault detection switching of workflow service and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210138438.0A CN114490416A (en) 2022-02-15 2022-02-15 Method and system for fault detection switching of workflow service and computer storage medium

Publications (1)

Publication Number Publication Date
CN114490416A true CN114490416A (en) 2022-05-13

Family

ID=81480524

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210138438.0A Pending CN114490416A (en) 2022-02-15 2022-02-15 Method and system for fault detection switching of workflow service and computer storage medium

Country Status (1)

Country Link
CN (1) CN114490416A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116208472A (en) * 2023-02-28 2023-06-02 中国工商银行股份有限公司 Site switching method, device, computer equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116208472A (en) * 2023-02-28 2023-06-02 中国工商银行股份有限公司 Site switching method, device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
US7000150B1 (en) Platform for computer process monitoring
US7062676B2 (en) Method and system for installing program in multiple system
US8495573B2 (en) Checkpoint and restartable applications and system services
CN100498725C (en) Method and system for minimizing loss in a computer application
US9189348B2 (en) High availability database management system and database management method using same
EP2831796B1 (en) Persistent and resilient worker processes
US20020161859A1 (en) Workflow engine and system
US20080244307A1 (en) Method to avoid continuous application failovers in a cluster
CN111143044B (en) Task scheduling management system, method, device and storage medium thereof
JP5519909B2 (en) Non-intrusive method for replaying internal events in an application process and system implementing this method
WO2022016847A1 (en) Automatic test method and device applied to cloud platform
US7941703B2 (en) Capturing machine state of unstable java program
JP2005505831A (en) Method for integrating Java servlets with asynchronous messages
US20150236799A1 (en) Method and system for quick testing and detecting mobile devices
US20110067007A1 (en) Automatic thread dumping
US7680917B2 (en) Method and system for unit testing web framework applications
CN114490416A (en) Method and system for fault detection switching of workflow service and computer storage medium
CN114371943A (en) Elegant publishing method based on micro-service architecture, and apparatus, device and medium thereof
US6519637B1 (en) Method and apparatus for managing a memory shortage situation in a data processing system
US8065569B2 (en) Information processing apparatus, information processing apparatus control method and control program
US11539612B2 (en) Testing virtualized network functions
CN112445549A (en) Operation and maintenance method, operation and maintenance device, electronic equipment and medium
CN114217950B (en) Node scheduling state control method and system
CN116305265A (en) Database processing method, device, server side and storage medium
US11886283B2 (en) Automatic node crash detection and remediation in distributed computing systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination