WO2021032175A1 - Fault injection method and device, and service system - Google Patents

Fault injection method and device, and service system Download PDF

Info

Publication number
WO2021032175A1
WO2021032175A1 PCT/CN2020/110351 CN2020110351W WO2021032175A1 WO 2021032175 A1 WO2021032175 A1 WO 2021032175A1 CN 2020110351 W CN2020110351 W CN 2020110351W WO 2021032175 A1 WO2021032175 A1 WO 2021032175A1
Authority
WO
WIPO (PCT)
Prior art keywords
fault
fault injection
message
business
application
Prior art date
Application number
PCT/CN2020/110351
Other languages
French (fr)
Chinese (zh)
Inventor
崔成
魏凌
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021032175A1 publication Critical patent/WO2021032175A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3664Environments for testing or debugging software
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3684Test management for test design, e.g. generating new test cases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites

Definitions

  • This application relates to the computer field, in particular to a fault injection method and device, and business service system.
  • cloud services B building their own cloud services (hereinafter referred to as cloud services B) on the basis of other cloud services (hereinafter referred to as cloud services A) can effectively improve the development efficiency of cloud services.
  • cloud services A building their own cloud services
  • this will cause the cloud service B to have a certain dependency on cloud service A, which will cause cloud service B to be unavailable when the implementation system of cloud service A fails. Therefore, test the implementation system of cloud service The reliability of the implementation system of the cloud service on which it depends is necessary when the system fails.
  • the implementation of the reliability test of the implementation system of cloud service B is to inject faults into the implementation system of cloud service A to simulate the failure of the implementation system of cloud service A, and then use the faulty
  • the realization system continues to provide cloud services to the realization system of cloud service B, and then observes the performance of the realization system of cloud service B after receiving the cloud service provided by the failed realization system, and then evaluates the realization system of cloud service B based on the performance Reliability.
  • the current injected faults mainly include: the shutdown fault of the server that implements the cloud service, the network disconnection fault of the server that implements the cloud service, and the forced exit fault of the cloud service process.
  • the present application provides a fault injection method, a device thereof, and a business service system, which can solve the problem of low accuracy of the current reliability test.
  • a fault injection method is provided, which is applied to a business service system in which a first application, a second application, and a fault injection module are running.
  • the method includes: the first application generates a business message , The business message is used to access the second application; the fault injection module obtains the business message, and performs the fault injection operation indicated by the fault attribute parameter on the business message; the fault injection module sends the business message after the fault injection operation is performed to the second application .
  • the business message is acquired through the fault injection module, and the fault indicated by the fault attribute parameter is executed on the business message.
  • Inject operation and send the business message after the fault injection operation is performed to the second application, which can inject faults in the process of sending business messages.
  • the granularity of the injected faults is refined and effective This improves the accuracy of reliability testing based on injected faults.
  • the injecting fault is realized by intercepting the business message during the sending process of the business message, there is no need to adapt the object in which the fault is injected, and the faultless injection can be realized.
  • an address conversion module is also running in the business service system. Before the fault injection module obtains the business message, the method further includes: the address conversion module obtains the business message and transfers the business message The destination address is modified to the address of the fault injection module, and the business message with the modified destination address is sent.
  • the fault injection module performs the fault injection operation indicated by the fault attribute parameter on the business message, including: the fault injection module performs the fault injection operation indicated by the fault attribute parameter on the business message when the business message meets the filter condition .
  • the filtering conditions involve one or more of the following content: request type, address requested to be accessed, request header keywords, and request body keywords; when the business message is a response to the business request, Filter conditions involve one or more of the following: response status code, response header keywords, and response body keywords.
  • the fault injection operation indicated by the fault attribute parameter may be a packet loss operation, a delay operation, and a packet error operation. Accordingly, the fault injection operation indicated by the fault attribute parameter may be performed in the following situations:
  • the business message includes multiple data packets.
  • the fault injection module performs the fault injection operation indicated by the fault attribute parameter on the business message, including: the fault injection module is based on the fault attribute parameter , To discard some or all of the multiple packets.
  • the fault injection module When the fault injection operation indicated by the fault attribute parameter is a delayed operation, the fault injection module performs the fault injection operation indicated by the fault attribute parameter on the business message, including: the fault injection module is based on the fault attribute parameter and is specified in the delayed fault attribute parameter. After a period of time, a business message is sent.
  • the fault injection module When the fault injection operation indicated by the fault attribute parameter is an error packet operation, the fault injection module performs the fault injection operation indicated by the fault attribute parameter on the business message, including: the fault injection module modifies the message in the business message based on the fault attribute parameter .
  • a business service system in a second aspect, includes: a first server, a second server, and a fault injection module; the first server is used to generate business messages, and the business messages are used to access the second server; and the fault injection module It is used to obtain the business message and perform the fault injection operation indicated by the fault attribute parameter on the business message; the fault injection module is also used to send the business message after the fault injection operation is performed to the second server.
  • the business service system further includes: an address conversion module; the address conversion module is used to obtain business messages, modify the destination address of the business message to the address of the fault injection module, and send the business message with the modified destination address; the fault injection module is specific It is used to receive the business message sent by the address conversion module, and perform the fault injection operation indicated by the fault attribute parameter on the business message.
  • the fault injection module is specifically configured to perform the fault injection operation indicated by the fault attribute parameter on the business message when the business message meets the filter condition.
  • the filtering conditions involve one or more of the following content: request type, address to be accessed, request header keywords, and request body keywords; when the business message is a response to the business request
  • the filter condition involves one or more of the following content: response status code, response header keywords, and response body keywords.
  • the fault injection module when the fault injection operation indicated by the fault attribute parameter is a packet loss operation, the fault injection module is specifically used to: when the business message includes a data packet, the fault injection module discards a data packet based on the fault attribute parameter; When the business message includes multiple data packets, the fault injection module discards some or all of the multiple data packets based on the fault attribute parameters.
  • the fault injection module is specifically configured to send a business message after delaying the time specified by the fault attribute parameter based on the fault attribute parameter.
  • the fault injection module is specifically configured to modify the message in the business message based on the fault attribute parameter.
  • a fault injection device in a third aspect, includes: a first monitoring unit and a fault injection unit; the first monitoring unit is used to obtain a business message generated by a first application, and the business message is used to access a second Application; the fault injection unit is used to obtain the business message from the first monitoring unit, perform the fault injection operation indicated by the fault attribute parameter on the business message, and send the business message after the fault injection operation is executed to the second application.
  • the fault injection device further includes: a filtering unit; the filtering unit is configured to receive the business message sent by the first monitoring unit, filter the business message according to the filtering condition, and send the business message that meets the filtering condition to the fault injection unit .
  • the fault injection device further includes: a second monitoring unit; the fault injection unit is specifically configured to send the business message after the fault injection operation is performed to the second monitoring unit; the second monitoring unit is used to send the service message after the fault injection operation is performed The business message is sent to the second application.
  • the fault injection device further includes: an address conversion unit; the address conversion unit is used to modify the destination address of the service message to be sent by the second monitoring unit to the address of the second application.
  • FIG. 1 is a schematic structural diagram of a fault injection module provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of an application scenario involved in a fault injection method provided by an embodiment of the present application
  • FIG. 3 is a schematic diagram of an application scenario involved in another fault injection method provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of an application scenario involved in another fault injection method provided by an embodiment of the present application.
  • FIG. 5 is a flowchart of a fault injection method provided by an embodiment of the present application.
  • Fig. 6 is a schematic diagram of a user interface for managing an application in a client provided by an embodiment of the present application.
  • the injected faults when testing the reliability of the implementation system of cloud services, mainly include: downtime failure of the server that implements the cloud service, network disconnection failure of the server that implements the cloud service, and forced exit failure of the cloud service process. Sexual failure. Since the granularity of the injected faults are all server-level faults, after the injected faults, the function implementation of other unrelated services will be affected and unrelated tests will be introduced, resulting in low accuracy of reliability testing based on the injected faults. Moreover, when injecting faults into the implementation system of the cloud service on which the cloud service under test depends, a lot of adaptation work is usually required, and it is impossible to inject faults without perception.
  • the embodiments of the present application provide a fault injection method, which can improve the accuracy of reliability testing based on injected faults.
  • the first application program and the second application program may be application programs that have a dependency relationship in function realization. That is, the implementation of the functions of the dependent end in the first application and the second application depends on the implementation of the functions of the dependent end in the first application and the second application.
  • the dependency relationship between the two can be expressed as: the dependent end can send a business request to the dependent end, the dependent end can process the business request, and send a business response carrying the processing result to the dependent end, and the dependent end will be based on The business response implements the function of the dependent end.
  • the business message generated by the first application for accessing the second application can be directly sent from the first application to the second application.
  • the business message generated by the first application program for accessing the second application program needs to be sent to the fault injection module first.
  • the fault injection module executes the fault attribute on the business message according to the fault attribute parameter. After the fault injection operation indicated by the parameter, the service message after the fault injection is sent to the second application.
  • the fault attribute parameter may be pre-stored in the fault injection module, or the fault injection module may receive a fault injection request, the fault injection request is used to indicate a fault injecting a business message, and the fault injection request may carry The fault attribute parameter.
  • the first application can be the dependent end
  • the second application can be the dependent end
  • the business message can be a business request sent by the first application to the second application.
  • Faults are injected into the process of sending business requests. From the perspective of the first application, it can be regarded as a fault in the second application.
  • the response of the first application after receiving the business response sent by the second application The analysis can realize the reliability analysis of the first application when the second application fails.
  • the second application can be the dependent end
  • the first application can be the dependent end
  • the business message can be the business response sent by the first application to the second application.
  • Faults are injected during the sending of business responses. From the perspective of the second application, it can be considered that the first application has a fault.
  • the response of the second application after receiving the faulty service response The analysis can realize the reliability analysis of the second application when the first application fails.
  • the function of forwarding the business message generated by the first application program for accessing the second application program to the fault injection module may be implemented by the address conversion module.
  • the business message generated by the first application program can be obtained through the address conversion module, the destination address of the business message can be modified to the address of the fault injection module, and the business message with the modified destination address can be sent, so that the original need to be sent The business message to the second application is sent to the fault injection module.
  • the function of the fault injection module can be realized by multiple functional units.
  • the fault injection module may include: a first monitoring unit C01 and a fault injection unit C02.
  • the first monitoring unit C01 is used to obtain a business message generated by a first application, and to transfer the business message Send to fault injection unit C02.
  • the fault injection unit C02 is configured to perform the fault injection operation indicated by the fault attribute parameter on the service message based on the fault attribute parameter, and send the service message after the fault injection to the second application, so that the second application can Business messages are processed.
  • the address conversion module modifies the destination address of the business message to the address of the fault injection module, essentially modifying the destination address of the business message to this The address of the first listening unit C01.
  • the fault injection module can inject faults into business messages with specific attributes for targeted reliability testing.
  • the business messages can be screened, and faults can be injected into the business messages that pass the screening.
  • the function of screening service messages can be implemented by the filtering unit C03. That is, as shown in FIG. 1, the multiple functional units may further include: a filtering unit C03, the filtering unit C03 is used to filter business messages according to filtering conditions, and send business messages that meet the filtering conditions to the fault
  • the injection unit C02 sends the business message that does not meet the filtering conditions to the second application.
  • the multiple functional units may further include: a second monitoring unit C04.
  • the fault injection unit C02 may send the service message after the fault is injected to the second monitoring unit C04, so as to send the service message after the fault injection to the second application through the second monitoring unit C04.
  • the filtering unit C03 may bypass the fault injection unit C02 and send the business messages that do not meet the filtering conditions to the second monitoring unit C04, so as to directly send the business messages that do not meet the filtering conditions to the second application through the second monitoring unit C04 program.
  • the fault injection module may further include: an address conversion module C05, the address conversion unit C05 is used to modify the destination address of the service message after the fault is injected to the address of the second application.
  • the address conversion unit C05 may modify the destination address of the service message after the fault injection to the address of the second application, so that The second monitoring unit 04 sends the service message after the fault injection to the second application according to the modified destination address.
  • the address conversion unit C05 may modify the destination address of the business message that does not meet the filtering conditions to the address of the second application.
  • the second monitoring unit 04 sends the business messages of the business messages that do not meet the filtering conditions to the second application according to the modified destination address.
  • the functions of the above fault injection module can be implemented in whole or in part by software, hardware, or a combination of software and hardware.
  • the function of the fault injection module may be realized in the form of a computer program product in whole or in part.
  • the computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are fully or partially realized.
  • the computer can be a general-purpose computer, a dedicated computer, a computer network, or other programmable devices.
  • the computer instructions can be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions can be transmitted from a website, computer, server, or data center through a wired (For example: coaxial cable, optical fiber, Digital Subscriber Line (DSL)) or wireless (for example: infrared, wireless, microwave, etc.) to transmit to another website, computer, server or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media.
  • the usable medium can be magnetic medium (for example: floppy disk, hard disk, magnetic tape), optical medium (for example: Digital Versatile Disc (DVD)), or semiconductor medium (for example: Solid State Disk (SSD)) Wait.
  • the application scenarios involved in the fault injection method provided in the embodiments of the present application may have multiple deployment modes, and the following are described as examples.
  • the application scenario includes: a first server 110 and a second server 120.
  • Both the first server 110 and the second server 120 may be one server, or a server cluster composed of several servers, or a cloud computing service center.
  • the first server 110 and the second server 120 may be connected through a wired network or a wireless network.
  • the first server 110 includes a first processor 1101, a first communication interface 1102, and a first memory 1103.
  • the first processor 1101, the first communication interface 1102, and the first memory 1103 are connected to each other through a first bus 1104.
  • the second server 120 includes a second processor 1201, a second communication interface 1202, and a second memory 1203.
  • the second processor 1201, the second communication interface 1202, and the second memory 1203 are connected to each other through a second bus 1204.
  • the first memory 1103 and the second memory 1203 are both used to store a computer program, and the computer program may be an application program.
  • the first processor 1101 calls the application program in the first memory 1103, it can realize the function of the application program
  • the second processor 1201 calls the application in the second memory 1203, it can implement the function of the application.
  • the first server 110 may run a first application program, an address conversion module, and a fault injection module
  • the second server 120 may run a second application program.
  • the first application program 1103a, the address conversion module 1103b, and the fault injection module 1103c may be stored in the first memory 1103, and the second application program 1203a may be stored in the second memory 1203.
  • the first server 110 may run a first application program and an address conversion module
  • the second server 120 may run a second application program and a fault injection module.
  • the first memory 1103 may store the first application program 1103a and the address conversion module 1103b
  • the second memory 1203 may store the second application program 1203a and the fault injection module 1203b.
  • FIG. 4 is a schematic diagram of an application scenario involved in another fault injection method provided by an embodiment of the present application.
  • the application scenario includes: a first server 110, a second server 120, and a third server 130.
  • the first server 110, the second server 120, and the third server 130 may all be one server, or a server cluster composed of several servers, or a cloud computing service center.
  • the first server 110 and the second server 120, the first server 110 and the third server 130, and the second server 120 and the third server 130 may all be connected through a wired network or a wireless network.
  • the first server 110 includes a first processor 1101, a first communication interface 1102, and a first memory 1103.
  • the first processor 1101, the first communication interface 1102, and the first memory 1103 are connected to each other through a first bus 1104.
  • the second server 120 includes a second processor 1201, a second communication interface 1202, and a second memory 1203.
  • the second processor 1201, the second communication interface 1202, and the second memory 1203 are connected to each other through a second bus 1204.
  • the third server 130 includes a third processor 1301, a third communication interface 1302, and a third memory 1303.
  • the third processor 1301, the third communication interface 1302, and the third memory 1303 are connected to each other through a third bus 1304.
  • the first memory 1103, the second memory 1203, and the third memory 1303 are all used to store computer programs, and the computer programs may be application programs. When a processor in a server calls an application program in the server’s memory, it can be implemented The functionality of the application.
  • the first application program and the address conversion module may be running in the first server 110, the second application program may be running in the second server 120, and the third server 130 may be running faulty. Inject the module.
  • the first application program 1103a and the address conversion module 1103b can be stored in the first memory 1103, the second application program 1203a is stored in the second memory 1203, and the fault injection is stored in the third memory 1303. Module 1303a.
  • the application scenarios involved in the fault injection method provided in the embodiments of the present application may include: a first server, a second server, and a fault injection module.
  • Both the first server and the second server may be one server, or a server cluster composed of several servers, or a cloud computing service center.
  • the first server and the second server, the first server and the fault injection module, and the fault injection module and the second server may be connected through a wired network or a wireless network.
  • the deployment situation of the first application in the first server and the deployment situation of the second application in the second server please refer to the deployment situation in FIG. 3 accordingly, and will not be repeated here.
  • the address conversion module may be integrated in the first application program.
  • the address conversion module may be a small program in the first application program.
  • the address conversion module may obtain the business message generated by the first application program for accessing the second application program, and Modify the destination address of the business message to the address of the fault injection module so that the fault injection module can obtain the business message.
  • the address conversion module and the first application program may be independently deployed in the first server 110, for example, the foregoing FIGS. 2 to 4 Both are exemplary illustrations of the independent deployment of the first application program and the address translation module.
  • the address translation module is deployed independently of the first application program, from the perspective of the first application program, after the first application program generates a business message for accessing the second application program, the process of sending the business message by the first application program It is the same as the business message sending process when the reliability test is not carried out, so that there is no need to change the first application program during the test process, which can solve the problem of a large amount of adaptation work in the related technology before the test can be carried out. No perception injection.
  • any one of the first bus 1104, the second bus 1204, and the third bus 1304 can be divided into an address bus, a data bus, a control bus, and the like.
  • an address bus a data bus
  • a control bus a control bus
  • only one thick line is used in FIGS. 2 to 4 to indicate, but it does not mean that there is only one bus or one type of bus.
  • any one of the first processor 1101, the second processor 1201, and the third processor 1301 may be a hardware chip, and the hardware chip may be an application-specific integrated circuit (application-specific integrated circuit). , ASIC), programmable logic device (programmable logic device, PLD) or a combination thereof.
  • the above-mentioned PLD may be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL) or any combination thereof.
  • CPLD complex programmable logic device
  • FPGA field-programmable gate array
  • GAL generic array logic
  • it may also be a general-purpose processor, for example, a central processing unit (CPU), a network processor (NP), or a combination of a CPU and an NP.
  • any one of the first memory 1103, the second memory 1203, and the third memory 1303 may include a volatile memory (volatile memory), such as random-access memory (RAM). ); It can also include non-volatile memory, such as flash memory, hard disk drive (HDD), or solid-state drive (SSD); it can also include the above A combination of types of storage.
  • volatile memory such as random-access memory (RAM).
  • non-volatile memory such as flash memory, hard disk drive (HDD), or solid-state drive (SSD); it can also include the above A combination of types of storage.
  • the multiple functional units of the fault injection module include: a first monitoring unit, a fault injection unit, a filtering unit, a second monitoring unit, and an address conversion unit as an example, the embodiment of the present application is provided
  • the implementation process of the fault injection method is explained.
  • the first application, the second application, the address conversion module, and the fault injection module are deployed according to other conditions, please refer to the implementation process of step 501 to step 507 below for the implementation process of the fault injection method.
  • the method includes:
  • Step 501 The first monitoring unit receives the fault injection request, and sends an address conversion request to the address conversion module based on the fault injection request.
  • the address conversion request is used to request that the destination address of the business message generated by the first application program for accessing the second application program be modified to the address of the first monitoring unit.
  • the fault injection request is used to request for fault injection during the process of sending a business message from the first application to the second application.
  • the fault injection request may carry a fault attribute parameter, and the fault attribute parameter is used to indicate the fault to be injected.
  • the first monitoring unit After receiving the fault injection request, the first monitoring unit also needs to send the fault attribute parameter to the fault injection unit.
  • the fault attribute parameter may also be pre-stored in the fault injection unit, which is not specifically limited in the embodiment of the present application.
  • the fault injection request is sent to the fault injection module, and the fault injection request can be received by any functional unit in the fault injection module, for example, it can be received by the first listening unit or by the filtering unit, Or received by the fault injection unit, or received by the second monitoring module.
  • Figure 5 is an example of receiving the fault injection request by the first listening unit.
  • a fault injection request can be triggered on the management client to request the implementation system of the reliability to be analyzed to inject faults.
  • the tester can submit a fault injection task in the management client to trigger the fault injection request.
  • the fault injection request can be triggered by invoking a software development kit (SDK).
  • SDK software development kit
  • the task of injecting faults submitted in the management client can be operated in the user interface of the application in the management client, or in the user interface of the browser in the management client.
  • the fault attribute parameter can be input in the user interface of the management client.
  • the fault attribute parameter may include: the fault identifier of the injected fault.
  • the first server may store program instructions for injecting different faults. After receiving the fault injection request carrying the fault identifier, the first server may call the program instructions of the fault indicated by the fault identifier according to the fault identifier, And by running the called program instructions, to achieve the fault injection indicated by the fault identifier.
  • the fault attribute parameter may further include one or more of the following: the probability of occurrence of the fault and the duration of the fault.
  • the duration of the failure By setting the duration of the failure, the duration of the failure injection can be effectively controlled, so that the failure state of the dependent end in the first application and the second application can be simulated within a specified time period.
  • the fault attribute parameter includes the probability of occurrence of the fault
  • the fault by injecting the fault according to the fault attribute parameter, the fault can be controlled to randomly appear fault according to the probability of occurrence, so as to simulate the situation that the dependent end is in a sub-healthy state and realize The reliability test of the dependent end when the end is in a sub-health state.
  • the dependent end is in a sub-healthy state refers to the state in which the dependent end can provide services with probabilistic failure.
  • the dependent end is the second application
  • the injection fault is a packet loss fault
  • the probability of occurrence of the fault is 100% (that is, the packet loss rate is 100%)
  • the packet is the failure state of discarding.
  • the probability of occurrence of the failure is 50% (that is, the packet loss rate is 50%)
  • the fault attribute parameters can also include one or more of the following: port number and Ip of the injected fault, and filter conditions for filtering business messages .
  • the port number and Ip of the injected fault are the port number and Ip of the port used by the second application to receive the service message.
  • the situation of the port failure can be simulated.
  • filter conditions for filtering business messages By setting filter conditions for filtering business messages, faults can be injected into business messages with specific attributes, so that faults can be injected into business requests in a targeted manner, and irrelevant tests can be avoided in the testing process and other tests that do not need to be tested. Service affects, and accurate injection of faults is achieved. It should be noted that, when the fault injection request carries a filter condition, after the first monitoring unit receives the fault injection request, the filter condition needs to be sent to the filter unit.
  • FIG. 6 is a schematic diagram of a user interface for managing an application in a client provided by an embodiment of the application.
  • the tester can input fault attribute parameters in the user interface, and click on the user interface after the parameter input is completed. "Inject now" button to trigger a fault injection request.
  • the fault attribute parameters that need to be input in the user interface shown in Figure 6 include: the environment name of the test environment, the name of the environment node used to deploy the test environment, and the Internet Protocol address of the server under test running the first application ( internet protocol address, IP), the network data exchange rules (protocol) in the test environment, the IP and port number (server_address) of the server on which the business depends, the type of the injected fault (drop_type, also called the fault identifier of the injected fault), the fault The probability of occurrence (drop_rate), the duration of the failure (timeout), and the filter conditions for filtering business messages (filter_keyword_content).
  • Figure 6 is a user interface diagram of the injection failure as a packet loss failure.
  • the environment name of the packet loss failure is: ECS_cui
  • the name of the environment node is: Apigateway_001
  • the IP of the server under test is: 172.168.200.41
  • the network data exchange rule is: Hypertext Transfer Protocol Secure (https)
  • the IP and port number of the server on which the business depends is: 172.168.200.42:8080
  • the fault identifier of the injected fault is: req_drop (the identifier Used to indicate that the injection failure is a packet loss failure)
  • the probability of occurrence of the failure that is, the packet loss rate
  • the duration of the failure is: 3600 seconds
  • the filter condition for filtering business messages is: all (all), The all indicates that all business messages are filtered, that is, fault injection is performed on all business messages.
  • Step 502 The first application program generates a business message for accessing the second application program.
  • the client can send a business request instruction to the first application through the client, and the first application can generate a corresponding business message after receiving the business request instruction (At this time, it is also called a business request).
  • the first application can generate a service request for accessing the second application, so as to request the second application to send the first application to the first application.
  • the application program provides the corresponding service, so that the first application program provides the service indicated by the service request instruction to the client according to the first application program.
  • the user when the user needs the second application to provide business services (for example, cloud services), he can send a business request instruction to the second application through the client, and the second application can generate the corresponding service request instruction after receiving the business request instruction.
  • Service request After the second application sends a service request to the first application level, the first application can generate a service message for accessing the second application based on the service request (also called service response at this time) to facilitate the second application.
  • the application program provides the client with the service indicated by the service request instruction according to the service response.
  • Step 503 The address conversion module obtains the service message, based on the address conversion request, modifies the destination address of the service message to the address of the first monitoring unit, and sends the modified service message.
  • the address conversion module After the address conversion module receives the address conversion request, it can monitor whether the first application program has generated a business message for accessing the second application program, and after monitoring that the first application program has generated a business message for accessing the second application program At this time, the business message is acquired, the destination address of the business message is modified to the address of the first monitoring unit, and the business message with the modified destination address is sent to the first monitoring unit.
  • the port number and IP of the first application are: 172.168.200.41:4000
  • the port number and IP of the first listening unit are: 172.168.200.41:5000
  • the port number and IP of the second listening unit are: 1.2 .3.4:8080
  • the port number and IP of the second application are: 172.168.200.42:8080
  • the service request generated by the first application is: http://172.168.200.42:8080/v1/xxxx, which is not required
  • the sender address of the business message generated by the first application program for accessing the second application program is 172.168.200.41:4000
  • the destination address of the business message is 172.168.200.42:8080
  • the router and other devices in the network will send the service message to the second application according to the destination address of the service message.
  • the address translation module can modify the destination address of the business message to the address of the first listening unit, so that the destination address of the business message becomes 172.168.200.41:5000, that is, the destination address of the business message is changed from 172.168. 200.42:8080 is modified to 172.168.200.41:5000.
  • the gateway and other devices in the network will send the service message to the first listener according to the modified destination address unit. Therefore, by modifying the destination address of the business message, the fault injection module can intercept the business message that is expected to be sent to the second application, and after injecting a fault into the business message, send the injected business message to the second application.
  • the destination address of the business message can be modified through the iptables command (a kind of program instruction).
  • the command to modify the destination address of the business message from 172.168.200.42:8080 to 172.168.200.41:5000 can be: iptables–t nat– A OUTPUT–d 172.168.200.42–p tcp–m tcp–dport 8080–j DNAT–to_destination 172.168.200.41:5000.
  • the address replacement module can be controlled to stop working, so that the first application can directly issue business messages for accessing the second application, so that the first application and the Second, the application works in the original way, so that the business returns to normal.
  • Step 504 The first monitoring unit receives the service message with the modified destination address, and sends the modified service message to the filtering unit.
  • Step 505 The filtering unit filters the business messages according to the filtering conditions, and when the business messages meet the filtering conditions, sends the business messages to the fault injection unit.
  • the implementation method of injecting faults into business messages with specific attributes can be: using a filtering unit to filter the business messages obtained by the fault injection unit according to the filtering conditions, and when the business messages meet the filtering conditions, send the business messages to the fault injection Unit, and inject faults into business messages through the fault injection unit. This can avoid the introduction of irrelevant tests in the test process, and achieve accurate fault injection.
  • the filter conditions can be set according to actual needs.
  • the filter condition may involve one or more of the following content: request type, address requested to be accessed, request header keyword, and request body keyword.
  • the filter condition may involve one or more of the following content: response status code, response header keywords, and response body keywords.
  • the request type may include: a request to change information to the receiver (ie a POST request), a request to query data (ie a GET request), a request to request the header of the page (ie a HEAD request), allowing the client to view the server Performance requests (ie OPTIONS requests), and requests from the client to the server to replace the content of the specified document (PUT requests) and other request types.
  • the address requested for access can be represented by a uniform resource identifier (URI).
  • URI uniform resource identifier
  • the request header keyword can be the content-Type of the request.
  • the keyword of the request body can be the request success (success).
  • the response header keyword can be the content-Type of the response.
  • the response body keyword can be response success (success).
  • the response status code can be 200, 404, 500, etc.
  • the response status code 200 indicates that the request is successful
  • the response status code 404 indicates that the requested resource does not exist
  • the response status code 500 indicates that the server has an unexpected error.
  • the filtering unit determines that the service message does not meet the filtering conditions, it can send the service message to the second monitoring unit to send the service message through the second monitoring unit To the second application.
  • the second monitoring unit when the second monitoring unit is not set, if the filtering unit determines that the service message does not meet the filtering condition, the service message can be directly sent to the second application.
  • Step 506 The fault injection unit performs the fault injection operation indicated by the fault attribute parameter on the service request, and sends the service message after the fault injection operation is performed to the second monitoring unit.
  • the faults injected into the service request can be network faults, such as packet loss faults, delay faults, and error packet faults.
  • the packet loss fault refers to the situation that the data packet used to carry the service message is dropped during the process of sending the service message. That is, the fault injection unit injects the packet loss fault into the service request: when the service request includes a data packet, the fault injection unit discards the data packet according to the fault attribute parameter. When the service request includes multiple data packets, the fault The injection unit discards all or part of the multiple data packets according to the fault attribute parameter.
  • Delay failure refers to a situation in which the data packet used to carry the service message is delayed and forwarded during the process of sending the service message.
  • the fault injection unit injects the delay fault into the service request refers to: the fault injection unit sends the data packet of the service message after delaying the time specified by the fault attribute parameter.
  • the error of packet error refers to the situation that the content of the data packet used to carry the business message is modified in the process of sending the business message, for example, the message header, message body and status code in the data packet appear One or more circumstances have been modified. That is, the fault injection unit injects the wrong packet fault into the service request means that the fault injection unit modifies the message in the service message according to the fault attribute parameter.
  • the injected faults may also be system resource faults, node faults, database faults, container faults, and other types of faults.
  • the system resource failures may include: central processing unit (CPU) boost failure, memory leak failure, hard disk failure, network failure, abnormal process exit failure, file abnormal failure, file system failure, and system management Failure etc.
  • the node faults may include: abnormal shutdown of the node and abnormal restart of the node.
  • the injected fault is not limited to the above-described fault, and may also be other faults, that is, the injected fault is expandable.
  • the program instructions for injecting faults can be written in advance and stored in the storage medium. When faults need to be injected, the faults to be injected can be determined by the fault injection method provided in the embodiments of the present application, and the instructions for injecting corresponding The program instructions of the fault to achieve the corresponding fault injection.
  • the fault injection unit when the fault injection module includes the second monitoring unit, after the fault injection unit injects the fault into the business message, the fault injection unit can send the service message after the fault is injected to the second monitoring unit to inject the fault through the second monitoring unit The subsequent business message is sent to the second application.
  • the fault injection module does not include the second monitoring unit, after the fault injection unit injects the fault into the service message, the service message after the fault injection may be directly sent to the second application program, that is, step 507 may not be executed.
  • the address conversion unit in the fault injection module can also modify the destination address of the service message to be sent by the second monitoring unit Is the address of the second application.
  • the address conversion unit may modify the destination address of the service message from 1.2.3.4:8080 to 172.168.200.42:8080.
  • the command to modify the destination address of the business message from 1.2.3.4:8080 to 172.168.200.42:8080 can be: iptables–t nat–A OUTPUT–d 1.2.3.4–p tcp–m tcp–dport 8080–j DNAT –To_destination 172.168.200.42:8080.
  • the receiving end of the service message after the injection failure can be set as the second application, so that the functional unit used to send the service message after the injection failure to the second application
  • the receiving end of the service message after the fault is injected can be obtained according to the fault attribute parameter. Therefore, the fault injection module may not include an address conversion unit.
  • Step 507 The second monitoring unit sends the service message after the failure is injected to the second application, so that the second application process the service message after the failure is injected.
  • the second application When the business message is a business request, after the second application receives the business request after the fault is injected, it will provide business services based on it, and send a business response to the first application based on the result of the business service provided. It will react after receiving the business response. At this point, it is equivalent to simulating the failure of the second application program.
  • the reliability of the first application program when the second application program fails can be realized. Analysis.
  • the second application When the business message is a business response, the second application will respond after receiving the business response after the fault is injected. At this point, it is equivalent to simulating the failure of the first application program.
  • the reliability of the second application program when the first application program fails can be realized. Analysis.
  • the business message is obtained through the fault injection module, and the fault attribute is executed on the business message.
  • the fault injection operation indicated by the parameter and sending the business message after the fault injection operation is performed to the second application can inject faults into the business message during the sending process of the business message.
  • it is more detailed The granularity of injected faults is improved, and the accuracy of reliability testing based on injected faults is effectively improved.
  • the injecting fault is realized by intercepting the business message during the sending process of the business message, there is no need to adapt the object in which the fault is injected, and the faultless injection can be realized.
  • the sequence of steps of the fault injection method provided in the embodiments of the present application can be appropriately adjusted, and the steps can also be increased or decreased according to the situation. For example, it is possible to choose whether to execute step 507 according to the situation. Any person familiar with the technical field can easily think of a method of change within the technical scope disclosed in this application, which should be covered by the protection scope of this application, and therefore will not be repeated.
  • the embodiment of the present application also provides a fault injection device, which can be deployed on a server or computer equipment.
  • the fault injection device may include the fault injection module provided in the embodiment of the present application.
  • the fault injection device may include: a first monitoring unit and a fault injection unit; the first monitoring unit is used to obtain a business message generated by a first application, and the business message is used to access a second application; The fault injection unit is used to obtain the service message from the first monitoring unit, perform the fault injection operation indicated by the fault attribute parameter on the service message, and send the service message after the fault injection operation is performed to the second application.
  • the fault injection device may further include: a filtering unit; the filtering unit is configured to receive the business message sent by the first monitoring unit, filter the business message according to filtering conditions, and filter The conditional business message is sent to the fault injection unit.
  • a filtering unit configured to receive the business message sent by the first monitoring unit, filter the business message according to filtering conditions, and filter The conditional business message is sent to the fault injection unit.
  • the fault injection device may further include: a second monitoring unit; at this time, the fault injection unit is specifically configured to send the service message after the fault injection operation is performed to the second monitoring unit; The monitoring unit is configured to send the service message after the fault injection operation is performed to the second application.
  • a second monitoring unit at this time, the fault injection unit is specifically configured to send the service message after the fault injection operation is performed to the second monitoring unit; The monitoring unit is configured to send the service message after the fault injection operation is performed to the second application.
  • the fault injection device may further include: an address conversion unit; the address conversion unit is configured to modify the destination address of the service message to be sent by the second monitoring unit to the address of the second application.
  • the embodiment of the present application also provides a storage medium.
  • the storage medium is a non-volatile computer-readable storage medium.
  • the fault injection module or address in the embodiment of the present application is implemented.
  • the embodiments of the present application also provide a computer program product containing instructions.
  • the computer program product runs on a computer, the computer executes the functions implemented by the fault injection module or the address conversion module in the embodiments of the present application.
  • the terms “first”, “second” and “third” are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance.
  • the term “at least one” refers to one or more, and the term “plurality” refers to two or more, unless specifically defined otherwise.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Test And Diagnosis Of Digital Computers (AREA)
  • Debugging And Monitoring (AREA)

Abstract

A fault injection method, comprising: a first application generates a service message, the service message being used for accessing a second application; a fault injection module obtains the service message, and performs a fault injection operation indicated by fault attribute parameters on the service message; the fault injection module sends the service message having experienced the fault injection operation to the second application. The method improves the accuracy of reliability testing according to an injected fault.

Description

故障注入方法及其装置、业务服务系统Fault injection method and device and business service system 技术领域Technical field
本申请涉及计算机领域,特别涉及一种故障注入方法及其装置、业务服务系统。This application relates to the computer field, in particular to a fault injection method and device, and business service system.
背景技术Background technique
随着云计算技术的快速发展,云服务的使用越来越普遍。并且,对于云服务提供商来说,在其他云服务(下面称为云服务A)的基础上构建自己的云服务(下面称为云服务B),能够有效提高云服务的开发效率。但是,这样会使该云服务B对云服务A具有一定的依赖性,导致当云服务A的实现系统出现故障时,云服务B可能出现服务不可用的情况,因此,测试云服务的实现系统在其所依赖的云服务的实现系统出现故障时的可靠性,是很有必要的。With the rapid development of cloud computing technology, the use of cloud services has become more and more common. In addition, for cloud service providers, building their own cloud services (hereinafter referred to as cloud services B) on the basis of other cloud services (hereinafter referred to as cloud services A) can effectively improve the development efficiency of cloud services. However, this will cause the cloud service B to have a certain dependency on cloud service A, which will cause cloud service B to be unavailable when the implementation system of cloud service A fails. Therefore, test the implementation system of cloud service The reliability of the implementation system of the cloud service on which it depends is necessary when the system fails.
相关技术中,对云服务B的实现系统进行可靠性测试的实现方式为:向云服务A的实现系统注入故障,以模拟出云服务A的实现系统出现故障的情况,然后使用该出现故障的实现系统继续向云服务B的实现系统提供云服务,再观察云服务B的实现系统在接收到该出现故障的实现系统提供的云服务后的表现,然后根据该表现评估云服务B的实现系统的可靠性。其中,目前注入的故障主要包括:实现云服务的服务器的停机故障、实现云服务的服务器的断网故障和云服务进程的强制退出故障。In related technologies, the implementation of the reliability test of the implementation system of cloud service B is to inject faults into the implementation system of cloud service A to simulate the failure of the implementation system of cloud service A, and then use the faulty The realization system continues to provide cloud services to the realization system of cloud service B, and then observes the performance of the realization system of cloud service B after receiving the cloud service provided by the failed realization system, and then evaluates the realization system of cloud service B based on the performance Reliability. Among them, the current injected faults mainly include: the shutdown fault of the server that implements the cloud service, the network disconnection fault of the server that implements the cloud service, and the forced exit fault of the cloud service process.
但是,以上故障均为全局性的故障,导致根据该故障进行可靠性测试的精确度较低。However, the above faults are all global faults, resulting in low accuracy of reliability testing based on the faults.
发明内容Summary of the invention
本申请提供了一种故障注入方法及其装置、业务服务系统,可以解决目前的可靠性测试的精确度较低的问题。The present application provides a fault injection method, a device thereof, and a business service system, which can solve the problem of low accuracy of the current reliability test.
第一方面,提供了一种故障注入方法,该方法应用于业务服务系统,业务服务系统中运行有第一应用程序、第二应用程序和故障注入模块,方法包括:第一应用程序生成业务消息,业务消息用于访问第二应用程序;故障注入模块获取业务消息,对业务消息执行故障属性参数所指示的故障注入操作;故障注入模块将执行故障注入操作后的业务消息发送至第二应用程序。In the first aspect, a fault injection method is provided, which is applied to a business service system in which a first application, a second application, and a fault injection module are running. The method includes: the first application generates a business message , The business message is used to access the second application; the fault injection module obtains the business message, and performs the fault injection operation indicated by the fault attribute parameter on the business message; the fault injection module sends the business message after the fault injection operation is performed to the second application .
在本申请实施例提供的故障注入方法中,在第一应用程序生成用于访问第二应用程序的业务消息后,通过故障注入模块获取该业务消息,对业务消息执行故障属性参数所指示的故障注入操作,并向第二应用程序发送执行故障注入操作后的业务消息,能够在业务消息的发送过程中注入故障,相较于相关技术中的全局性故障,细化了注入故障的粒度,有效的提高了根据注入的故障进行可靠性测试的精确度。并且,由于是在业务消息的发送过程中,通过拦截业务消息实现注入故障的,使得无需对注入故障的对象做适配工作,可以做到故障的无感知注入。In the fault injection method provided by the embodiment of the present application, after the first application generates a business message for accessing the second application, the business message is acquired through the fault injection module, and the fault indicated by the fault attribute parameter is executed on the business message. Inject operation and send the business message after the fault injection operation is performed to the second application, which can inject faults in the process of sending business messages. Compared with the global faults in related technologies, the granularity of the injected faults is refined and effective This improves the accuracy of reliability testing based on injected faults. Moreover, since the injecting fault is realized by intercepting the business message during the sending process of the business message, there is no need to adapt the object in which the fault is injected, and the faultless injection can be realized.
作为使故障注入模块获取业务消息的一种可实现方式中,业务服务系统中还运行 有地址转换模块,在故障注入模块获取业务消息之前,方法还包括:地址转换模块获取业务消息,将业务消息的目的地址修改为故障注入模块的地址,发送修改目的地址后的业务消息。As an achievable way for the fault injection module to obtain business messages, an address conversion module is also running in the business service system. Before the fault injection module obtains the business message, the method further includes: the address conversion module obtains the business message and transfers the business message The destination address is modified to the address of the fault injection module, and the business message with the modified destination address is sent.
在一种实现方式中,故障注入模块对业务消息执行故障属性参数所指示的故障注入操作,包括:故障注入模块在业务消息符合过滤条件时,对业务消息执行故障属性参数所指示的故障注入操作。通过使用过滤条件对业务消息进行过滤,并在业务消息符合过滤条件时,对业务消息注入故障,使得可以对业请求有针对性地进行故障注入,能够避免在测试过程中引入无关的测试,实现对故障的精准注入。In one implementation, the fault injection module performs the fault injection operation indicated by the fault attribute parameter on the business message, including: the fault injection module performs the fault injection operation indicated by the fault attribute parameter on the business message when the business message meets the filter condition . By using filter conditions to filter business messages, and when the business message meets the filter conditions, faults are injected into the business messages, so that fault injection can be carried out in a targeted manner for business requests, which can avoid the introduction of irrelevant tests in the test process. Accurate injection of faults.
其中,当业务消息为业务请求时,过滤条件涉及以下一项或多项内容:请求类型、请求访问的地址、请求头关键字和请求体关键字;当业务消息为针对业务请求的响应时,过滤条件涉及以下一项或多项内容:响应状态码、响应头关键字和响应体关键字。Among them, when the business message is a business request, the filtering conditions involve one or more of the following content: request type, address requested to be accessed, request header keywords, and request body keywords; when the business message is a response to the business request, Filter conditions involve one or more of the following: response status code, response header keywords, and response body keywords.
可选的,故障属性参数所指示的故障注入操作可以为丢包操作、时延操作和错包操作,相应的,执行故障属性参数所指示的故障注入操作,可以包括以下几种情况:Optionally, the fault injection operation indicated by the fault attribute parameter may be a packet loss operation, a delay operation, and a packet error operation. Accordingly, the fault injection operation indicated by the fault attribute parameter may be performed in the following situations:
业务消息包括多个数据包,当故障属性参数所指示的故障注入操作为丢包操作时,故障注入模块对业务消息执行故障属性参数所指示的故障注入操作,包括:故障注入模块基于故障属性参数,丢弃多个数据包中的部分或全部数据包。The business message includes multiple data packets. When the fault injection operation indicated by the fault attribute parameter is a packet loss operation, the fault injection module performs the fault injection operation indicated by the fault attribute parameter on the business message, including: the fault injection module is based on the fault attribute parameter , To discard some or all of the multiple packets.
当故障属性参数所指示的故障注入操作为时延操作时,故障注入模块对业务消息执行故障属性参数所指示的故障注入操作,包括:故障注入模块基于故障属性参数,在延迟故障属性参数指定的时长后,发送业务消息。When the fault injection operation indicated by the fault attribute parameter is a delayed operation, the fault injection module performs the fault injection operation indicated by the fault attribute parameter on the business message, including: the fault injection module is based on the fault attribute parameter and is specified in the delayed fault attribute parameter. After a period of time, a business message is sent.
当故障属性参数所指示的故障注入操作为错包操作时,故障注入模块对业务消息执行故障属性参数所指示的故障注入操作,包括:故障注入模块基于故障属性参数,修改业务消息中的报文。When the fault injection operation indicated by the fault attribute parameter is an error packet operation, the fault injection module performs the fault injection operation indicated by the fault attribute parameter on the business message, including: the fault injection module modifies the message in the business message based on the fault attribute parameter .
第二方面,提供了一种业务服务系统,业务服务系统包括:第一服务器、第二服务器和故障注入模块;第一服务器用于生成业务消息,业务消息用于访问第二服务器;故障注入模块用于获取业务消息,对业务消息执行故障属性参数所指示的故障注入操作;故障注入模块还用于将执行故障注入操作后的业务消息发送至第二服务器。In a second aspect, a business service system is provided. The business service system includes: a first server, a second server, and a fault injection module; the first server is used to generate business messages, and the business messages are used to access the second server; and the fault injection module It is used to obtain the business message and perform the fault injection operation indicated by the fault attribute parameter on the business message; the fault injection module is also used to send the business message after the fault injection operation is performed to the second server.
可选的,业务服务系统还包括:地址转换模块;地址转换模块用于获取业务消息,将业务消息的目的地址修改为故障注入模块的地址,发送修改目的地址后的业务消息;故障注入模块具体用于接收地址转换模块发送的业务消息,对业务消息执行故障属性参数所指示的故障注入操作。Optionally, the business service system further includes: an address conversion module; the address conversion module is used to obtain business messages, modify the destination address of the business message to the address of the fault injection module, and send the business message with the modified destination address; the fault injection module is specific It is used to receive the business message sent by the address conversion module, and perform the fault injection operation indicated by the fault attribute parameter on the business message.
可选的,故障注入模块具体用于在业务消息符合过滤条件时,对业务消息执行故障属性参数所指示的故障注入操作。Optionally, the fault injection module is specifically configured to perform the fault injection operation indicated by the fault attribute parameter on the business message when the business message meets the filter condition.
可选的,当业务消息为业务请求时,过滤条件涉及以下一项或多项内容:请求类型、请求访问的地址、请求头关键字和请求体关键字;当业务消息为针对业务请求的响应时,过滤条件涉及以下一项或多项内容:响应状态码、响应头关键字和响应体关键字。Optionally, when the business message is a business request, the filtering conditions involve one or more of the following content: request type, address to be accessed, request header keywords, and request body keywords; when the business message is a response to the business request When the filter condition involves one or more of the following content: response status code, response header keywords, and response body keywords.
可选的,当故障属性参数所指示的故障注入操作为丢包操作时,故障注入模块具体用于:当业务消息包括一个数据包时,故障注入模块基于故障属性参数,丢弃一个 数据包;当业务消息包括多个数据包时,故障注入模块基于故障属性参数,丢弃多个数据包中的部分或全部数据包。Optionally, when the fault injection operation indicated by the fault attribute parameter is a packet loss operation, the fault injection module is specifically used to: when the business message includes a data packet, the fault injection module discards a data packet based on the fault attribute parameter; When the business message includes multiple data packets, the fault injection module discards some or all of the multiple data packets based on the fault attribute parameters.
可选的,当故障属性参数所指示的故障注入操作为时延操作时,故障注入模块具体用于基于故障属性参数,在延迟故障属性参数指定的时长后,发送业务消息。Optionally, when the fault injection operation indicated by the fault attribute parameter is a delayed operation, the fault injection module is specifically configured to send a business message after delaying the time specified by the fault attribute parameter based on the fault attribute parameter.
可选的,当故障属性参数所指示的故障注入操作为错包操作时,故障注入模块具体用于基于故障属性参数,修改业务消息中的报文。Optionally, when the fault injection operation indicated by the fault attribute parameter is an error packet operation, the fault injection module is specifically configured to modify the message in the business message based on the fault attribute parameter.
第三方面,提供了一种故障注入装置,该故障注入装置包括:第一监听单元和故障注入单元;第一监听单元用于获取第一应用程序生成的业务消息,业务消息用于访问第二应用程序;故障注入单元用于获取来自第一监听单元的业务消息,对业务消息执行故障属性参数所指示的故障注入操作,将执行故障注入操作后的业务消息发送至第二应用程序。In a third aspect, a fault injection device is provided. The fault injection device includes: a first monitoring unit and a fault injection unit; the first monitoring unit is used to obtain a business message generated by a first application, and the business message is used to access a second Application; the fault injection unit is used to obtain the business message from the first monitoring unit, perform the fault injection operation indicated by the fault attribute parameter on the business message, and send the business message after the fault injection operation is executed to the second application.
可选的,该故障注入装置还包括:过滤单元;过滤单元用于接收第一监听单元发送的业务消息,根据过滤条件对业务消息进行过滤,并将符合过滤条件的业务消息发送至故障注入单元。Optionally, the fault injection device further includes: a filtering unit; the filtering unit is configured to receive the business message sent by the first monitoring unit, filter the business message according to the filtering condition, and send the business message that meets the filtering condition to the fault injection unit .
可选的,该故障注入装置还包括:第二监听单元;故障注入单元具体用于将执行故障注入操作后的业务消息发送至第二监听单元;第二监听单元用于将执行故障注入操作后的业务消息发送至第二应用程序。Optionally, the fault injection device further includes: a second monitoring unit; the fault injection unit is specifically configured to send the business message after the fault injection operation is performed to the second monitoring unit; the second monitoring unit is used to send the service message after the fault injection operation is performed The business message is sent to the second application.
可选的,该故障注入装置还包括:地址转换单元;地址转换单元用于将第二监听单元待发送的业务消息的目的地址修改为第二应用程序的地址。Optionally, the fault injection device further includes: an address conversion unit; the address conversion unit is used to modify the destination address of the service message to be sent by the second monitoring unit to the address of the second application.
附图说明Description of the drawings
图1是本申请实施例提供的一种故障注入模块的结构示意图;FIG. 1 is a schematic structural diagram of a fault injection module provided by an embodiment of the present application;
图2是本申请实施例提供的一种故障注入方法所涉及的应用场景示意图;2 is a schematic diagram of an application scenario involved in a fault injection method provided by an embodiment of the present application;
图3是本申请实施例提供的另一种故障注入方法所涉及的应用场景示意图;3 is a schematic diagram of an application scenario involved in another fault injection method provided by an embodiment of the present application;
图4是本申请实施例提供的再一种故障注入方法所涉及的应用场景示意图;4 is a schematic diagram of an application scenario involved in another fault injection method provided by an embodiment of the present application;
图5是本申请实施例提供的一种故障注入方法的流程图;FIG. 5 is a flowchart of a fault injection method provided by an embodiment of the present application;
图6是本申请实施例提供的一种管理客户端中应用程序的用户界面的示意图。Fig. 6 is a schematic diagram of a user interface for managing an application in a client provided by an embodiment of the present application.
具体实施方式detailed description
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。In order to make the objectives, technical solutions, and advantages of the present application clearer, the following will further describe the embodiments of the present application in detail with reference to the accompanying drawings.
相关技术中,在测试云服务的实现系统的可靠性时,注入的故障主要包括:实现云服务的服务器的停机故障、实现云服务的服务器的断网故障和云服务进程的强制退出故障等全局性故障。由于该注入的故障的粒度均为服务器层面的故障,导致在注入故障后,会影响其他无关服务的功能实现并引入无关的测试,导致根据注入的故障进行可靠性测试的精确度较低。并且,在向被测云服务所依赖的云服务的实现系统注入故障时,通常需要做大量的适配工作,无法做到故障的无感知注入。In related technologies, when testing the reliability of the implementation system of cloud services, the injected faults mainly include: downtime failure of the server that implements the cloud service, network disconnection failure of the server that implements the cloud service, and forced exit failure of the cloud service process. Sexual failure. Since the granularity of the injected faults are all server-level faults, after the injected faults, the function implementation of other unrelated services will be affected and unrelated tests will be introduced, resulting in low accuracy of reliability testing based on the injected faults. Moreover, when injecting faults into the implementation system of the cloud service on which the cloud service under test depends, a lot of adaptation work is usually required, and it is impossible to inject faults without perception.
本申请实施例提供了一种故障注入方法,该故障注入方法能够提高根据注入的故 障进行可靠性测试的精确度。下面先对本申请实施例提供的故障注入方法所涉及的应用场景进行说明。The embodiments of the present application provide a fault injection method, which can improve the accuracy of reliability testing based on injected faults. The following first describes the application scenarios involved in the fault injection method provided in the embodiments of the present application.
该应用场景涉及第一应用程序、第二应用程序和故障注入模块。该第一应用程序和第二应用程序可以为功能实现上具有依赖关系的应用程序。也即是,第一应用程序和第二应用程序中依赖端的功能实现,依赖于第一应用程序和第二应用程序中被依赖端的功能实现。其中,两者的依赖关系可表现为:依赖端可以向被依赖端发送业务请求,被依赖端可以对该业务请求进行处理,并向依赖端发送携带有处理结果的业务响应,依赖端会基于该业务响应实现该依赖端的功能。在第一应用程序和第二应用程序的正常工作过程中,第一应用程序生成的用于访问第二应用程序的业务消息,可以从第一应用程序直接发送至第二应用程序。This application scenario involves the first application, the second application and the fault injection module. The first application program and the second application program may be application programs that have a dependency relationship in function realization. That is, the implementation of the functions of the dependent end in the first application and the second application depends on the implementation of the functions of the dependent end in the first application and the second application. Among them, the dependency relationship between the two can be expressed as: the dependent end can send a business request to the dependent end, the dependent end can process the business request, and send a business response carrying the processing result to the dependent end, and the dependent end will be based on The business response implements the function of the dependent end. In the normal working process of the first application and the second application, the business message generated by the first application for accessing the second application can be directly sent from the first application to the second application.
而在本申请实施例中,第一应用程序生成的用于访问第二应用程序的业务消息,需要先发送至故障注入模块,在故障注入模块根据故障属性参数,对该业务消息执行该故障属性参数所指示的故障注入操作后,再将注入故障后的业务消息发送至第二应用程序。其中,该故障属性参数可以预先存储在该故障注入模块中,或者,该故障注入模块可以接收故障注入请求,该故障注入请求用于指示对业务消息注入故障,且该故障注入请求中可以携带有该故障属性参数。In the embodiment of the present application, the business message generated by the first application program for accessing the second application program needs to be sent to the fault injection module first. The fault injection module executes the fault attribute on the business message according to the fault attribute parameter. After the fault injection operation indicated by the parameter, the service message after the fault injection is sent to the second application. Wherein, the fault attribute parameter may be pre-stored in the fault injection module, or the fault injection module may receive a fault injection request, the fault injection request is used to indicate a fault injecting a business message, and the fault injection request may carry The fault attribute parameter.
在一种可实现方式中,第一应用程序可以为依赖端,第二应用程序可以为被依赖端,即业务消息可以为第一应用程序向第二应用程序发送的业务请求,此时,在业务请求的发送过程中注入故障,从第一应用程序的视角,可视为是第二应用程序出现了故障,相应的,通过对第一应用程序接收第二应用程序发送的业务响应后的反应进行分析,可以实现第二应用程序出现故障时,对第一应用程序的可靠性分析。In an achievable manner, the first application can be the dependent end, and the second application can be the dependent end, that is, the business message can be a business request sent by the first application to the second application. Faults are injected into the process of sending business requests. From the perspective of the first application, it can be regarded as a fault in the second application. Correspondingly, the response of the first application after receiving the business response sent by the second application The analysis can realize the reliability analysis of the first application when the second application fails.
在另一种实现方式中,第二应用程序可以为依赖端,第一应用程序可以为被依赖端,即业务消息可以为第一应用程序向第二应用程序发送的业务响应,此时,在业务响应的发送过程中注入故障,从第二应用程序的视角,可视为是第一应用程序出现了故障,相应的,通过对该第二应用程序接收被注入了故障的业务响应后的反应进行分析,可以实现第一应用程序出现故障时,对第二应用程序的可靠性分析。In another implementation, the second application can be the dependent end, and the first application can be the dependent end, that is, the business message can be the business response sent by the first application to the second application. Faults are injected during the sending of business responses. From the perspective of the second application, it can be considered that the first application has a fault. Correspondingly, the response of the second application after receiving the faulty service response The analysis can realize the reliability analysis of the second application when the first application fails.
可选的,将第一应用程序生成的用于访问第二应用程序的业务消息,转发至故障注入模块的功能可以通过地址转换模块实现。例如,可以通过该地址转换模块获取第一应用程序生成的该业务消息,将该业务消息的目的地址修改为故障注入模块的地址,并发送该修改目的地址后的业务消息,以将原本需要发送至第二应用程序的业务消息发送至故障注入模块。Optionally, the function of forwarding the business message generated by the first application program for accessing the second application program to the fault injection module may be implemented by the address conversion module. For example, the business message generated by the first application program can be obtained through the address conversion module, the destination address of the business message can be modified to the address of the fault injection module, and the business message with the modified destination address can be sent, so that the original need to be sent The business message to the second application is sent to the fault injection module.
可选的,该故障注入模块的功能可以通过多个功能性单元实现。示例的,如图1所示,该故障注入模块可以包括:第一监听单元C01和故障注入单元C02,该第一监听单元C01用于获取第一应用程序生成的业务消息,并将该业务消息发送至故障注入单元C02。故障注入单元C02用于基于故障属性参数,对业务消息执行所该故障属性参数所指示的故障注入操作,并将注入故障后的业务消息发送至第二应用程序,以使第二应用程序对该业务消息进行处理。此时,由于是通过第一监听单元01获取第一应用程序生成的业务消息,地址转换模块将业务消息的目的地址修改为故障注入模块的地址,实质是将该业务消息的目的地址修改为该第一监听单元C01的地址。Optionally, the function of the fault injection module can be realized by multiple functional units. For example, as shown in FIG. 1, the fault injection module may include: a first monitoring unit C01 and a fault injection unit C02. The first monitoring unit C01 is used to obtain a business message generated by a first application, and to transfer the business message Send to fault injection unit C02. The fault injection unit C02 is configured to perform the fault injection operation indicated by the fault attribute parameter on the service message based on the fault attribute parameter, and send the service message after the fault injection to the second application, so that the second application can Business messages are processed. At this time, since the business message generated by the first application program is obtained through the first monitoring unit 01, the address conversion module modifies the destination address of the business message to the address of the fault injection module, essentially modifying the destination address of the business message to this The address of the first listening unit C01.
并且,故障注入模块可以向具有特定属性的业务消息注入故障,以有针对性地进行可靠性测试。此时,可以对业务消息进行筛选,并对筛选通过的业务消息注入故障。在一种可实现方式中,该筛选业务消息的功能可以通过过滤单元C03实现。也即是,如图1所示,该多个功能性单元还可以包括:过滤单元C03,该过滤单元C03用于根据过滤条件对业务消息进行过滤,并将符合过滤条件的业务消息发送至故障注入单元C02,将不符合过滤条件的业务消息发送至第二应用程序。通过向具有特定属性的业务消息注入故障,能够避免在测试过程中引入无关的测试,实现对故障的精准注入。In addition, the fault injection module can inject faults into business messages with specific attributes for targeted reliability testing. At this point, the business messages can be screened, and faults can be injected into the business messages that pass the screening. In an implementable manner, the function of screening service messages can be implemented by the filtering unit C03. That is, as shown in FIG. 1, the multiple functional units may further include: a filtering unit C03, the filtering unit C03 is used to filter business messages according to filtering conditions, and send business messages that meet the filtering conditions to the fault The injection unit C02 sends the business message that does not meet the filtering conditions to the second application. By injecting faults into business messages with specific attributes, it is possible to avoid the introduction of irrelevant tests during the testing process, and to achieve precise fault injection.
进一步的,如图1所示,该多个功能性单元还可以包括:第二监听单元C04。此时,故障注入单元C02可以将注入故障后的业务消息发送至第二监听单元C04,以通过该第二监听单元C04将注入故障后的业务消息发送至第二应用程序。并且,过滤单元C03可以将不符合过滤条件的业务消息绕过故障注入单元C02发送至第二监听单元C04,以通过该第二监听单元C04将不符合过滤条件的业务消息直接发送至第二应用程序。Further, as shown in FIG. 1, the multiple functional units may further include: a second monitoring unit C04. At this time, the fault injection unit C02 may send the service message after the fault is injected to the second monitoring unit C04, so as to send the service message after the fault injection to the second application through the second monitoring unit C04. In addition, the filtering unit C03 may bypass the fault injection unit C02 and send the business messages that do not meet the filtering conditions to the second monitoring unit C04, so as to directly send the business messages that do not meet the filtering conditions to the second application through the second monitoring unit C04 program.
在一种可实现方式中,为了保证能够将注入故障后的业务消息发送至第二应用程序,如图1所示,该故障注入模块中还可以包括:地址转换模单元C05,该地址转换单元C05用于将注入故障后的业务消息的目的地址修改为第二应用程序的地址。例如,在故障注入单元C02将注入故障后的业务消息发送至第二监听单元C04后,该地址转换单元C05可以将该注入故障后的业务消息的目的地址修改为第二应用程序的地址,使得第二监听单元04根据修改后的目的地址,将该注入故障后的业务消息发送至第二应用程序。或者,在过滤单元C03将不符合过滤条件的业务消息发送至第二监听单元04后,该地址转换单元C05可以将该不符合过滤条件的业务消息的目的地址修改为第二应用程序的地址,使得第二监听单元04根据修改后的目的地址,将该不符合过滤条件的业务消息的业务消息发送至第二应用程序。In an achievable manner, in order to ensure that the service message after the fault injection can be sent to the second application, as shown in FIG. 1, the fault injection module may further include: an address conversion module C05, the address conversion unit C05 is used to modify the destination address of the service message after the fault is injected to the address of the second application. For example, after the fault injection unit C02 sends the service message after the fault injection to the second monitoring unit C04, the address conversion unit C05 may modify the destination address of the service message after the fault injection to the address of the second application, so that The second monitoring unit 04 sends the service message after the fault injection to the second application according to the modified destination address. Alternatively, after the filtering unit C03 sends the business message that does not meet the filtering conditions to the second monitoring unit 04, the address conversion unit C05 may modify the destination address of the business message that does not meet the filtering conditions to the address of the second application. The second monitoring unit 04 sends the business messages of the business messages that do not meet the filtering conditions to the second application according to the modified destination address.
需要说明的是,上述故障注入模块的功能,可以全部或部分地通过软件、硬件或者软件和硬件的结合来实现。并且,当使用软件实现故障注入模块的功能时,该故障注入模块的功能可以全部或部分地以计算机程序产品的形式实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行该计算机指令时,全部或部分地实现本申请实施例所述的流程或功能。该计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。该计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,例如计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如:同轴电缆、光纤、数据用户线(Digital Subscriber Line,DSL))或无线(例如:红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。该计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。该可用介质可以是磁性介质(例如:软盘、硬盘、磁带)、光介质(例如:数字通用光盘(Digital Versatile Disc,DVD))、或者半导体介质(例如:固态硬盘(Solid State Disk,SSD))等。It should be noted that the functions of the above fault injection module can be implemented in whole or in part by software, hardware, or a combination of software and hardware. Moreover, when the function of the fault injection module is realized by software, the function of the fault injection module may be realized in the form of a computer program product in whole or in part. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are fully or partially realized. The computer can be a general-purpose computer, a dedicated computer, a computer network, or other programmable devices. The computer instructions can be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions can be transmitted from a website, computer, server, or data center through a wired (For example: coaxial cable, optical fiber, Digital Subscriber Line (DSL)) or wireless (for example: infrared, wireless, microwave, etc.) to transmit to another website, computer, server or data center. The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media. The usable medium can be magnetic medium (for example: floppy disk, hard disk, magnetic tape), optical medium (for example: Digital Versatile Disc (DVD)), or semiconductor medium (for example: Solid State Disk (SSD)) Wait.
当使用软件实现故障注入模块的功能时,本申请实施例提供的故障注入方法所涉及的应用场景可以有多种部署方式,下面以以下几种为例进行说明。When software is used to implement the function of the fault injection module, the application scenarios involved in the fault injection method provided in the embodiments of the present application may have multiple deployment modes, and the following are described as examples.
如图2和图3所示,该应用场景包括:第一服务器110和第二服务器120。第一服务器110和第二服务器120均可以是一台服务器,或者由若干台服务器组成的服务器集群,或者是一个云计算服务中心。第一服务器110和第二服务器120之间可以通过有线网络或无线网络连接。As shown in FIGS. 2 and 3, the application scenario includes: a first server 110 and a second server 120. Both the first server 110 and the second server 120 may be one server, or a server cluster composed of several servers, or a cloud computing service center. The first server 110 and the second server 120 may be connected through a wired network or a wireless network.
并且,如图2和图3所示,第一服务器110包括第一处理器1101,第一通信接口1102和第一存储器1103。第一处理器1101、第一通信接口1102和第一存储器1103之间通过第一总线1104相互连接。第二服务器120包括第二处理器1201,第二通信接口1202和第二存储器1203。第二处理器1201、第二通信接口1202和第二存储器1203之间通过第二总线1204相互连接。其中,第一存储器1103和第二存储器1203均用于存储计算机程序,且该计算机程序可以是应用程序,第一处理器1101调用第一存储器1103中的应用程序时,能够实现该应用程序的功能,第二处理器1201调用第二存储器1203中的应用程序时,能够实现该应用程序的功能。In addition, as shown in FIGS. 2 and 3, the first server 110 includes a first processor 1101, a first communication interface 1102, and a first memory 1103. The first processor 1101, the first communication interface 1102, and the first memory 1103 are connected to each other through a first bus 1104. The second server 120 includes a second processor 1201, a second communication interface 1202, and a second memory 1203. The second processor 1201, the second communication interface 1202, and the second memory 1203 are connected to each other through a second bus 1204. Wherein, the first memory 1103 and the second memory 1203 are both used to store a computer program, and the computer program may be an application program. When the first processor 1101 calls the application program in the first memory 1103, it can realize the function of the application program When the second processor 1201 calls the application in the second memory 1203, it can implement the function of the application.
在图2所示的应用场景中,第一服务器110中可以运行有第一应用程序、地址转换模块和故障注入模块,第二服务器120中可以运行有第二应用程序。此时,如图2所示,第一存储器1103中可以存储有第一应用程序1103a、地址转换模块1103b和故障注入模块1103c,第二存储器1203中可以存储有第二应用程序1203a。In the application scenario shown in FIG. 2, the first server 110 may run a first application program, an address conversion module, and a fault injection module, and the second server 120 may run a second application program. At this time, as shown in FIG. 2, the first application program 1103a, the address conversion module 1103b, and the fault injection module 1103c may be stored in the first memory 1103, and the second application program 1203a may be stored in the second memory 1203.
在图3所示的应用场景中,第一服务器110中可以运行有第一应用程序和地址转换模块,第二服务器120中可以运行有第二应用程序和故障注入模块。此时,如图3所示,第一存储器1103中可以存储有第一应用程序1103a和地址转换模块1103b,第二存储器1203中可以存储有第二应用程序1203a和故障注入模块1203b。In the application scenario shown in FIG. 3, the first server 110 may run a first application program and an address conversion module, and the second server 120 may run a second application program and a fault injection module. At this time, as shown in FIG. 3, the first memory 1103 may store the first application program 1103a and the address conversion module 1103b, and the second memory 1203 may store the second application program 1203a and the fault injection module 1203b.
图4是本申请实施例提供的另一种故障注入方法所涉及的应用场景示意图。如图4所示,该应用场景中包括:第一服务器110、第二服务器120和第三服务器130。第一服务器110、第二服务器120和第三服务器130均可以是一台服务器,或者由若干台服务器组成的服务器集群,或者是一个云计算服务中心。第一服务器110和第二服务器120之间、第一服务器110和第三服务器130之间、及第二服务器120和第三服务器130之间均可以通过有线网络或无线网络连接。FIG. 4 is a schematic diagram of an application scenario involved in another fault injection method provided by an embodiment of the present application. As shown in FIG. 4, the application scenario includes: a first server 110, a second server 120, and a third server 130. The first server 110, the second server 120, and the third server 130 may all be one server, or a server cluster composed of several servers, or a cloud computing service center. The first server 110 and the second server 120, the first server 110 and the third server 130, and the second server 120 and the third server 130 may all be connected through a wired network or a wireless network.
并且,如图4所示,第一服务器110包括第一处理器1101,第一通信接口1102和第一存储器1103。第一处理器1101、第一通信接口1102和第一存储器1103之间通过第一总线1104相互连接。第二服务器120包括第二处理器1201,第二通信接口1202和第二存储器1203。第二处理器1201、第二通信接口1202和第二存储器1203之间通过第二总线1204相互连接,第三服务器130包括第三处理器1301,第三通信接口1302和第三存储器1303。第三处理器1301、第三通信接口1302和第三存储器1303之间通过第三总线1304相互连接。第一存储器1103、第二存储器1203和第三存储器1303均用于存储计算机程序,且该计算机程序可以是应用程序,某一服务器中的处理器调用该服务器的存储器中的应用程序时,能够实现该应用程序的功能。In addition, as shown in FIG. 4, the first server 110 includes a first processor 1101, a first communication interface 1102, and a first memory 1103. The first processor 1101, the first communication interface 1102, and the first memory 1103 are connected to each other through a first bus 1104. The second server 120 includes a second processor 1201, a second communication interface 1202, and a second memory 1203. The second processor 1201, the second communication interface 1202, and the second memory 1203 are connected to each other through a second bus 1204. The third server 130 includes a third processor 1301, a third communication interface 1302, and a third memory 1303. The third processor 1301, the third communication interface 1302, and the third memory 1303 are connected to each other through a third bus 1304. The first memory 1103, the second memory 1203, and the third memory 1303 are all used to store computer programs, and the computer programs may be application programs. When a processor in a server calls an application program in the server’s memory, it can be implemented The functionality of the application.
在该图4所示的应用场景中,第一服务器110中可以运行有第一应用程序和地址转换模块,第二服务器120中可以运行有第二应用程序,第三服务器130中可以运行有故障注入模块。此时,如图4所示,第一存储器1103中可以存储有第一应用程序1103a和地址转换模块1103b,第二存储器1203中存储有第二应用程序1203a,第三 存储器1303中存储有故障注入模块1303a。In the application scenario shown in FIG. 4, the first application program and the address conversion module may be running in the first server 110, the second application program may be running in the second server 120, and the third server 130 may be running faulty. Inject the module. At this time, as shown in FIG. 4, the first application program 1103a and the address conversion module 1103b can be stored in the first memory 1103, the second application program 1203a is stored in the second memory 1203, and the fault injection is stored in the third memory 1303. Module 1303a.
当使用硬件实现故障注入模块的功能时,本申请实施例提供的故障注入方法所涉及的应用场景可以包括:第一服务器、第二服务器和故障注入模块。该第一服务器和第二服务器均可以是一台服务器,或者由若干台服务器组成的服务器集群,或者是一个云计算服务中心。第一服务器和第二服务器之间、第一服务器和故障注入模块之间、及故障注入模块和第二服务器之间可以通过有线网络或无线网络连接。且在该应用场景中,第一应用程序在第一服务器中的部署情况,第二应用程序在第二服务器中的部署情况,请相应参考图3的部署情况,此处不再赘述。When hardware is used to implement the function of the fault injection module, the application scenarios involved in the fault injection method provided in the embodiments of the present application may include: a first server, a second server, and a fault injection module. Both the first server and the second server may be one server, or a server cluster composed of several servers, or a cloud computing service center. The first server and the second server, the first server and the fault injection module, and the fault injection module and the second server may be connected through a wired network or a wireless network. And in this application scenario, for the deployment situation of the first application in the first server and the deployment situation of the second application in the second server, please refer to the deployment situation in FIG. 3 accordingly, and will not be repeated here.
需要说明的是,当第一应用程序与地址转换模块均部署在第一服务器110中时,该地址转换模块可以集成在该第一应用程序中。例如,该地址转换模块可以为第一应用程序中的小程序,当接收到故障注入请求时,该地址转换模块可以获取该第一应用程序生成的用于访问第二应用程序的业务消息,并将该业务消息的目的地址修改为故障注入模块的地址,以便于故障注入模块能够获取该业务消息。It should be noted that when the first application program and the address conversion module are both deployed in the first server 110, the address conversion module may be integrated in the first application program. For example, the address conversion module may be a small program in the first application program. When a fault injection request is received, the address conversion module may obtain the business message generated by the first application program for accessing the second application program, and Modify the destination address of the business message to the address of the fault injection module so that the fault injection module can obtain the business message.
或者,当第一应用程序与地址转换模块均部署在第一服务器110中时,该地址转换模块和该第一应用程序可以独立部署在该第一服务器110中,例如,上述图2至图4均是第一应用程序与地址转换模块独立部署的示例性说明。当地址转换模块与第一应用程序独立部署时,从第一应用程序的视角,在该第一应用程序生成用于访问第二应用程序的业务消息后,由于第一应用程序发送业务消息的流程与不进行可靠性测试时的业务消息发送流程相同,使得在测试过程中无需对第一应用程序进行更改,能够解决相关技术中需要做大量的适配工作才能进行测试的问题,实现了故障的无感知注入。Or, when the first application program and the address conversion module are both deployed in the first server 110, the address conversion module and the first application program may be independently deployed in the first server 110, for example, the foregoing FIGS. 2 to 4 Both are exemplary illustrations of the independent deployment of the first application program and the address translation module. When the address translation module is deployed independently of the first application program, from the perspective of the first application program, after the first application program generates a business message for accessing the second application program, the process of sending the business message by the first application program It is the same as the business message sending process when the reliability test is not carried out, so that there is no need to change the first application program during the test process, which can solve the problem of a large amount of adaptation work in the related technology before the test can be carried out. No perception injection.
在图2至图4中,第一总线1104、第二总线1204和第三总线1304中的任一总线可以分为地址总线、数据总线、控制总线等。为便于表示,图2至图4中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。In FIGS. 2 to 4, any one of the first bus 1104, the second bus 1204, and the third bus 1304 can be divided into an address bus, a data bus, a control bus, and the like. For ease of presentation, only one thick line is used in FIGS. 2 to 4 to indicate, but it does not mean that there is only one bus or one type of bus.
在图2至图4中,第一处理器1101、第二处理器1201和第三处理器1301中的任一处理器可以是硬件芯片,该硬件芯片可以是专用集成电路(application-specific integrated circuit,ASIC),可编程逻辑器件(programmable logic device,PLD)或其组合。上述PLD可以是复杂可编程逻辑器件(complex programmable logic device,CPLD),现场可编程逻辑门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)或其任意组合。或者,也可以是通用处理器,例如,中央处理器(central processing unit,CPU),网络处理器(network processor,NP)或者CPU和NP的组合。In FIGS. 2 to 4, any one of the first processor 1101, the second processor 1201, and the third processor 1301 may be a hardware chip, and the hardware chip may be an application-specific integrated circuit (application-specific integrated circuit). , ASIC), programmable logic device (programmable logic device, PLD) or a combination thereof. The above-mentioned PLD may be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL) or any combination thereof. Or, it may also be a general-purpose processor, for example, a central processing unit (CPU), a network processor (NP), or a combination of a CPU and an NP.
在图2至图4中,第一存储器1103、第二存储器1203和第三存储器1303中的任一存储器可以包括易失性存储器(volatile memory),例如随机存取存储器(random-access memory,RAM);也可以包括非易失性存储器(non-volatile memory),例如快闪存储器(flash memory),硬盘(hard disk drive,HDD)或固态硬盘(solid-state drive,SSD);还可以包括上述种类的存储器的组合。In FIGS. 2 to 4, any one of the first memory 1103, the second memory 1203, and the third memory 1303 may include a volatile memory (volatile memory), such as random-access memory (RAM). ); It can also include non-volatile memory, such as flash memory, hard disk drive (HDD), or solid-state drive (SSD); it can also include the above A combination of types of storage.
下面以图2所示的部署情况,且故障注入模块的多个功能性单元包括:第一监听 单元、故障注入单元、过滤单元、第二监听单元和地址转换单元为例,对本申请实施例提供的故障注入方法的实现过程进行说明。当第一应用程序、第二应用程序、地址转换模块和故障注入模块按照其他情况进行部署时,该故障注入方法的实现过程,请相应参考下述步骤501至步骤507的实现过程。如图5所示,该方法包括:The following takes the deployment situation shown in FIG. 2 and the multiple functional units of the fault injection module include: a first monitoring unit, a fault injection unit, a filtering unit, a second monitoring unit, and an address conversion unit as an example, the embodiment of the present application is provided The implementation process of the fault injection method is explained. When the first application, the second application, the address conversion module, and the fault injection module are deployed according to other conditions, please refer to the implementation process of step 501 to step 507 below for the implementation process of the fault injection method. As shown in Figure 5, the method includes:
步骤501、第一监听单元接收故障注入请求,并基于该故障注入请求向地址转换模块发送地址转换请求。Step 501: The first monitoring unit receives the fault injection request, and sends an address conversion request to the address conversion module based on the fault injection request.
其中,地址转换请求用于请求将第一应用程序生成的用于访问第二应用程序的业务消息的目的地址修改为第一监听单元的地址。该故障注入请求用于请求在第一应用程序向第二应用程序发送业务消息的过程中注入故障。且故障注入请求中可以携带有故障属性参数,该故障属性参数用于指示需要注入的故障。此时,第一监听单元接收该故障注入请求后,还需要将该故障属性参数发送至故障注入单元。或者,该故障属性参数也可以预先存储在故障注入单元中,本申请实施例对其不做具体限定。The address conversion request is used to request that the destination address of the business message generated by the first application program for accessing the second application program be modified to the address of the first monitoring unit. The fault injection request is used to request for fault injection during the process of sending a business message from the first application to the second application. In addition, the fault injection request may carry a fault attribute parameter, and the fault attribute parameter is used to indicate the fault to be injected. At this time, after receiving the fault injection request, the first monitoring unit also needs to send the fault attribute parameter to the fault injection unit. Alternatively, the fault attribute parameter may also be pre-stored in the fault injection unit, which is not specifically limited in the embodiment of the present application.
并且,该故障注入请求是发送至故障注入模块的,且该故障注入请求可以由该故障注入模块中的任一功能性单元接收,例如,可以由第一监听单元接收,或由过滤单元接收,或由故障注入单元接收,或由第二监听模块接收。该图5是由第一监听单元接收该故障注入请求的示例。Moreover, the fault injection request is sent to the fault injection module, and the fault injection request can be received by any functional unit in the fault injection module, for example, it can be received by the first listening unit or by the filtering unit, Or received by the fault injection unit, or received by the second monitoring module. Figure 5 is an example of receiving the fault injection request by the first listening unit.
在需要对应用程序的实现系统进行可靠性分析时,可以在管理客户端上触发故障注入请求,以请求对待分析可靠性的实现系统注入故障。例如,测试人员可以在管理客户端中提交注入故障的任务,以触发该故障注入请求。或者,可以通过调用软件开发工具包(software development kit,SDK)的方式,触发该故障注入请求。且在管理客户端中提交注入故障的任务,可以在管理客户端中应用程序的用户界面中操作,也可以在管理客户端中浏览器的用户界面中操作。When it is necessary to perform reliability analysis on the implementation system of the application program, a fault injection request can be triggered on the management client to request the implementation system of the reliability to be analyzed to inject faults. For example, the tester can submit a fault injection task in the management client to trigger the fault injection request. Or, the fault injection request can be triggered by invoking a software development kit (SDK). And the task of injecting faults submitted in the management client can be operated in the user interface of the application in the management client, or in the user interface of the browser in the management client.
其中,该故障属性参数可以在管理客户端的用户界面中输入。可选的,该故障属性参数可以包括:注入故障的故障标识。示例的,第一服务器中可以存储有用于注入不同故障的程序指令,第一服务器在接收到携带有故障标识的故障注入请求后,可以根据该故障标识调用该故障标识指示的故障的程序指令,并通过运行调用的程序指令,以实现该故障标识所指示的故障注入。Among them, the fault attribute parameter can be input in the user interface of the management client. Optionally, the fault attribute parameter may include: the fault identifier of the injected fault. For example, the first server may store program instructions for injecting different faults. After receiving the fault injection request carrying the fault identifier, the first server may call the program instructions of the fault indicated by the fault identifier according to the fault identifier, And by running the called program instructions, to achieve the fault injection indicated by the fault identifier.
在一种可实现方式中,该故障属性参数还可以包括以下一个或多个:故障的发生几率和故障的持续时长。通过设置故障的持续时长,能够有效控制故障注入的时长,使得能够在指定时间段内模拟出第一应用程序和第二应用程序中被依赖端出现故障的状态。当故障属性参数包括故障的发生几率时,通过按照该故障属性参数注入故障,可以控制故障按照该发生几率随机地出现故障,以模拟出该被依赖端处于亚健康状态的情况,实现在被依赖端处于亚健康状态时对依赖端的可靠性测试。其中,被依赖端处于亚健康状态是指被依赖端能够提供概率性失败的服务的状态。例如,假设被依赖端为第二应用程序,当注入故障为丢包故障时,若故障的发生几率为100%(即丢包率为100%),则可以模拟出第二应用程序接收到数据包即丢弃的故障状态,若故障的发生几率为50%(即丢包率为50%),则可以模拟出第二应用程序按照50%的概率随机丢弃接收到的数据包的亚健康状态。In an implementable manner, the fault attribute parameter may further include one or more of the following: the probability of occurrence of the fault and the duration of the fault. By setting the duration of the failure, the duration of the failure injection can be effectively controlled, so that the failure state of the dependent end in the first application and the second application can be simulated within a specified time period. When the fault attribute parameter includes the probability of occurrence of the fault, by injecting the fault according to the fault attribute parameter, the fault can be controlled to randomly appear fault according to the probability of occurrence, so as to simulate the situation that the dependent end is in a sub-healthy state and realize The reliability test of the dependent end when the end is in a sub-health state. Among them, the dependent end is in a sub-healthy state refers to the state in which the dependent end can provide services with probabilistic failure. For example, suppose the dependent end is the second application, and when the injection fault is a packet loss fault, if the probability of occurrence of the fault is 100% (that is, the packet loss rate is 100%), it can be simulated that the second application receives data The packet is the failure state of discarding. If the probability of occurrence of the failure is 50% (that is, the packet loss rate is 50%), it can be simulated that the second application randomly discards the sub-health state of the received data packet with a probability of 50%.
在另一种可实现方式中,可以有针对性地注入故障,此时,该故障属性参数还可 以包括以下一个或多个:注入故障的端口号和Ip,及对业务消息进行过滤的过滤条件。当向业务消息进行故障注入时,注入故障的端口号和Ip为第二应用程序用于接收业务消息的端口的端口号和Ip。此时,通过按照该故障属性参数注入故障,可以模拟出该端口出现故障的情况。通过设置对业务消息进行过滤的过滤条件,能够对具有特定属性的业务消息注入故障,使得可以对业请求有针对性地注入故障,能够避免在测试过程中引入无关的测试以及对其他无需测试的服务造成影响,实现对故障的精准注入。需要说明的是,当故障注入请求中携带有过滤条件时,在第一监听单元接收到该故障注入请求后,还需要将该过滤条件发送至过滤单元。In another achievable way, faults can be injected in a targeted manner. At this time, the fault attribute parameters can also include one or more of the following: port number and Ip of the injected fault, and filter conditions for filtering business messages . When performing fault injection into the service message, the port number and Ip of the injected fault are the port number and Ip of the port used by the second application to receive the service message. At this time, by injecting the fault according to the fault attribute parameter, the situation of the port failure can be simulated. By setting filter conditions for filtering business messages, faults can be injected into business messages with specific attributes, so that faults can be injected into business requests in a targeted manner, and irrelevant tests can be avoided in the testing process and other tests that do not need to be tested. Service affects, and accurate injection of faults is achieved. It should be noted that, when the fault injection request carries a filter condition, after the first monitoring unit receives the fault injection request, the filter condition needs to be sent to the filter unit.
示例的,图6为本申请实施例提供的一种管理客户端中应用程序的用户界面的示意图,测试人员可以在该用户界面中输入故障属性参数,并在参数输入完成后点击用户界面中的“立即注入”按钮,以触发故障注入请求。其中,图6所示的用户界面中需要输入的故障属性参数包括:测试环境的环境名称,用于部署测试环境的环境节点的名称,运行有第一应用程序的待测服务器的互联网协议地址(internet protocol address,IP),测试环境中的网络数据交换规则(protocol),业务所依赖的服务器的Ip和端口号(server_address),注入故障的类型(drop_type,也称注入故障的故障标识),故障的发生几率(drop_rate),故障的持续时长(timeout),及对业务消息进行过滤的过滤条件(filter_keyword_content)。For example, FIG. 6 is a schematic diagram of a user interface for managing an application in a client provided by an embodiment of the application. The tester can input fault attribute parameters in the user interface, and click on the user interface after the parameter input is completed. "Inject now" button to trigger a fault injection request. Among them, the fault attribute parameters that need to be input in the user interface shown in Figure 6 include: the environment name of the test environment, the name of the environment node used to deploy the test environment, and the Internet Protocol address of the server under test running the first application ( internet protocol address, IP), the network data exchange rules (protocol) in the test environment, the IP and port number (server_address) of the server on which the business depends, the type of the injected fault (drop_type, also called the fault identifier of the injected fault), the fault The probability of occurrence (drop_rate), the duration of the failure (timeout), and the filter conditions for filtering business messages (filter_keyword_content).
该图6为注入故障为丢包故障的用户界面图,如图6所示,该丢包故障的环境名称为:ECS_cui,环境节点的名称为:Apigateway_001,待测服务器的IP为:172.168.200.41,网络数据交换规则为:超文本传输安全协议(hypertext transfer protocol secure,https),业务所依赖的服务器的Ip和端口号为:172.168.200.42:8080,注入故障的故障标识为:req_drop(该标识用于表示注入故障为丢包故障),故障的发生几率(即丢包率)为:100%,故障的持续时长为:3600秒,对业务消息进行过滤的过滤条件为:all(全部),该all表示对所有业务消息均进行过滤,即对所有业务消息进行故障注入。Figure 6 is a user interface diagram of the injection failure as a packet loss failure. As shown in Figure 6, the environment name of the packet loss failure is: ECS_cui, the name of the environment node is: Apigateway_001, and the IP of the server under test is: 172.168.200.41 , The network data exchange rule is: Hypertext Transfer Protocol Secure (https), the IP and port number of the server on which the business depends is: 172.168.200.42:8080, and the fault identifier of the injected fault is: req_drop (the identifier Used to indicate that the injection failure is a packet loss failure), the probability of occurrence of the failure (that is, the packet loss rate) is: 100%, the duration of the failure is: 3600 seconds, and the filter condition for filtering business messages is: all (all), The all indicates that all business messages are filtered, that is, fault injection is performed on all business messages.
步骤502、第一应用程序生成用于访问第二应用程序的业务消息。Step 502: The first application program generates a business message for accessing the second application program.
用户在需要第一应用程序提供业务服务(例如云服务)时,可以通过客户端向第一应用程序发送业务请求指令,第一应用程序在接收到该业务请求指令后,可以生成对应的业务消息(此时也称业务请求)。且由于该第一应用程序的功能实现依赖于第二应用程序的功能实现,因此,第一应用程序可以生成用于访问第二应用程序的业务请求,以便于请求第二应用程序向该第一应用程序提供对应的服务,使得第一应用程序根据其向客户端提供该业务请求指令所指示的服务。或者,用户在需要第二应用程序提供业务服务(例如云服务)时,可以通过客户端向第二应用程序发送业务请求指令,第二应用程序在接收到该业务请求指令后,可以生成对应的业务请求,在第二应用程序向第一应用程度发送业务请求后,第一应用程序可以基于该业务请求生成用于访问第二应用程序业务消息(此时也称业务响应),以便于第二应用程序根据该业务响应向客户端提供该业务请求指令所指示的服务。When the user needs the first application to provide business services (for example, cloud services), the client can send a business request instruction to the first application through the client, and the first application can generate a corresponding business message after receiving the business request instruction (At this time, it is also called a business request). And because the function realization of the first application depends on the function realization of the second application, the first application can generate a service request for accessing the second application, so as to request the second application to send the first application to the first application. The application program provides the corresponding service, so that the first application program provides the service indicated by the service request instruction to the client according to the first application program. Or, when the user needs the second application to provide business services (for example, cloud services), he can send a business request instruction to the second application through the client, and the second application can generate the corresponding service request instruction after receiving the business request instruction. Service request. After the second application sends a service request to the first application level, the first application can generate a service message for accessing the second application based on the service request (also called service response at this time) to facilitate the second application. The application program provides the client with the service indicated by the service request instruction according to the service response.
步骤503、地址转换模块获取业务消息,基于地址转换请求,将业务消息的目的地址修改为第一监听单元的地址,发送修改后的业务消息。Step 503: The address conversion module obtains the service message, based on the address conversion request, modifies the destination address of the service message to the address of the first monitoring unit, and sends the modified service message.
地址转换模块接收到地址转换请求后,可以监听第一应用程是否生成了用于访问第二应用程序的业务消息,并在监听到第一应用程序生成了用于访问第二应用程序的业务消息时,获取该业务消息,并将该业务消息的目的地址修改为第一监听单元的地址,并将修改目的地址后的业务消息发送至第一监听单元。After the address conversion module receives the address conversion request, it can monitor whether the first application program has generated a business message for accessing the second application program, and after monitoring that the first application program has generated a business message for accessing the second application program At this time, the business message is acquired, the destination address of the business message is modified to the address of the first monitoring unit, and the business message with the modified destination address is sent to the first monitoring unit.
示例的,假设第一应用程序的端口号和IP为:172.168.200.41:4000,第一监听单元的端口号和IP为:172.168.200.41:5000,第二监听单元的端口号和IP为:1.2.3.4:8080,第二应用程序的端口号和IP为:172.168.200.42:8080,第一应用程序生成的业务请求为:http://172.168.200.42:8080/v1/xxxx,在不需要进行故障注入时,该第一应用程序生成的用于访问第二应用程序的业务消息的发送端地址为172.168.200.41:4000,该业务消息的目的地址为172.168.200.42:8080,在第一应用程序发出该业务消息后,网络中的路由器等设备会根据该业务消息的目的地址,将该业务消息发送至第二应用程序。For example, suppose the port number and IP of the first application are: 172.168.200.41:4000, the port number and IP of the first listening unit are: 172.168.200.41:5000, and the port number and IP of the second listening unit are: 1.2 .3.4:8080, the port number and IP of the second application are: 172.168.200.42:8080, the service request generated by the first application is: http://172.168.200.42:8080/v1/xxxx, which is not required During fault injection, the sender address of the business message generated by the first application program for accessing the second application program is 172.168.200.41:4000, and the destination address of the business message is 172.168.200.42:8080, in the first application program After sending the service message, the router and other devices in the network will send the service message to the second application according to the destination address of the service message.
当需要进行故障注入时,地址转换模块可以将业务消息的目的地址修改为第一监听单元的地址,使得该业务消息的目的地址变为172.168.200.41:5000,即将业务消息的目的地址由172.168.200.42:8080修改为172.168.200.41:5000,此时,地址转换模块发出该修改目的地址的业务消息后,网络中的网关等设备会根据修改后的目的地址,将该业务消息发送至第一监听单元。因此,通过修改业务消息的目的地址,故障注入模块可以拦截预计发送至第二应用程序的业务消息,并在对该业务消息注入故障后,将注入故障后的业务消息发送至第二应用程序。When fault injection is required, the address translation module can modify the destination address of the business message to the address of the first listening unit, so that the destination address of the business message becomes 172.168.200.41:5000, that is, the destination address of the business message is changed from 172.168. 200.42:8080 is modified to 172.168.200.41:5000. At this time, after the address translation module sends out the service message for modifying the destination address, the gateway and other devices in the network will send the service message to the first listener according to the modified destination address unit. Therefore, by modifying the destination address of the business message, the fault injection module can intercept the business message that is expected to be sent to the second application, and after injecting a fault into the business message, send the injected business message to the second application.
其中,可以通过iptables命令(一种程序指令)修改业务消息的目的地址,例如,将业务消息的目的地址由172.168.200.42:8080修改为172.168.200.41:5000的命令可以为:iptables–t nat–A OUTPUT–d 172.168.200.42–p tcp–m tcp–dport 8080–j DNAT–to_destination 172.168.200.41:5000。Among them, the destination address of the business message can be modified through the iptables command (a kind of program instruction). For example, the command to modify the destination address of the business message from 172.168.200.42:8080 to 172.168.200.41:5000 can be: iptables–t nat– A OUTPUT–d 172.168.200.42–p tcp–m tcp–dport 8080–j DNAT–to_destination 172.168.200.41:5000.
需要说明的是,在需要停止注入故障时,可以通过控制地址装换模块停止工作,以便于第一应用程序能够直接发出用于访问第二应用程序的业务消息,以使第一应用程序和第二应用程序按照原有的方式工作,使得业务恢复正常。It should be noted that when it is necessary to stop injecting faults, the address replacement module can be controlled to stop working, so that the first application can directly issue business messages for accessing the second application, so that the first application and the Second, the application works in the original way, so that the business returns to normal.
步骤504、第一监听单元接收修改目的地址后的业务消息,向过滤单元发送修改后的业务消息。Step 504: The first monitoring unit receives the service message with the modified destination address, and sends the modified service message to the filtering unit.
步骤505、过滤单元根据过滤条件,对业务消息进行过滤,并在业务消息符合过滤条件时,将业务消息发送至故障注入单元。Step 505: The filtering unit filters the business messages according to the filtering conditions, and when the business messages meet the filtering conditions, sends the business messages to the fault injection unit.
在进行可靠性测试时,可以有针对性的进行测试,此时,可以通过向具有特定属性的业务消息注入故障,然后根据注入故障后的业务消息完成可靠性测试。该向具有特定属性的业务消息注入故障的实现方式可以为:使用过滤单元根据过滤条件,对故障注入单元获取的业务消息进行过滤,并在业务消息符合过滤条件时,将业务消息发送至故障注入单元,并通过故障注入单元对业务消息注入故障。这样能够避免在测试过程中引入无关的测试,实现对故障的精准注入。During reliability testing, targeted testing can be performed. At this time, faults can be injected into business messages with specific attributes, and then reliability testing can be completed according to the business messages after the faults are injected. The implementation method of injecting faults into business messages with specific attributes can be: using a filtering unit to filter the business messages obtained by the fault injection unit according to the filtering conditions, and when the business messages meet the filtering conditions, send the business messages to the fault injection Unit, and inject faults into business messages through the fault injection unit. This can avoid the introduction of irrelevant tests in the test process, and achieve accurate fault injection.
其中,过滤条件可以根据实际需要进行设置。例如,当业务消息为业务请求时,该过滤条件可以涉及以下一项或多项内容:请求类型、请求访问的地址、请求头关键字和请求体关键字。当业务消息为针对业务请求的响应时,该过滤条件可以涉及以下 一项或多项内容:响应状态码、响应头关键字和响应体关键字。Among them, the filter conditions can be set according to actual needs. For example, when the service message is a service request, the filter condition may involve one or more of the following content: request type, address requested to be accessed, request header keyword, and request body keyword. When the service message is a response to a service request, the filter condition may involve one or more of the following content: response status code, response header keywords, and response body keywords.
示例的,请求类型可以包括:向接收端发送更改信息的请求(即POST请求),请求查询数据的请求(即GET请求),请求页面的首部的请求(即HEAD请求),允许客户端查看服务器的性能的请求(即OPTIONS请求),及从客户端向服务器传送的数据取代指定的文档的内容的请求(即PUT请求)等请求类型。请求访问的地址可以使用统一资源标识符(uniform resource identifier,URI)表示,例如,请求访问的地址可以为/v1/createVM。请求头关键字可以为请求内容的类型(content-Type)。请求体关键字可以为请求成功(success)。响应头关键字可以为响应内容的类型(content-Type)。响应体关键字可以为响应成功(success)。响应状态码可以为200、404或500等,该响应状态码200表示请求成功,响应状态码404表示请求资源不存在,响应状态码500表示服务器发生不可预期的错误。For example, the request type may include: a request to change information to the receiver (ie a POST request), a request to query data (ie a GET request), a request to request the header of the page (ie a HEAD request), allowing the client to view the server Performance requests (ie OPTIONS requests), and requests from the client to the server to replace the content of the specified document (PUT requests) and other request types. The address requested for access can be represented by a uniform resource identifier (URI). For example, the address requested for access can be /v1/createVM. The request header keyword can be the content-Type of the request. The keyword of the request body can be the request success (success). The response header keyword can be the content-Type of the response. The response body keyword can be response success (success). The response status code can be 200, 404, 500, etc. The response status code 200 indicates that the request is successful, the response status code 404 indicates that the requested resource does not exist, and the response status code 500 indicates that the server has an unexpected error.
需要说明的是,在图2所示的应用场景中,若过滤单元确定业务消息不符合过滤条件,则可将业务消息发送至第二监听单元,以通过该第二监听单元将该业务消息发送至第二应用程序。在图3或图4所示的应用场景中,当未设置第二监听单元时,若过滤单元确定业务消息不符合过滤条件,则可将业务消息直接发送至第二应用程序。It should be noted that, in the application scenario shown in FIG. 2, if the filtering unit determines that the service message does not meet the filtering conditions, it can send the service message to the second monitoring unit to send the service message through the second monitoring unit To the second application. In the application scenario shown in FIG. 3 or FIG. 4, when the second monitoring unit is not set, if the filtering unit determines that the service message does not meet the filtering condition, the service message can be directly sent to the second application.
步骤506、故障注入单元对业务请求执行故障属性参数所指示的故障注入操作,并将执行故障注入操作后的业务消息发送至第二监听单元。Step 506: The fault injection unit performs the fault injection operation indicated by the fault attribute parameter on the service request, and sends the service message after the fault injection operation is performed to the second monitoring unit.
对业务请求注入的故障可以为网络类故障,例如:丢包故障、时延故障和错包故障等。丢包故障是指在发送业务消息的过程中,出现了用于携带该业务消息的数据包被丢弃的情况。也即是,故障注入单元对业务请求注入丢包故障是指:当业务请求包括一个数据包时,故障注入单元根据故障属性参数丢弃该一个数据包,当业务请求包括多个数据包时,故障注入单元根据故障属性参数丢弃该多个数据包中的全部或部分数据包。时延故障是指在发送业务消息的过程中,出现了用于携带该业务消息的数据包被延迟转发的情况。也即是,故障注入单元对业务请求注入时延故障是指:故障注入单元在延迟故障属性参数指定时长后,发送业务消息的数据包。错包故障是指在发送业务消息的过程中,出现了用于携带该业务消息的数据包的内容被修改的情况,例如,出现了数据包中的报文头、报文体和状态码中的一个或多个被修改的情况。也即是,故障注入单元对业务请求注入错包故障是指:故障注入单元根据故障属性参数,修改业务消息中的报文。The faults injected into the service request can be network faults, such as packet loss faults, delay faults, and error packet faults. The packet loss fault refers to the situation that the data packet used to carry the service message is dropped during the process of sending the service message. That is, the fault injection unit injects the packet loss fault into the service request: when the service request includes a data packet, the fault injection unit discards the data packet according to the fault attribute parameter. When the service request includes multiple data packets, the fault The injection unit discards all or part of the multiple data packets according to the fault attribute parameter. Delay failure refers to a situation in which the data packet used to carry the service message is delayed and forwarded during the process of sending the service message. That is, the fault injection unit injects the delay fault into the service request refers to: the fault injection unit sends the data packet of the service message after delaying the time specified by the fault attribute parameter. The error of packet error refers to the situation that the content of the data packet used to carry the business message is modified in the process of sending the business message, for example, the message header, message body and status code in the data packet appear One or more circumstances have been modified. That is, the fault injection unit injects the wrong packet fault into the service request means that the fault injection unit modifies the message in the service message according to the fault attribute parameter.
并且,该注入的故障也可以为系统资源类故障、节点类故障、数据库类故障和容器类故障等类型的故障。示例的,该系统资源类故障可以包括:中央处理器(central processing unit,CPU)升压故障、内存泄露故障、硬盘故障、网络故障、进程异常退出故障、文件异常故障、文件系统故障和系统管理故障等。该节点类故障可以包括:节点异常关机故障和节点异常重启故障等。在本申请实施例中,注入的故障不限于上述描述的故障,还可以为其他故障,即注入的故障是可扩展的。可以通过预先编写注入故障的程序指令,并将该程序指令存储在存储介质中,当需要注入故障时,可以通过本申请实施例提供的故障注入方法确定待注入的故障,并执行用于注入对应故障的程序指令,以实现对应故障的注入。In addition, the injected faults may also be system resource faults, node faults, database faults, container faults, and other types of faults. For example, the system resource failures may include: central processing unit (CPU) boost failure, memory leak failure, hard disk failure, network failure, abnormal process exit failure, file abnormal failure, file system failure, and system management Failure etc. The node faults may include: abnormal shutdown of the node and abnormal restart of the node. In the embodiments of the present application, the injected fault is not limited to the above-described fault, and may also be other faults, that is, the injected fault is expandable. The program instructions for injecting faults can be written in advance and stored in the storage medium. When faults need to be injected, the faults to be injected can be determined by the fault injection method provided in the embodiments of the present application, and the instructions for injecting corresponding The program instructions of the fault to achieve the corresponding fault injection.
需要说明的是,当故障注入模块包括第二监听单元时,故障注入单元向业务消息 注入故障后,可以将注入故障后的业务消息发送至第二监听单元,以通过第二监听单元将注入故障后的业务消息发送至第二应用程序。当故障注入模块不包括第二监听单元时,故障注入单元向业务消息注入故障后,可以将注入故障后的业务消息直接发送至第二应用程序,即可以不执行步骤507。It should be noted that when the fault injection module includes the second monitoring unit, after the fault injection unit injects the fault into the business message, the fault injection unit can send the service message after the fault is injected to the second monitoring unit to inject the fault through the second monitoring unit The subsequent business message is sent to the second application. When the fault injection module does not include the second monitoring unit, after the fault injection unit injects the fault into the service message, the service message after the fault injection may be directly sent to the second application program, that is, step 507 may not be executed.
并且,在第二监听单元将该注入故障后的业务消息发送至第二应用程序之前,还可以通过故障注入模块中的地址转换单元,将第二监听单元待发送的业务消息的目的地址,修改为第二应用程序的地址。例如,仍以步骤503中的示例为例,地址转换单元可以将该业务消息的目的地址由1.2.3.4:8080修改为172.168.200.42:8080。其中,将业务消息的目的地址由1.2.3.4:8080修改为172.168.200.42:8080的命令可以为:iptables–t nat–A OUTPUT–d 1.2.3.4–p tcp–m tcp–dport 8080–j DNAT–to_destination 172.168.200.42:8080。In addition, before the second monitoring unit sends the fault-injected service message to the second application, the address conversion unit in the fault injection module can also modify the destination address of the service message to be sent by the second monitoring unit Is the address of the second application. For example, still taking the example in step 503 as an example, the address conversion unit may modify the destination address of the service message from 1.2.3.4:8080 to 172.168.200.42:8080. Among them, the command to modify the destination address of the business message from 1.2.3.4:8080 to 172.168.200.42:8080 can be: iptables–t nat–A OUTPUT–d 1.2.3.4–p tcp–m tcp–dport 8080–j DNAT –To_destination 172.168.200.42:8080.
还需要说明的是,由于在设置故障属性参数时,可以设置注入故障后的业务消息的接收端为第二应用程序,使得用于将注入故障后业务消息发送至第二应用程序的功能性单元可以根据该故障属性参数,获取注入故障后的业务消息的接收端,因此,该故障注入模块也可以不包括地址转换单元。It should also be noted that, when setting the fault attribute parameter, the receiving end of the service message after the injection failure can be set as the second application, so that the functional unit used to send the service message after the injection failure to the second application The receiving end of the service message after the fault is injected can be obtained according to the fault attribute parameter. Therefore, the fault injection module may not include an address conversion unit.
还需要说明的是,当向业务消息注入的故障为丢包率为100%的丢包故障时,就无需将业务消息再发送至第二应用程序。It should also be noted that when the fault injected into the business message is a packet loss fault with a packet loss rate of 100%, there is no need to send the business message to the second application again.
步骤507、第二监听单元向第二应用程序发送注入故障后的业务消息,使第二应用程序对注入故障后的业务消息进行处理。Step 507: The second monitoring unit sends the service message after the failure is injected to the second application, so that the second application process the service message after the failure is injected.
当该业务消息为业务请求时,第二应用程序接收到注入故障后的业务请求后,会根据其提供业务服务,并根据提供业务服务的结果向第一应用程序发送业务响应,第一应用程序接收到该业务响应后会做出反应。此时,相当于模拟出了第二应该程序出现故障的情况,通过对第一应用程序接收到业务响应后的反应进行分析,可以实现对第一应用程序在第二应用程序出现故障时的可靠性进行分析。When the business message is a business request, after the second application receives the business request after the fault is injected, it will provide business services based on it, and send a business response to the first application based on the result of the business service provided. It will react after receiving the business response. At this point, it is equivalent to simulating the failure of the second application program. By analyzing the response of the first application program after receiving the business response, the reliability of the first application program when the second application program fails can be realized. Analysis.
当该业务消息为业务响应时,第二应用程序在接收到注入故障后的业务响应后会做出反应。此时,相当于模拟出了第一应该程序出现故障的情况,通过对第二应用程序接收到业务响应后的反应进行分析,可以实现对第二应用程序在第一应用程序出现故障时的可靠性进行分析。When the business message is a business response, the second application will respond after receiving the business response after the fault is injected. At this point, it is equivalent to simulating the failure of the first application program. By analyzing the response of the second application program after receiving the business response, the reliability of the second application program when the first application program fails can be realized. Analysis.
综上所述,在本申请实施例提供的故障注入方法中,在第一应用程序生成用于访问第二应用程序的业务消息后,通过故障注入模块获取该业务消息,对业务消息执行故障属性参数所指示的故障注入操作,并向第二应用程序发送执行故障注入操作后的业务消息,能够在业务消息的发送过程中对业务消息注入故障,相较于相关技术中的全局性故障,细化了注入故障的粒度,有效的提高了根据注入的故障进行可靠性测试的精确度。To sum up, in the fault injection method provided by the embodiment of the present application, after the first application generates a business message for accessing the second application, the business message is obtained through the fault injection module, and the fault attribute is executed on the business message. The fault injection operation indicated by the parameter and sending the business message after the fault injection operation is performed to the second application can inject faults into the business message during the sending process of the business message. Compared with the global fault in the related technology, it is more detailed The granularity of injected faults is improved, and the accuracy of reliability testing based on injected faults is effectively improved.
并且,由于是在业务消息的发送过程中,通过拦截业务消息实现注入故障的,使得无需对注入故障的对象做适配工作,可以做到故障的无感知注入。Moreover, since the injecting fault is realized by intercepting the business message during the sending process of the business message, there is no need to adapt the object in which the fault is injected, and the faultless injection can be realized.
同时,通过使用过滤条件对业务消息进行过滤,并在业务消息符合过滤条件时,对业务消息注入故障,使得可以对业务消息有针对性地注入故障,能够避免在测试过程中引入无关的测试以及对其他无需测试的服务造成影响,实现对故障的精准注入。At the same time, by using filter conditions to filter business messages, and when the business messages meet the filter conditions, faults are injected into the business messages, so that faults can be injected into the business messages in a targeted manner, and irrelevant tests can be avoided in the test process. It affects other services that do not need to be tested, and realizes accurate fault injection.
本申请实施例提供的故障注入方法的步骤先后顺序可以进行适当调整,步骤也可以根据情况进行相应增减,例如,可以根据情况选择是否执行步骤507。任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化的方法,都应涵盖在本申请的保护范围之内,因此不再赘述。The sequence of steps of the fault injection method provided in the embodiments of the present application can be appropriately adjusted, and the steps can also be increased or decreased according to the situation. For example, it is possible to choose whether to execute step 507 according to the situation. Any person familiar with the technical field can easily think of a method of change within the technical scope disclosed in this application, which should be covered by the protection scope of this application, and therefore will not be repeated.
本申请实施例还提供了一种故障注入装置,该故障注入装置可以部署在服务器或计算机设备上。该故障注入装置可以包括本申请实施例提供的故障注入模块。例如,该故障注入装置可以包括:第一监听单元和故障注入单元;所述第一监听单元用于获取第一应用程序生成的业务消息,所述业务消息用于访问第二应用程序;所述故障注入单元用于获取来自所述第一监听单元的所述业务消息,对所述业务消息执行故障属性参数所指示的故障注入操作,将执行故障注入操作后的业务消息发送至所述第二应用程序。The embodiment of the present application also provides a fault injection device, which can be deployed on a server or computer equipment. The fault injection device may include the fault injection module provided in the embodiment of the present application. For example, the fault injection device may include: a first monitoring unit and a fault injection unit; the first monitoring unit is used to obtain a business message generated by a first application, and the business message is used to access a second application; The fault injection unit is used to obtain the service message from the first monitoring unit, perform the fault injection operation indicated by the fault attribute parameter on the service message, and send the service message after the fault injection operation is performed to the second application.
可选的,该故障注入装置还可以包括:过滤单元;所述过滤单元用于接收所述第一监听单元发送的所述业务消息,根据过滤条件对所述业务消息进行过滤,并将符合过滤条件的业务消息发送至故障注入单元。Optionally, the fault injection device may further include: a filtering unit; the filtering unit is configured to receive the business message sent by the first monitoring unit, filter the business message according to filtering conditions, and filter The conditional business message is sent to the fault injection unit.
可选的,该故障注入装置还可以包括:第二监听单元;此时,所述故障注入单元具体用于将执行故障注入操作后的业务消息发送至所述第二监听单元;所述第二监听单元用于将执行故障注入操作后的业务消息发送至所述第二应用程序。Optionally, the fault injection device may further include: a second monitoring unit; at this time, the fault injection unit is specifically configured to send the service message after the fault injection operation is performed to the second monitoring unit; The monitoring unit is configured to send the service message after the fault injection operation is performed to the second application.
可选的,该故障注入装置还可以包括:地址转换单元;所述地址转换单元用于将所述第二监听单元待发送的业务消息的目的地址修改为所述第二应用程序的地址。Optionally, the fault injection device may further include: an address conversion unit; the address conversion unit is configured to modify the destination address of the service message to be sent by the second monitoring unit to the address of the second application.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,该故障注入装置中各单元的具体工作过程,可以参考前述系统实施例中的对应单元的描述,在此不再赘述。Those skilled in the art can clearly understand that, for convenience and concise description, the specific working process of each unit in the fault injection device can refer to the description of the corresponding unit in the foregoing system embodiment, which will not be repeated here.
本申请实施例还提供了一种存储介质,该存储介质为非易失性计算机可读存储介质,当存储介质中的指令被处理器执行时,实现如本申请实施例中故障注入模块或地址转换模块所实现的功能。The embodiment of the present application also provides a storage medium. The storage medium is a non-volatile computer-readable storage medium. When the instructions in the storage medium are executed by the processor, the fault injection module or address in the embodiment of the present application is implemented. The function realized by the conversion module.
本申请实施例还提供了一种包含指令的计算机程序产品,当计算机程序产品在计算机上运行时,使得计算机执行本申请实施例中故障注入模块或地址转换模块所实现的功能。The embodiments of the present application also provide a computer program product containing instructions. When the computer program product runs on a computer, the computer executes the functions implemented by the fault injection module or the address conversion module in the embodiments of the present application.
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps in the foregoing embodiments can be implemented by hardware, or by a program instructing relevant hardware to be completed. The program can be stored in a computer-readable storage medium. The storage medium mentioned can be a read-only memory, a magnetic disk or an optical disk, etc.
在本申请实施例中,术语“第一”、“第二”和“第三”仅用于描述目的,而不能理解为指示或暗示相对重要性。术语“至少一个”是指一个或多个,术语“多个”指两个或两个以上,除非另有明确的限定。In the embodiments of the present application, the terms "first", "second" and "third" are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance. The term "at least one" refers to one or more, and the term "plurality" refers to two or more, unless specifically defined otherwise.
本申请中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三 种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。The term "and/or" in this application is merely an association relationship that describes associated objects, indicating that there can be three types of relationships. For example, A and/or B can mean that there is A alone, and both A and B exist. There are three cases of B. In addition, the character "/" in this text generally indicates that the associated objects before and after are in an "or" relationship.
以上所述仅为本申请的可选实施例,并不用以限制本申请,凡在本申请的构思和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。The above are only optional embodiments of this application and are not intended to limit this application. Any modification, equivalent replacement, improvement, etc. made within the concept and principle of this application shall be included in the protection of this application Within range.

Claims (18)

  1. 一种故障注入方法,其特征在于,所述故障注入方法应用于业务服务系统,所述业务服务系统中运行有第一应用程序、第二应用程序和故障注入模块,所述方法包括:A fault injection method, characterized in that the fault injection method is applied to a business service system in which a first application program, a second application program and a fault injection module are running, and the method includes:
    所述第一应用程序生成业务消息,所述业务消息用于访问所述第二应用程序;The first application program generates a business message, and the business message is used to access the second application program;
    所述故障注入模块获取所述业务消息,对所述业务消息执行故障属性参数所指示的故障注入操作;The fault injection module acquires the business message, and performs the fault injection operation indicated by the fault attribute parameter on the business message;
    所述故障注入模块将执行故障注入操作后的业务消息发送至所述第二应用程序。The fault injection module sends the business message after performing the fault injection operation to the second application.
  2. 根据权利要求1所述的方法,其特征在于,所述业务服务系统中还运行有地址转换模块,在所述故障注入模块获取所述业务消息之前,所述方法还包括:The method according to claim 1, characterized in that an address conversion module is also operated in the business service system, and before the fault injection module obtains the business message, the method further comprises:
    所述地址转换模块获取所述业务消息,将所述业务消息的目的地址修改为所述故障注入模块的地址,发送修改后的所述业务消息。The address conversion module obtains the business message, modifies the destination address of the business message to the address of the fault injection module, and sends the modified business message.
  3. 根据权利要求1或2所述的方法,其特征在于,所述故障注入模块对所述业务消息执行故障属性参数所指示的故障注入操作,包括:The method according to claim 1 or 2, wherein the fault injection module performing the fault injection operation indicated by the fault attribute parameter on the service message comprises:
    所述故障注入模块在所述业务消息符合过滤条件时,对所述业务消息执行所述故障属性参数所指示的故障注入操作。The fault injection module performs the fault injection operation indicated by the fault attribute parameter on the service message when the service message meets the filter condition.
  4. 根据权利要求1至3任一所述的方法,其特征在于,当所述业务消息为业务请求时,所述过滤条件涉及以下一项或多项内容:请求类型、请求访问的地址、请求头关键字和请求体关键字;The method according to any one of claims 1 to 3, wherein when the service message is a service request, the filter condition involves one or more of the following content: request type, address requested to be accessed, request header Keywords and request body keywords;
    当所述业务消息为针对业务请求的响应时,所述过滤条件涉及以下一项或多项内容:响应状态码、响应头关键字和响应体关键字。When the service message is a response to a service request, the filter condition involves one or more of the following content: response status code, response header keyword, and response body keyword.
  5. 根据权利要求1至4任一所述的方法,其特征在于,所述业务消息包括多个数据包,当所述故障属性参数所指示的故障注入操作为丢包操作时,所述故障注入模块对所述业务消息执行所述故障属性参数所指示的故障注入操作,包括:The method according to any one of claims 1 to 4, wherein the service message includes multiple data packets, and when the fault injection operation indicated by the fault attribute parameter is a packet loss operation, the fault injection module Performing the fault injection operation indicated by the fault attribute parameter on the business message includes:
    所述故障注入模块基于所述故障属性参数,丢弃所述多个数据包的部分或全部。The fault injection module discards part or all of the multiple data packets based on the fault attribute parameter.
  6. 根据权利要求1至5任一所述的方法,其特征在于,当所述故障属性参数所指示的故障注入操作为时延操作时,所述故障注入模块对所述业务消息执行故障属性参数所指示的故障注入操作,包括:The method according to any one of claims 1 to 5, wherein when the fault injection operation indicated by the fault attribute parameter is a time-delay operation, the fault injection module performs fault attribute parameter analysis on the service message. The indicated fault injection operations include:
    所述故障注入模块基于所述故障属性参数,在延迟所述故障属性参数指定的时长后,发送所述业务消息。Based on the fault attribute parameter, the fault injection module sends the service message after delaying the time specified by the fault attribute parameter.
  7. 根据权利要求1至6任一所述的方法,其特征在于,当所述故障属性参数所指示的故障注入操作为错包操作时,所述故障注入模块对所述业务消息执行故障属性参数所指示的故障注入操作,包括:The method according to any one of claims 1 to 6, wherein when the fault injection operation indicated by the fault attribute parameter is an error packet operation, the fault injection module performs fault attribute parameter analysis on the service message. The indicated fault injection operations include:
    所述故障注入模块基于所述故障属性参数,修改所述业务消息。The fault injection module modifies the service message based on the fault attribute parameter.
  8. 一种业务服务系统,其特征在于,所述业务服务系统包括:第一服务器、第二服务器和故障注入模块;A business service system, characterized in that the business service system includes: a first server, a second server and a fault injection module;
    所述第一服务器上运行的第一应用程序用于生成业务消息,所述业务消息用于访问所述第二服务器上运行的第二应用程序;A first application program running on the first server is used to generate a business message, and the business message is used to access a second application program running on the second server;
    所述故障注入模块用于获取所述业务消息,对所述业务消息执行故障属性参数所指示的故障注入操作;将执行故障注入操作后的业务消息发送至所述第二服务器上运行的第二应用程序。The fault injection module is used to obtain the service message, perform the fault injection operation indicated by the fault attribute parameter on the service message; send the service message after the fault injection operation is performed to the second server running on the second server application.
  9. 根据权利要求8所述的系统,其特征在于,所述业务服务系统还包括:地址转换模块;The system according to claim 8, wherein the business service system further comprises: an address conversion module;
    所述地址转换模块用于获取所述业务消息,将所述业务消息的目的地址修改为所述故障注入模块的地址,发送修改后的所述业务消息。The address conversion module is used to obtain the business message, modify the destination address of the business message to the address of the fault injection module, and send the modified business message.
  10. 根据权利要求8或9所述的系统,其特征在于,The system according to claim 8 or 9, wherein:
    所述故障注入模块具体用于在所述业务消息符合过滤条件时,对所述业务消息执行所述故障属性参数所指示的故障注入操作。The fault injection module is specifically configured to perform the fault injection operation indicated by the fault attribute parameter on the service message when the service message meets the filter condition.
  11. 根据权利要求8至10任一所述的系统,其特征在于,当所述业务消息为业务请求时,所述过滤条件涉及以下一项或多项内容:请求类型、请求访问的地址、请求头关键字和请求体关键字;The system according to any one of claims 8 to 10, wherein when the service message is a service request, the filter condition involves one or more of the following content: request type, address requested to be accessed, request header Keywords and request body keywords;
    当所述业务消息为针对业务请求的响应时,所述过滤条件涉及以下一项或多项内容:响应状态码、响应头关键字和响应体关键字。When the service message is a response to a service request, the filter condition involves one or more of the following content: response status code, response header keyword, and response body keyword.
  12. 根据权利要求8至11任一所述的系统,其特征在于,所述业务消息包括多个数据包,当所述故障属性参数所指示的故障注入操作为丢包操作时,所述故障注入模块具体用于:所述故障注入模块基于所述故障属性参数,丢弃所述多个数据包中的部分或全部数据包。The system according to any one of claims 8 to 11, wherein the service message includes multiple data packets, and when the fault injection operation indicated by the fault attribute parameter is a packet loss operation, the fault injection module Specifically, the fault injection module discards part or all of the multiple data packets based on the fault attribute parameter.
  13. 根据权利要求8至12任一所述的系统,其特征在于,当所述故障属性参数所指示的故障注入操作为时延操作时,所述故障注入模块具体用于基于所述故障属性参数,在延迟所述故障属性参数指定的时长后,发送所述业务消息。The system according to any one of claims 8 to 12, wherein when the fault injection operation indicated by the fault attribute parameter is a delayed operation, the fault injection module is specifically configured to be based on the fault attribute parameter, After delaying the time specified by the fault attribute parameter, the service message is sent.
  14. 根据权利要求8至13任一所述的系统,其特征在于,当所述故障属性参数所指示的故障注入操作为错包操作时,所述故障注入模块具体用于基于所述故障属性参数,修改所述业务消息中的报文。The system according to any one of claims 8 to 13, wherein when the fault injection operation indicated by the fault attribute parameter is an error packet operation, the fault injection module is specifically configured to be based on the fault attribute parameter, Modify the message in the service message.
  15. 一种故障注入装置,其特征在于,所述装置包括:第一监听单元和故障注入 单元;A fault injection device, characterized in that the device comprises: a first monitoring unit and a fault injection unit;
    所述第一监听单元用于获取第一应用程序生成的业务消息,所述业务消息用于访问第二应用程序;The first monitoring unit is used to obtain a business message generated by a first application, and the business message is used to access a second application;
    所述故障注入单元用于获取来自所述第一监听单元的所述业务消息,对所述业务消息执行故障属性参数所指示的故障注入操作,将执行故障注入操作后的业务消息发送至所述第二应用程序。The fault injection unit is configured to obtain the business message from the first monitoring unit, perform the fault injection operation indicated by the fault attribute parameter on the business message, and send the business message after the fault injection operation is performed to the The second application.
  16. 根据权利要求15所述的装置,其特征在于,所述装置还包括:过滤单元;The device according to claim 15, wherein the device further comprises: a filtering unit;
    所述过滤单元用于接收所述第一监听单元发送的所述业务消息,根据过滤条件对所述业务消息进行过滤,并将符合过滤条件的业务消息发送至故障注入单元。The filtering unit is configured to receive the business message sent by the first monitoring unit, filter the business message according to the filtering condition, and send the business message that meets the filtering condition to the fault injection unit.
  17. 根据权利要求15或16所述的装置,其特征在于,所述装置还包括:第二监听单元;The device according to claim 15 or 16, wherein the device further comprises: a second listening unit;
    所述故障注入单元具体用于将执行故障注入操作后的业务消息发送至所述第二监听单元;The fault injection unit is specifically configured to send the business message after the fault injection operation is performed to the second monitoring unit;
    所述第二监听单元用于将执行故障注入操作后的业务消息发送至所述第二应用程序。The second monitoring unit is configured to send the service message after the fault injection operation is performed to the second application.
  18. 根据权利要求15至17任一所述的装置,其特征在于,所述装置还包括:地址转换单元;The device according to any one of claims 15 to 17, wherein the device further comprises: an address conversion unit;
    所述地址转换单元用于将所述第二监听单元待发送的业务消息的目的地址修改为所述第二应用程序的地址。The address conversion unit is configured to modify the destination address of the service message to be sent by the second monitoring unit to the address of the second application program.
PCT/CN2020/110351 2019-08-20 2020-08-20 Fault injection method and device, and service system WO2021032175A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910769067.4A CN110674028A (en) 2019-08-20 2019-08-20 Fault injection method and device and business service system thereof
CN201910769067.4 2019-08-20

Publications (1)

Publication Number Publication Date
WO2021032175A1 true WO2021032175A1 (en) 2021-02-25

Family

ID=69076380

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/110351 WO2021032175A1 (en) 2019-08-20 2020-08-20 Fault injection method and device, and service system

Country Status (2)

Country Link
CN (1) CN110674028A (en)
WO (1) WO2021032175A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116702146A (en) * 2023-08-07 2023-09-05 北京理想乡网络技术有限公司 Injection vulnerability scanning method and system of Web server

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110674028A (en) * 2019-08-20 2020-01-10 华为技术有限公司 Fault injection method and device and business service system thereof
CN111858381B (en) * 2020-07-31 2023-05-16 抖音视界有限公司 Application fault tolerance capability test method, electronic device and medium
CN112905434A (en) * 2021-03-22 2021-06-04 北京车和家信息技术有限公司 Fault drilling method, device, equipment, system and computer storage medium
CN113778834B (en) * 2021-11-10 2022-03-18 统信软件技术有限公司 System performance testing method and device of application software and computing equipment
CN115549966A (en) * 2022-08-25 2022-12-30 支付宝(杭州)信息技术有限公司 Security audit method and device for service request

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150193319A1 (en) * 2014-01-06 2015-07-09 Fujitsu Limited Method and a computing system allowing a method of injecting hardware faults into an executing application
CN106155883A (en) * 2015-03-30 2016-11-23 华为技术有限公司 A kind of virtual machine method for testing reliability and device
US20170242784A1 (en) * 2016-02-19 2017-08-24 International Business Machines Corporation Failure recovery testing framework for microservice-based applications
CN107368408A (en) * 2017-05-31 2017-11-21 中国船舶工业综合技术经济研究院 A kind of software fault towards interface injects automated testing method
CN109032825A (en) * 2018-06-06 2018-12-18 阿里巴巴集团控股有限公司 A kind of fault filling method, device and equipment
CN110674028A (en) * 2019-08-20 2020-01-10 华为技术有限公司 Fault injection method and device and business service system thereof

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060126799A1 (en) * 2004-12-15 2006-06-15 Microsoft Corporation Fault injection
CN104081346B (en) * 2012-02-07 2018-02-27 英特尔公司 For being interrupted using between tracking data Processing for removing device to support the method and apparatus of the address conversion in multiprocessor virtual machine environment
CN105335245B (en) * 2014-07-31 2019-02-01 华为技术有限公司 Failed storage method and apparatus, trouble shoot method and apparatus
CN104461865A (en) * 2014-11-04 2015-03-25 哈尔滨工业大学 Cloud environment distributed file system reliability test suite
CN104657244A (en) * 2015-02-10 2015-05-27 上海创景计算机系统有限公司 Embedded device CPU (Central Processing Unit) bus fault injection test system and test method
CN105446882B (en) * 2015-11-27 2017-11-07 合肥通用机械研究院 The method of testing of family expenses and similar applications electrical equipment software evaluation Black-box Testing system
US9965378B1 (en) * 2016-03-29 2018-05-08 Amazon Technologies, Inc. Mediated fault invocation service
CN108614764B (en) * 2016-12-12 2021-09-14 中国航空工业集团公司西安航空计算技术研究所 IMA application software fault injection method
CN108011743B (en) * 2017-07-28 2020-09-29 北京经纬恒润科技有限公司 Fault injection method and device
EP3438832B1 (en) * 2017-08-03 2020-10-07 Siemens Aktiengesellschaft A method for executing a program in a computer
CN108491317B (en) * 2018-02-06 2021-04-16 南京航空航天大学 SDC error detection method based on instruction vulnerability analysis
CN109597752B (en) * 2018-10-19 2022-11-04 中国船舶重工集团公司第七一六研究所 Fault propagation path simulation method based on complex network model

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150193319A1 (en) * 2014-01-06 2015-07-09 Fujitsu Limited Method and a computing system allowing a method of injecting hardware faults into an executing application
CN106155883A (en) * 2015-03-30 2016-11-23 华为技术有限公司 A kind of virtual machine method for testing reliability and device
US20170242784A1 (en) * 2016-02-19 2017-08-24 International Business Machines Corporation Failure recovery testing framework for microservice-based applications
CN107368408A (en) * 2017-05-31 2017-11-21 中国船舶工业综合技术经济研究院 A kind of software fault towards interface injects automated testing method
CN109032825A (en) * 2018-06-06 2018-12-18 阿里巴巴集团控股有限公司 A kind of fault filling method, device and equipment
CN110674028A (en) * 2019-08-20 2020-01-10 华为技术有限公司 Fault injection method and device and business service system thereof

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116702146A (en) * 2023-08-07 2023-09-05 北京理想乡网络技术有限公司 Injection vulnerability scanning method and system of Web server
CN116702146B (en) * 2023-08-07 2024-03-22 天翼安全科技有限公司 Injection vulnerability scanning method and system of Web server

Also Published As

Publication number Publication date
CN110674028A (en) 2020-01-10

Similar Documents

Publication Publication Date Title
WO2021032175A1 (en) Fault injection method and device, and service system
US11936548B2 (en) Active assurance for virtualized services
US11502932B2 (en) Indirect testing using impairment rules
US10425320B2 (en) Methods, systems, and computer readable media for network diagnostics
US9225601B2 (en) Network-wide verification of invariants
US7783750B1 (en) System and method for externalized real-time log correlation and performance monitoring of service-oriented applications
WO2021128977A1 (en) Fault diagnosis method and apparatus
US20090028053A1 (en) Root-cause approach to problem diagnosis in data networks
WO2022062407A1 (en) Link monitoring method and apparatus, and storage medium and electronic apparatus
US10728220B2 (en) System and method for covertly transmitting a payload of data
WO2019090997A1 (en) Data acquisition method and device, computer device and storage medium
US11777803B2 (en) Device management method, apparatus, and system
WO2022100020A1 (en) Vulnerability testing method and apparatus
US10608889B2 (en) High-level interface to analytics engine
US11621908B2 (en) Methods, systems and computer readable media for stateless service traffic generation
US9729404B2 (en) Quality of service monitoring device and method of monitoring quality of service
US6938086B1 (en) Auto-detection of duplex mismatch on an ethernet
US11223578B2 (en) System and control method to direct transmission of event data to one of a plurality of reception queues
Cisco Configuring the CNS Notification Engine
CN111813615B (en) Transaction exception processing method for application system
GB2566467A (en) Obtaining local area network diagnostic test results
US11973843B2 (en) On demand end user monitoring for automated help desk support
WO2022052876A1 (en) Virtual network health analysis method and system, and network device
US11469981B2 (en) Network metric discovery
US20240231878A1 (en) Application chaos injection for improved system resilience

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20853849

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20853849

Country of ref document: EP

Kind code of ref document: A1