CN114500349B - Cloud platform chaos testing method and device - Google Patents

Cloud platform chaos testing method and device Download PDF

Info

Publication number
CN114500349B
CN114500349B CN202111613738.1A CN202111613738A CN114500349B CN 114500349 B CN114500349 B CN 114500349B CN 202111613738 A CN202111613738 A CN 202111613738A CN 114500349 B CN114500349 B CN 114500349B
Authority
CN
China
Prior art keywords
test
concurrent
round
concurrency
error rate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111613738.1A
Other languages
Chinese (zh)
Other versions
CN114500349A (en
Inventor
杨帆
刘磊
何玥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianyi Cloud Technology Co Ltd
Original Assignee
Tianyi Cloud Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianyi Cloud Technology Co Ltd filed Critical Tianyi Cloud Technology Co Ltd
Priority to CN202111613738.1A priority Critical patent/CN114500349B/en
Publication of CN114500349A publication Critical patent/CN114500349A/en
Application granted granted Critical
Publication of CN114500349B publication Critical patent/CN114500349B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/50Testing arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Test And Diagnosis Of Digital Computers (AREA)

Abstract

The invention discloses a cloud platform chaos testing method and device, wherein the method comprises the following steps: after a fault is injected into the cloud platform to be tested, a test result of a kth concurrent test of the cloud platform to be tested is obtained, and the actual error rate of the kth concurrent test is counted; estimating the actual error rate of the kth concurrent test to obtain the predicted error rate of the kth+1th concurrent test; judging whether the prediction error rate of the k+1st concurrent test is larger than a preset threshold value; when the request quantity is larger than a preset threshold value, reducing the concurrent request quantity; when the request quantity is smaller than a preset threshold value, increasing the concurrent request quantity; and when the concurrent request quantity of the kth round of concurrent test is equal to a preset threshold value, determining the concurrent request quantity of the kth round of concurrent test as the critical of the cloud platform to be tested under the fault. According to the method, the current concurrent request quantity of the test is adjusted by utilizing the historical test result, and the concurrent request quantity is adjusted in a self-adaptive mode after fault injection, so that the performance critical of the cloud platform after fault injection is accurately tested, and the performance degradation condition of the cloud platform after fault is determined.

Description

Cloud platform chaos testing method and device
Technical Field
The invention relates to the technical field of testing, in particular to a cloud platform chaos testing method and device.
Background
In recent years, cloud computing has been a hot research direction in the ICT field, and quality control over cloud platform performance has also been a concern. Currently, mainstream open source cloud platform software (such as OpenStack) generally adopts a micro-service architecture. The micro-service architecture splits the single software into a plurality of software services with distinct functions and capable of being independently operated and deployed, and has the characteristics of good expansibility, easy deployment, easy development and the like. The adoption of the micro-service architecture is beneficial to reducing the cost of software development, is convenient to combine with the working mode of Devops (Development and operations), and simultaneously introduces new challenges. The software of the micro-service architecture is distributed, and copies of micro-services providing the same functionality are typically located on different hosts, virtual machines, or containers. Unexpected start-up and shut-down of the microservice copy may reduce or even interrupt the ability of the software to provide services externally. For cloud service providers, both dramatic degradation of control plane performance and failure of response can result in significant economic loss.
Chaos experiments are a research direction in the field of software testing in recent years. The chaos experiment is mainly used for observing whether the micro-service software system has the capability of coping with faults under the condition of random fault injection. The execution of the chaotic experiment is an important link for automatic realization. Existing chaos experimental tools, such as: chaosBlade, chaosMonkey, etc. can meet the requirements of artificially simulating faults such as a CPU, a memory, etc., but the inventor finds that if a constant number of concurrent requests is adopted in the testing of the cloud platform, whether the faults have influence on the result can only be judged according to the success rate of the requests, and how much the faults specifically degrade the performance of the cloud platform cannot be determined. Only multiple replicates can be performed to give an assessment of performance impact.
Disclosure of Invention
Therefore, the invention aims to solve the technical problem that the performance degradation condition of the cloud platform caused by faults cannot be determined in the prior art, and provides a method and a device for testing the chaos of the cloud platform.
In one aspect of the embodiment of the invention, a cloud platform chaos testing method is provided, which comprises the following steps: after a fault is injected into the cloud platform to be tested, a test result of the kth concurrent test of Yun Pingtai to be tested is obtained, the actual error rate of the kth concurrent test is counted, and k is 1,2 and 3 … …; estimating the actual error rate of the kth concurrent test to obtain the predicted error rate of the kth+1th concurrent test; judging whether the prediction error rate of the k+1-th concurrent test is larger than a preset threshold value; when the prediction error rate of the k+1th concurrent test is larger than the preset threshold, reducing the concurrent request quantity on the basis of the k-th concurrent test to obtain the concurrent request quantity of the k+1th concurrent test; when the prediction error rate of the k+1th round of concurrent test is smaller than the preset threshold, increasing the concurrent request quantity on the basis of the k-th round of concurrent test to obtain the concurrent request quantity of the k+1th round of concurrent test; and when the prediction error rate of the k+1th round of concurrent test is equal to the preset threshold, determining the concurrent request quantity of the k round of concurrent test as the critical of the cloud platform to be tested under the fault.
Optionally, the estimating the prediction error rate of the kth+1st round of concurrent test by using the actual error rate of the kth round of concurrent test includes: obtaining the prediction error rate of the kth concurrent test; and calculating the prediction error rate of the k+1th round concurrent test by using the preconfigured weight and the prediction error rate and the actual error rate of the k round concurrent test.
Optionally, the prediction error rate of the k+1st round of concurrent test is calculated by the following formula:
e′ k+1 =αe k +(1-α)e′ k
wherein e k Representing the actual error rate, e 'of the kth round of concurrent testing' k Representing the prediction error rate of the kth concurrent test, and alpha represents the smoothing coefficient.
Optionally, reducing the concurrency request amount based on the kth round of concurrency test to obtain the concurrency request amount of the kth+1 round of concurrency test, including: and calculating the concurrency request quantity of the k+1th round of concurrency test by using the prediction error rate of the k+1th round of concurrency test as an attenuation coefficient.
Optionally, the concurrency request amount of the k+1st round of concurrency test is calculated by the following formula:
wherein e' k+1 Representing the prediction error rate of the k+1st round concurrent test, C k Representing the amount of concurrent requests for the kth round of concurrent testing,representing rounding up the operators.
Optionally, increasing the concurrency request amount based on the kth round of concurrency test to obtain the concurrency request amount of the kth+1 round of concurrency test, including: determining the concurrency request quantity to be increased by using a preset floating coefficient and the concurrency request quantity of the k-th round of concurrency test, and adding the concurrency request quantity of the k-th round of concurrency test to obtain the concurrency request quantity of the k+1-th round of concurrency test.
Optionally, the concurrency request amount of the k+1st round of concurrency test is calculated by the following formula:
wherein e' k+1 Representing the prediction error rate of the k+1st round concurrent test, C k Represents the concurrency request quantity of the kth round of concurrency test, beta represents the floating coefficient,representing a downward rounding operator.
In another aspect of the present invention, there is also provided a cloud platform chaos testing device, including: the acquisition module is used for acquiring a test result of the kth concurrent test of Yun Pingtai to be tested after the fault is injected into the cloud platform to be tested, counting the actual error rate of the kth concurrent test, and taking 1,2 and 3 … …; the estimating module is used for estimating and obtaining the prediction error rate of the k+1st round concurrent test by utilizing the actual error rate of the k round concurrent test; the judging module is used for judging whether the prediction error rate of the k+1th round of concurrent test is larger than a preset threshold value; the first calculation module is used for reducing the concurrency request quantity on the basis of the k+1th round of concurrency test to obtain the concurrency request quantity of the k+1th round of concurrency test when the prediction error rate of the k+1th round of concurrency test is larger than the preset threshold; the second calculation module is used for increasing the concurrency request quantity on the basis of the k+1th round of concurrency test to obtain the concurrency request quantity of the k+1th round of concurrency test when the prediction error rate of the k+1th round of concurrency test is smaller than the preset threshold value; and the determining module is used for determining the concurrency request quantity of the kth round of concurrency test as the critical of the cloud platform to be tested under the fault when the prediction error rate of the kth+1 round of concurrency test is equal to the preset threshold.
In another aspect of the present invention, there is also provided a computer apparatus including: at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the cloud platform chaos testing method is executed.
In another aspect of the present invention, a computer readable storage medium is provided, where the computer readable storage medium stores computer instructions for causing a computer to execute the cloud platform chaos testing method described above.
The technical scheme of the invention has the following advantages:
according to the embodiment of the invention, the current concurrent request quantity of the test is adjusted by utilizing the historical test result, and the concurrent request quantity is adjusted in a self-adaptive manner after fault injection, so that the performance critical of the cloud platform injected with the fault is accurately tested, and the performance degradation condition of the cloud platform after the fault is determined.
According to the embodiment of the invention, the current error rate is dynamically adjusted by utilizing the error rate of the previous concurrent test, namely, historical data, so that the aim of adjusting the concurrent request quantity is fulfilled, and the evaluation accuracy of the performance degradation level after the fault is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a test system according to an embodiment of the present invention;
fig. 2 is a flowchart of a specific example of the cloud platform chaos testing method in embodiment 1 of the present invention;
FIG. 3 is a timing diagram of a chaotic test fault injection of a cloud platform according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a deployment architecture of a test system according to an embodiment of the present invention;
FIG. 5 is a schematic block diagram of a specific example of a cloud platform chaos testing device according to embodiment 2 of the present invention;
fig. 6 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made apparent and fully in view of the accompanying drawings, in which some, but not all embodiments of the invention are shown. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the description of the present invention, it should be noted that the directions or positional relationships indicated by the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. are based on the directions or positional relationships shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
In the description of the present invention, it should be noted that, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be either fixedly connected, detachably connected, or integrally connected, for example; can be mechanically or electrically connected; the two components can be directly connected or indirectly connected through an intermediate medium, or can be communicated inside the two components, or can be connected wirelessly or in a wired way. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.
In addition, the technical features of the different embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
The cloud platform chaos testing method and device provided by the embodiment of the invention can dynamically adjust the concurrent request quantity after faults in the automatic testing process of the cloud platform, can control the iteration times or time when the faults are injected and released, and realize the simulation of faults such as service start and stop, network congestion, high memory load and the like. The device provided by the invention can automatically execute and record fault injection of the chaotic experiment, improves the diversity of steady-state indexes of the chaotic experiment, and perfects the evaluation of faults on performance degradation.
Before introducing the testing method of the embodiment of the invention, the contents such as testing resources and environment provided by the embodiment of the invention are introduced.
The testing process of the embodiment of the invention needs to provide the following resources: chaotic tasks, users, projects, roles. The chaotic task is one-time execution of chaotic configuration on the cloud platform to be tested. The project is the minimum unit for managing chaotic configuration and chaotic tasks. The users, items and roles are related in a Role-based rights control (RBAC, role-based Access Control) mode. The user accesses the resources under the item by virtue of the role.
The embodiment of the invention also provides a testing system for executing the chaos test, as shown in fig. 1, which mainly comprises the following modules: user rights management module, task management module, cloud platform test suite, remote fault injection tool, etc. In addition to the above modules, the apparatus relies on a code hosting platform and a persistent integration platform.
The user authority management module: and providing functions of new creation, deletion, modification and inquiry of users, projects and roles. Support the rights control mode through the RBAC.
Code hosting platform: code hosting warehouses such as gitlab or gerrit can be selected. The user submits yaml configuration for fault injection to the code hosting platform, configuration parameters of the yaml file including, but not limited to: fault type, number of injection iterations, number of disaggregation iterations, total number of iterations, etc. The configuration is reviewed and incorporated on the code hosting platform.
The task management module: the task management module operates in the form of HTTP service and provides functions of adding, deleting, modifying and inquiring chaotic tasks. When a chaotic task is added, the task management module first verifies the request, including but not limited to: and checking whether the starting time is larger than the current time, whether the fault injection configuration file exists or not, and the like. After the verification is completed, the information such as ID, name, remark, starting time, configuration file content and the like is stored in a database. When the task start time is reached, the task management module triggers a pipeline for cloud platform fault injection and test on the continuous integrated platform, and updates the task state into running. After the continuous integrated platform assembly line is completed, the task management module updates the task into a finished and stores the test result to the database. When the chaotic task is updated, the task management module firstly performs state verification, the task name and remark field of any state can be updated, but only the task of the pending state can update the starting time, and the task management module stores the information to the database after updating.
Continuous integrated platform pipeline: when the continuous integration platform pipeline is triggered, the continuous integration platform slave node firstly downloads a cloud platform test tool, a fault injection tool and a fault injection configuration file from the code hosting platform. And then sequentially installing a fault injection tool and a cloud platform testing tool. And finally, executing basic configuration of the cloud platform test, and designating a fault injection configuration file to run the test.
Cloud platform test tool: the continuous integration platform runs a cloud platform test tool from the node. After the test command is issued, the cloud platform test tool starts two threads, one for executing the test task and the other for processing fault injection. The task to be tested and the iteration number to be executed are respectively managed in the first-in-first-out queues, and the thread to be tested puts single operation information such as the iteration number into the queues before each execution. If the thread for processing fault injection is triggered by the appointed iteration times, monitoring a queue of the iteration times, and sequentially taking the iteration times from the queue for comparison. If the time is triggered by the appointed time, starting a timer after the iteration times are fetched for the first time in the queue.
Cloud platform fault injection: the cloud platform fault injection module provides the capabilities of service start-stop, dock start-stop, server start-stop, network packet loss, network delay, memory load injection and CPU load injection. The fault injection module provides an API externally for other program calls. In the invention, a fault injection module is called by a cloud platform testing tool, and after a thread for processing fault injection in the testing tool is triggered, an API for fault injection is called. After the fault injection module logs in through the management network SSH or the out-of-band network IPMI, the fault injection task is completed by adopting the tools such as systems ctl, tc and the like.
The cloud platform chaos testing method provided by the embodiment of the invention, as shown in fig. 2, comprises the following steps:
and step S101, after the fault is injected into the cloud platform to be tested, the test result of the kth concurrent test of Yun Pingtai to be tested is obtained, the actual error rate of the kth concurrent test is counted, and k is 1,2 and 3 … ….
In the kth concurrent test, that is, in the test process of the kth iteration, a cloud platform test tool can be used to initiate a request of a corresponding concurrent request amount to a cloud platform for the kth concurrent test, so as to obtain a corresponding test result. The test result may be a single concurrent response result, and if the test result is not responded, an error is reported; if the response is successful, the test is passed. The ratio of the number of errors to the concurrent request can be used as the actual error rate. In the embodiment of the invention, the value of k can be 1,2 and 3 … …, which can be the test round after the fault is injected or the test round in the whole test task. For example, the test iterates 1000 times, where the value of k takes 1,2,3 … … 1000, where the fault is the start of injection at 200 th and recovery at 800 th. The embodiment of the invention mainly protects how to regulate the concurrent request quantity after fault injection.
Of course, it will be appreciated by those skilled in the art, after reading the embodiments of the present invention, that the adjustment of the concurrency request amount may be through the entire testing link. The embodiment of the invention emphasizes how to determine the critical after performance degradation of the cloud platform to be tested in the fault state after fault injection.
In the embodiment of the present invention, the message passing between the test execution and the fault injection is completed through the queue 1, and the iteration number (i.e., k) is saved in the queue 1. Before each test is executed, the test process stores the iteration times into the queue 1 and then starts a thread again to carry out test tasks. When the executing test thread reaches the concurrent request quantity, the test process waits for the test thread to finish executing, reads the test result and counts the actual error rate. Each test thread corresponds to one test case, and one test case corresponds to one concurrent test request.
And step S102, estimating the prediction error rate of the k+1st round concurrent test by using the actual error rate of the k round concurrent test.
In this embodiment, after the actual error rate is counted, the actual error rate is updated smoothly, and the prediction error rate of the concurrent test of the next round is estimated. Specifically, in the embodiment of the present invention, the actual error rate of the concurrent test of the kth round may be directly estimated to obtain the predicted error rate of the concurrent test of the kth+1 round, for example, the actual error rate is multiplied by a coefficient; on the other hand, the actual error rate of the kth concurrent test and the predicted error rate of the kth concurrent test can be used together to calculate the predicted error rate of the kth+1 concurrent test, for example, the actual error rate and the predicted error rate of the kth concurrent test are weighted and summed to obtain the predicted error rate of the kth+1 concurrent test.
As an optional implementation manner, the estimating the prediction error rate of the kth+1st round of concurrent test by using the actual error rate of the kth round of concurrent test includes: obtaining the prediction error rate of the kth concurrent test; and calculating the prediction error rate of the k+1th round concurrent test by using the preconfigured weight and the prediction error rate and the actual error rate of the k round concurrent test.
In the embodiment of the invention, the prediction error rate of the kth concurrent test can be calculated after the kth-1 concurrent test is finished; if k=1, the prediction error rate is an initial value, that is, 0. The preconfigured weight refers to the relative weight of the prediction error rate and the actual error rate of the kth concurrent test in the calculation process when the prediction error rate of the kth+1 concurrent test is calculated.
According to the embodiment of the invention, the current error rate is dynamically adjusted by utilizing the error rate of the previous concurrent test, namely, historical data, so that the aim of adjusting the concurrent request quantity is fulfilled, and the evaluation accuracy of the performance degradation level after the fault is improved.
As a further alternative embodiment, the prediction error rate of the k+1st round of concurrent testing may be calculated by the following formula:
e′ k+1 =αe k +(1-α)e′ k
wherein e k Representing the actual error rate, e 'of the kth round of concurrent testing' k Representing the prediction error rate of the kth concurrent test, and alpha represents the smoothing coefficient. Where the value of α is a constant value, a value greater than 0.5 and less than 1 may be taken, for example, 0.8. The higher the smoothing coefficient, the higher the duty cycle of the new observations. Because the test process of the embodiment of the invention is an iterative process, the formula is also an iterative process, and the longer the iteration times, the smaller the error rate is.
And step S103, judging whether the prediction error rate of the k+1st round of concurrent test is larger than a preset threshold value.
And step S104, when the prediction error rate of the k+1th round of concurrent test is larger than the preset threshold, reducing the concurrent request quantity on the basis of the k-th round of concurrent test to obtain the concurrent request quantity of the k+1th round of concurrent test.
And step S105, when the prediction error rate of the k+1th round of concurrent test is smaller than the preset threshold, increasing the concurrent request quantity on the basis of the k-th round of concurrent test to obtain the concurrent request quantity of the k+1th round of concurrent test.
And S106, determining the concurrency request quantity of the kth round of concurrency test as the critical value of the cloud platform to be tested under the fault when the prediction error rate of the kth+1 round of concurrency test is equal to the preset threshold value.
In the embodiment of the invention, when the prediction error rate of the k+1st round of concurrent test is judged to be smaller than the preset threshold value, the cloud platform to be tested does not reach the critical value of concurrent request processing, so that the next round of concurrent request quantity can be adjusted upwards; if the prediction error rate of the k+1th round of concurrent test is larger than a preset threshold, the cloud platform to be tested is indicated to exceed the limit of concurrent request processing, and the concurrent request amount of the next round of test is required to be adjusted down; and if the prediction error rate is equal to a preset threshold, the current concurrency request quantity can be considered as the limit of the cloud platform to be tested under the fault, and the concurrency request quantity can be not regulated.
In the embodiment of the invention, no matter which relation between the prediction error rate and the preset threshold value is, the next round of iterative test can be returned. If the concurrency request quantity is larger than or smaller than the preset threshold value, performing the concurrency test of the next round, namely adding 1 to the value of k, and returning to execute the corresponding concurrency test, namely executing the steps S101-S103 to perform corresponding judgment, and then performing subsequent actions according to the judgment result. And when the value of k reaches the maximum value, completing the chaos test. On the other hand, the loop exit condition may also be set as: and when the prediction error rate of the k+1st round of concurrent test is equal to the preset threshold value, determining the concurrency limit of the cloud platform to be tested under the fault.
According to the embodiment of the invention, the current concurrent request quantity of the test is adjusted by utilizing the historical test result, and the concurrent request quantity is adjusted in a self-adaptive manner after fault injection, so that the performance critical of the cloud platform injected with the fault is accurately tested, and the performance degradation condition of the cloud platform after the fault is determined.
As an optional implementation manner, the reducing the concurrency request amount based on the kth round of concurrency test to obtain the concurrency request amount of the kth+1th round of concurrency test includes: and calculating the concurrency request quantity of the k+1th round of concurrency test by using the prediction error rate of the k+1th round of concurrency test as an attenuation coefficient.
The prediction error rate result is obtained by calculating the actual error rate of the previous round, so that the concurrency request quantity is adjusted by using the prediction error rate result as the attenuation coefficient, the adjusted concurrency request quantity is more consistent with the test result of the previous round, and the cloud platform can be more quickly approaching to the critical of the cloud platform under the fault.
Specifically, the embodiment of the invention can calculate the concurrency request amount (i.e. the down-regulated concurrency request amount) of the k+1st round of concurrency test through the following formula:
wherein e' k+1 Representing the prediction error rate of the k+1st round concurrent test, C k Representing the amount of concurrent requests for the kth round of concurrent testing,representing rounding up the operators.
On the other hand, in the embodiment of the present invention, the uplink of the concurrency may also be performed in a similar manner as described above, and the prediction error rate is used as the floating coefficient, and the calculation manner may be similar to the above formula, so that the effect is similar, and will not be repeated here.
As an optional implementation manner, the increasing the concurrency request amount based on the kth round of concurrency test to obtain the concurrency request amount of the kth+1th round of concurrency test includes: determining the concurrency request quantity to be increased by using a preset floating coefficient and the concurrency request quantity of the k-th round of concurrency test, and adding the concurrency request quantity of the k-th round of concurrency test to obtain the concurrency request quantity of the k+1-th round of concurrency test.
That is, in the embodiment of the present invention, a fixed floating coefficient may be set to calculate the up-regulated concurrency request amount. Specifically, the concurrency request amount of the k+1st round of concurrency test can be calculated by the following formula:
wherein e' k+1 Representing the prediction error rate of the k+1st round concurrent test, C k Represents the concurrent request quantity of the kth round of concurrent test, and beta represents the floating coefficient, can be set according to experience,representing a downward rounding operator.
The above calculation formula is synthesized to obtain the following:
wherein, E is a preset threshold. If the predicted error rate is greater than a preset threshold, the test management process decays the concurrent request amount in the test management process according to the error rate. If the error rate is smaller than a preset threshold, for example, when the current error rate is 0, the concurrency request amount cannot reach the critical value of the cloud platform, and at this time, an attempt is made to appropriately increase the concurrency request amount of the test management process.
In the embodiment of the invention, the fault injection thread has two working modes, namely a timing mode and an iteration frequency mode. The configuration of the chaotic experiment in the timing mode specifies the fault injection time after the start of the test. The iteration frequency mode determines the fault injection time according to the execution frequency of the test. Both of which employ a polling approach.
Embodiments of the present invention are described below by the workflow of the chaos test shown in FIG. 3, as shown in FIG. 3, comprising:
step 1: a user submits a chaotic experiment configuration to a code hosting platform.
Step 2: merging the code library after being checked by an administrator.
Step 3: the user creates the chaotic experiment task through the Restful API interface, and the configuration and the start time of the chaotic experiment task are designated during creation.
Step 4: and triggering the continuous integrated platform assembly line after the task management module detects that the chaotic experiment task time arrives.
Step 5: the continuous integrated platform downloads chaotic experiment configuration, fault injection tools and cloud platform testing tools from the nodes.
Step 6: the continuous integrated platform is provided with a fault injection tool and a cloud platform testing tool sequentially from the node.
Step 7: and running a cloud platform test, and managing fault injection and recovery by a cloud platform test tool and a fault injection tool in the test process.
Step 8: and generating a test result and returning to a task state.
Fig. 4 is a platform for chaotic testing according to an embodiment of the present invention, where a deployment structure is as follows: the code hosting platform is deployed on a server 1, the user authority management module, the task management module and the database are deployed on a server 2, the Jenkins master is deployed on a server 3, and a server 4 is used as Jenkins slave. The servers 1-4 are connected with a cloud platform management network, and the server 4 is connected with an out-of-band management network of all management nodes and computing nodes of the cloud platform.
The cloud platform to be tested adopts Openstack and consists of three management nodes and two calculation nodes. The management nodes are deployed with keystone, nova-api, nova-scheduler, nova-conductor, placement, cinder-api, cander-scheduler, glance, neutron-server, neutron-dhcp-agent, etc. The computing nodes are deployed with nova-compute, cinder-volume, etc. services.
The specific operation steps of the user comprise:
step 1: the user submits the chaotic experiment configuration to the code hosting platform on the server 1 through the git.
Step 2: the configuration is reviewed and incorporated.
Step 3: the user holds the credential information to apply for the token from the user authority management module, and the holding token initiates a request to the task management module to create the chaotic experiment task.
Step 4: the Jenkins pipeline is triggered.
Step 5: jenkins slave begins downloading configuration and related software.
Step 6: and the execution of the chaotic experiment task is carried out on a Jenkins slave node (a server 4), and a fault injection tool and a cloud platform testing tool are installed on the server 4.
Step 7: the cloud platform testing tool performs the test. In this embodiment, the chaotic experimental configuration specifies running a test that creates a cloud host and specifies a total number of iterations 1000, concurrence 20. The fault type is that a management node nova-api service is down at random, the occurrence time is iteration times 200, and the release time is iteration times 800. When the test tool runs the test case 200 times, the fault injection thread in the test tool detects that the specified times are reached, the management node 1 (randomly selected) is remotely logged in through the cloud platform management network, a systemctl stop openstack-nova-api command is executed, and then the service is stopped. Service is restored when the number of iterations 800 is reached. After the iteration times are 200, the concurrency number is self-adaptively adjusted according to statistics of the error rate, and the concurrency limit is continuously closed to the concurrency limit of the normal processing request of the system.
Step 8: and after the chaotic experiment is finished, returning a test result and a chaotic experiment state.
Example 2
The present embodiment provides a cloud platform chaos testing device, which may be used to execute the testing method in the foregoing embodiment 1, as shown in fig. 5, and includes:
the obtaining module 501 is configured to obtain a test result of the kth concurrent test of the to-be-tested Yun Pingtai after the fault is injected into the cloud platform to be tested, count an actual error rate of the kth concurrent test, and k is 1,2,3 and … …;
the estimation module 502 is configured to estimate a prediction error rate of the kth+1st round of concurrent test by using the actual error rate of the kth round of concurrent test;
a judging module 503, configured to judge whether the prediction error rate of the kth+1st round of concurrent testing is greater than a preset threshold;
a first calculating module 504, configured to reduce the concurrency request amount based on the kth round of concurrency test when the prediction error rate of the kth round of concurrency test is greater than the preset threshold, to obtain a concurrency request amount of the kth round of concurrency test+1;
a second calculation module 505, configured to increase the concurrency request amount based on the kth round of concurrency test when the prediction error rate of the kth round of concurrency test is less than the preset threshold, to obtain the concurrency request amount of the kth round of concurrency test+1;
and the determining module 506 is configured to determine, when the prediction error rate of the kth+1st round of concurrent testing is equal to the preset threshold, a concurrency request amount of the kth round of concurrent testing as a critical value of the cloud platform to be tested under the fault.
According to the embodiment of the invention, the current error rate is dynamically adjusted by utilizing the error rate of the previous concurrent test, namely, historical data, so that the aim of adjusting the concurrent request quantity is fulfilled, and the evaluation accuracy of the performance degradation level after the fault is improved.
For a specific description of the device embodiments, reference may be made to the above method embodiments, which are not described herein.
Example 3
In one embodiment of the present invention, there is also provided a computer apparatus, an internal structural diagram of which may be as shown in fig. 6. The computer device includes a processor, memory, a network interface, and may also include a display screen and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external computer device through a network connection. The computer program when executed by the processor is used for realizing a data deduplication method for flow playback or a testing method of a service system, the computer equipment can also comprise a display screen and an input device, wherein the display screen can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen or a key, a track ball or a touch pad arranged on a shell of the computer equipment.
On the other hand, the computer device may not include a display screen and an input device, and those skilled in the art will understand that the structure shown in fig. 6 is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation of the computer device to which the present application is applied, and a specific computer device may include more or less components than those shown in the drawings, or may combine some components, or have different component arrangements.
In one embodiment, a computer device is provided that includes at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to perform the steps of:
after a fault is injected into the cloud platform to be tested, a test result of the kth concurrent test of Yun Pingtai to be tested is obtained, the actual error rate of the kth concurrent test is counted, and k is 1,2 and 3 … …;
estimating the actual error rate of the kth concurrent test to obtain the predicted error rate of the kth+1th concurrent test;
judging whether the prediction error rate of the k+1-th concurrent test is larger than a preset threshold value;
when the prediction error rate of the k+1th concurrent test is larger than the preset threshold, reducing the concurrent request quantity on the basis of the k-th concurrent test to obtain the concurrent request quantity of the k+1th concurrent test;
when the prediction error rate of the k+1th round of concurrent test is smaller than the preset threshold, increasing the concurrent request quantity on the basis of the k-th round of concurrent test to obtain the concurrent request quantity of the k+1th round of concurrent test;
and when the prediction error rate of the k+1th round of concurrent test is equal to the preset threshold, determining the concurrent request quantity of the k round of concurrent test as the critical of the cloud platform to be tested under the fault.
In one embodiment, a readable storage medium is provided, the computer readable storage medium storing computer instructions for causing the computer to perform:
after a fault is injected into the cloud platform to be tested, a test result of the kth concurrent test of Yun Pingtai to be tested is obtained, the actual error rate of the kth concurrent test is counted, and k is 1,2 and 3 … …;
estimating the actual error rate of the kth concurrent test to obtain the predicted error rate of the kth+1th concurrent test;
judging whether the prediction error rate of the k+1-th concurrent test is larger than a preset threshold value;
when the prediction error rate of the k+1th concurrent test is larger than the preset threshold, reducing the concurrent request quantity on the basis of the k-th concurrent test to obtain the concurrent request quantity of the k+1th concurrent test;
when the prediction error rate of the k+1th round of concurrent test is smaller than the preset threshold, increasing the concurrent request quantity on the basis of the k-th round of concurrent test to obtain the concurrent request quantity of the k+1th round of concurrent test;
and when the prediction error rate of the k+1th round of concurrent test is equal to the preset threshold, determining the concurrent request quantity of the k round of concurrent test as the critical of the cloud platform to be tested under the fault.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It is apparent that the above examples are given by way of illustration only and are not limiting of the embodiments. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. While still being apparent from variations or modifications that may be made by those skilled in the art are within the scope of the invention.

Claims (10)

1. The cloud platform chaos testing method is characterized by comprising the following steps of:
after a fault is injected into the cloud platform to be tested, a test result of the kth concurrent test of Yun Pingtai to be tested is obtained, the actual error rate of the kth concurrent test is counted, and k is 1,2 and 3 … …;
estimating the actual error rate of the kth concurrent test to obtain the predicted error rate of the kth+1th concurrent test;
judging whether the prediction error rate of the k+1-th concurrent test is larger than a preset threshold value;
when the prediction error rate of the k+1th concurrent test is larger than the preset threshold, reducing the concurrent request quantity on the basis of the k-th concurrent test to obtain the concurrent request quantity of the k+1th concurrent test;
when the prediction error rate of the k+1th round of concurrent test is smaller than the preset threshold, increasing the concurrent request quantity on the basis of the k-th round of concurrent test to obtain the concurrent request quantity of the k+1th round of concurrent test;
and when the prediction error rate of the k+1th round of concurrent test is equal to the preset threshold, determining the concurrent request quantity of the k round of concurrent test as the critical of the cloud platform to be tested under the fault.
2. The cloud platform chaos testing method according to claim 1, wherein the estimating the prediction error rate of the kth+1st round of concurrent test by using the actual error rate of the kth round of concurrent test includes:
obtaining the prediction error rate of the kth concurrent test;
and calculating the prediction error rate of the k+1th round concurrent test by using the preconfigured weight and the prediction error rate and the actual error rate of the k round concurrent test.
3. The cloud platform chaos test method according to claim 2, wherein the prediction error rate of the k+1th round of concurrent test is calculated by the following formula:
e′ k+1 =αe k +(1-α)e′ k
wherein e k Representing the actual error rate, e 'of the kth round of concurrent testing' k Representing the prediction error rate of the kth concurrent test, and alpha represents the smoothing coefficient.
4. The cloud platform chaos testing method according to claim 1, wherein the reducing the concurrency request amount based on the kth round of concurrency test to obtain the concurrency request amount of the kth+1th round of concurrency test includes:
and calculating the concurrency request quantity of the k+1th round of concurrency test by using the prediction error rate of the k+1th round of concurrency test as an attenuation coefficient.
5. The cloud platform chaos testing method according to claim 4, wherein the k+1-th concurrent test concurrent request amount is calculated by the following formula:
wherein e' k+1 Representing the prediction error rate of the k+1st round concurrent test, C k Representing the amount of concurrent requests for the kth round of concurrent testing,representing rounding up the operators.
6. The cloud platform chaos testing method according to claim 1, wherein increasing the concurrency request amount based on the kth round of concurrency test to obtain the concurrency request amount of the kth+1th round of concurrency test includes:
determining the concurrency request quantity to be increased by using a preset floating coefficient and the concurrency request quantity of the k-th round of concurrency test, and adding the concurrency request quantity of the k-th round of concurrency test to obtain the concurrency request quantity of the k+1-th round of concurrency test.
7. The cloud platform chaos testing method according to claim 6, wherein the k+1-th concurrent test concurrent request amount is calculated by the following formula:
wherein C is k Represents the concurrency request quantity of the kth round of concurrency test, beta represents the floating coefficient,representing a downward rounding operator.
8. The cloud platform chaos testing device is characterized by comprising:
the acquisition module is used for acquiring a test result of the kth concurrent test of Yun Pingtai to be tested after the fault is injected into the cloud platform to be tested, counting the actual error rate of the kth concurrent test, and taking 1,2 and 3 … …;
the estimating module is used for estimating and obtaining the prediction error rate of the k+1st round concurrent test by utilizing the actual error rate of the k round concurrent test;
the judging module is used for judging whether the prediction error rate of the k+1th round of concurrent test is larger than a preset threshold value;
the first calculation module is used for reducing the concurrency request quantity on the basis of the k+1th round of concurrency test to obtain the concurrency request quantity of the k+1th round of concurrency test when the prediction error rate of the k+1th round of concurrency test is larger than the preset threshold;
the second calculation module is used for increasing the concurrency request quantity on the basis of the k+1th round of concurrency test to obtain the concurrency request quantity of the k+1th round of concurrency test when the prediction error rate of the k+1th round of concurrency test is smaller than the preset threshold value;
and the determining module is used for determining the concurrency request quantity of the kth round of concurrency test as the critical of the cloud platform to be tested under the fault when the prediction error rate of the kth+1 round of concurrency test is equal to the preset threshold.
9. A computer device, comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to perform the cloud platform chaos test method according to any one of claims 1-7.
10. A computer-readable storage medium storing computer instructions for causing a computer to perform the cloud platform chaos test method according to any one of claims 1 to 7.
CN202111613738.1A 2021-12-27 2021-12-27 Cloud platform chaos testing method and device Active CN114500349B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111613738.1A CN114500349B (en) 2021-12-27 2021-12-27 Cloud platform chaos testing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111613738.1A CN114500349B (en) 2021-12-27 2021-12-27 Cloud platform chaos testing method and device

Publications (2)

Publication Number Publication Date
CN114500349A CN114500349A (en) 2022-05-13
CN114500349B true CN114500349B (en) 2023-08-08

Family

ID=81496035

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111613738.1A Active CN114500349B (en) 2021-12-27 2021-12-27 Cloud platform chaos testing method and device

Country Status (1)

Country Link
CN (1) CN114500349B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114791846B (en) * 2022-05-23 2022-10-04 北京同创永益科技发展有限公司 Method for realizing observability aiming at cloud-originated chaos engineering experiment

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101799539A (en) * 2010-01-05 2010-08-11 中国人民解放军重庆通信学院 Chaotic distance-testing method and device
CN102123061A (en) * 2011-03-28 2011-07-13 杭州电子科技大学 Method for determining performance of Web server
CN104331477A (en) * 2014-11-04 2015-02-04 哈尔滨工业大学 Method for testing concurrency property of cloud platform based on federated research
CN104717236A (en) * 2013-12-11 2015-06-17 中国移动通信集团公司 Equipment performance test method and device
CN105493024A (en) * 2014-11-28 2016-04-13 华为技术有限公司 Data threshold prediction method and related apparatus
CN109800137A (en) * 2018-12-06 2019-05-24 珠海西山居互动娱乐科技有限公司 A kind of server performance test method and system
CN110096335A (en) * 2019-04-29 2019-08-06 东北大学 One kind being directed to the different types of service concurrence amount prediction technique of virtual machine
CN110633905A (en) * 2019-09-06 2019-12-31 武汉理工大学 Reliability calculation method for cloud platform of intelligent vehicle
CN110674042A (en) * 2019-09-23 2020-01-10 苏州浪潮智能科技有限公司 Concurrency performance testing method and device
CN111324520A (en) * 2020-03-06 2020-06-23 五八有限公司 Service interface monitoring method and device, electronic equipment and storage medium
WO2021139103A1 (en) * 2020-05-21 2021-07-15 平安科技(深圳)有限公司 Method and apparatus for adaptively adjusting pressurization parameter, computer device, and storage medium
CN113742250A (en) * 2021-11-05 2021-12-03 广州易方信息科技股份有限公司 Automatic interface testing method and device

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8195983B2 (en) * 2008-10-22 2012-06-05 International Business Machines Corporation Method and system for evaluating software quality
US20120054564A1 (en) * 2010-08-27 2012-03-01 Abhishek Kumar Tiwary Method and apparatus to test memory using a regeneration mechanism
US9535774B2 (en) * 2013-09-09 2017-01-03 International Business Machines Corporation Methods, apparatus and system for notification of predictable memory failure
US9465715B2 (en) * 2014-06-12 2016-10-11 Oracle International Corporation Optimizing the number of shared processes executing in a computer system
US9619363B1 (en) * 2015-09-25 2017-04-11 International Business Machines Corporation Predicting software product quality
US10204034B2 (en) * 2017-04-06 2019-02-12 At&T Intellectual Property I, L.P. System and method for testing software applications in a software defined network
US10719644B2 (en) * 2017-06-30 2020-07-21 Synopsys, Inc. Method and framework to dynamically split a testbench into concurrent simulatable multi-processes and attachment to parallel processes of an accelerated platform

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101799539A (en) * 2010-01-05 2010-08-11 中国人民解放军重庆通信学院 Chaotic distance-testing method and device
CN102123061A (en) * 2011-03-28 2011-07-13 杭州电子科技大学 Method for determining performance of Web server
CN104717236A (en) * 2013-12-11 2015-06-17 中国移动通信集团公司 Equipment performance test method and device
CN104331477A (en) * 2014-11-04 2015-02-04 哈尔滨工业大学 Method for testing concurrency property of cloud platform based on federated research
CN105493024A (en) * 2014-11-28 2016-04-13 华为技术有限公司 Data threshold prediction method and related apparatus
CN109800137A (en) * 2018-12-06 2019-05-24 珠海西山居互动娱乐科技有限公司 A kind of server performance test method and system
CN110096335A (en) * 2019-04-29 2019-08-06 东北大学 One kind being directed to the different types of service concurrence amount prediction technique of virtual machine
WO2020220438A1 (en) * 2019-04-29 2020-11-05 东北大学 Method for predicting concurrent volume of services of different types for virtual machine
CN110633905A (en) * 2019-09-06 2019-12-31 武汉理工大学 Reliability calculation method for cloud platform of intelligent vehicle
CN110674042A (en) * 2019-09-23 2020-01-10 苏州浪潮智能科技有限公司 Concurrency performance testing method and device
CN111324520A (en) * 2020-03-06 2020-06-23 五八有限公司 Service interface monitoring method and device, electronic equipment and storage medium
WO2021139103A1 (en) * 2020-05-21 2021-07-15 平安科技(深圳)有限公司 Method and apparatus for adaptively adjusting pressurization parameter, computer device, and storage medium
CN113742250A (en) * 2021-11-05 2021-12-03 广州易方信息科技股份有限公司 Automatic interface testing method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种基于自适应监测的云计算系统故障检测方法;王焘;顾泽宇;张文博;徐继伟;魏峻;钟华;;计算机学报(第06期);全文 *

Also Published As

Publication number Publication date
CN114500349A (en) 2022-05-13

Similar Documents

Publication Publication Date Title
US20200120102A1 (en) Techniques for protecting against flow manipulation of serverless functions
CA2624483C (en) A method and system for automatically testing performance of applications run in a distributed processing structure and corresponding computer program product
US9483383B2 (en) Injecting faults at select execution points of distributed applications
Zheng et al. Tracking time-varying parameters in software systems with extended Kalman filters.
Cámara et al. Evaluation of resilience in self-adaptive systems using probabilistic model-checking
KR20190038883A (en) Dynamic optimization of simulation resources
US9164796B2 (en) Robust system control method with short execution deadlines
JP2005196601A (en) Policy simulator for autonomous management system
US7797598B1 (en) Dynamic timer for testbench interface synchronization
CN109324962B (en) Method and equipment for testing application program based on cloud Internet of things technology
US8793535B2 (en) Optimizing system usage when running quality tests in a virtual machine environment
CN114500349B (en) Cloud platform chaos testing method and device
EP3526674B1 (en) Time-parallelized integrity testing of software code
Rodríguez et al. Thermal-aware schedulability analysis for fixed-priority non-preemptive real-time systems
CN112698974A (en) Fault injection test method, device and storage medium
We et al. Functionally and temporally correct simulation of cyber-systems for automotive systems
Roy et al. Reducing service failures by failure and workload aware load balancing in saas clouds
US9953293B2 (en) Method for controlling changes of replication directions in a multi-site disaster recovery environment for high available application
US20120265879A1 (en) Managing servicability of cloud computing resources
US20180011734A1 (en) Job scheduler test program, job scheduler test method, and information processing apparatus
Sousa et al. Testing the dependability and performance of group communication based database replication protocols
Tchana et al. A self-scalable and auto-regulated request injection benchmarking tool for automatic saturation detection
Xu et al. Towards fault-tolerant real-time scheduling in the seL4 microkernel
Debbi Modeling and performance analysis of resource provisioning in cloud computing using probabilistic model checking
CN114816487B (en) Method, system, equipment and storage medium for upgrading BMC in batches

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant