CN113986719A - Automatic test method and system for large-scale cluster performance based on cloud service - Google Patents

Automatic test method and system for large-scale cluster performance based on cloud service Download PDF

Info

Publication number
CN113986719A
CN113986719A CN202111176883.8A CN202111176883A CN113986719A CN 113986719 A CN113986719 A CN 113986719A CN 202111176883 A CN202111176883 A CN 202111176883A CN 113986719 A CN113986719 A CN 113986719A
Authority
CN
China
Prior art keywords
test
client
server
software
performance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111176883.8A
Other languages
Chinese (zh)
Inventor
周同庆
李广辉
冯光
孙利杰
陈松政
刘文清
杨涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Qilin Xin'an Technology Co ltd
Original Assignee
Hunan Qilin Xin'an Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Qilin Xin'an Technology Co ltd filed Critical Hunan Qilin Xin'an Technology Co ltd
Priority to CN202111176883.8A priority Critical patent/CN113986719A/en
Publication of CN113986719A publication Critical patent/CN113986719A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3684Test management for test design, e.g. generating new test cases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/61Installation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45562Creating, deleting, cloning virtual machine instances

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a large-scale cluster performance automatic test method and a system based on cloud service, and the method comprises the steps of generating computing nodes in batches on a physical server based on an operating system provided with test software; designating test software of one computing node as a server and test software of the other computing nodes as clients in the generated computing nodes; and issuing a test program and a test task based on the test program to a specified client through a server, and synchronously acquiring the resource occupation states of the client and a physical server/physical client during the test task execution of each client. The method and the device can define the performance boundaries which can be reached by the cloud computing solutions under the same hardware and different hardware, and define the performance boundaries which can be reached by the different cloud computing solutions under the same hardware, and are high in testing efficiency and small in testing error.

Description

Automatic test method and system for large-scale cluster performance based on cloud service
Technical Field
The invention relates to a large-scale cluster performance testing technology based on cloud service, in particular to a large-scale cluster performance automatic testing method and system based on cloud service.
Background
Large-scale clustering based on cloud services is a basic architecture for providing cloud services, for example, by providing virtual machines on physical servers as computing nodes, and logging physical clients in the computing nodes to realize various remote operations or computations including cloud desktops. At present, the following problems exist in the large-scale cluster based on the cloud service: the upper limit resource which can be operated by a large-scale cluster operation virtual machine based on cloud service is difficult to define, the computing resource required to be allocated for certain specific operation in the virtual machine cannot be defined, the size of the network resource required to be spent for remote operation of the virtual machine cannot be accurately defined, the maximum performance which can be achieved by hardware of a cloud server and the performance degradation caused by aging due to the influence of the service duration of the hardware cannot be effectively perceived. The cloud service-based large-scale cluster cannot compare the maximum performance boundary height which can be reached by different hardware, cannot compare the consumption of computing resources required by the same operation of software service systems of different software manufacturers under the same hardware, cannot compare the time and performance spent by the same load operation, and cannot compare the maximum performance boundary height which can be reached. The performance test is difficult to grade, everyone has subjective judgment on the quality of the performance, the test data is difficult to form a unified standard, and each company says that the performance of the hardware and the software is better. For the test of the large-scale cluster based on the cloud service, the comparison test is mostly performed by testers, each tester has a test method and a test habit thereof, the reflection time and the test monitoring time can cause errors of human interference on test results, and different test scripts on the market at present cannot be written into a plurality of platforms at one time for execution, for example, the test script written on win7 cannot be tested on win10, win2008 or Linux.
Disclosure of Invention
The technical problems to be solved by the invention are as follows: aiming at the problems in the prior art, the invention provides a large-scale cluster performance automatic test method and system based on cloud service, which can define respective performance boundaries which can be reached by cloud computing solutions under the same hardware and different hardware, and define respective performance boundaries which can be reached by different cloud computing solutions under the same hardware, and have high test efficiency and small test error.
In order to solve the technical problems, the invention adopts the technical scheme that:
a large-scale cluster performance automatic testing method based on cloud service comprises the following steps:
1) preparing an operating system provided with test software, and generating computing nodes in batches on a physical server based on the operating system provided with the test software;
2) designating test software of one computing node as a server and test software of the other computing nodes as clients in the generated computing nodes;
3) and issuing a test program and a test task based on the test program to a specified client through a server, and synchronously acquiring the resource occupation states of the client and a physical server/physical client during the test task execution of each client.
Optionally, step 1) comprises: 1.1) preparing an operating system with test software; 1.2) creating a source virtual machine on a physical server, and installing an operating system of test software in the created source virtual machine or importing the virtual machine which is installed with the operating system of the test software; 1.3) copying a batch virtual machine copy on the basis of a source virtual machine by using a batch release function of a virtualization system on a physical server to obtain batch computing nodes of an operating system provided with test software.
Optionally, the step 2) of designating the test software of one computing node as a server and the test software of the other computing nodes as clients in the generated computing nodes means: the method comprises the steps that a designated computing node is remotely logged in on a physical client, and test software on the designated computing node triggers the test software of the computing node to send broadcast messages to other computing nodes through designated operation, so that the test software of the computing node serves as a server side, and the test software of the other computing nodes serves as client sides.
Optionally, after step 2) and before step 3), the method further includes a step in which each client periodically sends its own identity information, IP address, and resource occupation status to the server, where the resource occupation status includes at least one of CPU, memory, disk I/O, and network I/O status data, the CPU status data refers to a CPU occupation ratio, the memory status data includes a total memory and a space memory, the disk I/O status data includes read bytes and write bytes, and the network I/O status data includes send bytes and receive bytes.
Optionally, the test program in step 3) is one or more of an exe executable file, a python executable file, a shell executable file, a cmd executable file and an execution script program generated by a recorded manual operation sequence; the operation executed by the test program in the step 3) comprises one or more of window operation, drawing program operation, picture browsing operation, video playing operation, browser operation, resource browser operation, office software operation, maximum calculation amount pressure test operation, maximum disk read-write pressure test operation and maximum network bandwidth pressure test operation.
Optionally, the attributes of the test task based on the test program in step 3) include a test task name, a selected test program, an execution mode of the test program, and execution parameters, where the execution mode includes an immediate execution mode and a timed execution mode, and the execution parameters include a number of repeated executions, an execution interval time, and a start execution time.
Optionally, the step 3) of synchronously acquiring resource occupation states of the clients and the physical server/physical client during the test tasks executed by the clients includes: the method comprises the steps that when a test task based on a test program is issued to a designated client through a server, resource occupation states on a physical server/a physical client are simultaneously obtained through an SSH protocol respectively, and the resource occupation states of the client are recorded; after receiving a notification of completion of a test task sent by any client, ending recording of the resource occupation state of the client, thereby obtaining the resource occupation state of the client during execution of the test task; after receiving the notification of completion of the test tasks sent by all the clients, ending recording the resource occupation state on the physical server/physical client, thereby obtaining the resource occupation state of the physical server/physical client during the test tasks executed by each client; the resource occupation state comprises at least one of CPU, memory, disk I/O and network I/O state data, the CPU state data refers to the CPU occupation ratio, the memory state data comprises the size of the total memory and the space memory, the disk I/O state data comprises the number of read bytes and write bytes, and the network I/O state data comprises the number of sent bytes and received bytes.
Optionally, after step 2) and before step 3), the method further includes the step of updating the clients in batches: and issuing the distribution address of the new version of the test software to each client at the server, and acquiring the new version of the test software and completing the updating and upgrading of the local test software by each client based on the received distribution address.
In addition, the invention also provides a large-scale cluster performance automatic test system based on the cloud service, which comprises a microprocessor and a memory which are connected with each other, and is characterized in that the microprocessor is programmed or configured to execute the steps of the large-scale cluster performance automatic test method based on the cloud service.
In addition, the present invention also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program programmed or configured to execute the cloud service-based large-scale cluster performance automated testing method.
Compared with the prior art, the invention has the following advantages:
1. the method and the device can define the performance boundaries which can be reached by the cloud computing solutions under the same hardware and different hardware, and define the performance boundaries which can be reached by the different cloud computing solutions under the same hardware, the testing process is simple and quick, and the complexity of the operation process can be reduced.
2. In cloud computing, thousands of cloud virtual machines can be realized, human resources can be consumed for manual testing, and a large amount of testing time needs to be invested. The invention can quickly carry out the hard requirement of cluster large-scale performance test, can quickly deploy the test environment, improves the test efficiency, shortens the test period and reduces the consumption of human resources for test.
3. The invention generates the computing nodes in batch on the basis of the operating system provided with the testing software on the physical server, sends the testing program and the testing tasks based on the testing program to the appointed client through the server, and synchronously acquires the resource occupation states of the client and the physical server/physical client during the execution of the testing tasks by each client, so that the testing environments are the same, the testing methods are the same, the influence of human factors caused by human participation and the errors generated by human recording performance can be removed, the interference of the human factors on the performance evaluation of the server is removed, and the invention has the advantages of small testing errors and ensuring the reliability and stability of the testing results. The method has the advantages that the comparability is formed in each performance test, and the performance optimization level of hardware and software can be evaluated more objectively only if the actions of the tests are the same.
4. None of the conventional test software can evaluate the performance of the transmission protocol, and cannot judge whether the optimization of the protocol is successful or not, and it is very difficult to determine whether the network bandwidth and the system performance consumed by the same desktop operation are optimized or quit in the process of continuously iterating in the research and development department and quitting a new version. The invention can synchronously acquire the resource occupation states of the client and the physical server/physical client during the test task execution of each client, and can be used for comparing the performance and the quality of the network transmission protocol and the image compression protocol so as to be used for the optimization of the network transmission protocol and the image compression protocol.
Drawings
FIG. 1 is a schematic diagram of a basic flow of a method according to an embodiment of the present invention.
FIG. 2 is a schematic diagram of a topology of a system in an embodiment of the present invention.
FIG. 3 is a schematic diagram of a performance testing principle of the system in the embodiment of the invention.
Fig. 4 is a diagram illustrating the prior art of acquiring the resource occupation status on a physical server/physical client.
Fig. 5 is a schematic diagram of a command window for acquiring a memory occupation state according to an embodiment of the present invention.
FIG. 6 is a diagram of a command window for acquiring the CPU occupation status according to an embodiment of the present invention.
FIG. 7 is a diagram illustrating a command window for obtaining network usage status data according to an embodiment of the present invention.
FIG. 8 is a diagram illustrating the synchronous acquisition of resource occupation status on physical servers/physical clients according to an embodiment of the present invention.
FIG. 9 is a topology diagram of remote desktop traffic monitoring in an embodiment of the invention.
Detailed Description
As shown in fig. 1, fig. 2 and fig. 3, the method for automatically testing the performance of a large-scale cluster based on cloud service in this embodiment includes:
1) preparing an operating system provided with test software, and generating computing nodes in batches on a physical server based on the operating system provided with the test software;
2) designating test software of one computing node as a server and test software of the other computing nodes as clients in the generated computing nodes;
3) and issuing a test program and a test task based on the test program to a specified client through a server, and synchronously acquiring the resource occupation states of the client and a physical server/physical client during the test task execution of each client.
In this embodiment, the test software is specifically named as kyrintest, and is made by using a Qt graphical interface technology, a batch nested multithreading technology, network remote communication, Linux, and windows system kernel API interfaces, and generates a test program based on technologies such as recording macro screen.
In this embodiment, step 1) includes: 1.1) preparing an operating system with test software; 1.2) creating a source virtual machine on a physical server, and installing an operating system of test software in the created source virtual machine or importing the virtual machine which is installed with the operating system of the test software; 1.3) copying a batch virtual machine copy on the basis of a source virtual machine by using a batch release function of a virtualization system on a physical server to obtain batch computing nodes of an operating system provided with test software. After a source virtual machine is created, an operating system of a Kylintest is installed in the created source virtual machine or a virtual machine which is introduced with the operating system installed with the Kylintest is installed; and then, a large number of operating system copies are quickly copied from the source virtual machine on the physical server by using the batch release function of the virtualization system, and the source virtual machine is started. After boot up, the kyrintest will boot up and self-boot.
In this embodiment, the step 2) of designating the test software of one computing node as the server and the test software of the other computing nodes as the clients in the generated computing nodes means: the method comprises the steps that a designated computing node is remotely logged in on a physical client, and test software on the designated computing node triggers the test software of the computing node to send broadcast messages to other computing nodes through designated operation, so that the test software of the computing node serves as a server side, and the test software of the other computing nodes serves as client sides. In this embodiment, all the computing nodes are computing nodes in the cloud service cluster, and one of the computing nodes serves as a server to control all other computing nodes, so that the computing nodes in the cloud server cluster are uniformly deployed.
In this embodiment, after step 2) and before step 3), the method further includes a step in which each client periodically sends its own identity information, IP address, and resource occupation state to the server, where the resource occupation state in this embodiment includes at least one of CPU, memory, disk I/O, and network I/O state data (which may be used as needed), the CPU state data refers to a CPU occupation ratio, the memory state data includes a total memory size and a space memory size, the disk I/O state data includes read bytes and write bytes, and the network I/O state data includes sent bytes and received bytes. As an optional implementation manner, in this embodiment, the service end further generates an Excel file from the resource occupation state reported by the client.
In this embodiment, in step 3), the server issues the test program to the specified client, and the latest test script can be quickly sent to each client, so that the workload of manually deploying the automatic test script is reduced, and the deployment of a new test item is quickly realized.
In this embodiment, the test program in step 3) is one or more of an exe executable file, a python executable file, a shell executable file, a cmd executable file and an execution script program generated by a recorded manual operation sequence, so that the subsequent testers reduce the test workload, and ensure that each test is performed at the same time, the operation speed is the same, the test difference is removed, and the performance has a comparable basis; the operation executed by the test program in the step 3) comprises one or more of window operation, drawing program operation, picture browsing operation, video playing operation, browser operation, resource browser operation, office software operation, maximum calculation amount pressure test operation, maximum disk read-write pressure test operation and maximum network bandwidth pressure test operation. In this embodiment, the operation executed by the test program is different from the ordinary pressure test and the random test, and the aim is to restore the normal office flow. The test program can not only import the existing executable files (exe executable file, python executable file, shell executable file, cmd executable file and the execution script program generated by the recorded manual operation sequence), but also can generate the execution script program generated by the manual operation sequence recorded on site, imitate the operation commonly used by a human using a desktop, and truly restore the manual operation. For example, office stress tests of cloud desktop users on office desktops are simulated. For example: playing videos, PPT page turning, Word page turning, Excel page turning, browsing the Taobao home page and turning pages by a browser, viewing the images, enlarging and reducing, dragging windows and the like. The executable files are various, and the comprehensive test of a windows/Linux operating system can be realized; the operation executed by the test program comprises maximum calculated amount pressure test operation, maximum disk read-write pressure test operation and maximum network bandwidth pressure test operation, the upper limit supported under the maximum office requirement supported by the maximum can be tested, the upper limit supported by the virtual machine system can be tested through common office routine operation, the upper limit supported by the virtual machine system can be more accurate, and the calculation and network concurrent pressure brought by normal use of a computer can be restored through simulating real manual operation. The traditional function test cannot guarantee that the same person tests the same at each time, and also cannot guarantee that the test flows of different testers are the same, so that the operation time interval and the operation steps of each person are different, if the tester A operates a Word and records that the consumed hardware performance report is A1, the tester A uses the Word again to operate the hardware performance report A2, and A1 and A2 may be different. If tester B tests Word, the hardware performance consumed using Word is recorded as B1, which B1 may be different from the A1 recorded by tester A. If it is not guaranteed that a1 and a2 and B2 are not the same, then after a hardware upgrade or software performance optimization, the performance cannot be verified to have been optimized using test reports. The test program of this embodiment may adopt an execution script program generated by a recorded manual operation sequence, and take the 1: 1 recording of the operation of a tester as a template for performance testing of hardware and software under multiple iterations. As an alternative embodiment, the test procedure in this example is shown in table 1.
Table 1: example of a test procedure.
Figure BDA0003295545900000061
Figure BDA0003295545900000071
Figure BDA0003295545900000081
Figure BDA0003295545900000091
When a test task needs a new test program for simulating real user operation, the test program can be issued to the appointed client through the server, so that the client can obtain the new test program. For example, when a new test requirement occurs, a new script is needed, an execution script program generated by a recorded manual operation sequence can be added according to a custom script, a recording function is started when a service end point is clicked, all keyboard and mouse operations are recorded and recorded into a script, and the script is directly added into a script management page and provided for a custom task. And then, issuing an execution script program generated by the recorded manual operation sequence of the server to each client, and rapidly updating the test script. It should be noted that, the test program in this embodiment is a preferred embodiment for simulating a real user operation, but based on the concept of the method in this embodiment, the test program in this embodiment does not depend on simulating a real user operation, and may be a common stress test operation or other test operations. The traditional pressure test, the large CPU operation request, the high network occupation pressure, the high disk I/O reading and writing pressure test and the like are distinguished from the conventional desktop used by the user, the desktop used by the normal user has fluctuation and has peaks and valleys, so the pressure test in the market does not simulate the use habit of a real desktop, and the test of similarity of unmanned operation has no referential property. On the other hand, the performances of the CPU, the memory, the disk and the like are written on the hardware product, and are clearly indicated in the specification of the hardware product, and software verification is not required. Even authentication is of little interest because the performance is to be serviced for actual use. The test in this embodiment aims to simulate real human use to test the upper limit of the support of the system in actual business. Maximum concurrency performance can be tested. For example, how many people play PPT simultaneously, and at most, how many people write Word documents, and use browsers, Excel and the like.
In this embodiment, the attributes of the test task based on the test program in step 3) include a test task name, a selected test program, an execution mode of the test program, and execution parameters, where the execution mode includes an immediate execution mode and a timed execution mode, and the execution parameters include the number of repeated executions, an execution interval time, and a start execution time. For example, because some tests need to be executed in an environment with a small access amount, the task execution time may be set to start at 0 pm according to the time setting in the newly-created test task, and the task is released, the client may determine whether the time reaches 0 pm, and if the time reaches 0 pm, the client may start to execute the task released by the server. The server side can record the resource occupation state reported by the client side at the point 0 and also can record the resource occupation state on the physical server/physical client side, so that the server side can conveniently check the resource occupation state the next day.
In this embodiment, in step 3), a test task based on a test program is issued to a specified client by a server, and a custom task is used as a test unit, so that multithreading of a custom script, execution time of the custom task, custom repetition times, time interval of each test, selection of the test program, and resource occupation states of each client and a physical server/physical client during execution of the test task by each client are synchronously acquired, and recorded performance changes are all caused by execution of the task. And managing each client through cluster concurrent testing, and controlling the client to execute a custom task to perform large-scale performance pressure testing. A large number of virtual machines serving as clients execute a certain operation at the same time to calculate the maximum support boundary which can be reached by a cluster for the certain operation; and the performance of different systems is good and bad under the operation of the same amount of self-defined automatic test.
In this embodiment, the step 3) of synchronously acquiring resource occupation states of each client and the physical server/physical client during the period in which each client executes the test task includes: the method comprises the steps that when a test task based on a test program is issued to a designated client through a server, resource occupation states on a physical server/a physical client are simultaneously obtained through an SSH protocol respectively, and the resource occupation states of the client are recorded; after receiving a notification of completion of a test task sent by any client, ending recording of the resource occupation state of the client, thereby obtaining the resource occupation state of the client during execution of the test task; after receiving the notification of completion of the test tasks sent by all the clients, ending recording the resource occupation state on the physical server/physical client, thereby obtaining the resource occupation state of the physical server/physical client during the test tasks executed by each client; the resource occupation state comprises at least one of CPU, memory, disk I/O and network I/O state data, the CPU state data refers to the CPU occupation ratio, the memory state data comprises the size of the total memory and the space memory, the disk I/O state data comprises the number of read bytes and write bytes, and the network I/O state data comprises the number of sent bytes and received bytes. When the resource occupation state on the physical server/physical client is obtained through the SSH protocol, the server can automatically input the user name and login password of the linux server to be monitored and the monitoring terminal command of various data to be monitored, so that the resource occupation states of the clients and the physical server/physical client during the test task execution of the clients can be synchronously obtained. In this embodiment, SSH remote control is used as a monitoring basis for the physical server/physical client, and when a task is executed to monitor the physical server/physical client, the resource occupation status of the system is freely and customizedly checked and monitored data is recorded according to the specificity requirement of each test. After the server is simultaneously connected with the physical server/physical client through the SSH protocol, the tasks sent to the client are immediately executed or are regularly executed as long as the tasks start, various information of the physical server/physical client can be recorded according to the monitoring information, Excel files are generated, line graphs are generated, performance analysis can be conveniently carried out on large-batch tests in the follow-up process, and the quality degrees of different versions, different systems and different hardware can be obtained through the continuous analysis structure.
In this embodiment, the synchronous acquisition of the resource occupation states of the clients and the physical server/physical client during the execution of the test tasks by the clients in step 3) can remove the influence of human factors caused by human participation and errors caused by human recording performance. If a PPT is played in a virtual machine, performance change of a virtualization server needs to be monitored, and a resource occupation state of the server consumed by the PPT played by the virtual machine in the server is monitored, as shown in fig. 4, a normal flow of manual monitoring includes: the method comprises the following steps: opening server memory monitoring: using the SSH tool, performance is monitored using script commands in the SSH tool, as shown in FIG. 5, with the command "free-s 3| grep Mem"; step two: opening a server CPU for monitoring: using SSH tools, performance is monitored using script commands in SSH tools, as shown in fig. 6, with the command "top | grep qemu"; step three: opening a server disk for monitoring: using an SSH tool, monitoring performance using script commands in the SSH tool; step four: opening a server network for monitoring: using the SSH tool, monitoring performance using script commands in the SSH tool, as shown in FIG. 7, the command is "iftop-n-t | great cumulative"; step five: PPT is played inside the virtual machine; step six: closing the PPT inside the virtual machine; step seven: and (3) closing the server network monitoring: entering into SSH tool, using ctrl + c method to terminate monitoring; step eight: closing the server disk monitoring: entering into SSH tool, using ctrl + c method to terminate monitoring; step nine: and (3) closing the monitoring of the CPU of the server: entering into SSH tool, using ctrl + c method to terminate monitoring; step ten: closing server memory monitoring: entering into SSH tool, using ctrl + c method to terminate monitoring. In the step, time errors of a plurality of nodes exist, monitoring starts but PPT does not start playing, PPT is closed but monitoring does not stop, and the error of memory monitoring is the largest. As shown in fig. 8, the step of synchronously acquiring the monitoring of the resource occupation states of the clients and the physical server/physical client during the execution of the test tasks by the clients includes: the method comprises the following steps: at the same time when the test starts, starting memory monitoring, CPU monitoring, disk monitoring, network monitoring and PPT playing inside the virtual machine; step two: and at the same time when the test is finished, closing memory monitoring, CPU monitoring, disk monitoring, network monitoring and closing PPT played by the virtual machine. Through the steps, the influence of human factors caused by human participation and errors generated by human recording performance can be removed.
In this embodiment, after step 2) and before step 3), the method further includes the step of updating the clients in batches: and the server side issues the distribution address of the new version of the test software to each client side, and each client side acquires the new version of the test software based on the received distribution address and completes the updating and upgrading of the local test software, so that the test software can be quickly updated, and the workload of software iterative updating is reduced. In the testing process, if the testing software is updated, the testing personnel do not need to manually replace and upgrade the testing software on the cloud nodes one by one, and only the testing software needs to be placed on a uniform address and the server sends an updating command, all the clients can be automatically updated. For example, there are 300 computing nodes in a cluster, each computing node is installed with a 0.5 version of the KylinTest test system, when the system needs to be updated to a 0.6 version, a 0.6 version of the KylinTest file is placed on an Http file server, then according to the automatic update function, the network address of the KylinTest is input into the update address of the server, and the update button is clicked, so that the version of the test system can be updated quickly, the distribution address of the 0.6 version of the KylinTest is issued to each client, and each client acquires the 0.6 version of the KylinTest based on the received distribution address and completes the update and upgrade of the local KylinTest.
In summary, at present, there is no system tool capable of monitoring the cloud server background performance, monitoring the cloud computing node performance, and performing the cloud computing performance boundary pressure test for uniformly issuing the pressure test tasks in batches. The scripts used in the self-defined scripts of the embodiment simulate the use of a desktop by a human, such as Word, Excel, PPT, view pictures, browse a webpage, drag a window and the like, are not simple I/O read-write pressure test and calculation test, can simulate the use process of a common cloud desktop user on the cloud desktop, and can truly calculate the truest bearing upper limit of the cloud server. The image compression algorithm used by each software is different, the size of the traffic transmitted by the compressed data in the network is different, the method can simulate the average traffic spent by different software in use, and calculate the size and difference of the network bandwidth spent by different software to execute similar operations. For example, Word2007/2010/2013, Excel2007/2010/2013, PPT2007/2010/2013, wps11 and the like, compression algorithms used for executing the same file are different, so that network bandwidths spent for browsing the same file and performing the same operation are different, and software needing important optimization can be specified for optimization of related protocol algorithms of various remote cloud desktops. The method of this embodiment can monitor the instantaneous network traffic in use of the same software, and by detecting that the lost network traffic corresponds to the time node of the operation on the software, the network traffic that a certain specific operation needs to spend can be obtained, for example: excel maximization and minimization, Word insertion pictures, PPT page turning, PPT picture dragging and the like, the network flow consumed by the operation can reach an accurate measurable peak value, and the image algorithm of the operation can be optimized in a targeted manner in protocol optimization, so that the optimization key point of the protocol can be indicated. The method can evaluate the cloud service performance, the cloud virtual machine performance and the performance spent on certain software operation in the virtual operating system in unattended operation, can be used unattended at night to perform timing test task execution, simulates large-scale office use scenes, automatically detects various performances, and can verify the high availability of the system before a real cloud service system is formally online.
In addition, the method of the embodiment can provide a virtual performance consumption test function, such as kvm, virtmanager, deep-trust service virtual platform, kylin KSVD virtualization platform, and openstack, which are all virtualization implementation frames, and all of which virtualize hardware devices into a plurality of software devices for use by a virtualization system and virtualization application. The virtual process needs to consume a part of performance, and the quality of the consumed performance is highlighted in the large-scale virtualized cloud platform. It is difficult to test how much performance a virtual behavior consumes in a common virtualization platform, because, for example, a kvm-qemu platform, virtualization only has one qemu process in a system, and a cpu, a memory, a disk io, and the like, which are temporarily used by the qemu process, are the sum of consumption in a virtual machine including a virtualization system such as windows and linux and performance consumed in a virtualization platform virtualization process, so that performance loss consumed in the virtualization step is difficult to measure, and becomes a difficult point for testing in the industry; in addition, in a large-scale cluster, kvm and openstack, a deep trust service and an kylin KSVD platform all use some functions of Numa kernel calculation and the like of cpu, and dynamically allocate computing resources of idle virtual machines to virtual machines with large computing capacity, so that the peak performance of a virtualization cluster platform is more difficult to measure and calculate. For the above two points, the method of the embodiment can enable all the virtual machines to run the same application operation, the same computation and occupy the same computing resource at the same time in the cluster, so that the cloud server does not have the appropriation of the computing resource because there is no idle virtual machine in the cluster. The concurrency upper limit which can be supported by the cluster aiming at a certain application operation can be calculated; the following can be calculated by subtracting the total of the operation performance consumption reported by the inside of the virtual machine from the physical performance consumption of the physical server cluster: the performance values consumed by the cluster hardware virtualization software (such values are recorded as a key reference for the virtualization manufacturer to compete for the advantages and disadvantages, and the lower the performance value, the better the virtualization algorithm), as shown in fig. 9: the personal computer collects the performance of the physical server and the performance of the virtual machine, and then obtains the performance of the physical server and the performance of the virtual machine (sum of performance of the physical server and sum of performance of the virtual machine, which is consumption of hardware virtualization performance).
In addition, the present embodiment further provides a cloud service-based large-scale cluster performance automated testing system, which includes a microprocessor and a memory connected to each other, where the microprocessor is programmed or configured to execute the steps of the cloud service-based large-scale cluster performance automated testing method.
In addition, the present embodiment also provides a computer-readable storage medium, in which a computer program programmed or configured to execute the foregoing cloud service-based large-scale cluster performance automation test method is stored.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may occur to those skilled in the art without departing from the principle of the invention, and are considered to be within the scope of the invention.

Claims (10)

1. A large-scale cluster performance automatic test method based on cloud service is characterized by comprising the following steps:
1) preparing an operating system provided with test software, and generating computing nodes in batches on a physical server based on the operating system provided with the test software;
2) designating test software of one computing node as a server and test software of the other computing nodes as clients in the generated computing nodes;
3) and issuing a test program and a test task based on the test program to a specified client through a server, and synchronously acquiring the resource occupation states of the client and a physical server/physical client during the test task execution of each client.
2. The automated test method for the performance of the large-scale cluster based on the cloud service according to claim 1, wherein the step 1) comprises: 1.1) preparing an operating system with test software; 1.2) creating a source virtual machine on a physical server, and installing an operating system of test software in the created source virtual machine or importing the virtual machine which is installed with the operating system of the test software; 1.3) copying a batch virtual machine copy on the basis of a source virtual machine by using a batch release function of a virtualization system on a physical server to obtain batch computing nodes of an operating system provided with test software.
3. The automated testing method for the performance of the large-scale cluster based on the cloud service according to claim 1, wherein the step 2) of designating the testing software of one computing node as a server and the testing software of the other computing nodes as clients in the generated computing nodes means that: the method comprises the steps that a designated computing node is remotely logged in on a physical client, and test software on the designated computing node triggers the test software of the computing node to send broadcast messages to other computing nodes through designated operation, so that the test software of the computing node serves as a server side, and the test software of the other computing nodes serves as client sides.
4. The cloud service-based large-scale cluster performance automated testing method according to claim 1, further comprising a step of, after step 2) and before step 3), each client periodically sending identity information, an IP address, and a resource occupation state of each client to the server, wherein the resource occupation state includes at least one of CPU, memory, disk I/O, and network I/O state data, the CPU state data refers to a CPU occupation ratio, the memory state data includes a total memory and a space memory size, the disk I/O state data includes read bytes and write bytes, and the network I/O state data includes send bytes and receive bytes.
5. The automated test method for the performance of the large-scale cluster based on the cloud service as claimed in claim 1, wherein the test program in step 3) is one or more of an exe executable file, a python executable file, a shell executable file, a cmd executable file and a recorded execution script program generated by a manual operation sequence; the operation executed by the test program in the step 3) comprises one or more of window operation, drawing program operation, picture browsing operation, video playing operation, browser operation, resource browser operation, office software operation, maximum calculation amount pressure test operation, maximum disk read-write pressure test operation and maximum network bandwidth pressure test operation.
6. The cloud-based large-scale cluster performance automated testing method according to claim 1, wherein the attributes of the test tasks based on the test programs in step 3) include test task names, selected test programs, execution modes of the test programs and execution parameters, the execution modes include an immediate execution mode and a timed execution mode, and the execution parameters include the number of repeated executions, execution interval time and start execution time.
7. The automated testing method for the performance of the large-scale cluster based on the cloud service as claimed in claim 1, wherein the step 3) of synchronously acquiring the resource occupation status of each client and the physical server/physical client during the test task executed by each client comprises: the method comprises the steps that when a test task based on a test program is issued to a designated client through a server, resource occupation states on a physical server/a physical client are simultaneously obtained through an SSH protocol respectively, and the resource occupation states of the client are recorded; after receiving a notification of completion of a test task sent by any client, ending recording of the resource occupation state of the client, thereby obtaining the resource occupation state of the client during execution of the test task; after receiving the notification of completion of the test tasks sent by all the clients, ending recording the resource occupation state on the physical server/physical client, thereby obtaining the resource occupation state of the physical server/physical client during the test tasks executed by each client; the resource occupation state comprises at least one of CPU, memory, disk I/O and network I/O state data, the CPU state data refers to the CPU occupation ratio, the memory state data comprises the size of the total memory and the space memory, the disk I/O state data comprises the number of read bytes and write bytes, and the network I/O state data comprises the number of sent bytes and received bytes.
8. The automated testing method for the performance of the large-scale cluster based on the cloud service according to claim 1, wherein the method further comprises a step of updating the clients in batches after the step 2) and before the step 3): and issuing the distribution address of the new version of the test software to each client at the server, and acquiring the new version of the test software and completing the updating and upgrading of the local test software by each client based on the received distribution address.
9. A cloud service-based large-scale cluster performance automated testing system, comprising a microprocessor and a memory which are connected with each other, characterized in that the microprocessor is programmed or configured to execute the steps of the cloud service-based large-scale cluster performance automated testing method according to any one of claims 1 to 8.
10. A computer-readable storage medium, wherein a computer program is stored in the computer-readable storage medium, the computer program being programmed or configured to execute the automated testing method for cloud service-based large-scale cluster performance according to any one of claims 1 to 8.
CN202111176883.8A 2021-10-09 2021-10-09 Automatic test method and system for large-scale cluster performance based on cloud service Pending CN113986719A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111176883.8A CN113986719A (en) 2021-10-09 2021-10-09 Automatic test method and system for large-scale cluster performance based on cloud service

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111176883.8A CN113986719A (en) 2021-10-09 2021-10-09 Automatic test method and system for large-scale cluster performance based on cloud service

Publications (1)

Publication Number Publication Date
CN113986719A true CN113986719A (en) 2022-01-28

Family

ID=79737909

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111176883.8A Pending CN113986719A (en) 2021-10-09 2021-10-09 Automatic test method and system for large-scale cluster performance based on cloud service

Country Status (1)

Country Link
CN (1) CN113986719A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115981937A (en) * 2022-12-23 2023-04-18 深圳市章江科技有限公司 Memory automatic testing method and system based on hybrid cloud
CN116860643A (en) * 2023-07-17 2023-10-10 广东保伦电子股份有限公司 Method for building software concurrency performance test platform

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115981937A (en) * 2022-12-23 2023-04-18 深圳市章江科技有限公司 Memory automatic testing method and system based on hybrid cloud
CN116860643A (en) * 2023-07-17 2023-10-10 广东保伦电子股份有限公司 Method for building software concurrency performance test platform
CN116860643B (en) * 2023-07-17 2024-05-14 广东保伦电子股份有限公司 Method for building software concurrency performance test platform

Similar Documents

Publication Publication Date Title
US11550630B2 (en) Monitoring and automatic scaling of data volumes
US10673981B2 (en) Workload rebalancing in heterogeneous resource environments
US8145751B2 (en) Validating software in a grid environment using ghost agents
US9471951B2 (en) Watermarking and scalability techniques for a virtual desktop planning tool
US9417895B2 (en) Concurrent execution of a first instance and a cloned instance of an application
US7945657B1 (en) System and method for emulating input/output performance of an application
US7519527B2 (en) Method for a database workload simulator
US8677324B2 (en) Evaluating performance of an application using event-driven transactions
US9679090B1 (en) Systematically exploring programs during testing
CN113986719A (en) Automatic test method and system for large-scale cluster performance based on cloud service
Zeldovich et al. Interactive Performance Measurement with VNCPlay.
US9292423B1 (en) Monitoring applications for compatibility issues
US11055568B2 (en) Method and system that measure application response time
JP2017201470A (en) Setting support program, setting support method, and setting support device
Di Sanzo et al. A flexible framework for accurate simulation of cloud in-memory data stores
CN111625407B (en) SSD performance test method and related components
CN101883019A (en) Test method for verifying video application of storage server
CA2524835C (en) Method and apparatus for a database workload simulator
Bodik Automating datacenter operations using machine learning
CN113127312B (en) Method, device, electronic equipment and storage medium for database performance test
US20050076191A1 (en) Gathering operational metrics within a grid environment using ghost agents
Lochmann et al. Reproducible load tests for android systems with trace-based benchmarks
US11520675B2 (en) Accelerated replay of computer system configuration sequences
US20220413995A1 (en) Automated mocking of computer system deployments
US20230205550A1 (en) Comparative sessions benchmarking tool for characterizing end user experience in differently configured enterprise computer systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination