CN111338868A - Method and device for judging performance of storage equipment - Google Patents

Method and device for judging performance of storage equipment Download PDF

Info

Publication number
CN111338868A
CN111338868A CN202010109607.9A CN202010109607A CN111338868A CN 111338868 A CN111338868 A CN 111338868A CN 202010109607 A CN202010109607 A CN 202010109607A CN 111338868 A CN111338868 A CN 111338868A
Authority
CN
China
Prior art keywords
performance
configuration parameters
decision
storage
samples
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010109607.9A
Other languages
Chinese (zh)
Inventor
李闯
李玲侠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202010109607.9A priority Critical patent/CN111338868A/en
Publication of CN111338868A publication Critical patent/CN111338868A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2273Test methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2268Logging of test results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a method and a device for judging the performance of storage equipment, wherein the method comprises the following steps: performing a performance test based on different configuration parameters on the storage device and obtaining a plurality of storage performances based on different configuration parameters as a plurality of test results; combining different configuration parameters and a plurality of test results into a plurality of samples in a one-to-one correspondence manner and establishing a sample training set; constructing a plurality of decision trees by using the sample training set to form a decision forest; and respectively processing the specific configuration parameters by using a plurality of decision trees in the decision forest and judging the storage performance of the storage equipment under the specific configuration parameters by combining a plurality of processing results. The invention can automatically judge the performance of the storage equipment under different configurations, and has strong operability and convenient implementation.

Description

Method and device for judging performance of storage equipment
Technical Field
The present invention relates to the field of storage technologies, and in particular, to a method and an apparatus for determining performance of a storage device.
Background
With the rapid development of scientific computing and various network applications, the amount of information generated by human beings is more and more, and the storage of data is more and more concerned by people, so that the position of a storage component in the whole computer system is more and more important, and the storage is shifted to a disk array from a single disk and a single tape, and further the storage network is developed to be popular at present. The demand of large-scale data application is continuously emerging, mass data and application thereof become a new development direction, data storage has generated great influence on the work and life of people, and naturally and more attention is paid to the improvement of various performances of used storage equipment.
The performance of a single storage device is greatly different in input and output performance for different management software layer configurations on the premise that the hardware configuration is not changed, and how to judge the performance of the configured storage device under the fixed hardware environment condition is a problem which is focused on by a customer when the storage device is used and is also a target to be achieved by the storage device during performance test.
The performance high-low data indicator of interest for a storage device is IOPS (I/O per second), which is the maximum number of input/output (I/O) per second. Under the condition that hardware is unchanged, the performance value of the storage device is related to parameter configurations such as the number of selected links, the RAID (storage array) level, the number of disks contained in the RAID, the number of created LUNs (logical Unit numbers), the concurrency number and the like, in the process of optimizing the performance of the storage device, each parameter adjustment influences the performance, generally, the optimal storage performance configuration can be selected by repeatedly debugging and combining experience, no direct theoretical basis and scheme can be referred to, and the determination is difficult to simply carry out.
Aiming at the problems of complex and fussy performance test and manpower consumption of the storage equipment in the prior art, no effective solution is available at present.
Disclosure of Invention
In view of this, an object of the embodiments of the present invention is to provide a method and an apparatus for judging performance of a storage device, which can automatically judge performance of the storage device under different configurations, and are strong in operability and convenient to implement.
In view of the foregoing, a first aspect of the embodiments of the present invention provides a method for determining performance of a storage device, including the following steps:
performing a performance test based on different configuration parameters on the storage device and obtaining a plurality of storage performances based on different configuration parameters as a plurality of test results;
combining different configuration parameters and a plurality of test results into a plurality of samples in a one-to-one correspondence manner and establishing a sample training set;
constructing a plurality of decision trees by using the sample training set to form a decision forest;
and respectively processing the specific configuration parameters by using a plurality of decision trees in the decision forest and judging the storage performance of the storage equipment under the specific configuration parameters by combining a plurality of processing results.
In some embodiments, the configuration parameters include at least one of: array level, number of disks, number of output links, number of logical unit numbers, maximum concurrency; the array level includes one of: RAID0, RAID10, RAID5, RAID 6; the number of the magnetic disks is 3 to 5; the number of output links is 1 to 4; the number of logical unit numbers is 1 to 3; the maximum concurrency is 1 to 8; the storage performance includes a maximum number of inputs and outputs per second.
In some embodiments, constructing a plurality of decision trees using the sample training set to form a decision forest comprises:
randomly extracting a plurality of samples from a sample training set;
forming a decision tree using the plurality of samples;
putting the plurality of samples back into the sample training set;
and repeating the steps to obtain a plurality of decision trees to form a decision forest.
In some embodiments, storing the performance includes determining a range of performance based on a maximum number of inputs and outputs per second; the performance range comprises a low performance range, a neutral performance range and a high performance range which are divided from small to large based on the maximum input and output number per second;
the method for judging the storage performance of the storage device under the specific configuration parameters by using a plurality of decision trees in the decision forest to process the specific configuration parameters respectively and combining a plurality of processing results comprises the following steps: and respectively processing the specific configuration parameters by using a plurality of decision trees in the decision forest, respectively obtaining a plurality of performance range judgments of the plurality of decision trees, and taking the performance range which obtains the most storage performance judgments as the storage performance of the storage equipment in the decision forest under the specific configuration parameters.
In some embodiments, processing the particular configuration parameters using the decision tree to obtain the performance ranging determination includes:
determining a plurality of information gain ratios among a plurality of samples in the decision tree according to different configuration parameters of the plurality of samples and corresponding test results in the decision tree;
and multiplying the specific configuration parameters by a plurality of information gain ratios respectively, and accumulating to obtain the performance range judgment.
A second aspect of an embodiment of the present invention provides a performance determination apparatus for a storage device, including:
a processor; and
a memory storing program code executable by the processor, the program code when executed sequentially performing the steps of:
performing a performance test based on different configuration parameters on the storage device and obtaining a plurality of storage performances based on different configuration parameters as a plurality of test results;
combining different configuration parameters and a plurality of test results into a plurality of samples in a one-to-one correspondence manner and establishing a sample training set;
constructing a plurality of decision trees by using the sample training set to form a decision forest;
and respectively processing the specific configuration parameters by using a plurality of decision trees in the decision forest and judging the storage performance of the storage equipment under the specific configuration parameters by combining a plurality of processing results.
In some embodiments, the configuration parameters include at least one of: array level, number of disks, number of output links, number of logical unit numbers, maximum concurrency; the array level includes one of: RAID0, RAID10, RAID5, RAID 6; the number of the magnetic disks is 3 to 5; the number of output links is 1 to 4; the number of logical unit numbers is 1 to 3; the maximum concurrency is 1 to 8; the storage performance includes a maximum number of inputs and outputs per second.
In some embodiments, constructing a plurality of decision trees using the sample training set to form a decision forest comprises:
randomly extracting a plurality of samples from a sample training set;
forming a decision tree using the plurality of samples;
putting the plurality of samples back into the sample training set;
and repeating the steps to obtain a plurality of decision trees to form a decision forest.
In some embodiments, storing the performance includes determining a range of performance based on a maximum number of inputs and outputs per second; the performance range comprises a low performance range, a neutral performance range and a high performance range which are divided from small to large based on the maximum input and output number per second;
the method for judging the storage performance of the storage device under the specific configuration parameters by using a plurality of decision trees in the decision forest to process the specific configuration parameters respectively and combining a plurality of processing results comprises the following steps: and respectively processing the specific configuration parameters by using a plurality of decision trees in the decision forest, respectively obtaining a plurality of performance range judgments of the plurality of decision trees, and taking the performance range which obtains the most storage performance judgments as the storage performance of the storage equipment in the decision forest under the specific configuration parameters.
In some embodiments, processing the particular configuration parameters using the decision tree to obtain the performance ranging determination includes:
determining a plurality of information gain ratios among a plurality of samples in the decision tree according to different configuration parameters of the plurality of samples and corresponding test results in the decision tree;
and multiplying the specific configuration parameters by a plurality of information gain ratios respectively, and accumulating to obtain the performance range judgment.
The invention has the following beneficial technical effects: according to the performance judgment method and device for the storage equipment, performance tests based on different configuration parameters are performed on the storage equipment, and a plurality of storage performances based on the different configuration parameters are obtained to serve as a plurality of test results; combining different configuration parameters and a plurality of test results into a plurality of samples in a one-to-one correspondence manner and establishing a sample training set; constructing a plurality of decision trees by using the sample training set to form a decision forest; the technical scheme that the specific configuration parameters are respectively processed by using a plurality of decision trees in the decision forest and the storage performance of the storage device under the specific configuration parameters is judged by combining a plurality of processing results can automatically judge the performance of the storage device under different configurations, and is strong in operability and convenient to implement.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a method for determining performance of a storage device according to the present invention;
fig. 2 is a schematic diagram of a decision tree establishment of the method for determining the performance of a storage device according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention are described in further detail with reference to the accompanying drawings.
It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are merely for convenience of description and should not be construed as limitations of the embodiments of the present invention, and they are not described in any more detail in the following embodiments.
In view of the above, a first aspect of the embodiments of the present invention proposes an embodiment of a method capable of automatically determining the performance of a storage device in different configurations. Fig. 1 is a schematic flow chart illustrating a method for determining performance of a storage device according to the present invention.
The performance determination method of the storage device, as shown in fig. 1, includes the following steps:
step S101: performing a performance test based on different configuration parameters on the storage device and obtaining a plurality of storage performances based on different configuration parameters as a plurality of test results;
step S103: combining different configuration parameters and a plurality of test results into a plurality of samples in a one-to-one correspondence manner and establishing a sample training set;
step S105: constructing a plurality of decision trees by using the sample training set to form a decision forest;
step S107: and respectively processing the specific configuration parameters by using a plurality of decision trees in the decision forest and judging the storage performance of the storage equipment under the specific configuration parameters by combining a plurality of processing results.
The random forest machine learning algorithm used in the invention repeatedly and randomly extracts N samples from an original training sample set N in a put-back manner to generate a new training sample set training decision tree through a self-help method (bootstrap) resampling technology, then generates m decision trees according to the steps to form a random forest, and the classification result of the new data is determined according to the number of scores formed by voting by the classification trees. The essence of the method is an improvement on a decision tree algorithm, a plurality of decision trees are combined together, and the establishment of each tree depends on independently extracted samples, so that the effect of better prediction fitting speed and calculation speed than the decision trees can be achieved, and the purpose of obtaining a good statistical rule can be achieved.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), a Random Access Memory (RAM), or the like. Embodiments of the computer program may achieve the same or similar effects as any of the preceding method embodiments to which it corresponds.
In some embodiments, the configuration parameters include at least one of: array level, number of disks, number of output links, number of logical unit numbers, maximum concurrency; the array level includes one of: RAID0, RAID10, RAID5, RAID 6; the number of the magnetic disks is 3 to 5; the number of output links is 1 to 4; the number of logical unit numbers is 1 to 3; the maximum concurrency is 1 to 8; the storage performance includes a maximum number of inputs and outputs per second.
In some embodiments, constructing a plurality of decision trees using the sample training set to form a decision forest comprises:
randomly extracting a plurality of samples from a sample training set;
forming a decision tree using the plurality of samples;
putting the plurality of samples back into the sample training set;
and repeating the steps to obtain a plurality of decision trees to form a decision forest.
In some embodiments, storing the performance includes determining a range of performance based on a maximum number of inputs and outputs per second; the performance range comprises a low performance range, a neutral performance range and a high performance range which are divided from small to large based on the maximum input and output number per second;
the method for judging the storage performance of the storage device under the specific configuration parameters by using a plurality of decision trees in the decision forest to process the specific configuration parameters respectively and combining a plurality of processing results comprises the following steps: and respectively processing the specific configuration parameters by using a plurality of decision trees in the decision forest, respectively obtaining a plurality of performance range judgments of the plurality of decision trees, and taking the performance range which obtains the most storage performance judgments as the storage performance of the storage equipment in the decision forest under the specific configuration parameters.
In some embodiments, processing the particular configuration parameters using the decision tree to obtain the performance ranging determination includes:
determining a plurality of information gain ratios among a plurality of samples in the decision tree according to different configuration parameters of the plurality of samples and corresponding test results in the decision tree;
and multiplying the specific configuration parameters by a plurality of information gain ratios respectively, and accumulating to obtain the performance range judgment.
The method disclosed according to an embodiment of the present invention may also be implemented as a computer program executed by a CPU (central processing unit), and the computer program may be stored in a computer-readable storage medium. The computer program, when executed by the CPU, performs the above-described functions defined in the method disclosed in the embodiments of the present invention. The above-described method steps and system elements may also be implemented using a controller and a computer-readable storage medium for storing a computer program for causing the controller to implement the functions of the above-described steps or elements.
The following further illustrates embodiments of the invention in terms of specific examples.
Firstly, performing performance test on each storage device to obtain performance data under different configurations, wherein the different configurations comprise:
RAID level: the stored RAID levels comprise RAID0, RAID10, RAID5, RAID6 and the like, the practical application scenarios of RAID5 and RAID6 are more, the model is firstly established by the two RAIDs, and other levels can be increased by analogy.
RAID contains the number of disks: the number of disks contained in the RAID affects the performance, the RAID5 number at least includes 3 disks, 3/4/5 and the like can be selected in the model, and scenes for increasing the number of disks can be increased according to circumstances.
Number of output links stored: 1/2/4 output links are usually selected, and the model is built according to the selection.
Number of LUNs created per RAID: the division of the LUNs under the RAID can be divided into 1/2/3 and the like according to scenes, and the specific number can be increased according to different scenes.
Number of concurrences for testing performance: the concurrency number is established according to the common 1/2/8 and the like, and other concurrency numbers can be increased according to scenes.
The essence of the random forest algorithm is an improvement on the decision tree algorithm, which is to combine a plurality of decision trees, and the establishment of each tree depends on independently drawn samples. And repeatedly and randomly extracting N samples from the original training sample set N in a replacing manner to generate a new training sample set training decision tree, and then generating m decision trees according to the steps to form a random forest, wherein for example, N is 10 and m is 100 in the training sample set N.
According to different scenarios, the model built by the m decision trees may be configured and expanded by using RAID 5-disk number 3-link 1-LUN number 1-concurrency number 1 as shown in fig. 2, and other configurations are similar to those of the configuration type expanded at the leftmost side, and form configuration schemes in 2 × 3 total 162, and some parameter may be increased or decreased in an actual scenario, for example, if only focusing on RAID5, a different configuration scheme in one side 81 of RAID6 is not required.
Because the data tested by the performance index is a specific numerical value, the data is classified by a unified performance numerical value, and the conclusion classification performance index of the model is high, medium and low, for example, the IOPS is low when the IOPS is 1000, the IOPS is 100-10000 is medium, and the IOPS is 10000 is high.
The model of each decision tree shows the corresponding relation between the object characteristics of different storage configurations and the conclusion of high performance, wherein the conclusion { high performance value, medium performance value and low performance value } is the characteristic set A in decision, and { RAID5, RAID6} { disk number 3, disk number 4, disk number 5} { link 1, link 2, link 4} { LUN number 1, LUN number 2, LUN number 3} { concurrency number 1, concurrency number 2 and concurrency number 8} is the data set Di (i ═ 1, 2 … 5).
The generation algorithm of the decision tree has two mainstream: ID3 and C4.5, through different input configurations and the obtained performance value conclusion, the information gain ratio g (Di/A) of the feature set A to the data set Di can be obtained, wherein g (Di/A) indicates the gain effect from each tree branch node to node.
Each data set Di is assigned a weight of 20% in the configuration, and the sum of the weights of the data sets is 1 in total 5. And (3) inputting a decision tree model finished by data training, and only using input configuration conditions Di, and multiplying g (Di/A) by the weight of each branch to obtain an output result of the feature set A. Such as the configuration of input RAID 5-disk number 3-link 1-LUN number 1-concurrency number 1, the conclusion may be that the performance value is low.
And combining the models of the m decision trees to generate a random forest model of the m decision trees, respectively inputting the m decision trees to obtain prediction results of the performance for new data, namely the IOPS performance under the configuration information of new storage equipment to be judged, voting the prediction results, and obtaining the result with the highest vote number, namely the prediction conclusion of the random forest. For example, when m is 100, 80 decision trees have high performance values, among the performance values obtained by 15 decision trees, 5 decision trees have low performance values, and according to the voting result, a conclusion of 80 decision trees is taken, that is, the performance value is high, which has practical reference meaning for the user to decide whether to use the configuration stored in the memory, and also has guiding meaning for the optimal scheme of optimizing the configuration in the performance test.
As can be seen from the foregoing embodiments, in the performance determination method for a storage device according to an embodiment of the present invention, a performance test based on different configuration parameters is performed on the storage device, and a plurality of storage performances based on different configuration parameters are obtained as a plurality of test results; combining different configuration parameters and a plurality of test results into a plurality of samples in a one-to-one correspondence manner and establishing a sample training set; constructing a plurality of decision trees by using the sample training set to form a decision forest; the technical scheme that the specific configuration parameters are respectively processed by using a plurality of decision trees in the decision forest and the storage performance of the storage device under the specific configuration parameters is judged by combining a plurality of processing results can automatically judge the performance of the storage device under different configurations, and is strong in operability and convenient to implement.
It should be particularly noted that, the steps in the embodiments of the method for determining the performance of the storage device described above may be mutually intersected, replaced, added, or deleted, and therefore, the method for determining the performance of the storage device based on these reasonable permutation and combination transformations shall also belong to the scope of the present invention, and shall not limit the scope of the present invention to the described embodiments.
In view of the above-mentioned objects, a second aspect of the embodiments of the present invention provides an embodiment of an apparatus capable of automatically determining performance of a storage device in different configurations. The performance judgment device of the storage equipment comprises:
a processor; and
a memory storing program code executable by the processor, the program code when executed sequentially performing the steps of:
performing a performance test based on different configuration parameters on the storage device and obtaining a plurality of storage performances based on different configuration parameters as a plurality of test results;
combining different configuration parameters and a plurality of test results into a plurality of samples in a one-to-one correspondence manner and establishing a sample training set;
constructing a plurality of decision trees by using the sample training set to form a decision forest;
and respectively processing the specific configuration parameters by using a plurality of decision trees in the decision forest and judging the storage performance of the storage equipment under the specific configuration parameters by combining a plurality of processing results.
In some embodiments, the configuration parameters include at least one of: array level, number of disks, number of output links, number of logical unit numbers, maximum concurrency; the array level includes one of: RAID0, RAID10, RAID5, RAID 6; the number of the magnetic disks is 3 to 5; the number of output links is 1 to 4; the number of logical unit numbers is 1 to 3; the maximum concurrency is 1 to 8; the storage performance includes a maximum number of inputs and outputs per second.
In some embodiments, constructing a plurality of decision trees using the sample training set to form a decision forest comprises:
randomly extracting a plurality of samples from a sample training set;
forming a decision tree using the plurality of samples;
putting the plurality of samples back into the sample training set;
and repeating the steps to obtain a plurality of decision trees to form a decision forest.
In some embodiments, storing the performance includes determining a range of performance based on a maximum number of inputs and outputs per second; the performance range comprises a low performance range, a neutral performance range and a high performance range which are divided from small to large based on the maximum input and output number per second;
the method for judging the storage performance of the storage device under the specific configuration parameters by using a plurality of decision trees in the decision forest to process the specific configuration parameters respectively and combining a plurality of processing results comprises the following steps: and respectively processing the specific configuration parameters by using a plurality of decision trees in the decision forest, respectively obtaining a plurality of performance range judgments of the plurality of decision trees, and taking the performance range which obtains the most storage performance judgments as the storage performance of the storage equipment in the decision forest under the specific configuration parameters.
In some embodiments, processing the particular configuration parameters using the decision tree to obtain the performance ranging determination includes:
determining a plurality of information gain ratios among a plurality of samples in the decision tree according to different configuration parameters of the plurality of samples and corresponding test results in the decision tree;
and multiplying the specific configuration parameters by a plurality of information gain ratios respectively, and accumulating to obtain the performance range judgment.
As can be seen from the foregoing embodiments, the performance determination apparatus for a storage device according to an embodiment of the present invention performs a performance test based on different configuration parameters on the storage device and obtains a plurality of storage performances based on different configuration parameters as a plurality of test results; combining different configuration parameters and a plurality of test results into a plurality of samples in a one-to-one correspondence manner and establishing a sample training set; constructing a plurality of decision trees by using the sample training set to form a decision forest; the technical scheme that the specific configuration parameters are respectively processed by using a plurality of decision trees in the decision forest and the storage performance of the storage device under the specific configuration parameters is judged by combining a plurality of processing results can automatically judge the performance of the storage device under different configurations, and is strong in operability and convenient to implement.
It should be particularly noted that, the above embodiment of the performance judging apparatus of the storage device adopts the embodiment of the performance judging method of the storage device to specifically describe the working process of each module, and those skilled in the art can easily think that these modules are applied to other embodiments of the performance judging method of the storage device. Of course, since the steps in the embodiment of the method for determining the performance of the storage device may be mutually intersected, replaced, added, or deleted, the performance determining apparatus for the storage device, which is transformed by these reasonable permutations and combinations, shall also belong to the scope of the present invention, and shall not limit the scope of the present invention to the embodiment.
The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the present disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. Furthermore, although elements of the disclosed embodiments of the invention may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.
It should be understood that, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items. The numbers of the embodiments disclosed in the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, of embodiments of the invention is limited to these examples; within the idea of an embodiment of the invention, also technical features in the above embodiment or in different embodiments may be combined and there are many other variations of the different aspects of an embodiment of the invention as described above, which are not provided in detail for the sake of brevity. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of the embodiments of the present invention are intended to be included within the scope of the embodiments of the present invention.

Claims (10)

1. A performance judgment method of a storage device is characterized by comprising the following steps:
performing a performance test based on different configuration parameters on a storage device and obtaining a plurality of storage performances based on the different configuration parameters as a plurality of test results;
combining the different configuration parameters and the plurality of test results into a plurality of samples in a one-to-one correspondence manner and establishing a sample training set;
constructing a plurality of decision trees using the sample training set to form a decision forest;
and respectively processing specific configuration parameters by using the plurality of decision trees in the decision forest and judging the storage performance of the storage equipment under the specific configuration parameters by combining a plurality of processing results.
2. The method of claim 1, wherein the configuration parameter comprises at least one of: array level, number of disks, number of output links, number of logical unit numbers, maximum concurrency; the array level includes one of: RAID0, RAID10, RAID5, RAID 6; the number of the magnetic disks is 3 to 5; the number of output links is 1 to 4; the number of the logic unit numbers is 1 to 3; the maximum concurrency is 1 to 8; the storage performance includes a maximum number of inputs and outputs per second.
3. The method of claim 1, wherein constructing a plurality of decision trees using the sample training set to form a decision forest comprises:
randomly drawing a plurality of the samples from the training set of samples;
forming a decision tree using a plurality of said samples;
placing a plurality of the samples back into the sample training set;
repeating the above steps to obtain the plurality of decision trees to form the decision forest.
4. The method of claim 3, wherein the storage performance comprises a performance range determined based on a maximum number of inputs and outputs per second; the performance range comprises a low performance range, a neutral performance range and a high performance range which are divided from small to large based on the maximum input and output number per second;
respectively processing specific configuration parameters by using the plurality of decision trees in the decision forest and judging the storage performance of the storage device under the specific configuration parameters by combining a plurality of processing results, wherein the processing comprises the following steps: respectively processing specific configuration parameters by using the plurality of decision trees in the decision forest, respectively obtaining a plurality of performance range judgments of the plurality of decision trees, and taking the performance range obtained by the maximum storage performance judgment as the storage performance of the storage device in the decision forest under the specific configuration parameters.
5. The method of claim 4, wherein processing the specific configuration parameter using the decision tree and combining the plurality of processing results to obtain the performance range determination comprises:
determining a plurality of information gain ratios among a plurality of samples within the decision tree according to the different configuration parameters of the plurality of samples and the corresponding test results in the decision tree;
and multiplying the specific configuration parameter by the plurality of information gain ratios respectively and accumulating the result to obtain the performance range judgment.
6. A performance judging apparatus of a storage device, comprising:
a processor; and
a memory storing program code executable by the processor, the program code when executed sequentially performing the steps of:
performing a performance test based on different configuration parameters on a storage device and obtaining a plurality of storage performances based on the different configuration parameters as a plurality of test results;
combining the different configuration parameters and the plurality of test results into a plurality of samples in a one-to-one correspondence manner and establishing a sample training set;
constructing a plurality of decision trees using the sample training set to form a decision forest;
and respectively processing specific configuration parameters by using the plurality of decision trees in the decision forest and judging the storage performance of the storage equipment under the specific configuration parameters by combining a plurality of processing results.
7. The apparatus of claim 6, wherein the configuration parameter comprises at least one of: array level, number of disks, number of output links, number of logical unit numbers, maximum concurrency; the array level includes one of: RAID0, RAID10, RAID5, RAID 6; the number of the magnetic disks is 3 to 5; the number of output links is 1 to 4; the number of the logic unit numbers is 1 to 3; the maximum concurrency is 1 to 8; the storage performance includes a maximum number of inputs and outputs per second.
8. The apparatus of claim 6, wherein constructing a plurality of decision trees using the sample training set to form a decision forest comprises:
randomly drawing a plurality of the samples from the training set of samples;
forming a decision tree using a plurality of said samples;
placing a plurality of the samples back into the sample training set;
repeating the above steps to obtain the plurality of decision trees to form the decision forest.
9. The apparatus of claim 8, wherein the storage performance comprises a performance range determined based on a maximum number of inputs and outputs per second; the performance range comprises a low performance range, a neutral performance range and a high performance range which are divided from small to large based on the maximum input and output number per second;
respectively processing specific configuration parameters by using the plurality of decision trees in the decision forest and judging the storage performance of the storage device under the specific configuration parameters by combining a plurality of processing results, wherein the processing comprises the following steps: respectively processing specific configuration parameters by using the plurality of decision trees in the decision forest, respectively obtaining a plurality of performance range judgments of the plurality of decision trees, and taking the performance range obtained by the maximum storage performance judgment as the storage performance of the storage device in the decision forest under the specific configuration parameters.
10. The apparatus of claim 9, wherein processing the specific configuration parameter using the decision tree and combining the plurality of processing results to obtain the performance bound determination comprises:
determining a plurality of information gain ratios among a plurality of samples within the decision tree according to the different configuration parameters of the plurality of samples and the corresponding test results in the decision tree;
and multiplying the specific configuration parameter by the plurality of information gain ratios respectively and accumulating the result to obtain the performance range judgment.
CN202010109607.9A 2020-02-22 2020-02-22 Method and device for judging performance of storage equipment Pending CN111338868A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010109607.9A CN111338868A (en) 2020-02-22 2020-02-22 Method and device for judging performance of storage equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010109607.9A CN111338868A (en) 2020-02-22 2020-02-22 Method and device for judging performance of storage equipment

Publications (1)

Publication Number Publication Date
CN111338868A true CN111338868A (en) 2020-06-26

Family

ID=71181941

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010109607.9A Pending CN111338868A (en) 2020-02-22 2020-02-22 Method and device for judging performance of storage equipment

Country Status (1)

Country Link
CN (1) CN111338868A (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109597401A (en) * 2018-12-06 2019-04-09 华中科技大学 A kind of equipment fault diagnosis method based on data-driven

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109597401A (en) * 2018-12-06 2019-04-09 华中科技大学 A kind of equipment fault diagnosis method based on data-driven

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
佚名: "随机森林(Random Forest),决策树,bagging, boosting(Adaptive Boosting,GBDT)", 《HTTP://WWW.CNBLOGS.COM/MAYBE2030/P/4585705.HTML》 *
祝志川等: "基于信息增益比修正G1的系统协调发展测度模型及实证", 《统计与决策》 *
陈禹等: "基于随机森林和遗传算法的Ceph参数自动调优", 《计算机应用》 *

Similar Documents

Publication Publication Date Title
CN109711528A (en) Based on characteristic pattern variation to the method for convolutional neural networks beta pruning
CN107609130A (en) A kind of method and server for selecting data query engine
CN109784365A (en) A kind of feature selection approach, terminal, readable medium and computer program
CN110781174A (en) Feature engineering modeling method and system using pca and feature intersection
CN105512156A (en) Method and device for generation of click models
CN116402117A (en) Image classification convolutional neural network pruning method and core particle device data distribution method
Stützle et al. New benchmark instances for the QAP and the experimental analysis of algorithms
CA2520317A1 (en) Decision tree analysis
CN111338868A (en) Method and device for judging performance of storage equipment
CN101894063A (en) Method and device for generating test program for verifying function of microprocessor
CN111814414A (en) Coverage rate convergence method and system based on genetic algorithm
CN111461815A (en) Order recognition model generation method, recognition method, system, device and medium
CN108256694A (en) Based on Fuzzy time sequence forecasting system, the method and device for repeating genetic algorithm
CN111985644B (en) Neural network generation method and device, electronic equipment and storage medium
CN111260036B (en) Neural network acceleration method and device
CN109977977A (en) A kind of method and corresponding intrument identifying potential user
CN109145518B (en) Method for constructing reliability decision graph model of large-scale complex equipment
CN107433032A (en) Chess game data processing method and device
Li et al. A multi-objective learning method for building sparse defect prediction models
CN110020725A (en) A kind of test design method for serving Weapon Equipment System operation emulation
CN108073502B (en) Test method and system thereof
CN108536299A (en) Human-computer interaction result generation method based on matrix and system
CN110728299A (en) Multi-extreme learning machine-based transient stability hierarchical evaluation method after power system fault
CN109460533A (en) A kind of method and device improving GEMM calculated performance
CN115983719B (en) Training method and system for software comprehensive quality evaluation model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200626

RJ01 Rejection of invention patent application after publication