CN112162966A

CN112162966A - Distributed storage system parameter adjusting method and device, electronic equipment and medium

Info

Publication number: CN112162966A
Application number: CN202010956173.6A
Authority: CN
Inventors: 王团结; 梁鑫辉; 曹琪
Original assignee: Beijing Inspur Data Technology Co Ltd
Current assignee: Beijing Inspur Data Technology Co Ltd
Priority date: 2020-09-11
Filing date: 2020-09-11
Publication date: 2021-01-01

Abstract

The application discloses a method, a device, equipment and a medium for adjusting parameters of a distributed storage system, wherein the method comprises the following steps: acquiring a neural network model which is created in advance according to data obtained by interaction between the vbdbench client and the storage cluster and used for evaluating the time delay; determining actual internal state data of the current storage cluster, wherein the actual internal state data comprises ganesha interface delay data; and searching a target parameter which enables the output value to be maximum according to actual internal state data by using a neural network model, and adjusting the parameter of the current storage cluster according to the target parameter, wherein the output value is in negative correlation with the time delay of the client. The method and the device can create a neural network model for evaluating time delay, obtain actual internal state data of the storage cluster, search a target parameter capable of enabling an output value to be maximum based on the created neural network model, obtain the target parameter enabling the time delay to be small so as to conduct optimization of the storage cluster, and achieve automatic optimization of the storage cluster.

Description

Distributed storage system parameter adjusting method and device, electronic equipment and medium

Technical Field

The present application relates to the field of distributed storage technologies, and in particular, to a method and an apparatus for adjusting parameters of a distributed storage system, an electronic device, and a computer-readable storage medium.

Background

The current storage system has a large number of adjustable parameters, and the adjustable range of each parameter is very large, and modifying the parameters often has different influences on the performance of the system. Typically the default parameter configuration is provided by the supplier and the combination of these parameter values is typically not an optimal set of parameters. Even if a small part of parameter values are adjusted, the energy consumption and the performance efficiency of the system can be improved by more than several times.

The traditional parameter adjustment is completed by a system administrator according to professional knowledge and experience of the system administrator, and as the complexity of the storage system is continuously improved, manual parameter adjustment cannot adapt to a large-scale storage system, and manual parameter adjustment has the defects of incapability of all-weather monitoring, high labor cost and the like, so that the research of an automatic optimization method of the storage system is urgently needed at present. Therefore, how to solve the above problems is a great concern for those skilled in the art.

Disclosure of Invention

The application aims to provide a distributed storage system parameter adjusting method and device, an electronic device and a computer readable storage medium, which can realize automatic optimization of a storage cluster.

In order to achieve the above object, the present application provides a method for adjusting parameters of a distributed storage system, including:

acquiring a neural network model which is created in advance according to data obtained by interaction between a vbdbench client and a storage cluster, wherein the neural network model is used for evaluating the time delay;

determining actual internal state data of the current storage cluster; wherein, the actual internal state data comprises interface delay data of ganesha;

searching a target parameter which enables the output value of the neural network model to be maximum according to the actual internal state data by using the neural network model, and adjusting the parameter of the current storage cluster according to the target parameter, wherein the output value is in negative correlation with the time delay of a client.

Optionally, the interface delay data includes: any one or combination of any several of interface name, number of received requests, number of processed requests, average processing delay, maximum processing delay, minimum processing delay, average request waiting delay, maximum request waiting delay and minimum request waiting delay.

Optionally, the creating process of the neural network model includes:

constructing a full-connection neural network as an initial neural network model, and initializing neural network parameters in the initial neural network model;

acquiring a first internal state of a storage cluster at the current moment, selecting a group of parameters in a parameter space, performing dynamic configuration, performing IO read-write operation on a storage system after the configuration is completed, and acquiring average time delay data of a client; the parameter space is generated by setting a minimum value and a maximum value corresponding to ganesha configuration data and a sampling interval;

querying a second internal state of the storage cluster after the IO read-write operation is finished, and generating a group of training data by combining the average delay data, the ganesha configuration data and the first internal state;

and training the initial neural network model by utilizing the multiple groups of training data to generate a final neural network model.

Optionally, the ganesha configuration data includes: any one or combination of any several items of the number of working threads, the total number of reading working threads, the number of reading working thread groups, the maximum number of cached files and the caching time.

Optionally, the searching, by using the neural network model and according to the actual internal state data, for the target parameter that maximizes the output value of the neural network model includes:

traversing a parameter space of ganesha configuration data, inputting the actual internal state data and each group of parameters into the neural network model, and acquiring corresponding output values;

and selecting a maximum value from all the output values, and acquiring a parameter corresponding to the maximum value as the target parameter.

Optionally, after searching, by using the neural network model, a target parameter that maximizes an output value of the neural network model according to the actual internal state data and adjusting a parameter of the current storage cluster according to the target parameter, the method further includes:

and recording the actual internal state data and the target parameters as latest training parameters so as to update the neural network model by using the latest training parameters.

In order to achieve the above object, the present application provides a distributed storage system parameter adjusting apparatus, including:

the model acquisition module is used for acquiring a neural network model which is created in advance according to data obtained by interaction between the vbdbench client and the storage cluster, and the neural network model is used for evaluating the time delay;

the state determining module is used for determining actual internal state data of the current storage cluster; wherein, the actual internal state data comprises interface delay data of ganesha;

and the parameter adjusting module is used for searching a target parameter which enables the output value of the neural network model to be maximum according to the actual internal state data by using the neural network model, and adjusting the parameter of the current storage cluster according to the target parameter, wherein the output value is in negative correlation with the time delay of the client.

Optionally, the method further includes:

and the model updating module is used for recording the actual internal state data and the target parameters as latest training parameters so as to update the neural network model by using the latest training parameters.

To achieve the above object, the present application provides an electronic device including:

a memory for storing a computer program;

a processor for implementing the steps of any one of the aforementioned disclosed distributed storage system parameter adjustment methods when executing the computer program.

To achieve the above object, the present application provides a computer-readable storage medium, on which a computer program is stored, the computer program, when being executed by a processor, implementing the steps of any one of the distributed storage system parameter adjusting methods disclosed in the foregoing.

According to the scheme, the method for adjusting the parameters of the distributed storage system comprises the following steps: acquiring a neural network model which is created in advance according to data obtained by interaction between a vbdbench client and a storage cluster, wherein the neural network model is used for evaluating the time delay; determining actual internal state data of the current storage cluster; wherein, the actual internal state data comprises interface delay data of ganesha; searching a target parameter which enables the output value of the neural network model to be maximum according to the actual internal state data by using the neural network model, and adjusting the parameter of the current storage cluster according to the target parameter, wherein the output value is in negative correlation with the time delay of a client. According to the method, the neural network model for evaluating the time delay of the client can be created in advance, when the actual parameters need to be adjusted, the actual internal state data of the current storage cluster can be obtained, the target parameters which can enable the output value of the neural network model to be the maximum are searched based on the created neural network model, the larger the output value is, the smaller the time delay of the client is represented, therefore, the target parameters which enable the time delay of the client to be smaller can be obtained to adjust the storage cluster, the optimal parameters can be automatically obtained, manual parameter adjustment is not needed, the labor cost is reduced, and automatic adjustment of the storage cluster is achieved.

The application also discloses a distributed storage system parameter adjusting device, an electronic device and a computer readable storage medium, which can also achieve the technical effects.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a flowchart of a method for adjusting parameters of a distributed storage system according to an embodiment of the present application;

FIG. 2 is a flow chart of a neural network model creation process disclosed in an embodiment of the present application;

fig. 3 is a structural diagram of a distributed storage system parameter adjustment apparatus disclosed in an embodiment of the present application;

fig. 4 is a block diagram of an electronic device disclosed in an embodiment of the present application;

fig. 5 is a block diagram of another electronic device disclosed in the embodiments of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Therefore, the embodiment of the application discloses a distributed storage system parameter adjusting method, which can automatically acquire optimal parameters and realize automatic adjustment and optimization of a storage cluster.

Referring to fig. 1, a method for adjusting parameters of a distributed storage system disclosed in an embodiment of the present application includes:

s101: acquiring a neural network model which is created in advance according to data obtained by interaction between a vbdbench client and a storage cluster, wherein the neural network model is used for evaluating the time delay;

in the embodiment of the application, interaction can be carried out on the storage cluster through the vdbench client in advance, the internal state of the storage cluster, the average time delay of the client and ganesha configuration data are collected, a neural network model is created through offline learning, and the neural network model is used for evaluating the time delay of the client.

It should be noted that the vdbech is an I/O workload generator, which is used to verify data integrity and measure the performance of direct attachment and network connection storage, and because of the advantages of convenient installation and good compatibility, it becomes a common tool used in testing and benchmark testing. The Ganesha specifically refers to NFS-Ganesha, and is an NFS (Network File System) File server operating in a user mode.

Specifically, the ganesha configuration data may include, but is not limited to: the number of working threads, the total number of reading working threads, the grouping number of reading working threads, the maximum number of cached files and the caching time.

S102: determining actual internal state data of the current storage cluster; wherein, the actual internal state data comprises interface delay data of ganesha;

in this step, the actual internal state data of the storage cluster which needs parameter tuning at present can be obtained. Specifically, the actual internal state data may include interface latency data of ganesha, and the interface latency data of ganesha is a set of statistical indexes, which may specifically include but is not limited to: interface name, number of requests received, number of requests processed, average delay of processing, maximum delay of processing, minimum delay of processing, average delay of request waiting, maximum delay of request waiting, and minimum delay of request waiting.

S103: searching a target parameter which enables the output value of the neural network model to be maximum according to the actual internal state data by using the neural network model, and adjusting the parameter of the current storage cluster according to the target parameter, wherein the output value is in negative correlation with the time delay of a client.

It should be noted that after the actual internal state data of the current storage cluster is obtained, the actual internal state data and the ganesha configuration data may be used as input data of the neural network model, specifically, a parameter space of the ganesha configuration data may be traversed, the actual internal state data and each group of parameters are input into the neural network model, corresponding output values are obtained, a maximum value is selected from all the output values, and a parameter corresponding to the maximum value is obtained as a target parameter, so that the parameter of the current storage cluster is adjusted by using the target parameter.

As a preferred implementation manner, in the embodiment of the present application, after the target parameter is obtained by searching and the parameter of the current storage cluster is adjusted according to the target parameter, the actual internal state data and the target parameter may be further recorded as the latest training parameter, so that the neural network model is updated by using the latest training parameter in the following process, and the accuracy of the neural network model is improved. For example, after obtaining the target parameters, that is, the optimal configuration parameters, online configuration may be performed according to the target parameters, and after running for a period of time, data is stored and recorded according to the feedback of the reward function obtaining system and the internal state of the cluster at the current time, so as to further update the neural network model.

The creation process of the neural network model is further described below. Referring to fig. 2, specifically:

s201: constructing a full-connection neural network as an initial neural network model, and initializing neural network parameters in the initial neural network model;

according to the embodiment of the application, a four-layer fully-connected neural network can be specifically constructed to serve as a neural network model, and parameters of the neural network are initialized. The input layer is the internal state of the cluster and the ganesha core parameter, the first hidden layer comprises M neurons, the activation function is Relu, the second hidden layer comprises N neurons, the activation function is Relu, and the output layer comprises one neuron. The values of M and N may be set according to actual conditions in practical implementation, and are not limited herein, for example, M may be set to 100, and N may be set to 10.

S202: acquiring a first internal state of a storage cluster at the current moment, selecting a group of parameters in a parameter space, performing dynamic configuration, performing IO read-write operation on a storage system after the configuration is completed, and acquiring average time delay data of a client;

in this step, a first internal state s of the storage cluster at the current time can be obtained_t. Specifically, the interface latency of ganesha can be taken as the internal state of the cluster, including but not limited to the interface name, the number of received requests, the number of processed requests, the processing average latency, the processing maximum latency, the processing minimum latency, the request waiting average latency, the request waiting maximum latency, the request waiting minimum latency, and the like, and can be regarded as a 24 × 8 two-dimensional table. Expanding the two-dimensional table according to rows to obtain the length of 192 dimensionsInternal state vector s of a cluster of degrees_t。

It should be noted that the vdbech client is in the cluster state s_tParameter a_tIn time, the lower the average delay for performing IO read and write operations, the better the reward function is as follows: r is_t＝-latency_t. Five core configuration data of Ganesha are provided, and a parameter space corresponding to the Ganesha configuration data can be generated by setting the minimum value, the maximum value and the sampling interval corresponding to the Ganesha configuration data. For example, one specific parameter space may be as shown in table 1 below:

TABLE 1

Parameter name	Meaning of parameters	[ minimum, maximum, sampling Interval]
			Nb_Worker	Number of working threads	[1,128,10]
nb_worker_req	Total number of read work threads	[1,128,10]
			nb_worker_queue	Read worker threads are grouped into 4 groups	[1,100,10]
Entries_HWMark	Maximum number of files cached	[100,200000,1000]
			Attr_Expiration_Time	Cache time (second unit)	[1,100,10]

For example, a certain set of parameter vectors is: a is_t＝[10,10,5,100,8]。

It should be noted that, in the off-line learning process, a group of parameters needs to be acquired from the parameter space a, and the specific method is as follows: obtaining configuration parameter a using-greedy method_tWherein, a probability value close to 1, such as 0.99, a set of parameters is randomly selected from the parameter space with a smaller probability of 1-, and each set of parameters of the internal state of the current cluster and the parameter space is input into the neural network with a larger probability, and the set of parameters with the maximum output value of the neural network is taken.

Specifically, a set of parameters a may be obtained according to the internal state of the current cluster and the method for obtaining the parameters_tAnd dynamically configuring parameters, randomly sampling and executing a multi-application scene vdbech script, performing IO read-write operation on a storage system, and setting the running time, wherein the running time can be specifically selected for 2 minutes. After the IO is over, the feedback of the system can be obtained according to the reward function.

S203: querying a second internal state of the storage cluster after the IO read-write operation is finished, and generating a group of training data by combining the average delay data, the ganesha configuration data and the first internal state;

in this step, the second internal state s of the storage cluster after the IO read-write operation is finished can be queried_t+1. In combination with the average delay data, ganesha configuration data, the first internal state and the second internal state to generate a set of training data, e.g.,(s)_t，a_t，r_t，s_t+1) And storing the data into a cache list.

S204: and training the initial neural network model by utilizing the multiple groups of training data to generate a final neural network model.

It can be understood that a plurality of sets of training data can be randomly adopted from the cache list to train the neural network model, and the required neural network model is generated. Wherein, input s_i，s_iTarget value y of_iFrom the current immediate feedback r_tAnd target neural network model

Evaluation is given by

According to input s_i，s_iAnd a target value y_iAnd updating the neural network model Q by adopting a gradient descent algorithm. Updating once every time the Q neural network model is updated for C times

C may be set according to an actual implementation scenario, and the embodiment is not particularly limited thereto, for example, C may be set to 10, that is, each time the Q neural network model is updated ten times, the Q neural network model is updated once

Namely, it is

In the following, a distributed storage system parameter adjusting apparatus provided in an embodiment of the present application is introduced, and a distributed storage system parameter adjusting apparatus described below and a distributed storage system parameter adjusting method described above may refer to each other.

Referring to fig. 3, a distributed storage system parameter adjustment apparatus provided in an embodiment of the present application includes:

the model acquisition module 301 is configured to acquire a neural network model created in advance according to data obtained by interaction between the vdbench client and the storage cluster, where the neural network model is used to evaluate the time delay;

a state determination module 302, configured to determine actual internal state data of the current storage cluster; wherein, the actual internal state data comprises interface delay data of ganesha;

a parameter adjusting module 303, configured to search, by using the neural network model, a target parameter that maximizes an output value of the neural network model according to the actual internal state data, and adjust a parameter of the current storage cluster according to the target parameter, where the output value is negatively related to a client time delay.

For the specific implementation process of the modules 301 to 303, reference may be made to the corresponding content disclosed in the foregoing embodiments, and details are not repeated here.

On the basis of the foregoing embodiment, as a preferred implementation, the distributed storage system parameter adjustment apparatus provided in the embodiment of the present application may further include:

The present application further provides an electronic device, and as shown in fig. 4, an electronic device provided in an embodiment of the present application includes:

a memory 100 for storing a computer program;

the processor 200, when executing the computer program, may implement the steps provided by the above embodiments.

Specifically, the memory 100 includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and computer-readable instructions, and the internal memory provides an environment for the operating system and the computer-readable instructions in the non-volatile storage medium to run. The processor 200 may be a Central Processing Unit (CPU), a controller, a microcontroller, a microprocessor or other data Processing chip in some embodiments, and provides computing and controlling capability for the electronic device, and when executing the computer program stored in the memory 100, the method for adjusting parameters of the distributed storage system disclosed in any of the foregoing embodiments may be implemented.

On the basis of the above embodiment, as a preferred implementation, referring to fig. 5, the electronic device further includes:

and an input interface 300 connected to the processor 200, for acquiring computer programs, parameters and instructions imported from the outside, and storing the computer programs, parameters and instructions into the memory 100 under the control of the processor 200. The input interface 300 may be connected to an input device for receiving parameters or instructions manually input by a user. The input device may be a touch layer covered on a display screen, or a button, a track ball or a touch pad arranged on a terminal shell, or a keyboard, a touch pad or a mouse, etc.

And a display unit 400 connected to the processor 200 for displaying data processed by the processor 200 and for displaying a visualized user interface. The display unit 400 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch panel, or the like.

And a network port 500 connected to the processor 200 for performing communication connection with each external terminal device. The communication technology adopted by the communication connection can be a wired communication technology or a wireless communication technology, such as a mobile high definition link (MHL) technology, a Universal Serial Bus (USB), a High Definition Multimedia Interface (HDMI), a wireless fidelity (WiFi), a bluetooth communication technology, a low power consumption bluetooth communication technology, an ieee802.11 s-based communication technology, and the like.

While FIG. 5 shows only an electronic device having the

assembly

100 and 500, those skilled in the art will appreciate that the configuration shown in FIG. 5 does not constitute a limitation of the electronic device, and may include fewer or more components than shown, or some components may be combined, or a different arrangement of components.

The present application also provides a computer-readable storage medium, which may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk. The storage medium stores thereon a computer program, and the computer program is executed by a processor to implement the method for adjusting parameters of a distributed storage system disclosed in any of the foregoing embodiments.

According to the method and the device, a neural network model for evaluating the time delay of the client can be created in advance, when the actual parameters need to be adjusted, the actual internal state data of the current storage cluster can be obtained, the target parameters which can enable the output value of the neural network model to be the maximum can be searched based on the created neural network model, the larger the output value is, the smaller the time delay of the client is represented, therefore, the target parameters which enable the time delay of the client to be smaller can be obtained to adjust the storage cluster, the optimal parameters can be automatically obtained, manual parameter adjustment is not needed, the labor cost is reduced, and automatic adjustment of the storage cluster is achieved.

The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description. It should be noted that, for those skilled in the art, it is possible to make several improvements and modifications to the present application without departing from the principle of the present application, and such improvements and modifications also fall within the scope of the claims of the present application.

It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims

1. A distributed storage system parameter adjustment method is characterized by comprising the following steps:

2. The distributed storage system parameter adjustment method of claim 1, wherein the interface latency data comprises: any one or combination of any several of interface name, number of received requests, number of processed requests, average processing delay, maximum processing delay, minimum processing delay, average request waiting delay, maximum request waiting delay and minimum request waiting delay.

3. The distributed storage system parameter adjustment method according to claim 1, wherein the creation process of the neural network model includes:

4. The distributed storage system parameter adjustment method according to claim 3, wherein the ganesha configuration data includes: any one or combination of any several items of the number of working threads, the total number of reading working threads, the number of reading working thread groups, the maximum number of cached files and the caching time.

5. The distributed storage system parameter adjustment method according to any one of claims 1 to 4, wherein the searching, by using the neural network model, for the target parameter that maximizes the output value of the neural network model from the actual internal state data includes:

6. The method according to claim 5, wherein the searching, by using the neural network model, for a target parameter that maximizes the output value of the neural network model according to the actual internal state data, and after adjusting the parameter of the current storage cluster according to the target parameter, further comprises:

7. A distributed storage system parameter adjustment apparatus, comprising:

8. The distributed storage system parameter adjustment apparatus of claim 7, further comprising:

9. An electronic device, comprising:

a memory for storing a computer program;

a processor for implementing the steps of the method of adjusting parameters of a distributed storage system according to any one of claims 1 to 6 when executing said computer program.

10. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method for distributed storage system parameter adjustment according to any one of claims 1 to 6.