CN112162966A - Distributed storage system parameter adjusting method and device, electronic equipment and medium - Google Patents

Distributed storage system parameter adjusting method and device, electronic equipment and medium Download PDF

Info

Publication number
CN112162966A
CN112162966A CN202010956173.6A CN202010956173A CN112162966A CN 112162966 A CN112162966 A CN 112162966A CN 202010956173 A CN202010956173 A CN 202010956173A CN 112162966 A CN112162966 A CN 112162966A
Authority
CN
China
Prior art keywords
neural network
network model
internal state
parameter
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010956173.6A
Other languages
Chinese (zh)
Inventor
王团结
梁鑫辉
曹琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Inspur Data Technology Co Ltd
Original Assignee
Beijing Inspur Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Inspur Data Technology Co Ltd filed Critical Beijing Inspur Data Technology Co Ltd
Priority to CN202010956173.6A priority Critical patent/CN112162966A/en
Publication of CN112162966A publication Critical patent/CN112162966A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a method, a device, equipment and a medium for adjusting parameters of a distributed storage system, wherein the method comprises the following steps: acquiring a neural network model which is created in advance according to data obtained by interaction between the vbdbench client and the storage cluster and used for evaluating the time delay; determining actual internal state data of the current storage cluster, wherein the actual internal state data comprises ganesha interface delay data; and searching a target parameter which enables the output value to be maximum according to actual internal state data by using a neural network model, and adjusting the parameter of the current storage cluster according to the target parameter, wherein the output value is in negative correlation with the time delay of the client. The method and the device can create a neural network model for evaluating time delay, obtain actual internal state data of the storage cluster, search a target parameter capable of enabling an output value to be maximum based on the created neural network model, obtain the target parameter enabling the time delay to be small so as to conduct optimization of the storage cluster, and achieve automatic optimization of the storage cluster.

Description

Distributed storage system parameter adjusting method and device, electronic equipment and medium
Technical Field
The present application relates to the field of distributed storage technologies, and in particular, to a method and an apparatus for adjusting parameters of a distributed storage system, an electronic device, and a computer-readable storage medium.
Background
The current storage system has a large number of adjustable parameters, and the adjustable range of each parameter is very large, and modifying the parameters often has different influences on the performance of the system. Typically the default parameter configuration is provided by the supplier and the combination of these parameter values is typically not an optimal set of parameters. Even if a small part of parameter values are adjusted, the energy consumption and the performance efficiency of the system can be improved by more than several times.
The traditional parameter adjustment is completed by a system administrator according to professional knowledge and experience of the system administrator, and as the complexity of the storage system is continuously improved, manual parameter adjustment cannot adapt to a large-scale storage system, and manual parameter adjustment has the defects of incapability of all-weather monitoring, high labor cost and the like, so that the research of an automatic optimization method of the storage system is urgently needed at present. Therefore, how to solve the above problems is a great concern for those skilled in the art.
Disclosure of Invention
The application aims to provide a distributed storage system parameter adjusting method and device, an electronic device and a computer readable storage medium, which can realize automatic optimization of a storage cluster.
In order to achieve the above object, the present application provides a method for adjusting parameters of a distributed storage system, including:
acquiring a neural network model which is created in advance according to data obtained by interaction between a vbdbench client and a storage cluster, wherein the neural network model is used for evaluating the time delay;
determining actual internal state data of the current storage cluster; wherein, the actual internal state data comprises interface delay data of ganesha;
searching a target parameter which enables the output value of the neural network model to be maximum according to the actual internal state data by using the neural network model, and adjusting the parameter of the current storage cluster according to the target parameter, wherein the output value is in negative correlation with the time delay of a client.
Optionally, the interface delay data includes: any one or combination of any several of interface name, number of received requests, number of processed requests, average processing delay, maximum processing delay, minimum processing delay, average request waiting delay, maximum request waiting delay and minimum request waiting delay.
Optionally, the creating process of the neural network model includes:
constructing a full-connection neural network as an initial neural network model, and initializing neural network parameters in the initial neural network model;
acquiring a first internal state of a storage cluster at the current moment, selecting a group of parameters in a parameter space, performing dynamic configuration, performing IO read-write operation on a storage system after the configuration is completed, and acquiring average time delay data of a client; the parameter space is generated by setting a minimum value and a maximum value corresponding to ganesha configuration data and a sampling interval;
querying a second internal state of the storage cluster after the IO read-write operation is finished, and generating a group of training data by combining the average delay data, the ganesha configuration data and the first internal state;
and training the initial neural network model by utilizing the multiple groups of training data to generate a final neural network model.
Optionally, the ganesha configuration data includes: any one or combination of any several items of the number of working threads, the total number of reading working threads, the number of reading working thread groups, the maximum number of cached files and the caching time.
Optionally, the searching, by using the neural network model and according to the actual internal state data, for the target parameter that maximizes the output value of the neural network model includes:
traversing a parameter space of ganesha configuration data, inputting the actual internal state data and each group of parameters into the neural network model, and acquiring corresponding output values;
and selecting a maximum value from all the output values, and acquiring a parameter corresponding to the maximum value as the target parameter.
Optionally, after searching, by using the neural network model, a target parameter that maximizes an output value of the neural network model according to the actual internal state data and adjusting a parameter of the current storage cluster according to the target parameter, the method further includes:
and recording the actual internal state data and the target parameters as latest training parameters so as to update the neural network model by using the latest training parameters.
In order to achieve the above object, the present application provides a distributed storage system parameter adjusting apparatus, including:
the model acquisition module is used for acquiring a neural network model which is created in advance according to data obtained by interaction between the vbdbench client and the storage cluster, and the neural network model is used for evaluating the time delay;
the state determining module is used for determining actual internal state data of the current storage cluster; wherein, the actual internal state data comprises interface delay data of ganesha;
and the parameter adjusting module is used for searching a target parameter which enables the output value of the neural network model to be maximum according to the actual internal state data by using the neural network model, and adjusting the parameter of the current storage cluster according to the target parameter, wherein the output value is in negative correlation with the time delay of the client.
Optionally, the method further includes:
and the model updating module is used for recording the actual internal state data and the target parameters as latest training parameters so as to update the neural network model by using the latest training parameters.
To achieve the above object, the present application provides an electronic device including:
a memory for storing a computer program;
a processor for implementing the steps of any one of the aforementioned disclosed distributed storage system parameter adjustment methods when executing the computer program.
To achieve the above object, the present application provides a computer-readable storage medium, on which a computer program is stored, the computer program, when being executed by a processor, implementing the steps of any one of the distributed storage system parameter adjusting methods disclosed in the foregoing.
According to the scheme, the method for adjusting the parameters of the distributed storage system comprises the following steps: acquiring a neural network model which is created in advance according to data obtained by interaction between a vbdbench client and a storage cluster, wherein the neural network model is used for evaluating the time delay; determining actual internal state data of the current storage cluster; wherein, the actual internal state data comprises interface delay data of ganesha; searching a target parameter which enables the output value of the neural network model to be maximum according to the actual internal state data by using the neural network model, and adjusting the parameter of the current storage cluster according to the target parameter, wherein the output value is in negative correlation with the time delay of a client. According to the method, the neural network model for evaluating the time delay of the client can be created in advance, when the actual parameters need to be adjusted, the actual internal state data of the current storage cluster can be obtained, the target parameters which can enable the output value of the neural network model to be the maximum are searched based on the created neural network model, the larger the output value is, the smaller the time delay of the client is represented, therefore, the target parameters which enable the time delay of the client to be smaller can be obtained to adjust the storage cluster, the optimal parameters can be automatically obtained, manual parameter adjustment is not needed, the labor cost is reduced, and automatic adjustment of the storage cluster is achieved.
The application also discloses a distributed storage system parameter adjusting device, an electronic device and a computer readable storage medium, which can also achieve the technical effects.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a method for adjusting parameters of a distributed storage system according to an embodiment of the present application;
FIG. 2 is a flow chart of a neural network model creation process disclosed in an embodiment of the present application;
fig. 3 is a structural diagram of a distributed storage system parameter adjustment apparatus disclosed in an embodiment of the present application;
fig. 4 is a block diagram of an electronic device disclosed in an embodiment of the present application;
fig. 5 is a block diagram of another electronic device disclosed in the embodiments of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The traditional parameter adjustment is completed by a system administrator according to professional knowledge and experience of the system administrator, and as the complexity of the storage system is continuously improved, manual parameter adjustment cannot adapt to a large-scale storage system, and manual parameter adjustment has the defects of incapability of all-weather monitoring, high labor cost and the like, so that the research of an automatic optimization method of the storage system is urgently needed at present. Therefore, how to solve the above problems is a great concern for those skilled in the art.
Therefore, the embodiment of the application discloses a distributed storage system parameter adjusting method, which can automatically acquire optimal parameters and realize automatic adjustment and optimization of a storage cluster.
Referring to fig. 1, a method for adjusting parameters of a distributed storage system disclosed in an embodiment of the present application includes:
s101: acquiring a neural network model which is created in advance according to data obtained by interaction between a vbdbench client and a storage cluster, wherein the neural network model is used for evaluating the time delay;
in the embodiment of the application, interaction can be carried out on the storage cluster through the vdbench client in advance, the internal state of the storage cluster, the average time delay of the client and ganesha configuration data are collected, a neural network model is created through offline learning, and the neural network model is used for evaluating the time delay of the client.
It should be noted that the vdbech is an I/O workload generator, which is used to verify data integrity and measure the performance of direct attachment and network connection storage, and because of the advantages of convenient installation and good compatibility, it becomes a common tool used in testing and benchmark testing. The Ganesha specifically refers to NFS-Ganesha, and is an NFS (Network File System) File server operating in a user mode.
Specifically, the ganesha configuration data may include, but is not limited to: the number of working threads, the total number of reading working threads, the grouping number of reading working threads, the maximum number of cached files and the caching time.
S102: determining actual internal state data of the current storage cluster; wherein, the actual internal state data comprises interface delay data of ganesha;
in this step, the actual internal state data of the storage cluster which needs parameter tuning at present can be obtained. Specifically, the actual internal state data may include interface latency data of ganesha, and the interface latency data of ganesha is a set of statistical indexes, which may specifically include but is not limited to: interface name, number of requests received, number of requests processed, average delay of processing, maximum delay of processing, minimum delay of processing, average delay of request waiting, maximum delay of request waiting, and minimum delay of request waiting.
S103: searching a target parameter which enables the output value of the neural network model to be maximum according to the actual internal state data by using the neural network model, and adjusting the parameter of the current storage cluster according to the target parameter, wherein the output value is in negative correlation with the time delay of a client.
It should be noted that after the actual internal state data of the current storage cluster is obtained, the actual internal state data and the ganesha configuration data may be used as input data of the neural network model, specifically, a parameter space of the ganesha configuration data may be traversed, the actual internal state data and each group of parameters are input into the neural network model, corresponding output values are obtained, a maximum value is selected from all the output values, and a parameter corresponding to the maximum value is obtained as a target parameter, so that the parameter of the current storage cluster is adjusted by using the target parameter.
As a preferred implementation manner, in the embodiment of the present application, after the target parameter is obtained by searching and the parameter of the current storage cluster is adjusted according to the target parameter, the actual internal state data and the target parameter may be further recorded as the latest training parameter, so that the neural network model is updated by using the latest training parameter in the following process, and the accuracy of the neural network model is improved. For example, after obtaining the target parameters, that is, the optimal configuration parameters, online configuration may be performed according to the target parameters, and after running for a period of time, data is stored and recorded according to the feedback of the reward function obtaining system and the internal state of the cluster at the current time, so as to further update the neural network model.
According to the scheme, the method for adjusting the parameters of the distributed storage system comprises the following steps: acquiring a neural network model which is created in advance according to data obtained by interaction between a vbdbench client and a storage cluster, wherein the neural network model is used for evaluating the time delay; determining actual internal state data of the current storage cluster; wherein, the actual internal state data comprises interface delay data of ganesha; searching a target parameter which enables the output value of the neural network model to be maximum according to the actual internal state data by using the neural network model, and adjusting the parameter of the current storage cluster according to the target parameter, wherein the output value is in negative correlation with the time delay of a client. According to the method, the neural network model for evaluating the time delay of the client can be created in advance, when the actual parameters need to be adjusted, the actual internal state data of the current storage cluster can be obtained, the target parameters which can enable the output value of the neural network model to be the maximum are searched based on the created neural network model, the larger the output value is, the smaller the time delay of the client is represented, therefore, the target parameters which enable the time delay of the client to be smaller can be obtained to adjust the storage cluster, the optimal parameters can be automatically obtained, manual parameter adjustment is not needed, the labor cost is reduced, and automatic adjustment of the storage cluster is achieved.
The creation process of the neural network model is further described below. Referring to fig. 2, specifically:
s201: constructing a full-connection neural network as an initial neural network model, and initializing neural network parameters in the initial neural network model;
according to the embodiment of the application, a four-layer fully-connected neural network can be specifically constructed to serve as a neural network model, and parameters of the neural network are initialized. The input layer is the internal state of the cluster and the ganesha core parameter, the first hidden layer comprises M neurons, the activation function is Relu, the second hidden layer comprises N neurons, the activation function is Relu, and the output layer comprises one neuron. The values of M and N may be set according to actual conditions in practical implementation, and are not limited herein, for example, M may be set to 100, and N may be set to 10.
S202: acquiring a first internal state of a storage cluster at the current moment, selecting a group of parameters in a parameter space, performing dynamic configuration, performing IO read-write operation on a storage system after the configuration is completed, and acquiring average time delay data of a client;
in this step, a first internal state s of the storage cluster at the current time can be obtainedt. Specifically, the interface latency of ganesha can be taken as the internal state of the cluster, including but not limited to the interface name, the number of received requests, the number of processed requests, the processing average latency, the processing maximum latency, the processing minimum latency, the request waiting average latency, the request waiting maximum latency, the request waiting minimum latency, and the like, and can be regarded as a 24 × 8 two-dimensional table. Expanding the two-dimensional table according to rows to obtain the length of 192 dimensionsInternal state vector s of a cluster of degreest
It should be noted that the vdbech client is in the cluster state stParameter atIn time, the lower the average delay for performing IO read and write operations, the better the reward function is as follows: r ist=-latencyt. Five core configuration data of Ganesha are provided, and a parameter space corresponding to the Ganesha configuration data can be generated by setting the minimum value, the maximum value and the sampling interval corresponding to the Ganesha configuration data. For example, one specific parameter space may be as shown in table 1 below:
TABLE 1
Parameter name Meaning of parameters [ minimum, maximum, sampling Interval]
Nb_Worker Number of working threads [1,128,10]
nb_worker_req Total number of read work threads [1,128,10]
nb_worker_queue Read worker threads are grouped into 4 groups [1,100,10]
Entries_HWMark Maximum number of files cached [100,200000,1000]
Attr_Expiration_Time Cache time (second unit) [1,100,10]
For example, a certain set of parameter vectors is: a ist=[10,10,5,100,8]。
It should be noted that, in the off-line learning process, a group of parameters needs to be acquired from the parameter space a, and the specific method is as follows: obtaining configuration parameter a using-greedy methodtWherein, a probability value close to 1, such as 0.99, a set of parameters is randomly selected from the parameter space with a smaller probability of 1-, and each set of parameters of the internal state of the current cluster and the parameter space is input into the neural network with a larger probability, and the set of parameters with the maximum output value of the neural network is taken.
Specifically, a set of parameters a may be obtained according to the internal state of the current cluster and the method for obtaining the parameterstAnd dynamically configuring parameters, randomly sampling and executing a multi-application scene vdbech script, performing IO read-write operation on a storage system, and setting the running time, wherein the running time can be specifically selected for 2 minutes. After the IO is over, the feedback of the system can be obtained according to the reward function.
S203: querying a second internal state of the storage cluster after the IO read-write operation is finished, and generating a group of training data by combining the average delay data, the ganesha configuration data and the first internal state;
in this step, the second internal state s of the storage cluster after the IO read-write operation is finished can be queriedt+1. In combination with the average delay data, ganesha configuration data, the first internal state and the second internal state to generate a set of training data, e.g.,(s)t,at,rt,st+1) And storing the data into a cache list.
S204: and training the initial neural network model by utilizing the multiple groups of training data to generate a final neural network model.
It can be understood that a plurality of sets of training data can be randomly adopted from the cache list to train the neural network model, and the required neural network model is generated. Wherein, input si,siTarget value y ofiFrom the current immediate feedback rtAnd target neural network model
Figure BDA0002678661070000084
Evaluation is given by
Figure BDA0002678661070000081
According to input si,siAnd a target value yiAnd updating the neural network model Q by adopting a gradient descent algorithm. Updating once every time the Q neural network model is updated for C times
Figure BDA0002678661070000085
C may be set according to an actual implementation scenario, and the embodiment is not particularly limited thereto, for example, C may be set to 10, that is, each time the Q neural network model is updated ten times, the Q neural network model is updated once
Figure BDA0002678661070000082
Namely, it is
Figure BDA0002678661070000083
In the following, a distributed storage system parameter adjusting apparatus provided in an embodiment of the present application is introduced, and a distributed storage system parameter adjusting apparatus described below and a distributed storage system parameter adjusting method described above may refer to each other.
Referring to fig. 3, a distributed storage system parameter adjustment apparatus provided in an embodiment of the present application includes:
the model acquisition module 301 is configured to acquire a neural network model created in advance according to data obtained by interaction between the vdbench client and the storage cluster, where the neural network model is used to evaluate the time delay;
a state determination module 302, configured to determine actual internal state data of the current storage cluster; wherein, the actual internal state data comprises interface delay data of ganesha;
a parameter adjusting module 303, configured to search, by using the neural network model, a target parameter that maximizes an output value of the neural network model according to the actual internal state data, and adjust a parameter of the current storage cluster according to the target parameter, where the output value is negatively related to a client time delay.
For the specific implementation process of the modules 301 to 303, reference may be made to the corresponding content disclosed in the foregoing embodiments, and details are not repeated here.
On the basis of the foregoing embodiment, as a preferred implementation, the distributed storage system parameter adjustment apparatus provided in the embodiment of the present application may further include:
and the model updating module is used for recording the actual internal state data and the target parameters as latest training parameters so as to update the neural network model by using the latest training parameters.
The present application further provides an electronic device, and as shown in fig. 4, an electronic device provided in an embodiment of the present application includes:
a memory 100 for storing a computer program;
the processor 200, when executing the computer program, may implement the steps provided by the above embodiments.
Specifically, the memory 100 includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and computer-readable instructions, and the internal memory provides an environment for the operating system and the computer-readable instructions in the non-volatile storage medium to run. The processor 200 may be a Central Processing Unit (CPU), a controller, a microcontroller, a microprocessor or other data Processing chip in some embodiments, and provides computing and controlling capability for the electronic device, and when executing the computer program stored in the memory 100, the method for adjusting parameters of the distributed storage system disclosed in any of the foregoing embodiments may be implemented.
On the basis of the above embodiment, as a preferred implementation, referring to fig. 5, the electronic device further includes:
and an input interface 300 connected to the processor 200, for acquiring computer programs, parameters and instructions imported from the outside, and storing the computer programs, parameters and instructions into the memory 100 under the control of the processor 200. The input interface 300 may be connected to an input device for receiving parameters or instructions manually input by a user. The input device may be a touch layer covered on a display screen, or a button, a track ball or a touch pad arranged on a terminal shell, or a keyboard, a touch pad or a mouse, etc.
And a display unit 400 connected to the processor 200 for displaying data processed by the processor 200 and for displaying a visualized user interface. The display unit 400 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch panel, or the like.
And a network port 500 connected to the processor 200 for performing communication connection with each external terminal device. The communication technology adopted by the communication connection can be a wired communication technology or a wireless communication technology, such as a mobile high definition link (MHL) technology, a Universal Serial Bus (USB), a High Definition Multimedia Interface (HDMI), a wireless fidelity (WiFi), a bluetooth communication technology, a low power consumption bluetooth communication technology, an ieee802.11 s-based communication technology, and the like.
While FIG. 5 shows only an electronic device having the assembly 100 and 500, those skilled in the art will appreciate that the configuration shown in FIG. 5 does not constitute a limitation of the electronic device, and may include fewer or more components than shown, or some components may be combined, or a different arrangement of components.
The present application also provides a computer-readable storage medium, which may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk. The storage medium stores thereon a computer program, and the computer program is executed by a processor to implement the method for adjusting parameters of a distributed storage system disclosed in any of the foregoing embodiments.
According to the method and the device, a neural network model for evaluating the time delay of the client can be created in advance, when the actual parameters need to be adjusted, the actual internal state data of the current storage cluster can be obtained, the target parameters which can enable the output value of the neural network model to be the maximum can be searched based on the created neural network model, the larger the output value is, the smaller the time delay of the client is represented, therefore, the target parameters which enable the time delay of the client to be smaller can be obtained to adjust the storage cluster, the optimal parameters can be automatically obtained, manual parameter adjustment is not needed, the labor cost is reduced, and automatic adjustment of the storage cluster is achieved.
The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description. It should be noted that, for those skilled in the art, it is possible to make several improvements and modifications to the present application without departing from the principle of the present application, and such improvements and modifications also fall within the scope of the claims of the present application.
It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims (10)

1. A distributed storage system parameter adjustment method is characterized by comprising the following steps:
acquiring a neural network model which is created in advance according to data obtained by interaction between a vbdbench client and a storage cluster, wherein the neural network model is used for evaluating the time delay;
determining actual internal state data of the current storage cluster; wherein, the actual internal state data comprises interface delay data of ganesha;
searching a target parameter which enables the output value of the neural network model to be maximum according to the actual internal state data by using the neural network model, and adjusting the parameter of the current storage cluster according to the target parameter, wherein the output value is in negative correlation with the time delay of a client.
2. The distributed storage system parameter adjustment method of claim 1, wherein the interface latency data comprises: any one or combination of any several of interface name, number of received requests, number of processed requests, average processing delay, maximum processing delay, minimum processing delay, average request waiting delay, maximum request waiting delay and minimum request waiting delay.
3. The distributed storage system parameter adjustment method according to claim 1, wherein the creation process of the neural network model includes:
constructing a full-connection neural network as an initial neural network model, and initializing neural network parameters in the initial neural network model;
acquiring a first internal state of a storage cluster at the current moment, selecting a group of parameters in a parameter space, performing dynamic configuration, performing IO read-write operation on a storage system after the configuration is completed, and acquiring average time delay data of a client; the parameter space is generated by setting a minimum value and a maximum value corresponding to ganesha configuration data and a sampling interval;
querying a second internal state of the storage cluster after the IO read-write operation is finished, and generating a group of training data by combining the average delay data, the ganesha configuration data and the first internal state;
and training the initial neural network model by utilizing the multiple groups of training data to generate a final neural network model.
4. The distributed storage system parameter adjustment method according to claim 3, wherein the ganesha configuration data includes: any one or combination of any several items of the number of working threads, the total number of reading working threads, the number of reading working thread groups, the maximum number of cached files and the caching time.
5. The distributed storage system parameter adjustment method according to any one of claims 1 to 4, wherein the searching, by using the neural network model, for the target parameter that maximizes the output value of the neural network model from the actual internal state data includes:
traversing a parameter space of ganesha configuration data, inputting the actual internal state data and each group of parameters into the neural network model, and acquiring corresponding output values;
and selecting a maximum value from all the output values, and acquiring a parameter corresponding to the maximum value as the target parameter.
6. The method according to claim 5, wherein the searching, by using the neural network model, for a target parameter that maximizes the output value of the neural network model according to the actual internal state data, and after adjusting the parameter of the current storage cluster according to the target parameter, further comprises:
and recording the actual internal state data and the target parameters as latest training parameters so as to update the neural network model by using the latest training parameters.
7. A distributed storage system parameter adjustment apparatus, comprising:
the model acquisition module is used for acquiring a neural network model which is created in advance according to data obtained by interaction between the vbdbench client and the storage cluster, and the neural network model is used for evaluating the time delay;
the state determining module is used for determining actual internal state data of the current storage cluster; wherein, the actual internal state data comprises interface delay data of ganesha;
and the parameter adjusting module is used for searching a target parameter which enables the output value of the neural network model to be maximum according to the actual internal state data by using the neural network model, and adjusting the parameter of the current storage cluster according to the target parameter, wherein the output value is in negative correlation with the time delay of the client.
8. The distributed storage system parameter adjustment apparatus of claim 7, further comprising:
and the model updating module is used for recording the actual internal state data and the target parameters as latest training parameters so as to update the neural network model by using the latest training parameters.
9. An electronic device, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the method of adjusting parameters of a distributed storage system according to any one of claims 1 to 6 when executing said computer program.
10. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method for distributed storage system parameter adjustment according to any one of claims 1 to 6.
CN202010956173.6A 2020-09-11 2020-09-11 Distributed storage system parameter adjusting method and device, electronic equipment and medium Pending CN112162966A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010956173.6A CN112162966A (en) 2020-09-11 2020-09-11 Distributed storage system parameter adjusting method and device, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010956173.6A CN112162966A (en) 2020-09-11 2020-09-11 Distributed storage system parameter adjusting method and device, electronic equipment and medium

Publications (1)

Publication Number Publication Date
CN112162966A true CN112162966A (en) 2021-01-01

Family

ID=73858948

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010956173.6A Pending CN112162966A (en) 2020-09-11 2020-09-11 Distributed storage system parameter adjusting method and device, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN112162966A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113485649A (en) * 2021-07-23 2021-10-08 中国电信股份有限公司 Data storage method, system, device, medium and electronic equipment
CN113608677A (en) * 2021-06-28 2021-11-05 山东海量信息技术研究院 Parameter tuning method, system and device of distributed storage system
CN113992703A (en) * 2021-09-29 2022-01-28 浪潮电子信息产业股份有限公司 Distributed storage system parameter optimization method and related components
CN118396038A (en) * 2024-06-24 2024-07-26 阿里云计算有限公司 Method and device for determining parameter tuning information and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150142801A1 (en) * 2013-11-15 2015-05-21 International Business Machines Corporation System and Method for Intelligently Categorizing Data to Delete Specified Amounts of Data Based on Selected Data Characteristics
US20150286200A1 (en) * 2012-07-31 2015-10-08 Caterva Gmbh Device for an Optimized Operation of a Local Storage System in an Electrical Energy Supply Grid with Distributed Generators, Distributed Storage Systems and Loads
CN110134697A (en) * 2019-05-22 2019-08-16 南京大学 A kind of parameter automated tuning method, apparatus, system towards key-value pair storage engines
CN110515724A (en) * 2019-08-13 2019-11-29 新华三大数据技术有限公司 Resource allocation method, device, monitor and machine readable storage medium
CN110650208A (en) * 2019-09-29 2020-01-03 北京浪潮数据技术有限公司 Distributed cluster storage method, system, device and computer readable storage medium
CN111008699A (en) * 2019-12-05 2020-04-14 首都师范大学 Neural network data storage method and system based on automatic driving

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150286200A1 (en) * 2012-07-31 2015-10-08 Caterva Gmbh Device for an Optimized Operation of a Local Storage System in an Electrical Energy Supply Grid with Distributed Generators, Distributed Storage Systems and Loads
US20150142801A1 (en) * 2013-11-15 2015-05-21 International Business Machines Corporation System and Method for Intelligently Categorizing Data to Delete Specified Amounts of Data Based on Selected Data Characteristics
CN110134697A (en) * 2019-05-22 2019-08-16 南京大学 A kind of parameter automated tuning method, apparatus, system towards key-value pair storage engines
CN110515724A (en) * 2019-08-13 2019-11-29 新华三大数据技术有限公司 Resource allocation method, device, monitor and machine readable storage medium
CN110650208A (en) * 2019-09-29 2020-01-03 北京浪潮数据技术有限公司 Distributed cluster storage method, system, device and computer readable storage medium
CN111008699A (en) * 2019-12-05 2020-04-14 首都师范大学 Neural network data storage method and system based on automatic driving

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
徐江峰、谭玉龙: "基于机器学习的HBase配置参数优化研究", 计算机科学, vol. 47, no. 6, 15 June 2020 (2020-06-15), pages 474 - 479 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113608677A (en) * 2021-06-28 2021-11-05 山东海量信息技术研究院 Parameter tuning method, system and device of distributed storage system
CN113608677B (en) * 2021-06-28 2024-08-27 山东海量信息技术研究院 Parameter tuning method, system and device for distributed storage system
CN113485649A (en) * 2021-07-23 2021-10-08 中国电信股份有限公司 Data storage method, system, device, medium and electronic equipment
CN113992703A (en) * 2021-09-29 2022-01-28 浪潮电子信息产业股份有限公司 Distributed storage system parameter optimization method and related components
CN113992703B (en) * 2021-09-29 2024-04-05 浪潮电子信息产业股份有限公司 Distributed storage system parameter tuning method and related components
CN118396038A (en) * 2024-06-24 2024-07-26 阿里云计算有限公司 Method and device for determining parameter tuning information and electronic equipment

Similar Documents

Publication Publication Date Title
CN112162966A (en) Distributed storage system parameter adjusting method and device, electronic equipment and medium
Eksombatchai et al. Pixie: A system for recommending 3+ billion items to 200+ million users in real-time
JP7194163B2 (en) Multimedia resource recommendation method, multimedia resource recommendation device, electronic device, non-transitory computer-readable storage medium, and computer program
WO2020207268A1 (en) Database performance adjustment method and apparatus, device, system, and storage medium
US8468146B2 (en) System and method for creating search index on cloud database
JP6718500B2 (en) Optimization of output efficiency in production system
Khan et al. DivIDE: efficient diversification for interactive data exploration
CN103514229A (en) Method and device used for processing database data in distributed database system
JP2015528611A (en) Dynamic data acquisition method and system
CN114895773B (en) Energy consumption optimization method, system and device for heterogeneous multi-core processor and storage medium
US20150356163A1 (en) Methods and systems for analyzing datasets
CN110941447A (en) Directional release method, device and medium of application program and electronic equipment
US20220035794A1 (en) Data retrieval via incremental updates to graph data structures
US8463784B1 (en) Improving data clustering stability
WO2018120726A1 (en) Data mining based modeling method, system, electronic device and storage medium
CN115168389A (en) Request processing method and device
CN110516164A (en) A kind of information recommendation method, device, equipment and storage medium
CN109657695A (en) A kind of fuzzy division clustering method and device based on definitive operation
TWI534704B (en) Processing method for time series and system thereof
EP3009900B1 (en) Dynamic recommendation of elements suitable for use in an engineering configuration
CN104580109A (en) Method and device for generating click verification code
CN111666302A (en) User ranking query method, device, equipment and storage medium
CN110059025A (en) A kind of method and system of cache prefetching
JP2015001884A (en) Candidate presentation device, candidate presentation method, and program
Lyu et al. Sapphire: Automatic configuration recommendation for distributed storage systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination