CN109885469B - Capacity expansion method, prediction model creation method, device, equipment and medium - Google Patents

Capacity expansion method, prediction model creation method, device, equipment and medium Download PDF

Info

Publication number
CN109885469B
CN109885469B CN201910146524.4A CN201910146524A CN109885469B CN 109885469 B CN109885469 B CN 109885469B CN 201910146524 A CN201910146524 A CN 201910146524A CN 109885469 B CN109885469 B CN 109885469B
Authority
CN
China
Prior art keywords
capacity
model
prediction model
information
training sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910146524.4A
Other languages
Chinese (zh)
Other versions
CN109885469A (en
Inventor
崔晓飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sangfor Technologies Co Ltd
Original Assignee
Sangfor Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sangfor Technologies Co Ltd filed Critical Sangfor Technologies Co Ltd
Priority to CN201910146524.4A priority Critical patent/CN109885469B/en
Publication of CN109885469A publication Critical patent/CN109885469A/en
Application granted granted Critical
Publication of CN109885469B publication Critical patent/CN109885469B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The application discloses a capacity expansion method, a prediction model creation device and a prediction model creation medium, wherein the capacity expansion method comprises the following steps: acquiring a first prediction model; the first prediction model is obtained by training a model to be trained, which is constructed based on a machine learning algorithm, by using a training sample constructed based on time sequence data of historical capacity consumption of a storage system; predicting the capacity information to be expanded corresponding to the expected value of the capacity consumption duration by utilizing a first prediction model; and carrying out corresponding capacity expansion operation on the storage system according to the capacity information to be expanded. The prediction model based on the machine learning algorithm is trained based on the time sequence data of the historical capacity consumption, and the time sequence data of the historical capacity consumption can objectively reflect the actual change trend of the capacity consumption, so that the capacity information to be expanded predicted by the first prediction model is more consistent with the daily actual capacity requirement of a user, and the prediction accuracy of the expanded capacity is improved.

Description

Capacity expansion method, prediction model creation method, device, equipment and medium
Technical Field
The present application relates to the field of storage technologies, and in particular, to a capacity expansion method, a prediction model creation method, an apparatus, a device, and a medium.
Background
Currently, capacity expansion is essential for a storage product over time. But the expansion is needed to be enough to meet the use requirements of the following years and even more than ten years, which is a difficult problem. Conventionally, a rough estimation is made according to historical use conditions, and an equal-proportion estimation method is basically adopted in an estimation method.
However, the capacity to be expanded determined by the above-mentioned proportional estimation method is not very accurate, and there often occurs a problem that the estimated capacity to be expanded is significantly excessive or insufficient.
Disclosure of Invention
In view of the above, an object of the present application is to provide a capacity expansion method, a prediction model creation method, an apparatus, a device, and a medium, which can effectively improve the prediction accuracy of the expanded capacity. The specific scheme is as follows:
in a first aspect, the present application discloses a capacity expansion method, including:
acquiring a first prediction model; the first prediction model is obtained by training a model to be trained, which is constructed based on a machine learning algorithm, by using a training sample constructed based on time sequence data of historical capacity consumption of a storage system;
predicting the capacity information to be expanded corresponding to the expected value of the capacity consumption duration by utilizing the first prediction model;
and carrying out corresponding capacity expansion operation on the storage system according to the capacity information to be expanded.
Optionally, before the obtaining the first prediction model, the method further includes:
acquiring first time series data of historical capacity consumption of the storage system;
constructing a first training sample by using the time slice information in the first time sequence data and the corresponding historical capacity consumption;
and training a model to be trained, which is constructed based on a machine learning algorithm, by using the first training sample to obtain the first prediction model.
Optionally, the training a model to be trained, which is constructed based on a machine learning algorithm, by using the first training sample to obtain the first prediction model includes:
and training a model to be trained constructed based on a machine learning algorithm by using the time slice information in the first training sample as model input side data and using the historical capacity consumption in the first training sample as model output side data to obtain the first prediction model.
Optionally, the training a model to be trained, which is constructed based on a machine learning algorithm, by using the first training sample to obtain the first prediction model includes:
and training a model to be trained constructed based on a machine learning algorithm by taking the historical capacity consumption in the first training sample as model input side data and taking time segment information in the first training sample as model output side data to obtain the first prediction model.
Optionally, after predicting the to-be-expanded capacity information corresponding to the expected value of the capacity consumption duration and before performing the corresponding capacity expansion operation on the storage system according to the to-be-expanded capacity information, the method further includes:
acquiring a second prediction model; the second prediction model is obtained by training a model to be trained, which is constructed based on a machine learning algorithm, by using a second training sample, and the second training sample comprises storage system hardware information, historical expansion capacity information and corresponding expansion duration information;
and acquiring current hardware information of the storage system, and inputting the capacity information to be expanded and the current hardware information into the second prediction model to obtain an expansion duration prediction value output by the second prediction model.
Optionally, before obtaining the second prediction model, the method further includes:
obtaining the second training sample;
and training a model to be trained constructed based on a machine learning algorithm by using the hardware information of the storage system and the historical expansion capacity information in the second training sample as model input side data and using the expansion duration information in the second training sample as output side data to obtain the second prediction model.
Optionally, before performing corresponding capacity expansion operation on the storage system according to the capacity information to be expanded, the method further includes:
acquiring a third prediction model; the third prediction model is obtained by training a model to be trained, which is constructed based on a machine learning algorithm, by using a training sample constructed based on time sequence data of historical read-write throughput of the storage system;
inputting the expected value of the capacity expansion time period to the third prediction model to obtain the read-write throughput predicted value of each time point on the expected value of the capacity expansion time period output by the third prediction model;
establishing a corresponding curve graph according to the corresponding relation between the expected value of the capacity expansion time period and the read-write throughput predicted value;
selecting a curve segment which is consistent with the predicted value of the expansion time length and has the smallest area with the time axis from the curve graph as a target curve segment;
and determining the time period corresponding to the target curve segment as the optimal expansion time period.
Optionally, before the obtaining the third prediction model, the method further includes:
acquiring second time sequence data of historical read-write throughput of the storage system;
constructing a third training sample by using the time slice information in the second time sequence data and the corresponding historical read-write throughput;
and training a model to be trained constructed based on a machine learning algorithm by using the time slice information in the third training sample as model input side data and the historical read-write throughput in the third training sample as model output side data to obtain the third prediction model.
Optionally, the performing, according to the to-be-expanded capacity information, a corresponding expansion operation on the storage system includes:
and performing corresponding capacity expansion operation on the storage system according to the information of the capacity to be expanded and the optimal capacity expansion time period.
In a second aspect, the present application discloses a method for creating a prediction model, comprising:
acquiring first time series data of historical capacity consumption of a storage system;
constructing a first training sample by using the time slice information in the first time sequence data and the corresponding historical capacity consumption;
and training a model to be trained, which is constructed based on a machine learning algorithm, by using the first training sample to obtain a first prediction model.
Optionally, the method for creating a prediction model further includes:
obtaining a second training sample; the second training sample comprises storage system hardware information, historical expansion capacity information and corresponding expansion duration information;
and training a model to be trained constructed based on a machine learning algorithm by taking the hardware information of the storage system and the historical expansion capacity information in the second training sample as model input side data and taking the expansion duration information in the second training sample as output side data to obtain a second prediction model.
Optionally, the method for creating a prediction model further includes:
acquiring second time sequence data of historical read-write throughput of the storage system;
constructing a third training sample by using the time slice information in the second time sequence data and the corresponding historical read-write throughput;
and training a model to be trained constructed based on a machine learning algorithm by using the time slice information in the third training sample as model input side data and the historical read-write throughput in the third training sample as model output side data to obtain a third prediction model.
In a third aspect, the present application discloses a capacity expansion device, including:
the first model obtaining module is used for obtaining a first prediction model; the first prediction model is obtained by training a model to be trained, which is constructed based on a machine learning algorithm, by using a training sample constructed based on time sequence data of historical capacity consumption of a storage system;
the first prediction module is used for predicting the capacity information to be expanded corresponding to the expected value of the capacity consumption duration by utilizing the first prediction model;
and the capacity expansion module is used for carrying out corresponding capacity expansion operation on the storage system according to the capacity information to be expanded.
In a fourth aspect, the present application discloses a prediction model creation apparatus, including:
the first data acquisition module is used for acquiring first time sequence data of historical capacity consumption of the storage system;
the first sample structure modeling block is used for constructing a first training sample by utilizing time slice information in the first time series data and corresponding historical capacity consumption;
and the first model training module is used for training a model to be trained, which is constructed based on a machine learning algorithm, by using the first training sample to obtain a first prediction model.
In a fifth aspect, the present application discloses a capacity expansion device, including:
a memory for storing a computer program;
and the processor is used for executing the computer program to realize the capacity expansion method disclosed by the foregoing.
In a sixth aspect, the present application discloses a predictive model creating apparatus, including:
a memory for storing a computer program;
a processor for executing the computer program to implement the predictive model creation method disclosed in the foregoing.
In a seventh aspect, the present application discloses a computer readable storage medium for storing a computer program; wherein, the computer program realizes the capacity expansion method disclosed in the foregoing when being executed by a processor.
In an eighth aspect, the present application discloses a computer readable storage medium for storing a computer program; wherein the computer program, when executed by a processor, implements the predictive model creation method disclosed in the foregoing.
Therefore, the method includes the steps of firstly obtaining a first prediction model, wherein the first prediction model is obtained after machine learning training is carried out on training samples constructed by time sequence data based on historical capacity consumption of a storage system, and then predicting to-be-expanded capacity information corresponding to expected values of capacity consumption duration by means of the first prediction model. Therefore, the prediction model based on the machine learning algorithm is trained based on the time series data of the historical capacity consumption, the time series data of the historical capacity consumption can objectively reflect the actual change trend of the capacity consumption of the storage system, the change trend is generally a nonlinear change trend, and compared with the existing mode of estimating based on the simple proportional linear change trend, the prediction model based on the time series data of the historical capacity consumption can take the actual nonlinear change trend of the capacity consumption of the storage system into consideration in the prediction process, so that the capacity information to be expanded predicted by using the first prediction model can better meet the daily actual capacity requirement of a user, the condition that the capacity is obviously too much or too little is avoided, and the prediction accuracy of the expanded capacity is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a flow chart of a capacity expansion method disclosed in the present application;
FIG. 2 is a sub-flowchart of a capacity expansion method disclosed in the present application;
FIG. 3 is a sub-flowchart of a capacity expansion method disclosed herein;
FIG. 4 is a sub-flowchart of a capacity expansion method disclosed herein;
FIG. 5 is a sub-flow diagram of a capacity expansion method disclosed herein;
FIG. 6 is a schematic view of a flash tank of the present disclosure;
FIG. 7 is a schematic diagram of a portion of a flash tank of the present disclosure;
FIG. 8 is a schematic view of a portion of a flash tank of the present disclosure;
FIG. 9 is a schematic view of a portion of a flash tank of the present disclosure;
FIG. 10 is a schematic view of a portion of a flash tank of the present disclosure;
fig. 11 is a schematic view of a partial structure of a flash tank according to the present disclosure.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
At present, for a storage product, the capacity needs to be expanded to a certain extent so as to meet the use requirements of the following years and even more than ten years, which is a difficult problem. The conventional method adopts an equal proportion estimation method. However, the capacity to be expanded determined by the above-mentioned proportional estimation method is not very accurate, and there often occurs a problem that the estimated capacity to be expanded is significantly excessive or insufficient. Therefore, the capacity expansion scheme is provided, the predicted information of the capacity to be expanded can better meet the daily actual capacity requirement of the user, the condition that the amount is obviously too much or too little is avoided, and the prediction accuracy of the expanded capacity is improved.
Referring to fig. 1, an embodiment of the present application discloses a capacity expansion method, including:
step S11: acquiring a first prediction model; the first prediction model is obtained by training a model to be trained, which is constructed based on a machine learning algorithm, by using a training sample constructed based on time sequence data of historical capacity consumption of a storage system.
It is understood that, before the step S11, the first prediction model is trained in advance, and then the first prediction model is only called directly. The present embodiment specifically trains the first prediction model based on time series data of historical capacity consumption of the storage system, where the time series data of historical capacity consumption of the storage system specifically refers to a change of the historical capacity consumption of the storage system with a use time period, and can objectively reflect an actual change trend of the capacity consumption of the storage system, for example, a change of the capacity consumption of the storage system in the last 5 years, a change of the capacity consumption of the storage system in the last 10 years, and the like.
In this embodiment, the storage system may be specifically a storage system constructed based on a storage cluster. In addition, the machine learning algorithm in the present embodiment may be selected according to actual situations, and is not specifically limited herein.
Step S12: and predicting the capacity information to be expanded corresponding to the expected value of the capacity consumption duration by utilizing the first prediction model.
Step S13: and carrying out corresponding capacity expansion operation on the storage system according to the capacity information to be expanded.
Therefore, the embodiment of the application trains the prediction model based on the machine learning algorithm based on the time series data of the historical capacity consumption, since the time series data of the historical capacity consumption amount can objectively reflect the actual variation trend of the capacity consumption amount of the storage system, compared with the existing estimation method based on simple proportional linear trend, the embodiment of the application can take the actual nonlinear trend of the capacity consumption of the storage system into account in the prediction process, therefore, the capacity information to be expanded predicted by the first prediction model can better meet the daily actual capacity requirement of the user, the condition that the capacity information is obviously too much or too little is avoided, the prediction accuracy of the expanded capacity is improved, and the reliability is higher.
On the basis of the foregoing embodiments, the embodiments of the present application further explain and optimize the technical solutions. Specifically, the method comprises the following steps:
referring to fig. 2, before obtaining the first prediction model, the method further includes:
step S21: first time series data of historical capacity consumption of the storage system is obtained.
It is understood that, in step S21, the first time series data of the historical capacity consumption amount of the storage system in different capacity consumption time periods may be obtained, and accordingly, multiple sets of the first time series data are obtained. The different capacity consumption periods may or may not overlap each other, and the greater the number of sets of the first time-series data obtained in step S21, the better the prediction accuracy of the first prediction model.
In this embodiment, the first time series data may be specifically obtained by using an open source tool such as diamond and zabbix, may also be directly acquired by using iostat, and in addition, for a software defined storage system such as CEPH, may also be directly acquired from corresponding software.
Step S22: and constructing a first training sample by using the time slice information in the first time sequence data and the corresponding historical capacity consumption.
It should be noted that the time slice in the first time series data may be the same as the time slice corresponding to the first time series data, or may be a random section of the time slice corresponding to the first time series data.
Step S23: and training a model to be trained constructed based on a machine learning algorithm by using the first training sample to obtain the first prediction model.
In a first specific embodiment, the training a model to be trained, which is constructed based on a machine learning algorithm, with the first training sample to obtain the first prediction model specifically may include: and training a model to be trained constructed based on a machine learning algorithm by using the time slice information in the first training sample as model input side data and using the historical capacity consumption in the first training sample as model output side data to obtain the first prediction model.
It can be understood that, in the first prediction model obtained by the first embodiment, in the actual prediction application, the information input to the first prediction model is time information, and the information output by the first prediction model is capacity information.
In a second specific embodiment, the training a model to be trained, which is constructed based on a machine learning algorithm, with the first training sample to obtain the first prediction model specifically may include: and training a model to be trained constructed based on a machine learning algorithm by taking the historical capacity consumption in the first training sample as model input side data and taking time segment information in the first training sample as model output side data to obtain the first prediction model.
It is to be understood that, in the first prediction model obtained by the second embodiment, in the actual prediction application, the information input to the first prediction model is capacity information, and the information output by the first prediction model is time information.
Further, in this embodiment, after predicting the to-be-expanded capacity information corresponding to the expected value of the capacity consumption duration and before performing the corresponding capacity expansion operation on the storage system according to the to-be-expanded capacity information, the method may further include:
acquiring a second prediction model; the second prediction model is obtained by training a model to be trained, which is constructed based on a machine learning algorithm, by using a second training sample, and the second training sample comprises storage system hardware information, historical expansion capacity information and corresponding expansion duration information;
and acquiring current hardware information of the storage system, and inputting the capacity information to be expanded and the current hardware information into the second prediction model to obtain an expansion duration prediction value output by the second prediction model.
It should be noted that, in this embodiment, the capacity information to be expanded may be directly represented by a capacity value, or may be indirectly represented by the number of disks or the number of other storage nodes.
Referring to fig. 3, before obtaining the second prediction model, the method may further include:
step S31: obtaining a second training sample; the second training sample comprises storage system hardware information, historical expansion capacity information and corresponding expansion duration information.
In this embodiment, the hardware information of the storage system may specifically include hardware parameter information and cluster size information. The process of acquiring the hardware parameter information may specifically include acquiring detailed parameter information of objects such as a CPU, a memory, a network port, and a hard disk by using an existing open source method, a tool, or a method of directly reading a system configuration file. The acquisition process of the cluster scale information may specifically be performed through an interface provided by a storage product to determine the configuration information of the storage nodes in the storage cluster.
Step S32: and training a model to be trained constructed based on a machine learning algorithm by using the hardware information of the storage system and the historical expansion capacity information in the second training sample as model input side data and using the expansion duration information in the second training sample as output side data to obtain the second prediction model.
It should be noted that the machine learning algorithm in step S32 may be selected according to actual situations, and is not limited herein.
Further, referring to fig. 4, in this embodiment, before performing the corresponding capacity expansion operation on the storage system according to the capacity information to be expanded, the method may further include:
step S41: acquiring a third prediction model; the third prediction model is obtained by training a model to be trained, which is constructed based on a machine learning algorithm, by using a training sample constructed based on time sequence data of historical read-write throughput of the storage system.
It can be understood that, in this embodiment, the time series data of the historical read-write throughput of the storage system specifically refers to a change situation of the historical read-write throughput of the storage system over time, and can objectively reflect an actual change trend of the read-write throughput, for example, a change situation of the read-write throughput of the storage system between 5 pm and 12 pm every day, or a change situation of the read-write throughput of the storage system over the weekend every week.
Step S42: and inputting the expected value of the expansion time period into the third prediction model to obtain the read-write throughput predicted value of each time point on the expected value of the expansion time period output by the third prediction model.
Step S43: and creating a corresponding curve graph according to the corresponding relation between the expected value of the capacity expansion time period and the read-write throughput predicted value.
Step S44: and selecting a curve segment which is consistent with the predicted value of the expansion time length and has the smallest area with the time axis and is formed in the corresponding time period from the curve graph as a target curve segment.
It can be understood that, in this embodiment, a curve segment with the smallest area formed between the curve segment and the time axis is selected as the target curve segment, which aims to ensure that the read-write throughput in the corresponding time period is at a lower level as much as possible, thereby reducing the interference influence on the normal read-write service caused by the subsequent capacity expansion operation.
Further, in this embodiment, the interference influence degree of the subsequent expansion operation on the normal read-write operation may be estimated based on the area formed between the target curve segment and the time axis, and the interference influence degree may be output so as to be checked by the user.
Step S45: and determining the time period corresponding to the target curve segment as the optimal expansion time period.
Compared with the existing method for determining the extension time period based on the personal experience of a service engineer, the method for determining the extension time period in the embodiment has higher accuracy and stability.
In this embodiment, after the optimal capacity expansion time period is obtained, the optimal capacity expansion time period may be marked in the graph.
It is understood that, in this embodiment, the expected value of the capacity expansion time period may be specifically set by a user, for example, it may be set that 8 pm to 4 pm, and the predicted expansion time is 3 hours, the expected value of the expansion time period may be input to the third prediction model to obtain a predicted value of the read/write throughput at each time point between 8 pm of the day and 4 pm of the next day output by the third prediction model, then, according to the corresponding relation between the expected value of the capacity expansion time period and the predicted value of the read-write throughput, a corresponding curve graph is created, the duration of the corresponding time period is selected from the curve graphs and is 3 hours, and a curve segment with the minimum area formed between the curve segment and the time axis is used as a target curve segment, and the time segment corresponding to the target curve segment is determined as the optimal capacity expansion time segment.
Further, the performing, according to the to-be-expanded capacity information, a corresponding expansion operation on the storage system may specifically include: and carrying out corresponding capacity expansion operation on the storage system according to the capacity information to be expanded and the optimal capacity expansion time period. This may enable the capacity of the storage system to be expanded to an appropriate level within an optimal expansion period.
Referring to fig. 5, before obtaining the third prediction model, the method may further include:
step S51: and acquiring second time sequence data of historical read-write throughput of the storage system.
In this embodiment, the second time series data may be specifically acquired by using a source-opening tool such as diamond, zabbix, or the like, or may be directly acquired by using iostat, or may be directly acquired from corresponding software for a software-defined storage system such as CEPH.
Step S52: and constructing a third training sample by using the time slice information in the second time sequence data and the corresponding historical read-write throughput.
The time segment in the second time-series data may be the same as the time segment corresponding to the second time-series data, or may be a random section of the time segment corresponding to the second time-series data.
Step S53: and training a model to be trained constructed based on a machine learning algorithm by using the time slice information in the third training sample as model input side data and the historical read-write throughput in the third training sample as model output side data to obtain the third prediction model.
The machine learning algorithm in step S53 may be selected according to actual conditions, and is not specifically limited herein.
In addition, the embodiment of the application also discloses a method for creating the prediction model, which comprises the following steps:
acquiring first time series data of historical capacity consumption of a storage system; constructing a first training sample by utilizing the time slice information in the first time series data and the corresponding historical capacity consumption; and training the model to be trained constructed based on the machine learning algorithm by using the first training sample to obtain a first prediction model.
Therefore, the embodiment of the application trains the prediction model based on the machine learning algorithm based on the time series data of the historical capacity consumption, since the time series data of the historical capacity consumption amount can objectively reflect the actual variation trend of the capacity consumption amount of the storage system, compared with the existing estimation method based on simple proportional linear trend, the embodiment of the application can take the actual nonlinear trend of the capacity consumption of the storage system into account in the prediction process, therefore, the capacity information to be expanded predicted by the first prediction model can better meet the daily actual capacity requirement of the user, the condition that the capacity information is obviously too much or too little is avoided, the prediction accuracy of the expanded capacity is improved, and the reliability is higher.
Further, the method for creating a prediction model in this embodiment may further include:
obtaining a second training sample; the second training sample comprises storage system hardware information, historical expansion capacity information and corresponding expansion duration information; and training a model to be trained constructed based on a machine learning algorithm by taking the hardware information of the storage system and the historical expansion capacity information in the second training sample as model input side data and taking the expansion duration information in the second training sample as output side data to obtain a second prediction model.
Further, the method for creating a prediction model in this embodiment may further include:
acquiring second time sequence data of historical read-write throughput of the storage system; constructing a third training sample by utilizing the time slice information in the second time series data and the corresponding historical read-write throughput; and training a model to be trained, which is constructed based on a machine learning algorithm, by using the time slice information in the third training sample as model input side data and the historical read-write throughput in the third training sample as model output side data to obtain a third prediction model.
For more details of the prediction model creation method of this embodiment and the specific application process of each prediction model, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and details are not repeated here.
Correspondingly, referring to fig. 6, an embodiment of the present application further discloses a capacity expansion device, including:
a first model obtaining module 11, configured to obtain a first prediction model; the first prediction model is obtained by training a model to be trained, which is constructed based on a machine learning algorithm, by using a training sample constructed based on time sequence data of historical capacity consumption of a storage system;
the first prediction module 12 is configured to predict, by using the first prediction model, capacity information to be expanded corresponding to a capacity consumption duration expected value;
and the capacity expansion module 13 is configured to perform corresponding capacity expansion operation on the storage system according to the capacity information to be expanded.
Therefore, the embodiment of the application trains the prediction model based on the machine learning algorithm based on the time series data of the historical capacity consumption, since the time series data of the historical capacity consumption amount can objectively reflect the actual variation trend of the capacity consumption amount of the storage system, compared with the existing estimation method based on simple proportional linear trend, the embodiment of the application can take the actual nonlinear trend of the capacity consumption of the storage system into account in the prediction process, therefore, the capacity information to be expanded predicted by the first prediction model can better meet the daily actual capacity requirement of the user, the condition that the capacity information is obviously too much or too little is avoided, the prediction accuracy of the expanded capacity is improved, and the reliability is higher.
Further, referring to fig. 7, the expansion device may further include:
a first data obtaining module 21, configured to obtain first time series data of historical capacity consumption of the storage system;
a first sample structure modeling block 22, configured to construct a first training sample by using the time slice information in the first time series data and the corresponding historical capacity consumption amount;
and the first model training module 23 is configured to train a model to be trained, which is constructed based on a machine learning algorithm, with the first training sample to obtain the first prediction model.
In a first specific embodiment, the first model training module 23 is specifically configured to train a model to be trained, which is constructed based on a machine learning algorithm, with time slice information in the first training sample as model input-side data and historical capacity consumption in the first training sample as model output-side data, so as to obtain the first prediction model.
In a second specific embodiment, the first model training module 23 is specifically configured to train a model to be trained, which is constructed based on a machine learning algorithm, with the historical capacity consumption in the first training sample as model input-side data and the time segment information in the first training sample as model output-side data, to obtain the first prediction model.
Further, referring to fig. 8, the expansion device may further include:
a second model obtaining module 31, configured to obtain a second prediction model; the second prediction model is obtained by training a model to be trained, which is constructed based on a machine learning algorithm, by using a second training sample, and the second training sample comprises storage system hardware information, historical expansion capacity information and corresponding expansion duration information;
the second prediction module 32 is configured to collect current hardware information of the storage system, and input the capacity information to be expanded and the current hardware information to the second prediction model to obtain a predicted value of an expansion duration output by the second prediction model.
Further, referring to fig. 9, the expansion device may further include:
a second data obtaining module 41, configured to obtain the second training sample;
and the second model training module 42 is configured to train the model to be trained, which is constructed based on the machine learning algorithm, with the storage system hardware information and the historical expansion capacity information in the second training sample as model input-side data and the expansion duration information in the second training sample as output-side data, so as to obtain the second prediction model.
Further, referring to fig. 10, the expansion device may further include:
a third model obtaining module 51, configured to obtain a third prediction model; the third prediction model is obtained by training a model to be trained, which is constructed based on a machine learning algorithm, by using a training sample constructed based on time sequence data of historical read-write throughput of the storage system;
the third prediction module 52 is configured to input the expected value of the expansion time period to the third prediction model, and obtain a predicted value of the read-write throughput at each time point on the expected value of the expansion time period output by the third prediction model;
a graph creating module 53, configured to create a corresponding graph according to a correspondence between the expected value of the capacity expansion time period and the predicted value of the read/write throughput;
a curve segment selection module 54, configured to select, from the curve graph, a curve segment that has a duration of a corresponding time period that is consistent with the predicted expansion duration and forms a smallest area with the time axis as a target curve segment;
and an optimal expansion time period determining module 55, configured to determine a time period corresponding to the target curve segment as an optimal expansion time period.
Further, referring to fig. 11, the expansion device may further include:
a third data obtaining module 61, configured to obtain second time series data of historical read-write throughput of the storage system;
a second sample construction module 62, configured to construct a third training sample by using the time slice information in the second time series data and the corresponding historical read-write throughput;
and a third model training module 63, configured to train the model to be trained, which is constructed based on a machine learning algorithm, with the time slice information in the third training sample as model input-side data and the historical read-write throughput in the third training sample as model output-side data, so as to obtain the third prediction model.
In this embodiment, the capacity expansion module 13 may specifically perform a corresponding capacity expansion operation on the storage system according to the information about the capacity to be expanded and the optimal capacity expansion time period.
In addition, for more specific working processes of the modules, reference may be made to corresponding contents disclosed in the foregoing embodiments, and details are not repeated here.
Correspondingly, the embodiment of the present application further discloses a device for creating a prediction model, including:
the first data acquisition module is used for acquiring first time sequence data of historical capacity consumption of the storage system;
the first sample structure modeling block is used for constructing a first training sample by utilizing time slice information in the first time sequence data and corresponding historical capacity consumption;
and the first model training module is used for training a model to be trained, which is constructed based on a machine learning algorithm, by using the first training sample to obtain a first prediction model.
Therefore, the embodiment of the application trains the prediction model based on the machine learning algorithm based on the time series data of the historical capacity consumption, since the time series data of the historical capacity consumption amount can objectively reflect the actual variation trend of the capacity consumption amount of the storage system, compared with the existing estimation method based on simple proportional linear trend, the embodiment of the application can take the actual nonlinear trend of the capacity consumption of the storage system into account in the prediction process, therefore, the capacity information to be expanded predicted by the first prediction model can better meet the daily actual capacity requirement of a user, the condition that the capacity information is obviously too much or too little is avoided, the prediction accuracy of the expanded capacity is improved, and the reliability is higher.
Further, the prediction model creating apparatus in this embodiment may further include:
the second data acquisition module is used for acquiring a second training sample; the second training sample comprises storage system hardware information, historical expansion capacity information and corresponding expansion duration information;
and the second model training module is used for training the model to be trained constructed based on the machine learning algorithm by taking the hardware information and the historical expansion capacity information of the storage system in the second training sample as model input side data and taking the expansion duration information in the second training sample as output side data to obtain a second prediction model.
Further, the prediction model creating apparatus in this embodiment may further include:
the third data acquisition module is used for acquiring second time sequence data of historical read-write throughput of the storage system;
the second sample construction module is used for constructing a third training sample by utilizing the time slice information in the second time sequence data and the corresponding historical read-write throughput;
and the third model training module is used for training the model to be trained constructed based on the machine learning algorithm by taking the time slice information in the third training sample as model input side data and taking the historical read-write throughput in the third training sample as model output side data to obtain a third prediction model.
In addition, for more specific working processes of the modules, reference may be made to corresponding contents disclosed in the foregoing embodiments, and details are not repeated here.
Further, this application still discloses a dilatation device, includes:
a memory for storing a computer program;
and the processor is used for executing the computer program to realize the capacity expansion method disclosed by the embodiment.
For the specific steps of the above capacity expansion method, reference may be made to corresponding contents disclosed in the foregoing embodiments, and details are not repeated herein.
Further, the present application also discloses a prediction model creation device, including:
a memory for storing a computer program;
a processor for executing the computer program to implement the prediction model creation method disclosed in the foregoing embodiments.
For the specific steps of the prediction model creation method, reference may be made to corresponding contents disclosed in the foregoing embodiments, and details are not repeated here.
Further, the present application also discloses a computer-readable storage medium for storing a computer program; wherein, the computer program realizes the capacity expansion method disclosed by the foregoing embodiment when being executed by a processor.
For the specific steps of the above capacity expansion method, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and details are not repeated herein.
Further, the present application also discloses a computer-readable storage medium for storing a computer program; wherein the computer program, when executed by a processor, implements the predictive model creation method disclosed in the foregoing embodiments.
For the specific steps of the prediction model creation method, reference may be made to corresponding contents disclosed in the foregoing embodiments, and details are not repeated here.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above detailed description is given to a capacity expansion method, a prediction model creation method, an apparatus, a device, and a medium provided by the present application, and a specific example is applied in the present application to explain the principle and the implementation of the present application, and the description of the above embodiment is only used to help understanding the method and the core idea of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (18)

1. A capacity expansion method is characterized by comprising the following steps:
acquiring a first prediction model; the first prediction model is obtained by training a model to be trained, which is constructed based on a machine learning algorithm, by using a training sample constructed based on time sequence data of historical capacity consumption of a storage system;
predicting the capacity information to be expanded corresponding to the expected value of the capacity consumption duration by utilizing the first prediction model;
carrying out corresponding capacity expansion operation on the storage system according to the capacity information to be expanded;
and, the method further comprises: selecting a curve segment with the corresponding length being the predicted value of the expansion length and the smallest area formed between the curve segment and a time axis as a target curve segment from a curve graph constructed according to the corresponding relation between the expected value of the expansion length and the predicted value of the read-write throughput, and determining the time segment corresponding to the target curve segment as the optimal expansion time segment.
2. The capacity expansion method of claim 1, wherein before the obtaining the first prediction model, further comprising:
acquiring first time series data of historical capacity consumption of the storage system;
constructing a first training sample by using the time slice information in the first time sequence data and the corresponding historical capacity consumption;
and training a model to be trained constructed based on a machine learning algorithm by using the first training sample to obtain the first prediction model.
3. The capacity expansion method according to claim 2, wherein the training of the model to be trained, which is constructed based on a machine learning algorithm, with the first training sample to obtain the first prediction model includes:
and training a model to be trained constructed based on a machine learning algorithm by using the time slice information in the first training sample as model input side data and using the historical capacity consumption in the first training sample as model output side data to obtain the first prediction model.
4. The capacity expansion method according to claim 2, wherein the training of the model to be trained, which is constructed based on a machine learning algorithm, with the first training sample to obtain the first prediction model includes:
and training a model to be trained constructed based on a machine learning algorithm by taking the historical capacity consumption in the first training sample as model input side data and taking time segment information in the first training sample as model output side data to obtain the first prediction model.
5. The capacity expansion method according to any one of claims 1 to 4, wherein after predicting the capacity information to be expanded corresponding to the expected value of the capacity consumption duration and before performing the corresponding capacity expansion operation on the storage system according to the capacity information to be expanded, the method further includes:
acquiring a second prediction model; the second prediction model is obtained by training a model to be trained, which is constructed based on a machine learning algorithm, by using a second training sample, and the second training sample comprises storage system hardware information, historical expansion capacity information and corresponding expansion duration information;
and acquiring current hardware information of the storage system, and inputting the capacity information to be expanded and the current hardware information into the second prediction model to obtain an expansion duration prediction value output by the second prediction model.
6. The capacity expansion method of claim 5, wherein before the obtaining the second prediction model, further comprising:
obtaining the second training sample;
and training a model to be trained constructed based on a machine learning algorithm by using the hardware information of the storage system and the historical expansion capacity information in the second training sample as model input side data and using the expansion duration information in the second training sample as output side data to obtain the second prediction model.
7. A capacity expansion method according to claim 5, wherein before performing corresponding capacity expansion operation on the storage system according to the information about the capacity to be expanded, the method further includes:
acquiring a third prediction model; the third prediction model is obtained by training a model to be trained, which is constructed based on a machine learning algorithm, by using a training sample constructed based on time sequence data of historical read-write throughput of the storage system;
inputting the expected value of the capacity expansion time period to the third prediction model to obtain the read-write throughput predicted value of each time point on the expected value of the capacity expansion time period output by the third prediction model;
establishing a corresponding curve graph according to the corresponding relation between the expected value of the capacity expansion time period and the read-write throughput predicted value;
selecting a curve segment which is consistent with the predicted value of the expansion time length and has the smallest area with the time axis and is corresponding to the time length of the corresponding time period from the curve graph as a target curve segment;
and determining the time period corresponding to the target curve segment as the optimal expansion time period.
8. The capacity expansion method according to claim 7, wherein before the obtaining the third prediction model, the method further comprises:
acquiring second time series data of historical read-write throughput of the storage system;
constructing a third training sample by utilizing the time slice information in the second time series data and the corresponding historical read-write throughput;
and training a model to be trained, which is constructed based on a machine learning algorithm, by using the time slice information in the third training sample as model input side data and the historical read-write throughput in the third training sample as model output side data to obtain a third prediction model.
9. The capacity expansion method according to claim 7, wherein performing the corresponding capacity expansion operation on the storage system according to the information about the capacity to be expanded includes:
and carrying out corresponding capacity expansion operation on the storage system according to the capacity information to be expanded and the optimal capacity expansion time period.
10. A method of predictive model creation, comprising:
acquiring first time series data of historical capacity consumption of a storage system;
constructing a first training sample by using the time slice information in the first time sequence data and the corresponding historical capacity consumption;
training a model to be trained, which is constructed based on a machine learning algorithm, by using the first training sample to obtain a first prediction model, so that capacity expansion operation is performed based on an optimal capacity expansion time period by using the first prediction model;
the optimal capacity expansion time period is a time period corresponding to a target curve segment, and the target curve segment is a curve segment which is constructed according to a corresponding relation between an expected value of the capacity expansion time period and a predicted value of the read-write throughput and has a minimum area formed between the predicted value of the capacity expansion time period and a time axis.
11. The predictive model creation method of claim 10, further comprising:
obtaining a second training sample; the second training sample comprises storage system hardware information, historical expansion capacity information and corresponding expansion duration information;
and training a model to be trained constructed based on a machine learning algorithm by taking the hardware information of the storage system and the historical expansion capacity information in the second training sample as model input side data and taking the expansion duration information in the second training sample as output side data to obtain a second prediction model.
12. The method of predictive model creation as described in claim 11, further comprising:
acquiring second time series data of historical read-write throughput of the storage system;
constructing a third training sample by using the time slice information in the second time sequence data and the corresponding historical read-write throughput;
and training a model to be trained constructed based on a machine learning algorithm by using the time slice information in the third training sample as model input side data and the historical read-write throughput in the third training sample as model output side data to obtain a third prediction model.
13. A capacity expansion device, comprising:
the first model acquisition module is used for acquiring a first prediction model; the first prediction model is obtained by training a model to be trained, which is constructed based on a machine learning algorithm, by using a training sample constructed based on time sequence data of historical capacity consumption of a storage system;
the first prediction module is used for predicting the capacity information to be expanded corresponding to the expected value of the capacity consumption duration by utilizing the first prediction model;
the capacity expansion module is used for carrying out corresponding capacity expansion operation on the storage system according to the capacity information to be expanded;
and the capacity expansion device is also used for selecting a curve segment with the corresponding length as the predicted value of the capacity expansion length and the smallest area formed between the curve segment and the time axis from a curve graph constructed according to the corresponding relation between the expected value of the capacity expansion time period and the predicted value of the read-write throughput as a target curve segment, and determining the time period corresponding to the target curve segment as the optimal capacity expansion time period.
14. A prediction model creation apparatus, characterized by comprising:
the first data acquisition module is used for acquiring first time series data of historical capacity consumption of the storage system;
the first sample structure modeling block is used for constructing a first training sample by utilizing time slice information in the first time sequence data and corresponding historical capacity consumption;
the first model training module is used for training a model to be trained, which is constructed based on a machine learning algorithm, by using the first training sample to obtain a first prediction model so as to perform capacity expansion operation based on an optimal capacity expansion time period by using the first prediction model;
the optimal capacity expansion time period is a time period corresponding to a target curve segment, and the target curve segment is a curve segment which is constructed according to a corresponding relation between an expected value of the capacity expansion time period and a predicted value of the read-write throughput and has a minimum area formed between the predicted value of the capacity expansion time period and a time axis.
15. A flash device, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the capacity expansion method according to any one of claims 1 to 9.
16. A prediction model creation apparatus characterized by comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the predictive model creation method of any of claims 10 to 12.
17. A computer-readable storage medium for storing a computer program; wherein the computer program when executed by a processor implements the capacity expansion method of any one of claims 1 to 9.
18. A computer-readable storage medium for storing a computer program; wherein the computer program, when executed by a processor, implements a predictive model creation method as claimed in any one of claims 10 to 12.
CN201910146524.4A 2019-02-27 2019-02-27 Capacity expansion method, prediction model creation method, device, equipment and medium Active CN109885469B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910146524.4A CN109885469B (en) 2019-02-27 2019-02-27 Capacity expansion method, prediction model creation method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910146524.4A CN109885469B (en) 2019-02-27 2019-02-27 Capacity expansion method, prediction model creation method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN109885469A CN109885469A (en) 2019-06-14
CN109885469B true CN109885469B (en) 2022-09-30

Family

ID=66929651

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910146524.4A Active CN109885469B (en) 2019-02-27 2019-02-27 Capacity expansion method, prediction model creation method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN109885469B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110377228A (en) * 2019-06-19 2019-10-25 深圳壹账通智能科技有限公司 Automatic expansion method, device, O&M terminal and the storage medium of block chain node
CN110737593B (en) * 2019-09-19 2022-03-29 平安科技(深圳)有限公司 Intelligent capacity management method, device and storage medium
CN110768850A (en) * 2019-11-12 2020-02-07 国家电网有限公司 Communication capacity expansion processing method and device based on power system
CN111176565B (en) * 2019-12-25 2024-04-19 三星(中国)半导体有限公司 Method and device for determining storage load of application
CN113254261A (en) * 2020-02-07 2021-08-13 伊姆西Ip控股有限责任公司 Data backup method, electronic device and computer program product
CN112130767B (en) * 2020-09-18 2023-01-10 苏州浪潮智能科技有限公司 Storage pool use capacity determination method, device, equipment and medium
CN112291086A (en) * 2020-10-16 2021-01-29 苏州浪潮智能科技有限公司 Memory capacity expansion method, system and device of switch
CN112988071A (en) * 2021-03-15 2021-06-18 中国建设银行股份有限公司 Distributed storage capacity expansion method and device, storage medium and equipment
CN116414662B (en) * 2023-04-24 2023-10-10 山东溯源安全科技有限公司 Storage space expansion prompting method and device, electronic equipment and storage medium
CN116414663A (en) * 2023-04-24 2023-07-11 山东溯源安全科技有限公司 Capacity expansion prompting method, device and storage medium based on capacity use prediction

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105094708A (en) * 2015-08-25 2015-11-25 北京百度网讯科技有限公司 Method and apparatus for predicting disk capacity
CN106469107A (en) * 2016-08-31 2017-03-01 浪潮(北京)电子信息产业有限公司 A kind of capacity prediction methods of storage resource and device
CN107231264A (en) * 2017-07-25 2017-10-03 北京百度网讯科技有限公司 For the method and apparatus for the capacity for managing Cloud Server
CN107480028A (en) * 2017-07-21 2017-12-15 东软集团股份有限公司 The acquisition methods and device of residual time length workable for disk
JPWO2016208149A1 (en) * 2015-06-24 2018-04-05 日本電気株式会社 Prediction system, prediction method, and prediction program
CN108920324A (en) * 2018-06-08 2018-11-30 广东轩辕网络科技股份有限公司 The method of the trend analysis of information technoloy equipment memory capacity and early warning, system and electronic device
CN109032914A (en) * 2018-09-06 2018-12-18 掌阅科技股份有限公司 Resource occupation data predication method, electronic equipment, storage medium
CN109189323A (en) * 2018-07-06 2019-01-11 华为技术有限公司 Expansion method and equipment

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPWO2016208149A1 (en) * 2015-06-24 2018-04-05 日本電気株式会社 Prediction system, prediction method, and prediction program
CN105094708A (en) * 2015-08-25 2015-11-25 北京百度网讯科技有限公司 Method and apparatus for predicting disk capacity
CN106469107A (en) * 2016-08-31 2017-03-01 浪潮(北京)电子信息产业有限公司 A kind of capacity prediction methods of storage resource and device
CN107480028A (en) * 2017-07-21 2017-12-15 东软集团股份有限公司 The acquisition methods and device of residual time length workable for disk
CN107231264A (en) * 2017-07-25 2017-10-03 北京百度网讯科技有限公司 For the method and apparatus for the capacity for managing Cloud Server
CN108920324A (en) * 2018-06-08 2018-11-30 广东轩辕网络科技股份有限公司 The method of the trend analysis of information technoloy equipment memory capacity and early warning, system and electronic device
CN109189323A (en) * 2018-07-06 2019-01-11 华为技术有限公司 Expansion method and equipment
CN109032914A (en) * 2018-09-06 2018-12-18 掌阅科技股份有限公司 Resource occupation data predication method, electronic equipment, storage medium

Also Published As

Publication number Publication date
CN109885469A (en) 2019-06-14

Similar Documents

Publication Publication Date Title
CN109885469B (en) Capacity expansion method, prediction model creation method, device, equipment and medium
de Jonge et al. The influence of practical factors on the benefits of condition-based maintenance over time-based maintenance
US9696786B2 (en) System and method for optimizing energy consumption by processors
Huerta et al. Time-varying models for extreme values
US7107187B1 (en) Method for modeling system performance
CN104809333A (en) Capacity predicating method and system based on Kalman filter
CN110633194B (en) Performance evaluation method of hardware resources in specific environment
CN111310981A (en) Reservoir water level trend prediction method based on time series
CN113868953B (en) Multi-unit operation optimization method, device and system in industrial system and storage medium
CN110544130A (en) flow transaction monitoring method and device, computer equipment and storage medium
CN110909306B (en) Business abnormality detection method and device, electronic equipment and storage equipment
KR101793625B1 (en) Evaluation method and system of rainfall quantile considering climate change in nonstationary regional frequency analysis
JP7422272B2 (en) Method and apparatus for facilitating storage of data from industrial automation control systems or power systems
McNeil et al. Developing scenarios for future extreme losses using the POT model
CN103646670A (en) Method and device for evaluating performances of storage system
JP2022050354A (en) Feature quantity selection method and health state evaluation method
Inoue et al. Lognormal process software reliability modeling with testing-effort
JP4790371B2 (en) Time series data storage, extraction and synthesis method and program
CN110069379B (en) Screening method and screening device for monitoring indexes
JP2005063208A (en) Software reliability growth model selection method, software reliability growth model selection apparatus, software reliability growth model selection program and program recording medium
US10528883B2 (en) Hybrid estimation of transition probability values in markov decision processes
KR20060093079A (en) Long-run water demand forecast method and system by co-integrating regression with regional time varying coefficients
JP5636922B2 (en) Performance prediction apparatus, performance prediction method, and program
CN107665185A (en) A kind of customer behavior analysis method and relevant apparatus based on short message service
JP6493904B2 (en) Parameter selection method, parameter selection program, and parameter selection device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant