CN112115036A - Cluster capacity prediction method and device - Google Patents

Cluster capacity prediction method and device Download PDF

Info

Publication number
CN112115036A
CN112115036A CN202011038553.8A CN202011038553A CN112115036A CN 112115036 A CN112115036 A CN 112115036A CN 202011038553 A CN202011038553 A CN 202011038553A CN 112115036 A CN112115036 A CN 112115036A
Authority
CN
China
Prior art keywords
performance index
sample
neural network
network model
capacity prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011038553.8A
Other languages
Chinese (zh)
Other versions
CN112115036B (en
Inventor
徐凯路
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Ltd
Original Assignee
Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Ltd filed Critical Bank of China Ltd
Priority to CN202011038553.8A priority Critical patent/CN112115036B/en
Publication of CN112115036A publication Critical patent/CN112115036A/en
Application granted granted Critical
Publication of CN112115036B publication Critical patent/CN112115036B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3447Performance evaluation by modeling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/81Threshold
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/835Timestamp

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a cluster capacity prediction method and a device, wherein the method comprises the following steps: in the running process of each system application, each performance index value of each system application is obtained, the performance index value corresponding to each preset index field is extracted from each performance index value, each extracted performance index value is determined as a target performance index value, each target performance index value is input into a pre-constructed capacity prediction model, and a capacity prediction result of a cluster output by the capacity prediction model is obtained. Therefore, according to the technical scheme, the capacity prediction model is constructed in advance, the cluster capacity is predicted through the capacity prediction model based on each performance index applied by each system in the real production environment, and the cluster capacity prediction result is more accurate than that in the prior art because the data of the cluster capacity prediction is from the real production environment and the cluster capacity prediction is performed based on the whole cluster.

Description

Cluster capacity prediction method and device
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for predicting cluster capacity.
Background
A cluster is a group of mutually independent system applications interconnected by a high-speed network, which form a group and are managed in a single system mode. With the rapid development and wide application of internet technology, many services currently require cluster processing to meet performance requirements.
The cluster capacity refers to the overall bearing capacity of the cluster, and accurate cluster capacity prediction has important significance for guiding production capacity expansion and capacity reduction. At present, the cluster capacity prediction is generally based on a test environment, and the application capacity prediction of a single system is firstly performed, and then the cluster capacity evaluation is performed according to the prediction results of the application capacities of all the single systems. Because the existing capacity prediction is carried out based on a test environment, the test environment and an on-line generation environment have larger difference on physical machines and deployment, cluster processing services relate to a plurality of system applications, the system applications are in linkage coordination, the service is presented to the outside as a whole, the cluster capacity is evaluated through capacity evaluation of a single system user, and the whole capacity condition of a cluster cannot be embodied, so the existing cluster capacity prediction accuracy is poor.
Disclosure of Invention
The application provides a cluster capacity prediction method and a cluster capacity prediction device, and aims to solve the problem that the existing cluster capacity prediction accuracy is poor.
In order to achieve the above object, the present application provides the following technical solutions:
a method of cluster capacity prediction, the cluster comprising at least two system applications, the method comprising:
in the running process of each system application, obtaining each performance index value of each system application;
extracting performance index values corresponding to each preset index field from each performance index value, and determining each extracted performance index value as a target performance index value;
and inputting each target performance index value into a pre-constructed capacity prediction model to obtain a capacity prediction result of the cluster output by the capacity prediction model.
The method described above, optionally, the construction process of the capacity prediction model includes:
obtaining each performance index sample pre-stored in a database;
performing data preprocessing on each performance index sample to obtain an initial performance index sample corresponding to each performance index sample;
extracting initial performance indexes corresponding to each preset index field from each initial performance sample, and determining each extracted initial performance index as a target performance index sample;
grouping the target performance index samples according to the corresponding time stamp of each target performance index sample to obtain a plurality of sample groups; the number of target performance indicator samples of each sample group is the same;
acquiring a capacity value corresponding to each sample group;
establishing a neural network model, and training the neural network model according to the sample groups and the capacity values corresponding to the sample groups;
determining a sample group meeting a preset boundary condition in each sample group as an optimized sample group;
and optimizing the trained neural network model according to the optimized sample sets and the capacity values corresponding to the optimized sample sets to obtain a target neural network model, and taking the target neural network model as a capacity prediction model.
The method described above, optionally, the process of storing each performance index sample includes:
in the running process of each system application, acquiring each performance index value of each system application according to a preset period, and taking each acquired performance index value as a performance index sample;
determining the service type of each performance index sample;
and for each performance index sample, storing the performance index sample into a position corresponding to the service type of the performance index sample in a database.
Optionally, the optimizing the trained neural network model according to the optimized sample sets and the capacity values corresponding to the optimized sample sets respectively to obtain the target neural network model includes:
selecting one of the optimized sample groups from each of the optimized sample groups;
training a current neural network model according to the selected optimized sample group and the capacity value of the optimized sample group to obtain a first result;
calculating an error rate according to the current first result and the capacity value of the optimized sample set corresponding to the current first result;
judging whether the current error rate is smaller than a preset error threshold value or not;
if not, selecting one optimized sample group from the rest unselected optimized sample groups, and returning to the step of executing the training of the current neural network model by the selected optimized sample group to obtain a first result until the current error rate meets the preset error threshold value, and completing the optimization of the neural network model;
and taking the optimized neural network model as a target neural network model.
Optionally, the method further includes, after obtaining the capacity prediction result of the cluster output by the capacity prediction model, that:
and sending the capacity prediction result of the cluster to a visual interface for displaying.
The above method, optionally, further includes:
acquiring full load index values corresponding to the preset index fields respectively;
and inputting each full load index value into the capacity prediction model to obtain the limit capacity prediction result of the cluster output by the capacity prediction model.
An apparatus for cluster capacity prediction, the cluster comprising at least two system applications, the apparatus comprising:
the first acquisition unit is used for acquiring each performance index value of each system application in the running process of each system application;
the first extraction unit is used for extracting the performance index value corresponding to each preset index field from each performance index value and determining each extracted performance index value as a target performance index value;
and the first prediction unit is used for inputting each target performance index value into a pre-constructed capacity prediction model to obtain a capacity prediction result of the cluster output by the capacity prediction model.
The above apparatus, optionally, further comprises:
the second acquisition unit is used for acquiring each performance index sample pre-stored in the database;
the preprocessing unit is used for preprocessing data of each performance index sample to obtain an initial performance index sample corresponding to each performance index sample;
the second extraction unit is used for extracting the initial performance index corresponding to each preset index field from each initial performance sample and determining each extracted initial performance index as a target performance index sample;
the grouping unit is used for grouping the target performance index samples according to the corresponding time stamp of each target performance index sample to obtain a plurality of sample groups; the number of target performance indicator samples of each sample group is the same;
a third obtaining unit, configured to obtain a capacity value corresponding to each sample group;
the training unit is used for establishing a neural network model and training the neural network model according to the sample groups and the capacity values corresponding to the sample groups;
the first determining unit is used for determining the sample group meeting the preset boundary condition in each sample group as an optimized sample group;
and the optimization unit is used for optimizing the trained neural network model according to the capacity values corresponding to the optimization sample sets and the optimization sample sets to obtain a target neural network model, and the target neural network model is used as a capacity prediction model.
The above apparatus, optionally, further comprises:
the third acquisition unit is used for acquiring each performance index value of each system application according to a preset period in the running process of each system application, and taking each acquired performance index value as a performance index sample;
the second determining unit is used for determining the service type of each performance index sample;
and the storage unit is used for storing the performance index samples to the positions corresponding to the service types of the performance index samples in the database aiming at each performance index sample.
Optionally, in the apparatus described above, the optimization unit performs optimization on the trained neural network model according to each optimized sample group, so as to obtain a target neural network model, where the target neural network model is used to:
selecting one of the optimized sample groups from each of the optimized sample groups;
optimizing the current neural network model by the selected optimized sample group to obtain a first result;
calculating an error rate based on the current first result and the selected optimized sample set;
judging whether the error rate meets a preset error threshold value or not;
if not, selecting one optimized sample group from the rest unselected optimized sample groups, and returning to the step of executing the optimization of the selected optimized sample group on the current neural network model to obtain a first result until the current error rate meets the preset error threshold value, and completing the optimization of the neural network model;
and taking the optimized neural network model as a target neural network model.
A storage medium comprising stored instructions, wherein the instructions, when executed, control a device on which the storage medium is located to perform the above-mentioned cluster capacity prediction method.
An electronic device comprising a memory, and one or more instructions stored in the memory and configured to be executed by one or more processors to perform the cluster capacity prediction method described above.
Compared with the prior art, the method has the following advantages:
the application provides a cluster capacity prediction method and a device, wherein the method comprises the following steps: in the running process of each system application, each performance index value of each system application is obtained, the performance index value corresponding to each preset index field is extracted from each performance index value, each extracted performance index value is determined as a target performance index value, each target performance index value is input into a pre-constructed capacity prediction model, and a capacity prediction result of a cluster output by the capacity prediction model is obtained. Therefore, according to the technical scheme, the capacity prediction model is constructed in advance, the cluster capacity is predicted through the capacity prediction model based on each performance index applied by each system in the real production environment, and the cluster capacity prediction result is more accurate than that in the prior art because the data of the cluster capacity prediction is from the real production environment and the cluster capacity prediction is performed based on the whole cluster.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a method of a cluster capacity prediction method provided in the present application;
FIG. 2 is a flow chart of another method of a cluster capacity prediction method provided herein;
FIG. 3 is a flow chart of another method of a cluster capacity prediction method provided herein;
FIG. 4 is a diagram illustrating a method for predicting cluster capacity according to the present disclosure;
fig. 5 is a schematic structural diagram of a cluster capacity prediction apparatus provided in the present application;
fig. 6 is a schematic structural diagram of an electronic device provided in the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The embodiment of the application provides a cluster capacity prediction method, a cluster comprises at least two system applications, the method can be applied to various system platforms, and an execution main body of the method can be a processor running on a computer. A flowchart of the cluster capacity prediction method is shown in fig. 1, and specifically includes:
s101, in the running process of each system application, obtaining each performance index value of each system application.
In the method provided by the embodiment of the application, a distributed monitoring acquisition environment is set up in advance and used for acquiring each performance index value of each system application in a real production environment, namely, each performance index value of each system application in the operation process of each system application.
Alternatively, the respective performance metric values for each system application may be obtained in the form of a log.
It should be noted that the obtained performance index values of the respective system applications are at the same time sequence, that is, the timestamps corresponding to the obtained performance index values belong to the same time sequence.
It should be noted that each performance index value of the system application includes performance index values of a plurality of layers, and optionally, the performance index values of layers such as a bottom layer hardware, a middleware layer, and an upper layer application;
optionally, in the method provided by the embodiment of the present application, the established distributed monitoring acquisition environment may be distributed monitoring zabbix, and the distributed monitoring zabbix acquisition system is used to apply performance index values of each layer. The cluster is cooperatively managed by a distributed system coordinated zookeeper.
S102, extracting the performance index value corresponding to each preset index field from each performance index value, and determining each extracted performance index value as a target performance index value.
A plurality of index fields are preset, each preset index field is an index field having a large influence on the cluster capacity, and for example, the preset field may be a CPU usage rate.
According to each preset index field, extracting the performance index value corresponding to each preset index field from all the performance index values, and specifically, for each preset index field, obtaining the performance index value corresponding to the preset index field by traversing each performance index value.
And determining the performance index value corresponding to each extracted preset index field as a target performance index value.
S103, inputting each target performance index value into a pre-constructed capacity prediction model to obtain a capacity prediction result of the cluster output by the capacity prediction model.
The capacity prediction model is constructed in advance, and it needs to be noted that the construction of the capacity prediction model relates to model training and model optimization, and the model training and the model optimization are performed based on data acquired in a real production environment, namely training data of the model training and optimization data of the model optimization are both from the real production environment.
And taking each target performance index value as the input of the capacity prediction model, and obtaining the capacity prediction result of the cluster output by the capacity prediction model after the processing of the capacity prediction model.
It should be noted that before each target performance index value is input to the capacity prediction model, data preprocessing may also be performed on each model performance index value, optionally, the data preprocessing includes but is not limited to data cleaning, and a specific process of performing data cleaning on each performance index sample is the prior art, please refer to the existing data cleaning process, which is not described herein again; and inputting each target performance index value after data preprocessing into the capacity prediction model to obtain a capacity prediction result of the cluster of the capacity prediction model data.
In the cluster capacity prediction method provided by the embodiment of the application, in the running process of each system application, each performance index value of each system application is obtained, the performance index value corresponding to each preset index field is extracted from each performance index value, each extracted performance index value is determined as a target performance index value, each target performance index value is input into a pre-constructed capacity prediction model, and a capacity prediction result of a cluster output by the capacity prediction model is obtained. According to the cluster capacity prediction method provided by the embodiment of the application, the capacity prediction model is constructed in advance, the cluster capacity is predicted through the capacity prediction model based on each performance index applied by each system in the real production environment, and the cluster capacity prediction result is more accurate compared with the prior art because the data of the cluster capacity prediction is derived from the real production environment and the cluster capacity prediction is performed based on the whole cluster.
The capacity prediction model related to step S103 disclosed in fig. 1 in the embodiment of the present invention described above, a flow chart of a process for constructing the capacity prediction model is shown in fig. 2, and the process includes the following steps:
s201, obtaining each performance index sample pre-stored in a database.
The method includes the steps that performance index samples applied to each system in a cluster are stored in a database in advance, wherein the performance index samples are obtained from a real production environment according to a preset period, namely historical data of the real production environment, and it needs to be noted that the performance index samples in a past time period are stored in the database, namely a large number of performance index samples are stored in the database, namely, the number of the performance index samples stored in the database in advance is large enough.
The process of obtaining each performance index sample pre-stored in the database specifically includes the following steps:
in the running process of each system application, acquiring each performance index value of each system application according to a preset period, and taking each acquired performance index value as a performance index sample;
determining the service type of each performance index sample;
and storing the performance index samples into a position corresponding to the service type of the performance index sample in a database aiming at each performance index sample.
In the method provided by the embodiment of the application, in the running process of each system in a cluster, each performance index value of each system application is obtained according to a preset period, that is, the performance index value of each system application in a real production environment during running is obtained at regular time, each performance index value obtained according to the preset period is used as a performance index sample, the service type of each performance index sample is determined, the performance index samples with the same service type are classified based on the service type, and the performance index samples classified into the same class are stored in a database in a position corresponding to the service type. Optionally, each performance index sample may be subjected to data format conversion according to a preset format, and the performance index sample subjected to format conversion is stored, where the preset format may be a Json format.
S202, carrying out data preprocessing on each performance index sample to obtain an initial performance index sample corresponding to each performance index sample.
For the performance index samples obtained from the database, data preprocessing may be performed on each performance index sample, optionally, the data preprocessing includes but is not limited to data cleaning, and it should be noted that a specific process of performing data cleaning on each performance index sample is the prior art, please refer to the existing data cleaning process, which is not described herein again.
And taking each performance index sample after data preprocessing as an initial performance index sample, namely, performing data preprocessing on the performance index samples to obtain initial performance index samples corresponding to each performance index sample.
S203, extracting the initial performance index corresponding to each preset index field from each initial performance sample, and determining each extracted initial performance index as a target performance index sample.
The preset index field mentioned in step S203 is the same as the preset index field mentioned in step S102, and is not described herein again.
And extracting initial performance index samples corresponding to each preset field from all the initial performance index samples, namely extracting initial performance index samples having a large influence on the cluster capacity, and determining the extracted initial performance index samples corresponding to each preset field as target performance index samples so as to obtain a plurality of target performance index samples.
And S204, grouping the target performance index samples according to the corresponding time stamp of each target performance index sample to obtain a plurality of sample groups.
Grouping the target performance index samples according to the corresponding time stamps of the target performance index samples to realize grouping the target performance index samples of which the time stamps belong to the same time sequence, thereby obtaining a plurality of sample groups, wherein each sample group comprises performance index values corresponding to all preset index fields, and the number of the samples of the target performance index samples of each sample group is the same.
It should be noted that each sample group contains a performance index value of each system application in the cluster; for example, the cluster includes a system application a, a system application B, and a system application C, and the preset field is a CPU utilization, then the sample group 1 includes 40% of the CPU utilization of the system application a, 30% of the CPU utilization of the system application B, and 14% of the CPU utilization of the system application C.
And S205, acquiring the capacity value corresponding to each sample group.
According to the method provided by the embodiment of the application, the capacity value corresponding to each sample group is obtained.
S206, establishing a neural network model, and training the neural network model according to each sample group and the capacity value corresponding to each sample group.
Establishing a neural network model, and training the neural network model according to each sample group and the capacity value corresponding to each sample group, wherein each time, a group of sample groups and the capacity values corresponding to the sample groups are used for training, and when all the sample groups are used for model training, the training of the neural network model is completed.
It should be noted that, the process of performing model training according to a sample set and a corresponding capacity value is referred to in the prior art, and is not described herein again.
And S207, determining the sample group meeting the preset boundary condition in each sample group as an optimized sample group.
Determining a sample group meeting preset boundary conditions from a sample group, and determining the sample group meeting the preset boundary conditions as an optimized sample group, wherein the preset boundary conditions comprise boundary conditions corresponding to each preset index field, the sample group meeting the preset boundary conditions is used for indicating, each target performance index sample in the sample group meets the boundary conditions corresponding to each corresponding preset index field, namely, for each sample group, judging whether each target performance index sample contained in the sample meets the boundary conditions corresponding to the corresponding preset index field, and if so, determining the sample group as the optimized sample group. For example, the CPU utilization of the system application a in the sample group 1 is 60%, the CPU utilization of the system application B is 50%, the CPU utilization of the system application C is 35%, and the boundary condition corresponding to the preset index field CPU is 50%, and since the CPU utilization of the system application C in the sample group 1 is 25%, the boundary condition is not satisfied with 50%, the sample group 1 is not an optimized sample group.
And S208, optimizing the trained neural network model according to the optimized sample sets and the capacity values corresponding to the optimized sample sets to obtain a target neural network model, and taking the target neural network model as a capacity prediction model.
And optimizing the trained neural network model according to the capacity values corresponding to the optimized sample sets and the optimized sample sets, optionally optimizing the trained neural network model by using one optimized sample set and the capacity value corresponding to the optimized sample set each time, calculating an error rate after each optimization, optimizing the neural network model when the error rate meets an error threshold value, and determining the optimized neural network model as a target neural network model.
Referring to fig. 3, a specific process of optimizing the trained neural network model according to each optimized sample set to obtain a target neural network model includes the following steps:
s301, selecting one optimized sample group from the optimized sample groups.
S302, training the current neural network model by the selected optimization sample set and the capacity value of the optimization sample set to obtain a first result.
And S303, calculating an error rate according to the current first result and the capacity value of the optimized sample set corresponding to the current first result.
In the method provided by the embodiment of the application, after the current neural network model is trained by using the selected optimized sample set and the capacity value of the optimized sample each time, the error rate is calculated according to the first result obtained by the training and the capacity value of the optimized sample set corresponding to the first result.
S304, judging whether the current error rate is smaller than a preset error threshold value.
The error rate is compared with a preset error threshold, and when the error rate is less than the preset error threshold, step S305 is performed, and when the error rate is not less than the preset error threshold, step S306 is performed.
S305, completing optimization of the neural network model, and taking the optimized neural network model as a target neural network model.
S306, selecting an optimized sample group from the rest unselected optimized sample groups.
If the current error rate is not less than the preset error threshold, one optimized sample group is selected from the remaining unselected optimized sample groups, and the step S302 is executed again.
In the method provided by the embodiment of the application, the trained neural network model is optimized by optimizing the sample set, namely the sample set meeting the boundary conditions and the corresponding capacity value, so that the optimized neural network model can predict the boundary data more accurately.
In the cluster capacity prediction method provided by the embodiment of the application, the neural network model is trained through historical data of a real production environment to obtain the capacity prediction model.
In the cluster capacity prediction method provided in the embodiment of the present application, a full load index value corresponding to each preset index field may also be obtained, that is, a full load value applied to each system corresponding to each preset index field is obtained, that is, the number of the obtained full load index values is the same as the number of inputs of the capacity prediction model; each full load value is a value constructed based on a preset index field, and each full load index value is input into the capacity prediction model to obtain a limit capacity prediction result of the cluster output by the capacity prediction model data, so that the limit capacity of the cluster is predicted.
In the cluster capacity prediction method provided by the embodiment of the application, after the capacity prediction result and the limit capacity prediction result of the cluster output by the capacity prediction model are obtained, the capacity prediction result and the limit capacity prediction result can be sent to a visual interface for displaying.
In the method provided in the embodiment of the present application, an overall implementation of the cluster capacity prediction method is described, as shown in fig. 4, specifically including the following steps:
the distributed monitoring zabbix is used for acquiring performance index values of all applied layers of each system as performance index samples, the performance index samples are written into a distributed log system kafka, the same type of indexes are written into the same topic in the kafka, and the zookeeper is coordinated and managed by the distributed system in a cluster. And (3) customizing required indexes aiming at the current cluster, and performing multi-process periodic data fetching consumption. The temporal database opentsdb periodically stores performance index samples collected from kafka. Reading data in opentsdb, preprocessing the data, associating a plurality of performance index samples into a line of data according to the same time sequence dimension to form a sample group, training a pre-established neural net model based on each sample group, and optimizing the trained model to obtain a capacity prediction model. And performing real-time or periodic capacity prediction on the cluster by using the capacity prediction model, and also performing limit capacity prediction on the cluster based on the full load index, and sending the capacity prediction result to a front-end visual interface for displaying.
And (4) predicting the limit capacity value which can be borne by the system under the condition of corresponding high index value by using the trained model. The predicted value is automatically pushed to the front end to be updated.
Corresponding to the method described in fig. 1, an embodiment of the present application further provides a cluster capacity prediction apparatus, which is used for implementing the method in fig. 1 specifically, and a schematic structural diagram of the cluster capacity prediction apparatus is shown in fig. 5, and specifically includes:
a first obtaining unit 501, configured to obtain each performance index value of each system application in an operation process of each system application;
a first extracting unit 502, configured to extract a performance index value corresponding to each preset index field from each performance index value, and determine each extracted performance index value as a target performance index value;
a first prediction unit 503, configured to input each target performance index value into a capacity prediction model constructed in advance, and obtain a capacity prediction result of the cluster output by the capacity prediction model.
In the cluster capacity prediction device provided in the embodiment of the application, in the running process of each system application, each performance index value of each system application is obtained, a performance index value corresponding to each preset index field is extracted from each performance index value, each extracted performance index value is determined as a target performance index value, each target performance index value is input into a pre-constructed capacity prediction model, and a capacity prediction result of a cluster output by the capacity prediction model is obtained. The cluster capacity prediction device provided by the embodiment of the application is characterized in that a capacity prediction model is constructed in advance, cluster capacity is predicted through the capacity prediction model based on each performance index applied by each system in a real production environment, and an obtained cluster capacity prediction result is more accurate compared with the prior art because data of cluster capacity prediction is derived from the real production environment and cluster capacity prediction is performed based on the whole cluster.
In an embodiment of the present application, based on the foregoing scheme, the method may further include:
the second acquisition unit is used for acquiring each performance index sample pre-stored in the database;
the preprocessing unit is used for preprocessing data of each performance index sample to obtain an initial performance index sample corresponding to each performance index sample;
the second extraction unit is used for extracting the initial performance index corresponding to each preset index field from each initial performance sample and determining each extracted initial performance index as a target performance index sample;
the grouping unit is used for grouping the target performance index samples according to the corresponding time stamp of each target performance index sample to obtain a plurality of sample groups; the number of target performance indicator samples of each sample group is the same;
a third obtaining unit, configured to obtain a capacity value corresponding to each sample group;
the training unit is used for establishing a neural network model and training the neural network model according to the sample groups and the capacity values corresponding to the sample groups;
the first determining unit is used for determining the sample group meeting the preset boundary condition in each sample group as an optimized sample group;
and the optimization unit is used for optimizing the trained neural network model according to the capacity values corresponding to the optimization sample sets and the optimization sample sets to obtain a target neural network model, and the target neural network model is used as a capacity prediction model.
In an embodiment of the present application, based on the foregoing scheme, the method may further include:
the third acquisition unit is used for acquiring each performance index value of each system application according to a preset period in the running process of each system application, and taking each acquired performance index value as a performance index sample;
the second determining unit is used for determining the service type of each performance index sample;
and the storage unit is used for storing the performance index samples to the positions corresponding to the service types of the performance index samples in the database aiming at each performance index sample.
In an embodiment of the present application, based on the foregoing scheme, the optimization unit performs optimization on the trained neural network model according to each optimized sample group, so as to obtain a target neural network model, where the target neural network model is used to:
selecting one of the optimized sample groups from each of the optimized sample groups;
optimizing the current neural network model by the selected optimized sample group to obtain a first result;
calculating an error rate based on the current first result and the selected optimized sample set;
judging whether the error rate meets a preset error threshold value or not;
if not, selecting one optimized sample group from the rest unselected optimized sample groups, and returning to the step of executing the optimization of the selected optimized sample group on the current neural network model to obtain a first result until the current error rate meets the preset error threshold value, and completing the optimization of the neural network model;
and taking the optimized neural network model as a target neural network model.
In an embodiment of the present application, based on the foregoing scheme, the method may further include:
and the display unit is used for sending the capacity prediction result of the cluster to a visual interface for displaying.
In an embodiment of the present application, based on the foregoing scheme, the method may further include:
the fourth obtaining unit is used for obtaining the full load index value corresponding to each preset index field;
and a second prediction unit, configured to input each full load index value into the capacity prediction model, and obtain a limit capacity prediction result of the cluster output by the capacity prediction model.
An embodiment of the present application further provides a storage medium, where the storage medium includes stored instructions, where when the instructions are executed, the apparatus where the storage medium is located is controlled to perform the following operations:
in the running process of each system application, obtaining each performance index value of each system application;
extracting performance index values corresponding to each preset index field from each performance index value, and determining each extracted performance index value as a target performance index value;
and inputting each target performance index value into a pre-constructed capacity prediction model to obtain a capacity prediction result of the cluster output by the capacity prediction model.
The present embodiment further provides an electronic device, whose schematic structural diagram is shown in fig. 6, specifically including a memory 601, and one or more instructions 602, where the one or more instructions 602 are stored in the memory 601 and configured to be executed by one or more processors 603 to perform the following operations according to the one or more instructions 602:
in the running process of each system application, obtaining each performance index value of each system application;
extracting performance index values corresponding to each preset index field from each performance index value, and determining each extracted performance index value as a target performance index value;
and inputting each target performance index value into a pre-constructed capacity prediction model to obtain a capacity prediction result of the cluster output by the capacity prediction model.
It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. For the device-like embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments of the present application.
The foregoing describes a method and an apparatus for predicting cluster capacity provided by the present application in detail, and a specific example is applied in the present application to explain the principle and the implementation of the present application, and the description of the foregoing embodiment is only used to help understand the method and the core idea of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A method for cluster capacity prediction, wherein the cluster includes at least two system applications, the method comprising:
in the running process of each system application, obtaining each performance index value of each system application;
extracting performance index values corresponding to each preset index field from each performance index value, and determining each extracted performance index value as a target performance index value;
and inputting each target performance index value into a pre-constructed capacity prediction model to obtain a capacity prediction result of the cluster output by the capacity prediction model.
2. The method of claim 1, wherein the capacity prediction model is constructed by a process comprising:
obtaining each performance index sample pre-stored in a database;
performing data preprocessing on each performance index sample to obtain an initial performance index sample corresponding to each performance index sample;
extracting initial performance indexes corresponding to each preset index field from each initial performance sample, and determining each extracted initial performance index as a target performance index sample;
grouping the target performance index samples according to the corresponding time stamp of each target performance index sample to obtain a plurality of sample groups; the number of target performance indicator samples of each sample group is the same;
acquiring a capacity value corresponding to each sample group;
establishing a neural network model, and training the neural network model according to the sample groups and the capacity values corresponding to the sample groups;
determining a sample group meeting a preset boundary condition in each sample group as an optimized sample group;
and optimizing the trained neural network model according to the optimized sample sets and the capacity values corresponding to the optimized sample sets to obtain a target neural network model, and taking the target neural network model as a capacity prediction model.
3. The method of claim 2, wherein the act of storing each of the performance indicator samples comprises:
in the running process of each system application, acquiring each performance index value of each system application according to a preset period, and taking each acquired performance index value as a performance index sample;
determining the service type of each performance index sample;
and for each performance index sample, storing the performance index sample into a position corresponding to the service type of the performance index sample in a database.
4. The method of claim 2, wherein the optimizing the trained neural network model according to the capacity value corresponding to each optimized sample set and each optimized sample set to obtain a target neural network model comprises:
selecting one of the optimized sample groups from each of the optimized sample groups;
training a current neural network model according to the selected optimized sample group and the capacity value of the optimized sample group to obtain a first result;
calculating an error rate according to the current first result and the capacity value of the optimized sample set corresponding to the current first result;
judging whether the current error rate is smaller than a preset error threshold value or not;
if not, selecting one optimized sample group from the rest unselected optimized sample groups, and returning to the step of executing the training of the current neural network model by the selected optimized sample group to obtain a first result until the current error rate meets the preset error threshold value, and completing the optimization of the neural network model;
and taking the optimized neural network model as a target neural network model.
5. The method of claim 1, wherein after obtaining the capacity prediction result of the cluster output by the capacity prediction model, further comprising:
and sending the capacity prediction result of the cluster to a visual interface for displaying.
6. The method of claim 1, further comprising:
acquiring full load index values corresponding to the preset index fields respectively;
and inputting each full load index value into the capacity prediction model to obtain the limit capacity prediction result of the cluster output by the capacity prediction model.
7. An apparatus for cluster capacity prediction, wherein the cluster includes at least two system applications, the apparatus comprising:
the first acquisition unit is used for acquiring each performance index value of each system application in the running process of each system application;
the first extraction unit is used for extracting the performance index value corresponding to each preset index field from each performance index value and determining each extracted performance index value as a target performance index value;
and the first prediction unit is used for inputting each target performance index value into a pre-constructed capacity prediction model to obtain a capacity prediction result of the cluster output by the capacity prediction model.
8. The apparatus of claim 7, further comprising:
the second acquisition unit is used for acquiring each performance index sample pre-stored in the database;
the preprocessing unit is used for preprocessing data of each performance index sample to obtain an initial performance index sample corresponding to each performance index sample;
the second extraction unit is used for extracting the initial performance index corresponding to each preset index field from each initial performance sample and determining each extracted initial performance index as a target performance index sample;
the grouping unit is used for grouping the target performance index samples according to the corresponding time stamp of each target performance index sample to obtain a plurality of sample groups; the number of target performance indicator samples of each sample group is the same;
a third obtaining unit, configured to obtain a capacity value corresponding to each sample group;
the training unit is used for establishing a neural network model and training the neural network model according to the sample groups and the capacity values corresponding to the sample groups;
the first determining unit is used for determining the sample group meeting the preset boundary condition in each sample group as an optimized sample group;
and the optimization unit is used for optimizing the trained neural network model according to the capacity values corresponding to the optimization sample sets and the optimization sample sets to obtain a target neural network model, and the target neural network model is used as a capacity prediction model.
9. The apparatus of claim 8, further comprising:
the third acquisition unit is used for acquiring each performance index value of each system application according to a preset period in the running process of each system application, and taking each acquired performance index value as a performance index sample;
the second determining unit is used for determining the service type of each performance index sample;
and the storage unit is used for storing the performance index samples to the positions corresponding to the service types of the performance index samples in the database aiming at each performance index sample.
10. The apparatus of claim 8, wherein the optimization unit performs optimization of the trained neural network model according to each optimized sample set to obtain a target neural network model, and is configured to:
selecting one of the optimized sample groups from each of the optimized sample groups;
optimizing the current neural network model by the selected optimized sample group to obtain a first result;
calculating an error rate based on the current first result and the selected optimized sample set;
judging whether the error rate meets a preset error threshold value or not;
if not, selecting one optimized sample group from the rest unselected optimized sample groups, and returning to the step of executing the optimization of the selected optimized sample group on the current neural network model to obtain a first result until the current error rate meets the preset error threshold value, and completing the optimization of the neural network model;
and taking the optimized neural network model as a target neural network model.
CN202011038553.8A 2020-09-28 2020-09-28 Cluster capacity prediction method and device Active CN112115036B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011038553.8A CN112115036B (en) 2020-09-28 2020-09-28 Cluster capacity prediction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011038553.8A CN112115036B (en) 2020-09-28 2020-09-28 Cluster capacity prediction method and device

Publications (2)

Publication Number Publication Date
CN112115036A true CN112115036A (en) 2020-12-22
CN112115036B CN112115036B (en) 2024-09-17

Family

ID=73798611

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011038553.8A Active CN112115036B (en) 2020-09-28 2020-09-28 Cluster capacity prediction method and device

Country Status (1)

Country Link
CN (1) CN112115036B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170111233A1 (en) * 2015-10-15 2017-04-20 Citrix Systems, Inc. Systems and methods for determining network configurations using historical and real-time network metrics data
CN107231264A (en) * 2017-07-25 2017-10-03 北京百度网讯科技有限公司 For the method and apparatus for the capacity for managing Cloud Server
CN110289994A (en) * 2019-06-06 2019-09-27 厦门网宿有限公司 A kind of cluster capacity adjustment method and device
US20200027014A1 (en) * 2015-12-30 2020-01-23 Nutanix, Inc. Method for forecasting distributed resource utilization in a virtualization environment
CN111124689A (en) * 2019-12-31 2020-05-08 中国电子科技集团公司信息科学研究院 Dynamic allocation method for container resources in cluster
CN111427753A (en) * 2020-03-23 2020-07-17 上海新炬网络信息技术股份有限公司 ARIMA model-based capacity prediction device and control method thereof

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170111233A1 (en) * 2015-10-15 2017-04-20 Citrix Systems, Inc. Systems and methods for determining network configurations using historical and real-time network metrics data
US20200027014A1 (en) * 2015-12-30 2020-01-23 Nutanix, Inc. Method for forecasting distributed resource utilization in a virtualization environment
CN107231264A (en) * 2017-07-25 2017-10-03 北京百度网讯科技有限公司 For the method and apparatus for the capacity for managing Cloud Server
CN110289994A (en) * 2019-06-06 2019-09-27 厦门网宿有限公司 A kind of cluster capacity adjustment method and device
CN111124689A (en) * 2019-12-31 2020-05-08 中国电子科技集团公司信息科学研究院 Dynamic allocation method for container resources in cluster
CN111427753A (en) * 2020-03-23 2020-07-17 上海新炬网络信息技术股份有限公司 ARIMA model-based capacity prediction device and control method thereof

Also Published As

Publication number Publication date
CN112115036B (en) 2024-09-17

Similar Documents

Publication Publication Date Title
CN110149540B (en) Recommendation processing method and device for multimedia resources, terminal and readable medium
CN110059894B (en) Equipment state evaluation method, device, system and storage medium
CN108647329B (en) User behavior data processing method and device and computer readable storage medium
CN116307215A (en) Load prediction method, device, equipment and storage medium of power system
CN113268403B (en) Time series analysis and prediction method, device, equipment and storage medium
CN110764714A (en) Data processing method, device and equipment and readable storage medium
US11651271B1 (en) Artificial intelligence system incorporating automatic model updates based on change point detection using likelihood ratios
CN110956278A (en) Method and system for retraining machine learning models
CN116225848A (en) Log monitoring method, device, equipment and medium
CN110610267A (en) Talent information processing method and device, computer storage medium and electronic equipment
CN112115036B (en) Cluster capacity prediction method and device
CN116204522A (en) Data auditing method and device, electronic equipment and storage medium
CN115767601A (en) 5GC network element automatic nanotube method and device based on multidimensional data
CN113344585A (en) Anti-fraud prediction model training method and device, storage medium and electronic equipment
CN113610225A (en) Quality evaluation model training method and device, electronic equipment and storage medium
CN112395167A (en) Operation fault prediction method and device and electronic equipment
CN116843203B (en) Service access processing method, device, equipment, medium and product
CN117667606B (en) High-performance computing cluster energy consumption prediction method and system based on user behaviors
CN118312399B (en) Test environment detection method, electronic equipment and computer readable storage medium
CN117473268A (en) Threshold prediction method, system, equipment and storage medium
CN118212033A (en) Data processing method, device, equipment and storage medium
CN117952446A (en) Monitoring method of business processing model, related equipment and storage medium
CN116402375A (en) Index state management method and device
CN117852695A (en) Automatic iteration method and device for artificial intelligent model
CN116069604A (en) User behavior prediction method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant