CN115220904A - VNF resource demand prediction method and system based on feature selection - Google Patents

VNF resource demand prediction method and system based on feature selection Download PDF

Info

Publication number
CN115220904A
CN115220904A CN202110413409.6A CN202110413409A CN115220904A CN 115220904 A CN115220904 A CN 115220904A CN 202110413409 A CN202110413409 A CN 202110413409A CN 115220904 A CN115220904 A CN 115220904A
Authority
CN
China
Prior art keywords
vnf
resource demand
data
demand prediction
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110413409.6A
Other languages
Chinese (zh)
Inventor
江凌云
武静雯
朱洪波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202110413409.6A priority Critical patent/CN115220904A/en
Publication of CN115220904A publication Critical patent/CN115220904A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a VNF resource demand prediction method and a VNF resource demand prediction system based on feature selection, wherein the VNF resource demand prediction method comprises the following steps: obtaining VNF reference data; preprocessing VNF reference data to obtain preprocessed data with abnormal data, non-numerical data and irrelevant data removed; inputting the preprocessing data into a pre-trained VNF resource demand prediction model comprising different types; and determining the type of the VNF according to the VNF reference data, determining a VNF resource demand prediction model corresponding to the type of the VNF according to the type of the VNF, and outputting a VNF resource demand prediction result through the VNF resource demand prediction model. The advantages are that: the candidate feature set highly related to the prediction target is screened out based on the data features, then the candidate feature set is further screened out based on a greedy forward search strategy to obtain an optimal feature set, finally different types of prediction models are trained, better prediction performance can be obtained, and meanwhile the method is good in expandability.

Description

VNF resource demand prediction method and system based on feature selection
Technical Field
The invention relates to a VNF resource demand prediction method and a VNF resource demand prediction system based on feature selection, and belongs to the technical field of network virtualization and network slicing.
Background
In the telecommunications industry, network slicing techniques based on Software Defined Networking (SDN) and Network Function Virtualization (NFV) are gaining more and more attention. In this scenario, a Service Function Chain (SFC) creates a dynamic network service, and since a Virtual Network Function (VNF) in the SFC runs in a general server, this service processing architecture ensures flexibility and adjustability of the VNF. According to the previous VNF resource allocation policy, each VNF is mostly instantiated with a fixed number of resources, which are defined in advance by developers in the VNF descriptor, however, the VNF usually does not need fixed resources, which easily results in resource allocation under-allocation or over-allocation. Therefore, how to design an effective VNF resource allocation scheme is a huge challenge.
In network service deployment, it is necessary to predict resource requirements of VNFs, and service operators must map performance specifications in Service Level Agreements (SLAs) to an appropriate number of resources in the virtualization infrastructure. In the hardware-based case, performance specifications can be easily guaranteed since this is a controlled and isolated environment, but in this case the total resource reservation is not flexible, often resulting in over-configured and expensive hardware. Conversely, when using VNF, resource reservation can be translated into the amount of resources needed to handle real-time network load, while the amount of resources is dynamically adjusted to meet the performance specifications in the SLA as the network load or performance fluctuates. However, it is not easy to accurately predict the resource requirements of the VNF from real-time network load.
The softening properties of VNF make it necessary to test numerous configuration combinations, which contain millions of data points, during data collection, so that a machine learning algorithm becomes an ideal choice for processing the configuration files, which can automatically train a suitable, accurate and compact model from the measured data. Feature selection is a process of selecting some most effective features from original features to reduce data dimensionality, and is an important means for improving the performance of a learning algorithm. Since the VNF network load is characterized by multiple features in the configuration file, if only a single feature is selected as the prediction basis, the prediction result of the model is one-sidedness. However, if all relevant network load characteristics are selected for prediction, on one hand, the correlation among a plurality of characteristics is not considered, and on the other hand, a large processing cost is caused. Therefore, how to effectively select all relevant features to optimize the prediction performance of the model on the VNF resource demand is one of the important points and difficulties of the current research.
Disclosure of Invention
The technical problem to be solved by the invention is to overcome the defects of the prior art and provide a VNF resource demand prediction method and a VNF resource demand prediction system based on feature selection.
In order to solve the above technical problem, the present invention provides a VNF resource demand prediction method based on feature selection, including:
obtaining VNF reference data;
preprocessing VNF reference data to obtain preprocessed data with abnormal data, non-numerical data and irrelevant data removed;
inputting the preprocessing data into a pre-trained VNF resource demand prediction model comprising different types;
determining the type of the VNF according to the VNF reference data, determining a VNF resource demand prediction model corresponding to the type of the VNF according to the type of the VNF, and outputting a VNF resource demand prediction result through the VNF resource demand prediction model;
the determining process including different types of VNF resource demand prediction models includes:
obtaining different types of VNF historical reference data;
preprocessing certain type of VNF historical reference data to obtain preprocessed historical data with abnormal data, non-numerical data and irrelevant data removed;
extracting all relevant features of the VNF according to the pre-processing historical data;
screening out a candidate feature set highly related to the predicted target according to all related features of the VNF;
further screening the candidate feature set based on a greedy forward search strategy to obtain an optimal feature set;
and training different VNF resource demand prediction models corresponding to the VNF types by adopting different regression models based on the optimal feature set, verifying the trained different VNF resource demand prediction models corresponding to the VNF types, and determining the optimal VNF resource demand prediction model corresponding to the VNF of the type.
Further, the process of screening out a candidate feature set highly correlated with the predicted target according to all correlated features of the VNF includes:
screening all the obtained related features of the VNF, screening out the original features with high time correlation with the VNF resource allocation CPU, and obtaining an original feature set N raw
Calculating a Pearson correlation coefficient and a distance correlation coefficient according to the linear and nonlinear relations between the original features and the predicted target, and obtaining the original feature set N according to the Pearson correlation coefficient and the distance correlation coefficient raw Selecting N from the features can Features as candidate feature sets, where N can Is a settable parameter.
Further, the process of further screening candidate feature sets based on the greedy forward search strategy to obtain the best feature set includes:
selecting a single feature which minimizes the prediction error RMSE from the candidate feature set, then sequentially selecting a feature which enables the prediction error RMSE to be improved to the maximum from the rest candidate features, adding the feature to the final feature set, stopping adding the feature which is improved to the maximum to the final feature set when the improvement value of the maximum improvement is smaller than a preset threshold value, and taking the final feature set as an optimal feature set M at the moment pred
Further, the regression models comprising differences include:
a linear regression model, a ridge regression model, a K-nearest neighbor model, a decision tree model, a random forest model, and an adaptive boosting model.
A VNF resource demand prediction system based on feature selection, comprising:
the first acquisition module is used for acquiring VNF reference data;
the VNF data preprocessing module is used for preprocessing VNF reference data to obtain preprocessed data without abnormal data, non-numerical data and irrelevant data;
the input module is used for inputting the preprocessing data into pre-trained VNF resource demand prediction models comprising different types;
the output module is used for determining the type of the VNF according to the VNF reference data, determining a VNF resource demand prediction model corresponding to the type of the VNF according to the type of the VNF, and outputting a VNF resource demand prediction result through the VNF resource demand prediction model;
the output module includes a model determination module comprising:
the second acquisition module is used for acquiring different types of VNF historical reference data;
the second preprocessing module is used for preprocessing the VNF historical reference data of a certain type to obtain preprocessed historical data with abnormal data, non-numerical data and irrelevant data removed;
the extraction module is used for extracting all relevant features of the VNF according to the preprocessed historical data;
the first screening module is used for screening out a candidate feature set highly related to the predicted target according to all related features of the VNF;
the second screening module is used for further screening the candidate feature set based on a greedy forward search strategy to obtain a best feature set;
and the optimal model determining module is used for training different VNF resource demand prediction models corresponding to the VNF types by adopting different regression models based on the optimal feature set, verifying the trained different VNF resource demand prediction models corresponding to the VNF types and determining the optimal VNF resource demand prediction model corresponding to the VNF types.
Further, the first screening module includes:
the original feature screening module is used for screening all the obtained related features of the VNF to screen out original features with high time correlation with the VNF resource allocation cpu to obtain an original feature set N raw
A subsequent feature screening module for calculating a Pearson correlation coefficient and a distance correlation coefficient according to the linear and nonlinear relations between the original features and the predicted target, and extracting the original feature set N according to the Pearson correlation coefficient and the distance correlation coefficient raw Selecting N from the features can Features as candidate feature sets, where N can Is a settable parameter.
Further, the second screening module comprises:
an adding module, configured to select a single feature that minimizes a prediction error RMSE from the candidate feature set, then sequentially select a feature that maximizes the prediction error RMSE from the remaining candidate features, and add the feature to the final feature set;
a judging module for stopping adding the maximum improved characteristic to the final characteristic set when the maximum improved improvement value is smaller than a preset threshold value, wherein the final characteristic set at the moment is an optimal characteristic set M pred
The invention achieves the following beneficial effects:
the algorithm is applied to VNF resource demand prediction in a network slice environment, and provides a VNF resource demand prediction method based on a two-stage algorithm (TSA) aiming at the problem that the conventional VNF resource allocation strategy cannot meet dynamic resource demand and easily causes insufficient or excessive resource allocation. The model trained based on the method can obtain better prediction performance, meanwhile, the method has better expandability, and the trained model can be directly integrated into the existing VNF deployment algorithm for application.
Drawings
VNF dynamic resource allocation framework in FIG. 1 network slice scenario
FIG. 2 is a block diagram of a resource demand prediction method for VNF
Fig. 3, 4, 5, and 6 are simulation result diagrams (for example, nginx VNF) of a model prediction error comparison diagram, a model prediction R2 comparison diagram, a model training time comparison diagram, and a model prediction time comparison diagram.
Detailed Description
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the embodiments described below are only a part of the embodiments of the present invention, but not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in a VNF dynamic resource allocation framework in a network slice scenario of fig. 1, from left to right, there are sequentially an SFC composed of a wireless virtual network user, a virtual network management platform, a virtual resource pool, and VNFs abstracted by underlying network functions, arranged in order. In the system, wireless virtual network users submit service requests with different requirements on mobility, safety, time delay, reliability and the like under different application scenes respectively; in a virtual network management platform, a network state monitoring entity is mainly responsible for monitoring the network state of a VNF, and a virtual network scheduler is mainly responsible for uniformly scheduling the result of VNF resource demand prediction; traditional different types of network entities are abstracted into VNFs through NFV technology, a plurality of VNFs of different types are arranged in a corresponding order to form an SFC, and required resources must be allocated to different types of VNFs in real time in order to meet different service requests made by users. Here, the cloud server in the virtual resource pool provides multiple types of virtual network resources including computing resources, cache resources, bandwidth resources, and the like, the network state monitoring entity monitors the VNF network state and obtains a corresponding VNF configuration file, a model for accurately predicting resource requirements of the VNF is trained by using a machine learning method, and then the virtual network scheduler performs unified scheduling on results of VNF resource requirement prediction, that is, dynamic allocation is performed on the virtual resources in the resource pool.
The objectives of the study herein are formulated as follows. The underlay network G = (U, L) is the underlying network where a given network service should be deployed, with users distributed throughout the network, each user U ∈ U being located at a network node V ∈ V and at a data rate λ ∈ V u A network service S is requested. The network service S = (C, a) consists of a number of components C ∈ C (i.e. VNF), which are interconnected by virtual links a ∈ a. A VNF may be instantiated one or more times, wherein each VNF instance is allocated a certain amount of resources r according to its resource requirements c . In the prior art, r is generally expressed c Setting to a predetermined fixed value or linear function r of the network load lambda c (λ), but both of these assumptions are unreasonable and they can easily lead to over-or under-allocation. It is therefore an object of the present invention to accurately model and predict VNF resource requirements using machine learning. Referring to fig. 2, a block diagram of a resource demand prediction method of a VNF, specific implementation steps are as follows:
first, a data set is collected. Relevant features of the VNF, including experimental features and time series features, are collected based on the NFV benchmark framework. A VNF profile is obtained by analyzing a given VNF and then systematically configuring it with different resources and logging its corresponding performance. The data set mainly includes a representative VNF in a network security scenario, a Web scenario, and an internet of things scenario. Thus, the research includes a VNF (IDS system) that passively analyzes traffic and transparently forwards traffic, a VNF (proxy server) that actively modifies traffic, and a VNF (MQTT server) in the internet of things scenario, which can be used as an example of the 5G vertical industry. Specific information is shown in table 1.
Table 1 VNF reference dataset summary
Figure BDA0003024894730000061
Figure BDA0003024894730000071
And secondly, preprocessing data. The data set in table 1 mainly includes execution environment information of the VNF, processing information of the VNF, monitoring information, and the like, so that original data directly obtained from the test environment is very disordered, and there are some missing data, abnormal data, and non-numerical data that is not easy to process. Therefore, before model training, the data set needs to be preprocessed, and the original data is enabled to be cleaner and more effective by setting missing values as median of corresponding features and a processing mode of removing abnormal data, non-numerical data and a small amount of irrelevant data.
And thirdly, selecting the optimal single characteristic. Aiming at the defects existing in the existing research, the optimal single feature which minimizes the prediction error RMSE is found out by traversing all the relevant features.
And fourthly, selecting a candidate feature set. In stage 1 (see algorithm 1), the number of primitive features considered is reduced by screening out those with a higher temporal correlation to the VNF resource allocation cpu. The present invention represents the VNF resource allocation (referred to herein as cpu time) as cpu, and the time series of cpu used by the VNF at the end of time interval t as a vector
Figure BDA0003024894730000072
And represents the original features as M, the set of all original features as M, and then the time series of each feature ending at the time interval t as a vector
Figure BDA0003024894730000073
For each feature M ∈ M, use r m Denotes the absolute value of the Pearson correlation coefficient between cpu and m, expressed as d m Representing the distance correlation coefficient between cpu and m, the algorithm takes both correlation metrics into account, the pearson correlationA coefficient of 0 does not necessarily mean that the two variables are independent, but may also be non-linearly related. The distance correlation coefficient is generated to overcome the weak point of the pearson correlation coefficient, and mainly measures the degree of nonlinear correlation. Finally, the method is carried out from the original feature set N raw In the feature is selected N can The features serve as candidate features for the second stage algorithm. Wherein N is can The method is a settable parameter, and ensures that the balance between better model precision and lower overhead can be carried out in the second stage.
And fifthly, selecting an optimal feature set. In stage 2 (see algorithm 2), various subsets of the candidate features generated in stage 1 are explored, one of the subsets with the best prediction performance is selected, and a training error value RMSE (root mean square error) of the model is obtained by using a 10-fold cross-validation method in the selection process, and the prediction capabilities of the six prediction models are simultaneously compared.
Although after stage 1, the number of features has been predicted from
Figure BDA0003024894730000082
Is reduced to
Figure BDA0003024894730000083
And these candidate features are highly correlated with the predicted target, combining the candidate features into a model does not necessarily provide more useful information. Therefore, a greedy based forward search strategy is proposed, i.e. N is obtained from stage 1 can A single one of the features is selected that minimizes the prediction error RMSE, and then the remaining features are sequentially selected for the other features that maximize the improvement in RMSE value, which will not be added to the final feature set when the improvement is less than a given threshold. Finally, the optimal feature set M corresponding to different models is obtained pred . In the specific implementation process, taking the nginnx VNF as an example, the optimal feature set obtained based on different ML algorithms is shown in table 2.
TABLE 2 optimal feature sets (Nginx VNF) corresponding to different ML algorithms
Figure BDA0003024894730000081
And sixthly, integrating the algorithm into a VNF deployment algorithm. Firstly, a trained machine learning model is required to be loaded into a memory by a VNF deployment algorithm, and all models are preloaded during algorithm initialization so as to quickly predict VNF resource requirements; and then, adjusting a prediction algorithm of VNF instance resource allocation, wherein the returned values are not predefined values any more, but are based on the optimal machine learning model of the corresponding VNF, so that the allocated resources of each deployed VNF instance are dynamically determined, and the problems of excessive resource allocation and insufficient resource allocation are solved. As a representative of the latest VNF deployment algorithm, the present invention is described by taking the B-JointSP algorithm as an example, in the deployment algorithm, a machine learning model trained in advance is loaded for all the requested VNFs, and meanwhile, since the general machine learning model of the skleann API has a consistent interface, for the algorithm, it is simple and transparent to load and exchange different machine learning models, which is beneficial to selecting an optimal prediction model, and finally, the machine learning model is used to replace the existing linear function approximation, so that the accurate resource allocation can be performed for each VNF instance with less change on the existing VNF deployment algorithm.
Fig. 3 and 4 are model prediction performance comparisons based on optimal single feature and optimal feature sets, respectively. As can be seen from fig. 3, for different types of prediction models, the model prediction performance based on the optimal feature set is improved compared with that of the optimal single feature, and the maximum improvement is as high as 25%; FIG. 4 also verifies this conclusion from another point of view, except for the Dtree model, which is the predictor R of other models 2 A lift is obtained. FIG. 5 and FIG. 6 are comparisons of model training time and prediction time based on the optimal single feature and the optimal feature set, respectively, and compared with the model training and prediction time based on the optimal single feature, for different prediction models, the model training time based on the optimal feature set has different increasing and decreasing trends, while the model prediction time based on the optimal feature set is significantly more, but although the prediction time is increased, the increase is not large due to the small base number, the prediction time is increasedThe latter time cost is acceptable. The effectiveness of the VNF resource demand prediction method based on the TSA algorithm proposed herein is thus demonstrated.
Correspondingly, the invention also provides a VNF resource demand prediction system based on feature selection, which includes:
the first acquisition module is used for acquiring VNF reference data;
the first preprocessing module is used for preprocessing VNF reference data to obtain preprocessed data with abnormal data, non-numerical data and irrelevant data removed;
the input module is used for inputting the preprocessing data into pre-trained VNF resource demand prediction models comprising different types;
the output module is used for determining the type of the VNF according to the VNF reference data, determining a VNF resource demand prediction model corresponding to the type of the VNF according to the type of the VNF, and outputting a VNF resource demand prediction result through the VNF resource demand prediction model;
the output module includes a model determination module comprising:
the second acquisition module is used for acquiring different types of VNF historical reference data;
the second preprocessing module is used for preprocessing the VNF historical reference data of a certain type to obtain preprocessed historical data with abnormal data, non-numerical data and irrelevant data removed;
the extraction module is used for extracting all relevant features of the VNF according to the preprocessed historical data;
the first screening module is used for screening out a candidate feature set highly related to the predicted target according to all related features of the VNF;
the second screening module is used for further screening the candidate characteristic set based on a greedy forward search strategy to obtain a best characteristic set;
and the optimal model determining module is used for training different VNF resource demand prediction models corresponding to the VNF types by adopting different regression models based on the optimal feature set, verifying the trained different VNF resource demand prediction models corresponding to the VNF types and determining the optimal VNF resource demand prediction model corresponding to the VNF of the type.
The first screening module includes:
the original feature screening module is used for screening all the obtained related features of the VNF to screen out original features with high time correlation with the VNF resource allocation cpu to obtain an original feature set N raw
A subsequent feature screening module for calculating a Pearson correlation coefficient and a distance correlation coefficient according to the linear and nonlinear relations between the original features and the predicted target, and extracting the original feature set N according to the Pearson correlation coefficient and the distance correlation coefficient raw Selecting N from the features can Features as candidate feature sets, where N can Is a settable parameter.
The second screening module includes:
the adding module is used for selecting a single feature which minimizes the prediction error RMSE from the candidate feature set, then sequentially selecting a feature which maximizes the prediction error RMSE from the remaining candidate features, and adding the feature to the final feature set;
a judging module for stopping adding the maximum improved feature into the final feature set when the maximum improved improvement value is less than a preset threshold value, wherein the final feature set is an optimal feature set M pre d。
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; while the invention has been described in detail and with reference to the foregoing examples, those skilled in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (7)

1. A VNF resource demand prediction method based on feature selection is characterized by comprising the following steps:
obtaining VNF reference data;
preprocessing VNF reference data to obtain preprocessed data without abnormal data, non-numerical data and irrelevant data;
inputting the preprocessed data into pre-trained VNF resource demand prediction models comprising different types;
determining the type of the VNF according to the VNF reference data, determining a VNF resource demand prediction model corresponding to the type of the VNF according to the type of the VNF, and outputting a VNF resource demand prediction result through the VNF resource demand prediction model;
the determining process including different types of VNF resource demand prediction models includes:
obtaining different types of VNF historical reference data;
preprocessing certain type of VNF historical reference data to obtain preprocessed historical data with abnormal data, non-numerical data and irrelevant data removed;
extracting all related features of the VNF according to the pre-processing historical data;
screening out a candidate feature set highly related to the predicted target according to all related features of the VNF;
further screening the candidate feature set based on a greedy forward search strategy to obtain an optimal feature set;
and training different VNF resource demand prediction models corresponding to the VNF types by adopting different regression models based on the optimal feature set, verifying the trained different VNF resource demand prediction models corresponding to the VNF types, and determining the optimal VNF resource demand prediction model corresponding to the VNF of the type.
2. The feature selection-based VNF resource demand prediction method of claim 1, wherein the process of screening out a candidate feature set highly correlated with a prediction target according to all correlated features of the VNF comprises:
screening all the obtained related features of the VNF, screening out the original features with high time correlation with the VNF resource allocation cpu, and obtaining an original feature set N raw
Calculating a Pearson correlation coefficient and a distance correlation coefficient according to the linear and nonlinear relations between the original features and the predicted target, and obtaining an original feature set N according to the Pearson correlation coefficient and the distance correlation coefficient raw Selecting N from the features can Features as candidate feature sets, where N can Is a settable parameter.
3. The feature selection-based VNF resource demand prediction method of claim 2, wherein the greedy forward search strategy-based process of further screening candidate feature sets to obtain an optimal feature set comprises:
selecting a single feature which minimizes the prediction error RMSE from the candidate feature set, then sequentially selecting a feature which enables the prediction error RMSE to be improved to the maximum from the rest candidate features, adding the feature to the final feature set, stopping adding the feature which is improved to the maximum to the final feature set when the improvement value of the maximum improvement is smaller than a preset threshold value, and taking the final feature set as an optimal feature set M at the moment pred
4. The feature selection based VNF resource demand prediction method of claim 1, wherein the regression models that include differences include:
a linear regression model, a ridge regression model, a K-nearest neighbor model, a decision tree model, a random forest model, and an adaptive boosting model.
5. A VNF resource demand prediction system based on feature selection, comprising:
the first acquisition module is used for acquiring VNF reference data;
the first preprocessing module is used for preprocessing VNF reference data to obtain preprocessed data with abnormal data, non-numerical data and irrelevant data removed;
the input module is used for inputting the preprocessing data into pre-trained VNF resource demand prediction models comprising different types;
the output module is used for determining the type of the VNF according to the VNF reference data, determining a VNF resource demand prediction model corresponding to the type of the VNF according to the type of the VNF, and outputting a VNF resource demand prediction result through the VNF resource demand prediction model;
the output module includes a model determination module comprising:
the second acquisition module is used for acquiring different types of VNF historical reference data;
the second preprocessing module is used for preprocessing the VNF historical reference data of a certain type to obtain preprocessed historical data with abnormal data, non-numerical data and irrelevant data removed;
the extraction module is used for extracting all relevant features of the VNF according to the pre-processing historical data;
the first screening module is used for screening out a candidate feature set highly related to the predicted target according to all related features of the VNF;
the second screening module is used for further screening the candidate characteristic set based on a greedy forward search strategy to obtain an optimal characteristic set;
and the optimal model determining module is used for training different VNF resource demand prediction models corresponding to the VNF types by adopting different regression models based on the optimal feature set, verifying the trained different VNF resource demand prediction models corresponding to the VNF types and determining the optimal VNF resource demand prediction model corresponding to the VNF types.
6. The feature selection-based VNF resource demand prediction system of claim 5, wherein the first screening module comprises:
the original feature screening module is used for screening all the obtained related features of the VNF to screen out original features with high time correlation with the VNF resource allocation cpu to obtain an original feature set N raw
A subsequent feature screening module for calculating the Pearson correlation coefficient and the distance correlation coefficient according to the linear and nonlinear relations between the original features and the predicted target, and calculating the Pearson correlation coefficient and the distance correlation coefficient according to the Pearson correlation coefficient and the distance correlation coefficientNumber from primitive feature set N raw Selecting N from the features can Features as candidate feature sets, where N can Is a settable parameter.
7. The feature selection-based VNF resource demand prediction system of claim 6, wherein the second screening module comprises:
an adding module, configured to select a single feature that minimizes a prediction error RMSE from the candidate feature set, then sequentially select a feature that maximizes the prediction error RMSE from the remaining candidate features, and add the feature to the final feature set;
a judging module for stopping adding the maximum improved characteristic to the final characteristic set when the maximum improved improvement value is smaller than a preset threshold value, wherein the final characteristic set at the moment is an optimal characteristic set M pred
CN202110413409.6A 2021-04-16 2021-04-16 VNF resource demand prediction method and system based on feature selection Pending CN115220904A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110413409.6A CN115220904A (en) 2021-04-16 2021-04-16 VNF resource demand prediction method and system based on feature selection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110413409.6A CN115220904A (en) 2021-04-16 2021-04-16 VNF resource demand prediction method and system based on feature selection

Publications (1)

Publication Number Publication Date
CN115220904A true CN115220904A (en) 2022-10-21

Family

ID=83604453

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110413409.6A Pending CN115220904A (en) 2021-04-16 2021-04-16 VNF resource demand prediction method and system based on feature selection

Country Status (1)

Country Link
CN (1) CN115220904A (en)

Similar Documents

Publication Publication Date Title
US10958515B2 (en) Assessment and dynamic provisioning of computing resources for multi-tiered application
Bunyakitanon et al. End-to-end performance-based autonomous VNF placement with adopted reinforcement learning
US10460241B2 (en) Server and cloud computing resource optimization method thereof for cloud big data computing architecture
US10070328B2 (en) Predictive network traffic management
US20200257968A1 (en) Self-learning scheduler for application orchestration on shared compute cluster
US9111232B2 (en) Portable workload performance prediction for the cloud
Kousiouris et al. Dynamic, behavioral-based estimation of resource provisioning based on high-level application terms in cloud platforms
US11704123B2 (en) Automated orchestration of containers by assessing microservices
US9740534B2 (en) System for controlling resources, control pattern generation apparatus, control apparatus, method for controlling resources and program
US11055139B2 (en) Smart accelerator allocation and reclamation for deep learning jobs in a computing cluster
US10862765B2 (en) Allocation of shared computing resources using a classifier chain
Vakilinia et al. Analysis and optimization of big-data stream processing
KR102027303B1 (en) Migration System and Method by Fuzzy Value Rebalance in Distributed Cloud Environment
CN116057518A (en) Automatic query predicate selective prediction using machine learning model
KR101630125B1 (en) Method for resource provisioning in cloud computing resource management system
US9367351B1 (en) Profiling input/output behavioral characteristics in distributed infrastructure
US20200150957A1 (en) Dynamic scheduling for a scan
Carvalho et al. QoE-aware container scheduler for co-located cloud environments
US11164086B2 (en) Real time ensemble scoring optimization
US11240340B2 (en) Optimized deployment of analytic models in an edge topology
KR102062332B1 (en) An Memory Bandwidth Management Method and Apparatus for Latency-sensitive Workload
CN115220904A (en) VNF resource demand prediction method and system based on feature selection
CN115913967A (en) Micro-service elastic scaling method based on resource demand prediction in cloud environment
Hanczewski et al. Determining Resource Utilization in Cloud Systems: An Analytical Algorithm for IaaS Architecture
US20240231927A1 (en) Proactive resource provisioning in large-scale cloud service with intelligent pooling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination