CN113268822A

CN113268822A - Centrifugal pump performance prediction method based on small sample nuclear machine learning

Info

Publication number: CN113268822A
Application number: CN202110385957.2A
Authority: CN
Inventors: 赵旭涛; 张德胜; 孙龙月; 杨港; 薛加磊; 沈熙
Original assignee: Jiangsu University
Current assignee: Jiangsu University
Priority date: 2021-04-09
Filing date: 2021-04-09
Publication date: 2021-08-17

Abstract

A centrifugal pump performance prediction method based on small sample nuclear machine learning belongs to the technical field of centrifugal pump performance prediction and mainly comprises the following steps: (1) performing feature selection and standardization processing on the collected sample data; (2) constructing a small sample kernel machine learning Gaussian process regression prediction model; (3) selecting a suitable nonlinear kernel function; (4) training unknown hyper-parameters of the model based on the training data; (5) the validity of the model is verified based on the test data. The method takes the impeller structure and design parameters as basic data, learns the nonlinear relation between the impeller parameters and the lift and the efficiency of the centrifugal pump through a kernel machine learning Gaussian process regression model, and further realizes the prediction of the performance of the centrifugal pump. The invention fully and secondarily utilizes the design data of the existing centrifugal pump, has low requirements on the number of training samples by the prediction model, has low model construction difficulty, high model precision and good stability, and is more suitable for rapidly predicting the performance of the centrifugal pump in engineering design and optimization.

Description

A centrifugal pump performance prediction method based on small sample kernel machine learning

技术领域technical field

本发明属于离心泵性能预测领域，具体涉及到一种基于小样本核机器学习的离心泵性能预测方法。The invention belongs to the field of centrifugal pump performance prediction, and specifically relates to a centrifugal pump performance prediction method based on small sample kernel machine learning.

背景技术Background technique

离心泵作为一种通用机械，广泛应用于许多工业领域。叶轮是离心泵中最为重要的水力部件之一，将电动机的机械能转换为液体的能量，进而实现液体的输送，叶轮的合理设计对整个泵的性能十分重要。叶轮有许多结构参数如出口宽度，叶片数，叶轮外径等，其中一些参数的变化将会导致叶轮在旋转过程中，其内部出现许多复杂的流动现象，如二次流，尾迹射流，涡流等，其中的一些流动结构会消耗内部液体的能量，进而影响离心泵的性能，如扬程和效率。由于叶轮内部流动的复杂性，叶轮结构参数、内部流动及外特性之间常常是复杂的非线性关系，叶轮某一参数的变化对其性能的影响程度无法被定量地衡量，这导致离心泵的设计或者优化设计变得很困难，这时候离心泵性能的初步预测就变得非常重要。As a general-purpose machine, centrifugal pumps are widely used in many industrial fields. The impeller is one of the most important hydraulic components in the centrifugal pump. It converts the mechanical energy of the motor into the energy of the liquid, and then realizes the transportation of the liquid. The rational design of the impeller is very important to the performance of the entire pump. The impeller has many structural parameters such as outlet width, number of blades, outer diameter of the impeller, etc. The change of some of these parameters will lead to many complex flow phenomena inside the impeller during the rotation of the impeller, such as secondary flow, wake jet, eddy current, etc. , some of these flow structures consume the energy of the liquid inside, which in turn affects the performance of the centrifugal pump, such as head and efficiency. Due to the complexity of the internal flow of the impeller, there is often a complex nonlinear relationship between the impeller structural parameters, internal flow and external characteristics. It becomes difficult to design or optimize the design, and this is when preliminary prediction of centrifugal pump performance becomes very important.

目前常见的离心泵性能预测方法主要有基于数值模拟的流场分析法、经验统计法和基于数据驱动的机器学习方法。泵的现代设计中一般基于数值模拟方法对泵的性能进行预估，但对于结构较为复杂的物理模型来说，其计算开销难以忍受，过程较为繁琐。经验统计法是通过半经验半理论公式计算泵内的各种损失，在对流动作一定的简化和假设后寻找各种损失与泵结构参数之间的关系，进而建立起水力损失预测模型。由于各种泵型之间的差异，该方法不具有普遍性和推广性。At present, the common performance prediction methods of centrifugal pumps mainly include flow field analysis method based on numerical simulation, empirical statistical method and data-driven machine learning method. In the modern design of the pump, the performance of the pump is generally estimated based on the numerical simulation method, but for the physical model with more complex structure, the calculation cost is unbearable and the process is cumbersome. The empirical statistical method calculates various losses in the pump through semi-empirical and semi-theoretical formulas, and finds the relationship between various losses and pump structural parameters after certain simplification and assumption of convection action, and then establishes a hydraulic loss prediction model. Due to the differences between various pump types, this method is not universal and generalizable.

高斯过程回归(Gaussian Process Regression,GPR)是一种基于数据的核机器学习方法，不仅适合小样本下的非线性预测,还具有其他许多优点，如它在预测过程中不仅提供预测值还能提供方差以评估预测的不确定性；超参数对预测结果不太敏感；作为核机器学习方法，拥有很多核函数可以适应具有不同特征的数据，且可以根据数据特征自行地开发相应的核函数。由于GPR构造效率高、所需训练样本少、非线性关系学习能力强，将其应用于离心泵性能预测中可以大大提升离心泵的设计效率，减少其性能的预测时间。Gaussian Process Regression (GPR) is a data-based kernel machine learning method, which is not only suitable for nonlinear prediction under small samples, but also has many other advantages, such as it not only provides predicted values but also provides The variance is used to evaluate the uncertainty of the prediction; the hyperparameters are not very sensitive to the prediction results; as a kernel machine learning method, there are many kernel functions that can adapt to data with different characteristics, and the corresponding kernel functions can be developed according to the data characteristics. Due to the high construction efficiency of GPR, few training samples required, and strong nonlinear relationship learning ability, its application in centrifugal pump performance prediction can greatly improve the design efficiency of centrifugal pump and reduce its performance prediction time.

发明内容SUMMARY OF THE INVENTION

根据以上所述，本发明的目的在于提供基于小样本核机器学习的离心泵性能预测方法，能够在少量的样本支持下较为高效、准确地根据叶轮参数预测出离心泵的水力性能。According to the above, the purpose of the present invention is to provide a centrifugal pump performance prediction method based on small sample kernel machine learning, which can more efficiently and accurately predict the hydraulic performance of the centrifugal pump according to the impeller parameters with the support of a small number of samples.

本发明提供如下技术方案：一种基于小样本核机器学习的离心泵性能预测方法，包括以下步骤。The present invention provides the following technical solutions: a method for predicting the performance of a centrifugal pump based on small sample kernel machine learning, comprising the following steps.

(1)对所收集的样本数据进行标准化处理。(1) Standardize the collected sample data.

(2)构建小样本核机器学习高斯过程回归预测模型。(2) Construct a small sample kernel machine learning Gaussian process regression prediction model.

(3)选择合适的非线性核函数。(3) Choose a suitable nonlinear kernel function.

(4)基于训练数据对模型的未知超参数进行训练。(4) The unknown hyperparameters of the model are trained based on the training data.

(5)基于测试数据检验模型的有效性。(5) Check the validity of the model based on the test data.

上述方案中，所述步骤(1)中，为减少预测模型的训练时间及保证其预测精度，对所收集的样本数据进行标准化处理，使样本数据满足均值为0，方差为1的标准正态分布，具体公式如下：In the above scheme, in the step (1), in order to reduce the training time of the prediction model and ensure its prediction accuracy, the collected sample data is standardized, so that the sample data satisfies the standard normality with a mean value of 0 and a variance of 1. distribution, the specific formula is as follows:

其中，

为标准化后的数据，x_i为原始数据，μ为同一维度数据的均值，σ为同一维度数据的方差。in,

is the standardized data, x _i is the original data, μ is the mean of the data in the same dimension, and σ is the variance of the data in the same dimension.

上述方案中，所述步骤(2)中，根据高斯过程回归的数学原理，在MATLAB中构造出已知训练样本与未知测试样本之间的先验分布，其公式如下：In the above scheme, in the step (2), according to the mathematical principle of Gaussian process regression, the prior distribution between the known training samples and the unknown test samples is constructed in MATLAB, and the formula is as follows:

其中y表示训练样本输出变量的集合；f_*表示未知测试样本的输出；K(X,X)表示训练样本输入变量之间的核函数关系，K(X,x_*)与K(x_*,X)均表示训练样本输入变量与测试样本输入变量之间的核函数关系，K(x_*,x_*)表示测试样本输入变量之间的核函数关系，

表示噪声，I表示单位矩阵。Where y represents the set of training sample output variables; f _* represents the output of the unknown test sample; K(X, X) represents the kernel function relationship between the input variables of the training sample, K(X, x _* ) and K(x _* , X) all represent the kernel function relationship between the training sample input variables and the test sample input variables, K(x _* ,x _* ) represents the kernel function relationship between the test sample input variables,

represents noise, and I represents the identity matrix.

上述方案中，所述未知测试样本输出变量的后验分布可表示为In the above scheme, the posterior distribution of the unknown test sample output variable can be expressed as

其中

为该后验分布的均值，其数值可代表测试样本的未知输出变量，cov(f_*)为该后验分布的方差，可表征该输出变量的不确定性，它们可分别表示为以下两式：in

is the mean of the posterior distribution, and its value can represent the unknown output variable of the test sample, cov(f _* ) is the variance of the posterior distribution, which can represent the uncertainty of the output variable, and they can be expressed as the following two formulas respectively :

上述方案中，所述步骤(3)中，通过对比三种常用的均方指数(SquaredExponential,SE)非线性核函数、有理二次(Rational Quadratic,RQ)非线性核函数及Matern5/2非线性核函数的性能，最终采用SE核函数来构建叶轮参数与离心泵性能之间的非线性关系，其公式如下：In the above scheme, in the step (3), by comparing three commonly used Squared Exponential (SE) nonlinear kernel functions, rational quadratic (Rational Quadratic, RQ) nonlinear kernel functions and Matern5/2 nonlinear kernel functions. The performance of the kernel function, and finally the SE kernel function is used to construct the nonlinear relationship between the impeller parameters and the performance of the centrifugal pump. The formula is as follows:

其中

称为信号方差，控制核函数的输出大小；

l称为特征长度，控制输入变量各维度的特征属性对输出结果的影响程度。in

Called the signal variance, it controls the output size of the kernel function;

l is called the feature length, which controls the degree of influence of the feature attributes of each dimension of the input variable on the output result.

上述方案中，所述步骤(4)中，随机给定未知超参数的初始值，通过训练即可得到超参数的最优值。关于训练高斯过程回归中未知超参数的目标函数可表示为In the above scheme, in the step (4), the initial value of the unknown hyperparameter is randomly given, and the optimal value of the hyperparameter can be obtained through training. The objective function with respect to the unknown hyperparameters in training Gaussian process regression can be expressed as

其中

包含了模型中的未知参数，通过梯度下降法求解目标函数NLML关于每一个未知超参数偏导数的最小值即可得到最优的超参数值。in

The unknown parameters in the model are included, and the optimal hyperparameter value can be obtained by solving the minimum value of the partial derivative of the objective function NLML with respect to each unknown hyperparameter by the gradient descent method.

上述方案中，所述步骤(5)中，训练得到超参数的最优值后，在模型中输入标准化后测试样本的叶轮参数值，经过训练完成的模型即可求得所对应得离心泵性能数值的均值和方差，对其均值和方差进行反标准化处理，反标准化后的均值可代表离心泵性能的预测结果，反标准化后的方差可代表本次预测结果的不确定性。通过计算预测得到的离心泵性能值与真实实验值之间的相对误差即可检验所提出离心泵预测模型的精度和有效性。In the above scheme, in the step (5), after the optimal value of the hyperparameter is obtained by training, the impeller parameter value of the standardized test sample is input into the model, and the corresponding centrifugal pump performance can be obtained after the model completed by training. The mean and variance of the values are de-standardized. The de-standardized mean can represent the prediction result of centrifugal pump performance, and the de-standardized variance can represent the uncertainty of the prediction result. The accuracy and validity of the proposed centrifugal pump prediction model can be tested by calculating the relative error between the predicted performance value of the centrifugal pump and the real experimental value.

本发明的有益效果：基于小样本核机器学习的离心泵性能预测方法，实现了在少量训练样本支持下根据高斯过程中不同核函数灵活地学习离心泵叶轮参数与离心泵性能之间的非线性关系，进而对离心泵性能进行精准的预测。所提出预测模型需要设置的未知参数较少，且未知参数的取值对预测结果的影响很小，模型的构建效率大大提高，更适合工程实际中离心泵的性能预测。Beneficial effects of the present invention: the performance prediction method of centrifugal pump based on small sample kernel machine learning realizes the nonlinear learning between centrifugal pump impeller parameters and centrifugal pump performance flexibly according to different kernel functions in Gaussian process under the support of a small number of training samples relationship, and then accurately predict the performance of the centrifugal pump. The proposed prediction model needs to set fewer unknown parameters, and the value of the unknown parameters has little influence on the prediction results, the construction efficiency of the model is greatly improved, and it is more suitable for the performance prediction of centrifugal pumps in engineering practice.

附图说明Description of drawings

图1为本发明关于离心泵性能预测方法的流程图。FIG. 1 is a flow chart of a method for predicting the performance of a centrifugal pump according to the present invention.

图2为三种不同非线性核函数预测能力的对比图。Figure 2 is a comparison chart of the predictive ability of three different nonlinear kernel functions.

图3为14组测试样本扬程预测结果与实验值的对比图。Figure 3 is a comparison chart of the predicted lift results of the 14 groups of test samples and the experimental values.

图4为14组测试样本效率预测结果与实验值的对比图。Figure 4 is a comparison diagram of the efficiency prediction results of 14 groups of test samples and the experimental values.

图5为14组测试样本经过四种不同方法所得到的扬程预测结果之间的对比图。Figure 5 is a comparison chart of the lift prediction results obtained by 14 groups of test samples through four different methods.

图6为14组测试样本经过四种不同方法所得到的效率预测结果之间的对比图。Figure 6 is a comparison chart of the efficiency prediction results obtained by four different methods for 14 groups of test samples.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白，以下结合说明书附图及实施例，对本发明进行进一步详细说明。应当理解，此处所描述的具体实施例仅仅用以解释本发明，并不用于限定本发明。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments of the description. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

相反，本发明涵盖任何由权利要求定义的在本发明的精髓和范围上做的替代、修改、等效方法以及方案。进一步，为了使公众对本发明有更好的了解，在下文对本发明的细节描述中，详尽描述了一些特定的细节部分。对本领域技术人员来说没有这些细节部分的描述也可以完全理解本发明。On the contrary, the present invention covers any alternatives, modifications, equivalents and arrangements within the spirit and scope of the present invention as defined by the appended claims. Further, in order to give the public a better understanding of the present invention, some specific details are described in detail in the following detailed description of the present invention. The present invention can be fully understood by those skilled in the art without the description of these detailed parts.

(1)对所收集的样本数据进行标准化处理。通过《现代泵理论与设计》收集68组不同比转速的叶轮水力模型数据作为整个样本集，其中模型输入变量可为比转速、叶轮转速、叶轮出口宽度、叶片数等叶轮参数，输出变量可为离心泵扬程、效率等性能参数。从整个样本集中选取比转速从小到大的14组样本数据作为测试样本，其余的样本作为训练样本。对整个样本集整体进行标准化处理，使所有样本的同一个属性都满足均值为0，方差为1的正态分布。部分样本数据如表1所示。(1) Standardize the collected sample data. 68 groups of impeller hydraulic model data with different specific speeds are collected through Modern Pump Theory and Design as the entire sample set. The model input variables can be specific speed, impeller speed, impeller outlet width, number of blades and other impeller parameters, and the output variables can be Performance parameters of centrifugal pump head, efficiency, etc. From the entire sample set, 14 sets of sample data from small to large specific speeds are selected as test samples, and the rest of the samples are used as training samples. Standardize the entire sample set so that the same attribute of all samples satisfies a normal distribution with a mean of 0 and a variance of 1. Some sample data are shown in Table 1.

表1部分离心泵样本数据Table 1 Partial centrifugal pump sample data

(2)构建预测模型。根据小样本核机器学习高斯过程回归由先验分布得出后验分布的数学原理在MATLAB中编制相应的程序，得到其预测模型。选择高斯过程中非线性学习能力较强的三种核函数来学习离心泵叶轮参数与离心泵扬程及效率间的非线性关系。这三种核函数分别为SE核函数、RQ核函数及Matern5/2核函数。对基于三种核函数的离心泵性能预测结果进行对比分析，选出最适合学习离心泵叶轮参数与离心泵性能之间关系的核函数。图2显示了反映核函数学习能力及预测精度的四种统计学指标，即预测值与实验值之间的均方根误差(RMSE)、平均相对误差(MAPE)、最大相对误差(MARE)及决定系数(R²)。由图2可知，基于SE核函数的高斯过程回归预测模型，其对离心泵扬程和效率的预测精度最高，故确定选用SE核函数作为学习离心泵叶轮参数与离心泵性能之间关系的核函数。(2) Build a prediction model. According to the mathematical principle of Gaussian process regression of small sample kernel machine learning, the posterior distribution is obtained from the prior distribution, and the corresponding program is compiled in MATLAB to obtain its prediction model. Three kernel functions with strong nonlinear learning ability in Gaussian process are selected to learn the nonlinear relationship between centrifugal pump impeller parameters and centrifugal pump head and efficiency. The three kernel functions are SE kernel function, RQ kernel function and Matern5/2 kernel function. The performance prediction results of centrifugal pump based on three kernel functions were compared and analyzed, and the most suitable kernel function for learning the relationship between centrifugal pump impeller parameters and centrifugal pump performance was selected. Figure 2 shows four statistical indicators that reflect the learning ability and prediction accuracy of the kernel function, namely the root mean square error (RMSE), the mean relative error (MAPE), the maximum relative error (MARE) and the Coefficient of Determination (R ² ). It can be seen from Figure 2 that the Gaussian process regression prediction model based on the SE kernel function has the highest prediction accuracy for the head and efficiency of the centrifugal pump. Therefore, the SE kernel function is selected as the kernel function for learning the relationship between the impeller parameters of the centrifugal pump and the performance of the centrifugal pump. .

(3)预测模型训练。首先给定未知超参数的初始值，由于高斯过程回归模型较好的自适应性，未知超参数初始值的大小不会对离心泵性能预测结果产生较为敏感的影响，故在满足均值为0，方差为2正态分布的空间内随机赋予未知超参数初值20组，随后基于训练数据，依靠梯度下降算法对未知超参数进行寻优训练，该优化的目标函数为(3) Prediction model training. First, the initial value of the unknown hyperparameter is given. Due to the good adaptability of the Gaussian process regression model, the initial value of the unknown hyperparameter will not have a sensitive impact on the performance prediction result of the centrifugal pump, so when the mean value is 0, 20 groups of initial values of unknown hyperparameters are randomly assigned in the space with a variance of 2 normal distribution, and then based on the training data, the unknown hyperparameters are optimized and trained by the gradient descent algorithm. The objective function of this optimization is

其中

包含了模型中的未知超参数。待20组超参数初值均寻优完成后，选择使目标函数NLML达到最小的那组未知超参数初值作为该预测模型进行离心泵性能预测的初值。in

Contains unknown hyperparameters in the model. After the optimization of the initial values of the 20 groups of hyperparameters is completed, the initial value of the unknown hyperparameters that minimizes the objective function NLML is selected as the initial value for the prediction model to predict the performance of the centrifugal pump.

(4)预测模型有效性检验。预测模型训练完成后，已基本学习了所提供训练样本中离心泵叶轮参数与离心泵的扬程和效率之间的非线性关系，将第一步中所确定的14组测试样本的叶轮参数作为预测模型的输入数据，通过该模型即可预测出14组叶轮参数所对应扬程和效率的均值和方差，对初步输出的14组均值和方差进行反标准化，反标准化后的均值可代表扬程和效率的预测值，方差可代表所得预测结果的不确定性。图3和图4分别展示了预测得到的扬程和效率与其实验值之间的对比，图中阴影部分代表由预测方差转变而来的95％置信区间，其范围越窄说明预测结果的不确定性越小、可靠性越高。(4) Predictive model validity test. After the training of the prediction model is completed, the nonlinear relationship between the impeller parameters of the centrifugal pump and the head and efficiency of the centrifugal pump in the training samples provided has been basically learned, and the impeller parameters of the 14 groups of test samples determined in the first step are used as predictions The input data of the model can be used to predict the mean and variance of the head and efficiency corresponding to the 14 groups of impeller parameters, and de-standardize the 14 groups of mean and variance of the initial output. The de-standardized mean can represent the difference between the head and efficiency. Predicted value, the variance can represent the uncertainty of the obtained prediction result. Figures 3 and 4 show the comparison between the predicted head and efficiency and their experimental values, respectively. The shaded area in the figure represents the 95% confidence interval transformed from the predicted variance. The narrower the range, the uncertainty of the predicted results. The smaller, the higher the reliability.

从图3中可以看出，扬程预测值与其实验值吻合良好，其95％置信区间在前10个测试样本中都足够窄，而在最后四个测试样本中，由于其比转速的增大，在大比转速范围内的训练数据相对较少，进而导致预测的不确定性变大。从图4效率的预测情况来看，与扬程的预测结果相比，其效率的预测精度略低，效率的预测值基本上低于实验值，其预测值与实验值的变化规律基本上是一致的。效率预测的不确定性相对较大，但其实验值基本处于效率预测得到的95％置信区间之内，说明置信区间包含了其真实值，故其预测结果还是可信的。进一步计算了14个测试样本扬程与效率的预测值与其实验值之间的绝对相对误差。在所有测试样本中，扬程预测值与实验值之间的最大绝对相对误差为6.66％，并且大部分绝对相对误差低于4％，效率预测值与实验值之间的最大绝对相对误差为10.54％，并且大部分绝对相对误差低于8％，该预测精度满足离心泵在实际工程设计中的预测要求。As can be seen from Fig. 3, the predicted value of head agrees well with its experimental value, and its 95% confidence interval is narrow enough in the first 10 test samples, while in the last four test samples, due to the increase of its specific speed, There is relatively little training data in the large specific speed range, which in turn leads to large uncertainty in prediction. From the prediction of the efficiency in Figure 4, compared with the prediction result of the head, the prediction accuracy of the efficiency is slightly lower, the prediction value of the efficiency is basically lower than the experimental value, and the change law of the prediction value and the experimental value is basically the same of. The uncertainty of efficiency prediction is relatively large, but the experimental value is basically within the 95% confidence interval obtained by the efficiency prediction, indicating that the confidence interval includes its true value, so the prediction result is still credible. The absolute relative errors between the predicted and experimental values of head and efficiency for the 14 test samples were further calculated. Among all the tested samples, the maximum absolute relative error between the predicted value of head and the experimental value is 6.66%, and most of the absolute relative errors are less than 4%, and the maximum absolute relative error between the predicted value of efficiency and the experimental value is 10.54% , and most of the absolute relative errors are less than 8%, and the prediction accuracy meets the prediction requirements of centrifugal pumps in practical engineering design.

表2展示了本发明所提出的基于小样本核机器学习的离心泵性能预测方法与常见的三种预测模型BP神经网络(Back Propagation Neural Network,BPNN)，径向基神经网络(Radial Basis Function Neural Network)及支持向量回归(Support VectorRegression,SVR)之间的对比。从表2展示的四种性能评价指标结果综合来看，在同样的训练样本下，GPR对离心泵扬程和效率的预测在四种模型中精度最高，稳定性最好。Table 2 shows the performance prediction method of centrifugal pump based on small sample kernel machine learning proposed by the present invention and three common prediction models BP neural network (Back Propagation Neural Network, BPNN), Radial Basis Function Neural Network (Radial Basis Function Neural Network) Network) and support vector regression (Support VectorRegression, SVR). From the comprehensive results of the four performance evaluation indicators shown in Table 2, under the same training samples, GPR has the highest accuracy and the best stability in the prediction of centrifugal pump head and efficiency among the four models.

表2四种离心泵预测模型的性能对比Table 2 Performance comparison of four centrifugal pump prediction models

图5和图6直观地展示了四种模型扬程和效率的预测值与其实验值之间的对比。从扬程对比来看，在测试数据中，四种模型的预测值与实验值的变化规律基本一致，说明输入变量与扬程之间的非线性关系被充分地学习到，所定义的输入变量能很好地代表扬程的影响特征。从效率对比来看，模型预测值与实验值的变化规律也基本一致。整体来看，通过GPR预测得到的扬程和效率与其实验值更加接近，说明其预测精度更高。Figures 5 and 6 visually show the comparison between the predicted and experimental values of head and efficiency for the four models. From the head comparison, in the test data, the predicted values of the four models are basically consistent with the experimental values, indicating that the nonlinear relationship between the input variables and the head is fully learned, and the defined input variables can be very A good representation of the influence characteristics of head. From the perspective of efficiency comparison, the change law of the model predicted value and the experimental value is basically the same. On the whole, the head and efficiency predicted by GPR are closer to their experimental values, indicating that the prediction accuracy is higher.

与常见的基于数值模拟的离心泵性能预测方法相比，基于GPR模型，利用叶轮参数来预测离心泵性能的方法能够对工程设计应用中已有的数据进行二次利用，实现离心泵性能预测的周期短、难度低、普适性强。与其他常见的基于数据的预测模型相比，在离心泵性能预测中，基于GPR的预测模型所需的训练数据少，从而降低了数据收集的难度；模型的可构造性强，模型的参数对其预测性能的影响小，从而训练更加容易；GPR模型的稳定性好，不仅能提供预测结果，还能提供预测结果所对应的不确定性程度，其可靠性更强。Compared with the common performance prediction method of centrifugal pump based on numerical simulation, the method of using impeller parameters to predict the performance of centrifugal pump based on the GPR model can make secondary use of the existing data in engineering design application to realize the prediction of centrifugal pump performance. The cycle is short, the difficulty is low, and the universality is strong. Compared with other common data-based prediction models, in centrifugal pump performance prediction, the GPR-based prediction model requires less training data, which reduces the difficulty of data collection; The influence of its prediction performance is small, so the training is easier; the stability of the GPR model is good, it can not only provide the prediction result, but also provide the degree of uncertainty corresponding to the prediction result, and its reliability is stronger.

综上所述，上述实施方式并非是本发明的限制性实施方式，凡本领域的技术人员在本发明的实质内容的基础上所进行的修饰或者等效变形，均在本发明的技术范畴。To sum up, the above-mentioned embodiments are not limiting embodiments of the present invention, and any modifications or equivalent deformations made by those skilled in the art on the basis of the essential content of the present invention are all within the technical scope of the present invention.

Claims

1. A centrifugal pump performance prediction method based on small sample nuclear machine learning is characterized by comprising the following steps:

(1) performing feature selection and standardization processing on the collected sample data;

(2) constructing a small sample kernel machine learning Gaussian process regression prediction model;

(3) selecting a suitable nonlinear kernel function;

(4) training unknown hyper-parameters of the model based on the training data;

(5) the validity of the model is verified based on the test data.

2. The method according to claim 1, wherein in step (1), in order to reduce the training time of the prediction model and ensure the prediction accuracy thereof, the collected sample data is normalized so that the sample data satisfies a standard normal distribution with a mean value of 0 and a variance of 1, and the specific formula is as follows:

wherein,

for normalized data, x_iFor raw data, μ is the mean of data of the same dimension, and σ is the variance of data of the same dimension.

3. The method for predicting the performance of the centrifugal pump based on the small sample nuclear machine learning as claimed in claim 1, wherein in the step (2), the prior distribution between the known training sample and the unknown test sample is constructed in MATLAB software according to the mathematical principle of gaussian process regression, and the formula is as follows:

wherein y represents a set of training sample output variables; f. of_*An output representing an unknown test sample; k (X, X) denotes the kernel functional relationship between the training sample input variables, K (X, X)_*) And K (x)_*X) each represent a kernel function relationship between a training sample input variable and a test sample input variable, K (X)_*,x_*) Representing the kernel function relationship between the test sample input variables,

representing noise and I representing an identity matrix.

4. The method of claim 1, wherein the posterior distribution of the unknown test sample output variables is expressed as

Wherein

The mean of the posterior distribution whose value represents the unknown output variable of the test sample, cov (f)_*) For the variance of the posterior distribution, the uncertainty of the output variable can be characterized, which can be expressed as the following two equations, respectively:

5. the method for predicting centrifugal pump performance based on small sample nuclear machine learning of claim 1, wherein in the step (3), the SE kernel is finally used to construct the nonlinear relationship between the impeller parameters and the centrifugal pump performance by comparing the performances of three commonly used Square Exponential (SE) nonlinear kernels, Rational Quadratic (RQ) nonlinear kernels and Matern5/2 nonlinear kernels, and the formula is as follows:

wherein

Called signal variance, controlling the output magnitude of the kernel function;

l is called the characteristic length and controls the influence degree of the characteristic attribute of each dimension of the input variable on the output result.

6. The method for predicting the performance of the centrifugal pump based on the small sample nuclear machine learning as claimed in claim 1, wherein in the step (4), the initial value of the unknown hyper-parameter is randomly given, and the optimal value of the hyper-parameter can be obtained through training. An objective function for an unknown hyperparameter in a training Gaussian process regression can be expressed as

Wherein

The unknown parameters in the model are included, and the minimum value of the objective function NLML relative to each unknown hyper-parameter partial derivative is solved through a gradient descent method, so that the optimal hyper-parameter value can be obtained.

7. The method for predicting the performance of the centrifugal pump based on the small-sample nuclear machine learning as claimed in claim 1, wherein in the step (5), after the optimal value of the hyper-parameter is obtained through training, the impeller parameter value of the standardized test sample is input into the model, the mean value and the variance of the corresponding performance value of the centrifugal pump can be obtained through the trained model, the mean value and the variance are subjected to anti-standardization, the mean value after the anti-standardization can represent the prediction result of the performance of the centrifugal pump, and the variance after the anti-standardization can represent the uncertainty of the prediction result; the accuracy and the effectiveness of the proposed centrifugal pump prediction model can be checked by calculating the relative error between the predicted centrifugal pump performance value and the real experimental value.