CN115394380A

CN115394380A - Prediction method between material related parameters based on random degradation process

Info

Publication number: CN115394380A
Application number: CN202210996082.4A
Authority: CN
Inventors: 严兵; 郭宇; 贾攀
Original assignee: Jiangsu XCMG Guozhong Laboratory Technology Co Ltd
Current assignee: Jiangsu XCMG Guozhong Laboratory Technology Co Ltd
Priority date: 2022-08-19
Filing date: 2022-08-19
Publication date: 2022-11-25

Abstract

The invention discloses a method for predicting material related parameters based on a random degradation process, which comprises the following steps: carrying out sample distribution calculation on the parameter pairs of the sample data among the related parameters to obtain a parameter array conforming to the distribution and determine the distribution type; thereby constructing a distribution model; generating a new test sample data sequence as training sample data, extracting an input vector and an output vector of a training sample to perform regression model training, obtaining a decision function of a training sample regression model, calculating a mean vector of the input vector of the training sample, and obtaining a mean prediction vector by introducing regression model prediction; importing the input vector of the training sample into a regression model for prediction to obtain a prediction output vector of the relevant parameters, and calculating a weighted mean square error of the prediction output vector and the output vector of the training sample; and calculating a confidence interval of the mean prediction vector, and calculating a confidence interval of the predicted value by combining a linear difference method for evaluating the reliability of the predicted value. The method provided by the invention can be used for accurately predicting and evaluating the test data of the relevant parameters of the relevant materials applied to mechanical engineering, which are difficult to measure in the degradation process.

Description

Prediction method between material related parameters based on random degradation process

Technical Field

The invention belongs to the field of reliability analysis of materials of engineering machinery products in random degradation, and particularly relates to a method for predicting relevant parameters of materials based on a random degradation process.

Background

The physical and chemical performance parameters of the engineering machinery product material are very important for the reliability evaluation of the product. However, in practical application, the physical and chemical performance parameters of the product material show nonlinear changes along with the progress of the product degradation process. Therefore, effective reliability evaluation of the product is realized, various physical and chemical parameters of the product material need to be measured and analyzed regularly, and the method has great workload. On the other hand, due to the fact that the physical and chemical parameters of the product materials are often more and the measurement of partial parameters is difficult, in the actual measurement work, large manpower and material resources are often consumed.

The physical and chemical parameters are the intrinsic properties of the material, that is, strong correlation exists among some parameters, such as hardness and tensile strength of steel, shear modulus and toughness of rubber, and the measurement of some parameters is relatively simple and convenient, such as hardness and shear modulus, so that the prediction of the related parameters which are difficult to measure in the product material can be realized by utilizing the correlation and combining a regression analysis method and using parameter data which are easy to measure as independent variables in the random degradation process.

Disclosure of Invention

The purpose of the invention is as follows: in order to overcome the defects in the prior art, the invention provides a prediction method among material related parameters based on a random degradation process, based on the SVM regression prediction method, the related parameter test data which is difficult to measure in the degradation process of related materials applied to mechanical engineering are accurately predicted and evaluated, and the regression prediction among physicochemical parameters with related relations in the materials is realized.

The technical scheme is as follows: in a first aspect, the present invention provides a method for predicting material-related parameters based on a stochastic degradation process, including:

acquiring a parameter pair of sample data among related parameters;

carrying out sample distribution calculation on the parameter pairs to obtain a parameter array conforming to distribution and determine the distribution type;

constructing a distribution model according to the distributed parameter array and the determined distribution type;

generating a new simulation data sequence according to the distribution model and the parameter pair of the related parameter sample data;

taking the new test sample data sequence as training sample data of an SVM regression algorithm, extracting an input vector of the training sample according to the training sample data and calculating a mean vector of the training sample;

carrying out regression model training on training sample data to obtain a regression model; taking the input vector of the training sample as the input vector of the test sample, introducing the input vector and the mean vector of the training sample into a regression model for prediction, and respectively obtaining a related parameter prediction output vector and a mean prediction vector;

respectively calculating the predicted output vector and the output vector of the training sample to obtain a weighted mean square error;

calculating according to the weighted mean square error and the mean prediction vector to obtain a confidence interval of the mean prediction vector;

substituting the confidence interval of the mean prediction vector and the mean vector of the training sample into a linear difference method for calculation to obtain a confidence interval of a predicted value;

and evaluating the reliability of the predicted value of the relevant parameter according to the predicted value confidence interval.

In a further embodiment, the method for obtaining the parameter pair of the sample data among the related parameters comprises the following steps:

in the process of a product material random degradation experiment, setting a plurality of experiment samples aiming at relevant parameters; wherein the related parameters are two or more than two material intrinsic property parameters which are related to each other;

respectively measuring a plurality of experimental samples, and recording sample data among related parameters at different moments;

arranging the sample data of the plurality of experimental samples into a parameter pair set of the sample data among the related parameters according to a time sequence;

the expression of the parameter pair of the relevant parameter sample data is as follows: { (α) _ij ,β _ij )}| _{i＝1,2,...,m；j＝1,2,...n}

In the formula, α and β respectively represent parameters which are different in attributes of product materials and are mutually related, n represents time, m represents total batch of test samples, j represents any time, and i represents i batch of test samples at the j-th time.

In a further embodiment, the method for calculating the distribution of the parameters to obtain the parameter array conforming to the distribution and determining the distribution type comprises the following steps:

extracting a related parameter array at the same time from a parameter pair set of related parameter sample data;

substituting the related parameter array into a probability density function for inspection, and judging whether the related parameters are random distribution variables conforming to the distribution parameters;

calculating the parameter arrays of the plurality of experimental samples of the tested relevant parameters by adopting the maximum likelihood ratio in sequence to obtain the maximum likelihood ratios corresponding to the plurality of parameter arrays;

judging the maximum likelihood ratios corresponding to the parameter arrays to determine the maximum likelihood ratios;

determining the distribution type of the related parameters and the parameter group conforming to the distribution according to the maximum likelihood ratio;

wherein the probability density function is:

f(α|θ _αj1 ,θ _αj2 ,...,θ _αjk ) (1)

the expression of substituting the related parameter array into the probability density function test is as follows:

wherein (theta) _αj1 ,θ _αj2 ,...,θ _αjk ) A parameter set of the model is distributed at the moment t by the parameter alpha; h ₀ And H ₁ Is a hypothesis problem to be examined;

the maximum likelihood ratio is calculated according to the formula:

in the formula,

is (theta) _αj1 ,θ _αj2 ,...,θ _αjk ) Lambda is a maximum likelihood ratio; wherein, the judgment expression for determining the sizes of the maximum likelihood ratios is as follows:

wherein K is a critical value.

In a further embodiment, the set of distribution parameters is determined according to a maximum likelihood method

And establishing an expression of a distribution model of the material parameters according to the determined distribution function type, wherein the expression is as follows:

in a further embodiment, the method for generating a new test sample data sequence according to the distribution model and the parameter pair of the relevant parameter sample data comprises:

for the relevant parameters, alpha and beta are respectively established as a distribution model L of the parameters at the time t, and a Monte Carlo (MC) method is utilized to generateMatching distribution parameter set

And

analog data sequence of

And

composing sample simulation data

Wherein m is ₀ Representing the amount of analog data at time j.

Sample data { (α) _ij ,β _ij )}| _{i＝1,2,...,m；j＝1,2,...n} And analog data

Constitute a new test sample data sequence { (alpha) _i'j ,β _i'j )}| _{i'＝1,2,...,m'；j＝1,2,...n} Effectively contains the statistical information in the distribution model of the relevant parameter pairs alpha and beta. Wherein:

m'＝m+m ₀ representing the total batch of new test sample data at n moments;

i 'represents the i' batch of experimental samples at time j;

m ₀ the constraint conditions are as follows:

40＜m+m ₀ ≤50 (6)

in the formula (6), if the number of the actually measured sample data m exceeds 50, the simulation data is not generated by the Monte Carlo method; if the number of the actual measurement sample data m is not more than 40, numerical data simulation is appropriately performed according to the MC method, so that the sum of the actual measurement sample data and the simulation sample data satisfies the formula (6).

In further implementation, the method of taking the new test sample data sequence as training sample data of the SVM regression algorithm, extracting an input vector of the training sample according to the training sample data and calculating a mean vector of the training sample comprises the following steps:

from the new test sample data sequence { (. Alpha.) _i'j ,β _i'j )}| _{i'＝1,2,...,m'；j＝1,2,...n} Selecting a parameter array of easily-measured one parameter alpha as an input vector { alpha ] of the training sample _i',j } _{i'＝1,2,...m'；j＝1,2,...n} And calculating the mean vector of the training samples at n moments

In a further implementation, the training process of the regression model in n-dimensional space is, without loss of generality:

let the hyperplane in n-dimensional space be W ^T X + b =0, wherein W = (W) ₁ ,w ₂ ,...w _n ) And b are hyperplane feature vector and constant term, respectively, X = (X) ₁ ,x ₂ ,...x _n ) Training sample point vectors for n dimensions;

making the normal vector of the n-dimensional hyperplane feature vector W be a positive direction distance, and making the opposite direction be a negative direction distance; y is _k Is a scalar quantity, when the vector of training sample points is a positive direction distance from the hyperplane, y _k 1, is the distance in the negative direction, y _k ＝-1；

Wherein, the total number of the training sample point vectors is r, (r = m' · n), and any training sample point X _k ∈{X ₁ ,X ₂ ,...X _r The distance d to the hyperplane can be expressed as:

in the formula X _k' ∈{X ₁ ,X ₂ ,...X _r The training sample point vector is the training sample point vector farthest from the hyperplane;

the objective of the regression training is to obtain a hyperplane W with a feature vector W ^T X + b =0, so that the distance d from the training sample point to the hyperplane is as small as possible, i.e. the distance d is calculated in equation (7) with respect to the featureThe optimal solution problem of the eigenvector W is equivalent to the constrained optimization solution problem of the following equation (8):

d' in the formula (8) is a constant and is determined by training sample data, and the subsequent calculation is processed by a constant term without influencing the training result of the regression model;

for the optimal solution problem with constraint conditions, a Lagrangian method is usually adopted for solving, and a Lagrangian multiplier vector alpha = (alpha) is introduced ₁ ,α ₂ ,...α _r ) ^T As the undetermined coefficient of the constraint condition in the formula (8), its equation of the langerhan function is defined as:

wherein alpha is _k | _{k＝1,2,...,r} Is the kth component of the lagrange multiplier;

solving the optimization problem with inequality constraint conditions in the formula (8), wherein the optimal solution needs to satisfy the following solving conditions, namely KKT (Karush-Kuhn-Tucker) conditions:

in conjunction with the lagrangian function (9) and the necessary KKT condition (10), the above optimal solution problem equation (8) is equivalent to:

wherein (X) _i ·X _j ) Is the inner product between training sample point vectors;

combining with the training sample point vector data, solving the optimal solution problem of the formula (11) to obtain the optimal solution of the lagrangian multiplier as follows:

α ^* ＝(α ₁ ^* ,α ₂ ^* ,...,α _r ^* ) ^T (12)

the decision function of the regression problem is:

wherein alpha is _k ^* The k component of the optimal solution for the Lagrange multiplier, b ^* Is a fitting constant term;

selecting proper radial basic kernel function K (X) _i ·X _j ) Instead of inner product (X) between training sample point vectors _i ·X _j ) The optimal solution problem of the formula (8) can be subjected to nonlinear regression analysis, and the radial basis kernel function can be a Gaussian kernel function, a polynomial kernel function and the like; and introduces a relaxation variable xi _k I k =1,2,.. R, and a penalty parameter C>0, wherein the relaxation variable ξ _k I k =1, 2., r represents the degree to which the kth sample point does not satisfy the regression condition, and the penalty parameter C represents the error that can be tolerated by the decision function, so that the optimal solution problem of the above equation (8) becomes a nonlinear soft regression solution problem, i.e.:

through the solving process of the formula (9) to the formula (12), the optimal solution of the Lagrange multiplier is sequentially calculated, and the regression decision function is obtained as follows:

in a further embodiment, the input vector of the training sample is led into a regression model for training to obtain the output vector of the training sample; the method for respectively obtaining the relevant parameter prediction output vector and the mean value prediction vector by introducing the input vector of the training sample and the mean value vector of the training sample into the regression model for prediction comprises the following steps:

leading the input vector of the training sample into a regression model for training to obtain the output vector of the training sample, wherein the output vector of the training sample is a parameter array { beta ] of a relevant parameter beta which is difficult to measure _i',j } _{i'＝1,2,...,m'；j＝1,2,...n} ；

Input vector of training sample [ alpha ] _i',j } _{i'＝1,2,...m'；j＝1,2,...n} Leading the input vector as a test sample into a regression model for predicting to obtain a related parameter beta predicted output vector

Meanwhile, training sample mean vector is used

Introducing the input vector of the new test sample into a regression model for prediction to obtain a mean prediction vector

Wherein,

in a further embodiment, the method of obtaining the weighted mean square error by separately calculating the prediction output vector and the output vector of the training samples is as follows:

respectively outputting the output vector { beta of the training sample at different time _i',j } _{i'＝1,2,...,m'；j＝1,2,...n} And predicting the output vector

Weighted mean square error σ of _j ² Calculating according to the following formula:

where f (-) is a weighting function with properties: f (0) =0, f (1) =1, and in [0,1 =1]Upper monotoneSuccessively increasing, e.g.

In a further embodiment, the confidence interval of the mean prediction vector and the mean vector of the training sample are substituted into a linear difference method for calculation, and the method for obtaining the confidence interval of the predicted value comprises the following steps:

weighted root mean square error sigma at different times t _j The confidence interval of the boundary width as the confidence interval of the mean predictive vector is

And simultaneously, setting the widths of a plurality of weighted root mean square errors according to the confidence coefficient requirement of an actual problem: the confidence interval may be

Where the value of n' is determined based on the confidence requirement of the actual problem.

In a further embodiment, the method for evaluating the reliability of the predicted value of the relevant parameter according to the predicted value confidence interval comprises the following steps:

taking any measured parameter alpha 'as an input vector of a regression prediction model, and evaluating the reliability of a predicted value beta' of the parameter alpha; if the predicted value output quantity beta' is positioned in the predicted value confidence interval

And the predicted value of the relevant parameter beta which is evaluated to be difficult to measure has reliability.

Has the beneficial effects that: compared with the prior art, the invention has the following advantages:

based on the SVM regression prediction method, the related parameter test data which is difficult to measure in the degradation process of the related material applied to mechanical engineering is accurately predicted and evaluated, so that the regression prediction between physicochemical parameters with related relations in the material is realized, the problems of random disturbance, less data amount and non-linear regression analysis of the random disturbance to the test data are solved, and a Support Vector Machine (SVM) regression algorithm has the advantage of very flexible applicability so as to be applied to the reliability evaluation and analysis of engineering mechanical products;

by comparing polynomial regression methods, the SVM regression method can control the predicted value of the test sample in a higher reliability interval, the problem of overfitting is avoided, and good balance between prediction precision and practicability is achieved.

By comparing with a deep learning regression method and an SVM regression prediction method, when training samples are few, the SVM regression prediction method still has a good regression effect on the nonlinear regression problem, and is flexible and wide in applicability. In addition, the SVM method does not need two-step mapping process, the transformation process is completed in one step through kernel functions, and the predicted expression form is linear combination of radial basis kernel functions, so that the calculation amount is greatly reduced compared with deep learning.

Compared with a Bayesian linear regression method, the SVM regression prediction method has smaller dependence on the number of training samples, still has a better regression model when the number of training samples is less, and has higher certainty because the prediction result is a decision result rather than a posterior probability distribution interval.

Drawings

FIG. 1 is a flow chart of a method for predicting between relevant parameters according to the present invention;

FIG. 2 is a diagram of the prediction result when the predicted boundary width is + -1 σ according to the present invention;

FIG. 3 is a graph of the prediction result when the boundary width is + -2 σ according to the present invention.

Detailed Description

In order to more fully understand the technical content of the present invention, the technical solution of the present invention will be further described and illustrated with reference to the following specific embodiments, but not limited thereto.

Example 1:

the method for predicting the material-related parameters based on the stochastic degeneration process provided in the embodiment is further described with reference to fig. 1, and includes:

acquiring a parameter pair of sample data among related parameters;

calculating the predicted output vector and the output vector of the training sample respectively to obtain a weighted mean square error;

arranging the sample data of the plurality of experimental samples into a parameter pair set of sample data among related parameters according to a time sequence;

of pairs of parameters of related parameter sample dataThe expression is as follows: { (α) _ij ,β _ij )}| _{i＝1,2,...,m；j＝1,2,...n}

substituting the related parameter array into a probability density function for inspection, and judging that the related parameters are random distribution variables which accord with the distribution parameters;

wherein the probability density function is:

f(α|θ _αj1 ,θ _αj2 ,...,θ _αjk ) (1)

the maximum likelihood ratio is calculated according to the formula:

in the formula,

is (theta) _αj1 ,θ _αj2 ,...,θ _αjk ) Lambda is a maximum likelihood ratio; wherein, the judgment expression for determining the magnitude of the maximum likelihood ratios is as follows:

wherein K is a critical value.

for the related parameters, alpha and beta are respectively established as the distribution model L of the parameters at the time t, and the Monte Carlo method is utilized to generate the parameters conforming to the distribution parameters

And

analog data sequence of

And

composing sample simulation data

Wherein m is ₀ Representing the amount of analog data at time j.

Forming a new test sample data sequence { (alpha) _i'j ,β _i'j )}| _{i'＝1,2,...,m'；j＝1,2,...n} Effectively contains the statistical information in the distribution model of the relevant parameter pairs alpha and beta. Wherein:

m'＝m+m ₀ representing the total batch of new test sample data at n moments;

i 'represents the i' batch of experimental samples at time j;

m ₀ the constraint conditions are as follows:

40＜m+m ₀ ≤50 (6)

in the formula (6), if the number of the actually measured sample data m exceeds 50, the simulation data is generated without using a Monte Carlo method; if the number of the actual measurement sample data m is not more than 40, numerical data simulation is carried out properly according to the MC method, and the sum of the actual measurement sample data and the simulation sample data satisfies the formula (6).

In further implementation, the method of extracting the input vector of the training sample according to the training sample data and calculating the mean vector of the training sample by using the new test sample data sequence as training sample data of the SVM regression algorithm is as follows:

from a new test sample data sequence { (. Alpha.) _i'j ,β _i'j )}| _{i'＝1,2,...,m'；j＝1,2,...n} Selecting a parameter array of easily measured one parameter alpha as an input vector { alpha ] of a training sample _i',j } _{i'＝1,2,...m'；j＝1,2,...n} And calculating the mean vector of the training samples at n moments

making the normal vector of the n-dimensional hyperplane feature vector W be a positive direction distance, and making the opposite direction be a negative direction distance; y is _k Is a scalar quantity, when the vector of training sample points is a positive direction distance from the hyperplane, y _k 1, is a negative direction distance, y _k ＝-1；

Wherein, the total number of the training sample point vectors is r, (r = m'. N), and any training sample point X _k ∈{X ₁ ,X ₂ ,...X _r The distance d to the hyperplane can be expressed as:

the objective of regression training is to obtain a hyperplane W with a feature vector W ^T X + b =0, so that the distance d from the training sample point to the hyperplane is as small as possible, that is, the optimal solution problem for the feature vector W in equation (7) is solved, which is equivalent to the constrained optimal solution problem of equation (8) below:

for the optimal solution problem with constraint conditions, a Lagrangian method is usually adopted for solving, and a Lagrangian multiplier vector alpha = (alpha) is introduced ₁ ,α ₂ ,...α _r ) ^T As the undetermined coefficient of the constraint condition in equation (8), the equation of the langerhans function is defined as:

wherein alpha is _k | _{k＝1,2,...,r} Is the kth component of the Lagrangian multiplier;

solving the optimization problem of which the constraint condition in the formula (8) is an inequality, wherein the optimal solution of the optimization problem needs to meet the following solving condition, namely a KKT (Karush-Kuhn-Tucker) condition:

solving the optimal solution problem of the formula (11) by combining the training sample point vector data to obtain the optimal solution of the Lagrange multiplier as follows:

α ^* ＝(α ₁ ^* ,α ₂ ^* ,...,α _r ^* ) ^T (12)

the decision function of the regression problem is:

wherein alpha is _k ^* The k component of the optimal solution for Lagrange multipliers, b ^* Is a fitting constant term;

selecting proper radial basis kernel function K (X) _i ·X _j ) Instead of inner product (X) between training sample point vectors _i ·X _j ) The optimal solution problem of the formula (8) can be subjected to nonlinear regression analysis, and the radial basis kernel function can be a gaussian kernel function, a polynomial kernel function and the like; and introducing a relaxation variable xi _k I k =1,2,.. R, and a penalty parameter C>0, where the relaxation variable ξ _k I k =1, 2., r represents the degree to which the kth sample point does not satisfy the regression condition, and the penalty parameter C represents the error that can be tolerated by the decision function, so that the optimal solution problem of the above equation (8) becomes a nonlinear soft regression solution problem, i.e.:

Input vector of training sample [ alpha ] _i',j } _{i'＝1,2,...m'；j＝1,2,...n} As a test specimenThe input vector is led into a regression model to predict and obtain a related parameter beta predicted output vector

At the same time, training sample mean vector is used

Wherein,

in a further embodiment, the method for calculating the prediction output vector and the output vector of the training sample respectively to obtain the weighted mean square error comprises:

Weighted mean square error σ _j ² Calculating according to the following formula:

where f (-) is a weight function with properties: f (0) =0, f (1) =1, and in [0,1 =]Upper monotonically increasing continuously, e.g.

In a further embodiment, the method for calculating the confidence interval of the mean prediction vector and the mean vector of the training sample by substituting the linear difference method comprises the following steps:

weighted root mean square error sigma at different time instants t _j The confidence interval of the boundary width as the confidence interval of the mean predictive vector is

Example 2:

the embodiment of the "hardness-strength" related parameter pair of the steel is further explained with reference to fig. 2 to 3, so that regression prediction between physicochemical parameters having a correlation in the material is realized, and the method is applied to reliability evaluation analysis of engineering machinery products.

Step 1: in the random degradation process of engineering machinery products, at n different moments, the hardness-strength related parameter pairs { (alpha) of m test samples are respectively subjected to _ij ,β _ij )}| _{i＝1,2,...,m；j＝1,2,...n} And carrying out measurement, collection and arrangement.

And 2, step: the 'hardness-strength' related parameter pair { (alpha) obtained in step 1 _ij ,β _ij )}| _{i＝1,2,...,m；j＝1,2,...n} The parameters { alpha ] of the m test samples at a certain time t are determined by a distributed likelihood ratio test method _ij | _{i＝1,2,...,m} And { beta ] _ij | _{i＝1,2,...,m} The distribution type of the (C) is selected, such as Gaussian distribution, weibull distribution, gamma distribution, etc.; and respectively establishing a distribution model L (alpha) _j |θ _αj1 ,θ _αj2 ,...,θ _αjk ) And L (. Beta.) _j |θ _βj1 ,θ _βj2 ,...,θ _βjk ) Wherein (theta) _αj1 ,θ _αj2 ,...,θ _αjk ) And (theta) _βj1 ,θ _βj2 ,...,θ _βjk ) The model parameter sets for the model are distributed at time t for parameters α and β, respectively. Wherein at different times, the parameter distribution type and the distribution parameter determined by the likelihood ratio test method are independently determined; taking parameter α as an example, the specific process of the distribution likelihood ratio test method is as follows:

at a certain time t, the parameter α is assumed to match the distribution parameter as an unknown parameter set (θ) _αj1 ,θ _αj2 ,...,θ _αjk ) The probability density function of the randomly distributed variable of (2) is f (α | θ) _αj1 ,θ _αj2 ,...,θ _αjk ) Considering the inspection problem:

for m parameter data, the maximum likelihood ratio is calculated by the formula:

in the formula,

is (theta) _αj1 ,θ _αj2 ,...,θ _αjk ) Lambda is a maximum likelihood ratio; wherein the determination of the magnitude of the plurality of maximum likelihood ratios is determinedThe expression is as follows:

wherein K is a critical value.

After determining the distribution type and the parameter set of the parameter α according to the magnitude of the likelihood ratio λ, the distribution model at the time t can be expressed as:

and 3, step 3: generating a distribution parameter set by using Monte Carlo (MC) method according to the distribution model L of the parameters at the time t determined in the step 2

And

analog data sequence of (2)

And

composing sample simulation data

Wherein m is ₀ Representing the amount of analog data at time j.

m'＝m+m ₀ indicating the total batch of new test sample data at n moments;

i 'represents the i' batch of experimental samples at time j;

m ₀ the constraint conditions are as follows:

40＜m+m ₀ ≤50 (5)

in the formula (5), if the number of the actually measured sample data m exceeds 50, the simulation data is generated without using a Monte Carlo method; if the number of the actual measurement sample data m is not more than 40, numerical data simulation is carried out properly according to the MC method, and the sum of the actual measurement sample data and the simulation sample data satisfies the formula (5).

And 4, step 4: the new simulation data sequence { (α) generated in step 3 _i'j ,β _i'j )}| _{i'＝1,2,...,m'；j＝1,2,...n} As training sample data in the SVM regression algorithm, the hardness of steel is easy to measure, so that the { alpha ] is used _i',j } _{i'＝1,2,...m'；j＝1,2,...n} To train the input vector of the sample, the intensity parameter { β ] that is difficult to measure _i',j } _{i'＝1,2,...,m'；j＝1,2,...n} As an output vector of a training sample, performing regression model training; the specific training process comprises the following steps:

let the two-dimensional regression line equation be W ^T X + b =0, wherein W = (W) ₁ ,w ₂ ) And b are the regression curve feature vector and constant term, respectively, X = (X) ₁ ,x ₂ )＝{(α _i'j ,β _i'j )}| _{i'＝1,2,...,m'；j＝1,2,...n} A training sample point vector.

Making the normal vector of the two-dimensional regression line feature vector W be a positive direction distance, and making the opposite direction be a negative direction distance; y is _k Is a scalar quantity, when the vector of the training sample points is in the positive direction distance from the two-dimensional regression line, y _k =1, when the distance in the negative direction is,

y _k ＝-1；

in the formula X _k' ∈{X ₁ ,X ₂ ,...X _r A training sample point vector farthest from the hyperplane;

the objective of the regression training is to obtain a hyperplane W with a feature vector W ^T X + b =0, so that the distance d from the training sample point to the hyperplane is as small as possible, that is, the optimal solution problem for the feature vector W in equation (6) is solved, which is equivalent to the constrained optimal solution problem of equation (7) below:

d' in the formula (7) is a constant and is determined by training sample data, and the subsequent calculation is processed by a constant term without influencing the training result of the regression model;

for the optimal solution problem with constraint conditions, a Lagrangian method is usually adopted for solving, and a Lagrangian multiplier vector alpha = (alpha) is introduced ₁ ,α ₂ ,...α _r ) ^T As the undetermined coefficient of the constraint condition in the formula (7), its equation of the langerhan function is defined as:

solving the optimization problem of which the constraint condition in the formula (7) is an inequality, wherein the optimal solution of the optimization problem needs to meet the following solving condition, namely a KKT (Karush-Kuhn-Tucker) condition:

in conjunction with the lagrangian function (8) and the necessary KKT condition (9), the above optimal solution problem equation (7) is equivalent to:

combining with the training sample point vector data, solving the optimal solution problem of the formula (10) to obtain the optimal solution of the Lagrange multiplier as:

α ^* ＝(α ₁ ^* ,α ₂ ^* ,...,α _r ^* ) ^T (11)

the decision function of the regression problem is:

selecting proper radial basis kernel function K (X) _i ·X _j ) Instead of inner product (X) between training sample point vectors _i ·X _j ) Nonlinear regression analysis can be performed on the optimal solution problem of the formula (7), and the radial basis kernel function can be a gaussian kernel function, a polynomial kernel function and the like; and introducing a relaxation variable xi _k I k =1,2,.. R, and a penalty parameter C>0, where the relaxation variable ξ _k I k =1, 2., r represents the degree to which the kth sample point does not satisfy the regression condition, and the penalty parameter C represents the error that can be tolerated by the decision function, so that the optimal solution problem of the above equation (7) becomes a nonlinear soft regression solution problem, i.e.:

through the solving process of the formula (8) to the formula (11), the optimal solution of the Lagrange multiplier is sequentially calculated, and the regression decision function is obtained as follows:

and 5: according to the training regression model of the step 4, training samples { (alpha) } _i'j ,β _i'j )}| _{i'＝1,2,...,m'；j＝1,2,...n} Parameter vector of (1) { alpha ] _i',j } _{i'＝1,2,...m'；j＝1,2,...n} As the input vector of the test sample, prediction is carried out by a regression model to obtain a prediction output vector

On the other hand, training sample mean vector of n time instants

Obtaining a new prediction vector of the test sample as a new test sample input vector

And are referred to as mean test vector and mean prediction vector, respectively. Wherein

Step 6: according to the step 4 and the step 5, respectively calculating output vectors { beta ] of training samples at different time instants t _i',j } _{i'＝1,2,...,m'；j＝1,2,...n} And the output vector of the test sample

Weighted mean square error of _j ² The calculation process is as follows:

And so on.

And 7: according to the step 5 and the step 6, the mean predictive vector is calculated

Confidence interval of (c): according to the weighted mean square error sigma in step 6 _j ² Weighted root mean square error σ at different times t _j The width of the boundary as the confidence interval of the mean prediction vector, i.e. the confidence interval, is

The width of a plurality of weighted root mean square errors can also be set according to the confidence requirement of the actual problem, namely the confidence interval can be

Where the value of n' is determined according to the confidence requirement of the actual problem, as shown in technical figures 2 and 3.

And 8: according to step 7, for any measured hardness parameter

The confidence interval of the strength parameter beta 'related to the intensity parameter beta' can be determined by the following two steps of calculation in combination with a linear difference method:

if it is

Then it corresponds to the prediction parameter beta ₀ The confidence interval width σ 'of' is calculated by:

wherein sigma _j And σ _j+1 Are respectively as

And

the corresponding confidence interval width;

prediction of beta ₀ The confidence interval of' is expressed as

Its characteristic predicted value

The calculating method comprises the following steps:

wherein

And

are respectively as

And

and (4) corresponding predicted values.

And step 9: according to the step 8, any measured parameter alpha ' is used as an input vector of a regression prediction model, and the credibility of a predicted value beta ' of the parameter alpha ' is evaluated; if the predicted value output β' is within the confidence interval calculated in step 8

And the predicted value of the relevant parameter is considered to be acceptable, as shown in fig. 2.

Compared with a polynomial regression method, the SVM regression method can control the predicted value of the test sample in a higher reliability interval, avoids the problem of overfitting and achieves good balance between prediction precision and practicability.

Compared with a deep learning regression method and an SVM regression prediction method, the method still has a good regression effect on the nonlinear regression problem when fewer training samples are used, and is flexible and wide in applicability. In addition, the SVM method does not need two-step mapping process, the transformation process is completed in one step through kernel functions, and the predicted expression form is linear combination of radial basis kernel functions, so that the calculation amount is greatly reduced compared with deep learning.

Compared with a Bayes linear regression method, the SVM regression prediction method has smaller dependence on the number of training samples, still has a better regression model when the number of training samples is less, and has higher certainty because the prediction result is a decision result rather than a posterior probability distribution interval.

In summary, based on the SVM regression prediction method of the present invention, accurate prediction and evaluation are performed on relevant parameter test data that is difficult to measure in a degradation process of a relevant material applied to mechanical engineering, so as to realize regression prediction of physicochemical parameters having a relevant relationship in the material, solve the problems of random disturbance to the test data, less data amount and non-linear regression analysis of the random disturbance, and provide Vector Machine (SVM) regression algorithm has the advantage of very flexible applicability, so as to be applied to reliability evaluation and analysis of engineering mechanical products.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, it is possible to make various improvements and modifications without departing from the technical principle of the present invention, and those improvements and modifications should be considered as the protection scope of the present invention.

Claims

1. A method for predicting material-related parameters based on a stochastic regression process is characterized by comprising the following steps:

acquiring a parameter pair of sample data among related parameters;

generating a simulation data sequence according to the distribution model and the parameter pair of the related parameter sample data, and forming a new test sample data sequence with the actual measurement sample data;

importing the input vector of the training sample into a regression model for training to obtain the output vector of the training sample; importing the input vector of the training sample and the mean vector of the training sample into a regression model for prediction to respectively obtain a prediction output vector and a mean prediction vector of the relevant parameters;

calculating the predicted output vector and the output vector of the training sample to obtain a weighted mean square error;

2. The method according to claim 1, wherein the method for obtaining the parameter pairs of the sample data among the related parameters comprises:

sample data of related parametersThe expression of the parameter pairs of (1) is: { (. Alpha.) (alpha.) _ij ,β _ij )}| _{i＝1,2,...,m；j＝1,2,...n}

In the formula, α and β respectively represent parameters which are different in attributes of the product material and are mutually related, n represents time, m represents the total batch of the test samples, j represents any time, and i represents the i batch of the test samples at the j-th time.

3. The method of claim 1, wherein the step of performing a sample distribution calculation on the parameter pairs to obtain a distribution-compliant parameter array and determining the distribution type comprises:

the expression of substituting the related parameter alpha array into the probability density function is as follows:

f(α|θ _αj1 ,θ _αj2 ,...,θ _αjk ) (1)

the expression of substituting the related parameter alpha array into the probability density function test is as follows:

wherein (theta) _αj1 ,θ _αj2 ,...,θ _αjk ) Is a parameter alpha atA parameter group of the distribution model at the time t; h ₀ And H ₁ Is a hypothesis problem to be examined;

the maximum likelihood ratio is calculated according to the formula:

in the formula,

in the formula, K is a critical value;

4. the stochastic regression process-based prediction method between material-related parameters according to claim 3, wherein the method for constructing the distribution model from the distributed parameter array and the determined distribution type comprises: distribution parameter set determined according to maximum likelihood method

And determining the type of the distribution function, wherein the expression of the distribution model for constructing the material parameters is as follows:

5. the method for predicting parameters related to materials based on stochastic regression process according to claim 1, wherein the method for generating new test sample data sequence according to the distribution model and parameter pairs of related parameter sample data comprises:

for the related parameters, alpha and beta are respectively established as the distribution model L of the parameters at the time t, and the Monte Carlo method is utilized to generate the parameters meeting the distribution parameters

And

analog data sequence of

And

composing sample simulation data

Wherein m is ₀ Represents the amount of analog data at time j;

Constitute a new test sample data sequence { (alpha) _i'j ,β _i'j )}| _{i'＝1,2,...,m'；j＝1,2,...n} Effectively containing statistical information of relevant parameter pairs in alpha and beta distribution models; wherein:

m'＝m+m ₀ m' represents the total batch of new test sample data at n moments;

i 'represents the i' batch of experimental samples at time j;

m ₀ the constraint conditions are as follows:

40＜m+m ₀ ≤50 (6)

6. The method for predicting the material correlation parameters based on the random degradation process according to claim 1, wherein the method for extracting the input vector of the training sample according to the training sample data and calculating the mean vector of the training sample comprises the following steps:

7. The method for predicting the material related parameters based on the random regression process according to claim 1, wherein an input vector of a training sample is introduced into a regression model for training to obtain an output vector of the training sample; and the input vector of the training sample and the mean vector of the training sample are led into a regression model for prediction, and the method for respectively obtaining the prediction output vector and the mean prediction vector of the relevant parameters comprises the following steps:

training sample data sequence { (alpha) } _i'j ,β _i'j )}| _{i'＝1,2,...,m'；j＝1,2,...n} Introducing a two-dimensional space nonlinear equation, and training by an SVM regression algorithm to obtain a regression model f (alpha, beta); and will train the input vector of the sample { alpha _i',j } _{i'＝1,2,...m'；j＝1,2,...n} Importing the input vector as a test sample into a regression model for predicting to obtain a related parameter beta predicted output vector

Meanwhile, training sample mean vector is used

Wherein,

8. the method of claim 1, wherein the weighted mean square error is obtained by computing a prediction output vector and an output vector of the training samples, respectively, by:

respectively converting the output vector [ beta ] of the training sample at different time _i',j } _{i'＝1,2,...,m'；j＝1,2,...n} And predicting the output vector

Weighted mean square error σ of _j ² And calculating according to the following formula:

where f (-) is a weight function with properties: f (0) =0, f (1) =1, and in [0,1 =]Continuously increasing in monotonous manner, e.g. f (w) = w ⁿ | _n＞0 ,

9. The method for predicting material-related parameters based on the stochastic regression process according to claim 1, wherein the confidence interval of the mean prediction vector and the mean vector of the training samples are calculated by substituting a linear difference method, and the method for obtaining the confidence interval of the predicted value comprises the following steps:

10. The method for predicting the material-related parameters based on the stochastic degradation process of claim 9, wherein the method for evaluating the reliability of the predicted values of the related parameters according to the predicted value confidence intervals comprises the following steps: