CN115497574A - HPC compressive strength prediction method and system based on model fusion - Google Patents

HPC compressive strength prediction method and system based on model fusion Download PDF

Info

Publication number
CN115497574A
CN115497574A CN202211078389.2A CN202211078389A CN115497574A CN 115497574 A CN115497574 A CN 115497574A CN 202211078389 A CN202211078389 A CN 202211078389A CN 115497574 A CN115497574 A CN 115497574A
Authority
CN
China
Prior art keywords
model
hpc
compressive strength
prediction
fusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211078389.2A
Other languages
Chinese (zh)
Inventor
张永涛
田唯
肖垚
王永威
朱浩
李焜耀
杨华东
郑建新
王紫超
刘志昂
陈圆
薛现凯
李�浩
代百华
周浩
孙南昌
杨切
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CCCC Second Harbor Engineering Co
CCCC Highway Long Bridge Construction National Engineering Research Center Co Ltd
Original Assignee
CCCC Second Harbor Engineering Co
CCCC Highway Long Bridge Construction National Engineering Research Center Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CCCC Second Harbor Engineering Co, CCCC Highway Long Bridge Construction National Engineering Research Center Co Ltd filed Critical CCCC Second Harbor Engineering Co
Priority to CN202211078389.2A priority Critical patent/CN115497574A/en
Publication of CN115497574A publication Critical patent/CN115497574A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/10Analysis or design of chemical reactions, syntheses or processes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Medical Informatics (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a HPC compressive strength prediction method and a HPC compressive strength prediction system based on model fusion, which comprises the steps of collecting relevant parameter data of high-performance concrete; exploratory data analysis and data cleaning; processing abnormal values of concrete data and transforming the data; concrete data characteristic engineering; constructing a concrete compressive strength prediction model; optimizing parameters of a concrete compressive strength prediction model; fusing a concrete prediction compressive strength prediction model; performing interpretability analysis on a concrete compressive strength prediction model based on SHAP; by utilizing the method and the system, the defects that the traditional neural network model is difficult to train and has high requirements on data quantity are overcome; meanwhile, the comprehensive average of the results of multiple models is more reliable than the prediction result of a single model; meanwhile, the method has the advantages of short test period, high precision and low test cost, and has stronger engineering feasibility compared with the traditional empirical formula method and test method.

Description

HPC compressive strength prediction method and system based on model fusion
Technical Field
The invention belongs to the technical field of large open caisson construction, and particularly relates to an HPC compressive strength prediction method and system based on model fusion.
Background
HPC (high performance concrete) is widely applied to the construction of large-span bridges due to the outstanding characteristics of high strength, high durability and the like. The concrete compressive strength is used as an important index for evaluating the quality of the concrete, and the safety performance of the building structure is reflected to a great extent. Therefore, the accurate prediction method for researching the compressive strength of the high-performance concrete has important significance for the accurate control of construction projects and the scientific evaluation of engineering projects.
The existing method for predicting the compressive strength mainly comprises the following steps: empirical formula based methods, experimental methods, and statistical machine learning based methods.
The empirical formula method is based on artificial experience, and a complex mathematical model is established to fit various parameter indexes of the concrete, so that a compressive strength calculation model is established. The method is highly dependent on manual experience, has complex iterative calculation process and extremely limited fitting precision, and is not suitable for calculating the compressive strength of high-performance concrete materials with various mixed materials and extremely complex proportioning formula content.
Based on a test method, namely, the structure of the high-performance concrete material before and after molding is monitored through various test instruments and equipment, so that the compressive strength is obtained. The method has higher test precision and higher reliability of the test result; however, such methods are long in test period, high in test cost, and have a very high risk coefficient when the tests are performed in a construction environment with complicated sites, and thus are more used for the compression strength test in a laboratory environment.
The method based on statistical machine learning is based on machine learning as a theoretical basis, and a compressive strength prediction model is directly established by a data-driven method without excessive premise hypothesis, so that the method is low in cost, short in test period and high in test result precision, and has extremely high research and application values.
Until now, some researches for predicting the compressive strength of concrete by using a machine learning method, such as methods based on an AdaBoost algorithm, random forest and intelligent algorithms, a BP neural network or RBF neural network, a Support Vector Machine (SVM), a Linear Regression (LR) and a Deep Learning (DL) and the like; however, the above methods still have some drawbacks, such as that the above methods all predict the compressive strength based on the output result of a single model, and lack certain reliability; the method based on the artificial neural network or the deep learning particularly depends on a large amount of experimental data, is not suitable for scenes with high risk and difficult data acquisition, such as civil engineering, and is difficult in model training and easy to fall into local optimum or generate model overfitting conditions; methods based on SVM or LR are very susceptible to outliers, so that the prediction accuracy and the actual result differ greatly, and it is difficult for the LR method to fit a complex linear relationship between the HPC constituents. Meanwhile, the existing prediction method is a black box model on the whole, is lack of interpretability, cannot clearly know the specific influence of each data sample on the compressive strength, and is not beneficial to the specific guidance of actual engineering projects.
Disclosure of Invention
Therefore, aiming at the problems that the existing HPC (high performance concrete) compressive strength prediction method is strong in dependence on manual experience, large in data requirement, complex in model training process, low in model prediction result precision and poor in interpretability of the model prediction result, the invention provides the accurate prediction method which is suitable for common concrete and is particularly suitable for HPC compressive strength.
The HPC compressive strength prediction method based on model fusion, which realizes one of the purposes of the invention, comprises the following steps:
s1, respectively training a first prediction model and a second prediction model according to collected historical parameter data of the HPC to respectively obtain the trained first prediction model and the trained second prediction model for predicting the compressive strength of the HPC; HPC, i.e. high performance concrete;
and S2, performing composite operation on the trained first prediction model and the trained second prediction model to obtain a fusion model for predicting the HPC compressive strength, and outputting a final prediction result of the HPC compressive strength by the fusion model for predicting the HPC compressive strength.
The further technical scheme comprises that after the step S2, the method also comprises a step S3:
s3, the fusion model of the HPC compressive strength prediction is explained and analyzed by using a first algorithm, and the contribution value of the component content of each HPC parameter to the HPC compressive strength output by the fusion model of the HPC compressive strength prediction is obtained; the contribution values are used to measure whether the content of each component of HPC has an effect of enhancing the HPC compressive strength on a predicted value of the HPC compressive strength, which is used to guide the actual concrete mix design.
The further technical scheme comprises the following steps: the first algorithm is a SHAP-based interpretability algorithm. The contribution is expressed by Shapley Value and is abbreviated as Ψ, which is defined as follows:
Figure BDA0003831971760000031
in the formula:
s is a characteristic subset input by the model to be explained, namely a concrete parameter set input into the fusion model for predicting the HPC compressive strength;
x j the j characteristic variable of the sample to be explained; i.e. the jth concrete parameter;
p is the total number of characteristics, namely the total number of concrete parameters;
val x (S) represents the prediction result of the model on a sample x when S is taken as an input characteristic, namely the compressive strength result output by the model, wherein x is the sample, and the element of x is marked as x i ,x i Namely the value corresponding to the ith characteristic variable;
the SHAP Value indicates the importance degree of the jth feature to the output result of the model, namely the marginal contribution, and the Shapley Value is the average Value of the marginal contributions. The model interpretation results for the fusion model of HPC compressive strength prediction are defined as follows:
Figure BDA0003831971760000041
in the formula:
g is a compressive strength prediction model to be explained, i.e. a fusion model of HPC compressive strength prediction;
z'∈{0,1} M for combined vectors, representing a feature z j (j∈[1,M]) Is present or not, wherein z j For the jth concrete parameter input to the prediction model of the fused HPC compressive strength, z' is used to identify z 1 ~z M Whether present in the parameter set input to the fusion model of the HPC compressive strength prediction;
m is the number of the combined characteristics, namely the number of the concrete parameters of the fusion model input to the HPC compressive strength prediction;
Figure BDA0003831971760000042
attributing the characteristics of the characteristics j to Shapley Value, namely the prediction result of a fusion model of the j parameter on the HPC compressive strength prediction, namely the contribution Value of the compressive strength;
Ψ 0 the average prediction results of the fusion model for HPC compressive strength prediction, i.e., the mean of the compressive strength predictions.
Shapley Value measures the contribution of a feature to the overall prediction, Ψ j >And 0, the characteristic has a positive improvement effect on the predicted value of the compressive strength, namely the effect of enhancing the HPC compressive strength.
SHAPy global feature importance is the average of the sum of the absolute values of the Shapley values of each feature, i.e. SHAP
Figure BDA0003831971760000051
The further technical scheme comprises the following steps: in step S2, the method for obtaining the fusion model for HPC compressive strength prediction includes performing a composite operation on the first prediction model and the second prediction model by using a weighted average method.
The further technical scheme comprises the following steps: the method for carrying out compound operation on the first prediction model and the second prediction model by adopting the weighted average method comprises the following steps:
minimiz e(Loss)s.t.w 1 +w 2 =1 and w 1 ≥0,w 2 ≥0
in the formula:
w 1 a weight representing the first prediction model;
w 2 weights representing the second prediction model;
loss is the Loss function of the fusion model H (x) for HPC compressive strength prediction; the calculation method is as follows:
Figure BDA0003831971760000052
in the formula:
n is the sample capacity, namely the total number of the collected concrete sample data;
Figure BDA0003831971760000062
the concrete actual compressive strength corresponding to the ith sample is obtained;
Figure BDA0003831971760000063
a predicted value of the concrete compressive strength output of the ith sample by the fusion model H (x) for HPC compressive strength prediction;
the expression for H (x) is as follows:
Figure BDA0003831971760000061
in the formula:
h (x): a final predicted HPC compressive strength output of the fusion model representing the HPC compressive strength prediction;
w 1 、w 2 : respectively representing the weights of the first prediction model and the second prediction model;
h 1 、h 2 : respectively representing the compressive strength of the concrete predicted by the first prediction model and the second prediction model.
The further technical scheme comprises the following steps: the first prediction model predicts the compressive strength of the HPC based on the AdaBoost algorithm.
The further technical scheme comprises the following steps: the second prediction model predicts the compressive strength of the HPC based on the castboost algorithm.
The further technical scheme comprises the following steps: when the first prediction model and the second prediction model are subjected to model fusion by adopting a weighted average method, the weight of the second prediction model based on the Catboost algorithm is greater than that of the first prediction model based on the AdaBoost algorithm.
The further technical scheme comprises the following steps: before the step S2, carrying out hyper-parameter tuning on the first prediction model and the first prediction model by adopting a Bayesian optimization method; performing cross validation on the model after parameter adjustment; the super parameters comprise a tree and depth, and the first prediction model and the second prediction model which are used for predicting the HPC compressive strength after the super parameters are optimized are obtained.
The AdaBoost model and the CatBOost model are both Tree-based integrated models, the base models of the AdaBoost model and the CatBOost model are trees (precision trees), namely, a plurality of base models (namely, a plurality of trees) form the integrated models AdaBoost and CatBOost together, and the AdaBoost and the CatBOost are based on the Boosting integrated learning framework as a whole. The number of trees of the AdaBoost model and the CatBOost model is the number of precision Tree trees; the depth of the AdaBoost model and the depth of the CatBOost model are the number of layers of the AdaBoost model and the number of layers of the CatBOost model;
the further technical scheme comprises the following steps: step S1, acquiring new characteristic parameters from the acquired historical parameter data of the HPC in a characteristic structure mode, and using the new characteristic parameters to expand a data set to enable the predicted compressive strength of the HPC to be more accurate; the characteristic construction mode is that different concrete parameters are mathematically calculated by combining engineering experience to obtain the ratio relation of the different concrete parameters.
The HPC compressive strength prediction system based on model fusion for realizing the second purpose of the invention comprises a model training module and a model composite operation module;
the model training module is used for respectively training the first model and the second model according to the collected historical parameter data of the HPC to respectively obtain a trained first prediction model and a trained second prediction model for predicting the compressive strength of the HPC;
and the model composite operation module is used for carrying out composite operation on the HPC compressive strength output by the trained first prediction model and the trained second prediction model for predicting the HPC compressive strength to obtain a fusion model for predicting the HPC compressive strength, and the fusion model for predicting the HPC compressive strength outputs a final prediction result of the HPC compressive strength.
Furthermore, in the model composite operation module, a weighted average method is adopted to perform composite operation on the HPC compressive strength output by the first prediction model and the second prediction model.
The system further comprises a parameter tuning module, a parameter tuning module and a parameter tuning module, wherein the parameter tuning module is used for performing hyper-parameter tuning on a trained first prediction model and a trained second prediction model for predicting the HPC compressive strength respectively to obtain the first prediction model and the second prediction model after parameter optimization, and performing cross validation on the tuned compressive strength prediction models; the cross validation method comprises five-fold cross validation;
the model interpretation and analysis module is used for carrying out interpretation and analysis on the fusion model of the HPC compressive strength prediction by using a first algorithm to obtain the influence of the component content of each HPC parameter on the HPC compressive strength.
The system further comprises an abnormal value processing module, a parameter analysis module and a parameter analysis module, wherein the abnormal value processing module is used for detecting abnormal values in the collected historical parameter data of the concrete; the abnormal value detection method comprises the step of adopting an algorithm based on combination of K-Means + + clustering and isolated forests.
Wherein the K-Means + + algorithm steps are as follows:
a) Initializing an empty set M for storing an initial clustering center;
b) Randomly selecting a first cluster center mu from the initial sample (j) And assign it to set M;
c) For each sample x (i) (the sample does not belong to the set M), and the minimum squared distance d (x) from all the initial cluster centers in the set M is determined (i) ,M) 2
d) Based on weighted probability distribution
Figure BDA0003831971760000081
Randomly selecting the next centroid mu (p)
e) Repeating the steps b and c until K clustering centers are selected;
f) Continuing to use the classical K-Means algorithm based on the set M;
g) Selecting a K-Means model with the best performance according to SSE (simple sequence analysis), namely the sum of squares of residuals, so as to obtain the best clustering center;
and combining the K obtained clustering centers, and detecting abnormal values of the data set by using an isolated forest algorithm.
The isolated forest algorithm is an abnormal value detection algorithm based on division and ensemble learning, and if abnormal value detection is performed by directly using the isolated forest algorithm without clustering analysis in the previous period, the problems of large calculation amount, long operation period and strong man-made property in the division process can be met. The detection efficiency of the isolated forest algorithm can be greatly improved by carrying out data clustering analysis based on the K-Means + + algorithm.
Has the advantages that:
(1) The concrete compressive strength prediction model is established in an ensemble learning-based mode, and the ensemble model is further fused again, so that the model prediction precision is improved, and the defects that the traditional neural network model is difficult to train and has high requirements on data quantity are overcome; meanwhile, compared with a prediction result using a single model, the comprehensive average of multi-model results is more reliable;
(2) The HPC compressive strength prediction method based on the statistical machine learning method is short in test period, high in precision, low in test cost and higher in engineering feasibility compared with a traditional empirical formula method, a traditional test method and the like;
(3) The method combines model fusion with the SHAP interpretable algorithm for predicting the compressive strength of the concrete, overcomes the unpredictability and black box property of the modeling process of the traditional method, provides convenience for the development of concrete engineering by combining SHAP interpretable analysis, and is favorable for more exactly knowing the specific influence of each component on the compressive strength;
(4) The abnormal value processing process of the concrete parameter data adopts a mode of combining K-Means + + clustering and isolated forests, and overcomes the high computational complexity and artificial dependence of data division faced by directly using an isolated forest method for processing;
(5) The invention realizes the expansion of the data set by means of characteristic engineering, newly constructs the proportional characteristics such as water-cement ratio, water-glue ratio and the like, and is favorable for avoiding overfitting of the model.
Drawings
FIG. 1 is a flow chart for modeling an HPC compressive strength prediction model of the present invention;
FIG. 2 is a first graph illustrating the fitting effect of the HPC compressive strength prediction model of the present invention;
FIG. 3 is a graph of the fitting effect of the HPC compressive strength prediction model of the present invention;
FIG. 4 is a schematic diagram of the HPC compressive strength prediction model fusion process of the present invention;
FIG. 5 is a first diagram illustrating a single sample interpretation result of the SHAP model interpretable algorithm of the present invention;
FIG. 6 is a diagram of a second example interpretation result of the SHAP model interpretable algorithm of the present invention;
FIG. 7 is a diagram illustrating the interpretation of the SHAP model interpretable algorithm of the present invention over the entire data set.
Detailed Description
The following detailed description is provided to explain the claims of the present invention so that those skilled in the art may understand the claims. The scope of the invention is not limited to the following specific implementation configurations. It is intended that the scope of the invention be determined by those skilled in the art from the following detailed description, which includes claims that are directed to this invention.
As shown in fig. 1, the present embodiment includes the following steps:
step 1, collecting high-performance concrete related parameter data, collecting the concrete related parameter data on the spot of a concrete factory, a mixing plant and the like, wherein the concrete related parameter data comprise but are not limited to cement content, fly ash content, slag content, water reducing agent content, coarse/fine aggregate content, water content, curing period, temperature and slump, and form a sample set
Figure BDA0003831971760000113
Wherein the units of the cement content, the fly ash content, the slag content, the water reducing agent content, the coarse/fine aggregate content and the water content are all (kg/m) 3 ) I.e. the mass of the corresponding component in each cubic meter of concrete; the unit of the curing period is day (days), the unit of the temperature is ℃ (centigrade degree), the unit of the slump is mm (millimeter), and the curing period can be obtained by weighing or measuring when the concrete is prepared;
meanwhile, the actual compression strength corresponding to each sample is taken as a target variable
Figure BDA0003831971760000111
Thereby constructing an experimental data set
Figure BDA0003831971760000112
After the data is constructed, storing the data set in a local disk or a relational database;
step 2, exploratory data analysis and data cleaning
And the data set is preliminarily explored and known by means of visualization, statistical analysis and the like. And processing the missing value, the repeated value, the abnormal value and the like in the collected high-performance concrete related parameter data by combining the visual analysis result, and simultaneously knowing the data distribution condition. For the characteristic variables with less missing values (within 10 percent) and corresponding characteristic extreme values with little difference, the missing values are filled by mean values; filling the missing values of the characteristic variables with less missing values and larger difference of corresponding characteristic extreme values by adopting median; predicting and filling the missing values of the characteristic variables with the missing value proportion reaching about 10-50% by adopting a decision tree algorithm-based mode; removing the characteristic variables with the missing value proportion of more than 50 percent; deleting the repeated values in the data set;
step 3, processing abnormal values of concrete data and transforming data
Step 3.1, data transformation
The feature variables with larger extreme value difference of the feature values are normalized, and the normalization processing in this embodiment adopts a Robust Scaler method for processing the abnormal values, and the method includes the following steps:
a) For computing data to be processed
Figure BDA0003831971760000121
Quantile, in which removal is
Figure BDA0003831971760000122
Quantiles (i.e., median) and then storing the corresponding quantiles;
b) Calculating the IQR, which is defined as
Figure BDA0003831971760000123
Quantile and
Figure BDA0003831971760000124
difference in quantiles;
c) Scaling the characteristic variables by using the IQR to reach a uniform scale;
according to the data visualization result, carrying out logarithmic transformation on the data set or the characteristic variables which do not meet the induction bias of the current algorithm, namely adding 1 to the corresponding characteristic variable value and then taking the logarithm, so that the distribution of each characteristic variable is closer to normal distribution, and the adverse effect of the skewness of the data distribution on the model prediction result is avoided;
step 3.2, outlier processing
Processing abnormal values of the data set in a mode of combining a box line graph with a cluster and an isolated forest, wherein the box line graph boxplot is used for comparing the effects before and after the abnormal values are processed and confirming the processing benefits; the present embodiment employs outlier detection based on the K-Means + + clustering algorithm + isolated forests, which produces more consistent results than traditional K-Means by placing the initial centroids far from each other.
Step 4, concrete data characteristic engineering
The original data set acquired in the step 1 only contains the numerical content of each component forming the HPC, and no specific content proportioning relation exists; the data set is thus further augmented by means of feature construction (i.e. mathematical calculations via numerical features of different components in combination with engineering experience). For example, the ratio relation is obtained by the original characteristics of 'water content' and 'cement content', and then the water cement ratio is obtained; obtaining a ratio relation between the water content and gel (cement content, slag content and fly ash content) for short, and further obtaining a water-to-gel ratio; the characteristics obtained by the new construction through characteristic engineering are shown in the following table 1:
Figure BDA0003831971760000131
TABLE 1
Step 5, constructing and evaluating concrete compressive strength prediction model
Based on the data set after the preprocessing and the feature engineering, dividing a training set and a test set according to the proportion of 8. Meanwhile, the performance of the first prediction model and the performance of the second prediction model are respectively evaluated by using regression model evaluation indexes in combination with 5-fold cross validation. The evaluation indexes adopted in this embodiment are respectively defined as follows:
Figure BDA0003831971760000132
Figure BDA0003831971760000133
Figure BDA0003831971760000134
Figure BDA0003831971760000135
Figure BDA0003831971760000136
in the formula:
n is the total number of the collected concrete sample data;
i is a sample number, namely a sample number;
Figure BDA0003831971760000141
predicting the compressive strength value of the model on the ith sample, namely a predicted value;
Figure BDA0003831971760000142
the actual compressive strength value corresponding to the ith sample is an observed value;
step 6, optimizing parameters of the concrete compressive strength prediction model
Carrying out hyper-parameter tuning on the first prediction model and the second prediction model by using a Bayesian optimization method in combination with the evaluation result of the model performance, thereby further improving the model performance; the adjusted partial key hyper-parameters comprise the depths of a tree of a basic model tree and an integrated model tree, and the Bayesian optimization process takes model errors as a target function and finds the parameter with the minimum corresponding error through the combination of the parameters.
Step 7, fusing the first prediction model and the second prediction model
Fig. 2 is a diagram showing the fitting effect of the second prediction model based on the castboost algorithm on the test set, and fig. 3 is a diagram showing the fitting effect of the first prediction model based on the AdaBoost algorithm on the test set. As can be seen from the graph, the prediction effect of the second prediction model based on the Catboost algorithm on the compressive strength is obviously better than that of the first prediction model based on the AdaBoost algorithm;
in order to improve the reliability of the model prediction result, a group decision manner is adopted to integrate the results of multiple model predictions, and fusion is performed on the model decision level to improve the accuracy of the model prediction result, as shown in fig. 4, in this embodiment, a weighted average method is adopted to perform model fusion on a Catboost model and an Adaboost model, and a greater weight is given to the Catboost model in the fusion process.
The weighted average model fusion process is as follows:
taking a first prediction model based on the AdaBoost algorithm and a second prediction model based on the Catboost algorithm as a base model H (x), and recording a fusion model of HPC compressive strength prediction as H (x), wherein the fusion model is expressed as follows:
Figure BDA0003831971760000151
in the formula:
w i : represents the weight of the ith basis model, w in this embodiment i Not less than 0 and satisfy
Figure BDA0003831971760000152
h i (x) The method comprises the following steps Representing the compressive strength of the HPC predicted by the ith base model;
t: the number of base models to be subjected to model fusion is represented, the base models in the embodiment are AdaBoost and CatBOost, and therefore T is equal to 2;
h (x): representing the final predicted result of the compressive strength of the concrete.
Wherein the fusion weight w of the base model i The determination process of (2) is as follows:
the loss function for the fusion model defined in conjunction with RMSE is as follows:
Figure BDA0003831971760000153
in the formula:
n is the sample capacity, namely the total number of the collected concrete sample data;
Figure BDA0003831971760000154
the concrete actual compressive strength corresponding to the ith sample is obtained;
Figure BDA0003831971760000155
predicting the concrete compressive strength of the ith sample by using a fusion model H (x) for predicting the HPC compressive strength;
thus, the final optimization objective of the fusion model is defined as follows:
minimiz e(Loss)s.t.w 1 +w 2 =1 and w 1 ≥0,w 2 more than or equal to 0 type (8)
In the formula:
loss is a Loss function of the integration model H (x);
s.t is an abbreviation for constraint;
w 1 and w 2 The fusion weight of the basic model Catboost and AdaBoost;
minimize means minimize;
the fusion weight w of the base model is obtained by solving the constrained minimum optimization problem shown in equation (8).
Step 8, concrete compressive strength prediction model interpretability analysis based on SHAP
Through the model construction and evaluation, the model parameter adjustment and the model fusion, the concrete compressive strength prediction model with good prediction capability is obtained. Furthermore, the embodiment combines with the interpretable algorithm of the SHAP model to perform interpretation analysis on the model prediction result, so as to better understand the influence of each characteristic value on the compressive strength of the concrete predicted by the model. The effect is expressed as Shapley Value and abbreviated as Ψ, and is defined as follows:
Figure BDA0003831971760000161
in the formula:
s is a characteristic subset input by the model to be explained, namely a concrete parameter set input into the fusion model for predicting the HPC compressive strength;
x j the j-th characteristic variable of the sample to be explained; i.e. the jth concrete parameter;
p is the total number of characteristics, namely the total number of concrete parameters;
val x (S) represents the prediction result of the model to be explained on the sample x when S is taken as the input characteristic, namely the compressive strength result output by the model to be explained, wherein x is the sample, and the element of x is marked as x i ,x i Namely the value corresponding to the ith characteristic variable;
the SHAP Value is the importance degree of the jth feature to the output result of the model to be interpreted, namely the marginal contribution, and the Shapley Value is the average Value of the marginal contributions. The model interpretation result for the model to be interpreted is defined as follows:
Figure BDA0003831971760000171
in the formula:
g is the model to be explained, in this example the fusion model H (x) of the HPC compressive strength prediction after the fusion of the castboost and AdaBoost;
z'∈{0,1} M for combined vectors, representing a feature z j (j∈[1,M]) Is not present, wherein z j In this example, the j-th concrete parameter, z', which is the input to the fusion model for HPC compressive strength prediction, is used to identify z 1 ~z M Whether it is present in the set of parameters input to the fusion model of the HPC compressive strength prediction;
m is the number of the combined features, namely the number of input parameters input into the model g to be interpreted;
Figure BDA0003831971760000172
attributing the characteristic of the characteristic j to sharey Value, namely the prediction result of the j-th parameter of the model g to be explained, namely the contribution Value of the counter-pressure intensity in the embodiment;
Ψ 0 the average prediction result of the model g to be explained is the average of the prediction results of the compressive strength output by the fusion model of the HPC compressive strength prediction in this embodiment.
Shapley Value measures the contribution of a feature to the overall prediction, Ψ j >And when 0, the characteristic has a positive improvement effect on the predicted value, namely the effect of enhancing the compressive strength.
SHAPy global feature importance is the average of the sum of the absolute values of the Shapley values of each feature, i.e. SHAP
Figure BDA0003831971760000173
Wherein, fig. 5 to 7 are partial diagrams of model interpretability analysis performed by the present embodiment using the SHAP algorithm;
FIG. 5 is a SHAP partial interpretation of the first sample, and FIG. 6 is a SHAP partial interpretation of the tenth sample; in the figure, the average prediction result of the model is 35.25Mpa, and the prediction result of the compressive strength of the model to the first sample is 76.17Mpa. The compressive strength of the model on the 10 th sample is predicted to be 38.73MPa. And based on the model prediction result, the actual observation result and the value taking condition of each parameter, guidance can be provided for the design of the concrete mixing ratio.
FIG. 7 is a general sketch of SHAP global interpretation, in which the Y-axis arranges the characteristic factors affecting the compressive strength in order from top to bottom according to the contribution degree; wherein, the curing period, the water cement ratio and the cement content are the first three factors which have obvious influence on the HPC compressive strength, and the second is the water cement ratio and the like. And the X axis is an average influence value of each factor on the prediction result of the compressive strength prediction model. In the current experimental results, the compressive strength increases by 8MPa on average with the increase of the curing period.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
The embodiment of the application also provides an embodiment of the system, which comprises a model training module and a model compound operation module;
the model training module is used for respectively training the first model and the second model according to the collected historical parameter data of the HPC to respectively obtain a first prediction model and a second prediction model which are used for predicting the compressive strength of the HPC and are trained;
and the model composite operation module is used for carrying out composite operation on the HPC compressive strength output by the trained first prediction model and the trained second prediction model for predicting the HPC compressive strength to obtain a fusion model for predicting the HPC compressive strength, and the fusion model for predicting the HPC compressive strength outputs a final prediction result of the HPC compressive strength.
And in the model composite operation module, composite operation is carried out on the HPC compressive strength output by the first prediction model and the second prediction model by adopting a weighted average method.
In another embodiment, the HPC compression strength prediction model further comprises a model interpretation analysis module, configured to perform interpretation analysis on the fusion model of the HPC compression strength prediction using the first algorithm, to obtain an effect of the composition content of each HPC parameter on the HPC compression strength.
Those not described in detail in this specification are well within the skill of the art.

Claims (10)

1. A HPC compressive strength prediction method based on model fusion is characterized by comprising the following steps:
s1, respectively training a first prediction model and a second prediction model according to collected historical parameter data of the HPC to respectively obtain the trained first prediction model and the trained second prediction model for predicting the compressive strength of the HPC;
and S2, performing composite operation on the trained first prediction model and the trained second prediction model to obtain a fusion model for predicting the HPC compressive strength, and outputting a final prediction result of the HPC compressive strength by the fusion model for predicting the HPC compressive strength.
2. The model fusion-based HPC compressive strength prediction method of claim 1, further comprising, after step S2, step S3:
and S3, the fusion model of the HPC compressive strength prediction is explained and analyzed by using a first algorithm, and the contribution value of the component content of each HPC parameter to the HPC compressive strength output by the fusion model of the HPC compressive strength prediction is obtained.
3. The model fusion-based HPC compressive strength prediction method of claim 2, wherein the first algorithm is a SHAP-interpretable algorithm.
4. The model fusion-based HPC compressive strength prediction method of claim 1, wherein the step S2 of obtaining the fused model of the HPC compressive strength prediction comprises performing a composite operation on the first prediction model and the second prediction model by using a weighted average method.
5. The model fusion-based HPC compressive strength prediction method of claim 4, wherein the first prediction model predicts the compressive strength of the HPC based on an AdaBoost algorithm; the second prediction model predicts the compressive strength of the HPC based on the castboost algorithm.
6. The model fusion-based HPC compressive strength prediction method of claim 5, wherein when the first prediction model and the second prediction model are subjected to modular composite operation by using a weighted average method, the weight of the second prediction model based on the Catboost algorithm is greater than that of the first prediction model based on the AdaBoost algorithm.
7. The model fusion-based HPC compressive strength prediction method of claim 6, wherein the method of calculating the weight for each prediction model in the weighted average method comprises: the weights of the first prediction model and the second prediction model are obtained by solving a constrained minimum optimization problem as shown in the following formula:
minimiz e(Loss)s.t.w 1 +w 2 =1and w 1 ≥0,w 2 ≥0
in the formula:
w 1 and w 2 Weights corresponding to the first prediction model and the second prediction model, respectively;
loss is the Loss function of H (x) below;
Figure FDA0003831971750000021
in the formula:
h (x): a final predicted HPC compressive strength output of the fusion model representing the HPC compressive strength prediction;
w 1 、w 2 : respectively representing the weights of the first prediction model and the second prediction model;
h 1 (x)、h 2 (x) The method comprises the following steps Respectively representing the predicted HPC compressive strength of the first prediction model and the second prediction model.
8. The model fusion-based HPC compressive strength prediction system of claim 1, comprising: the model training module and the model compound operation module;
the model training module is used for respectively training the first model and the second model according to the collected historical parameter data of the HPC to respectively obtain a first prediction model and a second prediction model which are trained and used for predicting the compressive strength of the HPC;
and the model composite operation module is used for carrying out composite operation on the trained first prediction model and the trained second prediction model to obtain a fusion model for predicting the HPC compressive strength, and the fusion model for predicting the HPC compressive strength outputs a final prediction result of the HPC compressive strength.
9. The model-fusion-based HPC compressive strength prediction system of claim 8, further comprising a model interpretation analysis module configured to perform interpretation analysis on the fused model of the HPC compressive strength prediction using a first algorithm to obtain a contribution of the compositional content of each HPC parameter to the HPC compressive strength of the fused model output of the HPC compressive strength prediction.
10. The model fusion-based HPC compressive strength prediction system of claim 8, wherein the model composition operation module performs a composition operation on the HPC compressive strengths output by the first prediction model and the second prediction model using a weighted average method.
CN202211078389.2A 2022-09-05 2022-09-05 HPC compressive strength prediction method and system based on model fusion Pending CN115497574A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211078389.2A CN115497574A (en) 2022-09-05 2022-09-05 HPC compressive strength prediction method and system based on model fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211078389.2A CN115497574A (en) 2022-09-05 2022-09-05 HPC compressive strength prediction method and system based on model fusion

Publications (1)

Publication Number Publication Date
CN115497574A true CN115497574A (en) 2022-12-20

Family

ID=84468246

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211078389.2A Pending CN115497574A (en) 2022-09-05 2022-09-05 HPC compressive strength prediction method and system based on model fusion

Country Status (1)

Country Link
CN (1) CN115497574A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116092297A (en) * 2023-04-07 2023-05-09 南京航空航天大学 Edge calculation method and system for low-permeability distributed differential signal control
CN117420011A (en) * 2023-12-18 2024-01-19 南京建正建设工程质量检测有限责任公司 Concrete brick multipoint compressive strength detection system
CN117763701A (en) * 2024-02-22 2024-03-26 四川省交通勘察设计研究院有限公司 method for predicting strength of steel-concrete connection transition surface of steel arch bridge and related products

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116092297A (en) * 2023-04-07 2023-05-09 南京航空航天大学 Edge calculation method and system for low-permeability distributed differential signal control
CN116092297B (en) * 2023-04-07 2023-06-27 南京航空航天大学 Edge calculation method and system for low-permeability distributed differential signal control
CN117420011A (en) * 2023-12-18 2024-01-19 南京建正建设工程质量检测有限责任公司 Concrete brick multipoint compressive strength detection system
CN117420011B (en) * 2023-12-18 2024-03-15 南京建正建设工程质量检测有限责任公司 Concrete brick multipoint compressive strength detection system
CN117763701A (en) * 2024-02-22 2024-03-26 四川省交通勘察设计研究院有限公司 method for predicting strength of steel-concrete connection transition surface of steel arch bridge and related products
CN117763701B (en) * 2024-02-22 2024-05-07 四川省交通勘察设计研究院有限公司 Method for predicting strength of steel-concrete connection transition surface of steel arch bridge and related products

Similar Documents

Publication Publication Date Title
CN115497574A (en) HPC compressive strength prediction method and system based on model fusion
CN110084367B (en) Soil moisture content prediction method based on LSTM deep learning model
CN111985796B (en) Method for predicting concrete structure durability based on random forest and intelligent algorithm
US20220253734A1 (en) Machine learning methods to optimize concrete applications and formulations
CN107992976B (en) Hot topic early development trend prediction system and prediction method
CN107886161A (en) A kind of global sensitivity analysis method for improving Complex Information System efficiency
WO2021036546A1 (en) Near-infrared quantitative analysis model construction method based on biased estimation
CN115982178B (en) Intelligent formula batching method and system for autoclaved aerated concrete products
CN109919356B (en) BP neural network-based interval water demand prediction method
CN112069656B (en) LSSVM-NSGAII durable concrete mixing ratio multi-objective optimization method
KR20090114162A (en) Selection method of concrete mixture proportion
Akpinar et al. Intelligent prediction of concrete carbonation depth using neural networks
CN117312816B (en) Special steel smelting effect evaluation method and system
Hong et al. Predicting the CO 2 emission of concrete using statistical analysis
Xu et al. Comprehensive machine learning-based model for predicting compressive strength of ready-mix concrete
CN115495991A (en) Rainfall interval prediction method based on time convolution network
CN116451556A (en) Construction method of concrete dam deformation observed quantity statistical model
CN115948964A (en) Road flatness prediction method based on GA-BP neural network
Lin et al. A novel efficient model for gas compressibility factor based on GMDH network
CN112347670B (en) Rockfill material creep parameter prediction method based on neural network response surface
CN114239397A (en) Soft measurement modeling method based on dynamic feature extraction and local weighted deep learning
Li et al. The prediction of cement compressive strength based on gray level images and neural network
Wang et al. A fuzzy intelligent system for land consolidation–a case study in Shunde, China
Ziolkowski Computational complexity and its influence on predictive capabilities of machine learning models for concrete mix design
CN113269384B (en) Method for early warning health state of river system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination