CN117313041A - Method for predicting pitting corrosion of outer surface of buried pipeline and analyzing factors - Google Patents

Method for predicting pitting corrosion of outer surface of buried pipeline and analyzing factors Download PDF

Info

Publication number
CN117313041A
CN117313041A CN202311346348.1A CN202311346348A CN117313041A CN 117313041 A CN117313041 A CN 117313041A CN 202311346348 A CN202311346348 A CN 202311346348A CN 117313041 A CN117313041 A CN 117313041A
Authority
CN
China
Prior art keywords
pipeline
data
pitting
soil
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311346348.1A
Other languages
Chinese (zh)
Inventor
王勤英
宋宇辉
西宇辰
张兴寿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest Petroleum University
Original Assignee
Southwest Petroleum University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest Petroleum University filed Critical Southwest Petroleum University
Priority to CN202311346348.1A priority Critical patent/CN117313041A/en
Publication of CN117313041A publication Critical patent/CN117313041A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/27Regression, e.g. linear or logistic regression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Testing Resistance To Weather, Investigating Materials By Mechanical Methods (AREA)

Abstract

The invention relates to the field of corrosion protection of oil and gas pipelines, and discloses a method for predicting pitting corrosion and analyzing factors on the outer surface of a buried pipeline, which comprises the following steps: selecting soil characteristics and pipeline characteristics related to pitting of the outer surface of the buried pipeline as input characteristics; obtaining sample data of soil types and pipeline coating types; preprocessing sample data to form a corrosion data set; constructing a regression prediction model of the pitting depth of the outer surface of the buried pipeline based on an XGBoost algorithm; randomly selecting 80% of data in the corrosion data set as a training set for optimizing training of the model, and using the remaining 20% of data as a test set for evaluating the predicted performance of the trained model; and inputting the sample data into the trained model to obtain the soil characteristic and pipeline characteristic importance and pitting depth prediction result. The invention converts the classification characteristic into the continuous variable for the pitting depth regression prediction model, and realizes accurate prediction of the pitting depth of the pipeline.

Description

Method for predicting pitting corrosion of outer surface of buried pipeline and analyzing factors
Technical Field
The invention relates to the technical field of corrosion protection of oil and gas pipelines, in particular to a method for predicting pitting corrosion of the outer surface of a buried pipeline and analyzing factors.
Background
The oil and gas pipeline is called an energy vessel and plays an important role in ensuring energy safety. After the buried oil gas conveying pipeline is in service for a certain time, corrosion failure can occur due to interaction of various media and environments, so that oil gas leakage is caused, normal operation of an oil gas conveying system is affected, and environmental pollution, economic loss and even safety accidents are caused. Therefore, it is urgently required to predict the pitting corrosion of the buried oil and gas pipeline so as to provide important basis for the detection and maintenance thereof.
At present, the prediction method of the pitting depth of the buried oil and gas pipeline mainly comprises an empirical calculation model, a corrosion mechanism model, a gray theory, a neural network model and the like. However, the accuracy of the models in predicting the pitting depth of the outer surface of the buried oil and gas pipeline is poor, the interpretation of the models is poor, and the importance and the action rule of various pitting influence factors cannot be determined, so that the results of the prediction models cannot directly promote the upgrading of oil and gas pipeline materials and construction processes.
In order to meet the actual requirements of accurate prediction of pitting corrosion of the outer surface of an oil and gas pipeline and quantitative analysis of factors, it is urgently required to establish a prediction model based on interpretable machine learning and to deeply analyze and evaluate the action rules of various factors.
Disclosure of Invention
Aiming at the problems, the invention provides a method for predicting pitting corrosion and analyzing factors on the outer surface of a buried pipeline.
The invention adopts the following technical scheme:
a method for predicting pitting corrosion and analyzing factors on the outer surface of a buried pipeline comprises the following steps:
step 1: selecting soil characteristics and pipeline characteristics related to pitting of the outer surface of the buried pipeline as input characteristics; step 2: obtaining sample data of soil types and pipeline coating types;
step 3: preprocessing the sample data obtained in the step 2 to form a corrosion data set;
step 4: constructing a regression prediction model of the pitting depth of the outer surface of the buried pipeline based on an extremum gradient lifting algorithm XGBoost;
step 5: randomly selecting 80% of data in the corrosion data set as a training set for optimized training of the XGBoost model, and using the remaining 20% of data as a test set for evaluating the predicted performance of the trained model;
step 6: and inputting the sample data into the trained model to obtain the soil characteristic and pipeline characteristic importance and pitting depth prediction result.
Step 7: and calculating an accumulated local effect ALE value to analyze the action rule and threshold value of the soil characteristics and the pipeline characteristics on the pitting depth.
Further, the soil characteristics in step 1 include soil type, resistance, water content, HCO 3 Content of Cl Content of SO 4 2- Content, redox potential, pH, soil resistivity, soil-to-pipeline potential; pipeline characteristics include pipeline coating type, service life.
Further, the sample data preprocessing in the step 3 includes the following steps:
step 3.1: outlier processing: detecting an outlier in the sample data in the step 2 by using a quartile range of the box diagram, wherein the outlier is corrected by using a mode;
step 3.2: and (3) data conversion, namely converting the soil type and pipeline coating type variables in the sample data in the step (2) into numerical data by using a single-heat coding mode.
Further, the step 5 includes the following steps:
step 5.1: inputting a training set into the model constructed in the step 4;
step 5.2: performing feature optimization on the buried pipeline outer surface pitting prediction model based on the extremum gradient lifting algorithm XGBoost;
step 5.3: and performing super-parameter optimization on the buried pipeline outer surface pitting prediction model based on the extremum gradient lifting algorithm XGBoost.
Further, the specific feature optimization step in the step 5.2 is as follows: the Pelson correlation coefficient r between the features is calculated by adopting the following formula, wherein the two features with the correlation coefficient exceeding 0.7 are highly linearly correlated, one of the features is regarded as redundant features to be removed, and the remaining features are input into a model for training to obtain an optimal model;
wherein x is i And y i Representing the values of the ith sample in features x and y respectively,and->Represents the average of two features, N being the total number of samples.
Further, the super parameter optimization in the step 5.3 uses network searching and adopts cross verification, and the specific steps include:
step 5.3.1: setting a parameter grid, and setting a candidate value list for each super parameter;
step 5.3.2: setting the number of K-fold cross-validation folds, dividing a training set into K parts, respectively making each part of data into a validation set, taking the rest K-1 parts of data as the training set, and taking the average evaluation result of the K models in the validation set as cross-validation errors;
step 5.3.3: calculating model cross verification results under different super parameter combinations, and outputting the parameter combination with the smallest cross verification error as the optimal super parameter;
step 5.3.4: and under the condition of obtaining the optimal super-parameter combination, retraining the model, and verifying the reserved test data to obtain the optimal model.
Further, in the step 6, the importance of the soil feature and the pipeline feature is determined as the important feature by taking the sum of the times of dividing the feature in all decision trees as the dividing attribute, wherein the sum of the times is larger than a set threshold value.
Further, the specific steps of calculating the accumulated local effect ALE value in the step 7 are as follows: dividing the distribution interval of each feature into K subintervals, and calculating the accumulated local effect ALE value of the feature in each subinterval by adopting the following formula; a line graph is made for the calculated accumulated local effect value, and the line graph reflects the change rule of the pitting depth along with the characteristics and a threshold value;
wherein ALE (j, x) represents the feature j non-centered ALE value; x represents a feature other than feature j; k (k) j (x) The number of the division intervals of the feature j; n is n j (k) Representing the sample size in the kth interval; i represents an i-th sample point in a k-th section; z is Z k,j An upper boundary value for feature j in the kth interval; z is Z k-1,j A lower boundary value for feature j in the kth interval;representing the ith sample point in the kth interval.
The beneficial effects of the invention are as follows:
1. the method can convert the classification characteristic into the continuous variable for the pitting depth regression pre-model based on the XGBoost algorithm, and realizes accurate prediction of the pitting depth of the pipeline.
2. The invention can screen and sort the importance of the features, and improve the prediction performance and generalization capability of the model.
3. The method can quantitatively analyze the influence rule of soil characteristics and pipeline characteristics on pitting corrosion, can find out the influence threshold value of each characteristic, increases the understanding of a pipeline pitting corrosion mechanism, and can be used for pipeline pitting corrosion prediction and rule analysis and development of corresponding corrosion protection methods.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the following brief description of the drawings of the embodiments will make it apparent that the drawings in the following description relate only to some embodiments of the present invention and are not limiting of the present invention.
FIG. 1 is a schematic diagram of an operational framework of the present invention;
FIG. 2 is a diagram showing the result of the pearson correlation coefficient of the input features obtained after the preprocessing in the present invention;
FIG. 3 is a schematic view of the feature importance result of the feature importance evaluation calculation according to the present invention;
FIG. 4 is a schematic diagram of the predicted result of the pitting depth of the outer surface of the oil and gas pipeline according to the invention;
FIG. 5 is a graph schematically illustrating the concentration of chloride ions versus the depth of pitting corrosion calculated by the cumulative local effect value according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. It will be apparent that the described embodiments are some, but not all, embodiments of the invention. All other embodiments, which can be made by a person skilled in the art without creative efforts, based on the described embodiments of the present invention fall within the protection scope of the present invention.
The invention will be further described with reference to the drawings and examples.
As shown in FIG. 1, the method for predicting the pitting corrosion and analyzing the factors of the outer surface of the buried pipeline comprises the following steps:
step 1: soil characteristics and pipe characteristics associated with pitting of the outer surface of the buried pipe are selected as input characteristics.
Wherein the soil characteristics in the step 1 comprise soil type, resistance, water content and HCO 3 Content of Cl Content of SO 4 2- Content, redox potential, pH, soil resistivity, soil-to-pipeline potential; pipeline characteristics include pipeline coating type, service life.
Step 2: soil type and pipe coating type sample data are obtained.
Step 3: and (3) preprocessing the sample data obtained in the step (2) to form a corrosion data set.
The pretreatment in the step 3 comprises the following steps:
step 3.1: outlier processing: and (3) detecting an outlier in the sample data in the step (2) by using the quartile range of the box diagram, wherein the outlier is corrected by using the mode.
Step 3.2: and (3) data conversion, namely converting the soil type and pipeline coating type variables in the sample data in the step (2) into numerical data by using a single-heat coding mode.
Step 4: and constructing a regression prediction model of the pitting depth of the outer surface of the buried pipeline based on the extremum gradient lifting algorithm XGBoost.
Step 5: randomly selecting 80% of data in the corrosion data set as a training set for optimized training of the XGBoost model, and the remaining 20% of data as a test set for evaluating the performance of the trained model prediction.
Said step 5 comprises the steps of:
step 5.1: inputting the training set into the model constructed in the step 4.
Step 5.2: and performing feature optimization on the buried pipeline outer surface pitting prediction model based on the extremum gradient lifting algorithm XGBoost.
The specific step of feature optimization in the step 5.2 is as follows: the Pelson correlation coefficient r between the features is calculated by adopting the following formula, wherein the two features with the correlation coefficient exceeding 0.7 are highly linearly correlated, one of the features is regarded as redundant features to be removed, and the remaining features are input into a model for training to obtain an optimal model;
wherein x is i And yi represents the value of the ith sample in features x and y,and->Represents the average of two features, N being the total number of samples.
Step 5.3: and performing super-parameter optimization on the buried pipeline outer surface pitting prediction model based on the extremum gradient lifting algorithm XGBoost.
The super parameter optimization in the step 5.3 uses network searching and adopts cross verification, and the specific steps comprise:
step 5.3.1: setting a parameter grid, and setting a candidate value list for each super parameter.
Step 5.3.2: setting the number of K-fold cross-validation folds, dividing the training set into K parts, respectively making each part of data into a validation set, taking the rest K-1 parts of data as the training set, and taking the average evaluation result of the K models in the validation set as the cross-validation error.
Step 5.3.3: and calculating the model cross-validation results under different super-parameter combinations, and outputting the parameter combination with the minimum cross-validation error as the optimal super-parameter.
Step 5.3.4: and under the condition of obtaining the optimal super-parameter combination, retraining the model, and verifying the reserved test data to obtain the optimal model.
Step 6: and inputting the sample data into the trained model to obtain the soil characteristic and pipeline characteristic importance and pitting depth prediction result.
And in the step 6, the importance of the soil characteristic and the pipeline characteristic is determined as the important characteristic by taking the characteristic as the sum of the times of dividing the attribute in all decision trees, wherein the sum of the times is larger than a set threshold value.
Step 7: and calculating an accumulated local effect ALE value to analyze the action rule of soil characteristics and pipeline characteristics on the erosion depth.
The specific steps of calculating the accumulated local effect ALE value in the step 7 are as follows: dividing the distribution interval of each feature into K subintervals, and calculating the accumulated local effect ALE value of the feature in each subinterval by adopting the following formula; and (3) making a line graph for the calculated accumulated local effect ALE value, wherein the line graph reflects the law of the pitting depth along with the characteristic change and the threshold value, and the specific law is that the characteristic concentration is increased, the pitting depth is increased, and the characteristic concentration exceeds the threshold value mutation point, so that the pitting depth is rapidly increased.
Wherein ALE (j, x) represents the feature j non-centered ALE value; x represents a feature other than feature j; k (k) j (x) The number of the division intervals of the feature j; n is n j (k) Representing the sample size in the kth interval; i represents an i-th sample point in a k-th section; z is Z k,j An upper boundary value for feature j in the kth interval; z is Z k-1,j A lower boundary value for feature j in the kth interval;representing the ith sample point in the kth interval.
Examples
The method for predicting the pitting corrosion and analyzing the factors of the outer surface of the buried pipeline provided by the embodiment of the invention is described in detail below.
Step 1: selecting the maximum pitting depth of the outer surface of the pipeline as an index of pitting corrosion and taking the index as an output characteristic of a model, and selecting the input characteristic as soil type, resistance, water content and HCO 3- Content of Cl Content of SO4 2- Content, redox potential, pH, soil resistivity, soil-to-pipeline potential, pipeline coating type, service life.
Step 2: sample data of soil types and pipeline coating types are obtained, and 259 sample data are obtained by detecting the maximum pitting depths of the outer surfaces of the service oil and gas pipelines at different positions and surrounding soil on site, wherein the sample data comprise all input and output characteristics in the step 1.
Step 3: pre-processing 259 sets of sample data obtained in the step 2 to form a corrosion data set. The specific process is as follows:
step 3.1: outlier processing: detecting outliers in the sample data in the step 2 according to a box diagram with a quartile whisker line, wherein the outliers in the box diagram are defined as data greater than Q3+1.5IQR or less than Q1-1.5IQR; and correcting the outlier by using the mode of the corresponding feature. Wherein Q1 is the lower quartile of the data, Q3 is the upper quartile of the data, IQR is the quartile spacing, and is the difference between Q3 and Q1.
Step 3.2: data conversion: and (3) converting the soil type and pipeline coating type variables in the sample data in the step (2) into numerical data by using a single-heat coding mode. The method specifically comprises the steps of counting the number m of types of soil and pipeline coating types, determining that the soil and the pipeline coating types have m-bit codes, marking the ith bit of the ith type in the soil and the pipeline coating types as 1, marking the rest bit as 0, and converting the data of the soil and the pipeline coating types in the examples shown in the table 1:
table 1 results table after soil and pipe coating type data conversion
Note that: soil type: c (clay), SYCL (sandy clay loam), CL (clay loam), SCL (silty clay loam), SC (silty clay), SL (silty loam)
Type of coating: NC (uncoated), AEC (asphalt enamel coating), WTC (tape coating),
CTCs (coal tar coatings), FBEs (fusion epoxy coatings).
Step 4: and constructing a regression prediction model of the pitting depth of the outer surface of the buried pipeline based on the extremum gradient lifting algorithm XGBoost.
FIG. 1 shows a framework of a regression prediction model of buried pipeline outer surface pitting depth based on the XGBoost algorithm; the XGBoost model is built through a machine learning library Scikit-learn, input variables are input features preprocessed in the step 3, and the model is an existing model.
Step 5: randomly selecting 80% of data in the data set as a training set for optimized training of the XGBoost model, and using the remaining 20% of data as a test set for evaluating the predicted performance of the trained model. Training optimization and evaluation are carried out by adopting the XGBoost model framework constructed in the step 4, and the specific process is as follows:
step 5.1: inputting the training set into the model constructed in the step 4.
Step 5.2: and performing feature optimization on the buried pipeline outer surface pitting prediction model based on the extremum gradient lifting algorithm XGBoost.
The specific step of feature optimization in the step 5.2 is as follows: the Pelson correlation coefficient r between the features is calculated by adopting the following formula, wherein the two features with the correlation coefficient exceeding 0.7 are highly linearly correlated, one of the features is regarded as redundant features to be removed, and the remaining features are input into a model for training to obtain an optimal model;
wherein x is i And y i Representing the value of the ith sample in features x and y,and->Represents the average of two features, N being the total number of samples.
The result of the pearson correlation coefficient of the input features obtained after the pretreatment in the examples is shown in FIG. 2, in which the soil type (clay (ct_C), sandy clay (ct_SYCL), clay (ct_CL), silty clay (ct_SCL), silty clay (ct_SC), silty clay (ct_SL)), water content (wc), HCO 3 Content (bc), cl Content (cc), SO 4 2- Content (sc), oxidation-reduction potential (rp), pH (pH), soil resistivity (re), soil-pipe potential (pp); the type of the coded pipeline coating (no coating (class_NC), asphalt enamel coating (class_AEC), tape coating (class_WTC), coal tarCoating (class_ctc), fusion epoxy coating (class_fbe)), service life (t).
Step 5.3: and performing super-parameter optimization on the buried pipeline outer surface pitting prediction model based on the extremum gradient lifting algorithm XGBoost.
The super parameter optimization in the step 5.3 uses grid search and adopts cross verification, and the specific steps include:
step 5.3.1: setting a parameter grid: a candidate list is set for each super parameter. The hyper-parameter setting list setting values of the XGBoot model in the embodiment are as follows: the number of base estimators n_estimators: [10,20,50,100,200,300]. Sub-sampling rate subsamples: [1.0,0.9,0.8], maximum tree depth max_depth: [5,9,15,20,25], learning rate: [0.05,0.1,0.2,0.4,0.8,1], remaining super-parameters are default values.
Step 5.3.2: setting the number of K-fold cross validation folds: dividing the training set into K parts, respectively making each part of data into a verification set, taking the rest K-1 parts of data as the training set, and taking the average evaluation result of the K models in the verification set as the cross verification error. The cross-validation fold number is set to 5 in the example.
Step 5.3.3: and calculating the model cross-validation results under different super-parameter combinations, and outputting the parameter combination with the minimum cross-validation error as the optimal super-parameter.
Step 5.3.4: and after the optimal super-parameter combination is obtained, retraining the model, and verifying the reserved test data to obtain the optimal model.
Step 6: and after model prediction, obtaining the soil characteristic and pipeline characteristic importance and pitting depth prediction result.
And in the step 6, the importance of the soil characteristic and the pipeline characteristic is determined as the important characteristic by taking the characteristic as the sum of the times of dividing the attribute in all decision trees, wherein the sum of the times is larger than a set threshold value.
The importance of the features calculated by the feature importance evaluation in the examples is shown in FIG. 3, wherein the features include the encoded soil types (clay (ct_C), sandy clay loam (ct_SYCL), clay loam (ct_CL), silty clay loam (ct_SCL), silty clay (ct_SC), silty loam (ct_SL)), and waterRate (wc), HCO 3 Content (bc), cl Content (cc), SO 4 2- Content (sc), redox potential (rp), pH (pH), soil resistivity (re), soil-to-pipe potential (pp), and coded pipe coating type (no coating (class_nc), asphalt enamel coating (class_aec), tape coating (class_wtc), coal tar coating (class_ctc), fusion epoxy coating (class_fbe)), service life (t).
In the embodiment, the result of the test set predicted by the optimal model is shown in fig. 4, and the predicted value is close to the true value, so that the reliability of the model is verified.
The calculating of the accumulated local effect ALE value in the step 7 specifically comprises the following steps: dividing the distribution interval of each feature into K subintervals, and calculating the accumulated local effect value ALE of the feature in each subinterval by adopting the following formula; and (3) making a line graph for the calculated accumulated local effect ALE value, wherein the line graph reflects the law and the threshold value of the pitting depth along with the characteristic change.
Wherein ALE (j, x) represents the feature j non-centered ALE value; x represents a feature other than feature j; k (k) j (x) Feature j divides the number of intervals. n is n j (k) Representing the sample size in the kth interval; i represents an i-th sample point in a k-th section; z is Z k,j An upper boundary value for feature j in the kth interval; z is Z k-1,j A lower boundary value for feature j in the kth interval;representing the ith sample point in the kth interval.
The calculated K value for ALE in the example was set to 20, and a plot of the calculated cumulative local effect of chloride ion concentration versus pitting depth is shown in fig. 5, which shows that as chloride ion concentration increases, pitting depth increases, and that pitting depth increases rapidly once concentration exceeds the threshold discontinuity by 30 ppm.
The foregoing description is only a preferred embodiment of the present invention, and is not intended to limit the invention in any way, but any simple modification, equivalent variation and modification made to the above embodiment according to the technical matter of the present invention still fall within the scope of the technical scheme of the present invention.

Claims (8)

1. The method for predicting the pitting corrosion and analyzing the factors of the outer surface of the buried pipeline is characterized by comprising the following steps:
step 1: selecting soil characteristics and pipeline characteristics related to pitting of the outer surface of the buried pipeline as input characteristics;
step 2: obtaining sample data of soil types and pipeline coating types;
step 3: preprocessing the sample data obtained in the step 2 to form a corrosion data set;
step 4: constructing a regression prediction model of the pitting depth of the outer surface of the buried pipeline based on an extremum gradient lifting algorithm XGBoost;
step 5: randomly selecting 80% of data in the corrosion data set as a training set for optimizing training of the model, and using the remaining 20% of data as a test set for evaluating the predicted performance of the trained model;
step 6: and inputting the sample data into the trained model to obtain the soil characteristic and pipeline characteristic importance and pitting depth prediction result.
Step 7: and calculating an accumulated local effect ALE value to analyze the action rule of soil characteristics and pipeline characteristics on the erosion depth.
2. The method for predicting and analyzing the pitting corrosion on the outer surface of a buried pipeline according to claim 1, wherein the soil characteristics in the step 1 include soil type, water content and HCO 3 Content of Cl Content of SO 4 2- Content, redox potential, pH, soil resistivity, soil-to-pipeline potential; pipeline characteristics include pipeline coating type, service life.
3. The method for predicting pitting corrosion and analyzing factors on the outer surface of a buried pipeline according to claim 1, wherein the preprocessing in the step 3 comprises the following steps:
step 3.1: outlier processing: detecting an outlier in the sample data in the step 2 by using a quartile range of the box diagram, wherein the outlier is corrected by using a mode;
step 3.2: and (3) data conversion, namely converting the soil type and pipeline coating type variables in the sample data in the step (2) into numerical data by using a single-heat coding mode.
4. The method for predicting and analyzing the pitting corrosion on the outer surface of the buried pipeline according to claim 1, wherein the step 5 comprises the following steps:
step 5.1: inputting a training set into the model constructed in the step 4;
step 5.2: performing feature optimization on the buried pipeline outer surface pitting prediction model based on the extremum gradient lifting algorithm XGBoost;
step 5.3: and performing super-parameter optimization on the buried pipeline outer surface pitting prediction model based on the extremum gradient lifting algorithm XGBoost.
5. The method for predicting pitting corrosion and analyzing factors on the outer surface of a buried pipeline according to claim 4, wherein the specific feature optimization step in step 5.2 is as follows: the Pelson correlation coefficient r between the features is calculated by adopting the following formula, wherein the two features with the correlation coefficient exceeding 0.7 are highly linearly correlated, one of the features is regarded as redundant features to be removed, and the remaining features are input into a model for training to obtain an optimal model;
wherein x is i And y i Representing the value of the ith sample in features x and y,and->Represents the average of two features, N being the total number of samples.
6. The method for predicting pitting corrosion and analyzing factors on the outer surface of a buried pipeline according to claim 4, wherein the super-parameter optimization in step 5.3 uses network search and adopts cross validation, and the specific steps include:
step 5.3.1: setting a hyper-parameter grid, and setting a candidate value list for each hyper-parameter;
step 5.3.2: setting the number of K-fold cross-validation folds, dividing a training set into K parts, respectively making each part of data into a validation set, taking the rest K-1 parts of data as the training set, and taking the average evaluation result of the K models in the validation set as cross-validation errors;
step 5.3.3: calculating model cross verification results under different super parameter combinations, and outputting the parameter combination with the smallest cross verification error as the optimal super parameter;
step 5.3.4: and under the condition of obtaining the optimal super-parameter combination, retraining the model, and verifying the reserved test data to obtain the optimal model.
7. The method for predicting and analyzing the pitting corrosion on the outer surface of the buried pipeline according to claim 1, wherein the soil characteristic and the importance of the pipeline characteristic in the step 6 are the sum of the times of taking the characteristic as the dividing attribute in all decision trees, and the sum of the times is larger than a set threshold value, and the important characteristic is judged.
8. The method for predicting and analyzing the pitting corrosion on the outer surface of the buried pipeline according to claim 1, wherein the specific steps of calculating the accumulated local effect ALE value in the step 7 are as follows: dividing the distribution interval of each feature into K subintervals, and calculating the accumulated local effect ALE value of the feature in each subinterval by adopting the following formula; a line graph is made for the calculated accumulated local effect ALE value, and the line graph reflects the change rule of the pitting corrosion depth along with the characteristics;
wherein ALE (j, x) represents the feature j non-centered ALE value; x represents a feature other than feature j; k (k) j (x) The number of the division intervals of the feature j; n is n j (k) Representing the sample size in the kth interval; i represents an i-th sample point in a k-th section; z is Z k,j An upper boundary value for feature j in the kth interval; z is Z k-1,j A lower boundary value for feature j in the kth interval; i: x j i ∈n j (k) Representing the ith sample point in the kth interval.
CN202311346348.1A 2023-10-17 2023-10-17 Method for predicting pitting corrosion of outer surface of buried pipeline and analyzing factors Pending CN117313041A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311346348.1A CN117313041A (en) 2023-10-17 2023-10-17 Method for predicting pitting corrosion of outer surface of buried pipeline and analyzing factors

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311346348.1A CN117313041A (en) 2023-10-17 2023-10-17 Method for predicting pitting corrosion of outer surface of buried pipeline and analyzing factors

Publications (1)

Publication Number Publication Date
CN117313041A true CN117313041A (en) 2023-12-29

Family

ID=89286452

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311346348.1A Pending CN117313041A (en) 2023-10-17 2023-10-17 Method for predicting pitting corrosion of outer surface of buried pipeline and analyzing factors

Country Status (1)

Country Link
CN (1) CN117313041A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117589663A (en) * 2024-01-18 2024-02-23 西南石油大学 Residual life prediction method for nonmetallic pipeline of oil-gas field

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109284877A (en) * 2018-11-19 2019-01-29 福州大学 Based on AIGA-WLSSVM Buried Pipeline rate prediction method
US20210319265A1 (en) * 2020-11-02 2021-10-14 Zhengzhou University Method for segmentation of underground drainage pipeline defects based on full convolutional neural network
CN116227166A (en) * 2023-01-13 2023-06-06 中国地质大学(武汉) Method and device for calculating explosion vibration speed of corrosion metal pipelines in different operating years

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109284877A (en) * 2018-11-19 2019-01-29 福州大学 Based on AIGA-WLSSVM Buried Pipeline rate prediction method
US20210319265A1 (en) * 2020-11-02 2021-10-14 Zhengzhou University Method for segmentation of underground drainage pipeline defects based on full convolutional neural network
CN116227166A (en) * 2023-01-13 2023-06-06 中国地质大学(武汉) Method and device for calculating explosion vibration speed of corrosion metal pipelines in different operating years

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YUHUI SONG等: "Interpretable machine learning for maximum corrosion depth and influence factor analysis", MATERIALS DEGRADATION, 28 February 2023 (2023-02-28), pages 1 - 15 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117589663A (en) * 2024-01-18 2024-02-23 西南石油大学 Residual life prediction method for nonmetallic pipeline of oil-gas field
CN117589663B (en) * 2024-01-18 2024-03-19 西南石油大学 Residual life prediction method for nonmetallic pipeline of oil-gas field

Similar Documents

Publication Publication Date Title
CN111291097B (en) Drilling leaking layer position real-time prediction method based on decision tree data mining
KR102170765B1 (en) Method for creating a shale gas production forecasting model using deep learning
CN117313041A (en) Method for predicting pitting corrosion of outer surface of buried pipeline and analyzing factors
CN111259953B (en) Equipment defect time prediction method based on capacitive equipment defect data
CN116881671A (en) Atmospheric pollution tracing method and system based on neural network
Sambo et al. Application of Artificial Intelligence Methods for Predicting Water Saturation from New Seismic Attributes.
US11555943B2 (en) Method for identifying misallocated historical production data using machine learning to improve a predictive ability of a reservoir simulation
CN110795888A (en) Petroleum drilling risk prediction method
CN116644284A (en) Stratum classification characteristic factor determining method, system, electronic equipment and medium
CN116738192A (en) Digital twinning-based security data evaluation method and system
CN115455791B (en) Method for improving landslide displacement prediction accuracy based on numerical simulation technology
CN113468821B (en) Decision regression algorithm-based slope abortion sand threshold determination method
CN115545112A (en) Method for automatically identifying and processing large amount of sewage real-time automatic monitoring data
CN115906602A (en) Wind turbine generator tower barrel dumping monitoring and evaluating method based on deep learning Transformer self-coding
CN114066037A (en) Drainage basin pollution source tracing prediction method and device based on artificial intelligence
CN113887049A (en) Drilling speed prediction method and system for petroleum drilling based on machine learning
CN117708625B (en) Dam monitoring historical data filling method under spent data background
CN117540277B (en) Lost circulation early warning method based on WGAN-GP-TabNet algorithm
CN116976146B (en) Fracturing well yield prediction method and system coupled with physical driving and data driving
CN112070399B (en) Safety risk assessment method and system for large-scale engineering structure
RATNAYAKE Classification of static mechanical equipment using a fuzzy inference system: a case study from an offshore installation
Torebekov PREDICTING OIL RECOVERY FACTOR IN POLYMER INJECTION: A COMPARATIVE ANALYSIS OF MLP AND RBF NEURAL NETWORKS WITH OPTIMIZED PARAMETERS
Wang et al. Characteristic space-based algorithm for detecting abnormal monitoring data in underground engineering
CN118153944A (en) Storage tank risk prediction method, storage tank risk prediction system, computer equipment and storage medium
CN114239264A (en) Crude oil water content self-adaptive soft measurement method based on incremental Gaussian mixed regression

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination