CN118013426B - Steel product inclusion causal effect optimization method, system, electronic equipment and medium - Google Patents

Steel product inclusion causal effect optimization method, system, electronic equipment and medium Download PDF

Info

Publication number
CN118013426B
CN118013426B CN202410338274.5A CN202410338274A CN118013426B CN 118013426 B CN118013426 B CN 118013426B CN 202410338274 A CN202410338274 A CN 202410338274A CN 118013426 B CN118013426 B CN 118013426B
Authority
CN
China
Prior art keywords
inclusion
data set
intervention
effect
variables
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410338274.5A
Other languages
Chinese (zh)
Other versions
CN118013426A (en
Inventor
吕志民
武钰淳
张昊东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology Beijing USTB
Original Assignee
University of Science and Technology Beijing USTB
Filing date
Publication date
Application filed by University of Science and Technology Beijing USTB filed Critical University of Science and Technology Beijing USTB
Priority to CN202410338274.5A priority Critical patent/CN118013426B/en
Publication of CN118013426A publication Critical patent/CN118013426A/en
Application granted granted Critical
Publication of CN118013426B publication Critical patent/CN118013426B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention relates to a method, a system, electronic equipment and a medium for optimizing the inclusion cause and effect of steel products, which comprise the following steps: establishing a technological parameter data set and an inclusion defect data set which are related to the inclusion defect cause and effect of the steel product, and combining the technological parameter data set and the inclusion defect data set; carrying out data fusion and sample screening on the mixed defect data set of the steel products obtained after combination to obtain a new data set; performing attribute selection and outlier processing on actual data information and priori information of the new data set to obtain a data set to be processed; constructing an intervention variable for each process parameter in the data set to be processed, and calculating the causal effect of the intervention variable on the inclusion defect; and (5) formulating a process parameter optimization strategy based on the attribute of the causal effect estimated value. The invention measures the real causal effect between the technological parameters and the defect indexes in the manufacturing process, designs the technological parameter optimization strategy, and finally verifies the accuracy of causal effect estimation and optimization strategy.

Description

Steel product inclusion causal effect optimization method, system, electronic equipment and medium
Technical Field
The invention belongs to the technical field of steel product inclusion defect control, and particularly relates to a method, a system, electronic equipment and a medium for optimizing a steel product inclusion cause effect.
Background
The inclusion defect of the steel product is a defect existing in the steel, and is usually caused by nonmetallic inclusion, and the impurities are mixed in the steel production process, so that the quality and the service performance of the steel material are affected. The inclusion defects not only affect the surface quality of the steel, but also reduce the mechanical properties of the steel, such as toughness, ductility, service life and the like. Under the environment of bearing high load or impact, the product is often cracked, even broken wholly, and potential safety hazard is caused. In addition, these inclusions are difficult to remove by conventional processing methods in subsequent processing. Therefore, how to analyze the root cause of the defect of the steel inclusion, intervene in the production process, optimize the surface quality, reduce the occurrence of the inclusion, and have very important significance in the aspects of ensuring the product quality, reducing the production cost, meeting the demands of users and the like.
According to the production process and background knowledge, the inclusion defects of the strip steel can be attributed to molten steel inclusions, covering slag, covering agents, corrosives, iron scales and the like which are not removed in time in the converter steelmaking, external refining and continuous casting processes. To reduce inclusion defect rates, we need to answer questions about causal inferences. For example, what causal transfer relationship exists between process parameters and inclusion defects? What will be the inclusion defect in the sample in which it is located? How do their counterfactual results evaluate for samples that have performed some intervention in reality? The traditional inclusion defect control method is often based on mechanism and priori knowledge, has low efficiency and non-uniform standard, and is difficult to perform post quality diagnosis, tracing and optimization. Nor does it take into account the value contained in the data. Therefore, aiming at the technical field of controlling the inclusion defects of steel products, an automatic technology for quantitatively characterizing the causal effect between the process technological parameters and the inclusion defects and optimizing the technological parameters is needed to achieve the aim of controlling the inclusion defects.
Disclosure of Invention
In order to overcome the problems in the prior art, the invention provides a method, a system, electronic equipment and a medium for optimizing the inclusion cause and effect of steel products, which are used for overcoming the defects existing at present.
A method for optimizing the effect of inclusion cause and effect in steel products, said method comprising the steps of:
1) Establishing a technological parameter data set and an inclusion defect data set which are related to the inclusion defect cause and effect of the steel product, and combining the technological parameter data set and the inclusion defect data set;
2) Carrying out data fusion and sample screening on the mixed defect data set of the steel products obtained after combination to obtain a new data set;
3) Performing attribute selection and outlier processing on the actual data information and the priori information of the new data set to obtain a data set to be processed;
4) Constructing an intervention variable for each process parameter in the data set to be processed, and calculating the causal effect of the intervention variable on the inclusion defect;
5) And formulating a process parameter optimization strategy based on the attribute of the causal effect estimated value.
Aspects and any possible implementation manner as described above, further provide an implementation manner, where 1) specifically includes:
11 Establishing a technological parameter data set related to cause and effect of inclusion defects of the steel product;
12 Establishing a corresponding inclusion defect dataset for the process parameter dataset;
13 Combining the process parameter dataset and the inclusion defect dataset, the resulting dataset being referred to herein as raw sample data.
In the aspect and any possible implementation manner described above, there is further provided an implementation manner, the data type of each process parameter in the data set to be processed includes a discrete type and a continuous type, and the 4) specifically includes:
41 For discrete data types, determining intervention variables by using single-heat codes according to discrete values contained in the samples, and constructing a corresponding number of intervention variables according to the discrete value numbers;
42 For continuous data types, adopting a median as a partition point, a numerical value greater than the median as 1 and less than or equal to the median as 0, and generating an intervention variable by each continuous process parameter.
In the aspect and any possible implementation manner described above, there is further provided an implementation manner, for an intervention variable corresponding to a process parameter, regarding a variable remaining after the process parameter as a covariate of the intervention variable, and the result variable is an inclusion defect label, where 4) further includes:
43 Predicting a tendency score using logistic regression;
44 Comparing the sample numbers of the intervention variable experiment group and the control group, and setting the group with more sample numbers as the group with less sample numbers for matching to obtain matched samples;
45 Determining a causal effect value of the samples after matching;
46 Checking the causal effect value.
In aspects and any one of the possible implementations described above, there is further provided an implementation, wherein the calculation formula of the tendency score using logistic regression prediction is as follows:
in order to score the tendency of the person to be inclined, Taking a positive integer for the intervention variable of the process parameter,Representing probability, regarding the remaining variables in the process parameter set X as intervention variablesCovariates of (C), intervention variablesCorresponding process parameters are recorded asI.e.
Aspects and any possible implementation manner as described above, further provide an implementation manner, where the 5) specifically includes: 51 Setting technological parameters corresponding to intervention variables with significant differences among the checked result variables to be included in an optimization candidate set, and taking the inclusion with the causal effect value larger than 0 as a result variable negative influence set, wherein the inclusion result variable positive influence set with the causal effect value smaller than or equal to 0;
52 Regarding discrete variables in the optimization candidate set, taking the true values of the discrete variables as an optimization target;
53 For continuous variables in the optimization candidate set, firstly determining the intervention points of the continuous variables, reducing the numerical value of the technological parameters of the negative influence set, and improving the numerical value of the technological parameters of the positive influence set to form a new technological parameter data set;
54 Inputting the new technological parameter data set into a classification model to predict the proportion of inclusion defects, and comparing the proportion of inclusion defects with the proportion of inclusion defects of an original sample to verify the accuracy of causal effect estimation and intervention optimization strategies.
Aspects and any one of the possible implementations as set forth above, further provide an implementation, the 54) including:
541 Original sample) Dividing a training set and a testing set according to the proportion of 8:2, training a classification model, and cross-verifying classification accuracy by taking classification accuracy as an evaluation index;
542 All original samples) Inputting a training classification model, and checking classification accuracy of the training classification model;
543 Inputting 542) the new process parameter data set into the trained classification model, predicting the proportion of inclusion defects after optimizing the process parameters, and comparing with the proportion of inclusion defects of the original sample.
The invention also provides a system for optimizing the inclusion cause effect of the steel product, which realizes the method and comprises the following modules:
The method comprises the steps of establishing a merging module, which is used for establishing a technological parameter data set and an inclusion defect data set which are related to the inclusion defect cause and effect of the steel product, and merging the technological parameter data set and the inclusion defect data set;
the fusion and screening module is used for carrying out data fusion and sample screening on the defect data set of the steel product inclusion obtained after the fusion to obtain a new data set;
The processing module is used for carrying out attribute selection and abnormal value processing on the actual data information and the prior information of the new data set to obtain a data set to be processed;
The calculation module is used for constructing an intervention variable for each technological parameter in the data set to be processed and calculating the causal effect of the intervention variable on the inclusion defect;
and the optimization verification module is used for formulating a process parameter optimization strategy based on the attribute of the causal effect estimated value and verifying the process parameter optimization strategy.
The invention also provides an electronic device, which comprises:
a memory storing executable instructions;
And a processor executing the executable instructions in the memory to implement the method.
The invention also provides a computer storage medium having stored thereon a computer program for execution by a processor to perform the method.
The beneficial effects of the invention are that
The method of the invention comprises the steps of: establishing a technological parameter data set and an inclusion defect data set which are related to the inclusion defect cause and effect of the steel product, and combining the technological parameter data set and the inclusion defect data set; carrying out data fusion and sample screening on the mixed defect data set of the steel products obtained after combination to obtain a new data set; performing attribute selection and outlier processing on the actual data information and the priori information of the new data set to obtain a data set to be processed; constructing an intervention variable for each process parameter in the data set to be processed, and calculating the causal effect of the intervention variable on the inclusion defect; and (5) formulating a process parameter optimization strategy based on the attribute of the causal effect estimated value. The invention measures the real causal effect between the technological parameters and the defect indexes in the manufacturing process, designs the technological parameter optimization strategy, and finally verifies the accuracy of causal effect estimation and optimization strategy. And extracting the technological parameters related to the inclusion defect cause and effect of the steel product through industrial big data, and improving the accuracy and objectivity of the inclusion defect influence factor judgment and the high efficiency of technological parameter optimization.
The beneficial effects are as follows:
(1) The defect accuracy of the inclusion of the original data is cross predicted by using a high-accuracy AutoGluon-Tabular (AGT) model, so that powerful guarantee is provided for steel production modeling and verification of causal inference;
(2) Based on the actual business scene of steel production, the design strategy of technological parameter intervention variables and covariates thereof is creatively provided in consideration of the problem of sample equalization matching;
(3) The optimization window is designed by combining a high-precision prediction model AGT, so that quantitative optimization of technological parameters is realized, and the rationality of optimization measures is ensured;
(4) The causal inference method based on the tendency score matching is creatively applied to the field of industrial big data quality modeling, a mathematical model is built for the inclusion defects of steel products, on the premise of meeting corresponding mathematical assumptions, causal effects are successfully estimated based on PSM, and a process parameter optimization strategy is formulated by combining AGT, so that the inclusion defect rate is remarkably reduced, technicians can be helped to clearly control the regulation and control direction and the size of process parameters, and the method has guiding significance for the inclusion defect management in the steel production process.
Drawings
FIG. 1 is a flow chart of a PSM-based steel inclusion causal effect analysis and optimization method provided by an embodiment of the invention;
FIG. 2 is a schematic diagram of distribution of tendency scores of samples before and after matching according to an embodiment of the present invention;
FIG. 3 is a schematic diagram showing the values of the intervention effect of each process parameter on inclusions before and after matching according to the embodiment of the present invention;
FIG. 4 is a comparative schematic diagram of inclusion defects before and after optimization of process parameters according to an embodiment of the present invention.
Detailed Description
For a better understanding of the present invention, the present disclosure includes, but is not limited to, the following detailed description, and similar techniques and methods should be considered as falling within the scope of the present protection. In order to make the technical problems, technical solutions and advantages to be solved more apparent, the following detailed description will be given with reference to the accompanying drawings and specific embodiments.
It should be understood that the described embodiments of the invention are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1, the method for optimizing the inclusion cause effect of the steel product comprises the following steps:
1) Establishing a technological parameter data set and an inclusion defect data set which are related to the inclusion defect cause and effect of the steel product, and combining the technological parameter data set and the inclusion defect data set;
2) Carrying out data fusion and sample screening on the mixed defect data set of the steel products obtained after combination to obtain a new data set;
3) Performing attribute selection and outlier processing on the actual data information and the priori information of the new data set to obtain a data set to be processed;
4) Constructing an intervention variable for each process parameter in the data set to be processed, and calculating the causal effect of the intervention variable on the inclusion defect;
5) And formulating a process parameter optimization strategy based on the magnitude, direction and significance of the causal effect estimated value corresponding to each process parameter.
Preferably, the 1) specifically includes:
11 Establishing a technological parameter data set related to cause and effect of inclusion defects of the steel product;
12 Establishing a corresponding inclusion defect dataset for the process parameter dataset;
13 Combining the process parameter dataset and the inclusion defect dataset, the dataset obtained here being the original sample data.
Preferably, the data type of each process parameter in the data set to be processed includes a discrete type and a continuous type, and the 4) specifically includes:
41 For discrete data types, determining intervention variables by using one-hot coding according to each discrete value, and constructing a corresponding number of intervention variables according to the discrete value number;
42 For continuous data types, adopting a median as a partition point, a numerical value greater than the median as 1 and less than or equal to the median as 0, and generating an intervention variable by each continuous process parameter.
Preferably, for an intervention variable corresponding to a process parameter, the variables remaining after the process parameter are considered as covariates of the intervention variable, the resulting variable being inclusion defect labels, said 4) further comprising:
43 Predicting a tendency score using logistic regression;
44 Comparing the sample numbers of the intervention variable experiment group and the control group, and setting the group with more sample numbers as the group with less sample numbers for matching to obtain matched samples;
45 Determining a causal effect value of the samples after matching;
46 Checking the causal effect value.
Preferably, the calculation formula of the prediction tendency score using logistic regression is as follows:
in order to score the tendency of the person to be inclined, Taking a positive integer for the intervention variable of the process parameter,Representing probability, regarding the remaining variables in the process parameter set X as intervention variablesCovariates, intervention variablesCorresponding process parameters are recorded asI.e.
Preferably, the 5) specifically includes: 51 Setting technological parameters corresponding to intervention variables with significant differences among the checked result variables to be included in an optimization candidate set, and taking the inclusion with the causal effect value larger than 0 as a result variable negative influence set, wherein the inclusion result variable positive influence set with the causal effect value smaller than or equal to 0;
52 Regarding discrete variables in the optimization candidate set, taking the true values of the discrete variables as an optimization target;
53 For continuous variables in the optimization candidate set, firstly determining the intervention points of the continuous variables, reducing the numerical value of the technological parameters of the negative influence set, and improving the numerical value of the technological parameters of the positive influence set to form a new technological parameter data set;
54 Inputting the new technological parameter data set into a classification model to predict the proportion of inclusion defects, and comparing the proportion of inclusion defects with the proportion of inclusion defects of an original sample to verify the accuracy of causal effect estimation and intervention optimization strategies.
Preferably, the 54) includes:
541 Original sample) Dividing a training set and a testing set according to the proportion of 8:2, training a classification model, and cross-verifying classification accuracy by taking classification accuracy as an evaluation index;
542 All original samples) Inputting a training classification model, and checking classification accuracy of the training classification model;
543 Inputting 542) the new process parameter data set into the trained classification model, predicting the proportion of inclusion defects after optimizing the process parameters, and comparing with the proportion of inclusion defects of the original sample.
Specifically, the implementation process of the invention is as follows:
S1, establishing a process parameter data set related to cause and effect of inclusion defects of steel products The number of parameters isThe number of samples is; First, theThe individual samples compriseThe individual process parameter data sets are
S2, taking the process parameter set asCreating corresponding inclusion defect data setsY is a binary data set formed by 0 and 1, 0 represents no inclusion, 1 represents inclusion and n represents the number of samples;
s3, combining the technological parameter sets and the defect sets, First, theThe individual samples compriseInclusion defect dataset of individual parameters asThe combined data set is used as an original sample data.
S4, inclusion defect data sets are carried out on the steel products in the step S3And carrying out data fusion and sample screening, merging repeated samples, and extracting a data set of a certain steel grade in a certain time period. Assuming that the number of data set samples isThe data set obtained in this step is noted as
S5, according toThe actual data information and the prior information of the data are subjected to attribute selection and abnormal value processing, the actual data information comprises data distribution and the like, the prior information comprises reasonable value range and the like of the data, useless attributes are removed, and neighborhood searching and other modes are used for carrying out the data processingAnd (5) performing outlier filling, and finally deleting samples with outliers. Assume that the number of process parameters after removing useless attributes isThe number of remaining samples isThe data set obtained in this step isThe technological parameter set is
S6, for the process parameter setIn (a) process parameters ofConstructing intervention variables according to data types and numerical distributionThe data types of the process parameters are divided into discrete type and continuous type, and the construction modes of the attribute intervention variables of different types are also different. The method comprises the following specific steps:
S61, for discrete features, the values of the discrete features are fixed few values, usually integers, intervention variables are sequentially designed according to each value of each discrete feature, specifically, how many values of a certain discrete feature exist, and how many intervention variables are constructed to represent the feature. For example, three kinds of actual values of a discrete feature a are 3, 4 and 5, and three intervention variables a-3, a-4 and a-5 are respectively constructed, under each intervention variable, the original sample value is replaced by 1 or 0,1 represents that the sample takes a corresponding value, 0 represents that the sample does not take a corresponding value, and one discrete feature usually corresponds to a plurality of intervention variables.
S62, regarding continuous characteristics, taking the equalization problem of the matched samples into consideration, taking the median of characteristic distribution as a dividing point, normalizing the numerical value greater than the median as 1 and less than or equal to 0, and distinguishing two states of 'high value' and 'low value', wherein one continuous characteristic only generates one intervention variable.
S7, matching PSM by using the tendency score, and calculating the causal effect of each intervention variable on inclusion defects. The method comprises the following specific steps:
S71, combining discrete feature and continuity feature structures Intervention variables, designatedIntervention variableCorresponding process parameters are recorded asRegarding the variables remaining in X as intervention variablesCovariates of (i.e.)The result variable is inclusion defect label
S72, predicting each intervention variable using logistic regressionTendency score of (2)The calculation formula is as follows:
In which, in the process, Representing intervention variablesIs a score of the tendency of the person to be inclined,P in the rule represents that the calculation is performed in a logistic regression mode on the observation covariatesOn the premise of intervening in a variableProbability of a value of 1;
S73, comparing intervention variables The number of samples in the experimental group (value 1) and the control group (value 0) were set to be larger than the number of samples, and the matching was performed in the group with smaller number. The small number of samples is called0 Indicates no intervention, and 1 indicates intervention. The intervention state of each sample may be 0 or 1, and the counted samples areThe intervention states of the two groups are opposite; adopting a nearest neighbor matching mode, and according to the tendency score values corresponding to samples in the box and the base, using a 1:1 matching strategy with a put-back function, wherein the matching strategy is thatMiddle isFinding the sample with the closest tendency scoreIt can be understood thatIs a homogeneous sample of (c).
S74, defining intervention variablesThe causal effect of the outcome variable Y isAnd (3) withAverage causal effect of samples(AVERAGE TREATMENT EFFECT) the calculation formula is as follows:
In which, in the process, Representing intervention variablesThe causal effect of the outcome variable Y,Is the average causal effect value calculated byAndThe difference between the two sets of corresponding result variables Y.And (b) represents the mathematical expectation of the outcome variable Y when the intervention state is 1,Representing the mathematical expectation of the outcome variable Y when the intervention state is 0,The value of (2) is positive and negative, representing the direction, and the absolute value of the value represents the magnitude.
S75, pairAndThe resulting variable of (2) is subjected to a two-tailed T-test for checking whether the difference in the mean of the two populations is significant, wherein the purpose of the two-tailed test is to check whether the difference between the sampled sample statistics and the hypothesized parameters is excessive (whether in the positive or negative direction), and to split the risk to the left and right. For example, the significance level is 5%, the probability curve has a confidence interval of 2.5% on each of the left and right sides, i.e., 95%. The purpose of single-tail verification is only to pay attention to whether verification is high or low, that is to say to verify a single direction, with single-side verification. For example, the significance level is 5%, and the probability curve only needs to pay attention to the confidence interval that one side accounts for 5%, namely 90%. The purpose of the invention is to judge whether the mean value of two data is different, so a two-tail T test is used. The significance of the difference between the two results is characterized, and the result can also be used for representing whether the causal effect estimated value reflects the real intervention effect.
S8, based on the size, the direction and the significance level of the causal effect estimated value, a process parameter optimization strategy is formulated, and a verification model is designed. The method comprises the following specific steps:
S81, setting technological parameters corresponding to intervention variables with significant differences (namely, significant double-tail T test results) of the matched result variables, and taking the process parameters into an optimization candidate set, wherein the process parameters are specifically divided into inclusion positive influence sets and inclusion negative influence sets. Each intervention change is calculated to obtain a corresponding causal effect value Inclusion greater than 0 was taken as inclusion negative impact set, inclusion positive impact set less than 0, equal to 0 indicated no impact.
S82, regarding discrete variables in the optimization candidate set, taking the true values of the discrete variables as optimization targets.
S83, for optimizing continuous variables in a candidate set, firstly determining intervention points, randomly reducing the numerical value of the process parameters of a negative influence set, randomly increasing the numerical value of the process parameters of a positive influence set, specifically, for the process parameters with the intervention direction of 'increasing', finding samples higher than the intervention points, randomly selecting half of samples, multiplying the true value of the samples by 1.5, namely, increasing the original value by half. For the process parameters with the intervention direction of 'reduced', finding samples lower than the intervention point, randomly selecting half samples, multiplying the true value of the samples by 0.5 as the value of the intervention state, namely reducing the original value by half, and forming a new process parameter data set.
S84, optimizing the process parameter data set of the previous stepInputting a high-precision classification model AutoGluon-Tabular (AGT), predicting the proportion of inclusion defects, and comparing with the proportion of inclusion defects of the original sample data to verify the accuracy of causal effect estimation and intervention strategies. The method comprises the following specific steps:
S841, dividing the original sample data into a training set and a testing set according to the proportion of 8:2, training Autogluon a model, and cross-verifying classification accuracy by taking classification accuracy as an evaluation index.
S842, inputting all original sample data into a training Autogluon classification model, and checking classification accuracy.
S843, new technological parameter data setAnd inputting Autogluon models trained by all original sample data, predicting the proportion of the inclusion defects after the process parameter optimization, and comparing with the proportion of the inclusion defects of the original sample data.
The specific embodiments of the invention are as follows:
S1, establishing a process parameter data set related to cause and effect of inclusion defects of steel products The number of parameters isThe number of samples isFirst, theThe individual samples compriseThe individual process parameter data sets are
In this example, the process parameters are derived from four production stages: converter steelmaking, external refining, continuous casting and composition;
S2, taking the process parameter set as Creating corresponding inclusion defect data setsY is a binary data set formed by 0 and 1, 0 represents no inclusion, 1 represents inclusion and n represents the number of samples;
s3, combining the technological parameter sets and the defect sets, First, theThe individual samples compriseInclusion defect dataset of individual parameters as
S4, inclusion defect data sets are carried out on the steel products in the step S3And (5) data preprocessing is carried out. The method comprises the following specific steps:
s41. Merging Samples collected in each time period, the number of samples collected in each time period is recorded asThe combined data set isK is the number of time periods. The embodiment includes samples collected in 6 time periods, and the total sample number51402 Pieces.
S42, the process parameter sets acquired by the same smelting number are the same, so that the sample data1 is combined according to the smelting number, and the number of the smelting numbers is recorded asThe combined dataset is noted as
S43, extracting a steel grade (e.g) The data set of the next steel plant (such as a large plant and a small plant) is assumed to be the number of samplesIs marked as
In this embodiment, after the smelting numbers are combined,Number of samples of steel grade3347, The number of samples from which the large plant was screened was 2244.
S5, according toAnd (3) carrying out attribute selection and outlier processing on the actual data information and the prior information of the data. The method comprises the following specific steps:
S51, pair Processing is carried out, the attribute with unchanged abnormal value is deleted, only one repeated attribute is reserved, and the number of technological parameters after useless attribute is eliminated is assumed to beThe data set is
In this embodiment, there are 95 original attributes m, and the number of process parameters after useless attributes are removed50.
S52. PairProcessing, deleting samples with some discrete attributes being empty, and assuming the number of the remaining samples isThe data set is
In this embodiment, some samples with discrete attributes of null are deleted, and the number of the remaining samples is calculated2214.
S53 pair ofProcessing, substituting component null value or abnormal value with non-0 random number, randomly substituting null value or abnormal value of continuous variable such as temperature and carbon oxygen content from normal value with similar attribute by neighborhood search, and deleting sample containing null value to form new data setThe technological parameter set is
In this embodiment, the number of samples is counted after the random or neighborhood is matched with the outlier2191.
S6, for the process parameter setIn (a) process parameters ofConstructing intervention variables according to data types and numerical distributionThe data types of the process parameters are divided into discrete type and continuous type, and the construction modes of the attribute intervention variables of different types are also different. The method comprises the following specific steps:
S61, for the discrete feature, determining the number of the intervention variables according to different discrete values, and constructing the number of the intervention variables to represent the feature, wherein 1 represents that the corresponding value is used, and 0 represents that the intervention variable is not used.
In this step, there are a total of 4 discrete features, and the sum of all values is 14, so that a total of 14 intervention variables are generated.
S62, regarding continuous characteristics, taking the equalization problem of the matched samples into consideration, adopting a median as a partition point, wherein a numerical value greater than the median is 1 and less than or equal to 0, so as to distinguish two states of 'high value' and 'low value'. One discrete feature corresponds to multiple intervention variables, but one continuous feature generates only one intervention variable.
In this step, there are 39 consecutive features, and 39 intervention variables are generated in total.
S7, calculating the causal effect of each technological parameter on the inclusion defects by using the tendency score matching. For process parametersCorresponding one intervention variableRegarding the variables remaining in X as intervention variablesCovariates of (i.e.)The result variable is inclusion defect labelThe method comprises the following specific steps:
s71 predicting each intervention variable using logistic regression Tendency score corresponding to sampleThe calculation formula is as follows:
s72, comparing intervention variables The sample numbers of the experimental group and the control group are set to be matched with the group with more sample numbers as the group with less sample numbers. The small number of samples is calledThe intervention state may be 0 or 1, and a large number of samples are calledThe intervention states of the two groups are opposite; adopts a matching strategy of nearest neighbor, 1:1 and put back, and is characterized in thatMiddle isFind its counterfactual sample, andA new set of homogeneous samples is pooled and checked for balance of samples before and after matching.
In this step, the balance of the samples before and after matching is shown in the left and right parts of fig. 2.
S73.Both results of the sample under intervention states 0 and 1 respectively are available, defining intervention variablesFor result variableIs that (1)Average causal effect of samplesThe method comprises the following steps:
in this example, the causal effect estimates of all process parameters on inclusion defects are shown in fig. 3.
S74.The two results of the sample under the intervention states of 0 and 1 are subjected to double-tail T test to represent the significance of the difference of the two results, and the result can also represent whether the causal effect estimated value reflects the real intervention effect.
In this embodiment, out of the 53 intervention variables, 15 variables have significant causal effects on inclusion defects, and the process parameters corresponding to the intervention variables are included in the candidate set of variables to be optimized, and detailed information is shown in table 1 below.
TABLE 1 candidate set of variables to be optimized
S8, based on the size, the direction and the significance level of the causal effect estimated value, a process parameter optimization strategy is formulated, and a verification model is designed. The method comprises the following specific steps:
S81, setting technological parameters corresponding to intervention variables with significant differences (namely, significant double-tail T test results) of the matched result variables to be included in an optimization candidate set, and taking the inclusion with the causal effect value larger than 0 as a result variable negative influence set and taking the inclusion with the causal effect value smaller than 0 as a result variable positive influence set.
In the embodiment, positive influences are collected on the vacuum degree (KPa) of a slag dam manufacturer_1.0 and V, RH, the oxygen content in molten steel during tapping and NB, CR, ALS, and negative influences are collected on the number of times of a water outlet, the slag dam manufacturer_2.0, the number of times of a sliding plate, the weight of molten steel, pouring, MN, SI and TI.
S82, regarding discrete variables in the optimization candidate set, taking the true values of the discrete variables as optimization targets.
In this embodiment, when the discrete parameter "slag dam manufacturer" is 1.0, the effect of reducing inclusions is better, and when the discrete parameter is 2.0, the effect of improving inclusion defects is obvious, so that the "slag dam manufacturer" is directly optimized to be 1.0. In the raw data, 2114 samples were used with a dam manufacturer of '1.0' and 77 was used with a dam manufacturer of '2.0'. In order to simulate the actual intervention condition, a common number (38) of samples are randomly selected from 77 samples to be optimized, and the value of the slag dam manufacturer is adjusted to be '1.0' instead of '1.0'.
S83, for optimizing continuous variables in a candidate set, firstly determining intervention points, randomly reducing the numerical value of the process parameters of a negative influence set, randomly increasing the numerical value of the process parameters of a positive influence set, specifically, for the process parameters with the intervention direction of 'increasing', finding samples higher than the intervention points, randomly selecting half of the samples, multiplying the true value of the samples by 1.5, namely, increasing the original value by half. For the process parameters with the intervention direction of 'reduced', finding samples lower than the intervention point, randomly selecting half samples, multiplying the true value of the samples by 0.5 as the value of the intervention state, namely reducing the original value by half, and forming a new process parameter data set.
S84, constructing a new process parameter data set of S83Inputting a high-precision classification model AutoGluon-Tabular (AGT), predicting the proportion of inclusion defects, and comparing with the proportion of inclusion defects of an original sample to verify the accuracy of causal effect estimation and intervention strategies. The method comprises the following specific steps:
S841. sample of raw material The training set and the testing set are divided according to the proportion of 8:2, autogluon models are trained, classification accuracy is used as an evaluation index, and classification accuracy is cross-validated.
In this embodiment, the AGT model cross-validation classification accuracy is 0.802, as shown on the left in FIG. 4.
S842. Taking all original samplesThe training Autogluon classification model is input and its classification accuracy is checked.
In this embodiment, the AGT model cross-validation classification accuracy is 0.901, as shown on the right in FIG. 4.
S843, new technological parameter data setInputting all original samplesAnd predicting the proportion of the inclusion defects after the process parameter optimization by using the Autogluon model after training, and comparing the proportion of the inclusion defects with the proportion of the inclusion defects of the original sample.
In this embodiment, AGT trained with all original samples predicts 13.65% of inclusion defects in the original samples, 2.56% of inclusion defects after process parameter optimization, 11.09% of inclusion defects after process parameter optimization, and verifies the effectiveness of the optimization strategy.
As an embodiment of the disclosure, the invention also discloses a system for optimizing the effect of inclusion cause and effect of steel products, wherein the system realizes the method and comprises the following modules:
The method comprises the steps of establishing a merging module, which is used for establishing a technological parameter data set and an inclusion defect data set which are related to the inclusion defect cause and effect of the steel product, and merging the technological parameter data set and the inclusion defect data set;
the fusion and screening module is used for carrying out data fusion and sample screening on the defect data set of the steel product inclusion obtained after the fusion to obtain a new data set;
The processing module is used for carrying out attribute selection and abnormal value processing on the actual data information and the prior information of the new data set to obtain a data set to be processed;
The calculation module is used for constructing an intervention variable for each technological parameter in the data set to be processed and calculating the causal effect of the intervention variable on the inclusion defect;
and the optimization verification module is used for formulating a process parameter optimization strategy based on the attribute of the causal effect estimated value and verifying the process parameter optimization strategy.
As an embodiment of the present disclosure, the present disclosure further discloses an electronic device, including:
a memory storing executable instructions;
A processor executing the executable instructions in the memory to implement the method of the present invention.
As an embodiment of the present disclosure, the present disclosure also discloses a computer storage medium, on which a computer program is stored, the computer program being executed by a processor to implement the method of the present disclosure.
The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
While the foregoing description illustrates and describes the preferred embodiments of the present invention, it is to be understood that the invention is not limited to the disclosed forms of the invention, but is not to be construed as limited to other embodiments, and is capable of numerous other combinations, modifications and environments and is capable of changes or modifications within the scope of the claimed invention, either as described above or as a matter of technical or intellectual development in the relevant art. And that modifications and variations which do not depart from the spirit and scope of the invention are intended to be within the scope of the appended claims.

Claims (7)

1. A method for optimizing the effect of inclusion cause and effect of steel products, characterized in that the method comprises the steps of:
1) Establishing a technological parameter data set and an inclusion defect data set which are related to the inclusion defect cause and effect of the steel product, and combining the technological parameter data set and the inclusion defect data set, wherein the method specifically comprises the following steps of:
11 Establishing a technological parameter data set related to cause and effect of inclusion defects of the steel product;
12 Creating a corresponding inclusion defect dataset for the process parameter dataset;
13 Combining the process parameter dataset and the inclusion defect dataset;
2) Carrying out data fusion and sample screening on the mixed defect data set of the steel products obtained after combination to obtain a new data set;
3) Performing attribute selection and outlier processing on the actual data information and the priori information of the new data set to obtain a data set to be processed;
4) Constructing an intervention variable for each process parameter in the data set to be processed, and calculating the causal effect of the intervention variable on the inclusion defect, wherein the data type of each process parameter comprises a discrete type and a continuous type, and the step 4) specifically comprises the following steps:
41 For discrete data types, using one-hot coding according to each discrete value, determining intervention variables, and constructing a corresponding number of intervention variables according to the discrete value number;
42 For continuous data types, adopting a median as a partition point, wherein a numerical value greater than the median is 1 and less than or equal to 0, and each continuous process parameter generates an intervention variable;
Regarding an intervention variable corresponding to the process parameter, regarding the variable remaining after the process parameter as a covariate of the intervention variable, and the result variable is an inclusion defect label, wherein the step 4) further includes:
43 Predicting a propensity score using logistic regression;
44 Comparing the sample numbers of the intervention variable experiment group and the control group, and setting the group with more sample numbers as the group with less sample numbers for matching to obtain matched samples;
45 Determining a causal effect value of the samples after matching;
46 Checking the causal effect value;
5) And (5) formulating a process parameter optimization strategy based on the attribute of the causal effect estimated value.
2. The steel product inclusion cause and effect optimization method according to claim 1, wherein the calculation formula of the prediction tendency score using logistic regression is as follows:
in order to score the tendency of the person to be inclined, Taking a positive integer for the intervention variable of the process parameter,Representing probability, regarding the remaining variables in the process parameter set X as intervention variablesCovariates, intervention variablesCorresponding process parameters are recorded asI.e.
3. The method for optimizing the effect of inclusion cause and effect of steel products according to claim 1, wherein said step 5) comprises the following steps:
51 Setting technological parameters corresponding to intervention variables with significant differences among the checked result variables into an optimization candidate set, and taking the inclusion with the causal effect value larger than 0 as a result variable negative influence set, wherein the inclusion result variable positive influence set with the causal effect value smaller than or equal to 0;
52 For the discrete variables in the optimization candidate set, taking the true values of the discrete variables as optimization targets;
53 For the continuous variables in the optimized candidate set, firstly, determining the intervention points of the continuous variables, reducing the numerical value of the technological parameters of the negative influence set, and improving the numerical value of the technological parameters of the positive influence set to form a new technological parameter data set;
54 Inputting the new technological parameter data set into a classification model to predict the proportion of inclusion defects, and comparing the proportion of inclusion defects with the proportion of inclusion defects of an original sample to verify the accuracy of causal effect estimation and intervention optimization strategies.
4. The steel product inclusion cause and effect optimization method according to claim 3, characterized in that said step 54) comprises:
541 To the original sample Dividing a training set and a testing set according to the proportion of 8:2, training a classification model, and cross-verifying classification accuracy by taking classification accuracy as an evaluation index;
542 All original samples Inputting a training classification model, and checking classification accuracy of the training classification model;
543 Inputting 542) the new process parameter data set into the trained classification model, predicting the proportion of inclusion defects after optimizing the process parameters, and comparing the proportion of inclusion defects with the proportion of inclusion defects of the original sample.
5. A steel product inclusion cause and effect optimization system, characterized in that it implements the method of any one of claims 1-4, comprising the following modules:
The method comprises the steps of establishing a merging module, which is used for establishing a technological parameter data set and an inclusion defect data set which are related to the inclusion defect cause and effect of the steel product, and merging the technological parameter data set and the inclusion defect data set;
the fusion and screening module is used for carrying out data fusion and sample screening on the defect data set of the steel product inclusion obtained after the fusion to obtain a new data set;
The processing module is used for carrying out attribute selection and abnormal value processing on the actual data information and the prior information of the new data set to obtain a data set to be processed;
The calculation module is used for constructing an intervention variable for each technological parameter in the data set to be processed and calculating the causal effect of the intervention variable on the inclusion defect;
and the optimization verification module is used for formulating a process parameter optimization strategy based on the attribute of the causal effect estimated value and verifying the process parameter optimization strategy.
6. An electronic device, the electronic device comprising:
a memory storing executable instructions;
A processor executing the executable instructions in the memory to implement the method of any of claims 1-4.
7. A computer storage medium, characterized in that the medium has stored thereon a computer program which is executed by a processor to implement the method of any of claims 1-4.
CN202410338274.5A 2024-03-25 Steel product inclusion causal effect optimization method, system, electronic equipment and medium Active CN118013426B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410338274.5A CN118013426B (en) 2024-03-25 Steel product inclusion causal effect optimization method, system, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410338274.5A CN118013426B (en) 2024-03-25 Steel product inclusion causal effect optimization method, system, electronic equipment and medium

Publications (2)

Publication Number Publication Date
CN118013426A CN118013426A (en) 2024-05-10
CN118013426B true CN118013426B (en) 2024-06-28

Family

ID=

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116050607A (en) * 2023-01-03 2023-05-02 阿里云计算有限公司 Parameter optimization method and device for production process

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116050607A (en) * 2023-01-03 2023-05-02 阿里云计算有限公司 Parameter optimization method and device for production process

Similar Documents

Publication Publication Date Title
CN110321658B (en) Method and device for predicting plate performance
CN113255102B (en) Method and device for predicting carbon content and temperature of molten steel at converter end point
CN111798297B (en) Financial risk early warning analysis method and device
CN114048436A (en) Construction method and construction device for forecasting enterprise financial data model
Feng et al. End temperature prediction of molten steel in LF based on CBR–BBN
CN112785377B (en) Data distribution-based order completion period prediction model construction method and prediction method
CN110991739A (en) Construction method and prediction method of industry electric quantity demand prediction model
CN115688581A (en) Oil gas gathering and transportation station equipment parameter early warning method, system, electronic equipment and medium
CN112418522B (en) Industrial heating furnace steel temperature prediction method based on three-branch integrated prediction model
CN118013426B (en) Steel product inclusion causal effect optimization method, system, electronic equipment and medium
CN117312972A (en) Method for identifying health state of scraper conveyor speed reducer
CN117454765A (en) Copper smelting furnace spray gun service life prediction method based on IPSO-BP neural network
CN118013426A (en) Steel product inclusion causal effect optimization method, system, electronic equipment and medium
CN116341750A (en) Steel inclusion level prediction method, device, terminal and storage medium
CN113673811B (en) On-line learning performance evaluation method and device based on session
CN112950362A (en) Method and device for risk early warning of loan officials, computer equipment and storage medium
CN110033828A (en) Sexual discriminating method based on chip detection DNA data
CN115545882B (en) Credit risk prediction method based on newly increased credit reject ratio
CN110796381B (en) Modeling method and device for wind control model, terminal equipment and medium
CN109409720B (en) Personalized auditing method based on big data and deep learning and robot system
CN117217867A (en) Enterprise credit prediction and optimization system based on quantum genetic algorithm
Amaral A new SPC tool in the steelshop at ArcelorMittal Gent designed to increase productivity
CN118211708A (en) Coke pushing current peak value prediction method and equipment based on SSA algorithm optimization
CN117313529A (en) Converter steelmaking oxygen blowing amount prediction method, electronic equipment and storage medium
Fang et al. Machine learning-based performance predictions for steels considering manufacturing process parameters: a review

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant