CN115049123A - Prediction method for silicon content of molten iron in blast furnace based on GA-XGboost model - Google Patents

Prediction method for silicon content of molten iron in blast furnace based on GA-XGboost model Download PDF

Info

Publication number
CN115049123A
CN115049123A CN202210641526.2A CN202210641526A CN115049123A CN 115049123 A CN115049123 A CN 115049123A CN 202210641526 A CN202210641526 A CN 202210641526A CN 115049123 A CN115049123 A CN 115049123A
Authority
CN
China
Prior art keywords
data
blast furnace
content
silicon content
molten iron
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210641526.2A
Other languages
Chinese (zh)
Inventor
王德全
田铁磊
刘燕军
李涛
邓勇
李丽红
杨佳毅
王艾军
王杰
张书楼
张通亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North China University of Science and Technology
Delong Steel Ltd
Original Assignee
North China University of Science and Technology
Delong Steel Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North China University of Science and Technology, Delong Steel Ltd filed Critical North China University of Science and Technology
Priority to CN202210641526.2A priority Critical patent/CN115049123A/en
Publication of CN115049123A publication Critical patent/CN115049123A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Economics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Biology (AREA)
  • Operations Research (AREA)
  • Biomedical Technology (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Physiology (AREA)
  • Genetics & Genomics (AREA)
  • Artificial Intelligence (AREA)
  • General Business, Economics & Management (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Manufacture Of Iron (AREA)

Abstract

A blast furnace molten iron silicon content prediction method based on a GA-XGboost model comprises the following steps: firstly, collecting historical smelting data of a blast furnace; standardizing the data set; dividing the data into different clusters; fourthly, eliminating the characteristic variables with the relation numbers larger than the set values, and dividing the data into a training set and a testing set; utilizing the data in the training set to train the GA-XGboost model; testing the trained GA-XGboost model by using the data concentrated by the test; and seventhly, predicting the content of the silicon in the molten iron of the blast furnace by using a GA-XGboost model which is qualified in test. The method utilizes the genetic algorithm to optimize and improve on the basis of the XGboost algorithm, and divides the prediction data set into a plurality of data subsets through the KMeans + + algorithm before prediction, so that high prediction accuracy can be obtained under the complicated and changeable blast furnace conditions, and the prediction efficiency is greatly improved.

Description

Prediction method for silicon content of molten iron in blast furnace based on GA-XGboost model
Technical Field
The invention relates to a method for predicting the silicon content of blast furnace molten iron, which can improve the prediction efficiency and ensure the accuracy of a prediction result and belongs to the technical field of metal smelting.
Background
At present, the combination of blast furnace smelting and machine learning fields is in a vigorous development stage, the combination of the two is beneficial to making the production parameters of the blast furnace more accurate and accelerating the automatic process of smelting, and the advantages make the blast furnace smelting development trend in the future certainly. However, the difference between China and the international level is still large, the prediction technology in the blast furnace smelting process is still not mature in all aspects, the problems of mismatch between a mechanical model and the actual situation, poor prediction precision and generalization capability exist in the aspect of prediction of the silicon content of the blast furnace, and particularly, no efficient and mature prediction method and model exist in the aspect of prediction of the silicon content under the condition of large coal injection smelting, so that the silicon content and the furnace temperature state of the blast furnace cannot be fed back in time after the furnace material structure and the blast furnace operation parameters change, and the operation of the blast furnace is unstable.
The traditional big data prediction model usually adopts a neural network or a statistical method to predict a target result, although the predicted value and the true value are good in approximation effect, the prediction process of the model has inexplicability and the influence degree of input parameters in a blast furnace on the silicon content prediction result is difficult to judge; the latter applies a mature statistical analysis method to enable the prediction result of the data to have interpretability, but when the data set is too large, the prediction efficiency is greatly reduced, and in the face of complex and variable blast furnace conditions, higher prediction accuracy cannot be obtained. Therefore, it is necessary to find an efficient and accurate method for predicting the silicon content of the molten iron in the blast furnace.
Disclosure of Invention
The invention aims to provide a blast furnace molten iron silicon content prediction method based on a GA-XGboost model aiming at the defects of the prior art so as to improve the prediction efficiency of the blast furnace molten iron silicon content and ensure the accuracy of the prediction result.
The problem of the invention is solved by the following technical scheme:
a blast furnace molten iron silicon content prediction method based on a GA-XGboost model comprises the following steps:
firstly, collecting historical smelting data of a blast furnace, and preprocessing the collected data set;
standardizing the data set;
thirdly, dividing the data in the data set into different clusters through a KMeans + + clustering algorithm;
analyzing the correlation among the characteristic parameters in each cluster by using a Pearson correlation coefficient, eliminating characteristic variables of which the correlation number is larger than a set value, and dividing the data in each cluster into a training set and a test set;
utilizing the data in the training set to train the GA-XGboost model;
testing the trained GA-XGboost model by using the data concentrated by the test;
and seventhly, predicting the content of the silicon in the molten iron of the blast furnace by using a GA-XGboost model which is qualified in test.
According to the prediction method for the silicon content of the molten iron of the blast furnace based on the GA-XGboost model, the GA-XGboost model is trained by using data in a training set, the characteristic of population search of a genetic algorithm is utilized, the parameter value of the XGboost is used as an individual of the genetic algorithm, the currently optimized parameter value is transmitted to the XGboost for prediction in a set parameter combination interval, the result is used as a parameter of a fitness function of the genetic algorithm for multiple iterations, and the optimal parameter combination of the XGboost is finally obtained, and the method specifically comprises the following steps:
a. setting each initial parameter and selectable parameter interval of the genetic algorithm;
b. the fitness of the current model parameter is calculated by adopting a genetic algorithm, and the calculation formula of a fitness function is as follows:
Figure BDA0003682321270000021
where fitness represents the fitness, m represents the number of samples of the sub data set,
Figure BDA0003682321270000022
representing the true value of the ith sample in the test set;
c. setting the number of retained parents, selecting data with highest fitness as the retained parents, and randomly crossing genes of the two parents to generate new filial generations;
d. forming a new individual by randomly mutating a single gene of all the individuals of the filial generation, and taking the new individual as a parent of the next iteration of the genetic algorithm;
e. and (d) repeating the steps b to d until the specified iteration times or fitness reaches the specified requirement.
According to the prediction method for the silicon content of the blast furnace molten iron based on the GA-XGboost model, when the GA-XGboost model is trained by utilizing data in a training set, the number of parents is kept to be set to be 3, the genes of the parents are crossed by adopting a uniform crossing method, and each gene of a filial generation is independently selected from the parents to be crossed based on the independent characteristics of the filial generation.
According to the method for predicting the silicon content of the blast furnace molten iron based on the GA-XGboost model, when all individuals of filial generation form new individuals through randomly mutating a single gene, the mutation is to randomly select a parameter in a selectable range to replace the original parameter, and only one gene of the filial generation is changed during each mutation.
The prediction method of the silicon content of the blast furnace molten iron based on the GA-XGboost model comprises the following steps of preprocessing the collected data set, including processing of vacancy values and processing of abnormal values: eliminating the sample by adopting a subtraction method for data with more than half of the vacancy value of the characteristic parameter, and filling the vacancy value with the average value of data in a week before and after the vacancy value for the rest of data with the vacancy value; screening and cleaning abnormal values through a box type graph;
the prediction method of the content of the silicon in the molten iron of the blast furnace based on the GA-XGboost model is characterized in that a data set is standardized by using the following formula:
Figure BDA0003682321270000031
where X is the data before normalization, X is the data after normalization, μ is the data mean, and σ is the data standard deviation.
The method for predicting the silicon content of the molten iron in the blast furnace based on the GA-XGboost model comprises the following specific steps of dividing data in a data set into different clusters through a KMeans + + clustering algorithm:
a data set X containing n t-dimensional data is set as X 1 ,x 2 ,…,x n }(x i ∈R t ) Division into a plurality of non-intersecting clusters, where R t Representing t-dimensional data, x i The ith t-dimension data, the number of clusters is determined by the size of the contour coefficient, and the calculation formula of the contour coefficient s is as follows:
Figure BDA0003682321270000032
wherein b is the average Euclidean distance between the data and the non-local cluster data, a is the average Euclidean distance between the data and other data in the local cluster, and when the contour coefficient is maximum when dividing into k clusters, the k clusters are divided into kClustering:
Figure BDA0003682321270000033
X j denotes the jth cluster, j ═ 1,2, …, k, x ji Representing ith t-dimensional data in jth cluster, n j Indicating the number of t-dimensional data in the jth cluster.
According to the prediction method for the silicon content of the molten iron of the blast furnace based on the GA-XGboost model, when characteristic variables with the correlation coefficient larger than a set value are removed, the set value of the correlation coefficient is 0.9, when data in each cluster are divided into a training set and a testing set, the ratio of the number of the data in the training set to the number of the data in the testing set is 7: 3.
according to the prediction method of the silicon content of the blast furnace molten iron based on the GA-XGboost model, collected historical smelting data of the blast furnace comprise air temperature, air volume, oxygen content, oxygen enrichment rate, coke ratio, coal ratio, pressure difference, top pressure, top temperature, transparent finger, coal GAs CO content, coal GAs CO2 content, slag SiO2 content, slag binary alkalinity, slag ternary alkalinity, slag quaternary alkalinity, utilization coefficient, molten iron sulfur content, air-blowing kinetic energy, sintered ore silicon content, common pellet silicon content, acid pellet silicon content, lump ore 1 silicon content, lump ore 2 silicon content, magnesium acid pellet silicon content and calcium carbonate content, wherein seven characteristic parameters related to furnace burden proportion are respectively the sintered ore silicon content, the common pellet silicon content, the acid pellet silicon content, the lump ore 1 silicon content, the lump ore 2 silicon content, the magnesium acid pellet silicon content and the calcium carbonate content.
According to the prediction method of the content of molten iron and silicon in the blast furnace based on the GA-XGboost model, the collected data set relates to characteristic parameters of raw materials entering the blast furnace to form a sparse matrix, and after the data in the data set are divided into different clusters, a plurality of characteristic parameters related to charge mixture ratio are compressed into one-dimensional characteristics through a PCA algorithm: the silicon content of the furnace charge is improved so as to improve the prediction precision of the silicon content of the molten iron of the blast furnace.
The method utilizes the genetic algorithm to optimize and improve on the basis of the XGboost algorithm, and divides the prediction data set into a plurality of data subsets through the KMeans + + algorithm before prediction, thereby not only eliminating the influence of furnace conditions on the silicon content prediction, but also obtaining higher prediction accuracy under complicated and variable blast furnace conditions and greatly improving the prediction efficiency.
Drawings
The present invention will be described in further detail with reference to the accompanying drawings.
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a schematic diagram showing random crossing of genes in two parents, wherein FIG. 2(a) shows a state before crossing and FIG. 2(b) shows a state after crossing.
The symbols in the text are: fitness represents the fitness, m represents the number of samples of the sub data set,
Figure BDA0003682321270000041
the true value of the ith sample in the test set is represented, X is the data before normalization, X is the data after normalization, μ is the data mean, σ is the data standard deviation, R t Representing t-dimensional data, x i Is the ith t-dimensional data, s is the contour coefficient, b is the average Euclidean distance between the data and the non-local cluster data, a is the average Euclidean distance between the data and other data in the local cluster, and X j Denotes the jth cluster, x ji Represents ith t-dimensional data in jth cluster, n j Indicating the number of t-dimensional data in the jth cluster.
Detailed Description
The invention provides a prediction method of the silicon content of blast furnace molten iron based on a GA-XGboost model, which is characterized in that a genetic algorithm is utilized to carry out optimization and improvement on the basis of an XGboost algorithm, and a prediction data set is divided into a plurality of data subsets according to the similarity degree through a KMeans + + algorithm before prediction so as to distinguish the blast furnace condition in the smelting process and improve the accuracy of a prediction result. Because the environment of a blast furnace hearth is severe, the online measurement of the temperature of molten iron is difficult to realize, smelting data is not fed back timely, and a corresponding regulation and control measure cannot be taken in advance to stabilize a blast furnace thermal system, and a plurality of mechanism models and data-driven models need to meet strong mathematical assumptions, so that a better prediction result can be obtained when the blast furnace condition is stable and smooth, but when the furnace condition is unstable, the prediction result often has larger deviation from the actual condition. The invention aims to solve the problem of predicting the silicon content in blast furnace smelting and realize accurate prediction of the silicon content under various blast furnace conditions.
The method comprises the following steps:
s1, collecting historical smelting data of a blast furnace, wherein each sample comprises m sample characteristics, and preprocessing the collected data set.
Specifically, 26 variables of air temperature, air quantity, oxygen enrichment ratio, coke ratio, coal ratio, pressure difference, top pressure, top temperature, permeable index, coal gas CO content, coal gas CO2 content, slag SiO2 content, slag binary alkalinity, slag ternary alkalinity, slag quaternary alkalinity, utilization coefficient, molten iron sulfur content, blast kinetic energy, sintered ore silicon content, common pellet silicon content, acid pellet silicon content, lump ore 1 silicon content, lump ore 2 silicon content, magnesium acid pellet silicon content and calcium carbonate content are selected as input characteristics of the prediction model.
The preprocessing process comprises two parts, namely, a vacancy value and an abnormal value, wherein the vacancy value adopts a subtraction method and a filling method, the vacancy value of the characteristic parameter exceeds more than half of the number, the sample is eliminated by the subtraction method, and the filling method is to fill the vacancy value by using the average value of data of a week before and after the vacancy value; outliers were screened and cleaned by boxplot.
S2, the data set is normalized after being preprocessed, and is subjected to centering according to a data mean value mu and then is zoomed according to a data standard deviation sigma, so that the data set follows a normal distribution, namely the normalization is performed by using the following formula:
Figure BDA0003682321270000051
where X is the data before normalization and X is the data after normalization.
S3, dividing data in the data set into different clusters through a KMeans + + clustering algorithm, eliminating the influence of different furnace conditions on silicon content prediction, improving the accuracy of a prediction result, and dividing a data set X containing n t-dimensional data into { X ═ X + 1 ,x 2 ,…,x n }(x i ∈R t ) Divided into a plurality of non-intersecting clusters, whichIn R t Representing t-dimensional data, x i Is the ith t-dimensional data. The number of clusters is determined by the size of the contour coefficients, which are formulated as follows:
Figure BDA0003682321270000061
where b is an average euclidean distance (euclidean distance) between the data and the data not in the cluster, and a is an average euclidean distance between the data and other data in the cluster. When the contour coefficient is maximum when the partition is performed into k clusters, the partition is performed into k clusters:
Figure BDA0003682321270000062
wherein X j Denotes the jth cluster, j ═ 1,2, …, k, x ji Representing ith t-dimensional data in jth cluster, n j Representing the number of t-dimensional data in the jth cluster, each corresponding to a furnace condition.
S4, the characteristic parameters of the data set relating to the blast furnace charging raw material are a sparse matrix, which influences the accuracy of a prediction model, so that seven characteristic parameters related to furnace burden proportion, including sinter silicon content, common pellet silicon content, acid pellet silicon content, lump ore 1 silicon content, lump ore 2 silicon content, magnesium acid pellet silicon content and calcium carbonate content, are compressed into one-dimensional characteristics through a PCA algorithm: and (4) the silicon content of the furnace charge.
And S5, analyzing the correlation among the characteristic parameters of the divided clusters by using a Pearson correlation coefficient, eliminating the characteristic variables with the correlation number larger than 0.9, and taking the residual characteristic parameters as input values of a prediction model. And the data in each cluster is determined according to the following relation of 7: the ratio of 3 is divided into a training set and a test set.
And S6, training the GA-XGboost model by using the data in the training set in each cluster. The method uses the characteristic of genetic algorithm group search, takes the parameter value of the XGboost as an individual of the genetic algorithm, transmits the currently optimized parameter value to the XGboost for prediction from a set parameter combination interval, and takes the result as a parameter of a fitness function of the genetic algorithm for multiple iterations to finally obtain the optimal parameter combination of the XGboost.
The method comprises the following specific steps:
a. setting each initial parameter and selectable parameter interval of the genetic algorithm;
b. the fitness of the current model parameter is calculated by a genetic algorithm, and the calculation formula of a fitness function is as follows:
Figure BDA0003682321270000071
where fitness represents the fitness, m represents the number of samples of the sub data set,
Figure BDA0003682321270000072
representing the true value of the ith sample in the test set.
c. By setting the number of the reserved parents, the parent with the highest fitness is selected as the reserved parent, and the genes (data) of the two parents are randomly crossed to generate a new offspring (see fig. 2).
d. And (4) randomly mutating a single gene of all the individuals of the filial generation to form a new individual, and taking the new individual as a parent of the next iteration of the genetic algorithm.
e. And (e) repeating the steps b to d until the specified iteration times or fitness is completed to reach the specified requirement.
In step a, the initial parameter is set as a random value in a set parameter interval. Seven parameters which mainly affect the XGboost algorithm are selected for adjustment, the parameters are used as genes of a population of the genetic algorithm, the optimal parameters are searched in modes of cross variation and the like, and the training time of the algorithm is shortened due to the characteristic of high convergence speed.
In the step c, the number of the reserved parent individuals is set to be 3, and the number of the reserved parents is set to be 3 through experimental verification, so that the difference between the parent and the child is not reduced due to the fact that the number of the reserved parents is too high, the convergence speed is reduced, the training time is prolonged, and the difference between the parent and the child is not too large due to the fact that the number of the reserved parents is too small, and the optimal value cannot be converged easily.
In step c, the gene crossing of the parent adopts a uniform crossing method, and each gene of the offspring is independently selected from the parent for crossing based on the independent characteristics of the genes. In step d, mutation is to randomly select a parameter in a selectable range, to introduce the diversity of offspring by changing the value through random quantity, and to change only one gene of the offspring in each mutation. The cross variation mode acts on the filial generation and generates difference with the parent generation, not only the individuals with high fitness in the parent generation are reserved, but also the optimization is continuously carried out to iterate the optimal value, and the difficulty that the genetic algorithm falls into the local optimal value in the iterative convergence process is avoided due to the addition of the mutation.
And S7, inputting the data of the test set into the trained GA-XGboost model to obtain a group of predicted values, drawing a visual graph by comparing the predicted values with real values, testing the trained GA-XGboost model, and observing the model effect.
And S8, predicting the content of the molten iron silicon in the blast furnace by using a GA-XGboost model which is qualified in testing.
Experiments show that after the sample set is distinguished into different furnace conditions through a clustering algorithm, the hit rate of the prediction result is remarkably improved, the hit rate can reach 100% under the condition that the error is less than 0.1, and the hit rate can also reach more than 90% on average under the condition that the error is less than 0.05. Compared with the prediction result of the furnace condition which is not distinguished, the hit rate can be effectively improved after the furnace condition is distinguished, and meanwhile, the accuracy of the prediction result is improved after the sample set is divided according to the similarity degree. In addition, the XGboost model optimized by the genetic algorithm can be converged to the optimal condition quickly, the process of manually adjusting parameters is reduced, and the prediction efficiency of the model is improved.
The GA-XGboost prediction method not only enables prediction results to have good interpretability, but also has the problems that a huge data set can complete prediction within an acceptable time range, the accuracy is high, and manual parameter adjustment is reduced.
At present, the domestic blast furnace smelting technology of the large coal injection is still in a development stage, and the data accumulation degree cannot be compared with the traditional smelting method. Due to the lack of a mature technical scheme in the process of applying the large coal injection blast furnace smelting technology, the situation in the blast furnace is more complicated and changeable than that under the traditional smelting technology due to the increase of the coal ratio, and the change of the blast furnace operation parameters is influenced. The characteristic parameters of the prediction data set are greatly different due to the problems, and the data in the prediction data set are divided into a plurality of sub data sets (clusters) according to the similarity degree through a KMeans + + algorithm so as to distinguish different conditions in the blast furnace. Similar characteristic parameters can be clustered together by a method for distinguishing furnace conditions, and the accuracy of a prediction result is greatly improved.
Interpretation of terms:
XGboost (eXtreme Gradient boosting) is a Gradient boosting algorithm and a residual decision tree, and the basic idea is as follows: one tree is gradually added into the model, and the whole effect (the objective function is reduced) is improved when each CRAT decision tree is added. A plurality of decision trees (a plurality of single weak classifiers) are used to form a combined classifier, and each leaf node is given a certain weight.
The GA-XGboost is an optimized and improved XGboost algorithm through a genetic algorithm.
KMeans + +, KMeans + + is a clustering algorithm modified from the KMeans algorithm.

Claims (10)

1. A blast furnace molten iron silicon content prediction method based on a GA-XGboost model is characterized by comprising the following steps:
firstly, collecting historical smelting data of a blast furnace, and preprocessing the collected data set;
standardizing the data set;
thirdly, dividing the data in the data set into different clusters through a KMeans + + clustering algorithm;
analyzing the correlation among the characteristic parameters in each cluster by using a Pearson correlation coefficient, eliminating characteristic variables of which the correlation number is larger than a set value, and dividing the data in each cluster into a training set and a test set;
utilizing the data in the training set to train the GA-XGboost model;
testing the trained GA-XGboost model by using the data concentrated by the test;
and seventhly, predicting the content of the silicon in the molten iron of the blast furnace by using a GA-XGboost model which is qualified in test.
2. The method for predicting the silicon content of the molten iron of the blast furnace based on the GA-XGboost model as claimed in claim 1, wherein the training of the GA-XGboost model by utilizing data in a training set is characterized in that the characteristic of population search of a genetic algorithm is utilized, the parameter value of the XGboost is taken as an individual of the genetic algorithm, the currently optimized parameter value is transmitted to the XGboost for prediction in a set parameter combination interval, the result is taken as a parameter of a fitness function of the genetic algorithm for multiple iterations, and finally the optimal parameter combination of the XGboost is obtained, and the method comprises the following specific steps:
a. setting each initial parameter and selectable parameter interval of the genetic algorithm;
b. the fitness of the current model parameter is calculated by adopting a genetic algorithm, and the calculation formula of a fitness function is as follows:
Figure FDA0003682321260000011
where fitness represents the fitness, m represents the number of samples of the sub data set,
Figure FDA0003682321260000012
representing the true value of the ith sample in the test set;
c. setting the number of retained parents, selecting data with highest fitness as the retained parents, and randomly crossing genes of the two parents to generate new filial generations;
d. forming a new individual by randomly mutating a single gene of all the individuals of the filial generation, and taking the new individual as a parent of the next iteration of the genetic algorithm;
e. and (d) repeating the steps b to d until the specified iteration times or fitness reaches the specified requirement.
3. The method for predicting the silicon content of the molten iron of the blast furnace based on the GA-XGboost model as claimed in claim 2, wherein when the GA-XGboost model is trained by using data in a training set, the number of parents is kept to be 3, a uniform crossing method is adopted for gene crossing of the parents, and each gene of filial generations is independently selected from the parents for crossing based on independent characteristics of each gene.
4. A prediction method of Si content in molten iron of blast furnace based on GA-XGboost model as claimed in claim 3, wherein when all the individuals of the filial generation are transformed into new ones by randomly mutating single gene, the mutation is to randomly select a parameter to replace the original parameter in a selectable range, and each mutation changes only one gene of the filial generation.
5. A prediction method of Si content in molten iron of a blast furnace based on GA-XGboost model according to claims 1-4, characterized in that the pre-processing of the collected data set comprises the processing of vacancy values and abnormal values: eliminating the sample by adopting a subtraction method for data with more than half of the vacancy value of the characteristic parameter, and filling the vacancy value with the average value of data in a circle before and after the vacancy value for the rest data with the vacancy value; outliers were screened and cleaned by boxplot.
6. A prediction method of the content of molten iron and silicon in a blast furnace based on a GA-XGboost model according to claim 5, characterized in that the data set is normalized by the following formula:
Figure FDA0003682321260000021
where X is the data before normalization, X is the data after normalization, μ is the data mean, and σ is the data standard deviation.
7. The prediction method of the content of silicon in molten iron of the blast furnace based on the GA-XGboost model as claimed in claim 6, wherein the specific method for dividing the data in the data set into different clusters by the KMeans + + clustering algorithm is as follows:
a data set X containing n t-dimensional data is set as X 1 ,x 2 ,…,x n }(x i ∈R t ) Division into a plurality of non-intersecting clusters, where R t Representing t-dimensional data, x i The ith t-dimension data, the number of clusters is determined by the size of the contour coefficient, and the calculation formula of the contour coefficient s is as follows:
Figure FDA0003682321260000031
where b is an average euclidean distance between the data and non-local cluster data, a is an average euclidean distance between the data and other data in the local cluster, and when the contour coefficient is maximum when dividing into k clusters, the k clusters are divided:
Figure FDA0003682321260000032
X j denotes the jth cluster, j ═ 1,2, …, k, x ji Representing ith t-dimensional data in jth cluster, n j Indicating the number of t-dimensional data in the jth cluster.
8. The method for predicting the silicon content of the molten iron of the blast furnace based on the GA-XGboost model according to claim 7, wherein when characteristic variables with the correlation coefficient larger than a set value are removed, the set value of the correlation coefficient is 0.9, and when data in each cluster are divided into a training set and a testing set, the ratio of the number of the data in the training set to the number of the data in the testing set is 7: 3.
9. the method for predicting the silicon content of the molten iron of the blast furnace based on the GA-XGboost model as claimed in claim 3, wherein the collected historical smelting data of the blast furnace comprise air temperature, air volume, oxygen enrichment ratio, coke ratio, coal ratio, pressure difference, top pressure, top temperature, permeability index, CO content of GAs, CO2 content of GAs, SiO2 content of slag, binary basicity of slag, ternary basicity of slag, quaternary basicity of slag, utilization coefficient, sulfur content of molten iron, air-blasting kinetic energy, silicon content of sintered ore, silicon content of common pellets, silicon content of acid pellets, silicon content of lump ore 1, silicon content of lump ore 2, silicon content of magnesium acid pellets and calcium carbonate content, wherein seven characteristic parameters related to furnace burden proportioning are respectively the silicon content of sintered ore, the silicon content of common pellets, the silicon content of acid pellets, the silicon content of lump ore 1, the silicon content of lump ore 2, the silicon content of magnesium acid pellets and the silicon content of calcium carbonate, Calcium carbonate content.
10. The method for predicting the content of molten iron and silicon in the blast furnace based on the GA-XGboost model as claimed in claim 9, wherein the collected data set relates to characteristic parameters of raw materials entering the blast furnace to form a sparse matrix, and after the data in the data set is divided into different clusters, a plurality of characteristic parameters related to charge mixture ratio are compressed into one-dimensional characteristics through a PCA algorithm: the silicon content of the furnace charge is improved so as to improve the prediction precision of the silicon content of the molten iron of the blast furnace.
CN202210641526.2A 2022-06-07 2022-06-07 Prediction method for silicon content of molten iron in blast furnace based on GA-XGboost model Pending CN115049123A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210641526.2A CN115049123A (en) 2022-06-07 2022-06-07 Prediction method for silicon content of molten iron in blast furnace based on GA-XGboost model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210641526.2A CN115049123A (en) 2022-06-07 2022-06-07 Prediction method for silicon content of molten iron in blast furnace based on GA-XGboost model

Publications (1)

Publication Number Publication Date
CN115049123A true CN115049123A (en) 2022-09-13

Family

ID=83161561

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210641526.2A Pending CN115049123A (en) 2022-06-07 2022-06-07 Prediction method for silicon content of molten iron in blast furnace based on GA-XGboost model

Country Status (1)

Country Link
CN (1) CN115049123A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115732041A (en) * 2022-12-07 2023-03-03 中国石油大学(北京) Carbon dioxide capture amount prediction model construction method, intelligent prediction method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115732041A (en) * 2022-12-07 2023-03-03 中国石油大学(北京) Carbon dioxide capture amount prediction model construction method, intelligent prediction method and device
CN115732041B (en) * 2022-12-07 2023-10-13 中国石油大学(北京) Carbon dioxide capture quantity prediction model construction method, intelligent prediction method and device

Similar Documents

Publication Publication Date Title
CN108764517B (en) Method, equipment and storage medium for predicting change trend of silicon content in molten iron of blast furnace
CN106022377B (en) A kind of on-line prediction method of agglomeration for iron mine bed permeability state
CN104899463B (en) The method for building up of the classification trend prediction model of blast furnace molten iron silicon content four and application
CN108469180A (en) The method for building up of sintering end point forecasting system based on big data and machine learning
CN111260157B (en) Smelting ingredient optimization method based on ecological niche optimization genetic algorithm
CN115034465B (en) Coke quality prediction method based on artificial intelligence
CN111444942B (en) Intelligent forecasting method and system for silicon content of blast furnace molten iron
CN108460213A (en) Based on the T-S models of multi-cluster prototype to the prediction technique and program of gas flowrate in bosh
CN101139661A (en) Copper flash smelting operation parameter optimization method
CN111639820A (en) Energy consumption parameter optimization method and system for cement kiln production
CN113589693B (en) Cement industrial decomposing furnace temperature model predictive control method based on neighborhood optimization
CN115049123A (en) Prediction method for silicon content of molten iron in blast furnace based on GA-XGboost model
CN114239400A (en) Multi-working-condition process self-adaptive soft measurement modeling method based on local double-weighted probability hidden variable regression model
CN107622279A (en) The sorting technique of blast furnace internal state
Li et al. Burden surface decision using MODE with TOPSIS in blast furnace ironmkaing
CN114066069A (en) Combined weight byproduct gas generation amount prediction method
CN116307149A (en) Blast furnace performance optimization method based on attention LSTM and KBNSGA
CN114004153A (en) Penetration depth prediction method based on multi-source data fusion
CN109934421B (en) Blast furnace molten iron silicon content prediction and compensation method for fluctuating furnace condition
CN111639821A (en) Cement kiln production energy consumption prediction method and system
CN111833970A (en) Construction method and application of cement clinker quality characterization parameter prediction model
CN112329269B (en) Sintering ignition temperature modeling prediction method based on working condition identification
CN112836902A (en) Coal combustion calorific capacity prediction method based on improved BP neural network
CN112861276B (en) Blast furnace burden surface optimization method based on data and knowledge dual drive
CN115186900A (en) Dynamic blast furnace gas production prediction method and system suitable for multiple working condition types

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination