CN118213000A

CN118213000A - A method for predicting drum strength based on dynamic weighted distribution adaptation network

Info

Publication number: CN118213000A
Application number: CN202410054305.4A
Authority: CN
Inventors: 严锋; 杨春节
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2024-01-15
Filing date: 2024-01-15
Publication date: 2024-06-18

Abstract

The invention discloses a drum strength prediction method based on a dynamic weighted distribution adaptive network, and belongs to the field of soft measurement modeling of industrial processes. According to the invention, the domain invariant features of the source domain and the target domain are extracted by using a transfer learning method, the feature extractor is guided by adopting the pseudo tag of the target domain, the features related to high strength of the rotary drum are extracted, and the prediction performance of the model is improved. The invention firstly utilizes a long-short-term memory network to extract the characteristics of a source domain and a target domain. Then aligning the edge distribution and the condition distribution by minimizing the distribution distance, thereby learning the common characteristics among the domains; and then, constructing a target driving adaptation module, maximizing the correlation between the target domain characteristics and the labels, and extracting the characteristics of the target highly correlation. And finally, predicting the test data by using the trained feature extractor and regressor. The method realizes the prediction of the drum strength under the data drift, the built model is more consistent with the actual industrial production, and the accuracy of the prediction of the drum strength is improved.

Description

Drum strength prediction method based on dynamic weighted distribution adaptive network

Technical Field

The invention belongs to a soft measurement method of key performance indexes in the field of flow industry, and particularly relates to a drum strength prediction method based on a dynamic weighted distribution adaptive network.

Background

At present, the long-flow blast furnace ironmaking production still takes the dominant role in steel manufacturing. In the blast furnace ironmaking process, sintered ore is one of the main raw materials, and the charging specific gravity of the sintered ore is up to 70%. The quality of molten iron is closely related to the quality of sintered ore, and the stable and high-quality production of the sintered ore ensures the smooth proceeding of the iron-making process. Therefore, as a main route for obtaining sintered ore, the sintering process is a key link in the blast furnace ironmaking process.

Sintering is one of the main production modes of artificial block raw materials, and is a process of sintering the powdery raw materials into blocks under the condition of incomplete melting by heating at high temperature. The advantages and disadvantages of the sinter directly affect the yield, quality and energy consumption of iron-making production. According to the research of the literature, the sintering process is the most energy-consuming user next to the blast furnace ironmaking process, and accounts for about 10% of the energy consumption of the whole steel production process. The intelligent prediction and control of key index parameters in the sintering process has important significance for improving the yield and quality of the sintered ore and reducing the energy consumption. The sintering process itself involves a large number of physicochemical reactions, which can produce a large number of key index parameters, such as quality parameters, state parameters, and energy consumption parameters. From the perspective of stable control of the sintering process, accurate prediction and intelligent control of key index parameters are important points of research in academia and industry. The drum strength is used as a key physical property index of the sinter, and has important significance on the quality and yield of the subsequent blast furnace ironmaking.

Due to the lack of timely measurements, drum intensities are difficult to estimate and predict, resulting in difficult accurate control. Gao et al constructed a neural network for prediction of drum intensity using principal component analysis and genetic algorithms. To solve the problem of rare labels, chen et al propose a semi-supervised learning system for predicting drum strength. The system comprises three parts: the Gaussian mixture model is used for dividing various working conditions, the instant learning method is used for selecting the working condition with the nearest sample, and the semi-supervised least square method is used for learning the few-label sample. The physical and chemical reactions during sintering are difficult to quantitatively describe, resulting in high complexity in accurately building a kinetic model. Furthermore, the data driven method does not predict drum strength well due to the limited features contained in the data obtained from the sintering site. To address this problem, ye et al propose a mechanism and data fusion approach for drum strength prediction that utilizes a local thermal imbalance mechanism model (LTNE) to capture the uncertainty of the thermochemical reaction equation of the sintering process. The above research method has a certain promotion effect on the sintering field, but still needs to be further explored in face of the problem of few quality indexes and labels of the sintering ores.

Analysis of the drum strength prediction problem shows that the prediction problem has two difficulties: (1) The sintering process has time variability and multiple working conditions, so that the data are distributed and drift phenomenon, and then the training set and the testing set do not meet independent and same distribution. The conventional deep learning model is built under the assumption of independent identical distribution, and cannot solve the problem of distribution diversity caused by data drift. (2) The test cost of the drum strength is high, so that the label sample is rare, and the training of the deep learning model is challenged. Conventional deep learning models require a large amount of labeled data, and the training effect is often poor in face of rare label problems.

Disclosure of Invention

Aiming at the problem that the drum strength is difficult to detect in the process of data drift, the invention provides a sinter drum strength intelligent detection method based on a dynamic weighting combined distribution adaptive network. The method mainly comprises the following four steps: firstly, using a long and short-term memory network to extract the characteristics of a source domain and a target domain in a distributed way; then, calculating edge distribution and condition distribution of a source domain and a target domain respectively, and summing the two distributions by using a dynamic weighting method; then, a target driving adaptation module is provided, linear correlation and nonlinear correlation sum between the pseudo tag in the target domain and the target domain characteristics are calculated, and correlation of the pseudo tag and the target domain characteristics is maximized; finally, embedding a correlation function of the joint distribution distance and the target driving module into a loss function to enable the network learning domain to be unchanged in characteristics and target correlation characteristics; the method is verified in the real data of the actual sintering plant, and the result shows that the method has higher accuracy than other methods.

The invention is realized by adopting the following technical scheme:

1) Selecting auxiliary variables related to the drum strength of the sinter, extracting time slices by adopting a sliding window method based on historical data, and constructing a source domain data set X _s and a target domain data set X _t according to different time periods;

2) Respectively extracting the characteristics of the source domain and the target domain by using a long-short-term memory network, and then respectively calculating the distance between the edge distribution of the source domain and the edge distribution of the target domain and the distance between the conditional distributions by using a bulldozer distance algorithm;

3) Summing the distances between the two distributions by using a dynamic weighting method, constructing a distribution loss function, and minimizing the distance between the distributions; 4) Constructing a target driving adaptation network module, respectively calculating the correlation between the extracted features and a target domain label by adopting a Pierson coefficient and a Speman coefficient, extracting the features highly correlated with the drum strength, and constructing a target correlation loss function;

constructing a comprehensive loss function based on the distribution loss function and the target correlation loss function; forming a dynamic weighting distribution adaptation network model by the long-short-term memory network and the target drive adaptation network module, and training the constructed dynamic weighting distribution adaptation network model; ;

5) And deploying the trained dynamic weighted distribution adaptive network model to an actual sintering site to predict the drum strength in real time.

The invention has the beneficial effects that:

1. The invention builds a drum strength intelligent detection method with aligned characteristic distribution under data drift by using a transfer learning model. Most of the previous researches use a conventional depth model, a training set and a testing set are required to meet independent identical distribution assumptions, and the influence caused by data distribution drift cannot be solved. For the actual sintering process, the distribution of the training set and the test set under the data drift is different, and the model is combined with the migration learning method to provide a combined distribution adaptation network so as to realize the prediction of the drum strength under the data drift, and the built model is more consistent with the actual industrial production.

2. The present invention uses dynamic balance factors to characterize the relative importance of edge distribution and conditional distribution. In the prior art, two distributions are directly summed, and the relative importance of edge distribution and conditional distribution is dynamically changed in the sintering process.

3. The invention provides a target driving adaptation module which is used for learning the correlation between the target domain characteristics and the labels. In the prior art, feature alignment is generally carried out by directly using joint distribution, so that the features of the target domain deviate from the labels of the target domain, and the expression capability of the features is influenced. According to the method, the relevance between the target domain characteristics and the labels is calculated by utilizing the Pearson correlation coefficient and the Spirman correlation coefficient, so that the model learns the characteristics highly relevant to the target domain labels, and the prediction accuracy of the drum strength is improved. The method can provide accurate intelligent prediction thought for key indexes in other industrial processes, and improves the stability and safety of industrial production.

Drawings

FIG. 1 is a schematic diagram of a sinter drum strength prediction method based on a dynamic weighted joint distribution adaptation network;

FIG. 2 is a schematic diagram of a target drive adaptation;

FIG. 3 is a diagram of data distribution drift;

FIG. 4 is a diagram of the original feature distribution;

FIG. 5 schematic diagram of feature distribution after migration

FIG. 6 is a graph showing a comparison of drum strength prediction results.

Detailed Description

The invention will be described in further detail below with reference to the drawings and examples, it being noted that the examples described below are intended to facilitate an understanding of the invention and are not intended to limit the invention in any way.

FIG. 1 provides steps of a drum strength prediction method based on a dynamic weighted joint distribution adaptation network, specifically including:

1) And selecting auxiliary variables related to the drum strength of the sinter, extracting time slices by adopting a sliding window method based on historical data, and constructing a source domain data set X _s and a target domain data set X _t according to different time periods.

2) And then, respectively extracting the characteristics of the source domain and the target domain by using a long-short-term memory network, and respectively calculating the distance between the edge distribution of the source domain and the edge distribution of the target domain and the distance between the conditional distributions by using the bulldozer distance. The edge distribution and conditional distribution distance calculation method is as follows: firstly, respectively inputting source domain data X _s and target domain data X _t to a long-short-term memory network feature extractor phi to obtain source domain features and target domain features Z _s,Z_t, wherein the distance formula of edge distribution is as follows:

d_WD(D_s,D_t)＝||Z_s-Z_t||² (2)

Where D _WD(D_s,D_t) represents the bulldozer distance between the two edge distributions and D _s,D_t represents the data distributions of the source and target domains, respectively. The conditional distribution distance calculation formula is as follows:

D _CWD(D_s,D_t) represents the bulldozer distance between two condition distributions, C represents the equidistant dividing number of pseudo labels on a target domain, and the pseudo labels on the target domain are obtained by inputting target domain features extracted by a long-short-term memory network to a full-connection layer; the Z _s(c) and Z _t(c) represent feature representations belonging to the c-th interval in the source domain and the target domain respectively, And/>The number of samples belonging to the c-th interval in the source domain and the target domain are represented, respectively.

3) And summing the distance between the two distributions by using a dynamic weighting method, constructing a distribution loss function, and minimizing the distance between the distributions. The edge distribution and the condition distribution are dynamically weighted by a dynamic balance factor alpha. The relative distance between two distributions is firstly determined, and the specific formula is as follows:

Alpha ₀ represents the initial value of the dynamic balance factor, and the dynamic balance factor updating strategy based on incremental learning is as follows: if the edge distribution distance after the n+1 round of iteration is larger than that of the n round, increasing the weight coefficient of the edge distribution by using a penalty factor, so that the dynamic weighting distribution adaptive network model adopts larger weight to learn the edge distribution, and the specific formula is as follows;

Wherein, Represents a penalty factor, σ represents a hyperbolic sinusoidal activation function, and α _n+1 and α _n represent the dynamic factors of the n+1 and n-th iterations, respectively. Then the final dynamically weighted joint distribution can be expressed as:

d_joint(D_s,D_t)＝αd_WD(D_s,D_t)+(1-α)d_CWD(D_s,D_t) (7)

Where α represents a dynamic balance factor, d _joint(D_s,D_t) represents a joint distribution of the source domain and the target domain.

4) Constructing a target driving adaptation network module, respectively calculating the correlation between the extracted features and a target domain label by adopting a Pierson coefficient and a Speman coefficient, extracting the features highly correlated with the drum strength, and constructing a target correlation loss function;

As shown in fig. 2. The step of the target drive adaptation network module calculating the correlation is as follows: first, the linear correlation between each feature and the target domain label is calculated using pearson correlation coefficients, namely:

Wherein, D-th dimensional feature representing i-th sample in target domain,/>Mean value representing d-th dimension feature in target domain,/>Pseudo tag representing the i-th sample in the target domain,/>Representing the average of all pseudo tag samples in the target domain,/>Representing the pearson correlation coefficient between the d-th dimensional feature and the label in the target domain, _Nt representing the number of samples in the target domain.

Then, the nonlinear correlation between each feature and the target domain label is calculated using the spearman correlation coefficient, namely:

Wherein c _j(d) represents the difference in the levels of the jth sample, The number of levels representing the d-th dimensional feature of j samples,Representing the number of classes of the jth sample pseudo tag in the target domain,/>Representing the spearman correlation coefficient between the d-th dimensional feature and the tag.

Then the sum of the linear and nonlinear correlations of all features can be expressed as:

Where ρ _p and ρ _s represent the absolute values of the linear and nonlinear correlations of all features, respectively, and ρ _tar represents the sum of the absolute values of the linear and nonlinear correlations.

5) And constructing a comprehensive loss function based on the distribution loss function and the target correlation loss function, and extracting the characteristics highly correlated with the drum intensity. Comprehensive loss functionComprises three parts: regression loss function/>Joint distribution loss functionTarget adaptation loss function/>The regression loss function is calculated as follows:

Wherein Y _k represents the true label of the kth sample in the source domain, A prediction label representing the kth sample in the source domain, N representing the number of source domain samples.

Joint distribution loss functionThe calculation is as follows:

Target adaptation loss function The calculation is as follows:

Comprehensive loss function The calculation is as follows:

Where λ ₁ and λ ₂ are two hyper-parameters.

6) Forming a dynamic weighting distribution adaptation network model by the long-short-term memory network and the target drive adaptation network module, and training the constructed dynamic weighting distribution adaptation network model; and deploying the trained model to an actual industrial site to perform intelligent prediction of the drum strength.

In this embodiment, the accuracy of the verification method is verified. Training the constructed dynamic weighted distribution adaptive network model by using the real data of a certain sintering plant, and deploying the trained model to an actual industrial site to perform offline prediction precision analysis and online intelligent prediction of the drum strength.

The invention is further described below in connection with specific examples.

(1) Sintering process description and auxiliary variable determination

The sintering machine adopted in the experiment is a bag type sintering machine with 24 bellows, and the sintering machine is divided into a north side and a south side (such as a south side of a No. 1 bellows and a north side of a No. 1 bellows), and the specific technical process comprises five steps: proportioning, mixing, igniting, ventilation sintering, cooling and screening. Firstly, various iron ore powder raw materials are subjected to primary batching (pre-batching) to form neutralized powder with uniform components and higher iron content, and then the neutralized powder, sintered return ores, coke powder and solvent are subjected to secondary batching to form sintered batching. The mixture is formed by adding water twice and mixing and granulating, the mixture is conveyed to a mixing bin through a belt conveyor, and the mixture is uniformly distributed on a sintering trolley through a nine-roller distributor along with the movement of the sintering trolley along a track towards the tail part of the sintering machine. As the trolley passes the igniter, the surface of the mix begins to ignite. Simultaneously, an exhaust fan at the lower part of the trolley starts to exhaust air, so that the mixture is combusted from top to bottom. When the trolley reaches the tail, the sintering mixture is burnt out to form sinter. The sinter automatically falls down at the tail of the machine, forms qualified products through crushing, screening and cooling, and is sent to a blast furnace, and the rest unqualified sinter is used as return ore and bedding material for secondary use. The drum strength of the sinter directly affects the quality and yield of the subsequent blast furnace ironmaking, and intelligent prediction of the drum strength is necessary. Because of the high cost of testing the drum strength, a data-driven soft measurement method is used to predict the drum strength for cost saving. Through the proprietary numerical correlation analysis, 23 auxiliary variables such as the temperature and negative pressure (south and north) of bellows No. 1 were selected as inputs to the model, as shown in table 1.

TABLE 1 sintering process input and output variables

(2) Analysis of drift characteristics of data distribution during sintering

The sintering process involves a number of process variables such as bed thickness, firing temperature, trolley speed, etc., and the sintering machine trolley speed is taken as an example to analyze the data distribution drift. As shown in fig. 3, the apparent variability in the distribution of process variables across training and testing sets presents challenges to conventional machine learning models. Because conventional machine learning models are built on assumptions that the data is independent and co-distributed. Due to fluctuation of raw materials and change of operation conditions in the sintering process, fluctuation of working conditions is complex, so that the training set and the testing set often do not meet independent uniform distribution, and obvious distribution drift phenomenon is generated. Because the model trained on the training set has a good fitting effect only aiming at the data of the distribution of the training set, but the model can not be effectively predicted by facing the test set outside the distribution, and the prediction accuracy of the model is greatly reduced. Therefore, it is necessary to model the data distribution differences using the migration learning method. The method mainly utilizes a distance minimization function to reduce the distribution difference of a source domain and a target domain, and improves the accuracy of the model under the condition of data distribution drift.

(3) Data set construction and experimental setup

In order to verify the accuracy of the method of the invention, data about sinter is collected from a sintering plant in south China. According to the distribution condition of sintering data, data from 2021, 9 and 2022, 10 are selected as training sets (source fields), and data from 2022, 11 and 2023 are selected as test sets (target fields). After data preprocessing and sliding window division, the source domain data samples are 3000, and the target domain samples are 1000. The model results are measured using three evaluation indices, root mean square error RMSE, mean absolute error MAE, and hit rate HR (e=0.03), wherein,Representing the predicted value, y ⁽ⁱ⁾ representing the true value, e representing the relative percentage error of the true and predicted values. To reduce chance, the evaluation index for all models was an average of 10 results.

(4) Offline model creation and comparison

In order to compare the superiority of the dynamically weighted joint distribution adaptive network model (TDWJDA) established by the invention, a typical machine learning model and three transfer learning models are selected for comparison, and the method comprises the following steps: variable weighted self-encoder (VWSAE), principal component analysis transfer learning model (TCA), joint distribution adaptation method (JDA) and balance distribution adaptation method (BDA). Table 2 shows the results of comparison of different predictive models, bolded for the performance of the proposed method TDWJDA of the present invention. Overall, we can observe that the proposed TDWJDA model achieves the best performance in all evaluation metrics, with RMSE, MAE and HR values of 0.3979,0.3097 and 94.20%, respectively. Specifically, the traditional machine learning model (VWSAE) performs the worst because it is a conventional machine learning model that works well only for independent co-distributions and cannot learn features that shift in data distribution. In contrast, TCA can reduce the edge distribution difference of the source domain and the target domain through the distance function, thereby improving the feature learning capability of the model. However, the TCA model does not have the ability to characterize the conditional distribution of the source and target domains, and in practice the conditional distribution also exists in the data distribution drift journal during sintering. The JDA model takes into account both the variability of the edge distribution and the condition distribution, resulting in better performance. Furthermore, the importance of the edge distribution and the conditional distribution also dynamically changes over time during sintering, while the JDA model simply adds the two distributions, ignoring their differences in importance. The BDA model effectively balances the importance of edge distribution and conditional distribution with dynamic balance factors, thus exceeding JDA in terms of accuracy. However, the BDA model does not consider the correlation between the target domain features and the tags, so that the extracted target domain features gradually deviate from the target domain, and many redundant features are learned. According to the TDWJDA provided by the invention, the target driving adaptation module is utilized, the correlation between the target domain characteristics and the labels is maximized, the distance between the target domain characteristics and the labels is pulled in, the refinement degree of the target domain characteristics is improved, and the prediction performance of the model is improved. Therefore, TDWJDA of the invention surpasses all other comparison models, and greatly improves the precision of the strength of the rotary drum. To increase the interpretability of TDWJDA models, the present invention visualizes the feature distribution before and after migration, as shown in FIGS. 4 and 5. Comparing the two graphs, the distribution of the original features is found to have larger difference, and after the migration learning, the features of the source domain and the target domain are uniformly distributed, so that independent same-distribution assumptions are satisfied. In addition, the center point distance of the feature distribution after migration is greatly reduced compared with that before migration, so that TDWJDA models are proved to be truly pulled into the direct distribution distance of the source domain and the target domain, the distribution difference is reduced, and the prediction accuracy is improved.

Table 2 model prediction results comparison

(5) Online model results

Finally, the model established by the method is detected on line on an actual sintering machine. Fig. 6 shows curves of actual and predicted values of sinter drum strength, respectively. It can be seen that the ability of the TDWJDA model of the present invention to fit exceeds the other comparative models. By observing the coincidence discovery of the true value curve and the predicted value curve of fig. 6, the proposed TDWJDA model can accurately predict the drum strength, thereby meeting the application requirements of actual engineering.

The foregoing examples illustrate only a few embodiments of the invention and are described in detail herein without thereby limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of the invention should be assessed as that of the appended claims.

Claims

1. The drum strength prediction method based on the dynamic weighted distribution adaptation network is characterized by comprising the following steps of:

2. The method for predicting drum intensity of a dynamic weighted distribution adaptive network according to claim 1, wherein in the step 2), the edge distribution and conditional distribution distance calculating method is as follows: firstly, respectively inputting source domain data X _s and target domain data X _t to a long-short-term memory network feature extractor phi to obtain a source domain feature Z _s and a target domain feature Z _t, wherein the distance formula of edge distribution is as follows:

d_WD(D_s,D_t)＝||Z_s-Z_t||² (2)

Where D _WD(D_s,D_t) represents the bulldozer distance between the two edge distributions, and D _s,D_t represents the data distributions of the source and target domains, respectively;

the conditional distribution distance calculation formula is as follows:

3. The method for predicting the drum strength of a dynamic weighted distribution adaptive network according to claim 1, wherein the step 3) is:

Dynamically weighting the edge distribution and the condition distribution by adopting a dynamic balance factor alpha; firstly, determining an initial value alpha ₀ of a dynamic balance factor by using the relative distance between two distributions, wherein the specific formula is as follows:

the dynamic balance factor update strategy based on incremental learning is as follows: if the edge distribution distance after the n+1 round of iteration is larger than that of the n round, increasing the weight coefficient of the edge distribution by using a penalty factor, so that the dynamic weighting distribution adaptive network model adopts larger weight to learn the edge distribution, and the specific formula is as follows;

Wherein, Representing penalty factors, σ representing hyperbolic sinusoidal activation functions, α _n+1 and α _n representing dynamic factors for the n+1th and n-th iterations, respectively; then the final dynamically weighted joint distribution is expressed as:

d_joint(D_s,D_t)＝αd_WD(D_s,D_t)+(1-α)d_CWD(D_s,D_t) (7)

4. The method for predicting drum strength of a dynamic weighted distribution adaptation network according to claim 1, wherein in step 4), the step of calculating the correlation by the target drive adaptation network module is as follows:

First, the linear correlation between each feature and the target domain label is calculated using pearson correlation coefficients, namely:

Wherein, D-th dimensional feature representing i-th sample in target domain,/>Mean value representing d-th dimension feature in target domain,/>Pseudo tag representing the i-th sample in the target domain,/>Representing the average of all pseudo tag samples in the target domain,/>Representing the pearson correlation coefficient between the d-th dimension feature and the label in the target domain, N _t representing the number of samples in the target domain;

Wherein c _j(d) represents the difference in the levels of the jth sample, The number of levels representing the d-th dimensional feature of j samples,Representing the number of classes of the jth sample pseudo tag in the target domain,/>Representing a spearman correlation coefficient between the d-th dimensional feature and the tag;

then the sum of the linear and nonlinear correlations of all features is expressed as:

Where ρ _p and ρ _s represent the linear and nonlinear correlation absolute values of all features, respectively, and ρ _tar represents the sum of the correlation absolute values.

5. The method for predicting drum strength of a dynamically weighted distribution adaptation network as recited in claim 4, wherein in step 4), the integrated loss function isComprises three parts: regression loss function/>Joint distribution loss function/>Target adaptation loss function/>

Comprehensive loss functionThe calculation is as follows:

Where λ ₁ and λ ₂ are two hyper-parameters.

6. The method for predicting drum strength of a dynamically weighted distribution adaptation network of claim 5, wherein the regression loss function is calculated as follows:

Wherein Y _k represents the true label of the kth sample in the source domain, A prediction tag representing a kth sample in the source domain, N representing the number of source domain samples;

Joint distribution loss function The calculation is as follows:

d _joint(D_s,D_t) represents the joint distribution of the source domain and the target domain;

Target adaptation loss function The calculation is as follows:

7. The method for predicting the drum strength of a dynamic weighted distribution adaptation network according to claim 1, wherein the auxiliary variables comprise a middling proportion, a pulverized coal proportion, a limestone proportion, a material thickness, a sintering machine speed, an ignition temperature, an ignition strength, a large flue negative pressure, a circular cooler speed, a front air box negative pressure, a front air box temperature, a middle air box temperature, a tail air box temperature, and a drum strength at a previous moment.