CN113102516A

CN113102516A - Hot continuous rolling strip steel head width prediction method integrating rolling mechanism and deep learning

Info

Publication number: CN113102516A
Application number: CN202110243168.5A
Authority: CN
Inventors: 李旭; 何垚东; 栾峰; 曹雷; 陈丰; 马冰冰; 高坤; 霍利峰; 张殿华; 丁敬国; 韩月娇
Original assignee: Northeastern University China
Current assignee: Northeastern University China
Priority date: 2021-03-05
Filing date: 2021-03-05
Publication date: 2021-07-13
Anticipated expiration: 2041-03-05
Also published as: CN113102516B

Abstract

The invention provides a hot continuous rolling strip steel head width prediction method integrating a rolling mechanism and deep learning, which comprises the steps of firstly obtaining production data of a hot continuous rolling field, and removing outlier data by using a Pauta criterion to obtain sample data; screening influence factor data according to influence factors of rolling broadening, then constructing a rolling mechanism prediction model of each rack, calculating a prediction reference value of the head width of the hot continuous rolling strip steel according to the influence factor data, constructing a correction value of the head width of the hot continuous rolling strip steel by using a depth confidence neural network model, and finally adding the prediction reference value of the head width of the strip steel and the prediction correction value to obtain a final prediction value of the width of the measured position of the head of the strip steel at an outlet. The method disclosed by the invention integrates a rolling mechanism and a depth confidence neural network to predict the width of the head of the strip steel, so that the problems of low prediction accuracy and easiness in falling into a local extreme value of a prediction model based on the traditional single hidden layer neural network are solved, and a good basis is provided for the optimization of a process automation level setting model.

Description

Hot continuous rolling strip steel head width prediction method integrating rolling mechanism and deep learning

Technical Field

The invention relates to the technical field of steel rolling automatic control, in particular to a hot continuous rolling strip steel head width prediction method integrating a rolling mechanism and deep learning.

Background

The width precision is one of the most important dimension indexes in the production of strip steel. Although most of the width control means of the current hot continuous rolling production line are concentrated on the rough rolling area, the setting of the width model of the rough rolling area is influenced by the width change of the finish rolling area. When the set model parameters of the automation level of the adjustment process are adjusted, if the head width of the rolled piece after finish rolling can be accurately predicted, the adjustment of the model parameters can be guided, and a basis is provided for correcting the width set model.

Due to the nonlinear interaction and dynamic coupling process in the steel production process, the prediction of the width parameter is very complicated. Although the width prediction model established according to the rolling mechanism conforms to a general rolling rule, the width prediction model is inevitably simplified and approximated in the derivation process, and many field factors are ignored in the theoretical guidance modeling idea, so that the deviation exists between the theoretical guidance modeling idea and the actual production conditions, and the prediction width error of the model is larger by only depending on the rolling mechanism, and the increasingly accurate rolling requirement cannot be met.

With the development of intelligent technology, some width prediction methods based on rolling data and a neural network have appeared in recent years, and although the precision of the methods is improved, due to the black box characteristic of the neural network, the width prediction method based on a neural network prediction model is poor in interpretability and low in reliability, and the common single-hidden-layer neural network structure prediction precision is still insufficient, so that the generalization capability is not strong and the method is easy to fall into a local extremum. The problem is expected to be overcome by the technical personnel in the field, the rolling mechanism model is used for predicting the width reference value in a modeling mode integrating the rolling mechanism and the neural network, the neural network model is used for predicting the width correction value, and meanwhile, the neural network model adopts a deep confidence network in deep learning, so that the defects can be well overcome.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a hot continuous rolling strip steel head width prediction method integrating a rolling mechanism and deep learning, which comprises the following steps:

step 1: the method comprises the steps that production data of the same measuring position of M different strip steel heads in a hot continuous rolling site are obtained, wherein each strip steel head corresponds to a group of production process data, and the production data comprise each type of measurement data detected by each instrument arranged on a hot continuous rolling production line and each type of parameter data in rolling regulation data issued by a process automation level of hot continuous rolling production;

step 2: removing outlier data from production data by using a Pauta criterion to obtain m sample data;

and step 3: dividing all the same type of data representing the width of the head of the strip steel in the sample data into a reference sequence, and dividing each type of data remaining in the sample data into a comparison sequence;

and 4, step 4: screening the data in the comparison sequence according to the influence factors of rolling and spreading to obtain N groups of influence factor data influencing the head width of the strip steel;

and 5: constructing a rolling mechanism prediction model of each frame, and calculating a prediction reference value of the head width of the hot continuous rolling strip steel according to the influence factor data;

step 6: subtracting the prediction reference value of the head width of each strip steel from the data representing the head width of the strip steel in the reference sequence to obtain prediction deviation value data;

and 7: eliminating dimension difference of each type of influence factor data by adopting a min-max standardization method to obtain standardization data;

and 8: taking the standardized data corresponding to the influence factor data as input data of the deep belief neural network model, taking the predicted deviation value data as output data of the deep belief neural network model, and training the model to obtain the deep belief neural network model with optimal parameters;

and step 9: predicting production data of a measuring position of the head of the strip steel to be processed at an outlet by using a depth confidence neural network model with optimal parameters to obtain a prediction correction value of the width of the head of the strip steel;

step 10: and adding the prediction reference value and the prediction correction value of the strip steel head width to obtain the final prediction value of the strip steel head width at the outlet of the measurement position.

The step 2 comprises the following steps:

and (3) taking the Pauta criterion described in the formula (1) as a screening criterion, judging the data meeting the criterion as outlier data and removing the outlier data:

in the formula: y is_iThe values in the production data which characterize the strip head exit width are indicated, i ═ 1,2,3, …, M,

is y_iAverage value of (1), S_yIs y_iStandard deviation of (2).

The step 5 comprises the following steps:

step 5.1: according to the thickness of the outlet of the rack and the flow equation of second, the thickness h of the inlet of the rack is inversely calculated by using a formula (2)₀：

In the formula, h₁Indicating the thickness of the exit of the rack, v₀Indicating the gantry entrance velocity, v₁Representing the gantry exit velocity;

step 5.2: calculating the contact length l of the deformation zone by using the formula (3)_c：

Wherein R represents a roll radius;

step 5.3: calculating the broadening coefficient S by using Hill formula shown in formula (4)_B：

In the formula, b₀Representing the rack entrance width, C being a constant;

step 5.4: calculating the width expansion DB of the flat roll rolling in the finish rolling area by using a formula (5):

step 5.5: calculating the gantry exit width b using equation (6)₁：

b₁＝b₀+DB (6)

Further, for a hot continuous rolling finishing mill group with multiple racks, calculating the outlet width output by the rolling mechanism prediction model of the previous rack as the inlet width of the next rack from rack to rack according to the running direction of the production line until the outlet width of the last rack is calculated as the prediction reference value of the head width of the hot continuous rolling strip steel;

the step 7 comprises the following steps:

calculating corresponding standardized data x 'after dimension difference of data in the influencing factor data set is eliminated by using formula (7)'_jk，

In the formula, x_jkRepresenting the kth data element, x, in the jth class of data_jminDenotes the minimum value, x, in class j data_jmaxRepresenting the maximum value in the jth data, and N representing the number of data types in the influence factor data set;

the bottom layer of the depth confidence network model adopts an unsupervised pre-trained restricted Boltzmann machine model, the top layer adopts an error inverse propagation regression model with supervision and fine adjustment, the activation function adopts a ReLU function, and the regularization method adopts a dropout method to prevent overfitting.

Training the model in the step 8 to obtain a depth confidence neural network model with optimal parameters, which is specifically expressed as follows:

step 8.1: setting an initial learning rate as alpha, the number of initial hidden layers as A, the number of initial nodes of the hidden layers as B and the maximum iteration number as x;

step 8.2: setting the updating step length of the node number as b, updating the node number in each iteration by using the step length b, calculating the mean square error after each iteration by using a formula (8), and taking the node number corresponding to the iteration with the minimum mean square error value as the optimal node number of the hidden layer when the maximum iteration times x is reached

Where MSE represents the mean square error value, Y_kA predicted deviation value of the input is represented,

a predictive modification value representing a head width output by the deep belief neural network model;

step 8.3: the number of nodes of each hidden layer is set as

Setting the number of layersThe updating step length is a, the number of hidden layer layers in each iteration is updated by the step length a, the mean square error after each iteration is calculated by using a formula (8), and when the maximum iteration times x is reached, the number of layers corresponding to one iteration with the minimum mean square error value is used as the optimal number of layers of the hidden layers

Step 8.4: the number of nodes of each hidden layer is set as

The number of the hidden layers is set as

Setting the updating step length of the learning rate as d, updating the learning rate in each iteration by using the step length d, calculating the mean square error after each iteration by using a formula (8), and taking the learning rate corresponding to the iteration with the minimum mean square error value as the optimal learning rate of the model when the maximum iteration times x is reached.

The invention has the beneficial effects that:

the invention provides a hot continuous rolling strip steel head width prediction method integrating a rolling mechanism and deep learning, which is characterized in that outlier data are removed by applying a Pauta criterion according to actual production data of a hot continuous rolling field, so that a model is not interfered by an abnormal value in a modeling process; the modeling method can overcome the defects that the width error predicted by a rolling mechanism prediction model is large or the width interpretability predicted by a neural network prediction model is poor and the reliability is low. In addition, the selected deep belief neural network has higher prediction precision due to the structural characteristics of multiple hidden layers, and combines the characteristics of the training modes of unsupervised pre-training and supervised fine-tuning, so that the model has higher convergence speed and is not easy to fall into a local extremum. The method has the advantages of high prediction precision, strong generalization capability, high reliability and easy maintenance of the model, solves the problem of weak capability of the traditional width prediction model in adapting to the actual production process, saves the production investment cost and provides a good foundation for adjusting the parameters of the process automation level setting model.

Drawings

FIG. 1 is a flow chart of a hot continuous rolling strip steel head width prediction method integrating a rolling mechanism and deep learning in the invention;

FIG. 2 is a layout diagram of main equipment and meters of a hot continuous rolling line according to an embodiment of the present invention;

FIG. 3 is a flow chart of the present invention for a multi-stand unit based on a rolling mechanism model;

FIG. 4 is a block diagram of a deep belief neural network in accordance with the present invention;

FIG. 5 is a flow chart of the training of the deep belief neural network employed in the present invention;

FIG. 6 is a comparison graph of the predicted value and the actual measured value of the strip steel head width obtained by the method of the present invention in the embodiment of the present invention; wherein the graph (a) shows a comparison result of a target width of 685mm of a finished product, the graph (b) shows a comparison result of a target width of 710mm of a finished product, the graph (c) shows a comparison result of a target width of 735mm of a finished product, and the graph (d) shows a comparison result of a target width of 737mm of a finished product.

Detailed Description

The invention is further described with reference to the following figures and specific examples.

As shown in fig. 1, a method for predicting the head width of a hot continuous rolling strip steel by combining a rolling mechanism and deep learning, comprises the following steps:

in this embodiment, a typical finishing mill group of a hot continuous rolling line is adopted, and the arrangement of main equipment and detection instruments of the rolling line is shown in fig. 2, wherein RE represents a rough rolling edger, R represents a rough rolling plain-barreled mill, FE represents a finish rolling edger, and F represents a finish rolling plain-barreled mill. In the hot continuous rolling production process, the production data mainly comes from the actual data measured by the detecting instrument and the rolling regulation data calculated in the process automation level, wherein the measured data used in the embodiment includes: the device comprises a thickness gauge, a width gauge, a temperature gauge, a speed sensor, a rolling mill pressure sensor, a position sensor, a rolling mill power sensor, a rolling mill rotating speed sensor and an angle sensor, wherein the thickness gauge measures the thickness of a rolled piece, the width gauge measures the width of the rolled piece, the surface temperature of the rolled piece measured by the temperature gauge, the linear speed of a rolling mill measured by the speed sensor, the rolling force of the rolled piece in a deformation process measured by the rolling mill pressure sensor, the roll gap and roll gap deviation measured by the position sensor, the. The measuring signal generated by the detecting instrument is transmitted to the process automation level from the basic automation level; the parameter data in the rolling schedule data used in the present embodiment includes: the method comprises the following steps of setting the thickness of an intermediate billet, setting the tension between finishing mill frames, setting the rolling force of a finishing vertical roll, setting the roll gap of the finishing vertical roll, setting the linear speed of the finishing vertical roll, setting the diameter of a roll and compensating coefficient of roll abrasion. All the data constitute the production data of the hot continuous rolling field acquired by the embodiment.

Step 2: removing outlier data from production data by using a Pauta criterion to obtain m sample data, wherein the method comprises the following steps:

in the formula: y is_iData representing the width of the strip head outlet in the production data, i ═ 1,2,3, …, M,

is y_iAverage value of (1), S_yIs y_iStandard deviation of (2).

In this embodiment, the production data of the M-2744 group is divided into 685mm, 710mm, 735mm and 737mm according to the target width of the finished product, and the production data is removed by using the Pauta criterion. The rejection results are shown in table 1. Finally, 14 groups of outlier data are removed altogether, and sample data m is selected to be 2730 groups.

TABLE 1 outlier rejection results

because of the influence of factors of main fertilization vertical roll parameters, the reduction rate of a finish rolling flat roll and the tension between finish rolling stands in the width expansion of a finish rolling area, 48 groups of influence factor data are finally selected in the embodiment, wherein the influence factor data comprise finish rolling inlet temperature, finish rolling outlet temperature, intermediate billet thickness, finish rolling outlet thickness, finish rolling finished product target width, rough rolling outlet width, finish rolling vertical roll rolling force, finish rolling vertical roll gap, finish rolling vertical roll linear velocity, roll diameter, roll linear velocity, roll wear compensation, tension between stands and thickness back value in a rolling mill unit F1-F8;

and 5: constructing a rolling mechanism prediction model of each frame, and calculating a prediction reference value of the head width of the hot continuous rolling strip steel according to the influence factor data, wherein the method comprises the following steps:

Wherein R represents a roll radius;

In the formula, b₀Representing the width of the entrance of the rack, wherein C is a constant and is generally equal to 0.5;

step 5.5: calculating the gantry exit width b using equation (6)₁：

b₁＝b₀+DB (6)

For a single stand rolling mill, the stand exit width b calculated by step 5.5₁The prediction reference value of the head width of the strip steel is obtained, but for a hot continuous rolling finishing mill group with a plurality of frames, the outlet width output by a rolling mechanism prediction model of the previous frame is taken as the inlet width of the next frame to be calculated frame by frame according to the running direction of a production line until the outlet width of the last frame is calculated to be the prediction reference value of the head width of the hot continuous rolling strip steel;

the prediction flow chart of the rolling mechanism model in the present invention is shown in fig. 3. In this embodiment, the algorithm for calculating the thickness between the stands according to the second flow equation in the formula (2) has been written into the process automation level program of the production field, and the field can automatically calculate the thickness between the finishing mill stands according to the instrument parameters and store the thickness. The method is specifically realized in a way that the formula (2) is programmed and packaged into a function by C + + language, the hot continuous rolling finishing mill group with multiple racks calls the packaged function of the formula (2) to calculate the outlet thickness of the previous rack according to the final rack roller linear speed and the finishing rolling outlet thickness measured by the instrument, and the process is repeated until the outlet thickness of the first rack is calculated.

In this example, since 1 vertical rolling mill and 8 flat rolling mills are provided in the finish rolling zone, the calculation work of step 5.2 to step 5.5 is repeated with the finish rolling vertical roll gap as a starting point. And calculating the outlet width of the previous frame as the inlet width of the frame one by one until the outlet width of the last frame is calculated to be the prediction reference value of the head width of the hot continuous rolling strip steel, wherein the head width of the strip steel refers to the outlet width of the head of the strip steel at the outlet position of the last frame, and the outlet width of the frame refers to the outlet width of the head of the strip steel at the outlet position of the middle frame.

and 7: eliminating dimension difference of each type of influence factor data by adopting a min-max standardization method to obtain standardization data, wherein the standardization data comprises the following steps:

the bottom layer of the deep belief neural network model adopts an unsupervised pre-trained restricted Boltzmann machine model, the top layer adopts an error inverse propagation regression model with supervision and fine adjustment, the activation function adopts a ReLU function, the regularization method adopts a dropout method to prevent overfitting, and the discarding probability is 0.3.

step 8.1: setting an initial learning rate alpha to be 0.0001, the number of initial hidden layer layers to be A to be 2, the number of initial nodes of the hidden layers to be B to be 50 and the maximum iteration number to be x;

step 8.2: setting the updating step length of the node number as b being 50, updating the node number in each iteration by using the step length b being 50, calculating the mean square error after each iteration by using a formula (8), and taking the node number corresponding to the iteration with the minimum mean square error value as the optimal node number of the hidden layer when the maximum iteration number x is reached

step 8.3: the number of nodes of each hidden layer is set as

Setting the updating step length of the layer number as 1, and updating each time of the stack by the step length a as 1The number of the hidden layers in the generation is calculated by using a formula (8), the mean square error after each iteration is calculated, and when the maximum iteration times x is reached, the number of the layers corresponding to one iteration with the minimum mean square error value is used as the optimal number of the hidden layers

The selection results of the hidden layer structure are shown in table 2;

TABLE 2 Effect of hidden layer Structure on deep belief network model

Step 8.4: the number of nodes of each hidden layer is set as

The number of the hidden layers is set as

Setting the updating step length of the learning rate to be 0.0003, updating the learning rate in each iteration by the step length d, calculating the mean square error after each iteration by using a formula (8), and taking the learning rate corresponding to the iteration with the minimum mean square error value as the optimal learning rate of the model when the maximum iteration times x is reached

The results of selecting the learning rate are shown in table 3.

TABLE 3 Effect of learning Rate on deep belief network model

Specifically, due to the deep structure characteristics of the deep confidence network and the training mode combining unsupervised pre-training and supervised fine tuning, the width correction value predicted based on the deep confidence network model has the characteristics of high prediction precision, strong generalization capability and difficulty in falling into a local extreme value. The detailed structure and training process of the model is as follows:

the Deep Belief Network (DBN) is a probabilistic generation model, which is formed by stacking a plurality of constrained Boltzmann machines (RBMs). The DBN bottom-most layer receives the input data vector and transforms the input data to the hidden layer via the RBM, i.e., the input to a higher layer of RBMs is from the output of a lower layer of RBMs. An RBM is composed of a visible layer and a hidden layer, and the neurons of the visible layer and the neurons of the hidden layer are in full bidirectional connection. Assuming that a certain RBM visible layer has V neurons and the hidden layer has H neurons, for a given state (V, H), the energy function is defined as follows:

where θ ═ { W ', a ', b ' } is a parameter of the RBM, where W ' denotes a connection weight between the visible layer and the hidden layer, W '_f,gRepresents the connection weight between the visible unit f and the invisible unit g, a 'represents the offset of the visible layer, a'_fDenotes the bias of the visible unit f, b 'denotes the bias of the hidden layer, b' denotes the bias of the hidden unit g, H denotes the number of hidden layer neurons, V denotes the number of visible layer neurons, E_θ(v, h) represents the energy function when the hidden layer neuron is h and the visible layer neuron is v.

Based on the above energy functions, the joint probability distribution for a given state (v, h) is given by:

in the formula Z_θThe allocation function is represented.

Due to the special structure that RBM layers are connected with each other and not connected in the layers, when the state of each neuron of the visible layer is given, the activation states of each neuron of the hidden layer are mutually independent, and similarly, when the state of each neuron of the hidden layer is given, the activation states of each neuron of the visible layer are also mutually independent, so the activation probabilities of the g hidden layer neuron and the f visible layer neuron are respectively as follows:

where σ represents the activation function.

Due to the distribution function Z_θAre difficult to calculate, resulting in a joint probability distribution p_θ(v, h) cannot be calculated. And a contrast divergence algorithm can be adopted to accelerate RBM training learning. Training the RBM through a contrast divergence algorithm, wherein each parameter updating rule is as follows:

W′＝W′+ρ(hv^T-h′(v′)^T (14)

b′＝b′+ρ(h-h′) (15)

a′＝a′+ρ(v-v′) (16)

in the formula: v ' represents the reconstruction of the visible layer v, h ' represents the hidden layer obtained from the reconstruction v ', and ρ represents the learning rate.

However, stacking the RBMs can only obtain some high-level features from complex raw data, and cannot perform direct regression prediction on the data, and in order to obtain a complete DBN model, a conventional supervised regressor needs to be added at the topmost layer of the stacked RBMs. The basic structure of a DBN is shown in fig. 4. As can be seen from the figure, the training process of the DBN consists of two processes, unsupervised layer-by-layer pre-training and supervised fine-tuning. And (3) forming an RBM by two adjacent layers of neurons, performing unsupervised pre-training on the RBM layer by layer from bottom to top, inputting a final result into a supervised regression device at the top layer, and performing fine adjustment on the network weight and the bias by adopting a back propagation algorithm. The overall training process is shown in fig. 5.

In the embodiment, 2730 groups of screened actual data are used as experimental data of the model, the rolling mechanism prediction model established based on the rolling mechanism predicts the reference value of the head width of the strip steel in the finish rolling process, and the prediction model established based on the depth confidence neural network predicts the correction value of the head width of the strip steel in the finish rolling process. The data are randomly divided into 2180 groups of training sets and 550 groups of testing sets, and the training sets and the testing sets are applied to the prediction model based on the deep belief neural network. And selecting influence factor data of 48 groups of variables including a finish rolling inlet temperature, a finish rolling outlet temperature, an intermediate billet thickness, a finish rolling outlet thickness, a finish rolling finished product target width, a rough rolling outlet width, a finish rolling vertical roll rolling force, a finish rolling vertical roll gap, a finish rolling vertical roll linear velocity, and roll diameters, roll linear velocities, roll wear compensation, tension between frames and thickness back calculation values in the mill train F1-F8 as the input of a depth confidence network prediction model, and summing a head width reference value predicted by the rolling mechanism prediction model and a head width correction value predicted by the depth confidence neural network model to obtain a final predicted value of the head width of the strip steel. The embodiment is implemented by using python language programming, and the obtained predicted result and actual measurement result are compared as shown in fig. 6.

In conclusion, compared with the traditional method for predicting the width of the head of the finish rolling, the method for predicting the width of the head of the hot continuous rolling strip steel, which integrates the rolling mechanism and the depth confidence neural network, is accurate and efficient in predicting the width of the head of the strip steel in the finish rolling process. The method has better generalization performance for the strip steels with different finished product target specifications; meanwhile, the defects that the width is predicted only by a rolling mechanism, the precision is low, the capability of adapting to actual production is weak, the reliability of the width is predicted only by a neural network is low, and the interpretability is poor are overcome. In addition, the prediction model is constructed based on the rolling mechanism and deep learning without improving the equipment of the existing hot continuous rolling production line, so that the production investment cost is saved, and a good foundation is provided for the adjustment of the process automation level setting model parameters.

Claims

1. A hot continuous rolling strip steel head width prediction method integrating a rolling mechanism and deep learning is characterized by comprising the following steps:

2. The method for predicting the head width of the hot continuous rolling strip steel by combining the rolling mechanism and the deep learning according to claim 1, wherein the step 2 comprises the following steps:

is y_iAverage value of (1), S_yIs y_iStandard deviation of (2).

3. The method for predicting the head width of the hot continuous rolling strip steel by combining the rolling mechanism and the deep learning according to claim 1, wherein the step 5 comprises the following steps:

Wherein R represents a roll radius;

In the formula, b₀Representing the rack entrance width, C being a constant;

step 5.4: calculating the width D of the flat roll rolling in the finish rolling area by using a formula (5)_B：

Step 5.5: calculating the gantry exit width b using equation (6)₁：

b₁＝b₀+DB (6)。

4. The method for predicting the head width of the hot continuous rolling strip steel by fusing the rolling mechanism and the deep learning as claimed in claim 3, wherein for a hot continuous rolling finishing mill group with a plurality of stands, the outlet width output by the rolling mechanism prediction model of the previous stand is taken as the inlet width of the next stand to be calculated from one stand to another according to the running direction of the production line until the outlet width of the last stand is calculated as the prediction reference value of the head width of the hot continuous rolling strip steel.

5. The method for predicting the head width of the hot continuous rolling strip steel by combining the rolling mechanism and the deep learning according to claim 1, wherein the step 7 comprises the following steps:

In the formula, x_jkRepresenting the kth data element, x, in the jth class of data_jminDenotes the minimum value, x, in class j data_jmaxRepresents the maximum value in the jth class of data, and N represents the number of classes of data in the influence factor data set.

6. The hot continuous rolling strip steel head width prediction method integrating the rolling mechanism and the deep learning according to claim 1, characterized in that a limited Boltzmann machine model with unsupervised pre-training is adopted at the bottom layer of the deep belief neural network model, an error inverse propagation regression model with supervised fine tuning is adopted at the top layer, a ReLU function is adopted as an activation function, and a dropout method is adopted as a regularization method to prevent overfitting.

7. The method for predicting the head width of the hot continuous rolling strip steel by fusing the rolling mechanism and the deep learning according to claim 1, wherein the model is trained in the step 8 to obtain a deep belief neural network model with optimal parameters, and the method is specifically expressed as follows:

step 8.3: the number of nodes of each hidden layer is set as

Setting the updating step length of the layer number as a, updating the hidden layer number in each iteration by using the step length a, calculating the mean square error after each iteration by using a formula (8), and taking the layer number corresponding to the iteration with the minimum mean square error value as the optimal layer number of the hidden layer when the maximum iteration number x is reached

Step 8.4: the number of nodes of each hidden layer is set as

The number of the hidden layers is set as