CN116244625A

CN116244625A - Overflow type mill load indirect forecasting method based on multi-feature fusion neural network

Info

Publication number: CN116244625A
Application number: CN202310009904.XA
Authority: CN
Inventors: 张克胜; 徐泉; 柴天佑; 刘长鑫
Original assignee: 东北大学
Priority date: 2023-01-04
Filing date: 2023-01-04
Publication date: 2023-06-09

Abstract

The invention provides an overflow type mill load indirect forecasting method based on a multi-feature fusion neural network, wherein a downstream pump pool slurry pump running state and mill load mechanism model after mill ore discharge is established, a multivariate delay correlation analysis method and a mill load forecasting method of the multi-feature fusion neural network are provided, time lag relations between input variables and target variables are classified and analyzed, and a corresponding neural network forecasting model is established aiming at variables with different time lag characteristics. The invention provides a new soft measurement and multi-step prediction method for mill load, which solves the problems of high cost, inconvenient installation, difficult interpretation and analysis of spectrum signals and the like caused by the fact that a grinding sound sensor and a vibration sensor are required to be installed on a cylinder body, a bearing and the like in a conventional mill load detection method when a plurality of important equipment with relatively close physical distance are operated simultaneously and auxiliary equipment are operated in the actual industrial field, especially when the space position of a production workshop is insufficient.

Description

Overflow type mill load indirect forecasting method based on multi-feature fusion neural network

Technical Field

The invention relates to the technical field of state detection and deep learning of mill equipment, in particular to an overflow type mill load indirect forecasting method based on a multi-feature fusion neural network.

Background

The mining industry is a basic industry for national economy development, relates to a plurality of fields of chemical industry, metallurgy, aerospace, information and the like, and plays an irreplaceable important role in the economy development. The current grinding and selecting process in China adopts a multi-stage screening and magnetic separation process and a high-energy-consumption ore grinding process, relates to various equipment and a plurality of equipment groups, such as a high-frequency screen, a magnetic separator, a ball mill and the like, has complex mechanism and long flow operation, is a key process affecting concentrate grade and metal recovery rate in the ore dressing process, and has typical flow industrial characteristics, such as complex production process, strong production adjustment lag, strong inter-process coupling and the like.

The operation rate and efficiency of the ball mill generally determine the production efficiency and index of the milling process and even the whole ore dressing process, and the mill efficiency is one of important performance indexes of the milling process. When the ball mill runs under load, the ore feeding amount of the mill is low, the proportion of ore materials in the cylinder is low, the ore grinding time of the ore in the cylinder is long, the ore overgrinding phenomenon occurs, the concentrate grade is overhigh, the yield is reduced, and the cost is increased; when the ball mill is in overload operation, the ore feeding amount of the mill is higher than the processing capacity of the ball mill, the ore is discharged outside the mill without effective fine grinding in the cylinder, the ore concentrate grade is unqualified, and the product quality is not ensured. Therefore, timely judging or early judging the running state of the ball mill is very important to ensure the mineral separation production quality and yield.

In the actual production process of the mill, the running environment is complex and changeable, the ball mill has the characteristics of continuous work and closed rotation running, so that the ore quantity, the steel ball quantity and the water quantity in the ball mill are difficult to quantify in real time, and the mill load is difficult to calculate and directly detect by adopting a mechanism model. At present, the effective and widely applied methods are a grinding method and a vibration method, and a grinding sensor and a vibration sensor are required to be arranged on the parts of a mill cylinder, a bearing and the like so as to detect noise signals generated in the running process of the ball mill and vibration signals of the parts of the mill cylinder, the bearing and the like. The grinding method is a method for judging the current running state of the ball mill and further determining the load of the ball mill. Vibration methods determine the relationship between mill load and vibration energy by analysis of signals. However, the method needs to consider factors such as the type selection, the installation position and the like of the sensor equipment, and has high manufacturing cost and inconvenient installation. And the mill production data and the grinding sound/vibration signal data are large in volume and the grinding sound/vibration signal is complex in composition, so that the calculation cost of function evaluation based on the data is too high, and the conventional mill load soft measurement method is difficult to realize effective detection.

Moreover, when the space position of a production workshop is insufficient, a plurality of large and relatively close-located equipment (such as a cone crusher, a high-pressure roller mill, a plurality of ball mills and the like) are operated simultaneously, and auxiliary equipment (such as a high-frequency screen, a linear vibrating screen and an automatic ball feeding machine) is operated, the production operation environment of the mill has large noise and complex vibration sources, and when ores and steel balls continuously rotate, collide and rub in the cavity of the ball mill, vibration mechanical signals of the mill are subjected to external vibration interference and noise interference, so that the data interpretation and analysis difficulty of noise signals and vibration signals is increased, and the load conditions of ores, steel balls and water in the cavity of the mill cannot be effectively and accurately represented.

Disclosure of Invention

In order to solve the problems of high cost, inconvenient installation, difficult interpretation and analysis of spectrum signals and the like caused by the fact that a grinding sound sensor and a vibration sensor are required to be installed on a barrel, a bearing and other parts in a conventional grinding machine load detection method when a plurality of important equipment with relatively close physical distance are operated simultaneously and auxiliary equipment are operated particularly in the actual industrial field when the space position of a production workshop is insufficient, the invention provides an overflow type grinding machine load indirect forecasting method based on a multi-feature fusion neural network.

For this purpose, the invention adopts the following technical scheme:

the invention provides an overflow type mill load indirect forecasting method based on a multi-feature fusion neural network, which comprises the following steps:

according to the overflow ball mill ore discharge mode and the mill selection process flow, a mechanism model of mill outlet load and slag slurry pump motor current fed by a mill downstream pump pool is established;

collecting upstream and downstream production data of a mill according to a mill selection process flow; preprocessing original process variable data in the production data, analyzing index correlation and delay correlation of the production process variables at the upstream and downstream of the overflow ball mill by using a Pearson correlation coefficient, and classifying the characteristics of the mill process variables;

aiming at variable characteristic classification results, establishing and training an overflow type mill load indirect prediction model based on a multi-characteristic fusion neural network, wherein the prediction model comprises a time-lag characteristic data time sequence modal attention mechanism network, a non-time-lag characteristic data long-period and short-period memory network, a history target variable characteristic data long-period and short-period memory compensation network, a multi-characteristic fusion layer network discarding layer and a full-connection layer; the input of the time lag characteristic data time sequence modal attention mechanism network is time lag characteristic data, and the output is recorded as a time lag output matrix; the input of the non-time-lag characteristic data long-term and short-term memory network is the non-time-lag characteristic data, and the output is recorded as a non-time-lag output matrix; the input of the long-term and short-term memory compensation network of the historical target variable characteristic data is the historical target variable characteristic data, and the output is recorded as a historical target characteristic output matrix; the inputs of the multi-feature fusion layer network comprise a time-lag output matrix, a non-time-lag output matrix and a historical target feature output matrix, and the output result is used as the input of a discarding layer and a full-connection layer; the output results of the discarding layer and the full connection layer are the current forecasting result of the slurry pump motor;

and forecasting the current of the slurry pump motor by using a trained model, and obtaining an overflow type mill load forecasting result based on the forecasting result of the current of the slurry pump motor and the mechanism model.

Further, the mechanism model of the output load of the mill and the current of the slag slurry pump motor fed by the downstream pump pool of the mill is as follows:

wherein I is the actual running current of a slag slurry pump motor fed by a downstream pump pool of the mill, and the unit is ampere; h _m The unit is m for pumping pulp lift; ρ _p Is the density of ore pulp, and the unit is t/m ³ ；η _m Efficiency in pumping pulp; u is motor voltage, and the unit is V; cos eta _d Is the motor power coefficient; w (W) _F3 The water supplementing amount is fixed for the pump pool, and the unit is m ³ /h; a is the production imbalance factor.

Further, preprocessing the raw process variable data in the production data, including:

after the production data are aligned according to the uniform time stamp of the sampling time, the original data are preprocessed according to the second-level sampling data, so that the minute-level sampling data are obtained, a data set is obtained, the data set comprises a plurality of process variables and a target variable, and the target variable is magnetic separation for feeding slag slurry pump current once.

Further, preprocessing the original data according to second-level sampling data, including:

acquiring a variable data set to be processed;

retrieving all data within each minute of the variable to be processed;

calculating a detected data maximum value M;

judging whether M is equal to 0; if M is equal to 0, the M value is taken as a data value of the variable to be processed in the one minute, and the data value is stored in a database;

if M is not equal to 0, judging whether M is a non-numerical value or not; if M is a non-numerical value, the variable to be processed has no value in the one minute, and the variable to be processed is recorded as NAN;

if M is a numerical value, calculating the average value P of the effective data (more than 0) part in the detected data; the P value is used as a variable to be processed and is stored in a database at the one-minute data value.

Further, the feature classification includes: non-time-lag features, and extraneous features.

Further, the non-time-lag characteristic data long-term and short-term memory network and the history target variable characteristic data long-term and short-term memory compensation network are respectively composed of long-term and short-term memory nerve units and are respectively used for extracting time sequence characteristics in the non-time-lag characteristic data and the history data of the predicted target variable.

Further, the time lag characteristic data timing modal attention mechanism network comprises:

extracting long-period time sequence characteristics of the time lag characteristic data by using a long-period memory neural network;

the convolution kernel is utilized to further extract a time sequence mode of the long-term memory neural network layer, and a time sequence mode attention mechanism layer is established; the time sequence mode attention mechanism layer uses k filters to convolve the features of m hidden states to improve the learning ability of the model, and generates a matrix H with m rows and k columns ^C 。

Further, the multi-feature fusion layer network includes: outputting the hidden state of the time lag characteristic current time step obtained by the neural network to h _t And spatiotemporal characteristics v _t Non-time-lag characteristic network output hidden state matrix

And historical target data network output hidden state matrix +.>

And carrying out reconstruction and merging to obtain the state output of the multi-feature fusion layer, and obtaining the final output of the model through the discarding layer and the full-connection layer.

Further, training an overflow type mill load indirect prediction model based on the multi-feature fusion neural network, comprising:

initializing network parameters, setting the number of training rounds and the number of iterations of each round, and starting training after all the super parameters related to training are set;

and if the integral network reaches the set training round number or the loss rate of the internal neural network in a plurality of rounds is not reduced, the integral network is considered to be converged, and the trained network parameters at the moment are stored.

The invention has the following beneficial effects:

1. the invention provides a novel indirect detection method for mill load, which avoids the problem that the conventional soft measurement method for mill load excessively depends on mechanical information of a sensor acquisition device to be installed as model input, and aims at the problems of non-optimal mill load production regulation and control, different input characteristic hysteresis and the like, and provides a multivariate delay correlation analysis method and a mill load forecasting method of a multi-characteristic fusion neural network.

2. The multivariate delay correlation analysis method and the multi-feature fusion deep learning neural network prediction model provided by the invention have good universality for different multivariate time sequence prediction problems, and the network scale corresponding to different feature variables in the overall network can be properly adjusted according to the size of a data set.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to the drawings without inventive effort to a person skilled in the art.

FIG. 1 is a flow chart of indirect prediction of mill load based on multi-feature fusion in an embodiment of the invention;

FIG. 2 is a schematic diagram of a production process flow of a grinding process in an embodiment of the invention;

FIG. 3 is a flowchart illustrating steps of a data preprocessing method according to an embodiment of the present invention;

FIG. 4 is a diagram of a neural network based on multi-feature fusion in an embodiment of the present invention;

FIG. 5 is a block diagram of an LSTM neural network unit in accordance with an embodiment of the present invention;

FIG. 6 is a diagram of a network structure of a time-lapse characteristic data timing mode attention mechanism in an embodiment of the present invention;

FIG. 7 is a graph showing the maximum inflection point of the variable delay correlation curve in an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

As shown in fig. 1, the embodiment of the invention provides an overflow type mill load indirect forecasting method based on a multi-feature fusion neural network, which comprises the following steps:

step one, establishing a mechanism model of mill outlet load and slag slurry pump motor current fed by a mill downstream pump pool according to an overflow type ball mill ore discharge mode and a mill selection process flow;

according to the ore discharge mode of the overflow ball mill (namely, the component part of the outlet load of the mill), the outlet load L of the overflow ball mill _Lo (t/h) is a key indicator characterizing the operating load of such ball mills, and mill outlet load includes mill outlet wear steel ball amount, mill outlet dry ore, and mill outlet water displacement.

In the production process flow chart shown in fig. 2, the outlet ore discharge of the ball mill automatically flows to a primary magnetic separation ore feeding pump pool, water is fixedly supplemented by the pump pool, and slag slurry is fed to the next production process by a primary magnetic separation ore feeding pump through a pipeline.

Shaft power P when pumping pulp _m The calculation formula is as follows:

wherein H is _m The unit is m, which is the pulp lift; q (Q) _m The unit is L/s for the flow of ore pulp; ρ _p Is the density of ore pulp, and the unit is t/m ³ ；η _m In% for efficiency in pumping pulp.

When the pump shaft power is calculated, factors such as pump start and flow fluctuation are considered, and when a standard motor is selected, the standard motor has a certain power margin coefficient K, the power margin coefficient K=1.10-1.20 is generally taken, the high-power slurry pump takes a small value, and the low-power slurry pump takes a large value.

/>

Wherein N is motor power, and the unit is kW; k is a power margin coefficient; p (P) _m For the shaft power in kW when pumping pulp; η (eta) ₀ For transmission efficiency, eta in direct transmission ₀ ＝1.0。

The motor power of the slurry pump when feeding slurry is calculated from the angle of the running current, which can be expressed as follows

N＝UIcosη _d ； (3)

Wherein I is the actual running current (simply called as a magnetic current) of the primary magnetic separation slurry pump motor, and the unit is ampere; u is motor voltage, and the unit is V; cos eta _d The power coefficient of the motor is 0.95 to 1.0.

From the above formula:

flow rate Q of pumped pulp _m The calculation formula is as follows:

Q _m ＝A(L _Lo +W _F3 )； (5)

in which Q _m The unit of the pulp quantity conveyed by the pump is m ³ /h；W _F3 The water supplementing amount is fixed for the pump pool, and the unit is m ³ /h; a is a production imbalance coefficient, and is generally 1.1-1.2.

And finally, establishing a mechanism model of the pump current I of the slag slurry fed by the pump pool and the outlet load of the mill according to the formula (4) and the formula (5):

from equation (6), mill outlet load L _Lo The final calculation formula of (2) is:

wherein, the liquid crystal display device comprises a liquid crystal display device,

in the production process flow shown in fig. 2, after the production process and equipment selection are determined, the water supply amount to the pump sump and the frequency of the slurry pump for slag are fixed, and in this example, the slurry pump head H is pumped _m =28.4m; pulp densityDegree ρ _p ＝1.585t/m ³ The method comprises the steps of carrying out a first treatment on the surface of the Efficiency η of pumping pulp _m =0.85; the power headroom coefficient k=1.1; transmission efficiency eta ₀ =1.0; motor voltage u=380V; motor power coefficient cos eta _d =0.95; fixed water supplementing quantity W of pump pool _F3 ＝33m ³ /h; producing an imbalance factor a=1.1. It is known that mill outlet load is directly proportional to slurry pump motor current I to the mill downstream pump sump.

For an overflow ball mill of a determined model, the outlet load of the mill is different from the load of the mill by a constant value of the overflow load upper limit of the mill, so that the perception and soft measurement of the load of the mill are converted into the perception and modeling prediction of the current of a slurry pump motor.

Preprocessing original process variable data, analyzing index correlation and delay correlation of process variables and operation indexes at the upstream and downstream of the overflow ball mill by using a Pearson correlation coefficient, and classifying characteristics of the mill process variables;

obtaining upstream and downstream production data of a mill in a milling process of 22 days of a certain mill factory, wherein the data comprise 20 indexes such as mill No. 1 belt feeder frequency, mill No. 11 belt feeder current, mill No. 2 belt feeder frequency, mill No. 2 belt feeder current, mill No. 3 belt feeder current, high-frequency sieve feeding pump current, secondary magnetic separation feeding slag slurry pump current, ball mill exciting current, ball mill stator current, primary magnetic separation concentration measurement value, high-frequency sieve concentration measurement value, magnetic separation column concentration measurement value, magnetic separation flow measurement value, high-frequency sieve flow measurement value, magnetic separation column feeding 1 flow measurement value, high-frequency sieve pressure measurement value, no. 1 belt buffer bin material level measurement value, no. 2 belt buffer bin material level measurement value, primary magnetic separation feeding slag slurry pump pool liquid level measurement value, primary magnetic separation feeding slag slurry pump current and the like, after the production data are aligned according to a uniform time stamp, the original sampling data according to second level are preprocessed to obtain sampling data according to minute level, and a data set Dataset= { x is obtained ₁ ,x ₂ ,…,x ₁₉ Y, the data includes a total of 19 process variables (x _i I=1, …, 19) and target variable y. The first 19 variables are used as process variables input by a model, and the target variables are the current of a slag slurry pump by one-time magnetic separation.

The data preprocessing method comprises the steps as shown in fig. 3, including:

s201, acquiring a variable data set to be processed;

s202, retrieving all data of variables to be processed in each minute;

s203, calculating a detected data maximum value M;

s204, judging whether M is equal to 0; if M is equal to 0, the M value is taken as a data value of the variable to be processed in the one minute, and the data value is stored in a database;

s205, if M is not equal to 0, judging whether M is a non-numerical value; if M is a non-numerical value, the variable to be processed has no value in the one minute, and the variable to be processed is recorded as NAN;

s206, if M is a numerical value, calculating the average value P of the effective data (more than 0) part in the detected data; the P value is used as a variable to be processed and is stored in a database at the one-minute data value.

After data preprocessing, the data sampling period was 1 minute and the data set length was 18808. 12665 sets of data were used as training sets, 3166 sets of data were used as validation sets, and the remaining 2967 sets of data were used as test sets.

The pearson correlation coefficient r (X, Y) is used to calculate the correlation and delay correlation between different process variables and the target variable. The pearson correlation r (X, Y) is calculated as follows:

where Cov (X, Y) is the covariance of X and Y, var [ X ] is the variance of X, and Var [ Y ] is the variance of Y.

And respectively calculating delay correlations between 19 input variables and the target variable y at different lag moments delta t by using the pearson correlation coefficients.

And dividing all feature variables into non-time-lag features, time-lag features and irrelevant features according to the difference and sequencing conditions of the correlation coefficients of different input features and output features and the numerical values of the delay correlation coefficients. The specific definition is as follows:

if |r (X, Y) | >0, |r (X, Y+Δt) | -r (X, Y) |is less than or equal to 0, then X is referred to as a non-time-lapse feature of Y; if |r (X, Y) | >0, |r (X, Y+Δt) | -r (X, Y) | >0, then X is referred to as the time-lag feature of Y; if |r (X, Y) |=0, then X is said to be an irrelevant feature of Y. Δt represents the lag time of the variable with respect to the current time.

In this example, the delay correlation with y is calculated once every 2 minutes of lag for each set of data for a total of 50 times, i.e., Δt=0, 2, …,98,100. And B1, classifying the characteristic variables by combining the relation between the delay correlation of the input variables and the delay time delta t and the numerical difference and the sequencing condition of the delay correlation.

Dividing the self-variable data set into a non-time-lag data set X according to the characteristic variable classification result _NS ＝{X _ns,1 ,……,X _ns,m Time-lapse dataset X _DS ＝{X _ds,1 ,……,X _ds,n }. Wherein m is the number of non-time-lag features, n is the number of time-lag features, and the optimal lag time delta t is solved according to the variable delay correlation visualization graph and the data characteristics ^* 。

In this example 11 input variables [ x ] ₅ ,x ₁₆ ,x ₁₄ ,x ₁₁ ,x ₇ ,x ₁₀ ,x ₁₂ ,x ₁₅ ,x ₉ ,x ₁₉ ,x ₁₃ ]The delay correlation decreases with the increase of deltat, and the delay correlation is a non-time-lag characteristic variable to form data X _NS The method comprises the steps of carrying out a first treatment on the surface of the 8 input variables [ x ₄ ,x ₂ ,x ₆ ,x ₃ ,x ₁ ,x ₁₇ ,x ₁₈ ,x ₈ ]The correlation increases and decreases with the increase of deltat, and the correlation is a time lag characteristic variable to form data X _DS . Specific examples are shown in FIG. 7, the optimal lag time is determined to be deltat according to the maximum inflection point of the variable delay correlation visualization curve and the data characteristics ^* =20, the input variable and target variable delay correlation data analysis is shown in table 2. The optimal lag time is Deltat ^* The method can be used for initializing the number of the neurons of the neural network model.

TABLE 2

And thirdly, constructing an overflow type mill load indirect prediction model based on the multi-feature fusion neural network according to the variable feature classification result.

And (3) establishing different neural networks aiming at variable feature classification results by adopting a multi-feature fusion-based self-adaptive deep learning method to extract implicit feature information contained in different types of data features.

As shown in fig. 4, the overall network structure of the overflow mill load indirect prediction model based on the multi-feature fusion neural network includes: the time lag characteristic data time sequence modal attention mechanism network (time lag network for short), non-time lag characteristic data long-term and short-term memory network (time lag network for short), history target variable characteristic data long-term and short-term memory compensation network (history target variable network for short), multi-characteristic fusion layer, discarding layer (Dropout layer) and full-connection layer (namely output layer).

The non-time-lag network and the historical target variable network are both composed of Long Short-Term Memory (LSTM) neural units and are respectively used for extracting time sequence characteristics, especially Short-Term time sequence characteristics, in the non-time-lag characteristic data and the historical data of the predicted target variable. The LSTM neural network structure effectively relieves the gradient explosion and gradient disappearance problems of the circulating neural network (Recurrent Neural Networks, RNN) structure. Compared with the RNN unit structure, the LSTM has the core idea of having a state memory unit and several different gate structures, namely an input gate, an output gate and a forget gate. The LSTM cell structure is shown in fig. 5, and the vector can be expressed in its flow process by the following formula:

f _t ＝sigmoid(W _f [h _t-1 ,x _t ]+b _f )；

i _t ＝sigmoid(W _i [h _t-1 ,x _t ]+b _i )；

o _t ＝sigmoid(W _o [c _t ,h _t-1 ,x _t ]+b _o )；

c _t ＝f _t *c _t-1 +i _t *tanh(W _c [h _t-1 ,x _t ]+b _c )；

h _t ＝o _t *tanh(c _t )；

wherein f _t ,i _t ,o _t C is three coefficients passing through the forgetting gate, the input gate and the output gate respectively _t ,h _t Representing the cell state and the output hidden state, respectively, W, b being the weight and bias that need to be trained.

To improve the non-time lag characteristic data X _NS And historical target data Y _d Respectively selecting two LSTM networks to independently extract more characteristics contained in two time sequences, further improving the compensation performance of the model, and finally obtaining the hidden state matrix of the unit output

The time lag network structure is specifically characterized in that an LSTM neural network is utilized to extract long-term and short-term time sequence characteristics of time lag characteristic data; the LSTM layer timing modes are further extracted using convolution kernels, creating an improved timing mode attention mechanism layer (Improved Temporal Patterns Attention, tpa). An input matrix passes through an LSTM neural network layer to obtain an output hidden matrix H= { H ₁ ,h ₂ ,…,h _t Hidden state output for current time step

And cell state output c _t And inputs it as an ibpa network. K filters are used in the iTPA network>

Convolving the m hidden state features to improve model learning, yielding a matrix H with m rows and k columns ^C The specific time-lapse network structure is shown in fig. 6. The vector during its flow can be represented by the following formula:

representing the convolution value of the ith row vector and the jth filter, w is the filter window length.

The attention Score Function is calculated as follows:

wherein W is _a As a trainable parameter, the dimension is [ k,2m]。

The attention score is obtained by selecting a Sigmoid function, and the following formula is shown as follows:

α _i ＝sigmoid(f(s _t ,h _t ,c _t ))；

after the attention score is obtained, for H ^C Weighting the variable dimension to obtain v _t The formula is described as follows:

in the total neural network architecture, the number of LSTM network layer neurons and the optimal lag time are delta t ^* In agreement, for 20, the number of single neuron nodes is m=100, the number of lstm network layers is l=2, the number of convolution layers convolution kernels is k=10, the stride of the convolution sliding window is 1, and the batch size of training data is set to 64.

Constructing a multi-feature fusion layer network, and outputting the hidden state of the time lag feature current time step obtained by the neural network to h _t And spatiotemporal characteristics v _t Non-time-lag characteristic network output hidden state matrix

And historical target data network output hidden state matrix +.>

Reconstructing and combining the features to obtain the state output H of the feature fusion layer _FF Then obtaining a model final output H through a Dropout layer and a full connection layer _FC . The specific feature fusion calculation formula is as follows:

h′ _t ＝W _h h _t +W _v v _t ；

H _FC ＝W _FC H _FF +b _FC ；

wherein W is _h 、W _v 、W _FF 、W _FC 、b _FC For trainable parameters, σ is Dropout level node parameters, σ defaults to 0.2.

Initializing network parameters, setting super parameters of a training neural network, and performing network training.

Initializing network parameters, setting the number of training rounds and the number of iterations of each round, and starting training after all the super parameters related to training are set. The neural network optimizer employs an Adam optimizer, and the neural network loss rate employs a MSELoss () function.

And if the loss rate of the neural network is not reduced within the set training round number 40 or 10 times of training round number of the whole network, the whole network is considered to be converged, and the trained network parameters at the moment are stored.

And fifthly, testing by using the trained model and calculating the performance index.

For the test set data, the same preprocessing procedure and feature variable classification as the training set are performed. The processed non-time-lapse data set and time-lapse data set are sent to a trained network.

Calculating the average absolute error (Mean Abosolute Error, MAE), root mean square error (Root Mean Squared Error, RMSE), determinable coefficient (Coefficient of Determination, R ² ) And the like. MAE and RMSE are commonly used formulas for measuring the error rate of model prediction, and are used for describing the error condition of a predicted value and a true value. Smaller values of MAE and RMSE indicate smaller model prediction errors and higher algorithm performance. R is R ² The proportion of the variance of the dependent variable interpreted by the model in the total variance is described to judge the fitting degree of the prediction result. R is R ² The larger the value of (2) is, the stronger the linear relation exists between the independent variable and the dependent variable, and the smaller the model prediction error is. The evaluation index calculation formula is as follows:

where test set data length l=2967,

as predicted value, y _i Is the actual value +.>

Is the actual average value.

The evaluation index comparison results of the multi-step prediction performance of different neural networks on the test set are shown in table 2.

TABLE 2

In the above table experiments, k represents the model prediction step length, and three groups of comparison experiments are respectively used for predicting a target variable after 1 minute, predicting a target variable after 5 minutes and predicting a target variable after 10 minutes. From the above table, it can be seen that the method of the present invention has a more accurate time series prediction effect than the conventional LSTM neural network prediction model in this example problem.

As proved by experiments, the invention has good mill load forecasting performance, wherein R is forecasted in 1 step ² The index can reach 0.997, and has advanced level.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims

1. An overflow type mill load indirect forecasting method based on a multi-feature fusion neural network is characterized by comprising the following steps:

2. The overflow type mill load indirect forecasting method based on the multi-feature fusion neural network, which is characterized in that a mechanism model of the mill outlet load and the slag slurry pump motor current fed by a mill downstream pump pool is as follows:

3. The overflow mill load indirect forecasting method based on the multi-feature fusion neural network according to claim 1, wherein preprocessing the original process variable data in the production data comprises the following steps:

4. The overflow type mill load indirect forecasting method based on the multi-feature fusion neural network according to claim 3, wherein preprocessing the original data according to second-level sampling data comprises the following steps:

acquiring a variable data set to be processed;

retrieving all data within each minute of the variable to be processed;

calculating a detected data maximum value M;

if M is a numerical value, calculating the average value P of the effective data part in the detected data; the P value is used as a variable to be processed and is stored in a database at the one-minute data value.

5. The overflow type mill load indirect forecasting method based on the multi-feature fusion neural network according to claim 1, wherein the feature classification comprises: non-time-lag features, and extraneous features.

6. The overflow type mill load indirect forecasting method based on the multi-feature fusion neural network according to claim 1, wherein the non-time-lag feature data long-term and short-term memory network and the history target variable feature data long-term and short-term memory compensation network are composed of long-term and short-term memory neural units and are respectively used for extracting time sequence features in the non-time-lag feature data and the history data of the forecast target variable.

7. The overflow mill load indirect forecasting method based on the multi-feature fusion neural network according to claim 6, wherein the time-lapse feature data time-sequence modal attention mechanism network comprises:

8. The overflow mill load indirect forecasting method based on the multi-feature fusion neural network according to claim 6 or 7, wherein the multi-feature fusion layer network comprises: outputting the hidden state of the time lag characteristic current time step obtained by the neural network to h _t And spatiotemporal characteristics v _t Non-time-lag characteristic network output hidden state matrix

And historical target data network output hidden state matrix +.>

9. The method for predicting mill load based on the multi-feature fusion neural network according to claim 1, wherein training the overflow type mill load indirect prediction model based on the multi-feature fusion neural network comprises the following steps: