CN109784561A - A kind of thickener underflow concentration prediction method based on integrated study - Google Patents

A kind of thickener underflow concentration prediction method based on integrated study Download PDF

Info

Publication number
CN109784561A
CN109784561A CN201910036868.XA CN201910036868A CN109784561A CN 109784561 A CN109784561 A CN 109784561A CN 201910036868 A CN201910036868 A CN 201910036868A CN 109784561 A CN109784561 A CN 109784561A
Authority
CN
China
Prior art keywords
data
model
prediction
underflow
integrated
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910036868.XA
Other languages
Chinese (zh)
Inventor
吴爱祥
刘婷
袁兆麟
王少勇
王洪江
王贻明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology Beijing USTB
Original Assignee
University of Science and Technology Beijing USTB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology Beijing USTB filed Critical University of Science and Technology Beijing USTB
Priority to CN201910036868.XA priority Critical patent/CN109784561A/en
Publication of CN109784561A publication Critical patent/CN109784561A/en
Pending legal-status Critical Current

Links

Abstract

The present invention provides a kind of thickener underflow concentration prediction method based on integrated study, belongs to mining technique field.This method obtains actual production historical record data, is stored in enterprise database, then pre-processes to the data set got, reuses pretreated data configuration training set and test set;And integrated learning approach is used, using the above-mentioned foundation for having constructed training set and test set progress model, realizes to the accurate prediction of the underflow density of deep wimble thickener, finally show the result of prediction by Visualization Platform.This method can compare most factors of comprehensive consideration influence underflow density, unilateral insufficient bottleneck problem when solving current underflow density prediction model consideration influence factor.And integrated study model is used, large-scale data cannot be handled by solving the problem of that single machine learning model learning ability is limited, so that the control for concentrator provides more effective and accurate reference.

Description

A kind of thickener underflow concentration prediction method based on integrated study
Technical field
The present invention relates to mining technique fields, particularly relate to a kind of thickener underflow concentration prediction side based on integrated study Method.
Background technique
In mining production, goaf slump and Tailings Dam dam break are to lead to two big main causes of mine disaster.In order to Guaranteeing exploitation safety, for paste body filling mining codes because its setting rate is fast, filling intensity is high, above-mentioned disaster can be effectively controlled, and Economic, environmental protection, the extensive use by countries in the world.Paste body filling be by the higher crude tailings of fine particle content it is dense after be made Then the lotion slurry do not isolate, do not precipitate, not being dehydrated is pumped to underground and is filled.Dense tailing is paste body filling technique Primary link, deep wimble thickener be suitable for processing fine granular materials, have operating process is simple, production capacity is big, underflow is dense The advantages that high is spent, the dense core equipment of tailing is become.The working principle of deep wimble thickener is to make tailings particle in gravity, centainly Under height mud layer pressure and harrow frame stirring action, the slurry of higher concentration is formed.The core of paste body filling mining codes success Heart factor is the underflow density height of deep wimble thickener, and the accidents such as plugging, pressure rake easily occur for excessive concentration, and concentration is too low then Cause Underground filler intensity insufficient, causes security risk.In order to realize the paste body filling mining of safe and efficient rate, handle is needed Hold the changing rule of the underflow density of deep wimble thickener, Accurate Prediction underflow density.The prediction of the underflow density of deep wimble thickener It is foundation of the worker when carrying out paste body filling mining, has found that it is likely that the underflow density of appearance can be reduced in mining very extremely ahead of time The security risk of mostly unnecessary loss and big degree reduced in mining production.
Domestic and international multidigit scholar is made that many correlative studys for the prediction of underflow density.The mould that these research institutes propose Although type discusses influence of some factors to underflow density, but majority only considered concentrator parameter itself and dense process Influence of the portion link to underflow density.Such as Wang Yong et al. constructs a kind of deep wimble thickener underflow number based on ratio of height to diameter Model is learned to discuss [deep wimble thickener underflow density number of the Wang Yong based on ratio of height to diameter of the relationship between ratio of height to diameter and underflow density Learn model [J] Wuhan University of Technology journal, 2011,33 (8): 113-117.], the research of they et al. only considered concentrator Influence of the ratio of height to diameter to underflow density, 76% only up to be reached to the prediction of underflow density range and coincide.Burnt China's Zhe et al. According to crude tailings flocculation and settling properties experimental study, concentrator input concentration, flocculant unit consumption, flocculant concentration are had studied to heavy Reduction of speed degree, the influence for settling concentration.But the research of burnt magnificent Zhe et al. only considered the input concentration of dense link, flocculant list Consumption, flocculant concentration these three factors, and what is studied is relationship between these three factors and sinking speed, sedimentation concentration, It can not directly predict underflow density.These researchs also can only achieve 76% to the prediction accuracy highest of underflow density, precision without Method is up promoted again, and bottleneck is exactly that their influence factors of concern are very little, and the model used is simple.
It can be seen from the above, the method for current most of uses is all based on mathematics in the forecasting research to underflow density The method of model, it is also fewer based on machine learning prediction model.But other forecasting problems are solved using machine learning model Research it is relatively more, as Zhang Leqin et al. using Logic Regression Models predicts Re-search on Urbanization future development level [ Happy diligent, Chen Fakui influences Potential Prediction and analysis [J] agricultural to arable land based on the Re-search on Urbanization evolution of Logistic model Engineering journal, 2014,30 (04): 1-11.], Xiao Bai et al. predicts to calculate [Xiao such as space load using supporting vector machine model It is white, wait based on multistage clustering analysis and support vector machines Spatial Load Forecasting method [J] Automation of Electric Systems, 2015, 39 (12): 56-61.], the single machine learning model of this quasi-tradition has the theoretical basis of comparatively perfect, in small pre- of data volume There is relatively good process performance in survey problem, the data of Zhang Leqin et al. only has tens, and only one-sided investigation city Town influences arable land, and actually arable land use change is complexity, a dynamical system, and influence factor has very much;Xiao Bai's et al. Processing data in research also only have 380, so in their experiment, this simple, single machine learning model Process performance is also relatively good.But the situation of the overwhelming majority, underflow can only be accounted in view of statistics ratio in the case where data volume is big The forecasting problem of concentration data to be dealt with reach million, and the overwhelming majority is that equipment normal operation prepares lotion in data set Process, while also comprising disorderly closedown for several times and restarting data, it is clear that the single machine learning model of this tradition not only without Method carries out analysis prediction from entire dense process, not ideal enough for the prediction effect of the underflow density of abnormal conditions.In reality In production process, the exception of underflow density cannot be found in time, be easy to ignore the security risk in production.
Using the mass data collection monitored in lotion preparation process, entire dense process can be analyzed by studying one, from Multiple factors calculate the prediction model of underflow density variation, provide and efficiently and accurately refer to for the control of concentrator, become The key technology of research and development is needed at present.
Summary of the invention
The thickener underflow concentration prediction method based on integrated study that the technical problem to be solved in the present invention is to provide a kind of.
In the forecasting research to underflow density, the method for current most of uses is all based on mathematical method or mining Theory, the model based on machine learning are also fewer.With the arrival of big data era, data mining technology and machine learning Method is in theory and applies upper increasingly maturation, and numerous enterprises and researcher have had realized that the importance of data, and base The research and exploration of production raising efficiency are improved in data, it is intended to be produced by the result for analyzing data applied to enterprise In operational process, better economic benefit is obtained.
Paste-filling System is equipped with basic automation systems extensively in mining firm at present, has long-range control, Equipment coupling, monitoring sensor bury a little, and the basic functions such as data acquisition and storage have had what data mining technology was applied Hardware condition can prepare Absent measures by the way that data mining technology is applied to lotion, based in paste body filling production process Monitoring sensor records various process parameters variation in real time, and the Historical Monitoring data recorded are exported to come, pass through machine learning Algorithm probes into the change mechanism of lotion pulp density, finally changes the underflow density of prediction and is opened up by Visualization Platform Show.
The present invention is integrated using multiple learners by considering many factors for influencing underflow density during dense Model training one can Accurate Prediction underflow density prediction model, pass through becoming of analyzing that underflow density changes from Multiple factors Gesture, so that the control for concentrator is provided and efficiently and accurately referred to.It is as follows that the method comprising the steps of:
S1: data acquisition: actual production historical record data is obtained, enterprise is recorded and stored in by the automated system in mine In industry database;
S2: data prediction: pre-processing the data got in S1, rejects unrelated attribute, then carry out feature choosing It takes, obtains pretreated data set;
S3: pretreated dataset construction training set and test set in S2 construction training set and test set: are used;
S4: it establishes prediction model: using integrated learning approach, carrying out mould using the training set and test set constructed in S3 Type is established, and realizes the accurate prediction to deep wimble thickener underflow density;
S5: it shows prediction result: the result predicted in S4 is shown by Visualization Platform.
Wherein, extraneous data includes date, time, mechanical output in S2.
In S2 after Feature Selection, obtained data set includes feed solids amount, feed rate, input concentration, dilution water flow Amount, flocculation dosage, mud layer pressure, rake revolving speed, underflow pump speed one, underflow pump speed two, earth's surface pump speed, ten class of earth's surface stock position are total 1 dependent variable feature of 11 independent variable features and underflow density.
Training set and test set construction method are as follows in S3:
Training set is constructed using the method for sampling, makes the time interval 30s-2min between every data;Test set Select be not in some period training set other data as test set.
The specific method is as follows for model foundation in S4:
S41: according to the actual situation the problem of scale and goal in research, select integrated learning model and submodel;
S42: establishing integrated model, realizes the accurate prediction to the underflow density of deep wimble thickener.
It is specific as follows:
S421: packet and library required for importing, including cross validation function and algorithm packet;
S422: the parameter of primary learner and secondary learner model is respectively set;
S423: training Stacking integrated model, and test set is predicted with trained integrated model, verifying collection At the performance of model;
S424: the comparative experiments of underflow density prediction is carried out using other machines learning model.
Prediction result is shown in S5 specifically: the visual presentation at interface uses Echarts, and Echarts is webpage chart Component exhibiting provides data-interface, can be obtained expected visual presentation effect for corresponding assembly is passed to after data processing.
The advantageous effects of the above technical solutions of the present invention are as follows:
(1) present invention is a complicated industrial forecasting problem for thickener underflow concentration prediction, influences underflow There are many factor of concentration variation, and there is also couplings between each influence factor, have the characteristics that non-linear, diversity, based on number The underflow density Predicting Technique for learning model can only often consider one two kinds of influence factors, seem for this problem relatively difficult.Root Can be with Nonlinear Function Approximation according to machine learning the characteristics of, using the machine learning model based on integrated study, it is contemplated that The entire dense most factors for influencing underflow density in the process, have the features such as prediction accuracy is high, and training speed is fast.
(2) integrated study that the present invention uses is to be combined into a strong classifier using multiple or multiple Weak Classifiers, right Its classification results carries out organically integration to obtain final as a result, the integrated study model is able to solve single engineering The problem of habit model learning ability is limited, cannot handle large-scale data, and precision of prediction can not only be improved, it can also improve mould The generalization ability of type.
Detailed description of the invention
Fig. 1 is the thickener underflow concentration prediction method flow diagram of the invention based on integrated study;
Fig. 2 is the structural schematic diagram of deep wimble thickener involved in the embodiment of the present invention;
Fig. 3 is the underflow density prediction model structure chart based on integrated model of the embodiment of the present invention;
Fig. 4 is the flow chart of the data acquisition of the embodiment of the present invention;
Fig. 5 is the training set of the embodiment of the present invention and the organigram of test set;
Fig. 6 is the Stacking model framework schematic diagram of the embodiment of the present invention;
Fig. 7 is the Integrated Algorithm frame of the invention of the embodiment of the present invention;
Fig. 8 is the flow diagram of the integrated model neutron model training of the embodiment of the present invention;
Fig. 9 is the comparison diagram of many algorithms mean absolute error of the embodiment of the present invention;
Figure 10 is the visual presentation flow chart of the underflow density prediction result of the embodiment of the present invention.
Specific embodiment
To keep the technical problem to be solved in the present invention, technical solution and advantage clearer, below in conjunction with attached drawing and tool Body embodiment is described in detail.
The present invention provides a kind of thickener underflow concentration prediction method based on integrated study.
As shown in Figure 1, that the method comprising the steps of is as follows:
S1: data acquisition: actual production historical record data is obtained, enterprise is recorded and stored in by the automated system in mine In industry database;
S2: data prediction: pre-processing the data got in S1, rejects unrelated attribute, then carry out feature choosing It takes, obtains pretreated data set;
S3: pretreated dataset construction training set and test set in S2 construction training set and test set: are used;
S4: it establishes prediction model: using integrated learning approach, carrying out mould using the training set and test set constructed in S3 Type is established, and realizes the accurate prediction to deep wimble thickener underflow density;
S5: it shows prediction result: the result predicted in S4 is shown by Visualization Platform.
It is explained below with reference to specific implementation process.
Research object of the invention is concentrator, and concentrator is the core equipment of the dense system of lotion, concentrator it is main Working principle is by the feedwell with concentration automatic dilution function, it is ensured that tailing molecule quickly forms floccule body and wadding Solidifying body is not destroyed, to obtain the underflow of higher concentration.Part deep wimble thickener passes through destruction equipped with customization rabbling mechanism The stress balance of floccule body accelerates the separation of solid and liquid process of floc sedimentation.The cone structure of heavy grade makes to form floc sedimentation compression in deep bore Area squeezes abjection convenient for moisture in floccule body, system is made to can get higher underflow density.Research in the embodiment of the present invention Object is PPSM moldeed depth wimble thickener, and basic structure is as shown in Figure 2.Deep wimble thickener uses lotion system as most of at present The core equipment for the dense dehydration of last mortar that the mine of standby filling process uses, such concentrator big, high-efficient, bottom with production capacity The big feature of concentration is flowed, is promoted and is used by lot of domestic and foreign mine.
The other assemblies of filling system are to cooperate the equipment of deep wimble thickener work:
1, the feed pipe at the top of concentrator, overflow pipe and flocculant preparation and add-on system, being mainly responsible for will select The slurry of factory inputs concentrator, and the clear water of concentrator overflow is utilized again, and is responsible for the preparation for the flocculant that filling needs And addition.
2, the underflow pump and circulating line of dense motor spindle: it is responsible for equipped with underflow pump to process at concentrator bottom end outlet Slurry after dense provides power, current working status of slurry after the completion of preparation according to filling system enter circulating line or Person's stirring system.Circulating line generally is equipped with low level circulating line and high-order circulating line both of which, low level circulating line Into the position of concentrator above cone angle, high-order circulating line enters the position of concentrator in concentrator cylindrical section interposition It sets, two kinds of circulations are used to adjust thickener underflow concentration in actual production.
It is more in the parameter for boring dense some effects deep wimble thickener underflow density deeply, relationship it is sufficiently complex and exist compared with Strong coupled relation, this is also one of the reason of theoretical model is difficult to set up.The present invention proposes a kind of based on the dense of integrated study The forecasting system of close machine underflow density, core are to establish the prediction model of the thickener underflow concentration based on integrated study, By considering many factors of influence underflow density during dense, model training one integrated using multiple learners The prediction model of energy Accurate Prediction underflow density, achievees the purpose that be precisely controlled underflow density variation, underflow of the invention The overall data process figure of concentration prediction model is as shown in Figure 3.
In specific implementation, it uses the following technical solution and realizes step:
S1: data acquisition: actual production historical record data is obtained, enterprise is recorded and stored in by the automated system in mine In industry database, the data in business data are led by using OPC (OLE for Process Control) server Out, it obtains excel file and is stored in local.The flow chart of data acquisition is as shown in figure 4, mainly comprise the steps that
S11: local sources traffic is realized.Opc server is installed, and the database of MACS V6.5.2DCS system (is looked forward to The database of industry storage data) it is put into the access region of opc server, realize that opc server can take out the number of DCS database According to OPC includes the standard set of a whole set of interface, attribute and method, is used for process control and automated manufacturing system;
S12: the OPC communication between OPC client and opc server is realized.It is standardized by OPC and realizes OPC communication, mainly Using OPC standardize in DA (data access, Data Access) standardize, DA specification be OPC specification in data transmit define Specification, the inside mainly include that data transmit relevant interface function;
S13: excel file is exported from client.OPC client can be grouped data, and right button wants derived group, It will pop up the right-click menu of the group, selection " export excel " item, the dialog box of pop-up selection catalogue to file designation and selects Excel file is saved after good catalogue.The span data of acquisition is nearly 1 year, and total data records about 1,100,000, every row Be divided between data 10 seconds, data have timing, i.e., data be the identical automated system of the same factory temporally The data of journal arrange, and include the flow and concentration of stowage unit different location pipeline, flocculant system in the data set of acquisition The parameters such as feed rate, input concentration etc. of standby system, gelling agent add-on system and deep wimble thickener, further include in system Realtime power of each pump etc. amounts to 73 attributes, which is known as initial data.
S2: data prediction: data prediction: pre-processing the data set got, rejects and grinds with subsequent herein Study carefully unrelated attribute, such as date, time, mechanical output, then carries out Feature Selection.Specific steps are as follows: initial data is carried out pre- Processing, determines attribute required for follow-up work.The attribute for having part little with filling operating system relationship in initial data, it is first First reject the attribute, including date, time, mechanical output etc. unrelated with this paper follow-up study, using with enterprise automation system The harrow frame revolving speed of corresponding stirring rate, corresponding mud layer height are selected in system designer and the associated specialist discussion of paste body filling field Mud layer pressure, feed solids amount, feed rate, the input concentration of corresponding concentrator input concentration, corresponding flocculant unit consumption with Flocculation dosage, the dilution water flow of flocculant concentration elect underflow pump speed additional member in addition combined with field type facility and sensor layout scheme 1, underflow pump speed 2, earth's surface pump speed, earth's surface stock position quadrinomial parameter, pump speed influence entire production link raw material and are producing to lotion Residence time in equipment, influences the sedimentation time of lotion, and earth's surface stock position records the state of slurry, the indirect shadow of quadrinomial parameter Underflow density is rung, therefore is selected into.Data set after rearranging includes feed solids amount, feed rate, input concentration, dilution Water flow, flocculation dosage, mud layer pressure, rake revolving speed, underflow pump speed, earth's surface pump speed, ten class of earth's surface stock position become for 11 totally certainly 1 dependent variable feature of measure feature and underflow density.
S3: construction training set and test set.To pretreated data set respectively using double sampling initial position with adopt Sample step-length different unique step sampling construction training set and test set, wherein comprising the following specific steps
S31: if being spaced too short, the over-fitting (machine learning built in order to prevent between collected data each row of data Model shows excessively superior in training sample, causes to perform poor in validation data set and test data concentration), it needs Training set is constructed by sampling, so that the interval of training dataset every row time is probably at one minute or so.Specific steps are as follows: The data filing intra-record slack byte of acquisition in the present invention is 10 seconds, if still establishing training pattern, data with the time interval Total amount is larger, can not only bring huge training time cost, is also easy to the phenomenon that generating over-fitting.Therefore on training set It is sampled with step-length 5, specific steps are as follows: the data of various monitor values are unstable when just starting due to concentrator, then construct instruction It is selected since the 20th data in pretreated data set when practicing collection, until the 1000020th data is (in order to round up 1000000 datas), one is selected as training data every 5 datas, and the training data finally obtained has 166666 (1000000/6) item, 11 independents variable (train_X), 1 underflow density dependent variable (train_Y).
S32: it for test set, does not need to be constructed by sampling processing, as long as choosing pretreated data set In some period be not training set data data as test set just because test set is trained in order to test The accuracy of good model, so the time interval of every data does not influence the performance of model in test set.Specific steps are as follows: from S21 can be obtained, and the construction of training set selects being the 20th data since pretreated data set, until the 1000020th Data (in order to round 1000000 datas up) selects one as training data every 5 datas, that test set is from the 20 datas start, until the 1000020th data, in addition to belonging to the every other data of training set as test set data.
Its operation chart is as shown in Figure 5.
S4: it establishes prediction model: using integrated learning approach, having constructed training set and test set progress model using above-mentioned Foundation, realize the accurate prediction to the underflow density of deep wimble thickener.
Using the integrated approach based on submodel in integrated study in the present invention, in a variety of machine learning models of use to mesh Mark uses integrated theoretical informatics over all Integration algorithm after being modeled, submodel should have the characteristic of " good and different ".Refer to well Each submodel should have preferable prediction effect, and difference refers to should biggish difference in structure between submodel, So just it is available better than each submodel be used alone prediction effect.Currently based on submodel it is integrated there are mainly four types of Method, is Bagging, Boosting, Blending and Stacking respectively, is with Stacking integrated approach in the present invention Example.
Stacking is a kind of model integrated method of layering, and bottom is made of base learner, inputs the instruction to build Practice collection, next layer of model carries out retraining using the output of the model of first layer as training set, on the classifier of end layer The output of whole result is obtained, cross validation often is rolled over (by what is built using K to the training set built in practical applications Training set is divided into the similar exclusive subsets of k size, uses the union of k-1 subset as training set every time, it is remaining that Subset can thus obtain k group training/test set as test set, so as to carry out k training and test) division methods. Such as the Stacking model framework schematic diagram that Fig. 6 is two layers.First layer is made of three primary learners, and the second layer is learned by one Device composition is practised, the second layer carries out retraining using the output of first layer as training set, and obtains the output of whole result.
Specifically include the following contents:
S41: according to the actual situation the problem of scale and goal in research, select the suitable Stacking model number of plies and Submodel.In the present invention, selection carries out the building of overall assembly frame, the one side number of plies using 2 layers of Stacking mode Excessively high that calculating cost is caused constantly to increase, model training overlong time, on the other hand 2 layers of integrated architecture can be complete well At goal in research of the invention, overall algorithm is enabled to reach the precision of needs.Its base class submodel (primary learner) makes With strong classifier, including extreme gradient promoted (eXtreme Gradient Boosting, XGBoost), LightGBM, GBDT, Random forest (Random Forests, RF), BP (Backpropagation), the second hierarchical level learner reuse For XGBoost as final result output model, Integrated Algorithm frame is as shown in Figure 7.
S42: establishing integrated model, comprising the following specific steps
S421: packet and library required for importing, comprising the following specific steps
S4211: importing KFold packet from sklearn.model_selection, and KFold is in sklearn packet for handing over Pitch the function of verifying.
S4212: submodel library is imported.Specific steps are as follows: import xgboost, lightgbm.From sklearn.ensemble Middle importing RandomForestRegressor (RF), GradientBoostingRegressor (GBDT).From MLPRegressor (BP) is imported in sklearn.neural_network.
S422: the parameter of submodel is respectively set.Comprising the following specific steps
Parameter setting in S4221:xgboost model (xgb_model_1) are as follows:
' objective':'reg:linear'(need loss function to be minimized),
' learning_rate':0.02 (learning rate),
' n_estimators':2000 (number for promoting iteration),
' the max_depth':5 (depth capacity of tree.This value is also for avoiding over-fitting.Max_depth is bigger, Model can acquire more specific more local sample, setting max_depth be 5),
' (control is for each tree, the ratio of stochastical sampling by subsample':0.9.Reduce the value of this parameter, algorithm meeting It is more conservative, avoid over-fitting.But if this value is arranged too small, it may result in poor fitting, setting Subsample be 0.9),
' (for the accounting of columns that controls every stochastical sampling, (each column are one to colsample_bytree':0.9 Feature)),
' min_child_weight':10 (determine minimum leaf node sample weights and)
Parameter setting in S4222:lightgbm model (lgb_model) are as follows:
' learning_rate':0.01 (learning rate, the step-length of gradient decline),
' n_estimators':1250 (number for promoting iteration),
' max_bin':10 (indicating the maximum quantity of feature deposit bin),
' (control is for each tree, the ratio of stochastical sampling by subsample':0.8.Reduce the value of this parameter, algorithm meeting It is more conservative, avoid over-fitting.But if this value is arranged too small, it may result in poor fitting.Representative value: 0.5- 1),
' subsample_freq':10 (subsample frequency),
' colsample_bytree':0.8 (sampling ratio of feature, column index),
'min_child_samples':500
The parameter setting of S4223:RF model (rf_model) are as follows:
' n_estimators':200 (number for promoting iteration),
' max_depth':6 (setting tree depth, the bigger possible over-fitting of depth),
' (smallest sample number, this value needed for internal node is subdivided limit subtree to min_samples_split':70 Continue the condition divided, if the sample number of certain node is less than min_samples_split, will not continue to reattempt selection most Excellent feature is divided.),
' (the minimum sample number of leaf node, this value limit the least sample of leaf node to min_samples_leaf':30 This number can be with the brotgher of node together by beta pruning if certain leaf node number is less than sample number.)
The parameter setting of S4224:GBDT model (gb_model) are as follows: (setting tree depth, depth are bigger by max_depth=5 Possible over-fitting)
The parameter setting of S4225:BP model (BP_model) are as follows: hidden_layer_sizes=(100): tuple lattice Formula, length=n_layers-2 are defaulted (100), the number of the neuron of i-th of element representation, i-th of hidden layer.
S4226: secondary learner xgb_model_2 parameter (parameter meaning is same as above) setting are as follows:
N_estimators=2000,
Objective='reg:linear',
Learning_rate=0.1,
S423: training Stacking integrated model, and test set is predicted with trained integrated model, verifying collection At the performance of model.Comprising the following specific steps
S4231: the parameter in integrated model is defined.K in K folding cross validation is defined as k=5, secondary learner Stacker=xgb_model_2, primary learner base_models=(xgb_model_1, rf_model, gb_model, lgb_model,BP_model);
S4232: the training set and test set built is imported.Import train_X, train_Y, test_X, test_Y;
S4233: KFold function segmentation training set (train_X, train_Y), the parameter setting in KFold function are called Are as follows: whether n_splits (folding times)=k, shuffle (upset sequentially)=True, random_ before each segmentation State (random seed is used in shuffle==True)=2016, thus obtain 5 groups of training set/test sets (set j as Jth group training set/test set, j=1,2,3,4,5).
S4234: for each of primary learner submodel (set i as i-th of submodel, i=1,2,3,4,5), Circulation executes following steps:
Submodel (i=1) is trained using first group of training set (j=1) first, obtains a trained model, The prediction of first group of test set is carried out with this model, to obtain a prediction result, which is stored in one In the i-th column of array S_train (i=1 at this time is then stored in first row), while with this model to test set test_X Predicted, obtain another prediction result, be stored in array S_test_i jth column in (j=1 at this time is then stored in first row In).
Equally, still second group of training set (j=2, i=1) is trained with this submodel, obtains another training Good model, the prediction result of second group of test set is similarly stored in the i-th column of array S_train (i=1 at this time, then It is stored in first row), it connects behind existing data, array S_test_i is stored in the prediction result of test set test_X In jth column (j=2 at this time is then saved in a second column).
Identical operation is executed to remaining three groups of training sets/test set, has a column data in last array S_train, There are five column datas in array S_test_i, five column datas in S_test_i are averaged, which is stored in another In the i-th column of a array S_test (i=1 at this time is then stored in the 1st column).
So far, first sub- model training is completed, and flow diagram is as shown in Figure 8.
S4235: it for remaining 4 submodels (i is respectively 2,3,4,5), executes step (4-2-3-4), finally obtains There are five column datas in array S_train, the inside, there is 5 column datas in array S_test;
S4236: calling secondary learner stacker, and uses independent variable of the S_train as training set, and train_Y makees For the dependent variable of training set, stacker is trained, obtains a trained secondary learner model, then to S_test It is predicted that final prediction result y_predict can be obtained, is carried out with the test_Y in this prediction result and test set Comparison, can obtain the performance of trained integrated model.
Evaluation index of the MAE (mean absolute error) as model performance is used in the present invention.MAE is defined as follows formula institute Show, wherein True (i) indicates that the underflow density true value of the i-th data in test set, predict (i) indicate model prediction The predicted value of the underflow density of i-th data, n are the number of test set, | | it indicates to take the absolute value of difference.Lotion is filled It fills out from the point of view of preparation process, when lotion slurry precision and the deviation of aimed at precision are no more than 1%, are considered as meeting filling and require, because This is it is considered that established model predicts that underflow density precision meets MAE≤1, that is, can be considered model prediction accuracy on test set It meets the requirements, either still predicts that disorderly closedown hidden danger is ok for grasping production run state in actual industrial production As the very high reference of confidence level.
S4237: calling function mean_absolute_error (), calculates the average absolute of y_predict and test_Y Error, the MAE for obtaining integrated model is 0.5875%.
S424: the comparative experiments of underflow density prediction is carried out.It is predicted, is selected on identical training set and test set Select the conventional machines learning algorithm compared have support vector regression (Support VectorRegressionSVR), RF, GBDT, XGBoost, LightGBM, linear regression, BP neural network.Specifically includes the following steps:
S4241: the comparative experiments of support vector regression.Specific steps are as follows: SVR packet, setting are imported from sklearn.svm Parameter C=1.0, the epsilon=0.2 of SVR model, are trained SVM model using training set train_X, train_Y, Trained SVM model is obtained, and test set test_X is predicted with trained model, recalls function mean_ Absolute_error (), the MAE for calculating SVM model is 3.2709%.
The comparative experiments of S4242:XGBoost.Specific steps are as follows: import xgboost packet, call in xgboost packet Model parameter n_estimators=2000, objective='reg:linear' is arranged in XGBRegressor model, Learning_rate=0.1 is trained SVM model using training set train_X, train_Y, obtains trained XGBoost model, and test set test_X is predicted with trained model, recall function mean_absolute_ Error (), the MAE for calculating XGBoost model is 0.8297%.
The comparative experiments of S4243:LightGBM.Specific steps are identical as S4242, and S4222 is shown in the setting of model parameter, meter Calculating and obtaining the MAE of LightGBM model is 0.9313%.
S4244: the comparative experiments of linear regression.Specific steps are roughly the same with S4242, different places be from Function LinearRegression is imported in sklearn.linear_model, and does not have to setting parameter, is finally obtained linear The MAE of regression model is 2.7226%.
The comparative experiments of S4245:BP neural network.Specific steps are roughly the same with S4242, and S4225 is shown in the setting of parameter, The MAE for finally obtaining BP neural network model is 2.5781%.
The comparative experiments of S4246:RF.RandomForestRegressor (RF) is imported from sklearn.ensemble S4223 is shown in the setting of parameter, and the MAE for finally obtaining RF model is 2.1653%.
The comparative experiments of S4247:GBDT., import from sklearn.ensemble S4224 is shown in GradientBoostingRegressor (GBDT), the setting of parameter, and the MAE for finally obtaining GBDT model is 0.90633%.
S425: experimental result.Many algorithms mean absolute error comparing result is as shown in Figure 9.
It is 0.9313% that wherein XGBoost mean absolute error, which is 0.8297, LightGBM mean absolute error, and gradient mentions Rising tree is 0.90633%, random forest 2.1653%, Integrated Algorithm 0.5875%, BP neural network 2.5781%, line Property return be 2.7226%, support vector regression 3.2709%.It can be seen that the mean absolute error of the algorithm of integrated model It is the smallest, accuracy highest.The performance of tri- kinds of algorithms of LightGBM, GBDT, XGBoost is similar, and SVR performance is worst 's.
S43: encapsulation underflow density prediction model.Trained underflow density prediction model is packaged into a function, is built A vertical interface exports the prediction result for underflow density so that input data is excel file.
S5: it shows prediction result: the result of prediction is shown by Visualization Platform.The visual presentation at interface is then The use of Echarts, Echarts is webpage diagrammatic representation component, data-interface is provided, is by corresponding assembly is passed to after data processing It can get expected visual presentation effect.Operating process is as shown in Figure 10, specifically includes the following steps:
S51: by ajax, (ajax is a kind of for creating the skill of quick dynamic web page in the JavaScript code of front end Art) asynchronous get request obtains the data of underflow density, the function get_ for obtaining underflow density data is found by a URL underflowConcentration();
S52:get_underflowConcentration () function takes the prediction data of underflow density, with an array Form the data for returning to front end, updating in the data of ECharts line chart with the data, interface update underflow density Line chart.
The above is a preferred embodiment of the present invention, it is noted that for those skilled in the art For, without departing from the principles of the present invention, several improvements and modifications can also be made, these improvements and modifications It should be regarded as protection scope of the present invention.

Claims (6)

1. a kind of thickener underflow concentration prediction method based on integrated study, it is characterised in that: comprise the following steps that
S1: data acquisition: actual production historical record data is obtained, enterprise's number is recorded and stored in by the automated system in mine According in library;
S2: data prediction: pre-processing the data got in S1, rejects unrelated attribute, then carry out Feature Selection, obtains To pretreated data set;
S3: pretreated dataset construction training set and test set in S2 construction training set and test set: are used;
S4: it establishes prediction model: using integrated learning approach, carrying out model using the training set and test set constructed in S3 and build It is vertical, realize the accurate prediction to deep wimble thickener underflow density;
S5: it shows prediction result: the result predicted in S4 is shown by Visualization Platform.
2. the thickener underflow concentration prediction method according to claim 1 based on integrated study, it is characterised in that: described Extraneous data includes date, time, mechanical output in S2.
3. the thickener underflow concentration prediction method according to claim 1 based on integrated study, it is characterised in that: described In S2 after Feature Selection, obtained data set includes feed solids amount, feed rate, input concentration, dilution water flow, flocculant Amount, mud layer pressure, rake revolving speed, underflow pump speed one, underflow pump speed two, earth's surface pump speed, ten class of earth's surface stock position become for 11 totally certainly 1 dependent variable feature of measure feature and underflow density.
4. the thickener underflow concentration prediction method according to claim 1 based on integrated study, it is characterised in that: described Training set and test set construction method are as follows in S3:
Training set is constructed using the method for sampling, makes the time interval 30s-2min between every data;Test set selection It is not other data of training set in some period as test set.
5. the thickener underflow concentration prediction method according to claim 1 based on integrated study, it is characterised in that: described The specific method is as follows for model foundation in S4:
S41: according to the actual situation the problem of scale and goal in research, select integrated learning model and submodel;
S42: establishing integrated model, realizes the accurate prediction to the underflow density of deep wimble thickener.
6. the thickener underflow concentration prediction method according to claim 1 based on integrated study, it is characterised in that: described Prediction result is shown in S5 specifically: the visual presentation at interface uses Echarts, and Echarts is webpage diagrammatic representation component, Data-interface is provided, can be obtained expected visual presentation effect for corresponding assembly is passed to after data processing.
CN201910036868.XA 2019-01-15 2019-01-15 A kind of thickener underflow concentration prediction method based on integrated study Pending CN109784561A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910036868.XA CN109784561A (en) 2019-01-15 2019-01-15 A kind of thickener underflow concentration prediction method based on integrated study

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910036868.XA CN109784561A (en) 2019-01-15 2019-01-15 A kind of thickener underflow concentration prediction method based on integrated study

Publications (1)

Publication Number Publication Date
CN109784561A true CN109784561A (en) 2019-05-21

Family

ID=66500522

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910036868.XA Pending CN109784561A (en) 2019-01-15 2019-01-15 A kind of thickener underflow concentration prediction method based on integrated study

Country Status (1)

Country Link
CN (1) CN109784561A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110276128A (en) * 2019-06-21 2019-09-24 东北大学 A kind of thickener underflow concentration prediction method based on DAJYPLS algorithm
CN110472778A (en) * 2019-07-29 2019-11-19 上海电力大学 A kind of short-term load forecasting method based on Blending integrated study
CN110874373A (en) * 2019-12-10 2020-03-10 杭州岑石能源科技有限公司 Linear variation relation judgment method based on machine learning stacking model
CN110942147A (en) * 2019-11-28 2020-03-31 支付宝(杭州)信息技术有限公司 Neural network model training and predicting method and device based on multi-party safety calculation
CN111199343A (en) * 2019-12-24 2020-05-26 上海大学 Multi-model fusion tobacco market supervision abnormal data mining method
CN111210085A (en) * 2020-01-15 2020-05-29 重庆邮电大学 Coal mine gas concentration early warning method based on multi-view ensemble learning
CN111507507A (en) * 2020-03-24 2020-08-07 重庆森鑫炬科技有限公司 Big data-based monthly water consumption prediction method
CN111753399A (en) * 2020-05-26 2020-10-09 中南大学 Method for predicting filling slurry ring pipe pressure drop by machine learning
CN111986814A (en) * 2020-08-21 2020-11-24 南通大学 Modeling method of lupus nephritis prediction model of lupus erythematosus patient
CN112292642A (en) * 2018-06-27 2021-01-29 西门子股份公司 Control device for controlling a technical system and method for configuring a control device
CN112445136A (en) * 2020-12-16 2021-03-05 北京科技大学 Thickener prediction control method and system based on continuous time neural network
CN112995202A (en) * 2021-04-08 2021-06-18 昆明理工大学 SDN-based DDoS attack detection method
CN113723495A (en) * 2021-08-25 2021-11-30 广西大学 Electromagnetic equipment multi-parameter non-invasive identification method, system, equipment and storage medium based on integrated algorithm

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106283806A (en) * 2016-08-30 2017-01-04 东北大学 A kind of high consistency refining system pulp quality control method and system
CN108734197A (en) * 2018-04-17 2018-11-02 东北大学 A kind of Fault monitoring and diagnosis method of the dense washing process of hydrometallurgy
CN109065171A (en) * 2018-11-05 2018-12-21 苏州贝斯派生物科技有限公司 The construction method and system of Kawasaki disease risk evaluation model based on integrated study
CA3079625A1 (en) * 2017-10-24 2019-05-02 Nuralogix Corporation System and method for camera-based stress determination

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106283806A (en) * 2016-08-30 2017-01-04 东北大学 A kind of high consistency refining system pulp quality control method and system
CA3079625A1 (en) * 2017-10-24 2019-05-02 Nuralogix Corporation System and method for camera-based stress determination
CN108734197A (en) * 2018-04-17 2018-11-02 东北大学 A kind of Fault monitoring and diagnosis method of the dense washing process of hydrometallurgy
CN109065171A (en) * 2018-11-05 2018-12-21 苏州贝斯派生物科技有限公司 The construction method and system of Kawasaki disease risk evaluation model based on integrated study

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112292642A (en) * 2018-06-27 2021-01-29 西门子股份公司 Control device for controlling a technical system and method for configuring a control device
CN112292642B (en) * 2018-06-27 2023-09-01 西门子股份公司 Control device for controlling a technical system and method for configuring a control device
CN110276128A (en) * 2019-06-21 2019-09-24 东北大学 A kind of thickener underflow concentration prediction method based on DAJYPLS algorithm
CN110472778A (en) * 2019-07-29 2019-11-19 上海电力大学 A kind of short-term load forecasting method based on Blending integrated study
CN110942147A (en) * 2019-11-28 2020-03-31 支付宝(杭州)信息技术有限公司 Neural network model training and predicting method and device based on multi-party safety calculation
CN110874373A (en) * 2019-12-10 2020-03-10 杭州岑石能源科技有限公司 Linear variation relation judgment method based on machine learning stacking model
CN111199343A (en) * 2019-12-24 2020-05-26 上海大学 Multi-model fusion tobacco market supervision abnormal data mining method
CN111210085B (en) * 2020-01-15 2023-01-24 重庆邮电大学 Coal mine gas concentration early warning method based on multi-view ensemble learning
CN111210085A (en) * 2020-01-15 2020-05-29 重庆邮电大学 Coal mine gas concentration early warning method based on multi-view ensemble learning
CN111507507A (en) * 2020-03-24 2020-08-07 重庆森鑫炬科技有限公司 Big data-based monthly water consumption prediction method
CN111753399A (en) * 2020-05-26 2020-10-09 中南大学 Method for predicting filling slurry ring pipe pressure drop by machine learning
CN111986814A (en) * 2020-08-21 2020-11-24 南通大学 Modeling method of lupus nephritis prediction model of lupus erythematosus patient
CN111986814B (en) * 2020-08-21 2024-01-16 南通大学 Modeling method of lupus nephritis prediction model of lupus erythematosus patient
CN112445136A (en) * 2020-12-16 2021-03-05 北京科技大学 Thickener prediction control method and system based on continuous time neural network
CN112995202A (en) * 2021-04-08 2021-06-18 昆明理工大学 SDN-based DDoS attack detection method
CN113723495A (en) * 2021-08-25 2021-11-30 广西大学 Electromagnetic equipment multi-parameter non-invasive identification method, system, equipment and storage medium based on integrated algorithm

Similar Documents

Publication Publication Date Title
CN109784561A (en) A kind of thickener underflow concentration prediction method based on integrated study
CN110414788A (en) A kind of power quality prediction technique based on similar day and improvement LSTM
CN103984788B (en) A kind of coal entry anchor rod support automated intelligent design and optimization system
CN106168965A (en) Knowledge mapping constructing system
CN106150477A (en) A kind of method determining single well controlled reserves
CN107122860A (en) Bump danger classes Forecasting Methodology based on grid search and extreme learning machine
CN103226728B (en) High density polyethylene polymerization cascade course of reaction Intelligent Measurement and yield optimization method
CN103353883B (en) Big data stream type cluster processing system and method for on-demand clustering
CN103955558A (en) Method for collecting and processing engineering investigation data of different industries
CN105488628A (en) Electric power big data visualization oriented data mining method
CN113130014B (en) Rare earth extraction simulation method and system based on multi-branch neural network
CN109386272A (en) Ultra deep reef flat facies gas reservoir rational spacing between wells Multipurpose Optimal Method
CN114117881A (en) Sand production risk prediction method and system
CN102243628A (en) Mineralizing case reasoning model and method
CN103364831A (en) Physical property parameter quantification method based on neural network algorithm
CN106709822A (en) Industry power consumption data correlation mining method and device
Tan et al. Fracturing productivity prediction model and optimization of the operation parameters of shale gas well based on machine learning
CN109063269A (en) Hydraulic support group's top plate supporting control method based on graph model, storage medium
CN103593534A (en) Shield tunneling machine intelligent model selection method and device based on engineering geology factor relevance
Chen et al. Investigation of Coal Preparation for Life Cycle by Using Building Information Modeling (BIM): A Case Study
CN110188439A (en) The subway work ground settlement method for early warning of case-based reasioning and system dynamics
Zhou et al. Application of BP neural network in efficiency prediction of oilfield mechanized mining system
Huan et al. Underflow concentration prediction model of deep-cone thickener based on data-driven
CN113537706A (en) Oil field production increasing measure optimization method based on intelligent integration
CN108009668B (en) Large-scale load adjustment prediction method applying machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190521

RJ01 Rejection of invention patent application after publication