CN116720630A - Coal mine raw coal yield prediction method, equipment and medium based on time sequence - Google Patents

Coal mine raw coal yield prediction method, equipment and medium based on time sequence Download PDF

Info

Publication number
CN116720630A
CN116720630A CN202310987159.6A CN202310987159A CN116720630A CN 116720630 A CN116720630 A CN 116720630A CN 202310987159 A CN202310987159 A CN 202310987159A CN 116720630 A CN116720630 A CN 116720630A
Authority
CN
China
Prior art keywords
yield
period
month
predicted
historical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310987159.6A
Other languages
Chinese (zh)
Other versions
CN116720630B (en
Inventor
田铭
肖雪
侯明亚
董世德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Yunzhou Industrial Internet Co Ltd
Original Assignee
Inspur Yunzhou Industrial Internet Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Yunzhou Industrial Internet Co Ltd filed Critical Inspur Yunzhou Industrial Internet Co Ltd
Priority to CN202310987159.6A priority Critical patent/CN116720630B/en
Publication of CN116720630A publication Critical patent/CN116720630A/en
Application granted granted Critical
Publication of CN116720630B publication Critical patent/CN116720630B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/27Regression, e.g. linear or logistic regression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/02Agriculture; Fishing; Forestry; Mining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Artificial Intelligence (AREA)
  • Strategic Management (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Software Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Health & Medical Sciences (AREA)
  • Mining & Mineral Resources (AREA)
  • Marine Sciences & Fisheries (AREA)
  • Animal Husbandry (AREA)
  • Agronomy & Crop Science (AREA)
  • Quality & Reliability (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Medical Informatics (AREA)
  • Development Economics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a coal mine raw coal yield prediction method, equipment and medium based on a time sequence, and belongs to the technical field of data processing. According to the method, a plurality of historical output data samples are obtained, and based on a preset number of parameters to be combined and each historical output data sample, a to-be-trained output prediction model is trained. And determining the yield of the first period of the month to be predicted by training a finished yield prediction model. And determining a second period yield of months to be predicted corresponding to the time series yield data based on the time series yield data corresponding to the historical yield data samples. And determining the yield in the undetermined period according to the yield in the first period, the yield in the second period and the corresponding yield weight binary group. And updating the yield of the undetermined period based on the yield of the undetermined period and the planned production information corresponding to the month to be predicted so as to determine the predicted yield of the raw coal of the coal mine. The method solves the problems that the existing coal mine raw coal yield prediction is easily influenced by artificial subjective factors, and the yield prediction result is inaccurate and has poor credibility.

Description

Coal mine raw coal yield prediction method, equipment and medium based on time sequence
Technical Field
The application relates to the technical field of data processing, in particular to a method, equipment and medium for predicting the yield of raw coal in a coal mine based on a time sequence.
Background
Coal is one of the main energy sources, and the production of the coal has important significance for national energy safety and economic development. The accurate prediction of the raw coal yield of the coal mine plays an important role in the aspects of coal mine production management, coal market supply and demand balance and the like. The traditional prediction method is mainly based on experience and subjective judgment, and has the problems of low prediction accuracy, poor real-time performance and the like, and cannot meet the actual demands.
For example, when coal yield is predicted by using the current-stage empirical method and expert system method, it is difficult to avoid adding subjective factors to the predicted result, and the predicted result obtained by the method has uncertainty and poor reliability. In addition, in the process of obtaining the prediction result, the influence factors influencing the yield of raw coal may be ignored, so that the prediction result deviates from reality.
Disclosure of Invention
The embodiment of the application provides a time sequence-based method, equipment and medium for predicting the yield of raw coal in a coal mine, which are used for solving the problems that the yield prediction result is inaccurate and the reliability is poor due to the fact that the yield of raw coal is easily influenced by artificial subjective factors or influence factors on the yield of raw coal are ignored in the prior art when the yield of the raw coal in the coal mine is predicted, so that the yield of the raw coal in the coal mine is predicted more systematically, scientifically and reasonably.
In one aspect, the embodiment of the application provides a method for predicting the raw coal yield of a coal mine based on a time sequence, which comprises the following steps:
acquiring a plurality of historical yield data samples;
training a yield prediction model to be trained based on a preset number of parameters to be combined and each historical yield data sample; the yield prediction model to be trained is a seasonal autoregressive integrated moving average SARIMAX model;
determining the first period yield of the month to be predicted by training a finished yield prediction model;
determining a second period of production for the month to be predicted corresponding to the time series production data based on the time series production data corresponding to the historical production data samples;
determining the yield of the undetermined period according to the yield of the first period, the yield of the second period and the corresponding yield weight binary group;
and updating the yield in the undetermined period based on the yield in the undetermined period and the planned production information corresponding to the month to be predicted so as to determine the predicted yield of the raw coal of the coal mine.
In one implementation of the present application, before training the yield prediction model to be trained based on a preset number of parameters to be combined and each of the historical yield data samples, the method further includes:
the data structure of each historical output data sample is converted into a two-dimensional table DataFrame data structure according to the time sequence through a data analysis tool Pandas so as to obtain the time sequence output data.
In one implementation manner of the present application, training a yield prediction model to be trained based on a preset number of parameters to be combined and each of the historical yield data samples specifically includes:
determining a parameter grid list corresponding to each parameter to be combined through a grid search algorithm and preset parameter values; wherein each parameter to be combined at least comprises one or more of the following: the order of the autoregressive term, the differential order, the order of the moving average term, the order of the seasonal autoregressive term, the seasonal differential order, the order of the seasonal moving average term, the seasonal period length;
and determining the parameter set to be determined with the minimum AIC index value in the parameter sets to be determined corresponding to the parameter grid list as a combined parameter set of the yield prediction model to be trained according to the parameter grid list and the red pool information amount criterion AIC.
In one implementation manner of the present application, according to the parameter grid list and the erythro information amount criterion AIC, the parameter set to be determined having the minimum AIC index value in the parameter sets to be determined corresponding to the parameter grid list is determined as a combined parameter set of the yield prediction model to be trained, and specifically includes:
determining the parameter quantity of each undetermined parameter group in the parameter grid list;
determining the training sample quantity corresponding to each undetermined parameter group, and obtaining undetermined predicted output according to the corresponding training sample quantity; the training sample size is the sample size obtained from a plurality of historical output data samples, and each historical output data sample corresponding to the training sample size is used for inputting the output prediction model to be trained by the corresponding parameter set to obtain the output to be predicted;
determining a corresponding residual square sum according to the undetermined predicted yield and the actual yield of the training sample size corresponding to the undetermined predicted yield;
determining a corresponding log-likelihood function value according to the residual square sum and the training sample size and the log-likelihood function;
and determining the AIC index value according to the log likelihood function value and the corresponding parameter quantity, so as to take the parameter set to be determined with the minimum AIC index value in the parameter sets to be determined as a combined parameter set of the yield prediction model to be trained.
In one implementation of the present application, determining, based on time-series yield data corresponding to the historical yield data samples, a second period yield of the month to be predicted corresponding to the time-series yield data specifically includes:
determining the annual output of a preset historical year and the corresponding monthly output according to the time sequence output data;
determining a month yield ratio corresponding to each historical month in the same historical year according to the month yield and the corresponding annual yield;
the second period yield of the month to be predicted is calculated based on the month yield fraction and the annual yield.
In one implementation of the present application, calculating the second period yield of the month to be predicted based on the month yield ratio and the year yield specifically includes:
taking an average value of the month yield occupation ratios of the same month in each historical year as a month yield prediction weight of the corresponding month;
and determining the second period yield of the month to be predicted according to the corresponding average value of the annual yields and the product value of the month yield prediction weights.
In one implementation of the present application, before determining the pending period yield according to the first period yield, the second period yield, and the corresponding yield weight binary group, the method further includes:
inputting a specified sample of the historical yield data samples into the yield prediction model to obtain a first period yield of a simulated predicted month;
determining a second period yield of the simulated predicted month corresponding to the time series yield data according to the time series yield data corresponding to the specified sample;
and correcting the yield weights respectively corresponding to the first period yield and the second period yield according to a preset step length based on a grid search algorithm and the historical yield of the simulated predicted month corresponding to the specified sample until the difference between the yield of the corresponding simulated predicted period and the historical yield is smaller than a preset threshold value, so as to obtain the yield weight binary group.
In one implementation of the present application, updating the yield of the pending period based on the yield of the pending period and the planned production information corresponding to the month to be predicted to determine the predicted yield of the raw coal of the coal mine specifically includes:
crawling the planned production information of the raw coal of the coal mine through the Internet; the planned production information at least comprises production year, production influence days and total production influence;
determining the affected yield of the corresponding month to be predicted according to the planned production information;
and updating the yield of the undetermined period according to the yield of the undetermined period and the affected yield to determine the predicted yield of the coal mine raw coal.
On the other hand, the embodiment of the application also provides a time sequence-based coal mine raw coal yield prediction device, which comprises:
at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to:
acquiring a plurality of historical yield data samples;
training a yield prediction model to be trained based on a preset number of parameters to be combined and each historical yield data sample; the yield prediction model to be trained is a seasonal autoregressive integrated moving average SARIMAX model;
determining the first period yield of the month to be predicted by training a finished yield prediction model;
determining a second period of production for the month to be predicted corresponding to the time series production data based on the time series production data corresponding to the historical production data samples;
determining the yield of the undetermined period according to the yield of the first period, the yield of the second period and the corresponding yield weight binary group;
and updating the yield in the undetermined period based on the yield in the undetermined period and the planned production information corresponding to the month to be predicted so as to determine the predicted yield of the raw coal of the coal mine.
In yet another aspect, an embodiment of the present application further provides a time-series based non-volatile computer storage medium for predicting raw coal yield in a coal mine, storing computer-executable instructions configured to:
acquiring a plurality of historical yield data samples;
training a yield prediction model to be trained based on a preset number of parameters to be combined and each historical yield data sample; the yield prediction model to be trained is a seasonal autoregressive integrated moving average SARIMAX model;
determining the first period yield of the month to be predicted by training a finished yield prediction model;
determining a second period of production for the month to be predicted corresponding to the time series production data based on the time series production data corresponding to the historical production data samples;
determining the yield of the undetermined period according to the yield of the first period, the yield of the second period and the corresponding yield weight binary group;
and updating the yield in the undetermined period based on the yield in the undetermined period and the planned production information corresponding to the month to be predicted so as to determine the predicted yield of the raw coal of the coal mine.
By the technical scheme, the problems that when the yield of the raw coal of the coal mine is predicted, the yield is easily influenced by artificial subjective factors or influence factors on the yield of the raw coal are ignored, so that the yield prediction result is inaccurate and the reliability is poor are solved. The method utilizes the SARIMAX model, considers the influence of seasonal factors on the yield of the raw coal of the coal mine, effectively improves the prediction accuracy, combines two prediction modes, greatly improves the prediction efficiency, and avoids doping personal subjective factors. The application can predict the raw coal yield of the coal mine more systematically, scientifically and reasonably, is suitable for each stage of the production management of the coal mine, and is convenient for users to use.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
FIG. 1 is a schematic flow chart of a method for predicting the raw coal yield of a coal mine based on a time sequence in an embodiment of the application;
fig. 2 is a schematic structural diagram of a coal mine raw coal yield prediction device based on a time sequence in an embodiment of the application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The prediction of the coal yield at the present stage mainly comprises an empirical method and an expert system method. The empirical method is a method for predicting based on the historical yield and empirical knowledge of the mine, and the empirical method has the main disadvantage that the prediction result is easily influenced by subjective factors. In addition, the applicability of the experience is gradually reduced with the change of factors such as coal market demand, policy environment and the like. The expert system method is a prediction method based on expert knowledge and experience, has high requirements on the expert, and needs the expert to have deeper knowledge on the development trend of coal markets and industries. Meanwhile, the method is also influenced by subjective factors of experts, and a certain uncertainty exists in a prediction result. In addition, neither of the above-mentioned methods has a mature system or method, and cannot be widely used.
Based on the above, the embodiment of the application provides a time sequence-based method, equipment and medium for predicting the raw coal yield of a coal mine, which are used for solving the problems that the current method, equipment and medium are easily influenced by artificial subjective factors or influence factors on the raw coal yield are ignored, so that the yield prediction result is inaccurate and the reliability is poor, so that the raw coal yield of the coal mine is predicted more systematically, scientifically and reasonably.
Various embodiments of the present application are described in detail below with reference to the attached drawing figures.
The embodiment of the application provides a coal mine raw coal yield prediction method based on a time sequence, which can comprise the following steps of S101-S106 as shown in figure 1:
s101, the server acquires a plurality of historical output data samples.
The server is an execution subject of the method for predicting the raw coal yield of the coal mine based on the time series, and is merely an example, and the execution subject is not limited to the server, and the present application is not particularly limited thereto.
The specific mode of the server obtaining a plurality of historical output data samples can be that a crawler python development framework written by a developer is accessed into a coal mine output registration system to obtain the coal mine raw coal output in the past time period, and the historical output data samples are generated.
For example, a historical production data sample for the past five years, including historical production for each month of the past five years, annual production for each year.
S102, the server trains a yield prediction model to be trained based on a preset number of parameters to be combined and each historical yield data sample.
The yield prediction model to be trained is a seasonal autoregressive integrated moving average SARIMAX model.
In the embodiment of the application, based on the preset number of parameters to be combined and each historical yield data sample, before training the yield prediction model to be trained, the method further comprises the following steps:
the data structure of each historical production data sample is converted into a two-dimensional table DataFrame data structure by a data analysis tool Pandas in time sequence to obtain time series production data.
That is, the server may utilize the Pandas data analysis tool to process the raw historical production data samples into time series based data and convert the data into data of a DataFrame data structure, with the corresponding time series data in a time format.
In addition, in the above embodiment, in the case that the historical yield data of a certain year or month is missing, the server may also use the mean interpolation method, that is, use the historical yield data of the missing year or the previous year, the next year or the previous month, the next month, as the historical yield data of the missing year or month in an average manner, so as to further process the missing value or the outlier.
The preset number of parameters to be combined is set by a user, and the parameters to be combined include, but are not limited to, the order of the autoregressive terms, the differential order, the order of the moving average term, the order of the seasonal autoregressive terms, the seasonal differential order, the order of the seasonal moving average term and the seasonal period length of the yield prediction model to be trained.
The yield prediction model to be trained is a Seasonal autoregressive integrated moving average SARIMAX model, and it should be noted that the SARIMAX model is obtained by adding a season (S, seasonal) and an external factor (X, eXogenous) to a differential moving autoregressive model (ARIMA). That is to say on the ARIMA basis plus periodicity and seasonality, is applicable to data with significant periodicity and seasonality characteristics in the time series. The ARIMA model is known as an autoregressive moving average model, and is known as (Autoregressive Integrated Moving Average Model). Also known as ARIMA (p, d, q), is one of the most common models used for time series prediction in statistical models (statistical models). In ARIMA (p, d, q), AR is "autoregressive", and p is the autoregressive term order; MA is "moving average", q is the moving average term order, and d is the differential order made to make it a stationary sequence. The SARIMAX model in the application also comprises: the order P of the seasonal autoregressive term (SAR), the seasonal differential order (DS) D, the order Q of the seasonal moving average term (SMA), and the seasonal period length s.
The SARIMAX model has the following advantages:
1. seasonal factors can be considered: the coal yield is greatly affected by seasonal factors, such as higher winter coal yield and lower summer. The SARIMAX model can capture such seasonal factors well, thereby improving prediction accuracy.
2. Trend factors can be considered: coal yield is also affected by trend factors such as economic situation, policy adjustments, etc. The SARIMAX model can model trend factors to better predict future coal yields.
3. The model has strong interpretability: parameters of the SARIMAX model and the fitting process can be explained, so that the influence of each factor on the prediction result can be conveniently known. This can help the decision maker to better understand the cause and trend of the yield variation for coal yield prediction.
4. The prediction precision is high: the SARIMAX model can capture trends, seasonal and randomness in time series data, enabling more accurate predictions of future data. For coal yield prediction, this can provide a more accurate prediction result for the decision maker, thereby making decisions better.
In the embodiment of the application, based on a preset number of parameters to be combined and each historical yield data sample, a yield prediction model to be trained is trained, which specifically comprises:
the server determines a parameter grid list corresponding to each parameter to be combined through a grid search algorithm and preset parameter values. Wherein each parameter to be combined at least comprises one or more of the following: the order of the autoregressive term, the differential order, the order of the moving average term, the order of the seasonal autoregressive term, the seasonal differential order, the order of the seasonal moving average term, the seasonal period length. And determining the undetermined parameter set with the minimum AIC index value in the undetermined parameter sets corresponding to the parameter grid list as the combined parameter set of the yield prediction model to be trained according to the parameter grid list and the red pool information quantity criterion (Akaike information criterion, AIC).
The grid search algorithm is an exhaustive search method for specifying parameter values, and an optimal learning algorithm is obtained by optimizing parameters of an estimation function through a cross-validation method. That is, possible values of the respective parameters are arranged and combined, and all possible combination results are listed to generate a "grid". Each combination was then used for SVM training and performance was evaluated using cross-validation.
For example, the parameter values of the parameters to be combined of the SARIMAX model of the application comprise (0, 1, 2), the server combines the parameters to be combined, and the grid of possible values of the combined parameters to be combined is enumerated through a grid search algorithm to obtain a parameter grid list. If the undetermined parameter set formed by one row in the parameter grid list is { p=1, q=1, d=1, p=1, d=0, q=0 }, and the undetermined parameter set formed by the other row is { p=2, q=1, d=1, p=1, d=0, q=0 }. And then, the server calculates the AIC index values of all the undetermined parameter groups listed in the parameter grid list through the parameter grid list and the AIC, selects a group of undetermined parameter groups with the minimum AIC index values as a combined parameter group according to the comparison result of the AIC index values, and inputs the combined parameter groups into model parameters of the yield prediction model to be trained to obtain the yield prediction model.
The server determines a to-be-determined parameter set with the minimum AIC index value in all to-be-determined parameter sets corresponding to the parameter grid list as a combined parameter set of a to-be-trained yield prediction model according to the parameter grid list and the red pool information amount criterion AIC, and specifically comprises the following steps:
first, the server determines the number of parameters for each set of parameters to be determined in the parameter grid list.
For example, the set of undetermined parameters includes 5 variables, and the formula is combined by calculating the variables=32, the number of parameters is 32.
Then, the server determines the training sample size corresponding to each of the undetermined parameter groups, and the undetermined predicted yield obtained according to the corresponding training sample size.
The training sample size is the sample size obtained from a plurality of historical output data samples, and each historical output data sample corresponding to the training sample size is used for inputting a to-be-trained output prediction model trained by a corresponding to-be-determined parameter set to obtain the to-be-determined predicted output.
The training sample size may be user-specified, for example, 5, 10, etc., and the present application is not particularly limited to the specific sample size of the training sample size. The user may also designate the training sample sizes corresponding to the respective pending parameter groups, or may designate the training sample sizes corresponding to the same number of the respective pending parameter groups in a unified manner, which is not particularly limited in the present application. After the server inputs each set of parameters to be determined to the model parameters of the yield prediction model to be trained, model prediction is performed through each historical yield sample data of the corresponding training sample amount, and the yield to be determined is obtained.
The server then determines a corresponding sum of squares of residuals based on the predicted yield to be determined and the actual yield of the training sample size corresponding thereto.
That is, the server may calculate the sum of squares of residuals by calculating historical yield data samples for the pending predicted yield to obtain the actual yield in the accurate historical yield data samples. The calculation formula is as follows:
wherein,,is the sum of squares of the residuals>Is->Actual yield of individual historical yield data samples, +.>Is->The undetermined predicted yield of each historical yield data sample. Wherein (1)>The smaller the value of (c) is, the better the fitting of the model to the data is, and the smaller the sum of squares of the residuals is.
And then, the server determines corresponding log likelihood function values according to the residual square sum, the training sample size and the log likelihood function.
The calculation formula of the specific log likelihood function is as follows:
wherein,,for the log likelihood function value>For training sample size, ++>The circumference ratio may be 3.14 when calculated in the embodiment of the present application.
And then, the server determines the AIC index value according to the log likelihood function value and the corresponding parameter quantity, so that the to-be-determined parameter set with the minimum AIC index value in the to-be-determined parameter sets is used as a combined parameter set of the to-be-trained yield prediction model.
Wherein, the calculation formula of the AIC index value is:
wherein,,for AIC index value, ++>Is the number of parameters of the pending parameter set. The server may calculate AIC index values corresponding to the respective undetermined parameter groups, and use the undetermined parameter group with the smallest AIC index value as a model parameter of the trained yield prediction model.
S103, the server determines the yield of the first period of the month to be predicted through training a finished yield prediction model.
After the completion of the training process of the yield prediction model, the server predicts the yield of raw coal of the future coal mine through the yield prediction model, for example, the user inputs the predicted yield of raw coal of 6 months into the future through a user interface, and the server calculates the yield of the first period of 6 months into the future through the training of the completed yield prediction model
And S104, the server determines a second period yield of months to be predicted corresponding to the time series yield data based on the time series yield data corresponding to the historical yield data samples.
In an embodiment of the present application, determining, based on time-series yield data corresponding to historical yield data samples, a second period yield of months to be predicted corresponding to the time-series yield data specifically includes:
and the server determines the annual output of the preset historical year and the corresponding monthly output according to the time sequence output data. And then, determining the month yield ratio corresponding to each historical month in the same historical year according to the month yield and the corresponding annual yield. Subsequently, a second period yield for the month to be predicted is calculated based on the month yield fraction and the annual yield.
That is, after obtaining the time-series output data corresponding to the historical output data sample, the server may select, according to the year, the year output and the corresponding month output of the preset historical year specified by the user, where the specific value of the preset historical year may be set in the actual use process, which is not limited in the present application. The server obtains a month yield ratio according to the month yield and annual yield ratio of the same year, and then calculates the second period yield according to the month yield ratio.
Specifically, the server may take an average of the month yield ratios of the same month in each historical year as the month yield prediction weight of the corresponding month. And then, determining the second period yield of the month to be predicted according to the corresponding average value of the annual yield and the product value of the month yield prediction weight.
For example, the annual output set for the historical year is,/>Is a natural number, representing the historical year,represents the historical year of time series yield data +.>Annual yield of year,/->Month yield comprising 12 months +.>,/>And represents month, which is a natural number greater than 0 and less than 13. The month yield ratio was calculated by the following formula:
wherein,,indicate->In the year->Month yield per month. The server may then calculate the average specific gravity of the same month in the historical year in the annual production, i.e. the month production prediction weight, as follows:
wherein,,weights are predicted for month yield.
In addition, the mean calculation formula of annual output is:
for exampleFor 5, the above formula calculates the annual yield mean +.5 for the historical year>
The server may then calculate a second interval yield,/>The +.>Raw coal yield for one month.
S105, the server determines the yield in the undetermined period according to the yield in the first period, the yield in the second period and the corresponding yield weight binary group.
In an embodiment of the present application, before determining the yield in the pending period, the server determines the yield in the pending period according to the first period yield, the second period yield and the corresponding yield weight binary group, and the method further includes:
the server inputs specified samples in the historical production data samples into the production prediction model to obtain a first period production for the simulated predicted month. And determining a second period yield of the simulated predicted month corresponding to the time series yield data based on the time series yield data corresponding to the specified sample. And then, based on a grid search algorithm and the historical yield of the simulated prediction month corresponding to the specified sample, correcting the yield weights respectively corresponding to the yield in the first period and the yield in the second period according to a preset step length until the difference value between the yield in the corresponding simulated prediction period and the historical yield is smaller than a preset threshold value, and obtaining a yield weight binary group.
In other words, the first period yield and the second period yield obtained by the yield prediction model and the two algorithms combined with the historical year data prediction have reference values for future yield prediction, and based on the reference values, the server can respectively add corresponding weights to the two algorithms so that the finally obtained predicted yield of the coal mine raw coal is more reliable.
In the process of calculating the yield weight doublet, the server can firstly predict a specified sample specified by a user in the historical yield data sample by using a yield prediction model which is completed through training, so as to obtain the yield in the first period of the simulated prediction month. The specified sample may be randomly specified by the user, which is not particularly limited in the present application, and the simulated predicted month may be set by the user or randomly generated by the server, which is not particularly limited in the present application.
The server may then also perform a predictive simulation to predict the second interval yield of months, through the embodiment described above to obtain the second interval yield. Next, the server generates two yield weight values of the first period yield and the second period yield using a grid search algorithm, e.g、/>And let +.A. according to the preset step length by the grid search algorithm>The values are sequentially taken from the preset intervals, the preset intervals are (0, 1), the preset step length is 0.01 or 0.1, and the preset step length is specifically set by a user, so that the application is not particularly limited. And accordingly, is->By the formula->And (5) calculating to obtain the product.
By the two obtained yield weight values, yield weight binary group can be generated,/>]。
The server may calculate the pending period yield by the following formula
And S106, the server updates the yield in the undetermined period based on the yield in the undetermined period and the planned production information corresponding to the month to be predicted so as to determine the predicted yield of the raw coal of the coal mine.
In the embodiment of the application, the server updates the yield in the undetermined period based on the yield in the undetermined period and the planned production information corresponding to the month to be predicted so as to determine the predicted yield of the raw coal of the coal mine, and the method specifically comprises the following steps:
the server climbs the planned production information of the raw coal of the coal mine through the Internet. The planned production information includes at least year of production, number of days of production, total production. And determining the affected yield of the corresponding month to be predicted according to the planned production information. And updating the yield of the undetermined period according to the yield of the undetermined period and the affected yield to determine the predicted yield of the raw coal of the coal mine.
That is, in predicting the raw coal yield of the month to be predicted, there are some planned production information, such as a production stopping plan, a production resumption plan, etc., which the server can crawl for the number of days of influence of the planned production information on the production year yield of the raw coal yield, such as 20 days of production stopping, and the total yield that can be influenced. The planned production information will record the number of days of production year production impact, impact total production.
Then the server can calculate the monthly affected yield corresponding to the planned production information, taking the effect as a negative impact example (yield reduction) by calculating
Wherein,,representation->Month affected yield, +.>Representation->The number of days of the month,indicating that total yield is affected, +.>The number of days of production year and the number of days of production effect are shown.
The server calculates month to be predictedMoon->And->The difference of the above values can obtain the updated yield in the undetermined period, namely the predicted yield of the raw coal of the coal mine +.>
In addition, after obtaining the predicted output of the raw coal of the coal mine, the server can send the predicted output of the raw coal of the coal mine to an enterprise terminal and a user terminal for enterprises or users to use. The enterprise terminal and the user terminal include, but are not limited to, mobile phones, computers and other devices, which are not particularly limited in the application.
By the technical scheme, the problems that when the yield of the raw coal of the coal mine is predicted, the yield is easily influenced by artificial subjective factors or influence factors on the yield of the raw coal are ignored, so that the yield prediction result is inaccurate and the reliability is poor are solved. The method utilizes the SARIMAX model, considers the influence of seasonal factors on the yield of the raw coal of the coal mine, effectively improves the prediction accuracy, combines two prediction modes, greatly improves the prediction efficiency, and avoids doping personal subjective factors. The application is suitable for each stage of coal mine production management and is convenient for users to use.
Fig. 2 is a schematic structural diagram of a coal mine raw coal yield prediction apparatus based on a time sequence according to an embodiment of the present application, where, as shown in fig. 2, the apparatus includes:
at least one processor; and a memory communicatively coupled to the at least one processor. Wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to:
a number of historical yield data samples are obtained. Based on a preset number of parameters to be combined and each historical output data sample, training the output prediction model to be trained. The yield prediction model to be trained is a seasonal autoregressive integrated moving average SARIMAX model. And determining the yield of the first period of the month to be predicted by training a finished yield prediction model. And determining a second period yield of months to be predicted corresponding to the time series yield data based on the time series yield data corresponding to the historical yield data samples. And determining the yield in the undetermined period according to the yield in the first period, the yield in the second period and the corresponding yield weight binary group. And updating the yield of the undetermined period based on the yield of the undetermined period and the planned production information corresponding to the month to be predicted so as to determine the predicted yield of the raw coal of the coal mine.
The embodiment of the application also provides a time sequence-based coal mine raw coal yield prediction nonvolatile computer storage medium, which stores computer executable instructions, wherein the computer executable instructions are set as follows:
a number of historical yield data samples are obtained. Based on a preset number of parameters to be combined and each historical output data sample, training the output prediction model to be trained. The yield prediction model to be trained is a seasonal autoregressive integrated moving average SARIMAX model. And determining the yield of the first period of the month to be predicted by training a finished yield prediction model. And determining a second period yield of months to be predicted corresponding to the time series yield data based on the time series yield data corresponding to the historical yield data samples. And determining the yield in the undetermined period according to the yield in the first period, the yield in the second period and the corresponding yield weight binary group. And updating the yield of the undetermined period based on the yield of the undetermined period and the planned production information corresponding to the month to be predicted so as to determine the predicted yield of the raw coal of the coal mine.
The embodiments of the present application are described in a progressive manner, and the same and similar parts of the embodiments are all referred to each other, and each embodiment is mainly described in the differences from the other embodiments. In particular, for the apparatus, medium embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.
The device, medium and method provided by the embodiment of the application are in one-to-one correspondence, so that the device and medium also have similar beneficial technical effects as the corresponding method, and the beneficial technical effects of the device and medium are not repeated here because the beneficial technical effects of the method are described in detail above.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims (10)

1. A method for predicting raw coal yield of a coal mine based on a time sequence, the method comprising:
acquiring a plurality of historical yield data samples;
training a yield prediction model to be trained based on a preset number of parameters to be combined and each historical yield data sample; the yield prediction model to be trained is a seasonal autoregressive integrated moving average SARIMAX model;
determining the first period yield of the month to be predicted by training a finished yield prediction model;
determining a second period of production for the month to be predicted corresponding to the time series production data based on the time series production data corresponding to the historical production data samples;
determining the yield of the undetermined period according to the yield of the first period, the yield of the second period and the corresponding yield weight binary group;
and updating the yield in the undetermined period based on the yield in the undetermined period and the planned production information corresponding to the month to be predicted so as to determine the predicted yield of the raw coal of the coal mine.
2. The method of claim 1, wherein before training the yield prediction model to be trained based on a predetermined number of parameters to be combined and each of the historical yield data samples, the method further comprises:
the data structure of each historical output data sample is converted into a two-dimensional table DataFrame data structure according to the time sequence through a data analysis tool Pandas so as to obtain the time sequence output data.
3. The method for predicting the raw coal yield of the coal mine based on the time sequence according to claim 1, wherein the training of the yield prediction model to be trained is performed based on a preset number of parameters to be combined and each historical yield data sample, and specifically comprises the following steps:
determining a parameter grid list corresponding to each parameter to be combined through a grid search algorithm and preset parameter values; wherein each parameter to be combined at least comprises one or more of the following: the order of the autoregressive term, the differential order, the order of the moving average term, the order of the seasonal autoregressive term, the seasonal differential order, the order of the seasonal moving average term, the seasonal period length;
and determining the parameter set to be determined with the minimum AIC index value in the parameter sets to be determined corresponding to the parameter grid list as a combined parameter set of the yield prediction model to be trained according to the parameter grid list and the red pool information amount criterion AIC.
4. The method for predicting the raw coal yield of a coal mine based on a time sequence according to claim 3, wherein the parameter set to be determined, which is the combined parameter set of the yield prediction model to be trained and corresponds to the parameter grid list, is determined according to the parameter grid list and a red pool information amount criterion AIC, wherein the parameter set to be determined is the parameter set to be determined and has the minimum AIC index value, and the method specifically comprises the following steps:
determining the parameter quantity of each undetermined parameter group in the parameter grid list;
determining the training sample quantity corresponding to each undetermined parameter group, and obtaining undetermined predicted output according to the corresponding training sample quantity; the training sample size is the sample size obtained from a plurality of historical output data samples, and each historical output data sample corresponding to the training sample size is used for inputting the output prediction model to be trained by the corresponding parameter set to obtain the output to be predicted;
determining a corresponding residual square sum according to the undetermined predicted yield and the actual yield of the training sample size corresponding to the undetermined predicted yield;
determining a corresponding log-likelihood function value according to the residual square sum and the training sample size and the log-likelihood function;
and determining the AIC index value according to the log likelihood function value and the corresponding parameter quantity, so as to take the parameter set to be determined with the minimum AIC index value in the parameter sets to be determined as a combined parameter set of the yield prediction model to be trained.
5. The method for predicting the raw coal yield of a coal mine based on a time sequence according to claim 1, wherein determining the second period yield of the month to be predicted corresponding to the time sequence yield data based on the time sequence yield data corresponding to the historical yield data sample specifically comprises:
determining the annual output of a preset historical year and the corresponding monthly output according to the time sequence output data;
determining a month yield ratio corresponding to each historical month in the same historical year according to the month yield and the corresponding annual yield;
the second period yield of the month to be predicted is calculated based on the month yield fraction and the annual yield.
6. The method for predicting the raw coal yield in a coal mine based on a time series of claim 5, wherein calculating the second period yield of the month to be predicted based on the month yield ratio and the year yield comprises:
taking an average value of the month yield occupation ratios of the same month in each historical year as a month yield prediction weight of the corresponding month;
and determining the second period yield of the month to be predicted according to the corresponding average value of the annual yields and the product value of the month yield prediction weights.
7. A method of time series based raw coal yield prediction in a coal mine as claimed in claim 1 wherein prior to determining the yield in the pending period from the first period yield, the second period yield and the corresponding yield weight doublet, the method further comprises:
inputting a specified sample of the historical yield data samples into the yield prediction model to obtain a first period yield of a simulated predicted month;
determining a second period yield of the simulated predicted month corresponding to the time series yield data according to the time series yield data corresponding to the specified sample;
and correcting the yield weights respectively corresponding to the first period yield and the second period yield according to a preset step length based on a grid search algorithm and the historical yield of the simulated predicted month corresponding to the specified sample until the difference between the yield of the corresponding simulated predicted period and the historical yield is smaller than a preset threshold value, so as to obtain the yield weight binary group.
8. The method for predicting the yield of raw coal in a coal mine based on a time sequence according to claim 1, wherein the yield in the pending period is updated based on the yield in the pending period and planned production information corresponding to the month to be predicted, so as to determine the predicted yield of raw coal in the coal mine, specifically comprising:
crawling the planned production information of the raw coal of the coal mine through the Internet; the planned production information at least comprises production year, production influence days and total production influence;
determining the affected yield of the corresponding month to be predicted according to the planned production information;
and updating the yield of the undetermined period according to the yield of the undetermined period and the affected yield to determine the predicted yield of the coal mine raw coal.
9. A time series-based raw coal yield prediction apparatus for a coal mine, the apparatus comprising:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a time series based method of predicting raw coal production in a coal mine as claimed in any one of claims 1 to 8.
10. A time series based non-volatile computer storage medium storing computer executable instructions for performing a time series based method for predicting raw coal production in a coal mine as claimed in any one of claims 1 to 8.
CN202310987159.6A 2023-08-08 2023-08-08 Coal mine raw coal yield prediction method, equipment and medium based on time sequence Active CN116720630B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310987159.6A CN116720630B (en) 2023-08-08 2023-08-08 Coal mine raw coal yield prediction method, equipment and medium based on time sequence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310987159.6A CN116720630B (en) 2023-08-08 2023-08-08 Coal mine raw coal yield prediction method, equipment and medium based on time sequence

Publications (2)

Publication Number Publication Date
CN116720630A true CN116720630A (en) 2023-09-08
CN116720630B CN116720630B (en) 2023-12-22

Family

ID=87871906

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310987159.6A Active CN116720630B (en) 2023-08-08 2023-08-08 Coal mine raw coal yield prediction method, equipment and medium based on time sequence

Country Status (1)

Country Link
CN (1) CN116720630B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040102937A1 (en) * 2002-11-21 2004-05-27 Honeywell International Inc. Energy forecasting using model parameter estimation
CN107784384A (en) * 2016-08-31 2018-03-09 中国石油天然气股份有限公司 A kind of enterprise's moon maximum demand determines method and apparatus
CN113780655A (en) * 2021-09-08 2021-12-10 欧冶云商股份有限公司 Steel multi-variety demand prediction method based on intelligent supply chain
US20210390230A1 (en) * 2020-06-16 2021-12-16 Chongqing University Method for Quickly Optimizing Key Mining Parameters of Outburst Coal Seam
CN114066069A (en) * 2021-11-18 2022-02-18 国网综合能源服务集团有限公司 Combined weight byproduct gas generation amount prediction method
CN114529072A (en) * 2022-02-11 2022-05-24 杭州致成电子科技有限公司 Regional electric quantity prediction method based on time series
CN114943383A (en) * 2022-06-07 2022-08-26 平安科技(深圳)有限公司 Prediction method and device based on time series, computer equipment and storage medium
CN115860197A (en) * 2022-11-22 2023-03-28 北京科技大学 Data-driven coal bed gas yield prediction method and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040102937A1 (en) * 2002-11-21 2004-05-27 Honeywell International Inc. Energy forecasting using model parameter estimation
CN107784384A (en) * 2016-08-31 2018-03-09 中国石油天然气股份有限公司 A kind of enterprise's moon maximum demand determines method and apparatus
US20210390230A1 (en) * 2020-06-16 2021-12-16 Chongqing University Method for Quickly Optimizing Key Mining Parameters of Outburst Coal Seam
CN113780655A (en) * 2021-09-08 2021-12-10 欧冶云商股份有限公司 Steel multi-variety demand prediction method based on intelligent supply chain
CN114066069A (en) * 2021-11-18 2022-02-18 国网综合能源服务集团有限公司 Combined weight byproduct gas generation amount prediction method
CN114529072A (en) * 2022-02-11 2022-05-24 杭州致成电子科技有限公司 Regional electric quantity prediction method based on time series
CN114943383A (en) * 2022-06-07 2022-08-26 平安科技(深圳)有限公司 Prediction method and device based on time series, computer equipment and storage medium
CN115860197A (en) * 2022-11-22 2023-03-28 北京科技大学 Data-driven coal bed gas yield prediction method and system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
丁一: "华能曹妃甸港煤炭物流需求预测及发展策略研究", 《中国优秀硕士学位论文全文数据库 工程科技Ⅰ辑》, no. 9, pages 31 *
程淑珍: "《医疗卫生单位预算管理》", 山西科学技术出版社, pages: 124 *
邱乐: "季节调整模型在原煤产量预测中的应用", 《中国优秀硕士学位论文全文数据库 工程科技Ⅰ辑》, no. 6, pages 13 *

Also Published As

Publication number Publication date
CN116720630B (en) 2023-12-22

Similar Documents

Publication Publication Date Title
CN112949945B (en) Wind power ultra-short-term prediction method for improving bidirectional long-term and short-term memory network
US20210133536A1 (en) Load prediction method and apparatus based on neural network
CN106022521B (en) Short-term load prediction method of distributed BP neural network based on Hadoop architecture
CN108846528B (en) Long-term load prediction method for large-scale industrial power consumer
CN110910004A (en) Reservoir dispatching rule extraction method and system with multiple uncertainties
CN110969290A (en) Runoff probability prediction method and system based on deep learning
Zhang et al. A novel power‐driven grey model with whale optimization algorithm and its application in forecasting the residential energy consumption in China
CN113449919B (en) Power consumption prediction method and system based on feature and trend perception
CN113361761A (en) Short-term wind power integration prediction method and system based on error correction
CN116596044A (en) Power generation load prediction model training method and device based on multi-source data
CN110598929A (en) Wind power nonparametric probability interval ultrashort term prediction method
CN115640874A (en) Transformer state prediction method based on improved grey model theory
CN114154716B (en) Enterprise energy consumption prediction method and device based on graph neural network
CN112836885B (en) Combined load prediction method, combined load prediction device, electronic equipment and storage medium
CN110807508A (en) Bus peak load prediction method considering complex meteorological influence
CN117252436B (en) Method and system for land utilization change ecological risk partition
CN116720630B (en) Coal mine raw coal yield prediction method, equipment and medium based on time sequence
CN113762591A (en) Short-term electric quantity prediction method and system based on GRU and multi-core SVM counterstudy
CN112307672A (en) BP neural network short-term wind power prediction method based on cuckoo algorithm optimization
CN113919610A (en) ARIMA model construction method and evaluation method for low-voltage transformer area line loss prediction
CN115081681B (en) Wind power prediction method based on propset algorithm
CN114065646B (en) Energy consumption prediction method based on hybrid optimization algorithm, cloud computing platform and system
CN116316548A (en) Load prediction method and system based on population flow data and dual-stage attention mechanism
CN115456286A (en) Short-term photovoltaic power prediction method
Liu et al. Forecasting China’s per capita living energy consumption by employing a novel DGM (1, 1, tα) model with fractional order accumulation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant