CN112150205A - Price prediction method and device and electronic equipment - Google Patents

Price prediction method and device and electronic equipment Download PDF

Info

Publication number
CN112150205A
CN112150205A CN202011032490.5A CN202011032490A CN112150205A CN 112150205 A CN112150205 A CN 112150205A CN 202011032490 A CN202011032490 A CN 202011032490A CN 112150205 A CN112150205 A CN 112150205A
Authority
CN
China
Prior art keywords
price
target
value
data
factor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011032490.5A
Other languages
Chinese (zh)
Other versions
CN112150205B (en
Inventor
刘菲
王圣茂
刘峰
孙庆恩
陈华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur General Software Co Ltd
Huaibei Mining Group Co Ltd
Original Assignee
Inspur General Software Co Ltd
Huaibei Mining Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur General Software Co Ltd, Huaibei Mining Group Co Ltd filed Critical Inspur General Software Co Ltd
Priority to CN202011032490.5A priority Critical patent/CN112150205B/en
Publication of CN112150205A publication Critical patent/CN112150205A/en
Application granted granted Critical
Publication of CN112150205B publication Critical patent/CN112150205B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0206Price or cost determination based on market factors

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Theoretical Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Human Resources & Organizations (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a price prediction method and device and electronic equipment, wherein the method comprises the following steps: acquiring historical data of a target commodity; determining a target factor value of at least one price influencing factor influencing the price of the target commodity according to the historical data, wherein the target factor value represents the amount of the price influencing factor; dividing historical data into a training set and a test set except the training set; establishing a price prediction model according to the target factor value and the training set by using a SARIMAX algorithm; determining a predicted value for representing the price of the target commodity in the time period to be tested according to the test set by using a price prediction model; obtaining a residual sequence in the time period to be predicted according to the predicted value and the real price of the target commodity in the time period to be predicted in the test set by utilizing a Prophet algorithm; and determining the predicted price of the target commodity in the target time period according to the residual sequence and the predicted value by using a price prediction model. The scheme can improve the accuracy of price prediction.

Description

Price prediction method and device and electronic equipment
Technical Field
The invention relates to the technical field of data analysis, in particular to a price prediction method and device and electronic equipment.
Background
The prediction of commodity price is the basis of market prediction analysis and commodity production and sale decision, is an important problem in the field of market prediction, and plays a key role in many aspects such as commodity production, sale and the like.
The existing price prediction is generally to predict and fit the price of the commodity through a time series prediction model Prophet. However, such predictions tend to miss stationary components that are not periodic, thereby reducing the accuracy of price predictions.
Disclosure of Invention
The embodiment of the invention provides a price prediction method and device and electronic equipment, which can improve the accuracy of price prediction.
In a first aspect, an embodiment of the present invention provides a price prediction method, where the method includes:
acquiring historical data of a target commodity;
determining a target factor value for at least one price influencing factor influencing the price of the target commodity from the historical data, wherein the target factor value characterizes the amount of the price influencing factor;
dividing the historical data into a training set and a test set except the training set;
establishing a price prediction model according to the target factor value and the training set by using a SARIMAX algorithm;
determining a predicted value for representing the price of the target commodity in a time period to be tested according to the test set by using the price prediction model;
obtaining a residual sequence in the time period to be predicted according to the predicted value and the real price of the target commodity in the time period to be predicted in the test set by utilizing a Prophet algorithm;
and determining the predicted price of the target commodity in the target time period according to the residual sequence and the predicted value by using the price prediction model.
Preferably, the first and second electrodes are formed of a metal,
the determining a target factor value for at least one price impact factor that impacts a price of the target good from the historical data comprises:
determining a numerical value and a first time point corresponding to at least one factor to be selected which influences the price of the target commodity from the historical data;
constructing an original matrix which takes the first time point as a row and takes the numerical value corresponding to the factor to be selected as a column;
carrying out dimensionless processing on the original data matrix to generate a target matrix;
taking the data of the ith row of the target matrix as a reference data sequence;
for each line of data outside the reference data sequence, performing the following operations:
calculating a difference value of the data of the x column in the data of the current row and the data of the x column of the reference data sequence;
obtaining a difference sequence consisting of the calculated differences;
determining a minimum value, a maximum value, an absolute difference value and a preset resolution coefficient in the difference value sequence, wherein the absolute difference value is a difference value between difference value data in the difference value sequence and a reference number for non-dimensionalization processing in a current row;
calculating at least one correlation coefficient corresponding to the data of the current line according to the minimum value, the maximum value, the absolute difference value and the resolution coefficient;
determining whether a target correlation coefficient within a preset threshold range exists in the at least one correlation coefficient;
if so, determining a target to-be-selected factor corresponding to the target association coefficient in the current row;
taking the target factor to be selected as a price influence factor;
and acquiring a numerical value corresponding to the price influence factor from the original array, and taking the acquired numerical value as a target factor value of the price influence factor.
Preferably, the first and second electrodes are formed of a metal,
after the obtaining of the historical data of the target commodity, before the dividing the historical data into a training set and a test set other than the training set, further comprising:
converting the historical data into a target format to generate a price data set;
determining whether data is missing from the price dataset;
when the price data set is determined to be lack of data, inserting a lack value into the historical data through linear interpolation according to the historical data;
determining whether the filled price data set has abnormal data or not;
when abnormal data exist in the price data set after the missing value is inserted, generating replacement data corresponding to the abnormal data through linear interpolation according to the historical data;
deleting the abnormal data, and inserting the replacement data into a position corresponding to the abnormal data to obtain modified historical data;
the dividing the historical data into a training set and a test set other than the training set includes:
and dividing the replaced historical data into a training set and a test set except the training set.
Preferably, the first and second electrodes are formed of a metal,
the determining, by using the price prediction model and according to the test set, a predicted value for characterizing the price of the target commodity in a time period to be tested includes:
determining at least one second point in time comprised by the test set;
determining at least one target second time point from the at least one second time point, and generating a time period to be measured consisting of the at least one target second time point;
and assigning the time period to be measured to the price prediction model to obtain a prediction value for representing the price of the target commodity in the time period to be measured.
Preferably, the first and second electrodes are formed of a metal,
the determining the predicted price of the target commodity in the target time period according to the residual sequence and the predicted value by using the price prediction model comprises:
s1: determining whether the residual sequence is a sequence which does not conform to normal distribution, if so, executing S2, otherwise, executing S4;
s2: assigning the residual sequence to the Prophet algorithm to obtain an adjusting value;
s3: summing the adjustment value and the predicted value to obtain a replacement value, taking the replacement value as a predicted value, and executing the residual sequence in the time period to be predicted according to the predicted value and the real price of the target commodity in the time period to be predicted in the test set;
s4: carrying out error analysis on the residual error sequence to obtain an error rate;
s5: determining whether the error rate is within a preset error range, if so, performing S6, otherwise, performing S7;
s6: assigning a target time period to the price prediction model to obtain the predicted price of the target commodity in the target time period;
s7: determining a target factor value for at least one price influencing factor influencing the price of the target good from the historical data is performed.
Preferably, the first and second electrodes are formed of a metal,
creating a price prediction model according to the target factor value and the training set by using a SARIMAX algorithm, wherein the creating comprises the following steps:
s1: determining whether the training set is a stationary time series, if so, performing S2, otherwise, performing S3;
s2: assigning the target factor value and the training set to the SARIMAX algorithm to obtain a price prediction model;
s3: and performing seasonal difference processing on the training set to obtain a difference set, taking the difference set as the training set, and returning to S1.
In a second aspect, an embodiment of the present invention provides a price predicting apparatus, including:
the acquisition module is used for acquiring historical data of the target commodity;
the determining module is used for determining a target factor value of at least one price influence factor influencing the price of the target commodity according to the historical data acquired by the acquiring module, wherein the target factor value represents the amount of the price influence factor;
the processing module is used for dividing the historical data acquired by the acquisition module into a training set and a test set except the training set;
the prediction module is used for establishing a price prediction model according to the target factor value determined by the determination module and the training set obtained by the processing module by utilizing a SARIMAX algorithm; determining a predicted value for representing the price of the target commodity in a time period to be tested according to the test set by using the price prediction model; obtaining a residual sequence in the time period to be predicted according to the predicted value and the real price of the target commodity in the time period to be predicted in the test set by utilizing a Prophet algorithm; and determining the predicted price of the target commodity in the target time period according to the residual sequence and the predicted value by using the price prediction model.
Preferably, the first and second electrodes are formed of a metal,
the determining module is configured to perform:
determining a numerical value and a first time point corresponding to at least one factor to be selected which influences the price of the target commodity from the historical data;
constructing an original matrix which takes the first time point as a row and takes the numerical value corresponding to the factor to be selected as a column;
carrying out dimensionless processing on the original data matrix to generate a target matrix;
taking the data of the ith row of the target matrix as a reference data sequence;
for each line of data outside the reference data sequence, performing the following operations:
calculating a difference value of the data of the x column in the data of the current row and the data of the x column of the reference data sequence;
obtaining a difference sequence consisting of the calculated differences;
determining a minimum value, a maximum value, an absolute difference value and a preset resolution coefficient in the difference value sequence, wherein the absolute difference value is a difference value between difference value data in the difference value sequence and a reference number for non-dimensionalization processing in a current row;
calculating at least one correlation coefficient corresponding to the data of the current line according to the minimum value, the maximum value, the absolute difference value and the resolution coefficient;
determining whether a target correlation coefficient within a preset threshold range exists in the at least one correlation coefficient;
if so, determining a target to-be-selected factor corresponding to the target association coefficient in the current row;
taking the target factor to be selected as a price influence factor;
and acquiring a numerical value corresponding to the price influence factor from the original array, and taking the acquired numerical value as a target factor value of the price influence factor.
In a third aspect, an embodiment of the present invention provides an electronic device, including: at least one memory and at least one processor;
the at least one memory to store a machine readable program;
the at least one processor is configured to invoke the machine-readable program to perform the method of any of the first aspects.
In a fourth aspect, embodiments of the present invention provide a computer-readable medium having stored thereon computer instructions, which, when executed by a processor, cause the processor to perform the method of any of the first aspects.
The embodiment of the invention provides a price prediction method, a price prediction device and electronic equipment, wherein the price of a fitting target commodity is generally predicted by a conventional price prediction method through a Prophet algorithm, but stable components without periodicity are easily omitted in the prediction, so that the accuracy of price prediction is reduced. Therefore, in order to improve the accuracy of price prediction, the SARIMAX algorithm can be used for price prediction, seasonal factors and external factors influencing the price are added to the method on the basis of the differential mobile autoregressive model ARIMA, and the method is suitable for data with obvious periodicity and seasonal characteristics in a time sequence. The price influence factor is an external factor influencing the price of the target commodity, and the target factor value can measure the influence degree of the price influence factor on the price, so that firstly, the target factor value of at least one price influence factor influencing the price of the target commodity can be determined based on the acquired historical data of the target commodity, the historical data is divided into a training set and a testing set, then, a price prediction model is established based on the training set and the target factor value, the model comprises the price of the target commodity and the target factor value of at least one price influence factor, the influence degree of the target factor value on the price can be measured, meanwhile, in order to verify the price prediction model, the price of the target commodity in the time period to be tested can be predicted and a predicted value can be obtained based on the time period to be tested represented by the price prediction model and the testing set, and then, a Prophet algorithm is utilized, and obtaining a residual sequence used for representing the difference value between the predicted value and the real value of the test set in the time period to be predicted based on the predicted value and the real price of the target commodity in the time period to be predicted in the test set, and predicting the price of the target commodity in the target time period based on the residual sequence and the predicted value so as to obtain the predicted price of the target commodity in the target time period. By predicting the price of the target commodity in the mode, the accuracy of price prediction can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of a price forecasting method according to an embodiment of the present invention;
FIG. 2 is a flow chart of another price prediction method provided by an embodiment of the invention;
fig. 3 is a schematic diagram of a price forecasting apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer and more complete, the technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention, and based on the embodiments of the present invention, all other embodiments obtained by a person of ordinary skill in the art without creative efforts belong to the scope of the present invention.
As shown in fig. 1, an embodiment of the present invention provides a price prediction method, which may include the following steps:
step 101: acquiring historical data of a target commodity;
step 102: determining a target factor value of at least one price influencing factor influencing the price of the target commodity according to the historical data, wherein the target factor value represents the amount of the price influencing factor;
step 103: dividing historical data into a training set and a test set except the training set;
step 104: establishing a price prediction model according to the target factor value and the training set by using a SARIMAX algorithm;
step 105: determining a predicted value for representing the price of the target commodity in the time period to be tested according to the test set by using a price prediction model;
step 106: obtaining a residual sequence in the time period to be predicted according to the predicted value and the real price of the target commodity in the time period to be predicted in the test set by utilizing a Prophet algorithm;
step 107: and determining the predicted price of the target commodity in the target time period according to the residual sequence and the predicted value by using a price prediction model.
In the embodiment of the invention, the price of the fitting target commodity is generally predicted by the conventional price prediction method through a Prophet algorithm, but stable components without periodicity are easy to miss in the prediction, so that the accuracy of price prediction is reduced. Therefore, in order to improve the accuracy of price prediction, the SARIMAX algorithm can be used for price prediction, seasonal factors and external factors influencing the price are added to the method on the basis of the differential mobile autoregressive model ARIMA, and the method is suitable for data with obvious periodicity and seasonal characteristics in a time sequence. The price influence factor is an external factor influencing the price of the target commodity, and the target factor value can measure the influence degree of the price influence factor on the price, so that firstly, the target factor value of at least one price influence factor influencing the price of the target commodity can be determined based on the acquired historical data of the target commodity, the historical data is divided into a training set and a testing set, then, a price prediction model is established based on the training set and the target factor value, the model comprises the price of the target commodity and the target factor value of at least one price influence factor, the influence degree of the target factor value on the price can be measured, meanwhile, in order to verify the price prediction model, the price of the target commodity in the time period to be tested can be predicted and a predicted value can be obtained based on the time period to be tested represented by the price prediction model and the testing set, and then, a Prophet algorithm is utilized, and obtaining a residual sequence used for representing the difference value between the predicted value and the real value of the test set in the time period to be predicted based on the predicted value and the real price of the target commodity in the time period to be predicted in the test set, and predicting the price of the target commodity in the target time period based on the residual sequence and the predicted value so as to obtain the predicted price of the target commodity in the target time period. By predicting the price of the target commodity in the mode, the accuracy of price prediction can be improved.
In order to determine the target factor value of the at least one price influencing factor influencing the price of the target product, in an embodiment of the present invention, step 102 in the foregoing embodiment determines the target factor value of the at least one price influencing factor influencing the price of the target product according to the historical data, which may be specifically implemented as follows:
determining a numerical value and a first time point corresponding to at least one factor to be selected which influences the price of the target commodity from the historical data;
constructing an original matrix which takes the first time point as a row and takes the numerical value corresponding to the factor to be selected as a column;
carrying out dimensionless processing on the original data matrix to generate a target matrix;
taking the data of the ith row of the target matrix as a reference data sequence;
for each line of data outside the reference data sequence, performing the following operations:
calculating a difference value of the data of the x column in the data of the current row and the data of the x column of the reference data sequence;
obtaining a difference sequence consisting of the calculated differences;
determining a minimum value, a maximum value, an absolute difference value and a preset resolution coefficient in the difference value sequence, wherein the absolute difference value is the difference value between the difference value data in the difference value sequence and a reference number for non-dimensionalization processing in a current line;
calculating at least one correlation coefficient corresponding to the data of the current line according to the minimum value, the maximum value, the absolute difference value and the resolution coefficient;
determining whether a target correlation coefficient within a preset threshold range exists in at least one correlation coefficient;
if so, determining a target to-be-selected factor corresponding to the target association coefficient in the current row;
taking a target factor to be selected as a price influence factor;
and acquiring a numerical value corresponding to the price influence factor from the original array, and taking the acquired numerical value as a target factor value of the price influence factor.
In the embodiment of the present invention, the price influencing factor is an external factor influencing the price of the target commodity, and the target factor value is used for measuring the influence degree of the price influencing factor on the price, however, there may be many external factors influencing the price, and in order to simplify the price prediction model, the number of input variables, that is, the number of price influencing factors, needs to be controlled, so that all price influencing factors can be screened by using the gray relevance algorithm shown in the above steps, and the main external influencing factor influencing the price is determined from the screened external influencing factors, and is input into the price prediction model as the input variable to predict the price of the target commodity.
In order to improve the accuracy of price prediction, in an embodiment of the present invention, after acquiring the historical data of the target commodity in step 101, before dividing the historical data into a training set and a test set other than the training set in step 103, the above embodiment further includes:
converting the historical data into a target format to generate a price data set;
determining whether data is missing in the price dataset;
when missing data in the price data set is determined, inserting missing values into the historical data through linear interpolation according to the historical data;
determining whether the filled price data set has abnormal data or not;
when abnormal data exist in the price data set after the insertion missing value is determined, generating replacement data corresponding to the abnormal data through linear interpolation according to the historical data;
deleting the abnormal data, and inserting the replacement data into a position corresponding to the abnormal data to obtain modified historical data;
dividing historical data into a training set and a test set except the training set, wherein the method comprises the following steps:
and dividing the replaced historical data into a training set and a test set except the training set.
In the embodiment of the invention, firstly, historical data needs to be converted into a price data set in a target format, so that the data formats are unified. The historical data of the target commodity can be crawled from a target commodity website and can also be acquired from the data. Since the historical data may come from different data sources, there may be a case of missing data or abnormal data, and therefore, data preprocessing needs to be performed on the missing data and the abnormal data in the acquired historical data. For the condition that the price data set has missing data, the missing value can be inserted through a linear interpolation method to ensure the integrity of historical data, and for the condition that the filled price data set still has abnormal data, the abnormal data can be replaced through the linear interpolation method, and then the training set and the test set are divided based on the historical data processed by the missing data and the abnormal data.
In order to perform price prediction by using a price prediction model, in an embodiment of the present invention, step 105 in the foregoing embodiment determines, by using the price prediction model, a predicted value for characterizing the price of a target commodity in a time period to be tested according to a test set, which may be specifically implemented by the following steps:
determining at least one second time point included in the test set;
determining at least one target second time point from the at least one second time point, and generating a time period to be measured consisting of the at least one target second time point;
and assigning the time period to be measured to a price prediction model to obtain a predicted value for representing the price of the target commodity in the time period to be measured.
In the embodiment of the invention, the historical data is divided into the training set and the test set, the data can be predicted by using the price prediction model through the data in the training set, and the analysis and verification of the predicted value can be performed through the data in the test set. For example, if the historical data is data of Huai mine commodity coal in 2019 from 1 month to 12 months, the historical data of 2019 from 1 month to 8 months can be used as a training set, the data of 2019 from 9 months to 12 months can be used as a test set, that is, at least one second time point included in the test set is 2019 from 9 months to 12 months, the historical data of 2019 from 1 month to 8 months can be used for inputting a price prediction model for price prediction to obtain a predicted value of the price of the target commodity, and then prediction analysis can be performed based on the predicted value and real data of a time period to be tested corresponding to the test set.
In order to determine the predicted price of the target product in the target time period, in an embodiment of the present invention, the predicted price of the target product in the target time period is determined according to the residual sequence and the predicted value by using a price prediction model in step 107 in the above embodiment, which may be specifically implemented as follows:
s1: determining whether the residual sequence is a sequence which does not conform to normal distribution, if so, executing S2, otherwise, executing S4;
s2: assigning the residual sequence to a Prophet algorithm to obtain an adjusting value;
s3: summing the adjustment value and the predicted value to obtain a replacement value, taking the replacement value as the predicted value, and performing the test to concentrate the real price of the target commodity in the time period to be predicted according to the predicted value and the real price of the target commodity in the time period to be predicted to obtain a residual sequence in the time period to be predicted;
s4: carrying out error analysis on the residual sequence to obtain an error rate;
s5: determining whether the error rate is within a preset error range, if so, performing S6, otherwise, performing S7;
s6: assigning the target time period to a price prediction model to obtain the predicted price of the target commodity in the target time period;
s7: determining a target factor value for at least one price influencing factor influencing the price of the target good from the historical data is performed.
In the embodiment of the invention, the residual sequence is the difference value between the predicted value and the true value of the data in the test set, and the residual sequence may have random condition, so that the residual sequence needs to be analyzed to determine whether the residual sequence is a sequence which does not conform to normal distribution, and after determining that the residual sequence is a sequence which does not conform to normal distribution, the residual sequence is a random sequence, the predicted value needs to be further adjusted by a Prophet algorithm to obtain an adjusted value, the adjusted value and the predicted value are summed to obtain a final predicted value, and the final residual sequence is obtained by the difference value between the final predicted value and the true data in the test set, and the residual sequence may have errors, so that the residual sequence needs to be subjected to error analysis, and when the error rate of the residual sequence is within a preset error range, a target time period can be assigned to the price prediction model, and obtaining the predicted price of the target commodity in the target time period, adjusting the target factor value of at least one price influence factor of the price when the error rate of the residual sequence is not in the preset error range, and performing price prediction through the process again. Meanwhile, when the sequence of the residual error sequence conforming to the normal distribution is determined, the residual error sequence is not a random sequence, error analysis can be directly carried out, and the price of the target commodity in the target time period is predicted.
In order to create the price prediction model, in an embodiment of the present invention, the price prediction model is created according to the target factor value and the training set by using a SARIMAX algorithm in step 104 in the foregoing embodiment, which may be specifically implemented by:
s1: determining whether the training set is a stationary time series, if so, performing S2, otherwise, performing S3;
s2: assigning the target factor value and the training set to a SARIMAX algorithm to obtain a price prediction model;
s3: the training set is subjected to seasonal difference processing to a difference set, and the difference set is used as the training set, and the process returns to S1.
In the embodiment of the invention, when the price prediction model is created, the stability test needs to be carried out on the original time sequence, namely the training set, so as to improve the accuracy of prediction. When the training set is determined to be a stationary time sequence, the target factor value and the training set can be assigned to a SARIMAX algorithm to obtain a price prediction model, and when the training set is determined not to be the stationary time sequence, seasonal difference processing needs to be performed on the training set to obtain a difference set, the difference set is used as the training set, and the stability test is continuously performed to enable the original time sequence to be the stationary sequence.
In one embodiment of the invention, in the prediction process by utilizing the linear Prophet and nonlinear SARIMAX combined models, a combined prediction method is adopted to process a time sequence and an error sequence respectively, so that the prediction precision is improved, wherein the error sequence is obtained by the difference between an original sequence and linear prediction. The original sequence with seasonal and periodic regularity can be processed by utilizing a nonlinear model SARIMAX; while there are some possible linear relationships for the error sequence, it can be analyzed using the Prophet model. Combining linear and non-linear models can improve the accuracy of the system.
As shown in fig. 2, in order to more clearly illustrate the technical solution and the advantages of the present invention, the following detailed description of the price prediction method provided in the embodiment of the present invention may specifically include the following steps:
step 201: obtaining historical data of the target commodity, and determining a numerical value and a first time point corresponding to at least one factor to be selected which influences the price of the target commodity from the historical data.
For example, assuming that the target commodity is Huai mine commodity coal, the historical data can be price data of the commodity coal in the month from 6 months in 2008 to 3 months in 2020, the provincial clean coal sales, the yield of raw coal in the mining area, and mixed coal
Step 202: and constructing an original matrix which takes the first time point as a row and takes the numerical value corresponding to the factor to be selected as a column, and carrying out non-dimensionalization processing on the original data matrix to generate a target matrix.
Step 203: taking the data of the ith row of the target matrix as a reference data sequence, calculating the difference value between the data of the x-th column in the data of the current row and the data of the x-th column of the reference data sequence aiming at the data of each row except the reference data sequence, and acquiring a difference value sequence consisting of the calculated difference values.
Step 204: determining a minimum value, a maximum value, an absolute difference value and a preset resolution coefficient in the difference sequence, and calculating at least one correlation coefficient corresponding to the data of the current line according to the minimum value, the maximum value, the absolute difference value and the resolution coefficient, wherein the absolute difference value is the difference value between the difference data in the difference sequence and a reference number for non-dimensionalization processing in the current line.
Step 205: and determining whether a target correlation coefficient within a preset threshold value range exists in the at least one correlation coefficient.
Step 206: when it is determined that a target correlation coefficient within a preset threshold value range exists in at least one correlation coefficient, determining a target to-be-selected factor corresponding to the target correlation coefficient in a current row; and taking the target factor to be selected as a price influence factor. And acquiring a numerical value corresponding to the price influence factor from the original array, and taking the acquired numerical value as a target factor value of the price influence factor, wherein the target factor value represents the amount of the price influence factor.
Specifically, since there are many external influence factors affecting the price, the target factor value of at least one target price influence factor may be determined by the gray relevance algorithm.
For example, assuming that the factors to be selected are the provincial clean coal sales, the provincial mining area raw coal production, the mixed coal and the washed mixed coal inventory and the corresponding first time points, taking the factor to be selected as the provincial clean coal sales as an example, and the first time point is 2008-6 months, the value of the factor to be selected is 2045.3 tons, and similarly, the other factors to be selected are shown in the following table:
Figure BDA0002704199540000141
the original matrix constructed is
Figure BDA0002704199540000142
The original matrix is dimensionless by the following primary equations:
Figure BDA0002704199540000143
where X1 denotes the target matrix, Xij denotes the data of the ith row and jth column of the original matrix, Xi1 denotes the data of the ith row and 1 st column of the original matrix, and j is a positive integer other than 1.
Wherein, Xi1As a reference column for non-dimensionalization.
An object matrix of
Figure BDA0002704199540000144
It can be determined that the reference data sequence is X0={1,0.9496,0.8005};
Taking the absolute value of the difference between the data in the same column in the 2 nd row and the 1 st row in the target matrix as the data in the 1 st row of the difference sequence, and taking the absolute value of the difference between the data in the same column in the 3 rd row and the 1 st row in the target matrix as the data in the 2 nd row of the difference sequence, the difference sequence can be obtained as follows:
Figure BDA0002704199540000151
determining the correlation coefficient by the following second equation:
Figure BDA0002704199540000152
where Δ (min) is the minimum value in the difference sequence, ρ is the resolution factor (typically 0.5), (Δ max) is the maximum value in the difference sequence, Δ0Is the absolute difference.
The correlation coefficient may be determined as:
Figure BDA0002704199540000153
Figure BDA0002704199540000154
Figure BDA0002704199540000155
assuming that the correlation coefficient of the threshold range is 0.95, the target factors to be selected are the provincial clean coal sales and the inventory of the coal blending and the washed coal, which can be used as the price influence factors.
Step 207: and converting the historical data into a target format to generate a price data set.
Step 208: it is determined whether data is missing from the price dataset.
Step 209: and when determining missing data in the price data set, inserting missing values into the historical data through linear interpolation according to the historical data.
Step 210: and determining whether abnormal data exists in the filled price data set.
Step 211: when abnormal data exist in the filled price data set, generating replacement data corresponding to the abnormal data through linear interpolation according to the historical data, deleting the abnormal data, inserting the replacement data into the position corresponding to the abnormal data to obtain modified historical data, and dividing the replaced historical data into a training set and a test set except the training set.
Specifically, the price of Huai mine commodity coal has the characteristics of irregular periodicity such as trend, seasonality, holidays and the like and sequences such as partial abnormal values and the like, so that data preprocessing is needed for missing data and abnormal data in historical data.
For example, assuming that the price of coal commodity in month 6 in 2008 is 150 yuan/ton, the price of coal commodity in month 8 in 2008 is 130 yuan/ton, and the price of coal commodity in month 7 in 2008 can be determined as 140 yuan/ton by means of a linear difference, similarly, if the price of coal commodity in month 7 in 2008 is 300 yuan/ton, which is much higher than that of coal commodity in other months in the same year, it indicates that the price of coal commodity in month 7 in 2008 is abnormal, the price of coal commodity in month 7 in 2008 can be replaced by 140 yuan/ton in a linear interpolation manner, and if the price data of coal commodity in month 7 in 2008 is repeated, the repeated data can be deleted, so as to avoid inaccuracy of price data prediction.
Step 212: it is determined whether the training set is a stationary time series, if so, step 213 is performed, otherwise, step 214 is performed.
Step 213: and assigning the target factor value and the training set to a SARIMAX algorithm to obtain a price prediction model, and executing step 215.
Specifically, the method is based on the SARIMAX algorithm, and increases seasons and external factors that affect price (such as power plant inventory and daily consumption, port inventory, port coal price, railway volume, etc.). In the selection of the variables, the price index of the commercial coal is used as an explained variable, and the supply of the commercial coal, the demand of the commercial coal, the supply and demand gaps of the commercial coal and the social inventory of the power coal are used as the explained variables. Therefore, the fluctuation of "supply of commercial coal" contributes to the prediction of the fluctuation of the commercial coal price index, the fluctuation of "stock of commercial coal" contributes to the fluctuation of the commercial coal price index, and the fluctuation of "gap between supply and demand of commercial coal" contributes to the prediction of the fluctuation of the commercial coal price index. When price prediction is performed through the SARIMAX algorithm, data processing can be performed on the model in a mode of creating a timestamp, converting the type of a date or time column, making a series of single variables and the like.
For example, an original graph with time as abscissa and price as ordinate is created based on historical data of commodity coal in a price prediction model. And calculating the average value of the commodity coal price in the time period corresponding to the abscissa shown in the original graph based on the price in the original graph, replacing the price in a linear interpolation mode when the absolute value of the difference value between the price corresponding to any time point in the original graph and the average value is not within the preset difference value range, and generating a moving average graph based on the replaced price data. The unit length of the abscissa of the original plot and the standard deviation plot is the same, and the time period characterized by the abscissa is also the same. And calculating corresponding standard deviation according to each time point in the abscissa of the original graph and the moving average graph, and generating a standard deviation graph according to the time point in the original graph and the standard deviation corresponding to the time point, wherein the unit length of the abscissa of the standard deviation graph is the same as that of the abscissa in the original graph, and the time period represented by the abscissa is also the same.
Detecting whether the time sequence of the commodity coal price in the moving average graph meets a zero mean value or not, wherein the square (namely the corresponding variance) of each standard deviation in the standard deviation graph is a constant, if not, the time sequence of the commodity coal price is unstable, then performing data decomposition on the time sequence to obtain an influence graph of long-term trend factors, seasonal variation factors, cyclic variation factors and irregular variation factors of the commodity coal price on the price, sequentially determining whether the commodity coal price has the influence of the seasonal factors or not, performing autocorrelation coefficient detection when the commodity coal price is determined to have the influence of the seasonal factors, determining whether the autocorrelation coefficient is in a confidence interval or not, and performing unit root detection when the autocorrelation coefficient is determined not to be in the confidence interval to avoid the occurrence of a pseudo regression problem, and determining whether a unit root exists or not by comparing the value of the statistic with the critical value of the statistic under the confidence level, and further determining that the original time sequence is unstable and seasonal difference processing is needed when the unit root exists.
Step 214: and performing seasonal difference processing on the training set to obtain a difference set, taking the difference set as the training set, and returning to the step 212.
For example, when the original time sequence is not a stationary sequence, seasonal differentiation may be performed, for example, the differentiation is performed for 12 months, white noise check may be performed after the differentiation, if the check result is stationary and not white noise, the differentiation is ended, otherwise, the differentiation operation is continued, and an autocorrelation graph ACF and a partial autocorrelation graph PACF graph may be created to determine input parameters of the price prediction model, and a seasonal autoregressive order P and a seasonal moving average order Q may be determined from the ACF and the PACF graph.
Step 215: determining at least one second time point included in the test set; and determining at least one target second time point from the at least one second time point, generating a time period to be measured consisting of the at least one target second time point, assigning the time period to be measured to the price prediction model, and obtaining a predicted value for representing the price of the target commodity in the time period to be measured.
Step 216: and obtaining a residual sequence in the time period to be predicted according to the predicted value and the real price of the target commodity in the time period to be predicted in the test set by utilizing a Prophet algorithm.
Step 217: determining whether the residual sequence is a sequence which does not conform to normal distribution, if so, executing step 218, otherwise, executing step 219;
step 218: assigning the residual sequence to a Prophet algorithm to obtain an adjusting value; and summing the adjustment value and the predicted value to obtain a replacement value, taking the replacement value as the predicted value, and executing step 216.
Step 219: and carrying out error analysis on the residual sequence to obtain an error rate.
Step 220: and determining whether the error rate is within a preset error range, if so, executing the step 221, otherwise, returning to the step 201.
Step 221: and assigning the target time period to a price prediction model to obtain the predicted price of the target commodity in the target time period.
Specifically, when the residual sequence is predicted based on the Prophet algorithm, the following settings may be made: -piecewise linear or increasing trend curves. Prophet may automatically detect a trend change by selecting a transition point from the data; creating each periodic component by using a virtual variable according to periodic components of the period; and thirdly, the user can define the important festival and holiday list by self.
For example, a kernel density estimation KDE line, which is a residual, can be fitted to a normal distribution graph to determine the randomness of the sequence of residuals.
As shown in fig. 3, an embodiment of the present invention provides a price predicting apparatus, including:
an obtaining module 301, configured to obtain historical data of a target product;
a determining module 302, configured to determine, according to the historical data acquired by the acquiring module 301, a target factor value of at least one price influencing factor that influences the price of the target commodity, where the target factor value represents a quantity of the price influencing factor;
a processing module 303, configured to divide the historical data acquired by the acquiring module 301 into a training set and a test set other than the training set;
a prediction module 304, configured to create, by using a SARIMAX algorithm, a price prediction model according to the target factor value determined by the determination module 302 and the training set obtained by the processing module 303; determining a predicted value for representing the price of the target commodity in the time period to be tested according to the test set by using a price prediction model; obtaining a residual sequence in the time period to be predicted according to the predicted value and the real price of the target commodity in the time period to be predicted in the test set by utilizing a Prophet algorithm; and determining the predicted price of the target commodity in the target time period according to the residual sequence and the predicted value by using a price prediction model.
In the embodiment of the invention, the price of the fitting target commodity is generally predicted by the conventional price prediction method through a Prophet algorithm, but stable components without periodicity are easy to miss in the prediction, so that the accuracy of price prediction is reduced. Therefore, in order to improve the accuracy of price prediction, the SARIMAX algorithm can be used for price prediction, seasonal factors and external factors influencing the price are added to the method on the basis of the differential mobile autoregressive model ARIMA, and the method is suitable for data with obvious periodicity and seasonal characteristics in a time sequence. The price influence factor is an external factor influencing the price of the target commodity, and the target factor value can measure the influence degree of the price influence factor on the price, so that the target factor value of at least one price influence factor influencing the price of the target commodity can be determined by utilizing the determining module based on the historical data of the target commodity acquired by the acquiring module, the historical data is divided into a training set and a testing set by the processing module, then a price prediction model is established by the prediction module based on the training set and the target factor value, the price of the target commodity and the target factor value of at least one price influence factor are contained in the model, the influence degree of the target factor value on the price can be measured, meanwhile, in order to verify the price prediction model, the price of the target commodity in the time period to be tested can be predicted and the predicted value can be obtained based on the time period to be tested represented by the price prediction model and the testing set, and then, obtaining a residual sequence used for representing the difference value between the predicted value and the real value of the test set in the time period to be predicted based on the predicted value and the real price of the target commodity in the time period to be predicted in the test set by utilizing a Prophet algorithm, and predicting the price of the target commodity in the target time period based on the residual sequence and the predicted value so as to obtain the predicted price of the target commodity in the target time period. By predicting the price of the target commodity in the mode, the accuracy of price prediction can be improved.
In an embodiment of the present invention, the determining module 302 is configured to perform:
determining a numerical value and a first time point corresponding to at least one factor to be selected which influences the price of the target commodity from the historical data;
constructing an original matrix which takes the first time point as a row and takes the numerical value corresponding to the factor to be selected as a column;
carrying out dimensionless processing on the original data matrix to generate a target matrix;
taking the data of the ith row of the target matrix as a reference data sequence;
for each line of data outside the reference data sequence, performing the following operations:
calculating a difference value of the data of the x column in the data of the current row and the data of the x column of the reference data sequence;
obtaining a difference sequence consisting of the calculated differences;
determining a minimum value, a maximum value, an absolute difference value and a preset resolution coefficient in the difference value sequence, wherein the absolute difference value is the difference value between the difference value data in the difference value sequence and a reference number for non-dimensionalization processing in a current line;
calculating at least one correlation coefficient corresponding to the data of the current line according to the minimum value, the maximum value, the absolute difference value and the resolution coefficient;
determining whether a target correlation coefficient within a preset threshold range exists in at least one correlation coefficient;
if so, determining a target to-be-selected factor corresponding to the target association coefficient in the current row;
taking a target factor to be selected as a price influence factor;
and acquiring a numerical value corresponding to the price influence factor from the original array, and taking the acquired numerical value as a target factor value of the price influence factor.
It is to be understood that the illustrated structure of the embodiment of the present invention does not specifically limit the price prediction device. In other embodiments of the invention the price prediction means may comprise more or fewer components than shown, or some components may be combined, some components may be split, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
Because the information interaction, execution process, and other contents between the units in the device are based on the same concept as the method embodiment of the present invention, specific contents may refer to the description in the method embodiment of the present invention, and are not described herein again.
The embodiment of the invention also provides a price prediction device, which comprises: at least one memory and at least one processor;
at least one memory for storing a machine readable program;
at least one processor for invoking a machine readable program to perform a price prediction method in any embodiment of the present invention.
Embodiments of the present invention further provide a computer-readable medium, on which computer instructions are stored, and when executed by a processor, the computer instructions cause the processor to execute the price prediction method in any of the embodiments of the present invention.
Specifically, a system or an apparatus equipped with a storage medium on which software program codes that realize the functions of any of the above-described embodiments are stored may be provided, and a computer (or a CPU or MPU) of the system or the apparatus is caused to read out and execute the program codes stored in the storage medium.
In this case, the program code itself read from the storage medium can realize the functions of any of the above-described embodiments, and thus the program code and the storage medium storing the program code constitute a part of the present invention.
Examples of the storage medium for supplying the program code include a floppy disk, a hard disk, a magneto-optical disk, an optical disk (e.g., CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD + RW), a magnetic tape, a nonvolatile memory card, and a ROM. Alternatively, the program code may be downloaded from a server computer via a communications network.
Further, it should be clear that the functions of any one of the above-described embodiments may be implemented not only by executing the program code read out by the computer, but also by causing an operating system or the like operating on the computer to perform a part or all of the actual operations based on instructions of the program code.
Further, it is to be understood that the program code read out from the storage medium is written to a memory provided in an expansion board inserted into the computer or to a memory provided in an expansion unit connected to the computer, and then causes a CPU or the like mounted on the expansion board or the expansion unit to perform part or all of the actual operations based on instructions of the program code, thereby realizing the functions of any of the above-described embodiments.
The embodiments of the invention have at least the following beneficial effects:
1. in the embodiment of the invention, the price of the fitting target commodity is generally predicted by the conventional price prediction method through a Prophet algorithm, but stable components without periodicity are easy to miss in the prediction, so that the accuracy of price prediction is reduced. Therefore, in order to improve the accuracy of price prediction, the SARIMAX algorithm can be used for price prediction, seasonal factors and external factors influencing the price are added to the method on the basis of the differential mobile autoregressive model ARIMA, and the method is suitable for data with obvious periodicity and seasonal characteristics in a time sequence. The price influence factor is an external factor influencing the price of the target commodity, and the target factor value can measure the influence degree of the price influence factor on the price, so that firstly, the target factor value of at least one price influence factor influencing the price of the target commodity can be determined based on the acquired historical data of the target commodity, the historical data is divided into a training set and a testing set, then, a price prediction model is established based on the training set and the target factor value, the model comprises the price of the target commodity and the target factor value of at least one price influence factor, the influence degree of the target factor value on the price can be measured, meanwhile, in order to verify the price prediction model, the price of the target commodity in the time period to be tested can be predicted and a predicted value can be obtained based on the time period to be tested represented by the price prediction model and the testing set, and then, a Prophet algorithm is utilized, and obtaining a residual sequence used for representing the difference value between the predicted value and the real value of the test set in the time period to be predicted based on the predicted value and the real price of the target commodity in the time period to be predicted in the test set, and predicting the price of the target commodity in the target time period based on the residual sequence and the predicted value so as to obtain the predicted price of the target commodity in the target time period. The price of the target commodity is predicted in the mode, so that the accuracy of price prediction can be improved;
2. in an embodiment of the present invention, the price influencing factor is an external factor influencing the price of the target commodity, and the value of the target factor is used for measuring the influence degree of the price influencing factor on the price, however, there may be many external factors influencing the price, and in order to simplify the price prediction model, the number of input variables, that is, the number of price influencing factors, needs to be controlled, so that all price influencing factors can be screened by using the gray relevance algorithm shown in the above steps, and the main external influencing factor influencing the price is determined from the number of input variables, and is input into the price prediction model as the input variable to predict the price of the target commodity;
3. in an embodiment of the present invention, the historical data needs to be converted into the price data set in the target format, so that the data format is uniform. The historical data of the target commodity can be crawled from a target commodity website and can also be acquired from the data. Since the historical data may come from different data sources, there may be a case of missing data or abnormal data, and therefore, data preprocessing needs to be performed on the missing data and the abnormal data in the acquired historical data. For the condition that the price data set has missing data, the missing value can be inserted through a linear interpolation method to ensure the integrity of historical data, and for the condition that the filled price data set still has abnormal data, the abnormal data can be replaced through the linear interpolation method, and then the training set and the test set are divided based on the historical data processed by the missing data and the abnormal data.
It should be noted that not all steps and modules in the above flows and system structure diagrams are necessary, and some steps or modules may be omitted according to actual needs. The execution order of the steps is not fixed and can be adjusted as required. The system structure described in the above embodiments may be a physical structure or a logical structure, that is, some modules may be implemented by the same physical entity, or some modules may be implemented by a plurality of physical entities, or some components in a plurality of independent devices may be implemented together.
In the above embodiments, the hardware unit may be implemented mechanically or electrically. For example, a hardware element may comprise permanently dedicated circuitry or logic (such as a dedicated processor, FPGA or ASIC) to perform the corresponding operations. The hardware elements may also comprise programmable logic or circuitry, such as a general purpose processor or other programmable processor, that may be temporarily configured by software to perform the corresponding operations. The specific implementation (mechanical, or dedicated permanent, or temporarily set) may be determined based on cost and time considerations.
While the invention has been shown and described in detail in the drawings and in the preferred embodiments, it is not intended to limit the invention to the embodiments disclosed, and it will be apparent to those skilled in the art that various combinations of the code auditing means in the various embodiments described above may be used to obtain further embodiments of the invention, which are also within the scope of the invention.

Claims (10)

1. A method of price forecasting, the method comprising:
acquiring historical data of a target commodity;
determining a target factor value for at least one price influencing factor influencing the price of the target commodity from the historical data, wherein the target factor value characterizes the amount of the price influencing factor;
dividing the historical data into a training set and a test set except the training set;
establishing a price prediction model according to the target factor value and the training set by using a SARIMAX algorithm;
determining a predicted value for representing the price of the target commodity in a time period to be tested according to the test set by using the price prediction model;
obtaining a residual sequence in the time period to be predicted according to the predicted value and the real price of the target commodity in the time period to be predicted in the test set by utilizing a Prophet algorithm;
and determining the predicted price of the target commodity in the target time period according to the residual sequence and the predicted value by using the price prediction model.
2. The method of claim 1,
the determining a target factor value for at least one price impact factor that impacts a price of the target good from the historical data comprises:
determining a numerical value and a first time point corresponding to at least one factor to be selected which influences the price of the target commodity from the historical data;
constructing an original matrix which takes the first time point as a row and takes the numerical value corresponding to the factor to be selected as a column;
carrying out dimensionless processing on the original data matrix to generate a target matrix;
taking the data of the ith row of the target matrix as a reference data sequence;
for each line of data outside the reference data sequence, performing the following operations:
calculating a difference value of the data of the x column in the data of the current row and the data of the x column of the reference data sequence;
obtaining a difference sequence consisting of the calculated differences;
determining a minimum value, a maximum value, an absolute difference value and a preset resolution coefficient in the difference value sequence, wherein the absolute difference value is a difference value between difference value data in the difference value sequence and a reference number for non-dimensionalization processing in a current row;
calculating at least one correlation coefficient corresponding to the data of the current line according to the minimum value, the maximum value, the absolute difference value and the resolution coefficient;
determining whether a target correlation coefficient within a preset threshold range exists in the at least one correlation coefficient;
if so, determining a target to-be-selected factor corresponding to the target association coefficient in the current row;
taking the target factor to be selected as a price influence factor;
and acquiring a numerical value corresponding to the price influence factor from the original array, and taking the acquired numerical value as a target factor value of the price influence factor.
3. The method of claim 1,
after the obtaining of the historical data of the target commodity, before the dividing the historical data into a training set and a test set other than the training set, further comprising:
converting the historical data into a target format to generate a price data set;
determining whether data is missing from the price dataset;
when the price data set is determined to be lack of data, inserting a lack value into the historical data through linear interpolation according to the historical data;
determining whether the filled price data set has abnormal data or not;
when abnormal data exist in the price data set after the missing value is inserted, generating replacement data corresponding to the abnormal data through linear interpolation according to the historical data;
deleting the abnormal data, and inserting the replacement data into a position corresponding to the abnormal data to obtain modified historical data;
the dividing the historical data into a training set and a test set other than the training set includes:
and dividing the replaced historical data into a training set and a test set except the training set.
4. The method of claim 1,
the determining, by using the price prediction model and according to the test set, a predicted value for characterizing the price of the target commodity in a time period to be tested includes:
determining at least one second point in time comprised by the test set;
determining at least one target second time point from the at least one second time point, and generating a time period to be measured consisting of the at least one target second time point;
and assigning the time period to be measured to the price prediction model to obtain a prediction value for representing the price of the target commodity in the time period to be measured.
5. The method of claim 1,
the determining the predicted price of the target commodity in the target time period according to the residual sequence and the predicted value by using the price prediction model comprises:
s1: determining whether the residual sequence is a sequence which does not conform to normal distribution, if so, executing S2, otherwise, executing S4;
s2: assigning the residual sequence to the Prophet algorithm to obtain an adjusting value;
s3: summing the adjustment value and the predicted value to obtain a replacement value, taking the replacement value as a predicted value, and executing the residual sequence in the time period to be predicted according to the predicted value and the real price of the target commodity in the time period to be predicted in the test set;
s4: carrying out error analysis on the residual error sequence to obtain an error rate;
s5: determining whether the error rate is within a preset error range, if so, performing S6, otherwise, performing S7;
s6: assigning a target time period to the price prediction model to obtain the predicted price of the target commodity in the target time period;
s7: determining a target factor value for at least one price influencing factor influencing the price of the target good from the historical data is performed.
6. The method of claim 1,
creating a price prediction model according to the target factor value and the training set by using a SARIMAX algorithm, wherein the creating comprises the following steps:
s1: determining whether the training set is a stationary time series, if so, performing S2, otherwise, performing S3;
s2: assigning the target factor value and the training set to the SARIMAX algorithm to obtain a price prediction model;
s3: and performing seasonal difference processing on the training set to obtain a difference set, taking the difference set as the training set, and returning to S1.
7. A price prediction device, comprising:
the acquisition module is used for acquiring historical data of the target commodity;
the determining module is used for determining a target factor value of at least one price influence factor influencing the price of the target commodity according to the historical data acquired by the acquiring module, wherein the target factor value represents the amount of the price influence factor;
the processing module is used for dividing the historical data acquired by the acquisition module into a training set and a test set except the training set;
the prediction module is used for establishing a price prediction model according to the target factor value determined by the determination module and the training set obtained by the processing module by utilizing a SARIMAX algorithm; determining a predicted value for representing the price of the target commodity in a time period to be tested according to the test set by using the price prediction model; obtaining a residual sequence in the time period to be predicted according to the predicted value and the real price of the target commodity in the time period to be predicted in the test set by utilizing a Prophet algorithm; and determining the predicted price of the target commodity in the target time period according to the residual sequence and the predicted value by using the price prediction model.
8. The apparatus of claim 7,
the determining module is configured to perform:
determining a numerical value and a first time point corresponding to at least one factor to be selected which influences the price of the target commodity from the historical data;
constructing an original matrix which takes the first time point as a row and takes the numerical value corresponding to the factor to be selected as a column;
carrying out dimensionless processing on the original data matrix to generate a target matrix;
taking the data of the ith row of the target matrix as a reference data sequence;
for each line of data outside the reference data sequence, performing the following operations:
calculating a difference value of the data of the x column in the data of the current row and the data of the x column of the reference data sequence;
obtaining a difference sequence consisting of the calculated differences;
determining a minimum value, a maximum value, an absolute difference value and a preset resolution coefficient in the difference value sequence, wherein the absolute difference value is a difference value between difference value data in the difference value sequence and a reference number for non-dimensionalization processing in a current row;
calculating at least one correlation coefficient corresponding to the data of the current line according to the minimum value, the maximum value, the absolute difference value and the resolution coefficient;
determining whether a target correlation coefficient within a preset threshold range exists in the at least one correlation coefficient;
if so, determining a target to-be-selected factor corresponding to the target association coefficient in the current row;
taking the target factor to be selected as a price influence factor;
and acquiring a numerical value corresponding to the price influence factor from the original array, and taking the acquired numerical value as a target factor value of the price influence factor.
9. An electronic device, comprising: at least one memory and at least one processor;
the at least one memory to store a machine readable program;
the at least one processor, configured to invoke the machine readable program to perform the method of any of claims 1 to 4.
10. Computer readable medium, characterized in that it has stored thereon computer instructions which, when executed by a processor, cause the processor to carry out the method of any one of claims 1 to 4.
CN202011032490.5A 2020-09-27 2020-09-27 Price prediction method and device and electronic equipment Active CN112150205B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011032490.5A CN112150205B (en) 2020-09-27 2020-09-27 Price prediction method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011032490.5A CN112150205B (en) 2020-09-27 2020-09-27 Price prediction method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN112150205A true CN112150205A (en) 2020-12-29
CN112150205B CN112150205B (en) 2022-08-30

Family

ID=73894672

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011032490.5A Active CN112150205B (en) 2020-09-27 2020-09-27 Price prediction method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN112150205B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112766619A (en) * 2021-04-07 2021-05-07 广东众聚人工智能科技有限公司 Commodity time sequence data prediction method and system
CN113129064A (en) * 2021-04-25 2021-07-16 深圳壹账通创配科技有限公司 Automobile part price prediction method, system, equipment and readable storage medium
CN114356911A (en) * 2022-03-18 2022-04-15 四川省医学科学院·四川省人民医院 Data missing processing method and system based on set division information quantity maximization
CN116862077A (en) * 2023-08-31 2023-10-10 吉林电力交易中心有限公司 Electric heating operation cost prediction method and medium based on multi-mode combination model
CN117292530A (en) * 2023-11-27 2023-12-26 深圳龙电华鑫控股集团股份有限公司 Carrier communication data acquisition efficiency optimization method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150066729A1 (en) * 2013-09-04 2015-03-05 Mastercard International Incorporated System and method for currency exchange rate forecasting
CN106875069A (en) * 2017-03-22 2017-06-20 无锡中科富农物联科技有限公司 A kind of rapeseed oil forecasting of futures prix method
CN111461786A (en) * 2020-04-03 2020-07-28 中南大学 Goods sales prediction method and device based on Prophet-CEEMDAN-ARIMA
CN111563760A (en) * 2019-01-29 2020-08-21 网易(杭州)网络有限公司 Prediction method, medium, device and computing equipment of total volume of trades

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150066729A1 (en) * 2013-09-04 2015-03-05 Mastercard International Incorporated System and method for currency exchange rate forecasting
CN106875069A (en) * 2017-03-22 2017-06-20 无锡中科富农物联科技有限公司 A kind of rapeseed oil forecasting of futures prix method
CN111563760A (en) * 2019-01-29 2020-08-21 网易(杭州)网络有限公司 Prediction method, medium, device and computing equipment of total volume of trades
CN111461786A (en) * 2020-04-03 2020-07-28 中南大学 Goods sales prediction method and device based on Prophet-CEEMDAN-ARIMA

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CATHERIEN MCHUGH等: "Forecasting Day-ahead Electricity Price with A SARIMAX Model", 《2019 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE(SSCI)》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112766619A (en) * 2021-04-07 2021-05-07 广东众聚人工智能科技有限公司 Commodity time sequence data prediction method and system
CN112766619B (en) * 2021-04-07 2021-07-20 广东众聚人工智能科技有限公司 Commodity time sequence data prediction method and system
CN113129064A (en) * 2021-04-25 2021-07-16 深圳壹账通创配科技有限公司 Automobile part price prediction method, system, equipment and readable storage medium
CN114356911A (en) * 2022-03-18 2022-04-15 四川省医学科学院·四川省人民医院 Data missing processing method and system based on set division information quantity maximization
CN114356911B (en) * 2022-03-18 2022-05-20 四川省医学科学院·四川省人民医院 Data missing processing method and system based on set division information quantity maximization
CN116862077A (en) * 2023-08-31 2023-10-10 吉林电力交易中心有限公司 Electric heating operation cost prediction method and medium based on multi-mode combination model
CN117292530A (en) * 2023-11-27 2023-12-26 深圳龙电华鑫控股集团股份有限公司 Carrier communication data acquisition efficiency optimization method
CN117292530B (en) * 2023-11-27 2024-02-13 深圳龙电华鑫控股集团股份有限公司 Carrier communication data acquisition efficiency optimization method

Also Published As

Publication number Publication date
CN112150205B (en) 2022-08-30

Similar Documents

Publication Publication Date Title
CN112150205B (en) Price prediction method and device and electronic equipment
Fontagné et al. Tariff-based product-level trade elasticities
CN112053233B (en) GRA-based dynamic medium and small enterprise credit scoring method and system
CN112860769B (en) Energy planning data management system
CN114048436A (en) Construction method and construction device for forecasting enterprise financial data model
CN114519498A (en) Quality evaluation method and system based on BIM (building information modeling)
CN115860572A (en) Supplier evaluation method and system based on flexible configuration of multi-dimensional operation
CN111932044A (en) Steel product price prediction system and method based on machine learning
Hamano et al. On quality and variety bias in aggregate prices
CN114066089A (en) Batch job operation time-consuming interval determining method and device
Akerman Market concentration and the relative demand for college‐educated labour
US20210090101A1 (en) Systems and methods for business analytics model scoring and selection
CN112598227A (en) Power economic index construction method and system based on power data
CN112001774A (en) Neural network-based tobacco input quantity research and judgment method and system
CN114252794B (en) Method and device for predicting residual life of disassembled intelligent ammeter
JP2017084229A (en) Investment simulation device and method
CN115936875A (en) Financial product form hanging processing method and device
CN112330182A (en) Quantitative analysis method and device for economic operation condition
CN113298575A (en) Method, system, equipment and storage medium for trademark value batch evaluation
Roehling Implications of exchange rate volatility for trade: Volatility measurement matters
CN111784198A (en) Monitoring method and device for metering assets and computer equipment
Agarwal et al. Accountability in AI
CN113743532A (en) Anomaly detection method, device, equipment and computer storage medium
CN115983892A (en) Price prediction model creation method and device, electronic equipment and readable storage medium
Mohan Rao Optimal base‐period data for productivity measurement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant