CN116645132A - Multi-factor variable-based time sequence prediction method and device, electronic equipment and medium - Google Patents

Multi-factor variable-based time sequence prediction method and device, electronic equipment and medium Download PDF

Info

Publication number
CN116645132A
CN116645132A CN202310559065.9A CN202310559065A CN116645132A CN 116645132 A CN116645132 A CN 116645132A CN 202310559065 A CN202310559065 A CN 202310559065A CN 116645132 A CN116645132 A CN 116645132A
Authority
CN
China
Prior art keywords
data
variables
historical data
prediction
influence factor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310559065.9A
Other languages
Chinese (zh)
Inventor
徐倩
周珂馨
邵家伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yuannian Technology Co ltd
Original Assignee
Beijing Yuannian Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yuannian Technology Co ltd filed Critical Beijing Yuannian Technology Co ltd
Priority to CN202310559065.9A priority Critical patent/CN116645132A/en
Publication of CN116645132A publication Critical patent/CN116645132A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Evolutionary Computation (AREA)
  • Marketing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Medical Informatics (AREA)
  • Game Theory and Decision Science (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a time sequence prediction method, a device, electronic equipment and a computer readable storage medium based on a multi-factor variable, wherein the method comprises the following steps: acquiring historical data; determining a plurality of influencing factor variables and target variables based on the historical data; predicting future data of a plurality of influence factor variables based on historical data and a preset prediction model; and predicting the target variable by using a machine learning algorithm based on the future data and the historical data to generate a first prediction result. The method solves the technical problem that in the prior art, the influence capability of different factors on the predicted result is not provided, and the method for assisting in policy decision making by adjusting the factors to change the predicted result is not facilitated.

Description

Multi-factor variable-based time sequence prediction method and device, electronic equipment and medium
Technical Field
The application belongs to the field of data processing, and particularly relates to a time sequence prediction method and device based on multi-factor variables, electronic equipment and a computer readable storage medium.
Background
In a commodity sales enterprise, sales prediction is an important link in enterprise marketing decision and strategic management, for example, sales prediction can be applied to commodity replenishment strategies, commodity inventory management, commodity production plan planning and the like.
In the early technology, a time sequence prediction model based on an RNN model comprises a deep AR model, an LSTNet model and the like, and the time sequence characteristic and the added factor characteristic are learned through a cyclic neural network.
However, RNN structure suffers from the problem of gradient disappearance with increasing length and gradual disappearance of long-range data features, while multi-step rolling prediction suffers from the problem of error accumulation.
In the prior art, a transducer model is adopted to learn time sequence characteristics through an attention mechanism, so that the capability of capturing global correlation is realized, however, the initial transducer model has large parameter quantity and high time complexity. The model such as an model presented later reduces the time complexity through sparse attention and accelerates the model training speed. The Autoformer model changes the attention mechanism to fast fourier transform based on the transform to calculate the sequence correlation and learns the timing characteristics by decomposing trend and periodic terms.
However, the prior art does not provide the capability of different factors to influence the predicted result, which is disadvantageous to implement a means of adjusting the factors to change the predicted result as an auxiliary policy decision.
Based on this, the present application has been made.
Disclosure of Invention
The embodiment of the application provides a time sequence prediction method, a device, electronic equipment and a computer readable storage medium based on multi-factor variables, which are not beneficial to the technical problem that the influence capability of different factors on a prediction result is not provided in the prior art, and the method for assisting policy decision making of adjusting the factors to change the prediction result is not beneficial to realization.
According to a first aspect of the present application, there is provided a multi-factor variable based timing prediction method, the method comprising:
acquiring historical data;
determining a plurality of influencing factor variables and target variables based on the historical data;
predicting future data of a plurality of influence factor variables based on historical data and a preset prediction model;
and predicting the target variable by using a machine learning algorithm based on the future data and the historical data to generate a first prediction result.
Optionally, the method further comprises:
determining importance of a plurality of influence factor variables based on a preset prediction model;
modifying the influence factor variable based on the importance of the influence factor variable to obtain a second prediction result;
and comparing the first prediction result with the second prediction result to generate a comparison result diagram.
Optionally, predicting future data of the plurality of influencing factor variables based on the historical data and a preset prediction model includes:
decomposing historical data of a plurality of influence factor variables by using a moving average algorithm to obtain trend data;
determining seasonal data based on the historical data and the trend data;
respectively inputting trend data and season data into a preset prediction model to obtain a first output result and a second output result;
and obtaining future data based on the first output result and the second output result.
Optionally, determining the importance of the plurality of influencing factor variables based on a preset prediction model includes:
combining the future data with the historical data to obtain a target data set;
extracting features of the target data set by using a machine learning algorithm to obtain time sequence features of multiple dimensions;
the importance of a plurality of influencing factor variables is determined according to the time sequence characteristics of a plurality of dimensions.
Optionally, the time sequence features of the multiple dimensions include a first time sequence feature, a second time sequence feature and a third time sequence feature, and the feature extraction is performed on the target data set by using a machine learning algorithm to obtain the time sequence features of the multiple dimensions, including:
summing or averaging the target data within a first preset time to determine a first timing characteristic of a plurality of influence factor variables;
performing super-parametric search to obtain a period of a fitting result, and extracting target data in the same period as the plurality of influence factor variables and the target variable to determine second time sequence characteristics of the plurality of influence factor variables;
extracting target data of the plurality of influence factor variables in a second preset time to determine a third time sequence characteristic of the plurality of influence factor variables; the second preset time includes: day, month, year, or not is any one of weekends.
Optionally, determining a plurality of influence factor variables based on the historical data includes:
determining a plurality of initial factor variables based on the historical data;
and carrying out correlation screening on the plurality of initial factor variables to determine a plurality of influence factor variables.
Optionally, after the historical data is acquired, the method further includes:
filling the historical data;
deleting repeated data in the packets in the historical data;
and filtering the data with the sequence length smaller than the preset sequence length in the historical data.
According to a second aspect of the present application, there is provided a multi-factor variable based timing prediction apparatus, the apparatus comprising: the acquisition module is used for acquiring historical data; a first determining module for determining a plurality of influencing factor variables and target variables based on the history data; the first prediction module is used for predicting future data of a plurality of influence factor variables based on historical data and a preset prediction model; and the second prediction module is used for predicting the target variable by utilizing a machine learning algorithm based on the future data and the historical data to generate a first prediction result.
Optionally, the apparatus further comprises: the second determining module is used for determining the importance of a plurality of influence factor variables based on a preset prediction model; the modification module is used for modifying the influence factor variable based on the importance of the influence factor variable to obtain a second prediction result; and the comparison module is used for comparing the first prediction result with the second prediction result to generate a comparison result diagram.
Optionally, the first prediction module includes: the decomposition unit is used for decomposing the historical data of the plurality of influence factor variables by using a moving average algorithm to obtain trend data; a first determining unit configured to determine season data based on the history data and the trend data; the training unit is used for respectively inputting the trend data and the season data into a preset prediction model to obtain a first output result and a second output result; and the calculation unit is used for obtaining future data based on the first output result and the second output result.
Optionally, the second determining module includes: the merging unit is used for merging the future data and the historical data to obtain a target data set; the extraction unit is used for extracting the characteristics of the target data set by using a machine learning algorithm to obtain time sequence characteristics of multiple dimensions; and the second determining unit is used for determining the importance of a plurality of influence factor variables according to the time sequence characteristics of a plurality of dimensions.
Optionally, the time sequence features of the multiple dimensions include a first time sequence feature, a second time sequence feature and a third time sequence feature, wherein the extracting unit is used for summing or averaging the target data in a first preset time to determine the first time sequence feature of the multiple influencing factor variables; extracting target data in the same period as the plurality of influence factor variables and the target variable to determine second time sequence characteristics of the plurality of influence factor variables; extracting target data of the plurality of influence factor variables in a second preset time to determine a third time sequence characteristic of the plurality of influence factor variables; the second preset time includes: day, month, year, or not is any one of weekends.
Optionally, the first determining module is configured to determine a plurality of initial factor variables based on the history data; and carrying out correlation screening on the plurality of initial factor variables to determine a plurality of influence factor variables.
Optionally, the apparatus further comprises: the filling module is used for filling the historical data; the deleting module is used for deleting repeated data in the packets in the historical data; the filtering module is used for filtering the data with the sequence length smaller than the preset sequence length in the historical data.
According to a third aspect of the present application, there is provided an electronic device comprising: a processor and a memory storing computer program instructions; any of the above-described multi-factor variable-based timing prediction methods is implemented when the processor executes the computer program instructions.
According to a fourth aspect of the present application, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement any of the above-described multi-factor variable based timing prediction methods.
In summary, the time sequence prediction method and device based on the multi-factor variable provided by the application have at least the following beneficial effects:
the time sequence prediction method based on the multi-factor variable comprises the following steps: acquiring historical data; determining a plurality of influencing factor variables and target variables based on the historical data; predicting future data of a plurality of influence factor variables based on historical data and a preset prediction model; and predicting the influence factor variable by using a machine learning algorithm based on the future data and the historical data to generate a first prediction result. Therefore, the application predicts the target variable through the machine learning algorithm according to the combination of the historical data and the future data, thereby obtaining the prediction result. The method solves the technical problem that in the prior art, the influence capability of different factors on the predicted result is not provided, and the method for assisting in policy decision making by adjusting the factors to change the predicted result is not facilitated.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are some embodiments of the application and that other drawings can be obtained from them without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a timing prediction method based on a multi-factor variable according to an embodiment of the present application;
FIG. 2 is a flowchart of a timing prediction method based on a multi-factor variable according to an embodiment of the present application;
FIG. 3 is a block diagram of a timing prediction apparatus based on a multi-factor variable according to an embodiment of the present application;
fig. 4 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
To further clarify the above and other features and advantages of the present application, a further description of the application will be rendered by reference to the appended drawings. It should be understood that the specific embodiments described herein are for illustrative purposes only and are not limiting, as to those skilled in the art.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. It will be apparent, however, to one skilled in the art that the specific details need not be employed to practice the present application. In other instances, well-known steps or operations have not been described in detail in order to avoid obscuring the application.
The time sequence prediction method based on the multi-factor variable provided by the embodiment of the application can be executed by the time sequence prediction device based on the multi-factor variable provided by the embodiment of the application, and the device can be configured in electronic equipment.
Referring to fig. 1, the present application provides a time sequence prediction method based on a multi-factor variable, the method comprising:
step S11, historical data is acquired.
Specifically, in the present application, a central processing unit (Central Processing Unit, CPU) may be used as an execution subject of the present application, and the CPU may acquire history data from a database.
Optionally, the historical data may include sales data, soil ph data, etc., and specific types of the historical data are not described herein, and may be implemented by those skilled in the art.
Step S13, determining a plurality of influence factor variables and target variables based on the historical data.
Specifically, in the present application, after the history data is acquired, a plurality of influence factor variables and target variables of the history data may be determined. It should be noted that the influence factor variable may be used to characterize the trend of the history data, which can be changed with the change of the influence factor variable, that is, the influence factor variable determines the trend of the history data.
Step S15, predicting future data of a plurality of influence factor variables based on the historical data and a preset prediction model.
Specifically, in the present application, a preset prediction model may be a model formed according to training history data, and then future data of a plurality of influencing factor variables may be predicted according to the history data and the preset prediction model under a certain constraint condition (such as a time condition), that is, according to the constraint condition, the history data may be changed, thereby forming the future data. Such as: when the constraint condition is a time condition (one month in the future), the history data includes sales data, and then between 1 month and 3 months, the sales data is that the drinks are sold for 1 month 300 cups, 2 months 200 cups and 3 months 250 cups, then future data of one month in the future can be predicted, and the future data can be 450 cups and the like, and the determination of the future data needs to be determined according to the preset prediction model.
Step S17, predicting the target variable by using a machine learning algorithm based on the future data and the historical data to generate a first prediction result.
Specifically, in the present application, after obtaining the history data and the future data, the CPU or the graphics processor (Graphic Processing Unit, GPU) may predict the target variable by using a machine learning algorithm to generate the first prediction result. Among them, specific steps regarding the machine learning algorithm are described in detail below.
In an alternative embodiment, the method further comprises:
determining importance of a plurality of influence factor variables based on a preset prediction model;
modifying the influence factor variable based on the importance of the influence factor variable to obtain a second prediction result;
and comparing the first prediction result with the second prediction result to generate a comparison result diagram.
Specifically, in the application, the importance of a plurality of influence factor variables is determined based on a preset prediction model, and after the importance of the plurality of influence factor variables is obtained, the influence factor variables are modified according to the importance of each influence factor variable, so that the target variables are predicted, and a second prediction result is obtained. The application can adjust the size of the influence factor variable according to the importance of the influence factor variable and check the prediction condition of the target variable, thereby assisting the business decision. Specific steps relating to the machine learning algorithm are described in detail below.
In addition, after the second prediction result and the first prediction result are obtained, the second prediction result and the first prediction result are compared, and a comparison result diagram is generated, so that the comparison result diagram is convenient to view.
According to the method, the importance of the influence factor variable is determined through a machine learning algorithm according to the combination of historical data and future data, and after the importance of the factor variable is determined, the target variable is predicted, so that a prediction result is obtained. The method solves the technical problem that in the prior art, the influence capability of different factors on the predicted result is not provided, and the method for assisting in policy decision making by adjusting the factors to change the predicted result is not facilitated.
In an alternative embodiment, the plurality of influencing factor variables includes a known factor and an unknown factor; the known factors include holidays or special event days.
When holidays or special activity days are added as factor variables, accuracy of future data determination can be effectively improved. Such as: when the sales data is the sales of beverages, the number of the sales beverages is often increased on holidays or special event days, and the accuracy of future data is effectively increased through judging the importance of the holidays or the special event days.
Alternatively, the unknown factors may be variables known in the past but unknown in the future, such as temperature, sales, etc.
Alternatively, the known factors may be variables known in the past and in the future, such as time, holidays, or commodity pricing that the company has determined for the next few years, etc
As shown in fig. 2, in an alternative embodiment, predicting future data of a plurality of influencing factor variables based on the historical data and a preset prediction model in step S15 includes:
and step S151, decomposing historical data of a plurality of influence factor variables by using a moving average algorithm to obtain trend data.
Step S152, determining season data based on the history data and the trend data.
Step S153, the trend data and the season data are respectively input into a preset prediction model to obtain a first output result and a second output result.
Step S154, based on the first output result and the second output result, future data is obtained.
It should be noted that the moving average algorithm may be: a common method for predicting the demand of a company product, the capacity of a company, etc. in one or more future phases is to use a set of recent actual data values. The moving average algorithm is suitable for on-demand prediction. The moving average method is very useful when the product demand is neither rapidly growing nor rapidly declining, and there is no seasonal factor, and it is effective to eliminate random fluctuations in the predictions. The moving average method varies according to the weights of the elements used in prediction.
According to the embodiment, the historical data of a plurality of influence factor variables can be decomposed by using a moving average algorithm to obtain expected trend data (namely the demand of products), and then the seasonal data can be obtained by subtracting the trend data from the historical data. And finally, respectively inputting the trend data and the season data into a preset prediction model, obtaining a first output result and a second output result, namely the predicted trend data and the season data, by a model training mode, and adding the results to obtain future data. In this way, the accuracy of future data can be effectively ensured.
In an alternative embodiment, the preset prediction model may be determined by means of whether to search for parameters, where the automatic search may search for the optimal prediction model, and if not, the super parameters trained by the default model may be adjusted according to previous knowledge.
In an alternative embodiment, the preset prediction model may be a tree model, where the STL may be used to calculate a target variable trend, return a target trend graph, and return periodic data according to date granularity of the input data, so that a user may effectively view the overall trend of the data and the periodic factors.
In an alternative embodiment, determining the importance of the plurality of influencing factor variables based on a preset predictive model may include:
and combining the future data with the historical data to obtain a target data set.
And extracting the characteristics of the target data set by using a machine learning algorithm to obtain time sequence characteristics of multiple dimensions.
The importance of a plurality of influencing factor variables is determined according to the time sequence characteristics of a plurality of dimensions.
In this embodiment, after future data and historical data are obtained, the future data and the historical data are combined to form a complete data chain, that is, a target data set, then feature extraction is performed on the target data set by using a machine learning algorithm to obtain time sequence features of multiple dimensions, and importance of multiple influencing factor variables is determined according to the time sequence features of the multiple dimensions.
In an alternative embodiment, the merging may be performed in a time sequence.
In one embodiment, the plurality of dimensional timing characteristics includes a first timing characteristic, a second timing characteristic, and a third timing characteristic; the method comprises the steps of extracting characteristics of a target data set by using a machine learning algorithm to obtain time sequence characteristics of multiple dimensions, wherein the method comprises the following steps:
summing or averaging the target data within a first preset time to determine a first timing characteristic of a plurality of influence factor variables;
extracting target data in the same period as the plurality of influence factor variables and the target variable to determine second time sequence characteristics of the plurality of influence factor variables;
extracting target data of the plurality of influence factor variables in a second preset time to determine a third time sequence characteristic of the plurality of influence factor variables; the second preset time includes: day, month, year, or not is any one of weekends.
In this embodiment, the features of the target data set need to be extracted from three aspects, specifically: local features, periodic features, and calendar features.
For local features, the present embodiment sums or averages the target data sets over a first preset time to determine a first timing characteristic (i.e., local feature) of the plurality of influencing factor variables. Such as: the sales can be suddenly reduced under the condition of bad weather in a certain day. For this purpose, the target data for several consecutive days may be added or averaged, and the local feature may be obtained by using the history data of the previous day or the sum of the history data of the previous two days as the local feature.
For the period feature, the present embodiment extracts target data that is the same period as the plurality of influence factor variables and the target variable to determine a second timing feature of the plurality of influence factor variables. It should be noted that the preset prediction model may automatically learn the periodicity of the target data set, and then extract the historical data in the same period as the factor variable and the target data.
As for the calendar feature, the calendar feature in the present embodiment may include: day, month, year, or not is any one of weekends. And extracting target data of the plurality of influence factor variables in a second preset time to determine a third time sequence characteristic of the plurality of influence factor variables. The present embodiment may use these variables to capture some time-type seasonal information.
The embodiment can provide a data stability checking function through the three characteristics, so that the accuracy of future data can be accurately determined.
In an alternative embodiment, determining a plurality of influence factor variables based on the history data in step S13 includes:
based on the historical data, a plurality of initial factor variables are determined.
And carrying out correlation screening on the plurality of initial factor variables to determine a plurality of influence factor variables.
In this embodiment, a plurality of initial factor variables may be determined for the historical data according to the service logic requirement, however, there may be correlation among a plurality of initial factor variables, so that there is similarity among factor variables, and thus, future data determination is inaccurate.
In an alternative embodiment, the correlation filtering of the plurality of initial factor variables includes:
and removing the initial factor variable of which the correlation meets the preset condition.
And reserving the initial factor variables of which the correlation does not meet the preset condition to obtain a plurality of influence factor variables.
In an alternative embodiment, after step S11, the method further comprises:
and filling the historical data.
And deleting the repeated data in the packets in the historical data.
And filtering the data with the sequence length smaller than the preset sequence length in the historical data.
In this embodiment, a filling method for null values may be selected according to service logic, and data with null values may be filled based on a specified method. Such as: aiming at sales data and temperature data, when the temperature data is missing, the method such as average value, tree planting and the like can be utilized to fill up, repeated data in groups in historical data are deleted, and data with sequence length smaller than the preset sequence length in the historical data are filtered, so that the situation that the data are too small to obtain results can be avoided.
In an alternative embodiment, the present application may also employ a deep learning algorithm for timing prediction, wherein the deep learning algorithm structure includes a variable selection network, an LSTM (layer) network, an Attention (layer) network. The variable selection network learns the influence of multiple factors on time sequence prediction through training, and the model comprises three variable selection networks: the model can process factors unknown in the future and factors known in the future; the unknown factors in the future are only sent into a coding region dynamic variable selection network during training; the known factors in the future are simultaneously sent into a coding region dynamic variable selection network and a decoding region dynamic variable selection network; the static factors are only sent into a static variable selection network; the outputs of the LSTM network coding region and the decoding region can be simultaneously used as the next network input, and the LSTM network is used for extracting the time sequence characteristics of short distance; the Attention network uses the output of the LSTM decoding area as the precursor of q and the output of the LSTM encoding area as the precursors of k and v; the Attention network is used for learning long-distance time sequence characteristics; the model output result is quantile, and the loss function is quantile loss function. The application can provide a plurality of selectable algorithms for time sequence prediction, and the algorithms can be selected according to actual preference and priori knowledge.
According to the present application, there is provided a time sequence prediction apparatus based on a multi-factor variable, as shown in fig. 3, the apparatus comprising: an acquisition module 31 for acquiring history data; a first determination module 32 for determining a plurality of influence factor variables based on the historical data; a first prediction module 33, configured to predict future data of a plurality of influencing factor variables based on the historical data and a preset prediction model; a second determination module 34 for determining importance of a plurality of influencing factor variables using a machine learning algorithm based on the future data and the historical data; the second prediction module 35 is configured to predict the target variable based on importance of the plurality of influencing factor variables, so as to obtain a prediction result.
According to the method, the importance of the influence factor variable is determined through a machine learning algorithm according to the combination of historical data and future data, and after the importance of the factor variable is determined, the target variable is predicted, so that a prediction result is obtained. The method solves the technical problem that in the prior art, the influence capability of different factors on the predicted result is not provided, and the method for assisting in policy decision making by adjusting the factors to change the predicted result is not facilitated.
Optionally, the apparatus further comprises: the second determining module is used for determining the importance of a plurality of influence factor variables based on a preset prediction model; the modification module is used for modifying the influence factor variable based on the importance of the influence factor variable to obtain a second prediction result; and the comparison module is used for comparing the first prediction result with the second prediction result to generate a comparison result diagram.
Optionally, the first prediction module includes 33: the decomposition unit is used for decomposing the historical data of the plurality of influence factor variables by using a moving average algorithm to obtain trend data; a first determining unit configured to determine season data based on the history data and the trend data; the training unit is used for respectively inputting the trend data and the season data into a preset prediction model to obtain a first output result and a second output result; and the calculation unit is used for obtaining future data based on the first output result and the second output result.
Optionally, the second determining module includes: the merging unit is used for merging the future data and the historical data to obtain a target data set; the extraction unit is used for extracting the characteristics of the target data set by using a machine learning algorithm to obtain time sequence characteristics of multiple dimensions; and the second determining unit is used for determining the importance of a plurality of influence factor variables according to the time sequence characteristics of a plurality of dimensions.
Optionally, the time sequence features of the multiple dimensions include a first time sequence feature, a second time sequence feature and a third time sequence feature, wherein the extracting unit is used for summing or averaging the target data in a first preset time to determine the first time sequence feature of the multiple influencing factor variables; extracting target data in the same period as the plurality of influence factor variables and the target variable to determine second time sequence characteristics of the plurality of influence factor variables; extracting target data of the plurality of influence factor variables in a second preset time to determine a third time sequence characteristic of the plurality of influence factor variables; the second preset time includes: day, month, year, or not is any one of weekends.
Optionally, the first determining module 32 is configured to determine a plurality of initial factor variables based on the history data; and carrying out correlation screening on the plurality of initial factor variables to determine a plurality of influence factor variables.
Optionally, the apparatus further comprises: the filling module is used for filling the historical data; the deleting module is used for deleting repeated data in the packets in the historical data; the filtering module is used for filtering the data with the sequence length smaller than the preset sequence length in the historical data.
It is to be understood that the specific features, operations and details described herein before with respect to the method of the application may also be similarly applied to the apparatus and system of the application, or vice versa. In addition, each step of the method of the present application described above may be performed by a corresponding component or unit of the apparatus or system of the present application.
It is to be understood that the various modules/units of the apparatus of the application may be implemented in whole or in part by software, hardware, firmware, or a combination thereof. Each module/unit may be embedded in a processor of the electronic device in hardware or firmware or may be independent of the processor, or may be stored in a memory of the electronic device in software for the processor to call to perform the operations of each module/unit. Each module/unit may be implemented as a separate component or module, or two or more modules/units may be implemented as a single component or module.
As shown in fig. 4, the present application provides an electronic device 400 comprising a processor 401 and a memory 402 storing computer program instructions. Wherein the processor 401, when executing the computer program instructions, implements the steps of the multi-factor variable based timing prediction method described above. The electronic device 400 may be broadly a server, a terminal, or any other electronic device having the necessary computing and/or processing capabilities.
In one embodiment, the electronic device 400 may include a processor, memory, network interface, communication interface, etc. connected by a system bus. The processor of the electronic device 400 may be used to provide the necessary computing, processing, and/or control capabilities. The memory of the electronic device 400 may include non-volatile storage media and internal memory. The non-volatile storage medium may store an operating system, computer programs, and the like. The internal memory may provide an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface and communication interface of the electronic device 400 may be used to connect and communicate with external devices via a network. Which when executed by a processor performs the steps of the method of the application.
The application provides a computer readable storage medium, wherein computer program instructions are stored on the computer readable storage medium, and the time sequence prediction method based on multi-factor variables is realized when the computer program instructions are executed by a processor.
Those skilled in the art will appreciate that the method steps of the present application may be implemented by a computer program, which may be stored on a non-transitory computer readable storage medium, to instruct related hardware such as the electronic device 400 or the processor, which when executed causes the steps of the present application to be performed. Any reference herein to memory, storage, or other medium may include non-volatile or volatile memory, as the case may be. Examples of nonvolatile memory include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), flash memory, magnetic tape, floppy disk, magneto-optical data storage, hard disk, solid state disk, and the like. Examples of volatile memory include Random Access Memory (RAM), external cache memory, and the like.
The technical features described above may be arbitrarily combined. Although not all possible combinations of features are described, any combination of features should be considered to be covered by the description provided that such combinations are not inconsistent.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.

Claims (10)

1. A method of timing prediction based on a multi-factor variable, the method comprising:
acquiring historical data;
determining a plurality of influencing factor variables and target variables based on the historical data;
predicting future data of a plurality of influence factor variables based on the historical data and a preset prediction model;
and predicting a target variable by using a machine learning algorithm based on the future data and the historical data to generate a first prediction result.
2. The multi-factor variable based timing prediction method of claim 1, further comprising:
determining importance of a plurality of influence factor variables based on a preset prediction model;
modifying the influence factor variable based on the importance of the influence factor variable to obtain a second prediction result;
and comparing the first prediction result with the second prediction result to generate a comparison result diagram.
3. The multi-factor variable based time series prediction method according to claim 1, wherein predicting future data of a plurality of the influencing factor variables based on the history data and a preset prediction model comprises:
decomposing historical data of a plurality of influence factor variables by using a moving average algorithm to obtain trend data;
determining seasonal data based on the historical data and the trend data;
respectively inputting the trend data and the season data into a preset prediction model to obtain a first output result and a second output result;
and obtaining future data based on the first output result and the second output result.
4. The multi-factor variable based timing prediction method of claim 2, wherein determining importance of a plurality of the influencing factor variables based on a preset prediction model comprises:
combining the future data with the historical data to obtain a target data set;
extracting features of the target data set by using a machine learning algorithm to obtain time sequence features of multiple dimensions;
and determining the importance of a plurality of influence factor variables according to the time sequence characteristics of a plurality of dimensions.
5. The multi-factor variable based timing prediction method of claim 4, wherein the timing features of the plurality of dimensions include a first timing feature, a second timing feature, and a third timing feature, wherein the performing feature extraction on the target data set using a machine learning algorithm to obtain the timing features of the plurality of dimensions comprises:
summing or averaging the target data within a first preset time to determine first timing characteristics of a plurality of the influence factor variables;
performing super-parametric search to obtain a period of a fitting result, and extracting target data in the same period as the influence factor variables and the target variables to determine second time sequence characteristics of the influence factor variables;
extracting target data of a plurality of influence factor variables in a second preset time to determine third time sequence characteristics of the plurality of influence factor variables; wherein the second preset time includes: day, month, year, or not is any one of weekends.
6. The multi-factor variable based timing prediction method of claim 1, wherein the determining a plurality of influencing factor variables based on the historical data comprises:
determining a plurality of initial factor variables based on the historical data;
and carrying out correlation screening on a plurality of initial factor variables to determine a plurality of influence factor variables.
7. The multi-factor variable based timing prediction method of claim 1, wherein after acquiring the historical data, the method further comprises:
filling the historical data;
deleting repeated data in the packets in the historical data;
and filtering the data with the sequence length smaller than the preset sequence length in the historical data.
8. A multi-factor variable based timing prediction apparatus, the apparatus comprising:
the acquisition module is used for acquiring historical data;
a first determining module for determining a plurality of influencing factor variables and target variables based on the history data;
the first prediction module is used for predicting future data of a plurality of influence factor variables based on the historical data and a preset prediction model;
and the second prediction module is used for predicting the target variable by utilizing a machine learning algorithm based on the future data and the historical data to generate a first prediction result.
9. An electronic device, the electronic device comprising: a processor and a memory storing computer program instructions;
the processor, when executing the computer program instructions, implements a multi-factor variable based timing prediction method as claimed in any of claims 1-7.
10. A computer readable storage medium, having stored thereon computer program instructions which, when executed by a processor, implement the multi-factor variable based timing prediction method of any of claims 1-7.
CN202310559065.9A 2023-05-17 2023-05-17 Multi-factor variable-based time sequence prediction method and device, electronic equipment and medium Pending CN116645132A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310559065.9A CN116645132A (en) 2023-05-17 2023-05-17 Multi-factor variable-based time sequence prediction method and device, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310559065.9A CN116645132A (en) 2023-05-17 2023-05-17 Multi-factor variable-based time sequence prediction method and device, electronic equipment and medium

Publications (1)

Publication Number Publication Date
CN116645132A true CN116645132A (en) 2023-08-25

Family

ID=87623970

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310559065.9A Pending CN116645132A (en) 2023-05-17 2023-05-17 Multi-factor variable-based time sequence prediction method and device, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN116645132A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117648383A (en) * 2024-01-30 2024-03-05 中国人民解放军国防科技大学 Heterogeneous database real-time data synchronization method, device, equipment and medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117648383A (en) * 2024-01-30 2024-03-05 中国人民解放军国防科技大学 Heterogeneous database real-time data synchronization method, device, equipment and medium
CN117648383B (en) * 2024-01-30 2024-06-11 中国人民解放军国防科技大学 Heterogeneous database real-time data synchronization method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN110060144B (en) Method for training credit model, method, device, equipment and medium for evaluating credit
CN109816221B (en) Project risk decision method, apparatus, computer device and storage medium
CN110400021B (en) Bank branch cash usage prediction method and device
CN110390425A (en) Prediction technique and device
CN103778474A (en) Resource load capacity prediction method, analysis prediction system and service operation monitoring system
CN110415036B (en) User grade determining method, device, computer equipment and storage medium
CN112818032B (en) Data screening method and data analysis server for serving big data mining analysis
CN116645132A (en) Multi-factor variable-based time sequence prediction method and device, electronic equipment and medium
CN108182633A (en) Loan data processing method, device, computer equipment and storage medium
CN116091118A (en) Electricity price prediction method, device, equipment, medium and product
CN116091110A (en) Resource demand prediction model training method, prediction method and device
CN114782201A (en) Stock recommendation method and device, computer equipment and storage medium
CN117540336A (en) Time sequence prediction method and device and electronic equipment
CN115511562A (en) Virtual product recommendation method and device, computer equipment and storage medium
CN110992189A (en) Resource data estimation method, resource data estimation device, computer equipment and storage medium
CN112132498A (en) Inventory management method, device, equipment and storage medium
CN112667394B (en) Computer resource utilization rate optimization method
CN111783487B (en) Fault early warning method and device for card reader equipment
CN114648406A (en) User credit integral prediction method and device based on random forest
CN113837782B (en) Periodic term parameter optimization method and device of time sequence model and computer equipment
CN111783486A (en) Maintenance early warning method and device for card reader equipment
CN114676167B (en) User persistence model training method, user persistence prediction method and device
CN112862137A (en) Method and device for predicting quantity, computer equipment and computer readable storage medium
CN114997879B (en) Payment routing method, device, equipment and storage medium
CN111339156B (en) Method, apparatus and computer readable storage medium for long-term determination of business data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination