CN117272051B - Time sequence prediction method, device and medium based on LSTM optimization model - Google Patents

Time sequence prediction method, device and medium based on LSTM optimization model Download PDF

Info

Publication number
CN117272051B
CN117272051B CN202311553141.1A CN202311553141A CN117272051B CN 117272051 B CN117272051 B CN 117272051B CN 202311553141 A CN202311553141 A CN 202311553141A CN 117272051 B CN117272051 B CN 117272051B
Authority
CN
China
Prior art keywords
lstm
model
prediction
lstm model
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311553141.1A
Other languages
Chinese (zh)
Other versions
CN117272051A (en
Inventor
唐昌明
徐同明
马士中
王金丽
任聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur General Software Co Ltd
Original Assignee
Inspur General Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur General Software Co Ltd filed Critical Inspur General Software Co Ltd
Priority to CN202311553141.1A priority Critical patent/CN117272051B/en
Publication of CN117272051A publication Critical patent/CN117272051A/en
Application granted granted Critical
Publication of CN117272051B publication Critical patent/CN117272051B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2123/00Data types
    • G06F2123/02Data types in the time domain, e.g. time-series data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a time sequence prediction method, equipment and medium based on an LSTM optimization model, and belongs to the field of data processing systems or methods specially suitable for prediction purposes. Comprising the following steps: initializing LSTM model super parameters and preprocessing data samples to obtain training samples and test samples; optimizing LSTM model super parameters through the multilayer LSTM model and based on training samples, and calculating initial fitness of the multilayer LSTM model based on test samples; under the condition that the multilayer LSTM model does not reach a preset iteration number threshold value based on the initial fitness, carrying out population iteration on the LSTM model super-parameters based on a population algorithm, and determining average scores of the population iteration and the parameter-optimized LSTM model super-parameters; calculating the current fitness of the multilayer LSTM model based on the average score and the training sample, and obtaining a global optimal parameter when the current fitness is greater than the current optimal solution; and obtaining an LSTM optimization model based on the global optimal parameters, and predicting the time sequence through the LSTM optimization model.

Description

Time sequence prediction method, device and medium based on LSTM optimization model
Technical Field
The present disclosure relates to the field of data processing systems or methods, and more particularly, to a method, apparatus, and medium for predicting a time sequence based on an LSTM optimization model.
Background
In the current industry, the prediction of the business development trend and short-term situation mostly depends on factors such as system online time, company strategic planning, personal experience and the like, and the lack of standardized tools for realizing trend prediction based on time sequences leads to the lack of uniformity and accuracy of prediction results. The existing predictive products require high learning cost and special maintenance and are used with product cutting. The root cause of this is that intelligent software is not very popular, the reduction of entrance threshold only serves the relevant technicians, and the training of intelligent model requires a lot of parameters and special learning to adapt to different usage scenarios and business characteristics.
Disclosure of Invention
The embodiment of the application provides a time sequence prediction method, equipment and medium based on an LSTM optimization model, which are used for solving the technical problems that the existing prediction mode requires high learning cost and special maintenance.
In one aspect, an embodiment of the present application provides a time sequence prediction method based on an LSTM optimization model, including:
initializing the super parameters of the multilayer predicted LSTM model, and preprocessing the pre-acquired data samples to obtain preprocessed training samples and preprocessed test samples;
optimizing the initialized LSTM model hyper-parameters through a multi-layer LSTM model and based on the training samples, and calculating the initial fitness of the multi-layer LSTM model based on the test samples;
under the condition that the multilayer LSTM model does not reach a preset iteration frequency threshold value based on the initial fitness, carrying out population iteration on the LSTM model super-parameters based on a population algorithm, and determining average scores of the population iteration and the LSTM model super-parameters after parameter optimization;
calculating the current fitness of the multilayer LSTM model based on the average score and the training sample, and obtaining a global optimal parameter under the condition that the current fitness is larger than a current optimal solution;
based on the global optimal parameters, a corresponding LSTM optimization model is obtained, and the time sequence is predicted through the LSTM optimization model.
In one implementation manner of the present application, initializing the multiparameter of the LSTM model of the multilayer prediction, and preprocessing a data sample obtained in advance to obtain a preprocessed training sample and a test sample, which specifically includes:
determining a service scene of a time sequence to be predicted and service characteristics corresponding to the service scene, and determining multi-layer prediction parameters based on the service scene and the corresponding service characteristics; the multi-layer prediction parameters include at least: number of hidden layers, number of iterations, input set size, time step, and learning rate;
taking the multilayer prediction parameters as LSTM model superparameters, and initializing the LSTM model superparameters to obtain initialized LSTM model superparameters;
based on a data complement mechanism and a random time step number grouping mechanism, preprocessing a plurality of sample data acquired in advance, and obtaining corresponding training samples and test samples.
In one implementation manner of the present application, the optimizing, by using a multi-layer LSTM model and based on the training samples, the initialized LSTM model hyper-parameters specifically includes:
under the condition that the population algorithm is a drosophila algorithm, training the multilayer LSTM model based on the training sample so as to adjust the population iteration times and the inertia weight of the drosophila;
and adjusting the searching step length and the searching range of the grid searching based on the random step length, determining the adjusted LSTM model hyper-parameters, and realizing parameter optimization of the LSTM model hyper-parameters.
In one implementation manner of the present application, the calculating the initial fitness of the multilayer LSTM model based on the test sample specifically includes:
determining a multi-layer parameter evaluation standard of a multi-layer LSTM model, and taking the multi-layer parameter evaluation standard as an adaptability function of parameter evaluation; the multi-layer parameter evaluation criteria include at least: a loss function, a recent loss function, and an overfitting evaluation function;
and calculating the initial fitness of the multi-layer LSTM model through the fitness function according to the test sample, and determining whether the multi-layer LSTM model reaches a preset iteration number threshold or not based on the initial fitness.
In one implementation manner of the present application, in a case that the initial fitness determines that the multilayer LSTM model does not reach a preset iteration number threshold, before population iteration is performed on the LSTM model super-parameters based on a population algorithm, the method further includes:
based on evaluating the service demands of random partition and focusing on the modern period, acquiring a plurality of service data corresponding to the service demands, and performing multiple verification in a multi-layer parameter evaluation standard according to the plurality of service data;
aiming at the service data with verification failure, acquiring a training result corresponding to the service data with verification failure, and reversely supplementing the service data with verification failure according to the training result.
In one implementation manner of the present application, the determining the average score of the LSTM model superparameter after population iteration and parameter optimization specifically includes:
inputting the LSTM model superparameter subjected to population iteration and parameter optimization adjustment into a first layer of the multi-layer LSTM model, and inputting the LSTM model superparameter output by the first layer into a second layer of the multi-layer LSTM model;
continuously inputting the LSTM model superparameter output by the second layer downwards until the LSTM model superparameter is input to the last layer in the multilayer LSTM model;
constructing a corresponding full-connection layer aiming at each layer of LSTM model, and constructing corresponding back propagation of each full-connection layer respectively;
and determining model loss corresponding to the output result of each full-connection layer, and inputting the model loss and the back propagation of each full-connection layer into an optimizer to obtain the average score of the LSTM model hyper-parameters.
In one implementation manner of the present application, after the obtaining the corresponding LSTM optimization model based on the global optimal parameter, the method further includes:
based on model training and verification times, transmitting the LSTM model super-parameters to a plurality of different prediction nodes, and carrying out parallel prediction on the LSTM model super-parameters through the plurality of prediction nodes;
respectively receiving the prediction return values of the plurality of prediction nodes, and calculating the fitness of the LSTM optimization model according to the plurality of prediction return values;
and evaluating the LSTM model hyper-parameters according to the fitness of the LSTM optimization model to obtain model performance evaluation of the LSTM optimization model.
In one implementation manner of the present application, the predicting, by the LSTM optimization model, the time sequence specifically includes:
determining a prediction type corresponding to a time sequence to be predicted based on user requirements, and determining a prediction mechanism corresponding to the time sequence to be predicted according to the prediction type; the prediction type includes: long-term prediction and short-term prediction, the prediction mechanism comprising: a shared prediction mechanism and a single output prediction mechanism;
for a time sequence to be predicted, the type of prediction is long-term prediction, the time sequence to be predicted for long-term prediction is predicted through a shared prediction mechanism, a corresponding long-term prediction time sequence is output, and the offset of the long-term prediction time sequence is reduced through a single-step sliding mechanism;
and aiming at the time sequence to be predicted, the short-term prediction type of which is short-term prediction, predicting the time sequence to be predicted by a single-output prediction mechanism, and outputting a corresponding short-term prediction time sequence to realize the prediction of the time sequence.
On the other hand, the embodiment of the application also provides a time sequence prediction device based on the LSTM optimization model, which comprises the following steps:
at least one processor;
and a memory communicatively coupled to the at least one processor;
wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a time series prediction method based on an LSTM optimization model as described above.
On the other hand, the embodiment of the application also provides a non-volatile computer storage medium, which stores computer executable instructions, wherein the computer realizes the time sequence prediction method based on the LSTM optimization model when executing the executable instructions.
The embodiment of the application provides a time sequence prediction method, equipment and medium based on an LSTM optimization model, which at least comprise the following beneficial effects:
the model convergence speed can be increased by reasonably selecting the initial value of the super parameter, the expression capacity of the model is improved, a good foundation is laid for the subsequent optimization process, noise and unnecessary characteristics in sample data can be eliminated by preprocessing the data samples, useful information is extracted, cleaner and more representative training samples and test samples are obtained, and the prediction capacity of the model is improved; the ultra-parameters of the LSTM model can be automatically adjusted through the multi-layer LSTM model and the training samples, the most suitable ultra-parameter value can be automatically selected according to actual data conditions, so that a more accurate prediction result is obtained, the prediction capability of the model to unseen data can be evaluated through inputting the test samples into the multi-layer LSTM model, and the initial fitness can be used as a reference index for further optimization and used for judging the quality degree of the model; when the multilayer LSTM model does not reach the preset iteration number threshold, iteration optimization can be carried out on the super parameters of the model by using a population algorithm, the performance of the model can be gradually improved by continuously updating the super parameters, a more stable and reliable evaluation result can be obtained by averaging scores of the super parameters of the LSTM model obtained by multiple iterations, misjudgment caused by accidental of a single iteration result can be avoided, and the accurate evaluation of the performance of the model is improved; by comparing the relation between the current fitness and the current optimal solution, whether the global optimal parameter is obtained can be determined, when the current fitness is larger than the current optimal solution, the better parameter combination can be determined to be obtained, so that the global optimal parameter is obtained, and further, an LSTM (least squares) optimization model is obtained based on the global optimal parameter, so that the corresponding time sequence is predicted through the LSTM optimization model.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:
fig. 1 is a flow chart of a time sequence prediction method based on an LSTM optimization model according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a multi-layer LSTM model according to an embodiment of the present application;
FIG. 3 is a flowchart of another time series prediction method based on the LSTM optimization model according to an embodiment of the present application;
fig. 4 is a schematic diagram of an internal structure of a time series prediction device based on an LSTM optimization model according to an embodiment of the present application.
Detailed Description
For the purposes, technical solutions and advantages of the present application, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
The embodiment of the application provides a time sequence prediction method, equipment and medium based on an LSTM optimization model, which can accelerate the convergence speed of the model and improve the expressive power of the model by reasonably selecting the initial value of the super parameter, is beneficial to laying a good foundation for the subsequent optimization process, can eliminate noise and unnecessary characteristics in sample data by preprocessing a data sample, extracts useful information, obtains cleaner and more representative training samples and test samples, and improves the prediction capability of the model; the ultra-parameters of the LSTM model can be automatically adjusted through the multi-layer LSTM model and the training samples, the most suitable ultra-parameter value can be automatically selected according to actual data conditions, so that a more accurate prediction result is obtained, the prediction capability of the model to unseen data can be evaluated through inputting the test samples into the multi-layer LSTM model, and the initial fitness can be used as a reference index for further optimization and used for judging the quality degree of the model; when the multilayer LSTM model does not reach the preset iteration number threshold, iteration optimization can be carried out on the super parameters of the model by using a population algorithm, the performance of the model can be gradually improved by continuously updating the super parameters, a more stable and reliable evaluation result can be obtained by averaging scores of the super parameters of the LSTM model obtained by multiple iterations, misjudgment caused by accidental of a single iteration result can be avoided, and the accurate evaluation of the performance of the model is improved; by comparing the relation between the current fitness and the current optimal solution, whether the global optimal parameter is obtained can be determined, when the current fitness is larger than the current optimal solution, the better parameter combination can be determined to be obtained, so that the global optimal parameter is obtained, and further, an LSTM (least squares) optimization model is obtained based on the global optimal parameter, so that the corresponding time sequence is predicted through the LSTM optimization model. The method solves the technical problems that the existing prediction mode requires high learning cost and special maintenance.
The following describes in detail the technical solutions provided by the embodiments of the present application with reference to the accompanying drawings.
Fig. 1 is a flow chart of a time sequence prediction method based on an LSTM optimization model according to an embodiment of the present application.
The implementation of the analysis method according to the embodiment of the present application may be a terminal device or a server, which is not particularly limited in this application. For ease of understanding and description, the following embodiments are described in detail with reference to a server.
It should be noted that the server may be a single device, or may be a system formed by a plurality of devices, that is, a distributed server, which is not specifically limited in this application.
As shown in fig. 1, a time sequence prediction method based on an LSTM optimization model provided in an embodiment of the present application includes:
101. initializing the multiparameter of the LSTM model of multilayer prediction, and preprocessing the pre-acquired data sample to obtain a preprocessed training sample and a preprocessed testing sample.
According to the method, the LSTM model super-parameters are effectively optimized through the combination of a multi-layer Long-Term Memory (LSTM) model and a population algorithm, and the accuracy and the stability of prediction can be improved through the initialization of the LSTM model super-parameters and the pretreatment of data samples, so that the method can be widely applied to various time sequence prediction scenes, such as stock market prediction, weather prediction and the like, and has high practical value and economic benefit.
In one embodiment of the application, aiming at a service scene of a time sequence to be predicted and service characteristics corresponding to the service scene, a server uses the thought of a population algorithm, takes multi-layer prediction parameters such as multi-layer prediction proportion, hidden layer number, iteration times, input set size, time step, learning rate and the like as LSTM model super-parameters, initializes the LSTM model super-parameters, so that initialized LSTM model super-parameters can be obtained, and the server also frames parameter range and size association between each layer. And then, the server also uses a data complement mechanism and a random time step number grouping mechanism to preprocess a plurality of sample data acquired in advance, and obtains corresponding training samples and test samples, wherein the preprocessing steps can ensure the integrity and the accuracy of the data and improve the prediction capability of the model.
102. Optimizing the initialized LSTM model hyper-parameters through the multilayer LSTM model and based on the training samples, and calculating the initial fitness of the multilayer LSTM model based on the test samples.
In one embodiment of the application, the server uses a population algorithm idea if the population algorithm is a drosophila algorithm, and trains the multilayer LSTM model based on training samples under the condition that the population algorithm is the drosophila algorithm so as to adjust the population iteration times and the inertia weight of drosophila, and the adjustment of the parameters can improve the searching capacity and the convergence speed of the model, so that the model can be better adapted to different time series data. The server adjusts the search step length and the search range of the grid search in a random step length and disturbance mode, so that the search space of the model can be enlarged, the possibility of finding the global optimal solution is improved, and further the adjusted LSTM model hyper-parameters are determined, and the parameter optimization of the LSTM model hyper-parameters is realized. By combining with a drosophila optimization algorithm, the problem of super-parameter optimization of the model is effectively solved, and the prediction performance of the model is improved.
Specifically, the server needs to determine multiple layers of parameter evaluation criteria of the multiple layers of LSTM model, and uses the multiple layers of parameter evaluation criteria as fitness functions of parameter evaluation, where the evaluation criteria are used to measure prediction errors, short-term and long-term prediction performance, and capability of avoiding overfitting of the model. It should be noted that, the multi-layer parameter evaluation criteria in the embodiments of the present application at least include: a loss function, a recent loss function, and an overfitting evaluation function.
And then, the server calculates the initial fitness of the multilayer LSTM model through the fitness function according to the test sample, and can calculate the initial fitness of the multilayer LSTM model according to the test sample by utilizing the fitness function, thereby providing quantitative evaluation for the performance of the model on test data and being beneficial to knowing the effectiveness and accuracy of the model. The server can determine whether the model reaches the preset iteration number threshold by comparing the initial fitness of the model with the preset iteration number threshold, thereby being beneficial to controlling the training process of the model and preventing the occurrence of the over-fitting phenomenon. The server optimizes the LSTM model super-parameters according to the initial fitness and the preset iteration number threshold, so that better prediction performance can be obtained, and the prediction precision and generalization capability of the model can be further improved by adjusting the multi-layer parameters of the model, such as the hidden layer number, the iteration number, the learning rate and the like.
103. And under the condition that the multilayer LSTM model does not reach the preset iteration frequency threshold value based on the initial fitness, carrying out population iteration on the LSTM model super-parameters based on a population algorithm, and determining average scores of the population iteration and the LSTM model super-parameters after parameter optimization.
In one embodiment of the application, the concept of a multi-layer LSTM model is introduced to solve the problems that a simple LSTM model is slow in convergence, poor in interference resistance and not easy to simultaneously reserve the near-far characteristic. The server takes the last output as the input of the lower layer based on the using sequence of hidden layers among the layers, specifically, the super-parameters of the LSTM model after population iteration and parameter optimization adjustment are input into a first layer of a plurality of layers of LSTM models, the super-parameters of the LSTM model output by the first layer are input into a second layer of the plurality of layers of LSTM models, and then the super-parameters of the LSTM model output by the second layer are continuously input downwards until the super-parameters of the LSTM model are input into the last layer of the plurality of layers of LSTM models.
The server introduces the hierarchical output proportion, builds a corresponding full-connection layer for each layer of LSTM model, builds back propagation of each layer by using the full-connection layer, determines model loss corresponding to the output result of each full-connection layer, and inputs the model loss and the back propagation of each full-connection layer into the optimizer, so that the average score of the LSTM model hyper-parameters can be obtained. Therefore, the problems of rapid convergence and long-short distance characteristic reservation can be solved through multiple layers and proportions, and the self-learning problem of the model and the anti-interference improvement are solved through the back propagation of the full-connection layer.
In one embodiment, before determining that the multilayer LSTM model does not reach the preset iteration number threshold based on the initial fitness, performing population iteration on the LSTM model super-parameters based on a population algorithm, using a method of evaluating random partitioning to obtain a plurality of service data related to service requirements, and by emphasizing service requirements of a recent period, ensuring that the model captures and adapts to modern service trends better, formulating a multilayer parameter evaluation standard, performing multiple checks on the obtained service data, ensuring that the model performs comprehensive evaluation on service requirements in different dimensions, improving the robustness of the model, obtaining a corresponding training result aiming at service data failing in verification, knowing the defect of the model on the data failing in verification by analyzing the training result, then repairing the service data failing in verification by using a reverse alignment mechanism according to the training result, and adjusting the model parameters or adding training samples to enable the model to adapt to the service data failing in verification better, thereby solving the interference of initial uplink transfer failure service to a later-stage model.
Fig. 2 is a schematic structural diagram of a multilayer LSTM model according to an embodiment of the present application. As shown in fig. 2, the server initializes the LSTM model superparameter first, then inputs the initialized LSTM model superparameter to a first layer LSTM model in the multi-layer LSTM model, outputs the processed LSTM model superparameter through the first layer LSTM model, further inputs the LSTM model superparameter output after the first layer LSTM model is processed to a second layer LSTM model, and based on the execution flow, continuously inputs the LSTM model superparameter output after the second layer LSTM model is processed downward until the LSTM model is input to a last layer LSTM model in the multi-layer LSTM model.
And respectively constructing corresponding full-connection layers for each layer in the multilayer LSTM model, respectively constructing corresponding back propagation of the full-connection layers through the full-connection layers of each layer, and further inputting the back propagation of each layer into the optimizer. And the server also respectively determines the output results corresponding to each full-connection layer, calculates the model loss of the LSTM model according to the output results of all the full-connection layers, inputs the model loss into the optimizer, further carries out optimization through the optimizer based on the reverse output and the model loss of each full-connection layer, and can obtain the average score of the super parameters of the LSTM model.
104. And calculating the current fitness of the multilayer LSTM model based on the average score and the training sample, and obtaining the global optimal parameter under the condition that the current fitness is larger than the current optimal solution.
The server calculates the current fitness of the multi-layer LSTM model according to the average score obtained by the multi-layer LSTM model through multi-execution of the super-parameters of the LSTM model and the training sample obtained by preprocessing the data sample, further obtains the initial optimal parameters of the multi-layer LSTM model based on the current fitness, and judges the magnitude relation between the current fitness and the current optimal solution, so that the current optimal solution is updated under the condition that the current fitness is larger than the current optimal solution, and further the global optimal parameters of the multi-layer LSTM model are obtained according to the updated optimal solution and the initial optimal parameters.
105. Based on the global optimal parameters, a corresponding LSTM optimization model is obtained, and the time sequence is predicted through the LSTM optimization model.
Specifically, in one embodiment of the present application, the server determines a prediction type corresponding to the time series to be predicted based on the user requirement, and determines a prediction mechanism corresponding to the time series to be predicted according to the prediction type. It should be noted that, the prediction types in the embodiment of the present application include: long-term prediction and short-term prediction, prediction mechanisms include: a shared prediction mechanism and a single output prediction mechanism.
For the time sequence to be predicted, the server predicts the time sequence to be predicted, which is long-term predicted, through a shared prediction mechanism, so as to output a corresponding long-term prediction time sequence, reduces the offset of the long-term prediction time sequence through a single-step sliding mechanism, predicts the time sequence to be predicted, which is short-term predicted, through a single-output prediction mechanism, and outputs a corresponding short-term prediction time sequence, so as to realize the prediction of the time sequence.
In one embodiment of the application, after obtaining a corresponding LSTM optimization model based on a global optimum parameter, a server sends LSTM model superparameters to a plurality of different prediction nodes based on model training and verification times, and predicts the LSTM model superparameters in parallel through the plurality of prediction nodes, and then the server receives prediction return values of the plurality of prediction nodes respectively, calculates fitness of the LSTM optimization model according to the plurality of prediction return values, and evaluates the LSTM model superparameters according to the fitness of the LSTM optimization model, so that model performance evaluation of the LSTM optimization model can be obtained.
Fig. 3 is a flowchart of another time sequence prediction method based on the LSTM optimization model according to an embodiment of the present application. As shown in fig. 3, the server initializes the LSTM model hyper-parameters first, and pre-processes the pre-acquired data samples, so as to obtain pre-processed training samples and test samples. The server can optimize the initialized LSTM model superparameter according to the training sample, and then inputs the optimized LSTM model superparameter into the multi-layer LSTM model, so that average scores corresponding to multiple model execution are obtained.
Meanwhile, the server calculates initial fitness of the multi-layer LSTM model based on the training samples, and further determines whether the multi-layer LSTM model reaches a preset iteration number threshold or not based on the initial fitness, so that current data can be obtained directly through the acquirer under the condition that the preset iteration number threshold is reached, and therefore global optimal parameters can be determined according to the obtained current data. The server calculates the current fitness of the multi-layer LSTM model according to the training samples and the average score, and updates the optimal solution to obtain the corresponding global optimal parameter under the condition that the current fitness is larger than the current optimal solution, so that the LSTM optimal model can be obtained based on the global optimal parameter, and further model performance evaluation is carried out through the LSTM optimal model.
The foregoing is a method embodiment presented herein. Based on the same inventive concept, the embodiment of the application also provides a time sequence prediction device based on the LSTM optimization model, and the structure of the time sequence prediction device is shown in fig. 4.
Fig. 4 is a schematic diagram of an internal structure of a time series prediction device based on an LSTM optimization model according to an embodiment of the present application. As shown in fig. 4, the apparatus includes:
at least one processor;
and a memory communicatively coupled to the at least one processor;
wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to:
initializing the super parameters of the multilayer predicted LSTM model, and preprocessing the pre-acquired data samples to obtain preprocessed training samples and preprocessed test samples;
optimizing the initialized LSTM model hyper-parameters through the multilayer LSTM model and based on training samples, and calculating the initial fitness of the multilayer LSTM model based on test samples;
under the condition that the multilayer LSTM model does not reach a preset iteration number threshold value based on the initial fitness, carrying out population iteration on the LSTM model super-parameters based on a population algorithm, and determining average scores of the population iteration and the LSTM model super-parameters after parameter optimization;
calculating the current fitness of the multilayer LSTM model based on the average score and the training sample, and obtaining a global optimal parameter under the condition that the current fitness is larger than the current optimal solution;
based on the global optimal parameters, a corresponding LSTM optimization model is obtained, and the time sequence is predicted through the LSTM optimization model.
The embodiments of the present application also provide a non-volatile computer storage medium, storing computer executable instructions, where the computer is capable of, when executing the executable instructions:
initializing the super parameters of the multilayer predicted LSTM model, and preprocessing the pre-acquired data samples to obtain preprocessed training samples and preprocessed test samples;
optimizing the initialized LSTM model hyper-parameters through the multilayer LSTM model and based on training samples, and calculating the initial fitness of the multilayer LSTM model based on test samples;
under the condition that the multilayer LSTM model does not reach a preset iteration number threshold value based on the initial fitness, carrying out population iteration on the LSTM model super-parameters based on a population algorithm, and determining average scores of the population iteration and the LSTM model super-parameters after parameter optimization;
calculating the current fitness of the multilayer LSTM model based on the average score and the training sample, and obtaining a global optimal parameter under the condition that the current fitness is larger than the current optimal solution;
based on the global optimal parameters, a corresponding LSTM optimization model is obtained, and the time sequence is predicted through the LSTM optimization model.
All embodiments in the application are described in a progressive manner, and identical and similar parts of all embodiments are mutually referred, so that each embodiment mainly describes differences from other embodiments. In particular, for the apparatus and medium embodiments, the description is relatively simple, as it is substantially similar to the method embodiments, with reference to the section of the method embodiments being relevant.
The foregoing describes specific embodiments of the present application. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The devices and media provided in the embodiments of the present application are in one-to-one correspondence with the methods, so that the devices and media also have similar beneficial technical effects as the corresponding methods, and since the beneficial technical effects of the methods have been described in detail above, the beneficial technical effects of the devices and media are not described in detail herein.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims (7)

1. A method for predicting a time sequence based on an LSTM optimization model, the method comprising:
initializing the super parameters of the multilayer predicted LSTM model, and preprocessing the pre-acquired data samples to obtain preprocessed training samples and preprocessed test samples;
optimizing the initialized LSTM model hyper-parameters through a multi-layer LSTM model and based on the training samples, and calculating the initial fitness of the multi-layer LSTM model based on the test samples;
under the condition that the multilayer LSTM model does not reach a preset iteration frequency threshold value based on the initial fitness, carrying out population iteration on the LSTM model super-parameters based on a population algorithm, and determining average scores of the population iteration and the LSTM model super-parameters after parameter optimization;
calculating the current fitness of the multilayer LSTM model based on the average score and the training sample, and obtaining a global optimal parameter under the condition that the current fitness is larger than a current optimal solution;
based on the global optimal parameters, a corresponding LSTM optimization model is obtained, and the time sequence is predicted through the LSTM optimization model;
initializing the multiparameter of the LSTM model of multilayer prediction, and preprocessing a data sample obtained in advance to obtain a preprocessed training sample and a preprocessed testing sample, wherein the method specifically comprises the following steps:
determining a service scene of a time sequence to be predicted and service characteristics corresponding to the service scene, and determining multi-layer prediction parameters based on the service scene and the corresponding service characteristics; the multi-layer prediction parameters include at least: number of hidden layers, number of iterations, input set size, time step, and learning rate;
taking the multilayer prediction parameters as LSTM model superparameters, and initializing the LSTM model superparameters to obtain initialized LSTM model superparameters;
preprocessing a plurality of sample data obtained in advance based on a data complement mechanism and a random time step number grouping mechanism, and obtaining corresponding training samples and test samples;
the calculating the initial fitness of the multilayer LSTM model based on the test sample specifically comprises the following steps:
determining a multi-layer parameter evaluation standard of a multi-layer LSTM model, and taking the multi-layer parameter evaluation standard as an adaptability function of parameter evaluation; the multi-layer parameter evaluation criteria include at least: a loss function, a recent loss function, and an overfitting evaluation function;
calculating the initial fitness of the multi-layer LSTM model through the fitness function according to a test sample, and determining whether the multi-layer LSTM model reaches a preset iteration frequency threshold or not based on the initial fitness;
under the condition that the multilayer LSTM model does not reach a preset iteration number threshold based on the initial fitness, before population iteration is carried out on the LSTM model super-parameters based on a population algorithm, the method further comprises the following steps:
based on evaluating the service demands of random partition and focusing on the modern period, acquiring a plurality of service data corresponding to the service demands, and performing multiple verification in a multi-layer parameter evaluation standard according to the plurality of service data;
aiming at the service data with verification failure, acquiring a training result corresponding to the service data with verification failure, and reversely supplementing the service data with verification failure according to the training result.
2. The method for predicting time series based on the LSTM optimization model according to claim 1, wherein the optimizing the initialized LSTM model hyper-parameters by using the multilayer LSTM model and based on the training samples specifically includes:
under the condition that the population algorithm is a drosophila algorithm, training the multilayer LSTM model based on the training sample so as to adjust the population iteration times and the inertia weight of the drosophila;
and adjusting the searching step length and the searching range of the grid searching based on the random step length, determining the adjusted LSTM model hyper-parameters, and realizing parameter optimization of the LSTM model hyper-parameters.
3. The LSTM optimization model-based time series prediction method of claim 1, wherein determining the population iteration and the average score of the LSTM model super parameter after the parameter optimization specifically includes:
inputting the LSTM model superparameter subjected to population iteration and parameter optimization adjustment into a first layer of the multi-layer LSTM model, and inputting the LSTM model superparameter output by the first layer into a second layer of the multi-layer LSTM model;
continuously inputting the LSTM model superparameter output by the second layer downwards until the LSTM model superparameter is input to the last layer in the multilayer LSTM model;
constructing a corresponding full-connection layer aiming at each layer of LSTM model, and constructing corresponding back propagation of each full-connection layer respectively;
and determining model loss corresponding to the output result of each full-connection layer, and inputting the model loss and the back propagation of each full-connection layer into an optimizer to obtain the average score of the LSTM model hyper-parameters.
4. The method for predicting time series based on LSTM optimization model according to claim 1, wherein after obtaining the corresponding LSTM optimization model based on the global optimal parameter, the method further comprises:
based on model training and verification times, transmitting the LSTM model super-parameters to a plurality of different prediction nodes, and carrying out parallel prediction on the LSTM model super-parameters through the plurality of prediction nodes;
respectively receiving the prediction return values of the plurality of prediction nodes, and calculating the fitness of the LSTM optimization model according to the plurality of prediction return values;
and evaluating the LSTM model hyper-parameters according to the fitness of the LSTM optimization model to obtain model performance evaluation of the LSTM optimization model.
5. The method for predicting time series based on the LSTM optimization model according to claim 1, wherein predicting time series by the LSTM optimization model specifically comprises:
determining a prediction type corresponding to a time sequence to be predicted based on user requirements, and determining a prediction mechanism corresponding to the time sequence to be predicted according to the prediction type; the prediction type includes: long-term prediction and short-term prediction, the prediction mechanism comprising: a shared prediction mechanism and a single output prediction mechanism;
for a time sequence to be predicted, the type of prediction is long-term prediction, the time sequence to be predicted for long-term prediction is predicted through a shared prediction mechanism, a corresponding long-term prediction time sequence is output, and the offset of the long-term prediction time sequence is reduced through a single-step sliding mechanism;
and aiming at the time sequence to be predicted, the short-term prediction type of which is short-term prediction, predicting the time sequence to be predicted by a single-output prediction mechanism, and outputting a corresponding short-term prediction time sequence to realize the prediction of the time sequence.
6. A time series prediction apparatus based on an LSTM optimization model, the apparatus comprising:
at least one processor;
and a memory communicatively coupled to the at least one processor;
wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a LSTM optimization model-based time series prediction method as claimed in any one of claims 1-5.
7. A non-transitory computer storage medium storing computer executable instructions which, when executed, implement a LSTM optimization model-based time series prediction method as claimed in any one of claims 1-5.
CN202311553141.1A 2023-11-21 2023-11-21 Time sequence prediction method, device and medium based on LSTM optimization model Active CN117272051B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311553141.1A CN117272051B (en) 2023-11-21 2023-11-21 Time sequence prediction method, device and medium based on LSTM optimization model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311553141.1A CN117272051B (en) 2023-11-21 2023-11-21 Time sequence prediction method, device and medium based on LSTM optimization model

Publications (2)

Publication Number Publication Date
CN117272051A CN117272051A (en) 2023-12-22
CN117272051B true CN117272051B (en) 2024-03-08

Family

ID=89201213

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311553141.1A Active CN117272051B (en) 2023-11-21 2023-11-21 Time sequence prediction method, device and medium based on LSTM optimization model

Country Status (1)

Country Link
CN (1) CN117272051B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112653142A (en) * 2020-12-18 2021-04-13 武汉大学 Wind power prediction method and system for optimizing depth transform network
CN112733996A (en) * 2021-01-14 2021-04-30 河海大学 GA-PSO (genetic Algorithm-particle swarm optimization) based hydrological time sequence prediction method for optimizing XGboost
CN114781692A (en) * 2022-03-24 2022-07-22 国网河北省电力有限公司衡水供电分公司 Short-term power load prediction method and device and electronic equipment
CN115185804A (en) * 2022-07-28 2022-10-14 苏州浪潮智能科技有限公司 Server performance prediction method, system, terminal and storage medium
CN115545151A (en) * 2022-08-25 2022-12-30 南京上铁电子工程有限公司 Method for training prediction model for time series data prediction and related equipment
CN116526450A (en) * 2023-04-14 2023-08-01 三峡大学 Error compensation-based two-stage short-term power load combination prediction method
CN116757057A (en) * 2023-03-25 2023-09-15 北京工业大学 Air quality prediction method based on PSO-GA-LSTM model

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112653142A (en) * 2020-12-18 2021-04-13 武汉大学 Wind power prediction method and system for optimizing depth transform network
CN112733996A (en) * 2021-01-14 2021-04-30 河海大学 GA-PSO (genetic Algorithm-particle swarm optimization) based hydrological time sequence prediction method for optimizing XGboost
CN114781692A (en) * 2022-03-24 2022-07-22 国网河北省电力有限公司衡水供电分公司 Short-term power load prediction method and device and electronic equipment
CN115185804A (en) * 2022-07-28 2022-10-14 苏州浪潮智能科技有限公司 Server performance prediction method, system, terminal and storage medium
CN115545151A (en) * 2022-08-25 2022-12-30 南京上铁电子工程有限公司 Method for training prediction model for time series data prediction and related equipment
CN116757057A (en) * 2023-03-25 2023-09-15 北京工业大学 Air quality prediction method based on PSO-GA-LSTM model
CN116526450A (en) * 2023-04-14 2023-08-01 三峡大学 Error compensation-based two-stage short-term power load combination prediction method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Evolving CNN-LSTM models for time series prediction using enhanced grey wolf optimizer;HaiLin Xie et al;《IEEEAccess》;第8卷;161519-161541 *
基于IAPSO-Holt-TCN的时序瓦斯浓度预测模型;温廷新等;《中国安全生产科学技术》;第19卷(第7期);57-62 *

Also Published As

Publication number Publication date
CN117272051A (en) 2023-12-22

Similar Documents

Publication Publication Date Title
CN113887916A (en) Dynamic quantitative evaluation method and system for line loss of power distribution network
CN110727506B (en) SPARK parameter automatic tuning method based on cost model
CN110717535A (en) Automatic modeling method and system based on data analysis processing system
CN112365033B (en) Wind power interval prediction method, system and storage medium
CN111327655A (en) Multi-tenant container resource quota prediction method and device and electronic equipment
CN108830417A (en) A kind of residential energy consumption prediction technique and system based on ARMA and regression analysis
CN117540938B (en) Integrated building energy consumption prediction method and system based on TD3 reinforcement learning optimization
CN117272051B (en) Time sequence prediction method, device and medium based on LSTM optimization model
CN116955335A (en) Address data management method and system based on big data model algorithm
CN116975753A (en) Data category based prediction method, device, equipment and medium
CN114386716B (en) Answer sequence prediction method based on improved IRT structure, controller and storage medium
CN113240181B (en) Rolling simulation method and device for reservoir dispatching operation
CN115035304A (en) Image description generation method and system based on course learning
CN113408692A (en) Network structure searching method, device, equipment and storage medium
US11928562B2 (en) Framework for providing improved predictive model
CN116957166B (en) Tunnel traffic condition prediction method and system based on Hongmon system
CN116720662B (en) Distributed energy system applicability evaluation method based on set pair analysis
CN117422176A (en) Business trend prediction method, equipment and medium based on optimization model
CN117077829A (en) Network optimization method, device, equipment and storage medium
CN115906402A (en) Multi-power-supply scene prediction method and system
CN118261205A (en) Task processing method and device
CN117435875A (en) System-level testing method and system for typical equipment at distribution network side
CN118071506A (en) Asset configuration method and device, readable storage medium and computing equipment
CN114881283A (en) Training method of solar-term change forecasting model, and solar-term change forecasting method and device
KR20230077280A (en) Method for learning prediction model for regression prediction of time series data and method for predicting using prediction model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant