CN113569972A - Meteorological data interpolation method, meteorological data interpolation device, electronic equipment and storage medium - Google Patents

Meteorological data interpolation method, meteorological data interpolation device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113569972A
CN113569972A CN202110884988.2A CN202110884988A CN113569972A CN 113569972 A CN113569972 A CN 113569972A CN 202110884988 A CN202110884988 A CN 202110884988A CN 113569972 A CN113569972 A CN 113569972A
Authority
CN
China
Prior art keywords
data
meteorological data
model
data interpolation
interpolation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110884988.2A
Other languages
Chinese (zh)
Inventor
黄翀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Geographic Sciences and Natural Resources of CAS
Original Assignee
Institute of Geographic Sciences and Natural Resources of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Geographic Sciences and Natural Resources of CAS filed Critical Institute of Geographic Sciences and Natural Resources of CAS
Priority to CN202110884988.2A priority Critical patent/CN113569972A/en
Publication of CN113569972A publication Critical patent/CN113569972A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Complex Calculations (AREA)

Abstract

The application provides a meteorological data interpolation method, a device, electronic equipment and a storage medium, which relate to the field of meteorological monitoring, and the method comprises the following steps: obtaining a missing air temperature segment based on the obtained automatic observation data; and interpolating the missing air temperature segment according to a trained meteorological data interpolation model by taking the obtained artificial observation data as a true value to obtain complete high-frequency time sequence meteorological data. The complete high-precision temperature observation data is obtained through the low-frequency temperature observation data, the high-frequency observation temperature data with missing values is subjected to data interpolation, and the problem that the existing temperature observation data is low in precision can be solved.

Description

Meteorological data interpolation method, meteorological data interpolation device, electronic equipment and storage medium
Technical Field
The application relates to the field of meteorological monitoring, in particular to a meteorological data interpolation method and device, electronic equipment and a storage medium.
Background
Temperature is an important observation quantity for agricultural and ecosystem research, temperature factors must be considered in agricultural crop growth simulation, agricultural meteorological disaster monitoring and ecosystem simulation, higher precision is required in temperature detection due to the refinement of agricultural and ecological simulation, temperature observation data is generally acquired by a field meteorological observation station, and data loss is easily caused in small meteorological observation due to equipment failure, severe environment or manual operation errors and the like, so that the precision of temperature observation data is influenced. Therefore, the current temperature observation technology has the problem of low temperature observation data precision.
Disclosure of Invention
An object of the embodiments of the present application is to provide a meteorological data interpolation method, device, electronic device, and storage medium, so as to solve the problem of low accuracy of current temperature observation data.
In a first aspect, an embodiment of the present application provides a meteorological data interpolation method, including:
obtaining a missing air temperature segment based on the obtained automatic observation data;
and interpolating the missing air temperature segment according to a trained meteorological data interpolation model by taking the obtained artificial observation data as a true value to obtain complete high-frequency time sequence meteorological data.
In the implementation process, the meteorological data interpolation method provided by the embodiment of the application can interpolate missing air temperature segments in automatic observation data by using a meteorological data interpolation model based on artificial observation data, so that complete high-precision frequency temperature observation data can be obtained, the accuracy and the integrity of the data are improved, and the problem of low precision of the current temperature observation data can be solved.
Optionally, before the inputting the missing temperature segment data into the meteorological data interpolation model, the method further includes:
constructing a meteorological data interpolation initial model;
training the meteorological data interpolation initial model based on a model data set to obtain a meteorological data interpolation model, wherein the model data set comprises a training data set, a verification data set and a test data set, the training data set and the verification data set are used for training the meteorological data interpolation initial model, and the test data set is used for verifying the generalization capability of the trained meteorological data interpolation initial model; the meteorological data interpolation initial model is a BilSTM-I model, and the building of the meteorological data interpolation initial model comprises the following steps: and constructing the meteorological data interpolation initial model based on the Seq2Seq and the Encoder-Decode architecture.
In the implementation process, the meteorological data interpolation model provided by the embodiment of the application interpolates the missing air temperature segment, so that the data interpolation precision and robustness can be improved, and the condition that the meteorological station data is missing due to equipment failure, harsh environment and the like can be met.
Optionally, the meteorological data interpolation initial model includes a coding part, the structure of the coding part is an LSTM-I structure, and a formula describing an internal unit connection process of the LSTM-I structure is as follows:
Figure BDA0003193718530000021
Figure BDA0003193718530000022
Figure BDA0003193718530000023
Figure BDA0003193718530000024
wherein,
Figure BDA0003193718530000031
to estimate the vector, ht-1Hidden state of last LSTM cell, WxAnd bxIn order to be the parameters of the model,
Figure BDA0003193718530000032
is a missing value, xtAs an input vector, mtAs a mask vector, htTo predict the state,/tTo estimate the error.
In the implementation process, the Long short-term memory (LSTM) structure provided in the embodiment of the present application controls the transmission state through the gating state, selectively memorizes the transmission state, memorizes the weather data that needs to be memorized for a Long time, and forgets unimportant data, so as to improve the accuracy of the weather interpolation data.
Optionally, the meteorological data interpolation initial model further includes a decoding portion, and a formula describing a decoding process of the decoding portion is as follows:
st=LSTM(ht,st-1)
yt=Wyst+by
Figure BDA0003193718530000033
wherein s is the output state sequence, y is the output interpolation result sequence, WyAnd byAs a model parameter,/yIs the interpolation result error of the decoding layer.
In the implementation process, the Long short-term memory (LSTM) structure provided in the embodiment of the present application controls the transmission state through the gating state, selectively memorizes the transmission state, memorizes the weather data that needs to be memorized for a Long time, and forgets unimportant data, so as to improve the accuracy of the weather interpolation data.
Optionally, the constructing the meteorological data interpolation initial model includes:
and forming the encoding part of the meteorological data interpolation initial model based on a bidirectional encoding LSTM-I neural network, wherein an output sequence of the encoding part comprises a forward hidden state sequence and a backward hidden state sequence, and the forward hidden state sequence and the backward hidden state sequence are spliced to be used as an output sequence of the encoding part, so that the decoding part receives the output sequence and generates an interpolated output interpolation result sequence.
In the implementation process, the LSTM structure is adopted to solve the problem of gradient disappearance or gradient explosion in the model training process, and the model training process is more suitable for a long-sequence model training process compared with a common Recurrent Neural Network (RNN) structure, so that the meteorological data interpolation model adopting the LSTM structure can improve the accuracy of long-sequence meteorological data interpolation.
Optionally, a formula for calculating an estimation error of the meteorological data interpolation initial model is as follows:
Figure BDA0003193718530000041
wherein,
Figure BDA0003193718530000042
for the estimation error corresponding to the forward hidden state sequence,
Figure BDA0003193718530000043
is an estimation error corresponding to the backward hidden state sequence.
In the implementation process, the structural bidirectional coding of the BilSTM-I is adopted, so that the meteorological data needing to be memorized for a long time can be judged more accurately, and the accuracy of the meteorological interpolation data is improved.
Optionally, the obtaining the missing air temperature segment based on the obtained automatic observation data includes:
and performing data interpolation on data gaps lower than a preset time threshold in the automatic observation data and the artificial observation data by adopting a Kalman smoothing method to obtain the missing air temperature segment data.
In the implementation process, a data interpolation is performed on occasional or short-time data gaps in the time sequence of the meteorological data by adopting a Kalman smoothing method, so that the short-time data gaps can be eliminated, the model focuses more on the long-time sequence meteorological data interpolation, and the accuracy of the meteorological data interpolation can be improved.
In a second aspect, an embodiment of the present application further provides a meteorological data interpolation device, including:
and the data acquisition module is used for acquiring the missing air temperature segment based on the acquired automatic observation data.
And the data interpolation module is used for interpolating the missing air temperature segment according to the trained meteorological data interpolation model by taking the acquired artificial observation data as a true value so as to obtain complete high-frequency time sequence meteorological data.
In the implementation process, the meteorological data interpolation device can acquire complete high-precision temperature observation data through low-frequency temperature observation data, and perform data interpolation on temperature data with missing values, so that the problem of low precision of the current temperature observation data can be solved.
Optionally, the meteorological data interpolation device may further include:
and the initial model building module is used for building a meteorological data interpolation initial model before inputting the missing air temperature segment data into a meteorological data interpolation model.
The model training module is used for training the meteorological data interpolation initial model based on a model data set to obtain a meteorological data interpolation model, the model data set comprises a training data set, a verification data set and a test data set, the training data set and the verification data set are used for training the meteorological data interpolation initial model, and the test data set is used for verifying the generalization capability of the trained meteorological data interpolation initial model; the meteorological data interpolation initial model may be a BiLSTM-I model, and the constructing the meteorological data interpolation initial model includes: and constructing the meteorological data interpolation initial model based on the Seq2Seq and the Encoder-Decode architecture.
In the implementation process, the meteorological data interpolation model is used for interpolating the missing air temperature segment data, so that the data interpolation precision and robustness can be improved. The method can be used for dealing with the condition of data loss of the meteorological station caused by equipment failure, harsh environment and the like.
Optionally, the model training module may be further configured to form the encoding portion of the meteorological data interpolation initial model based on a bidirectional encoded LSTM-I neural network, where an output sequence of the encoding portion includes a forward hidden state sequence and a backward hidden state sequence, and the forward hidden state sequence and the backward hidden state sequence are spliced to serve as an output sequence of the encoding portion, so that the decoding portion receives the output sequence and generates an interpolated output interpolation result sequence.
In the implementation process, the LSTM structure is adopted, the problem of gradient disappearance or gradient explosion in the model training process can be solved, and the model training process is more suitable for a long-sequence model training process compared with a common Recurrent Neural Network (RNN) structure, so that the accuracy of long-sequence meteorological data interpolation can be improved by adopting the meteorological data interpolation model with the LSTM structure.
Optionally, the data obtaining module may further be configured to:
and performing data interpolation on data gaps lower than a preset time threshold in the automatic observation data and the artificial observation data by adopting a Kalman smoothing method to obtain the missing air temperature segment data.
In the implementation process, the data interpolation is carried out by adopting a Kalman smoothing method for occasional or short-time data gaps in the time sequence of the meteorological data, so that the short-time data gaps can be eliminated, the model focuses on the long-time sequence meteorological data interpolation, and the accuracy of the meteorological data interpolation can be improved.
In a third aspect, an embodiment of the present application further provides an electronic device, where the electronic device includes a memory and a processor, where the memory stores program instructions, and the processor executes the steps in any one of the foregoing implementation manners when reading and executing the program instructions.
In a fourth aspect, an embodiment of the present application further provides a storage medium, where the readable storage medium stores computer program instructions, and the computer program instructions are read by a processor and executed to perform the steps in any of the foregoing implementation manners.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
Fig. 1 is a schematic diagram of a meteorological data interpolation method according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram illustrating a step of constructing a meteorological data interpolation model according to an embodiment of the present application;
fig. 3 is a schematic diagram of a meteorological data interpolation device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. Referring to fig. 1, fig. 1 is a schematic diagram of a meteorological data interpolation method according to an embodiment of the present application, where the method includes:
in step S12, a missing air temperature segment is obtained based on the acquired automatic observation data.
In step S14, the missing air temperature segment is interpolated according to the trained weather data interpolation model with the acquired artificial observation data as the true value, so as to obtain complete high-frequency time-series weather data.
The automatic observation data can be temperature data observed by a meteorological instrument every other preset time, and the manual observation data can be temperature data observed by a weather instrument every day for a fixed number of times.
In the embodiment of the application, the automatic observation data can be temperature data obtained by observing the meteorological instrument once every half hour, and the manual observation data can be observation data fixed three times every day, and the high-frequency automatic observation data is searched to obtain the missing air temperature segment with the data gap.
And substituting the missing air temperature segment data into a meteorological data interpolation model to interpolate the temperature observation time sequence of the missing air temperature segment data to obtain complete air temperature time sequence data.
Therefore, the meteorological data interpolation method provided by the embodiment of the application can interpolate missing air temperature segments in automatic observation data by a meteorological data interpolation model based on artificial observation data, so that complete high-precision frequency temperature observation data can be obtained, the accuracy and the integrity of the data are improved, and the problem of low precision of the current temperature observation data can be solved.
It should be understood that the meteorological data interpolation model also needs to be constructed before the meteorological data interpolation model is used for data interpolation.
Optionally, referring to fig. 2, fig. 2 is a schematic diagram of a step of constructing a meteorological data interpolation model according to an embodiment of the present application, and for step S14, before the missing air temperature segment is interpolated according to the trained meteorological data interpolation model, the method may further include the following steps:
in step S131, a meteorological data interpolation initial model is constructed.
In step S132, the meteorological data interpolation initial model is trained based on the model data set to obtain a meteorological data interpolation model.
The model data set comprises a training data set, a verification data set and a test data set, the training data set and the verification data set are used for training the meteorological data interpolation initial model, and the test data set is used for verifying the generalization capability of the trained meteorological data interpolation initial model.
For example, the meteorological data interpolation initial model may be a BiLSTM-I model, and the constructing the meteorological data interpolation initial model includes:
and constructing the meteorological data interpolation initial model based on the Seq2Seq and the Encoder-Decode architecture.
The method comprises the steps of carrying out data inspection on the obtained automatic observation data and the obtained manual observation data, carrying out time matching and mask making on the two data, and carrying out sample sampling in a rolling window mode to obtain a model data set.
Illustratively, time matching is carried out on automatic observation data and manual observation data, a segment needing data interpolation is determined, a mask can be used for definitely forcing a model to ignore certain values, for example, attention to interpolation elements is paid, errors in a model learning process caused by influence on a loss function are prevented, a rolling window method is adopted, a time sequence segmented by days is used as a model training construction sample set, and generalization capability of the model to different missing value windows can be enhanced.
Therefore, the meteorological data interpolation model provided by the embodiment of the application interpolates the missing air temperature segment, so that the data interpolation precision and robustness can be improved, and the situation that meteorological station data are missing due to equipment failure, harsh environment and the like can be solved.
Specifically, the meteorological data interpolation initial model comprises a coding part, the structure of the coding part is an LSTM-I structure, and a formula describing an internal unit connection process of the LSTM-I structure is as follows:
Figure BDA0003193718530000091
Figure BDA0003193718530000092
Figure BDA0003193718530000093
Figure BDA0003193718530000094
wherein,
Figure BDA0003193718530000095
to estimate the vector, ht-1Hidden state of last LSTM cell, WxAnd bxIn order to be the parameters of the model,
Figure BDA0003193718530000096
is a missing value, xtAs an input vector, mtAs a mask vector, htTo predict the state,/tTo estimate the error.
Illustratively, the basic structure of the coding part is LSTM-I, in which the recurrent neural network elements directly employ long-short term memory elements, in
Figure BDA0003193718530000097
In the formula, the hidden state h of the last LSTM unit is sett-1Conversion into an estimated vector
Figure BDA0003193718530000098
In that
Figure BDA0003193718530000099
In the formula, by using a mask vector mtInput vector xtReplacing missing values in the vector with estimated vectors
Figure BDA00031937185300000910
A corresponding value;
in that
Figure BDA00031937185300000911
In the formula, through LSTM network elements
Figure BDA00031937185300000912
Hidden state of handlebart-1Generating a predicted state ht
In that
Figure BDA00031937185300000913
And calculating an estimation error of the LSTM-I unit in the formula, wherein the estimation error is the accumulated amount of the absolute difference between an observed value and an estimated value at the position of a missing value.
Therefore, the Long short-term memory (LSTM) structure provided by the embodiment of the present application selectively memorizes weather data that needs to be memorized for a Long time and forgets unimportant data by controlling the transmission state through the gated state, and can improve the accuracy of interpolation weather data.
Further, the meteorological data interpolation initial model further comprises a decoding part, and a formula describing a decoding process of the decoding part is as follows:
st=LSTM(ht,st-1)
yt=Wyst+by
Figure BDA0003193718530000101
wherein s is the output state sequence, y is the output interpolation result sequence, WyAnd byAs a model parameter,/yIs the interpolation result error of the decoding layer.
Wherein s ist=LSTM(ht,st-1) Indicating that the decoding layer is at the bottom of a standard LSTM network that synthesizes the encoded output sequence as h, resulting in an output state sequence s containing richer information { s1, s2, …, sn }. At yt=Wyst+byThe formula shows that the top of the decoding layer adopts a linear full-connection layer because the temperature value is a continuous value, and an interpolation result sequence y is output.
Figure BDA0003193718530000102
Is the interpolation result error of the decoding layer.
Therefore, the method and the device for the long-sequence meteorological data interpolation have the advantages that the LSTM structure is adopted, the problem of gradient disappearance or gradient explosion in the model training process can be solved, and the method and the device are more suitable for the long-sequence model training process compared with a common Recurrent Neural Network (RNN) structure, so that the accuracy of the long-sequence meteorological data interpolation can be improved by adopting the meteorological data interpolation model with the LSTM structure.
Optionally, for step S131, the constructing the meteorological data interpolation initial model includes:
and forming the encoding part of the meteorological data interpolation initial model based on a bidirectional encoding LSTM-I neural network, wherein an output sequence of the encoding part comprises a forward hidden state sequence and a backward hidden state sequence, and the forward hidden state sequence and the backward hidden state sequence are spliced to be used as an output sequence of the encoding part, so that the decoding part receives the output sequence and generates an interpolated output interpolation result sequence.
Illustratively, in the course of bi-directional encoding, a slave timeReading input from beginning to end of sequence to generate forward hidden state vector sequence
Figure BDA0003193718530000111
Another reverse read input from the end to the beginning of the time sequence, producing a sequence of backward hidden states
Figure BDA0003193718530000112
Splicing the forward and backward hidden state sequences to form the coded output of the coding layer
Figure BDA0003193718530000113
Wherein, the vector hiCan be expressed as:
Figure BDA0003193718530000114
further, the bi-directionally encoded LSTM-I encoded network error includes both forward and reverse estimation errors. The formula for calculating the estimation error of the meteorological data interpolation initial model is as follows:
Figure BDA0003193718530000115
wherein,
Figure BDA0003193718530000116
for the estimation error corresponding to the forward hidden state sequence,
Figure BDA0003193718530000117
is an estimation error corresponding to the backward hidden state sequence.
Therefore, the meteorological data needing to be memorized for a long time can be judged more accurately by adopting the structural bidirectional coding of the BilSTM-I, and the accuracy of the meteorological interpolation data is improved.
Optionally, for step S12, the obtaining missing temperature segment data based on the obtained automatic observation data and manual observation data includes:
and performing data interpolation on data gaps lower than a preset time threshold in the automatic observation data and the artificial observation data by adopting a Kalman smoothing method to obtain the missing air temperature segment data.
Therefore, according to the embodiment of the application, the data interpolation is performed by adopting the Kalman smoothing method for the occasional or short-time data gaps in the time sequence of the meteorological data, so that the short-time data gaps can be eliminated, the model can pay more attention to the long-time sequence meteorological data interpolation, and the accuracy of the meteorological data interpolation can be improved.
Further, the embodiment of the application takes the meteorological temperature observation data of a certain field surgery observation research station as an example, in the observation research station, the automatic observation data is updated once every half hour, the manual observation data is three times per day, wherein the automatic observation data has short-time temperature loss and long-time temperature loss, and the loss length is two months.
Firstly, preprocessing data, eliminating temperature data loss in short time in the data by adopting a Kalman smoothing method, only having a gap with the time length of two months once in the preprocessed data, recording the temperature time sequence of each day as d by taking the day as a unit, and recording the date without loss value in both manual observation and machine observation as d
Figure BDA0003193718530000121
The date of observation data including only three manual observations and no machine was recorded
Figure BDA0003193718530000122
In segments of days, the time series of meteorological data is represented as:
Figure BDA0003193718530000123
this sequence represents a temperature time series of length n days, with a window of indeed m days in width. The sequence data has a length of 48n, and there is a data segment with a data length of 48mAnd (4) a notch. To indicate the location of missing values in the sequence, a temperature time series of half-hour auto-observed data samples of length L (i.e., 48n) is sampled
Figure BDA0003193718530000124
Constructing a corresponding length L mask time series
Figure BDA0003193718530000125
Wherein the mask sequence is represented as follows:
Figure BDA0003193718530000126
taking day as unit, segmenting half-hour mask sequence with length L, and recording the segment without missing value as mask each day
Figure BDA0003193718530000131
And a mask segment containing only three observations of the artificial observation data and recorded as
Figure BDA0003193718530000132
Thus, a mask sequence which is segmented by days and corresponds to the time sequence of the meteorological data is established:
Figure BDA0003193718530000133
and (3) adopting a rolling window method, and training and constructing a sample set for a meteorological data interpolation model based on a time sequence segmented by days. And interpolating a missing value with the length of m (days), constructing a sample rolling window with the length of more than m, and respectively keeping observation data with the length of s (days) at two ends of m, so that the length w of the rolling window is (m +2s) days. Constructing a model data set by using a Seq2Seq training method, wherein a temperature observation sequence in a training input sample with the length of w is as follows:
Figure BDA0003193718530000134
the following time series result output is formed through training:
Figure BDA0003193718530000135
in the above-described sequence, the sequence,
Figure BDA0003193718530000136
to interpolate missing values, a complete segment of data values is automatically observed every half hour a day. When a training sample is constructed, according to the corresponding relation between the observation value and the mask sequence in the order of days, a mask sequence with the length of w (days) is constructed and used as the other input of the sample.
In the embodiment of the application, the example data is taken as an example, the observation data on the left side of the missing value window of a month is used for constructing the training set, and the observation data on the right side is used for constructing the verification set. Meanwhile, in order to verify the generalization ability of the model, two test samples are constructed, wherein one missing value time window is set to be 30 days, the other missing value time window is set to be 60 days, and continuous observed values before and after the missing value windows of the two training samples are set to be 14 days.
After the model training set is constructed, a PyTorch deep learning framework can be adopted for model construction, the training data set and the verification data set are used for training the model, and the test data set is used for testing the generalization ability of the model. The specific method is to use a model with a deletion value window of 30 days to interpolate the time sequence of observation of the temperature with the deletion value of 60 days, and then use the model with the deletion value window of 60 days to interpolate the time sequence of observation of the temperature with the deletion value of 30 days.
And finally, constructing the data missing segment with the time length of two months according to an input data structure, and bringing the data missing segment into a model which is trained and tested to calculate, thereby finally obtaining complete air temperature time sequence data.
Referring to fig. 3, based on the same inventive concept, an embodiment of the present invention further provides a meteorological data interpolation apparatus 30, and fig. 3 is a schematic diagram of the meteorological data interpolation apparatus provided in the embodiment of the present application, where the apparatus includes:
and the data acquisition module 31 is used for obtaining the missing air temperature segment based on the acquired automatic observation data.
And the data interpolation module 32 is configured to interpolate the missing air temperature segment according to the trained meteorological data interpolation model by using the acquired artificial observation data as a true value, so as to obtain complete high-frequency time sequence meteorological data.
Optionally, the meteorological data interpolation device 30 may further include:
and the initial model building module is used for building a meteorological data interpolation initial model before inputting the missing air temperature segment data into a meteorological data interpolation model.
The model training module is used for training the meteorological data interpolation initial model based on a model data set to obtain a meteorological data interpolation model, the model data set comprises a training data set, a verification data set and a test data set, the training data set and the verification data set are used for training the meteorological data interpolation initial model, and the test data set is used for verifying the generalization capability of the trained meteorological data interpolation initial model; the meteorological data interpolation initial model may be a BiLSTM-I model, and the constructing the meteorological data interpolation initial model includes: and constructing the meteorological data interpolation initial model based on the Seq2Seq and the Encoder-Decode architecture.
Optionally, the model training module may be further configured to form the encoding portion of the meteorological data interpolation initial model based on a bidirectional encoded LSTM-I neural network, where an output sequence of the encoding portion includes a forward hidden state sequence and a backward hidden state sequence, and the forward hidden state sequence and the backward hidden state sequence are spliced to serve as an output sequence of the encoding portion, so that the decoding portion receives the output sequence and generates an interpolated output interpolation result sequence.
Optionally, the data obtaining module 31 may further be configured to:
and performing data interpolation on data gaps lower than a preset time threshold in the automatic observation data and the artificial observation data by adopting a Kalman smoothing method to obtain the missing air temperature segment data.
In a third aspect, an embodiment of the present application further provides an electronic device, where the electronic device includes a memory and a processor, where the memory stores program instructions, and the processor executes the steps in any one of the foregoing implementation manners when reading and executing the program instructions.
In a fourth aspect, an embodiment of the present application further provides a storage medium, where the readable storage medium stores computer program instructions, and the computer program instructions are read by a processor and executed to perform the steps in any of the foregoing implementation manners.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
For example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The storage medium may be a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Read Only Memory (EPROM), an electrically Erasable Read Only Memory (EEPROM), or other media capable of storing program codes. The storage medium is used for storing a program, and the processor executes the program after receiving an execution instruction.
In addition, units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
Furthermore, the functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
It should be noted that the functions, if implemented in the form of software functional modules and sold or used as independent products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.).
In this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. A meteorological data interpolation method is characterized by comprising the following steps:
obtaining a missing air temperature segment based on the obtained automatic observation data;
and interpolating the missing air temperature segment according to a trained meteorological data interpolation model by taking the obtained artificial observation data as a true value to obtain complete high-frequency time sequence meteorological data.
2. The method of claim 1, wherein prior to said interpolating the missing air temperature segment according to the trained meteorological data interpolation model, the method further comprises:
constructing a meteorological data interpolation initial model;
training the meteorological data interpolation initial model based on a model data set to obtain a meteorological data interpolation model, wherein the model data set comprises a training data set, a verification data set and a test data set, the training data set and the verification data set are used for training the meteorological data interpolation initial model, and the test data set is used for verifying the generalization capability of the trained meteorological data interpolation initial model; the meteorological data interpolation initial model is a BilSTM-I model, and the building of the meteorological data interpolation initial model comprises the following steps:
and constructing the meteorological data interpolation initial model based on the Seq2Seq and the Encoder-Decode architecture.
3. The method of claim 2, wherein the meteorological data interpolation initial model comprises a coding part, the structure of the coding part is an LSTM-I structure, and the formula describing the internal unit connection process of the LSTM-I structure is as follows:
Figure FDA0003193718520000011
Figure FDA0003193718520000012
Figure FDA0003193718520000013
Figure FDA0003193718520000014
wherein,
Figure FDA0003193718520000021
to estimate the vector, ht-1Hidden state of last LSTM cell, WxAnd bxIn order to be the parameters of the model,
Figure FDA0003193718520000022
is a missing value, xtAs an input vector, mtAs a mask vector, htTo predict the state,/tTo estimate the error.
4. The method of claim 3, wherein the meteorological data interpolation initial model further comprises a decoding part, and the formula describing the decoding process of the decoding part is as follows:
St=LSTM(ht,St-1)
yt=Wyst+by
Figure FDA0003193718520000023
wherein s is the output state sequence, y is the output interpolation result sequence, WyAnd byAs a model parameter,/yIs the interpolation result error of the decoding layer.
5. The method of claim 4, wherein the constructing the meteorological data interpolation initial model comprises:
forming the encoding part of the meteorological data interpolation initial model based on a bidirectional encoding LSTM-I neural network, wherein the output sequence of the encoding part comprises a forward hidden state sequence and a backward hidden state sequence;
and splicing the forward hidden state sequence and the backward hidden state sequence to be used as an output sequence of the encoding part, so that the decoding part receives the output sequence and generates an interpolated output interpolation result sequence.
6. The method of claim 5, wherein the formula for calculating the estimation error of the meteorological data interpolation initial model is as follows:
Figure FDA0003193718520000024
wherein,
Figure FDA0003193718520000025
for the estimation error corresponding to the forward hidden state sequence,
Figure FDA0003193718520000026
is an estimation error corresponding to the backward hidden state sequence.
7. The method according to claim 1, wherein said deriving a missing air temperature segment based on the obtained automatic observation data comprises:
and performing data interpolation on a data gap which is lower than a preset time threshold in the automatic observation data by adopting a Kalman smoothing method to obtain the missing air temperature segment.
8. A meteorological data interpolation device, comprising:
the data acquisition module is used for obtaining a missing air temperature segment based on the acquired automatic observation data;
and the data interpolation module is used for interpolating the missing air temperature segment according to the trained meteorological data interpolation model by taking the acquired artificial observation data as a true value so as to obtain complete high-frequency time sequence meteorological data.
9. An electronic device comprising a memory having stored therein program instructions and a processor that, when executed, performs the steps of the method of any of claims 1-7.
10. A storage medium having stored thereon computer program instructions for executing the steps of the method according to any one of claims 1 to 7 when executed by a processor.
CN202110884988.2A 2021-08-03 2021-08-03 Meteorological data interpolation method, meteorological data interpolation device, electronic equipment and storage medium Pending CN113569972A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110884988.2A CN113569972A (en) 2021-08-03 2021-08-03 Meteorological data interpolation method, meteorological data interpolation device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110884988.2A CN113569972A (en) 2021-08-03 2021-08-03 Meteorological data interpolation method, meteorological data interpolation device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113569972A true CN113569972A (en) 2021-10-29

Family

ID=78170152

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110884988.2A Pending CN113569972A (en) 2021-08-03 2021-08-03 Meteorological data interpolation method, meteorological data interpolation device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113569972A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114116742A (en) * 2021-11-18 2022-03-01 佳都科技集团股份有限公司 Method and device for filling time sequence data based on subway comprehensive monitoring system
CN114169500A (en) * 2021-11-30 2022-03-11 电子科技大学 Neural network model processing method based on small-scale electromagnetic data
CN114911788A (en) * 2022-07-15 2022-08-16 中国长江三峡集团有限公司 Data interpolation method and device and storage medium
CN115345279A (en) * 2022-08-10 2022-11-15 中国电信股份有限公司 Multi-index abnormality detection method and device, electronic equipment and storage medium
CN115935139A (en) * 2023-01-09 2023-04-07 吉林大学 Space field interpolation method for ocean observation data
CN116362915A (en) * 2023-05-31 2023-06-30 深圳市峰和数智科技有限公司 Method and device for supplementing and aligning meteorological data of photovoltaic power station and related equipment
CN116992221A (en) * 2023-07-31 2023-11-03 武汉天翌数据科技发展有限公司 Fault detection method, device and equipment of operation and maintenance platform and storage medium
CN117609706A (en) * 2023-10-20 2024-02-27 北京师范大学 Method for interpolating data of carbon water flux
CN118394799A (en) * 2024-05-15 2024-07-26 成都虚谷伟业科技有限公司 Missing data intelligent matching method of time sequence database

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107577649A (en) * 2017-09-26 2018-01-12 广州供电局有限公司 The interpolation processing method and device of missing data
CN108090558A (en) * 2018-01-03 2018-05-29 华南理工大学 A kind of automatic complementing method of time series missing values based on shot and long term memory network
CN111967509A (en) * 2020-07-31 2020-11-20 北京赛博星通科技有限公司 Method and device for processing and detecting data acquired by industrial equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107577649A (en) * 2017-09-26 2018-01-12 广州供电局有限公司 The interpolation processing method and device of missing data
CN108090558A (en) * 2018-01-03 2018-05-29 华南理工大学 A kind of automatic complementing method of time series missing values based on shot and long term memory network
CN111967509A (en) * 2020-07-31 2020-11-20 北京赛博星通科技有限公司 Method and device for processing and detecting data acquired by industrial equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
NAN JIANG等: "BiLSTM-A: A missing value imputation method for PM2.5 prediction", 《2020 2ND INTERNATIONAL CONFERENCE ON APPLIED MACHINE LEARNING (ICAML)》 *
杨龙 等: "新能源电网中考虑特征选择的Bi-LSTM网络短期负荷预测", 《电力系统自动化》 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114116742A (en) * 2021-11-18 2022-03-01 佳都科技集团股份有限公司 Method and device for filling time sequence data based on subway comprehensive monitoring system
CN114116742B (en) * 2021-11-18 2023-08-08 佳都科技集团股份有限公司 Time sequence data filling method and device based on subway integrated monitoring system
CN114169500B (en) * 2021-11-30 2023-04-18 电子科技大学 Neural network model processing method based on small-scale electromagnetic data
CN114169500A (en) * 2021-11-30 2022-03-11 电子科技大学 Neural network model processing method based on small-scale electromagnetic data
CN114911788A (en) * 2022-07-15 2022-08-16 中国长江三峡集团有限公司 Data interpolation method and device and storage medium
CN114911788B (en) * 2022-07-15 2022-09-27 中国长江三峡集团有限公司 Data interpolation method and device and storage medium
CN115345279A (en) * 2022-08-10 2022-11-15 中国电信股份有限公司 Multi-index abnormality detection method and device, electronic equipment and storage medium
CN115345279B (en) * 2022-08-10 2024-03-29 中国电信股份有限公司 Multi-index anomaly detection method and device, electronic equipment and storage medium
CN115935139A (en) * 2023-01-09 2023-04-07 吉林大学 Space field interpolation method for ocean observation data
CN116362915A (en) * 2023-05-31 2023-06-30 深圳市峰和数智科技有限公司 Method and device for supplementing and aligning meteorological data of photovoltaic power station and related equipment
CN116362915B (en) * 2023-05-31 2023-08-15 深圳市峰和数智科技有限公司 Method and device for supplementing and aligning meteorological data of photovoltaic power station and related equipment
CN116992221A (en) * 2023-07-31 2023-11-03 武汉天翌数据科技发展有限公司 Fault detection method, device and equipment of operation and maintenance platform and storage medium
CN116992221B (en) * 2023-07-31 2024-03-26 武汉天翌数据科技发展有限公司 Fault detection method, device and equipment of operation and maintenance platform and storage medium
CN117609706A (en) * 2023-10-20 2024-02-27 北京师范大学 Method for interpolating data of carbon water flux
CN117609706B (en) * 2023-10-20 2024-06-04 北京师范大学 Method for interpolating data of carbon water flux
CN118394799A (en) * 2024-05-15 2024-07-26 成都虚谷伟业科技有限公司 Missing data intelligent matching method of time sequence database

Similar Documents

Publication Publication Date Title
CN113569972A (en) Meteorological data interpolation method, meteorological data interpolation device, electronic equipment and storage medium
Todini A model conditional processor to assess predictive uncertainty in flood forecasting
US10026221B2 (en) Wetland modeling and prediction
CN111458661A (en) Power distribution network line variation relation diagnosis method, device and system
CN112462261B (en) Motor abnormality detection method and device, electronic equipment and storage medium
CN113948159B (en) Fault detection method, device and equipment for transformer
CN115471625A (en) Cloud robot platform big data intelligent decision method and system
CN116777452B (en) Prepayment system and method for intelligent ammeter
CN115935139A (en) Space field interpolation method for ocean observation data
Bousquet et al. Detecting and correcting underreported catches in fish stock assessment: trial of a new method
CN114819289A (en) Prediction method, training method, device, electronic device and storage medium
CN116595493A (en) Non-idealized flexible cable axial tension demodulation method and system based on LSTM network
CN112215495B (en) Pollution source contribution calculation method based on long-time and short-time memory neural network
Keitel et al. Selecting creep models using Bayesian methods
CN115063337A (en) Intelligent maintenance decision-making method and device for buried pipeline
CN116859317A (en) Method and device for predicting metering error of capacitive voltage transformer
CN116609852A (en) Underground medium parameter high-precision modeling method and equipment for well-seismic fusion
CN115952916A (en) Artificial intelligence-based wind power prediction error correction method, device and equipment
CN116228132A (en) Data management method and device of RM system, electronic equipment and medium
Weber et al. Invited commentary: themes and issues from the workshop Operational river flow and water supply forecasting
CN113435927A (en) User intention prediction method, device, equipment and storage medium
RU2714612C1 (en) Method of identifying nonlinear systems
CN112329983B (en) Data processing method and device
CN118364270B (en) Domain generalization modeling method, device, equipment and medium for industrial equipment health prediction
CN117312765A (en) Method for bidirectional processing of missing data by ARIMA-LSTM model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20211029

RJ01 Rejection of invention patent application after publication