CN116933125A

CN116933125A - Time series data prediction method, device, equipment and storage medium

Info

Publication number: CN116933125A
Application number: CN202310721940.9A
Authority: CN
Inventors: 赵书宝
Original assignee: Xinao Xinzhi Technology Co ltd
Current assignee: Xinao Xinzhi Technology Co ltd
Priority date: 2023-06-16
Filing date: 2023-06-16
Publication date: 2023-10-24

Abstract

The application discloses a time sequence data prediction method, a device, equipment and a storage medium, wherein the method comprises the following steps: acquiring a time sequence to be processed; the time sequence to be processed is used for representing operation data of an object to be predicted at a plurality of detection times; performing feature extraction processing on the time sequence to be processed through a prediction model to obtain local feature data, global feature data and fusion feature data of the time sequence to be processed; and carrying out feature coding processing on the local feature data, the global feature data and the fusion feature data by using the prediction model to obtain a coding result, and carrying out classified prediction on the coding result to obtain a prediction result of the object to be predicted. According to the scheme, the short-range period information and the long-range period information of the time sequence to be processed can be captured in a finer granularity, so that the downstream tasks are classified and predicted by combining more comprehensive time sequence characterization information, and the accuracy of prediction is greatly improved.

Description

Time series data prediction method, device, equipment and storage medium

Technical Field

The present application relates generally to the field of data processing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for predicting time series data.

Background

Along with the rapid development of the informatization technology, industrial enterprises can acquire a large amount of data by using the internet of things technology and process the data in real time in the production operation process, the data are data columns recorded according to time sequence, the characteristics of having time stamps, structuring, no updating, unique data sources and the like exist, and the time sequence data processing is widely applied to the fields of smart cities, the internet of things, the internet of vehicles, the industrial Internet and the like. In order to better utilize the time sequence data to execute the applications of electric power monitoring, mechanical equipment fault detection, intelligent operation and maintenance of equipment in the energy industry and the like, the prediction processing based on the time sequence data is very important.

At present, in the related art, the original time sequence can be decomposed and extracted to obtain a time sequence characteristic, and the time sequence characteristic is predicted to obtain a predicted result, however, in a real scene, the original time sequence is generally overlapped with the trend characteristic, so that the time sequence information obtained by one-time decomposition and extraction is compared with one-sided time sequence information, and the accuracy of predicting the downstream task based on the time sequence data is low.

Disclosure of Invention

In view of the foregoing drawbacks or shortcomings in the prior art, it is desirable to provide a time series data prediction method, apparatus, device, and storage medium.

In a first aspect, the present invention provides a method for predicting time-series data, the method comprising:

acquiring a time sequence to be processed; the time sequence to be processed is used for representing operation data of an object to be predicted at a plurality of detection times;

performing feature extraction processing on the time sequence to be processed through a prediction model to obtain local feature data, global feature data and fusion feature data of the time sequence to be processed;

and carrying out feature coding processing on the local feature data, the global feature data and the fusion feature data by using the prediction model to obtain a coding result, and carrying out classified prediction on the coding result to obtain a prediction result of the object to be predicted.

In one embodiment, the prediction model includes a first feature extraction module, a second feature extraction module and a feature fusion module, and the feature extraction processing is performed on the time sequence to be processed through the prediction model to obtain local feature data, global feature data and fusion feature data of the time sequence to be processed, including:

continuously sampling the time sequence to be processed through the first feature extraction module to obtain a plurality of continuous fragments, and taking the plurality of continuous fragments as local feature data;

Performing interval sampling processing on the time sequence to be processed through the second feature extraction module to obtain a plurality of discontinuous fragments, and taking the plurality of discontinuous fragments as global feature data;

and carrying out fusion processing on the local feature data and the global feature data through the feature fusion module to obtain fusion feature data.

In one embodiment, the feature encoding processing is performed on the local feature data, the global feature data and the fusion feature data by using the prediction model to obtain an encoding result, including;

performing time projection processing on the local feature data, the global feature data and the fusion feature data in parallel to acquire time dimension information after the local feature data, the global feature data and the fusion feature data are converted;

performing channel projection processing based on the local feature data, the global feature data and the time dimension information after the fusion feature data conversion to obtain the converted channel dimension information;

performing dimension reduction processing on the converted time dimension information and the converted channel dimension information through a bottleneck layer to obtain dimension reduced data;

And performing linear mapping processing on the dimension reduced data to obtain a local coding result, a global coding result and a fusion coding result.

In one embodiment, the training process of the prediction model includes the following steps:

acquiring a sample time sequence; the sample time series data is marked with a history marking result;

performing two times of data enhancement processing on the sample time sequence to obtain two sets of enhancement data;

performing feature extraction and coding processing on the two groups of enhancement data through an initial feature extraction network of an initial prediction model to obtain two groups of sample local coding information, two groups of sample global coding information, local fusion coding information and global fusion coding information;

and performing iterative training on an initial classification network of an initial prediction model based on the two groups of sample local coding information, the two groups of sample global coding information, the local fusion coding information, the global fusion coding information and the history labeling result to obtain the prediction model.

In one embodiment, performing feature extraction processing and encoding processing on the two sets of enhancement data to obtain two sets of sample local encoding information, two sets of sample global encoding information, local fusion encoding information and global fusion encoding information, where the feature extraction processing and encoding processing include:

Performing feature extraction processing on the two groups of enhancement data through a first coding module in the initial feature extraction network respectively to obtain two groups of sample local features, and performing feature extraction processing on the two groups of enhancement data through a second coding module in the initial feature extraction network in parallel to obtain two groups of sample global features;

processing the two groups of sample local features and the two groups of sample global features through feature fusion to obtain local fusion features and global fusion features;

and carrying out coding processing through a feature coding module in the feature extraction network based on the two groups of sample local features, the two groups of sample global features, the local fusion features and the global fusion features to obtain two groups of sample local coding information, two groups of sample global coding information, local fusion coding information and global fusion coding information.

In one embodiment, performing iterative training on an initial classification network of an initial prediction model based on the two sets of sample local coding information, the two sets of sample global coding information, the local fusion coding information, the global fusion coding information and the history labeling result to obtain the prediction model, including:

Constructing a contrast loss function based on the two groups of sample local coding information, the two groups of sample global coding information, the local fusion coding information and the global fusion coding information, and training parameters of the first feature extraction module, the second feature extraction module, the feature fusion module and the feature coding module according to the minimized iteration of the contrast loss function to obtain a feature extraction network of the prediction model;

and performing iterative training on the initial classification network based on the feature extraction network and the history labeling result to obtain the prediction model.

In one embodiment, performing iterative training on the initial classification network based on the feature extraction network and the history labeling result to obtain the prediction model includes:

performing feature extraction processing on the sample time sequence through a first feature extraction module and a second feature extraction module in the feature extraction network to obtain local data and global data;

the local data and the global data are subjected to fusion processing through the feature fusion module to obtain fusion data, and the fusion data, the local data and the global data are subjected to coding processing through a feature fusion coding module in the feature extraction network to obtain a fusion result;

Carrying out classification prediction on the fusion result through the initial classification network to obtain an output result;

and constructing a classification loss function based on the output result and the history labeling result, iteratively training parameters of the initial classification network according to the minimization of the classification loss function to obtain a classification network of the prediction model, and constructing the prediction model based on the feature extraction network and the classification network.

In one embodiment, the contrast loss function includes a first component, a second component, and a third component;

the first component is used for representing the difference loss between the two groups of sample local coding information;

the second component is used for representing the difference loss between the two groups of sample global coding information;

the third component is used to characterize a loss of difference between the local fusion encoded information and the global fusion encoded information.

In a second aspect, an embodiment of the present application provides a time-series data prediction apparatus, including:

the acquisition module is used for acquiring a time sequence to be processed; the time sequence to be processed is used for representing operation data of an object to be predicted at a plurality of detection times;

the feature extraction module is used for carrying out feature extraction processing on the time sequence to be processed through a prediction model to obtain local feature data, global feature data and fusion feature data of the time sequence to be processed;

And the classification prediction module is used for carrying out feature coding processing on the local feature data, the global feature data and the fusion feature data by utilizing the prediction model to obtain a coding result, and carrying out classification prediction on the coding result to obtain a prediction result of the object to be predicted.

In a third aspect, an embodiment of the present application provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the time series data prediction method of the first aspect as described above when the program is executed by the processor.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium having stored thereon a computer program for implementing the time-series data prediction method of the above first aspect.

According to the time sequence data prediction method, device, equipment and storage medium, the time sequence to be processed is obtained, the feature extraction processing is carried out on the time sequence to be processed through the prediction model, local feature data, global feature data and fusion feature data of the time sequence to be processed are obtained, then feature encoding processing is carried out on the local feature data, the global feature data and the fusion feature data through the prediction model, encoding results are obtained, and classification prediction is carried out on the encoding results to obtain the prediction results of the object to be predicted. Compared with the prior art, on the one hand, the technical scheme can extract the local feature data, the global feature data and the fusion feature data of the time sequence to be processed from the global and local angles, so that the short-range period information and the long-range period information of the time sequence to be processed can be captured in a finer granularity, and more accurate guiding information is provided for subsequent classification prediction. On the other hand, the prediction result is determined by carrying out coding processing and classification prediction on the local feature data, the global feature data and the fusion feature data and combining more comprehensive features, so that the method can be suitable for different downstream tasks such as equipment fault diagnosis, abnormal value detection, natural gas, power load prediction and the like, and the accuracy of prediction is improved.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the accompanying drawings in which:

FIG. 1 is a schematic diagram of an implementation environment of a time-series data prediction method according to an embodiment of the present application;

fig. 2 is a flowchart of a time series data prediction method according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a time-series data prediction structure according to an embodiment of the present application;

FIG. 4 is a flowchart illustrating a method for determining a coding result according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a feature encoding module according to an embodiment of the present application;

FIG. 6 is a flowchart of a method for training a prediction model according to an embodiment of the present application;

FIG. 7 is a schematic diagram of a structure for training a prediction model according to an embodiment of the present application;

FIG. 8 is a schematic diagram of a comparison of calculated speeds of an algorithm of the present application and a baseline model provided by an embodiment of the present application;

fig. 9 is a flowchart of a method for predicting time-series data according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of a time-series data prediction apparatus according to an embodiment of the present application;

Fig. 11 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

The application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. It should be noted that, for convenience of description, only the portions related to the application are shown in the drawings.

It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.

It can be understood that in industrial internet application, contrast learning is used as an algorithm in machine learning, so that abundant information can be effectively mined from massive data, powerful time sequence characterization is constructed, and high cost caused by data labeling is reduced. The time series data is used as a numerical sequence formed by arranging index values of a certain phenomenon according to a time sequence, can be applied to different application scenes, and is generally applied to a problem of predicting a continuous sequence by predicting the numerical condition of a future sequence through a sequence numerical rule which has already occurred in time series analysis. For example: predicting the number of large plate points on the next trade day in the financial field; predicting future weather conditions; predicting sales of a commodity at the next moment; prediction of movie box office change conditions.

At present, in the related art, the original time sequence can be decomposed and extracted to obtain the time sequence characteristics, and the prediction result is obtained by performing prediction processing based on the time sequence characteristics, however, in a real scene, the original time sequence is generally overlapped with the trend characteristics, so that the time sequence information obtained by one-time decomposition and extraction is compared with one-sided time sequence information, and the accuracy of predicting the downstream task based on the time sequence data is lower.

Based on the defects, the application provides a time sequence data prediction method, a device, equipment and a storage medium, and compared with the prior art, on the one hand, the technical scheme can extract local feature data, global feature data and fusion feature data of a time sequence to be processed from the global and local angles, so that short-range period information and long-range period information of the time sequence to be processed can be captured in a finer granularity, and more accurate guiding information is provided for subsequent classification prediction. On the other hand, the prediction result is determined by carrying out coding processing and classification prediction on the local feature data, the global feature data and the fusion feature data and combining more comprehensive features, so that the method can be suitable for different downstream tasks such as equipment fault diagnosis, abnormal value detection, natural gas, power load prediction and the like, and the accuracy of prediction is improved.

Fig. 1 is a schematic diagram of an implementation environment of a time-series data prediction method according to an embodiment of the present application. As shown in fig. 1, the implementation environment architecture includes: a terminal 100 and a server 200.

In the field of time-series data prediction, the process of performing prediction processing on time-series data may be performed at the terminal 100 or at the server 200. For example, the terminal 100 obtains the time sequence to be processed, and prediction processing can be performed on the time sequence to be processed locally by the terminal 100 to obtain a prediction result of the object to be predicted; the time sequence to be processed may be sent to the server 200, so that the server 200 obtains the time sequence to be processed, performs prediction processing on the time sequence to be processed to obtain a prediction result of the object to be predicted, and then sends the prediction result to the terminal 100 to implement prediction processing on time sequence data.

In addition, the terminal 100 may display an application interface through which a waiting time series uploaded by a user may be acquired or the uploaded waiting time series may be transmitted to the server 200.

Alternatively, the terminal 100 may be a terminal device in various AI application scenarios. For example, the terminal 100 may be an intelligent home device such as an intelligent television, an intelligent television set-top box, or the terminal 100 may be a mobile portable terminal such as a smart phone, a tablet computer, and an electronic book reader, or the terminal 100 may be an intelligent wearable device such as an intelligent glasses, an intelligent watch, and the embodiment is not limited in this way.

The server 200 may be a server device that provides a background service for the AI application installed in the terminal 100 described above. The server 20 may be a server, a server cluster or a distributed system formed by a plurality of servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (content delivery network, CDN), basic cloud computing services such as big data and artificial intelligent platforms, and the like.

A communication connection is established between the terminal 100 and the server 200 through a wired or wireless network. Alternatively, the wireless network or wired network described above uses standard communication techniques and/or protocols. The network is typically the Internet, but may be any network including, but not limited to, a local area network (Local Area Network, LAN), metropolitan area network (Metropolitan Area Network, MAN), wide area network (Wide Area Network, WAN), a mobile, wired or wireless network, a private network, or any combination of virtual private networks.

For easy understanding and explanation, the method, apparatus, device and storage medium for predicting time-series data according to the embodiments of the present application are described in detail below with reference to fig. 2 to 11.

Fig. 2 is a flowchart of a method for predicting time-series data according to an embodiment of the present application, where the method may be performed by a computer device, and the computer device may be the server 200 or the terminal 100 in the system shown in fig. 1, or the computer device may be a combination of the terminal 100 and the server 200. As shown in fig. 2, the method includes:

s101, acquiring a time sequence to be processed; the temporal sequence to be processed is used to characterize the operational data of the object to be predicted at a plurality of detection times.

It should be noted that the above-mentioned time sequence to be processed is data collected at different times, and is used to describe the situation that the phenomenon changes with time, where such data reflects a certain transaction, the state or the degree of the change of the phenomenon with time. The object to be predicted refers to a main object to be subjected to classification prediction, and the object to be predicted and the time sequence to be processed corresponding to different downstream task scenes are different.

Taking a fault detection scene as an example, the object to be predicted can be a mechanical device, and the time sequence to be processed refers to operation data of the mechanical device at a plurality of detection times; taking a power load detection scene as an example, if the object to be predicted is the power value of different families, the time sequence to be processed refers to the power value load data of different families at a plurality of detection times; taking a natural gas load detection scene as an example, if the object to be predicted is a natural gas value of different families, the time sequence to be processed refers to natural gas load data of different families at a plurality of detection times. For example, the time series to be processed may be natural gas loading data collected by the user over 30 days.

In the embodiment of the application, when the time sequence to be processed is acquired, the data acquisition device is called for acquisition, the time sequence to be processed can be acquired through a cloud, the time sequence to be processed can be acquired through a database or a blockchain, and the time sequence to be processed can be acquired through the importing of external equipment.

S102, carrying out feature extraction processing on the time sequence to be processed through a prediction model to obtain local feature data, global feature data and fusion feature data of the time sequence to be processed.

It should be noted that, the prediction model is a neural network model that is input as a time sequence to be processed, outputs as a prediction result of an object to be predicted, has the capability of integrating information in a short-range period and a long-range period of the time sequence to be processed, and can predict the prediction result of the object to be predicted. The model parameters of the prediction model are in an optimal state, and the time sequence to be processed can be input into the prediction model to output the prediction result of the object to be predicted.

Referring to fig. 3, the prediction model may include a first feature extraction module, which may be a local feature extraction module, a second feature extraction module, which may be a global feature extraction module, and a feature fusion module. The first feature extraction module is used for carrying out local feature extraction processing on the time sequence to be processed to obtain local feature data, and the second feature extraction module is used for carrying out global feature extraction on the time sequence to be processed to obtain global feature data. The local characteristic data are used for representing short-range period characteristics of the to-be-processed time sequence, the global characteristic data are used for representing long-range period characteristics of the to-be-processed time sequence, and the fusion characteristic data are used for representing characteristics of the short-range period and the long-range period fused with the to-be-processed time sequence.

The local feature data, the global feature data and the fusion feature data may be represented in a matrix or vector form.

After the local feature data and the global feature data are obtained, the local feature data and the global feature data can be fused through a feature fusion module. When the local feature data and the global feature data are represented by vectors, information fusion can be performed in a vector combination mode to obtain fusion feature data; when the local feature data and the global feature data are represented by a matrix, information fusion can be performed in a matrix splicing mode, so that corresponding fusion feature data are obtained.

In the embodiment, the time sequence to be processed is subjected to feature extraction processing through the prediction model to obtain the local feature data, the global feature data and the fusion feature data, and the short-term dependency and the long-term dependency of the time sequence to be processed can be extracted in a finer granularity, so that good data guiding information is provided for the subsequent classification prediction, and the accuracy of predicting the prediction result is improved.

And S103, performing feature coding processing on the local feature data, the global feature data and the fusion feature data by using a prediction model to obtain a coding result, and performing classified prediction on the coding result to obtain a prediction result of the object to be predicted.

The predictive model may also include a feature encoding module. As an implementation manner, the feature encoding module may include a plurality of processing layers, and may sequentially perform feature encoding processing on the local feature data, the global feature data, and the fusion feature data to obtain a local encoding result, a global encoding result, and a fusion encoding result. Taking the processing of the local feature data as an example, the local feature data can be subjected to channel projection and time projection processing through a first processing layer, the time dimension information and the channel dimension information of the local feature data are extracted, and the dimension reduction processing is performed through a second processing layer, so that a local coding result is obtained. And similarly, the global coding result and the fusion coding result can be obtained by adopting the same processing mode as the local characteristic data. The local encoding result, the global encoding result, and the fusion encoding result may be expressed in the form of vectors.

Alternatively, the feature encoding module may be an encoder in a transducer model, or may be an encoder in a convolutional neural network.

After the local coding result, the global coding result and the fusion coding result are determined, the local coding result, the global coding result and the fusion coding result can be classified and predicted through a classification network, for example, the local coding result, the global coding result and the fusion coding result are processed through a full connection layer and an activation function, and a prediction result of an object to be predicted is obtained.

It should be noted that, the prediction result of the object to be predicted refers to a processing result obtained by analyzing and extracting the time domain feature data and the frequency domain feature data, and is used for identifying type information of the object to be predicted at a plurality of detection times, so as to quickly obtain type information and attribute characteristics of the time sequence to be processed. Taking the fault detection application scenario as an example, the prediction result includes a fault type of the object to be predicted, or may include a plurality of fault attributes of the object to be predicted under the fault type.

According to the time sequence data prediction method provided by the embodiment of the application, the local feature data, the global feature data and the fusion feature data of the time sequence to be processed are obtained by obtaining the time sequence to be processed and carrying out feature extraction processing on the time sequence to be processed through the prediction model, then the local feature data, the global feature data and the fusion feature data are subjected to feature coding processing by utilizing the prediction model to obtain a coding result, and the coding result is subjected to classification prediction to obtain the prediction result of the object to be predicted. Compared with the prior art, on the one hand, the technical scheme can extract the local feature data, the global feature data and the fusion feature data of the time sequence to be processed from the global and local angles, so that the short-range period information and the long-range period information of the time sequence to be processed can be captured in a finer granularity, and more accurate guiding information is provided for subsequent classification prediction. On the other hand, the prediction result is determined by carrying out coding processing and classification prediction on the local feature data, the global feature data and the fusion feature data and combining more comprehensive features, so that the method can be suitable for different downstream tasks such as equipment fault diagnosis, abnormal value detection, natural gas, power load prediction and the like, and the accuracy of prediction is improved.

In one embodiment, the present embodiment provides a specific implementation manner of performing feature extraction processing on a time sequence to be processed through a prediction model to obtain local feature data, global feature data and fusion feature data of the time sequence to be processed. The method comprises the following steps:

and continuously sampling the time sequence to be processed through a first feature extraction module to obtain a plurality of continuous fragments, and taking the plurality of continuous fragments as local feature data. And performing interval sampling processing on the time sequence to be processed through a second feature extraction module to obtain a plurality of discontinuous fragments, and taking the plurality of discontinuous fragments as global feature data. And then, carrying out fusion processing on the local feature data and the global feature through a feature fusion module to obtain fusion feature data.

In particular, the first feature extraction module is used for extracting short-term or local features of the time series to be processed. The time sequence to be processed can be sampled by adopting a continuous sampling strategy, the time sequence to be processed is divided into a plurality of continuous time segments, and the plurality of continuous segments are used as local characteristic data. At this time, each time slice can effectively reserve local information of the time series itself to be processed.

The second feature extraction module is used for extracting long-term or global features of the time series to be processed. The interval sampling strategy can be adopted to sample the time sequence to be processed, for example, sampling is carried out once every several time points to obtain a plurality of discontinuous segments, and the plurality of discontinuous segments are used as global characteristic data. Each segment is composed of a time point sampled uniformly for each designated step length and a sequence, and each segment can effectively reserve global information or long-range period information of the time sequence to be processed.

After the local feature data and the global feature data are obtained, the local feature data and the global feature data can be fused through a feature fusion module to obtain fused feature data, wherein the fused feature data not only comprises local information of a time sequence to be processed, but also comprises global information of the time sequence to be processed.

According to the method, the device and the system for processing the time series, the local feature data, the global feature data and the fusion feature data of the time series to be processed can be extracted more comprehensively and in fine granularity through the first feature extraction module, the second feature extraction module and the feature fusion module of the prediction model, so that guidance information is provided for subsequent feature coding processing, and the accuracy of the prediction processing is improved.

In one embodiment, a specific implementation manner is provided for performing feature encoding processing on the local feature data, the global feature data and the fusion feature data by using a prediction model to obtain an encoding result, and referring to fig. 4, the method includes:

and S201, carrying out time projection processing on the local feature data, the global feature data and the fusion feature data in parallel to obtain time dimension information after the local feature data, the global feature data and the fusion feature data are converted.

S202, performing channel projection processing based on the local feature data, the global feature data and the time dimension information after the fusion feature data are converted, and obtaining the converted channel dimension information.

S203, performing dimension reduction processing on the converted time dimension information and the converted channel dimension information through a bottleneck layer to obtain dimension reduced data.

S204, performing linear mapping processing on the dimension reduced data to obtain a local coding result, a global coding result and a fusion coding result.

It should be noted that the local feature data, the global feature data, and the fusion feature data may be represented by a matrix, where the time dimension information temporal dim refers to the number of time information and features in the matrix, and the channel dimension information channels dim refers to the number of values possessed by each matrix element in the matrix, for example, 12 elements in a 3×4 matrix, and if each element has three values, the matrix is said to be 3 channels, i.e., channels=3. It is common for a color picture to have three channels, red, green and blue. As another example, a matrix of 3x 4x 5, dims=3, and channels=12 has the following meaning: three-dimensional matrix, each matrix element has a number of values of 12.

Specifically, as shown in fig. 5, the feature encoder may adopt a Multi-Layer period (MLP) structure, and specifically includes four parts, namely channel projection, temporal projection, bottleneck design and output projection, where, taking the local feature data being processed by the feature encoding module as an example, the temporal dimension information temporal dim and the channel dimension information channel dim of the local feature data, for example, temporal dim is H, channel projection is W, and then processing the local feature data by the temporal channel (temporal projection) ^H Information exchange is carried out through radiation transformation of time dimension, so that time dimension information F' after local feature data conversion is obtained, and at the moment, the local feature data is formed by R ^H Conversion to R ^F′ I.e. converting the time dimension H into a time dimension F'.

Then, the time dimension information after the local feature data conversion is subjected to channel projection (channel projection) processing, and is subjected to information interaction through radiation transformation of channel dimension to obtain the converted channel dimension information W, and the local feature data is converted into R at the moment ^W The converted time dimension information F' and the converted channel dimension information are subjected to dimension reduction processing through a Bottleneck layer (Bottleneck layer) to obtain corresponding dimension F, and the obtained dimension reduced data is R ^F And (5) carrying out linear mapping processing on the data subjected to dimension reduction (Output projection) to obtain a local coding result.

And in the same way, the global characteristic data and the fusion characteristic data can be coded by adopting the same processing mode as the local characteristic data, so that the corresponding global coding result and fusion coding result are obtained.

It will be appreciated that the bottleneck layer described above uses a convolutional neural network of 1*1. This is called the bottleneck layer because it is longer than just like a bottleneck. In this embodiment, the bottleneck layer is set to reduce the dimension between two MLPs, so as to reduce the parameter amount of the model and improve the calculation efficiency of the algorithm. Through linear mapping processing, the learned local and global features can be subjected to radial transformation into specified dimensions, so that the subsequent prediction processing is facilitated.

In the embodiment, the local feature data, the global feature data and the fusion feature data are subjected to feature coding processing, so that the corresponding coding result can be accurately determined, and the accuracy of model prediction is improved.

In one embodiment, the present application further provides a schematic diagram of a training process for training a prediction model, referring to fig. 6, the training process includes the following steps:

S301, acquiring a sample time sequence; the sample time series data is marked with a history marking result.

The sample time series is data required for training the prediction model. The sample time series is used to characterize the operational data of the sample object at a plurality of historical detection times. The corresponding sample time sequences are different in different application scenarios. Taking a fault detection scene as an example, the sample object may be a mechanical device, and the sample time sequence refers to operation data of the mechanical device at a plurality of historical detection times; taking the power load detection scenario as an example, if the sample object is a power value, the sample time sequence refers to power value load data of different households at a plurality of historical detection times.

Taking a natural gas load detection scenario as an example, taking collected historical gas data of 2000 industrial and commercial users as detection time points, the above sample time series may be detection data within 30 days, and the obtained sample time series may include 30×24 time points, and after the sample time series is obtained, the sample time series may be subjected to two times of enhancement processing to obtain two sets of enhancement data.

The sample time series data is marked with a historical marking result, the historical marking result can be a result obtained by manually marking the data, taking a device fault detection scene as an example, for example, the historical marking result of the sample time series can be known fault type information.

In this embodiment, the sample time sequence may be first randomly divided into a training set and a verification set according to a certain proportion, and then a prediction model is constructed by using the training set and the verification set according to a training learning algorithm. The training set is used for training the initial prediction model to obtain a trained prediction model, and the verification set is used for verifying the trained prediction model to verify the performance of the prediction model. For example, 80% of the sample time series is selected as the training set, and the remaining 20% of the sample time series is selected as the validation set.

In the embodiment of the application, when the sample time sequence is acquired, the data acquisition device is called for acquisition, the data acquisition device can acquire the sample time sequence through a cloud end, the time sequence to be processed can be acquired through a database or a blockchain, and the sample time sequence can be acquired through the introduction of external equipment.

S302, performing two times of data enhancement processing on the sample time sequence to obtain two sets of enhancement data.

The enhanced data is data obtained by enhancing the sample time series.

As one implementation manner, any enhancement data in the two sets of enhancement data is obtained by processing at least one of the following ways: the sample time series is up-sampled, down-sampled, noise added, flipped and decomposed. Specifically, up-sampling and down-sampling are performed on one of the sample time sequences to obtain a processed sequence, and then splicing is performed on the processed sequence and other sequences to obtain enhanced data. The enhancement data may be obtained by randomly adding gaussian noise to the original sample time series. It is also possible to invert the sign of the sequence value. Or may be STL decomposition: and decomposing the sample time sequence into a period, a trend and a residual error, respectively performing enhancement processing, and finally merging to obtain enhancement data.

As another implementation manner, the amplitude and the phase of the sample time sequence can be adjusted and disturbed to obtain the enhancement data, or a section of the sample time sequence can be erased randomly or the main component of the frequency domain is added to obtain the enhancement data. The two sets of enhancement data may include first enhancement data and second enhancement data. And the enhancement data is taken as a positive sample pair and the other enhancement data is taken as a negative sample pair of the sample time sequence.

In the embodiment, the sample time sequence is enhanced, so that the number of sample data sets is enriched, the effect of the trained prediction model is better, and the luxury capacity and the robustness of the model are improved.

And S303, carrying out feature extraction and coding processing on the two groups of enhancement data through an initial feature extraction network of an initial prediction model to obtain two groups of sample local coding information, two groups of sample global coding information, local fusion coding information and global fusion coding information.

S304, performing iterative training on an initial classification network of the initial prediction model based on the two groups of sample local coding information, the two groups of sample global coding information, the local fusion coding information, the global fusion coding information and the history labeling result to obtain the prediction model.

Optionally, referring to fig. 7, the initial prediction model may include an initial feature extraction network and an initial classification network, and in the process of training the initial prediction model through a training set, two sets of enhancement data may be input into the initial feature extraction network, where the initial feature extraction network includes a first feature extraction module, a second feature extraction module, and a feature fusion module, where the first feature extraction module is configured to perform continuous sampling processing on the two sets of enhancement data to obtain a plurality of continuous segments, and the plurality of continuous segments are used as two sets of sample local feature data; the second feature extraction module is used for performing interval sampling processing on the two groups of enhancement data to obtain a plurality of discontinuous fragments, and taking the plurality of discontinuous fragments as two groups of sample global feature data; the feature fusion module is used for carrying out fusion processing on the two groups of sample local feature data and the two groups of sample global feature data to obtain sample fusion feature data, wherein the sample fusion feature data can comprise local fusion coding information and global fusion coding information.

Specifically, the two sets of enhancement data can be subjected to feature extraction processing through a first coding module in an initial feature extraction network respectively to obtain two sets of sample local feature data, and the two sets of enhancement data are subjected to feature extraction processing through a second coding module in the initial feature extraction network in parallel to obtain two sets of sample global feature data; and then processing the two groups of sample local feature data and the two groups of sample global feature data through feature fusion to obtain sample fusion feature data, wherein the sample fusion feature comprises local fusion features and global fusion features.

The initial feature extraction network can further comprise a feature coding module, and coding processing can be performed through the feature coding module in the feature extraction network based on the two groups of sample local feature data, the two groups of sample global feature data, the local fusion feature and the global fusion feature to obtain two groups of sample local coding information, two groups of sample global coding information, local fusion coding information and global fusion coding information. And constructing a contrast loss function based on the two groups of sample local coding information, the two groups of sample global coding information, the local fusion coding information and the global fusion coding information, and performing iterative training on the initial classification network based on the feature extraction network and the history labeling result to obtain a prediction model according to parameters of the first feature extraction module, the second feature extraction module, the feature fusion module and the feature coding module, which are subjected to iterative training according to the contrast loss function, so as to obtain the feature extraction network of the prediction model.

After obtaining two sets of sample local coding information, two sets of sample global coding information, local fusion coding information and global fusion coding information, local contrast loss can be built based on the two sets of sample local coding information, global contrast loss is built based on the two sets of sample global coding information, fusion contrast loss is built based on the local fusion coding information and the global fusion coding information, a contrast loss function is built according to the local contrast loss, the global contrast loss and the fusion contrast loss, parameters of the first feature extraction module, the second feature extraction module, the feature fusion module and the feature coding module are iterated according to the minimization of the contrast loss function, and a trained feature extraction network is obtained.

The local contrast loss, global contrast loss, and fusion contrast loss can be represented by InfoNCE as follows:

wherein, when l _i,j Indicating loss of local contrast, x _i And x _j Respectively representing two groups of positive sample pairs, namely sample local coding information, N represents the number of samples and x _k Representing a negative sample; when l _i,j When representing global contrast loss, x _i And x _j Respectively representing two positive sample pairs, i.e. sample global coding information, x _k Representing a negative sample; when l _i,j Indicating loss of fusion contrast, x _i And x _j Respectively representing two positive sample pairs, namely local fusion coding information and global fusion coding information, x _k Representing a negative sample.

When the initial feature extraction network is trained according to the contrast loss function minimization iteration, the similarity between the local feature data of two samples, the global feature data of two samples and the local fusion data and the global fusion data is required to be continuously made as large as possible, and the similarity between the local feature data and the global fusion data and other negative samples is required to be as small as possible.

It should be noted that, the iterative training of the first feature extraction module, the second feature extraction module and the feature fusion module in the embodiment of the present application is three independent processing procedures, and may be only the iterative training of the first feature extraction module, or only the iterative training of the second feature extraction module, or only the iterative training of the feature fusion module. Of course, the first feature extraction module, the second feature extraction module and the feature fusion module may be all subjected to iterative training, and the execution sequence of the three may not be limited, and may be executed in series in one iterative training or may be executed in parallel.

After the trained feature extraction network is determined, parameters in the feature extraction network are controlled to be unchanged, and the initial classification network is subjected to iterative training based on the feature extraction network and the historical labeling result to obtain a prediction model.

Specifically, in the process of performing iterative training on the initial classification network, the sample time sequence can be subjected to feature extraction processing through a feature extraction network to obtain a fusion result, then the fusion result is subjected to classification prediction through the initial classification network to obtain an output result, then a classification loss function is constructed based on the output result and the history labeling result, parameters of the initial classification network are iteratively trained according to the minimization of the classification loss function to obtain a classification network of a prediction model, and the prediction model is constructed based on the feature extraction network and the classification network.

The initial classification network is used for classification or prediction, and may be a two-layer MLP network, for implementing prediction of a downstream target task, for example, performing equipment fault diagnosis prediction, load prediction, abnormal value detection, and the like. The trained feature extraction network and the trained classification network may be used as a trained predictive model.

After the training set is used for training to obtain a trained prediction model, the verification set can be input into the trained prediction model for processing to obtain an output result, a performance index is determined for evaluating the predicted performance, and when the index is greater than a preset threshold value, the corresponding trained prediction model is determined to be the prediction model. In addition, in order to test the performance of the scheme in the application, time sequence data prediction can be performed by taking the natural gas load of 2000 industrial and commercial users as a scene, and prediction precision distributions of different schemes such as a baseline model and the algorithm of the application can be calculated respectively, and can be shown in fig. 8, wherein the light color part indicates the user with the algorithm better than the baseline model, and the dark color part indicates the user with the algorithm weaker than the baseline model. It is apparent from the figure that the algorithm proposed by the present application is superior to the baseline model in terms of a greater number of users. Meanwhile, the algorithm and the baseline model 1 and the baseline model 2 of the application can calculate the floating point operation times per second (floating point operations per second, FLOPS) of short sequence prediction and long sequence prediction, and the FLOPS can be understood as the calculation speed, and is an index for measuring the hardware performance. Specific parameters thereof can be seen in the following table:

FLOPS	Short sequence prediction	Long sequence prediction
			The algorithm	30	328
Baseline model 1	5078	883
			Baseline model 2	11652	5078

Obviously, by the above data representation: compared with the existing baseline model, the FLOPS determined by the algorithm of the application is lower, namely, the speed of short sequence prediction and long sequence prediction is superior to that of the baseline model.

In the embodiment, when the network is extracted by training the features, the local features and the global features of the time sequence can be fully considered locally, the long-term dependency relationship and the short-term dependency relationship of the sample time sequence can be captured, so that the model focuses on the short-term features and the long-term features of the sample time sequence in the training process, the time sequence features can be extracted more comprehensively by constructing the global comparison learning task, the local comparison learning task and the fusion comparison learning task, and the prediction model can be migrated to the downstream time sequence prediction classification task, such as natural gas load prediction, electric load prediction, missing value filling, outlier interpolation, rotating mechanical equipment fault detection and other different task scenes through the trained feature extraction network, so that the task prediction accuracy is improved.

In another embodiment of the present application, a specific implementation of the contrast loss function described above is also provided. In one possible implementation, the contrast loss function may include a first component, a second component, and a third component. The first component is used for representing the difference loss between two groups of sample local coding information; the second component is used for representing the difference loss between the two groups of sample global coding information; the third component is used to characterize the loss of difference between the local fusion encoded information and the global fusion encoded information.

Wherein the contrast loss function constructed as described above may be the sum of the first component, the second component, and the third component. By setting the first component, the second component and the third component in the contrast loss function, the difference loss between two groups of sample local coding information, the difference loss between two groups of sample global coding information and the difference loss between local fusion coding information and global fusion coding information can be reduced, so that the local characteristics and the global characteristics of the sample time sequence are fused by the sample local coding information and the sample global coding information, and the parameters in the obtained characteristic extraction network are better.

In the embodiment of the application, when the contrast loss function is constructed to obtain the feature extraction network, the difference between the local coding information of the two groups of samples and the difference between the global coding information of the two groups of samples are synthesized, and the difference between the local fusion coding information and the global fusion coding information are synthesized, and the initial feature extraction network is trained based on the contrast loss function, so that model parameters in the initial feature extraction network can be more accurately and comprehensively iterated and trained, the obtained feature extraction network is better, and the feature extraction accuracy is higher.

Optionally, in the process of iteratively training the initial feature extraction network, reasonable weight coefficients can be further allocated to the first component, the second component and the third component in the loss function, so that the prediction difference of the model is highly matched with the actual service requirement, and the improvement of the model performance can be brought. In one possible implementation, in determining the loss function, the loss function may be determined by determining the weight coefficients of the first, second, and third components, based on the weight coefficients of the first, second, and third components. Wherein the weight coefficient of the first component is related to the importance degree of the local fusion coding information; the weight coefficient of the second component is related to the importance degree of the global fusion coding information; the weight coefficient of the second component is related to the importance of the fused feature representation.

The first component, the second component, the third component and the loss function satisfy the following formulas:

Y＝a ₁ *y ₁ +a ₂ *y ₂ +a ₃ *y ₃

wherein Y is a contrast loss function in the process of the feature extraction network training, and Y ₁ As a first component, a ₁ For the weight coefficient of the first component, y ₂ As a second component, a ₂ As the weight coefficient of the second component, y ₃ As a third component, a ₃ For the weight coefficient of the third component, e.g. may be a ₁ 、a ₂ And a ₃ May be 1/3, respectively.

In one embodiment, after the feature extraction network is trained, the sample time sequence may be subjected to feature extraction processing by a first feature extraction module in the feature extraction network to obtain local data, the sample time sequence is subjected to feature extraction processing by a second feature extraction module in the feature extraction network to obtain global data, then the local data and the global data are subjected to fusion processing by a feature fusion module to obtain fusion data, then the local data, the global data and the fusion data are subjected to coding processing by a feature coding module in the feature extraction network to obtain fusion results, the fusion results are subjected to classification prediction by an initial classification network to obtain output results, a classification loss function is constructed based on the output results and a history labeling result, the classification loss function is minimized according to the classification loss function, parameters of the initial classification network are iteratively trained to obtain a classification network of a prediction model, and a prediction model is constructed based on the feature extraction network and the classification network.

In the process of minimizing iterative training of the initial classification network according to the classification loss function, for example, a gradient descent method may be used for iterative training. When the initial classification network is trained iteratively, parameters in the initial classification network may be updated, for example, matrix parameters such as a weight matrix and a bias matrix in the initial classification network may be updated. Wherein the weight matrix and the bias matrix include, but are not limited to, matrix parameters in a self-attention layer, a feed-forward network layer and a full connection layer of the initial classification network.

When the parameters in the initial classification network are updated through the classification loss function, it may be determined that the initial classification network is not converged according to the loss function, and the parameters in the initial classification network may be adjusted by adjusting the parameters in the initial classification network, so that the initial classification network is converged, thereby obtaining the adjusted initial classification network. The convergence of the initial classification network may mean that the output result and the history labeling result of the initial classification network are smaller than a preset threshold, or the change rate of the difference between the output result and the history labeling result approaches to a certain lower value. And when the calculated loss function is smaller, or the difference between the calculated loss function and the loss function output in the previous iteration approaches 0, the initial classification network is considered to be converged, the prediction model is further considered to be converged, and the trained feature extraction network and the classification network are used as the prediction model.

In the embodiment, the sample time sequence is obtained, the sample time sequence is enhanced, enhanced data are obtained, and the initial prediction model is iteratively trained based on the enhanced data and the history labeling result, so that the accuracy of the trained prediction model is higher, and the accuracy of the prediction model in determining the prediction result of the object to be predicted is further improved.

For a better understanding of the embodiments of the present application, a complete flowchart method of the method of time series data prediction proposed by the present application is further described below. As shown in fig. 9, the method may include the steps of:

s401, acquiring a time sequence to be processed, wherein the time sequence to be processed is used for representing operation data of an object to be predicted at a plurality of detection times.

Specifically, taking a natural gas load detection scenario as an example, a time series to be processed may be obtained, where the time series to be processed may be, for example, a natural gas consumption of a user within 30 days, and may include 30×24 time points in units of detection per hour.

S402, continuously sampling the time sequence to be processed through a first feature extraction module to obtain a plurality of continuous fragments, and taking the plurality of continuous fragments as local feature data.

S403, performing interval sampling processing on the time sequence to be processed through a second feature extraction module to obtain a plurality of discontinuous segments, and taking the plurality of discontinuous segments as global feature data.

S404, carrying out fusion processing on the local feature data and the global feature through a feature fusion module to obtain fusion feature data.

In this embodiment, considering that it is necessary to predict data of a certain time series in the future, the time series to be processed is related not only to data of an adjacent time point but also to time points of a certain time of history, for example, data of the current 12 points is related to not only data of 11 points on the same day and data of 13 points on the same day but also data of 12 points on the previous day and data of 12 points on the same yesterday. Therefore, when the time sequence data is predicted, not only the short-term dependency relationship of the time sequence but also the long-term dependency relationship is required to be considered, and therefore, the application designs an MLP-based lightweight deep learning architecture, and the key idea of the architecture is to apply a simple MLP structure on two sampling strategies so as to realize effective short-range information and long-range information extraction.

After the time sequence to be processed is obtained, continuous sampling processing can be carried out on the time sequence to be processed through a first feature extraction module in the prediction model to obtain a plurality of continuous fragments, the plurality of continuous fragments are used as local feature data, interval sampling processing is carried out on the processing time sequence through a second feature extraction module in the prediction model to obtain a plurality of discontinuous fragments, the plurality of discontinuous fragments are used as global feature data, and fusion processing is carried out on the local feature data and the global feature data through a feature fusion module to obtain fusion feature data.

S405, the local feature data, the global feature data and the fusion feature data are subjected to coding processing through a feature coding module in parallel, and a coding result is obtained.

S406, carrying out classified prediction on the coding result through a classification network of the prediction model to obtain a prediction result of the object to be predicted.

After the local feature data, the global feature data and the fusion feature data are obtained, time dimension information temporal dim and channel dimension information channels dim of the local feature data, for example, temporal dim is H, channel projection is W, and then a time channel (temporal projection) process is performed to obtain local feature data R ^H Information exchange is carried out through radiation conversion of time dimension, so that converted time dimension information F' is obtained, and at the moment, the local characteristic data is formed by R ^H Conversion to R ^F′ I.e. converting the time dimension H into a time dimension F'. Then, the time dimension information after the local feature data conversion is subjected to channel projection (channel projection) processing, and is subjected to information interaction through radiation transformation of channel dimension to obtain the converted channel dimension information W, and the local feature data is converted into R at the moment ^W The converted time dimension information F' and the converted channel dimension information are subjected to dimension reduction processing through a Bottleneck layer (Bottleneck layer) to obtain corresponding dimension F, and the obtained dimension reduced data is R ^F And (5) carrying out linear mapping processing on the data subjected to dimension reduction (Output projection) to obtain a local coding result. And the global characteristic data and the fusion characteristic data are coded by adopting the same processing mode as the local characteristic data, so that the corresponding global coding result and fusion coding result are obtained.

After the corresponding coding result is determined, the coding result is classified and predicted through a classification network of a prediction model, for example, the coding result can be operated through a multi-classification function, and the natural gas abnormality detection type is output. Alternatively, the multi-classification function may be a softmax function, wherein the multi-classification function functions are used to add non-linear factors, because the linear model is not sufficiently expressive to transform continuous real values of the input into an output between 0 and 1.

Taking three classifications as an example, the probability that the element value included in the final result is the corresponding natural gas abnormal detection type can be output through the multi-classification function, and the element value with the highest probability can be selected as the prediction result of the time sequence to be processed.

In the embodiment of the application, the feature information of the time sequence to be processed can be more comprehensively extracted by carrying out feature extraction processing on the time sequence to be processed through the feature extraction network of the prediction model, so that the more comprehensive information is combined for classification prediction, and the classification processing is carried out through the classification network, so that the prediction result of the object to be predicted is more accurately obtained.

It should be noted that although the operations of the method of the present application are depicted in the drawings in a particular order, this does not require or imply that the operations must be performed in that particular order or that all of the illustrated operations be performed in order to achieve desirable results. Rather, the steps depicted in the flowcharts may change the order of execution. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform.

On the other hand, fig. 10 is a schematic structural diagram of a time-series data prediction apparatus according to an embodiment of the present application. The apparatus may be an apparatus within a computer device, as shown in fig. 10, the apparatus 400 comprising:

An acquisition module 410, configured to acquire a time sequence to be processed; the time sequence to be processed is used for representing the running data of the object to be predicted at a plurality of detection times;

the feature extraction module 420 is configured to perform feature extraction processing on the time sequence to be processed through a prediction model, so as to obtain local feature data, global feature data and fusion feature data of the time sequence to be processed;

the classification prediction module 430 is configured to perform feature encoding processing on the local feature data, the global feature data, and the fusion feature data by using a prediction model to obtain an encoding result, and perform classification prediction on the encoding result to obtain a prediction result of the object to be predicted.

Optionally, the feature extraction module 420 is specifically configured to:

continuously sampling the time sequence to be processed through a first feature extraction module to obtain a plurality of continuous fragments, and taking the plurality of continuous fragments as local feature data;

performing interval sampling processing on the time sequence to be processed through a second feature extraction module to obtain a plurality of discontinuous fragments, and taking the plurality of discontinuous fragments as global feature data;

and carrying out fusion processing on the local feature data and the global feature data through a feature fusion module to obtain fusion feature data.

Optionally, the above classification prediction module 430 is specifically configured to:

performing channel projection processing based on the local feature data, the global feature data and the time dimension information after the feature data conversion, and obtaining the converted channel dimension information;

and performing linear mapping processing on the data subjected to dimension reduction to obtain a local coding result, a global coding result and a fusion coding result.

Optionally, the training process of the prediction model includes the following steps:

And carrying out iterative training on an initial classification network of the initial prediction model based on the two groups of sample local coding information, the two groups of sample global coding information, the local fusion coding information, the global fusion coding information and the historical labeling result to obtain the prediction model.

Optionally, the device is further configured to:

performing feature extraction processing on the two groups of enhancement data through a first coding module in an initial feature extraction network respectively to obtain two groups of sample local features, and performing feature extraction processing on the two groups of enhancement data through a second coding module in the initial feature extraction network in parallel to obtain two groups of sample global features;

and carrying out coding processing by a feature coding module in a feature extraction network based on the two groups of sample local features, the two groups of sample global features, the local fusion features and the global fusion features to obtain two groups of sample local coding information, two groups of sample global coding information, local fusion coding information and global fusion coding information.

Optionally, the device is further configured to:

Constructing a contrast loss function based on the two groups of sample local coding information, the two groups of sample global coding information, the local fusion coding information and the global fusion coding information, and training parameters of a first feature extraction module, a second feature extraction module, a feature fusion module and a feature coding module according to the minimized iteration of the contrast loss function to obtain a feature extraction network of the prediction model;

and performing iterative training on the initial classification network based on the feature extraction network and the history labeling result to obtain a prediction model.

Optionally, the device is further configured to:

carrying out feature extraction processing on the sample time sequence through a first feature extraction module and a second feature extraction module in a feature extraction network to obtain local data and global data;

the local data and the global data are fused through a feature fusion module to obtain fusion data, and the fusion data, the local data and the global data are coded through a feature fusion coding module in a feature extraction network to obtain a fusion result;

carrying out classification prediction on the fusion result through an initial classification network to obtain an output result;

Optionally, the contrast loss function includes a first component, a second component, and a third component;

the first component is used for representing the difference loss between two groups of sample local coding information;

the third component is used to characterize the loss of difference between the local fusion encoded information and the global fusion encoded information.

Compared with the prior art, on the one hand, the time sequence data prediction device provided by the embodiment of the application can extract the local characteristic data, the global characteristic data and the fusion characteristic data of the time sequence to be processed from the global and local angles, so that the short-range period information and the long-range period information of the time sequence to be processed can be captured in a finer granularity, and more accurate guiding information is provided for subsequent classification prediction. On the other hand, the prediction result is determined by carrying out coding processing and classification prediction on the local feature data, the global feature data and the fusion feature data and combining more comprehensive features, so that the method can be suitable for different downstream tasks such as equipment fault diagnosis, abnormal value detection, natural gas, power load prediction and the like, and the accuracy of prediction is improved.

In another aspect, a computer device provided in an embodiment of the present application includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the time-series data prediction method described above when the processor executes the program.

Referring now to fig. 11, fig. 11 is a schematic diagram illustrating a computer system of a computer device according to an embodiment of the present application.

As shown in fig. 11, the computer system 600 includes a Central Processing Unit (CPU) 601, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 603 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data required for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, mouse, etc.; an output portion 607 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The drive 610 is also connected to the I/O interface 605 as needed. Removable media 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on drive 610 so that a computer program read therefrom is installed as needed into storage section 608.

In particular, according to embodiments of the present application, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the application include a computer program product comprising a computer program embodied on a machine-readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from the network through the communication portion 603, and/or installed from the removable medium 611. The above-described functions defined in the system of the present application are performed when the computer program is executed by a Central Processing Unit (CPU) 601.

The computer readable medium shown in the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units or modules involved in the embodiments of the present application may be implemented in software or in hardware. The described units or modules may also be provided in a processor, for example, as: a processor, comprising: the device comprises an acquisition module, a characteristic extraction module and a classification prediction module. Wherein the names of the units or modules do not constitute a limitation of the units or modules themselves in some cases, for example, the acquisition module may also be described as "for acquiring a temporal sequence to be processed"; the temporal sequence to be processed is used to characterize the operational data of the object to be predicted at a plurality of detection times.

As another aspect, the present application also provides a computer-readable storage medium that may be contained in the electronic device described in the above embodiment; or may be present alone without being incorporated into the electronic device. The computer-readable storage medium stores one or more programs that, when used by one or more processors, perform the time-series data prediction method described in the present application:

In summary, according to the method, the device, the equipment and the storage medium for predicting time series data provided by the embodiment of the application, the local feature data, the global feature data and the fusion feature data of the time series to be processed are obtained by obtaining the time series to be processed and performing feature extraction processing on the time series to be processed through the prediction model, then the local feature data, the global feature data and the fusion feature data are subjected to feature coding processing by utilizing the prediction model to obtain a coding result, and the coding result is subjected to classification prediction to obtain the prediction result of the object to be predicted. Compared with the prior art, on the one hand, the technical scheme can extract the local feature data, the global feature data and the fusion feature data of the time sequence to be processed from the global and local angles, so that the short-range period information and the long-range period information of the time sequence to be processed can be captured in a finer granularity, and more accurate guiding information is provided for subsequent classification prediction. On the other hand, the prediction result is determined by carrying out coding processing and classification prediction on the local feature data, the global feature data and the fusion feature data and combining more comprehensive features, so that the method can be suitable for different downstream tasks such as equipment fault diagnosis, abnormal value detection, natural gas, power load prediction and the like, and the accuracy of prediction is improved.

The above description is only illustrative of the preferred embodiments of the present application and of the principles of the technology employed. It will be appreciated by persons skilled in the art that the scope of the application referred to in the present application is not limited to the specific combinations of the technical features described above, but also covers other technical features formed by any combination of the technical features described above or their equivalents without departing from the inventive concept. Such as the above-mentioned features and the technical features disclosed in the present application (but not limited to) having similar functions are replaced with each other.

Claims

1. A method of predicting time series data, the method comprising:

2. The method according to claim 1, wherein the prediction model includes a first feature extraction module, a second feature extraction module, and a feature fusion module, and performing feature extraction processing on the time sequence to be processed through the prediction model to obtain local feature data, global feature data, and fusion feature data of the time sequence to be processed, including:

3. The method according to claim 1 or 2, wherein the feature encoding processing is performed on the local feature data, the global feature data and the fusion feature data by using the prediction model to obtain an encoding result, including;

4. The method according to claim 1, wherein the training process of the predictive model comprises the steps of:

5. The method of claim 4, wherein performing feature extraction and encoding on the two sets of enhancement data to obtain two sets of sample local encoding information, two sets of sample global encoding information, local fusion encoding information, and global fusion encoding information, comprises:

performing feature extraction processing on the two groups of enhancement data through a first feature extraction module in the initial feature extraction network to obtain two groups of sample local features, and performing feature extraction processing on the two groups of enhancement data through a second feature extraction module in the initial feature extraction network in parallel to obtain two groups of sample global features;

processing the two groups of sample local features and the two groups of sample global features through a feature fusion module to obtain local fusion features and global fusion features;

6. The method of claim 5, wherein iteratively training an initial classification network of an initial prediction model based on the two sets of sample local coding information, the two sets of sample global coding information, the local fusion coding information, the global fusion coding information, and the history labeling result to obtain the prediction model, comprises:

7. The method of claim 6, wherein iteratively training the initial classification network based on the feature extraction network and the historical labeling results to obtain the predictive model comprises:

8. The method of claim 6, wherein the contrast loss function comprises a first component, a second component, and a third component;

9. A time series data prediction apparatus, the apparatus comprising:

10. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the time series data prediction method of any of claims 1-8 when the computer program is executed by the processor.

11. A computer-readable storage medium having stored thereon a computer program for implementing the time-series data prediction method according to any one of claims 1-8.