CN112766619A

CN112766619A - Commodity time sequence data prediction method and system

Info

Publication number: CN112766619A
Application number: CN202110369758.2A
Authority: CN
Inventors: 张凯; 曲浩; 崔超然; 丁冬睿
Original assignee: Guangdong Zhongju Artificial Intelligence Technology Co ltd
Current assignee: Guangdong Zhongju Artificial Intelligence Technology Co ltd
Priority date: 2021-04-07
Filing date: 2021-04-07
Publication date: 2021-05-07
Anticipated expiration: 2041-04-07
Also published as: CN112766619B

Abstract

The invention provides a commodity time sequence data prediction method and a commodity time sequence data prediction system. The scheme comprises the steps of obtaining initial commodity time sequence data and preprocessing the initial commodity time sequence data; acquiring commodity time sequence characteristics according to the standard commodity time sequence data, performing adjacent interval division and periodic interval division to acquire adjacent interval high-level semantic information representation and periodic interval high-level semantic information representation, and performing splicing and characteristic transformation to generate information fusion data; acquiring and processing the characteristics of the seasonal information data, and performing splicing and characteristic transformation according to the information fusion data and the seasonal characteristic data to generate target fusion data; and performing characteristic transformation and an activation function according to the target fusion data to predict the price and sales volume of the commodity. According to the scheme, multi-scale attention mechanism learning is carried out on the basis of GRU network coding from the characteristics of commodity proximity, periodicity, seasonality and the like, and the characteristic of commodity time sequence data is accurately captured.

Description

Commodity time sequence data prediction method and system

Technical Field

The invention relates to the technical field of time sequence data prediction, in particular to a commodity time sequence data prediction method and a commodity time sequence data prediction system.

Background

Time series data refers to data collected at different times for the case where the described phenomenon varies over time. Such data reflects the state or extent of change of an object, phenomenon, etc. over time. In recent years, time series data are attracting much attention due to the characteristics of easy collection and large yield in daily life and work. Such data can perform different tasks in different domains, such as: the industrial data can be subjected to historical data anomaly point detection; the electrocardiosignal data can be used for biological identity characteristic identification; weather, financial, etc. data may be used for future trend prediction, etc.

However, the existing time series data prediction scheme is mainly an exponential smoothing method and has the following defects: the first exponential smoothing method mainly aims at sequences without trends and seasonality, the second exponential smoothing method mainly aims at sequences with trends but without seasonality, and the third exponential smoothing method mainly aims at sequences with trends and seasonality; the BP (Back-propagation) feedforward neural network is very easy to fall into a local optimum value and cannot reach a global optimum value; the RNN Recurrent Neural Network (current Neural Network) cannot solve the long-term dependence problem because it covers the original data information with the recursion of time. In summary, although there are various methods for predicting time series data of a commodity, the conventional single method has certain limitations, and cannot accurately and quickly capture the characteristics of the time series data of the commodity.

Disclosure of Invention

In view of the above problems, the present invention provides a method and a system for predicting time series data of a commodity, which perform multi-scale attention mechanism learning based on GRU network coding based on the characteristics of proximity, periodicity, seasonality and the like of the commodity, thereby solving the problem that the conventional single method has certain limitations and cannot accurately and quickly capture the time series data characteristics of the commodity.

According to a first aspect of the embodiments of the present invention, a method for predicting time series data of a commodity is provided, which specifically includes:

acquiring initial commodity time sequence data, preprocessing the initial commodity time sequence data, and storing the initial commodity time sequence data as standard commodity time sequence data;

acquiring commodity time sequence characteristics according to the standard commodity time sequence data, and acquiring first intermediate data by utilizing GRU network coding;

carrying out multi-head self-attention mechanism weighted summation after carrying out adjacent interval division and cycle interval division according to the first intermediate data to obtain high-level semantic information representation of adjacent intervals and high-level semantic information representation of cycle intervals;

splicing and performing feature transformation according to the adjacent interval high-level semantic information representation and the periodic interval high-level semantic information representation to generate information fusion data;

acquiring and processing the characteristics of the seasonal information data to generate seasonal characteristic data;

splicing and performing characteristic transformation according to the information fusion data and the seasonal characteristic data to generate target fusion data;

and performing characteristic transformation and an activation function according to the target fusion data, and predicting the price of the commodity and the sales volume of the commodity.

In one or more embodiments, preferably, the obtaining of the initial commodity time series data, and preprocessing the initial commodity time series data, and storing the initial commodity time series data as standard commodity time series data specifically include:

crawling available commodity time sequence data through the Internet, and storing the commodity time sequence data as first initial commodity time sequence data;

inputting the existing commodity time sequence data in a manual input mode, and storing the commodity time sequence data as second initial commodity time sequence data;

saving the first initial commodity time sequence data and the second initial commodity time sequence data together as the initial commodity time sequence data;

performing data preprocessing on the initial commodity time sequence data by using a first calculation formula, and storing the initial commodity time sequence data as the standard commodity time sequence data;

the first calculation formula is:

wherein the content of the first and second substances,xfor the time series data of the initial commodity,x'for the time series data of the standard commodity,mintime series data for the initial commodityxThe minimum value of (a) is determined,maxtime series data for the initial commodityxIs measured.

In one or more embodiments, preferably, the obtaining of the commodity time series characteristic according to the standard commodity time series data and obtaining the first intermediate data by using the GRU network code specifically include:

according to the standard commodity time sequence data, commodity time sequence characteristics are extracted by using a second calculation formula, and the commodity time sequence characteristics are obtained;

storing the commodity time sequence characteristics as bottom data for GRU network coding by using a third calculation formula to generate first intermediate data;

the second calculation formula is:

wherein the content of the first and second substances,Xfor the time-series characteristic of the commodity,Xis formed bynOf heavenx _iThe structure of the utility model is that the material,x _iis a commodityiThe characteristics of the day are,

is composed ofnColumn(s) ofdMatrix of rowsR；

The third calculation formula is:

wherein the content of the first and second substances,Hfor the purpose of said first intermediate data,His formed bynOf heavenh _iThe structure of the utility model is that the material,h _iis the first intermediate dataiThe characteristics of the day are,

is composed ofnColumn(s) ofdMatrix of rowsR。

In one or more embodiments, preferably, after performing adjacent interval division and cycle interval division according to the first intermediate data, performing multi-head self-attention mechanism weighted summation to obtain an adjacent interval high-level semantic information representation and a cycle interval high-level semantic information representation, specifically including:

dividing the first intermediate data adjacent interval to obtain second intermediate data in a fourth calculation formula form;

dividing the first intermediate data period interval to obtain third intermediate data in a fifth calculation formula form;

performing multi-head self-attention mechanism weighted summation according to the second intermediate data to obtain high-level semantic information representation of the adjacent interval;

performing multi-head self-attention mechanism weighted summation according to the third intermediate data to obtain high-level semantic information representation of the periodic interval;

the fourth calculation formula is:

wherein the content of the first and second substances,

for the purpose of said second intermediate data,h _nis the second intermediate datanThe characteristics of the day are,tis the maximum value of the adjacent interval;

the fifth calculation formula is:

wherein the content of the first and second substances,

for the purpose of said third intermediate data,h _iis the first intermediate dataiCharacteristics of the day，iIs greater than 0 and less than or equal tonAn integer of (d);

the multi-head self-attention mechanism weighted summation comprises the steps of carrying out weight mapping calculation by using a sixth calculation formula, further carrying out single-head self-attention calculation by using a seventh calculation formula according to a weight mapping result to obtain single-head self-attention, and calculating by using an eighth calculation formula according to the single-head self-attention to obtain multi-head attention;

the sixth calculation formula is:

wherein the Attention is a weight mapping function,Q、KandVthere are 3 input vectors each, and each,QandKis composed ofd _kA dimension vector is set to the vector of the dimension,Vis thatd _vA dimension vector is set to the vector of the dimension,K ^Tto representKSoftmax is a mapping function whose result is mapped to [0,1 ]]；

The seventh calculation formula is:

wherein the content of the first and second substances,head _iis as followsiThe single head is self-attentive,W _i ^Qis as followsiA first one of the transformation matrices is first,W _i ^Kis as followsiThe first and second transformation matrices are then used,W _i ^Vis as followsiA first third transformation matrix, Attention being the weight mapping function;

the eighth calculation formula is:

wherein the content of the first and second substances,MultiHeadin order for the multi-head attention to be focused,Concatfor a plurality of said single heads to proceed with self-attentionThe splicing is carried out in a splicing way,head _iis as followsiThe single head is self-attentive,W ^ois a fourth transformation matrix.

In one or more embodiments, preferably, the generating information fusion data according to the splicing and feature transformation performed on the adjacent interval high-level semantic information representation and the cycle interval high-level semantic information representation specifically includes:

the periodic interval high-level semantic information is expressed and arranged into a ninth calculation formula form, information fusion is carried out to obtain periodic high-level semantic features, wherein the periodic high-level semantic features are 7 rowsdA matrix of rows;

splicing the periodic high-level semantic features and the adjacent interval high-level semantic information representation to obtain fourth intermediate data;

performing feature transformation by using a tenth calculation formula according to the fourth intermediate data to obtain the information fusion data;

the ninth calculation formula is:

wherein the content of the first and second substances,

representing the high-level semantic information of the periodic interval;

the tenth calculation formula is:

wherein the content of the first and second substances,Mfor the purpose of said information-fusion data,W _fa transformation matrix for the first information fusion,Fin order to be able to determine the fourth intermediate data,b _fis a bias for the first information fusion.

In one or more embodiments, preferably, the performing feature acquisition and processing on the seasonal information data to generate seasonal feature data specifically includes: and performing text coding on the spring, summer, autumn and winter sections in the seasonal factor information by using Word2Vec according to the seasonal factor information to generate the seasonal feature data.

In one or more embodiments, preferably, the generating target fusion data by performing splicing and feature transformation according to the information fusion data and the seasonal feature data specifically includes:

splicing the information fusion data and the seasonal feature data to obtain fifth intermediate data;

performing feature transformation by using an eleventh calculation formula according to the fifth intermediate data to obtain the target fusion data;

the eleventh calculation formula is:

wherein the content of the first and second substances,Efor the purpose of fusing the data for the object,Fin order to be able to determine the fifth intermediate data,W _fua transformation matrix for the second information fusion,b _fua bias for second information fusion.

In one or more embodiments, preferably, the predicting the price of the commodity and the sales volume of the commodity by performing the feature transformation and the activation function according to the target fusion data specifically includes:

performing feature transformation and an activation function by using a twelfth calculation formula to generate the price of the commodity and the sales volume of the commodity;

the twelfth calculation formula is:

wherein the content of the first and second substances,Reluin order to activate the function(s),Efor the purpose of fusing the data for the object,W _ein order to transform the matrix for the features,b _ein order to bias the transformation of the features,Ya last page for the price of said goods and the sales volume of said goods y ₁， y ₂ }，y ₁Is the price of the goods in question,y ₂is the sales volume of the commodity.

According to a second aspect of the embodiments of the present invention, there is provided a system for predicting time series data of a commodity, including:

the time sequence data acquisition module is used for acquiring the time sequence characteristics of the commodities according to the standard commodity time sequence data and acquiring first intermediate data by utilizing GRU network coding;

the first characteristic acquisition module is used for carrying out multi-head self-attention mechanism weighted summation after carrying out adjacent interval division and cycle interval division according to the first intermediate data to obtain high-level semantic information representation of adjacent intervals and high-level semantic information representation of cycle intervals;

the second characteristic acquisition module is used for carrying out splicing and characteristic transformation according to the adjacent interval high-level semantic information representation and the periodic interval high-level semantic information representation to generate information fusion data;

the first semantic fusion module is used for acquiring and processing the characteristics of the seasonal information data to generate seasonal characteristic data;

the seasonal data acquisition module is used for carrying out splicing and characteristic transformation according to the information fusion data and the seasonal characteristic data to generate target fusion data;

the second semantic fusion module is used for performing feature transformation and activation functions according to the target fusion data and predicting the price and sales volume of the commodity;

and the target prediction module is used for performing characteristic transformation and activation functions according to the target fusion data to predict the price of the commodity and the sales volume of the commodity.

In one or more embodiments, preferably, the article time series data prediction system further includes:

the display module is used for displaying the information fusion data, the seasonal characteristic data and the price and sales volume of the commodity;

and the storage module is used for storing the standard commodity time sequence data, the first intermediate data, the adjacent interval high-level semantic information representation, the period interval high-level semantic information representation, the information fusion data, the target fusion data and the price and sales volume of the commodity.

The technical scheme provided by the embodiment of the invention can have the following beneficial effects:

according to the commodity time sequence data prediction method based on multi-scale attention, compared with the previous method, the method focuses more on the proximity, periodicity and seasonality of the commodity time sequence data, further the proximity and periodicity of the time sequence data are coded by using a multi-head self-attention mechanism, and the seasonal factors of the commodity are taken into consideration and are used as external information to be fused with time sequence characteristics, so that the purpose of improving the model accuracy is achieved.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a flowchart of a method for predicting time series data of a commodity according to an embodiment of the present invention.

Fig. 2 is a flowchart of acquiring initial commodity time series data, preprocessing the initial commodity time series data, and storing the initial commodity time series data as standard commodity time series data in the commodity time series data prediction method according to an embodiment of the present invention.

Fig. 3 is a flowchart of obtaining a time series characteristic of a commodity according to the standard commodity time series data and obtaining first intermediate data by using GRU network coding in a commodity time series data prediction method according to an embodiment of the present invention.

Fig. 4 is a flowchart of performing multi-head self-attention mechanism weighted summation after performing adjacent interval division and cycle interval division according to the first intermediate data in the commodity time series data prediction method according to an embodiment of the present invention to obtain an adjacent interval high-level semantic information representation and a cycle interval high-level semantic information representation.

Fig. 5 is a flowchart of generating information fusion data by performing splicing and feature transformation according to the adjacent interval high-level semantic information representation and the periodic interval high-level semantic information representation in the commodity time series data prediction method according to an embodiment of the present invention.

Fig. 6 is a flowchart of generating target fusion data by performing splicing and feature transformation according to the information fusion data and the seasonal feature data in a commodity time series data prediction method according to an embodiment of the present invention.

Fig. 7 is a block diagram of a product time series data prediction system according to an embodiment of the present invention.

Fig. 8 is a schematic structural framework diagram of a product time series data prediction system according to an embodiment of the present invention.

Detailed Description

In some of the flows described in the present specification and claims and in the above figures, a number of operations are included that occur in a particular order, but it should be clearly understood that these operations may be performed out of order or in parallel as they occur herein, with the order of the operations being indicated as 101, 102, etc. merely to distinguish between the various operations, and the order of the operations by themselves does not represent any order of performance. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor limit the types of "first" and "second" to be different.

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In particular, with the rapid development of deep learning, methods for solving the time series data of the commodity using a deep neural network are endless, such as a BP (Back-propagation) feedforward neural network, a convolutional neural network, and a cyclic neural network, which are commonly designed according to the structure of a human brain neural network. However, the network models have certain defects, and the BP neural network is very easy to fall into a local optimal value and cannot reach a global optimal value; the convolutional neural network pays attention to partial interval information according to the size of a convolutional kernel but has weak grasp on global information; the RNN Recurrent Neural Network (current Neural Network) cannot solve the long-term dependence problem because it covers the original data information with the recursion of time. An LSTM (Long Short Term Memory) is designed for solving the problem that an RNN (neural network) cannot solve Long-Term dependence, has a special threshold mechanism, can keep the current input information quantity and the information quantity before forgetting according to a gate control unit, and cannot perform parallel calculation. The commodity time sequence data has the characteristics of proximity, periodicity, seasonality and the like, but the network structure is as follows: the convolutional neural network focuses on partial interval information; the RNN neural network only concerns the adjacent interval information; the LSTM neural network can pay attention to long-term dependence but has huge consumption of computing resources; therefore, the conventional single method cannot capture the time series data characteristic of the commodity well.

The embodiment of the invention provides a commodity time sequence data prediction method and a commodity time sequence data prediction system. The solution solves the problem from the characteristics of commodity proximity, periodicity, seasonality, and the like. The bottom layer of the method is based on GRU network (Gated Current Unit) coding, and because the convergence speed of the method is faster than that of a common RNN network, the method not only keeps the effect of LSTM, but also has a simple structure. And further carrying out multi-scale attention mechanism learning based on the GRU network, constantly dividing adjacent intervals and periodic intervals according to historical time sequence data, and adding external season information. By the arrangement, the model can quickly and accurately capture the commodity time sequence data characteristics.

According to a first aspect of the embodiments of the present invention, a method for predicting time series data of a commodity is provided.

As shown in fig. 1, the method for predicting time series data of a commodity includes:

s101, obtaining initial commodity time sequence data, preprocessing the initial commodity time sequence data, and storing the initial commodity time sequence data as standard commodity time sequence data;

s102, acquiring a commodity time sequence characteristic according to the standard commodity time sequence data, and acquiring first intermediate data by utilizing GRU network coding;

s103, after adjacent interval division and periodic interval division are carried out according to the first intermediate data, multi-head self-attention mechanism weighted summation is carried out to obtain adjacent interval high-level semantic information representation and periodic interval high-level semantic information representation;

s104, splicing and carrying out feature transformation according to the adjacent interval high-level semantic information representation and the periodic interval high-level semantic information representation to generate information fusion data;

s105, acquiring and processing the characteristics of the seasonal information data to generate seasonal characteristic data;

s106, splicing and feature transformation are carried out according to the information fusion data and the seasonal feature data to generate target fusion data;

and S107, performing characteristic transformation and activation functions according to the target fusion data, and predicting the price and sales volume of the commodity.

In the embodiment of the invention, commodity time sequence data is predicted by multi-scale attention, the multi-scale attention is specifically concerned about the proximity, periodicity and seasonality of the commodity time sequence data, the proximity and periodicity of the time sequence data are encoded by using a multi-head self-attention mechanism, and on the basis, the seasonal factors of the commodity are considered and are used as external information to be fused with time sequence characteristics, so that the purpose of improving the accuracy of a model is achieved.

As shown in fig. 2, in one or more embodiments, preferably, the obtaining initial commodity time series data, preprocessing the initial commodity time series data, and storing the initial commodity time series data as standard commodity time series data specifically includes:

s201, crawling available commodity time sequence data through the Internet and saving the data as first initial commodity time sequence data;

s202, inputting existing commodity time sequence data in a manual input mode, and storing the commodity time sequence data as second initial commodity time sequence data;

s203, saving the first initial commodity time sequence data and the second initial commodity time sequence data together as the initial commodity time sequence data;

s204, performing data preprocessing on the initial commodity time sequence data by using a first calculation formula, and storing the initial commodity time sequence data as the standard commodity time sequence data;

the first calculation formula is:

In the embodiment of the invention, an acquisition mode and a data processing flow of the standard commodity time sequence data are provided, sufficient commodity time sequence data can be effectively acquired through manual entry and an internet crawling mode, but the acquired data may have different data standards, and in order to ensure that the acquired data have the same standard, normalization processing is performed by using a first calculation formula, so that the final standard time sequence commodity data is acquired. The processing flow in the embodiment can ensure better consistency of the commodity time sequence data in subsequent prediction, and further can improve the convergence speed of the model.

As shown in fig. 3, in one or more embodiments, preferably, the obtaining of the commodity time series characteristic according to the standard commodity time series data and obtaining the first intermediate data by using the GRU network code specifically includes:

s301, extracting commodity time sequence characteristics by using a second calculation formula according to the standard commodity time sequence data, and obtaining the commodity time sequence characteristics;

s302, storing the commodity time sequence characteristics as bottom layer data for GRU network coding by using a third calculation formula to generate first intermediate data;

the second calculation formula is:

is composed ofnColumn(s) ofdMatrix of rowsR；

The third calculation formula is:

is composed ofnColumn(s) ofdMatrix of rowsR。

In the embodiment of the present invention, the standard commodity time series data is further processed, and the main purpose of this data processing is to obtain underlying data for GRU network coding, specifically, the first intermediate data is in the form of d-dimension data with characteristics of each day as a set, and the purpose of normalizing the first intermediate data is to ensure that the subsequent data processing can be performed normally.

As shown in fig. 4, in one or more embodiments, preferably, after performing adjacent interval division and cycle interval division according to the first intermediate data, performing multi-head self-attention mechanism weighted summation to obtain an adjacent interval high-level semantic information representation and a cycle interval high-level semantic information representation, specifically including:

s401, dividing the first intermediate data adjacent interval to obtain second intermediate data in a fourth calculation formula form;

s402, dividing the first intermediate data period interval to obtain third intermediate data in a fifth calculation formula form;

s403, performing multi-head self-attention mechanism weighted summation according to the second intermediate data to obtain high-level semantic information representation of the adjacent interval;

s404, performing multi-head self-attention mechanism weighted summation according to the third intermediate data to obtain high-level semantic information representation of the periodic interval;

the fourth calculation formula is:

wherein the content of the first and second substances,

the fifth calculation formula is:

wherein the content of the first and second substances,

for the purpose of said third intermediate data,h _iis the first intermediate dataiThe characteristics of the day are,iis greater than 0 and less than or equal tonAn integer of (d);

the sixth calculation formula is:

The seventh calculation formula is:

wherein the content of the first and second substances,head _iis as followsiThe single head is self-attentive,W _i ^Qis as followsiA first one of the transformation matrices is first,W _i ^Kis as followsiThe first and second transformation matrices are then used,W _i ^Vis as followsiA first third transformation matrix, Attention being said weight mapA function of rays;

the eighth calculation formula is:

wherein the content of the first and second substances,MultiHeadin order for the multi-head attention to be focused,Concatsplicing is performed for a plurality of said singles with self-attention,head _iis as followsiThe single head is self-attentive,W ^ois a fourth transformation matrix.

In the embodiment of the invention, the acquisition of the high-level semantic information representation of the adjacent interval is carried out, which is the embodiment of the proximity of the commodity time sequence data, and the acquisition of the high-level semantic information representation of the periodic interval is also carried out, which is the embodiment of the periodic sequential embodiment of the commodity time sequence data. After the two high-level semantic information representations are obtained, deep information fusion and extraction can be conveniently and further completed.

As shown in fig. 5, in one or more embodiments, preferably, the generating information fusion data according to the splicing and feature transformation performed on the adjacent interval high-level semantic information representation and the cycle interval high-level semantic information representation specifically includes:

s501, expressing and sorting the periodic interval high-level semantic information into a ninth calculation formula form, and performing information fusion to obtain periodic high-level semantic features, wherein the periodic high-level semantic features are 7 rowsdA matrix of rows;

s502, splicing the periodic high-level semantic features and the adjacent interval high-level semantic information representation to obtain fourth intermediate data;

s503, performing feature transformation by using a tenth calculation formula according to the fourth intermediate data to obtain the information fusion data;

the ninth calculation formula is:

wherein the content of the first and second substances,

representing the high-level semantic information of the periodic interval;

the tenth calculation formula is:

In the embodiment of the invention, the key point is how to perform information fusion on the obtained adjacent interval high-level semantic information representation and the cycle interval high-level semantic information representation, the way of fusing the two features is to splice the matrixes, then the data in the two matrixes are overlapped and fused through feature transformation, and finally the obtained information fusion data not only contains the adjacent interval semanteme, but also contains the cycle interval semanteme.

In the embodiment of the invention, the seasonal information data is input and the seasonal characteristic data in a vector form is generated, so that a foundation is laid for further subsequent fusion. Word2Vec is a model for generating Word vectors, which can be used to map each Word to a vector and to represent Word-to-Word relationships, and the Word2Vec Word vectors can better express the similarity and analogy relationships between different words, for example: the method comprises the steps of mapping spring to a word vector matrix, mapping summer to a word vector matrix, and enabling the two word vector matrices to represent different information on semantic information.

As shown in fig. 6, in one or more embodiments, preferably, the generating target fusion data by performing splicing and feature transformation according to the information fusion data and the seasonal feature data specifically includes:

s601, splicing the information fusion data and the seasonal characteristic data to obtain fifth intermediate data;

s602, performing feature transformation by using an eleventh calculation formula according to the fifth intermediate data to obtain the target fusion data;

the eleventh calculation formula is:

In the embodiment of the invention, on the basis of acquiring the information fusion data and the seasonal feature data, data splicing is further carried out, and then data in two matrixes are overlapped and fused through feature transformation, so that high-level semantic features with time sequence data proximity, periodicity and seasonality are finally obtained, wherein the high-level semantic features are specifically the target fusion data.

the twelfth calculation formula is:

In the embodiment of the invention, a Linear rectification function (ReLU) is an activation function in an artificial neural network, and a final prediction matrix is obtained by performing prediction operation on a nonlinear function represented by a ramp function and a variant thereof, wherein the prediction matrix comprises the price and sales volume of a predicted commodity.

According to a second aspect of the embodiments of the present invention, a system for predicting time series data of a commodity is provided.

As shown in fig. 7, the product time series data prediction system includes:

the time sequence data acquisition module 701 is used for acquiring the time sequence characteristics of the commodities according to the standard commodity time sequence data and acquiring first intermediate data by utilizing GRU network coding;

a first feature obtaining module 702, configured to perform multi-head self-attention mechanism weighted summation after performing adjacent interval division and cycle interval division according to the first intermediate data, to obtain a high-level semantic information representation of an adjacent interval and a high-level semantic information representation of a cycle interval;

a second feature obtaining module 703, configured to perform splicing and feature transformation according to the adjacent interval high-level semantic information representation and the periodic interval high-level semantic information representation, and generate information fusion data;

a first semantic fusion module 704, configured to perform feature acquisition and processing on seasonal information data, and generate seasonal feature data;

a seasonal data acquisition module 705, configured to perform splicing and feature transformation according to the information fusion data and the seasonal feature data, and generate target fusion data;

the second semantic fusion module 706 is configured to perform feature transformation and activation functions according to the target fusion data, and predict the price of the commodity and the sales volume of the commodity;

and the target prediction module 707 is configured to perform feature transformation and an activation function according to the target fusion data to predict the price of the commodity and the sales volume of the commodity.

a display module 708, configured to display the information fusion data, the seasonal feature data, and the price and sales volume of the commodity;

a storage module 709, configured to store the standard commodity time sequence data, the first intermediate data, the adjacent interval high-level semantic information representation, the cycle interval high-level semantic information representation, the information fusion data, the target fusion data, and the price and sales volume of the commodity.

Fig. 8 is a schematic structural framework diagram of a product time series data prediction system according to an embodiment of the present invention. The conventional method for processing the time series data only considers the proximity of the time series data, and does not divide the data according to factors such as period, proximity and season. The commodity time sequence data prediction system is provided with two frame parts which are an adjacent interval frame and a periodic interval frame respectively. When data is encoded through a GRU network, each state of the encoded data has forward information and backward information due to a gating mechanism of the GRU. Carrying out adjacent division and periodic division on the coded data, and adopting a multi-head self-attention mechanism to fuse information on the data in adjacent intervals; the data of the period interval is also fused in the same way. To account for external factors of seasonal information, the seasons are encoded here using Word2 Vec. And finally, fusing three parts of information including the adjacent interval characteristic, the periodic interval characteristic and the seasonal characteristic, and outputting the predicted commodity price and sales volume.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A commodity time series data prediction method is characterized by comprising the following steps:

2. The method for predicting the commodity time series data according to claim 1, wherein the obtaining of the initial commodity time series data, the preprocessing of the initial commodity time series data, and the storage of the initial commodity time series data as standard commodity time series data specifically comprises:

the first calculation formula is:

3. The method according to claim 1, wherein the obtaining of the time series characteristics of the commodity according to the standard commodity time series data and the obtaining of the first intermediate data by using the GRU network coding specifically include:

the second calculation formula is:

is composed ofnColumn(s) ofdMatrix of rowsR；

The third calculation formula is:

is composed ofnColumn(s) ofdMatrix of rowsR。

4. The method for predicting the commodity time series data according to claim 1, wherein after the adjacent interval division and the cycle interval division are performed according to the first intermediate data, the multi-head self-attention mechanism weighted summation is performed to obtain the adjacent interval high-level semantic information representation and the cycle interval high-level semantic information representation, and the method specifically comprises the following steps:

the fourth calculation formula is:

wherein the content of the first and second substances,

the fifth calculation formula is:

wherein the content of the first and second substances,

the sixth calculation formula is:

The seventh calculation formula is:

the eighth calculation formula is:

wherein the content of the first and second substances,MultiHeadin order for the multi-head attention to be focused,Concatsplicing is performed for a plurality of said singles with self-attention,head _iis as followsiA main body composed ofThe single head of the utility model is self-attentive,W ^ois a fourth transformation matrix.

5. The method for predicting the commodity time series data according to claim 1, wherein the generating of the information fusion data by performing splicing and feature transformation according to the adjacent interval high-level semantic information representation and the periodic interval high-level semantic information representation specifically comprises:

the ninth calculation formula is:

wherein the content of the first and second substances,

representing the high-level semantic information of the periodic interval;

the tenth calculation formula is:

6. The method for predicting time series data of commodities, according to claim 1, wherein the obtaining and processing of the characteristics of the seasonal information data to generate seasonal characteristic data specifically comprises: and performing text coding on the spring, summer, autumn and winter sections in the seasonal factor information by using Word2Vec according to the seasonal factor information to generate the seasonal feature data.

7. The method for predicting time series data of commodities according to claim 1, wherein the generating of target fusion data by performing splicing and feature transformation according to the information fusion data and the seasonal feature data specifically comprises:

the eleventh calculation formula is:

8. The method for predicting the time series data of the commodities, as claimed in claim 1, wherein said performing the feature transformation and the activation function according to the target fusion data to predict the prices and sales of the commodities specifically comprises:

the twelfth calculation formula is:

9. A system for predicting time series data of a commodity, the system comprising:

10. The system for forecasting merchandise temporal data according to claim 9, further comprising: