CN113052618A - Data prediction method and related equipment - Google Patents

Data prediction method and related equipment Download PDF

Info

Publication number
CN113052618A
CN113052618A CN201911374073.6A CN201911374073A CN113052618A CN 113052618 A CN113052618 A CN 113052618A CN 201911374073 A CN201911374073 A CN 201911374073A CN 113052618 A CN113052618 A CN 113052618A
Authority
CN
China
Prior art keywords
time
data
time series
tensor
dimension
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911374073.6A
Other languages
Chinese (zh)
Inventor
史启权
蔡嘉俊
尹嘉铭
陈磊
袁明轩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201911374073.6A priority Critical patent/CN113052618A/en
Publication of CN113052618A publication Critical patent/CN113052618A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Analysis (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Computing Systems (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the application discloses a data prediction method, which can be applied to data prediction in the field of artificial intelligence and comprises the following steps: acquiring a plurality of time sequences, wherein each time sequence comprises data arranged in a time dimension, the plurality of time sequences have the same dimension, and the change of the data in the time dimension has an association relationship; mapping the plurality of time sequences into a target tensor, wherein the dimension of the target tensor is greater than that of the time sequences, and the target tensor comprises data included in the plurality of time sequences; and processing the target tensor through a first prediction model to obtain a prediction result, wherein the prediction result comprises a data prediction value of each time sequence in the plurality of time sequences in a time dimension. According to the method and the device, the time sequence is mapped into the high-order multidimensional target tensor, the input of the prediction model is improved, the incidence relation between the time sequences can be learned by the prediction model, and the prediction precision of the prediction model is improved.

Description

Data prediction method and related equipment
Technical Field
The present application relates to the field of artificial intelligence, and in particular, to a data prediction method and related device.
Background
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making. Image processing using artificial intelligence is a common application of artificial intelligence.
In some scenarios, we need to predict sales and demand of some products, and in the prior art, we can predict sales and demand through a data prediction model based on a time series of the products including historical sales and historical demand as input.
However, in some products, such as electronic products (mobile phones or notebook computers), the life cycle is short, and in such cases, the demand prediction and the sales prediction are based on very short (small) samples, and the prediction effect is poor.
Disclosure of Invention
The embodiment of the application provides a data prediction method, which comprises the following steps: acquiring a plurality of time sequences, wherein each time sequence comprises data arranged in a time dimension, the plurality of time sequences have the same dimension, and the change of the data in the time dimension has an association relationship; mapping the plurality of time series into a target tensor, wherein the dimension of the target tensor is larger than that of the time series, and the target tensor contains data included by the plurality of time series; and processing the target tensor through a first prediction model to obtain a prediction result, wherein the prediction result comprises a data prediction value of each time sequence in the plurality of time sequences in a time dimension. In the embodiment of the application, the plurality of time sequences are mapped into the high-order multidimensional target tensor, so that the input of the prediction model is improved, the prediction model can learn the incidence relation among the plurality of time sequences, and the prediction precision of the prediction model is improved.
In an optional implementation of the first aspect, the plurality of time series includes a first time series and a second time series, the first time series includes data representing a change in a first characteristic of a first object in a time dimension, the second time series includes data representing a change in the first characteristic of a second object in the time dimension, and the first object and the second object have an association relationship in a first feature.
In an optional implementation of the first aspect, the plurality of time series includes a third time series and a fourth time series, the third time series includes data used for representing a change of a second characteristic of a third object in a time dimension, the fourth time series includes data used for representing a change of a third characteristic of the third object in the time dimension, and the second characteristic of the third object and the third characteristic of the third object have an association relationship.
In an optional implementation of the first aspect, the processing the target tensor by the first prediction model to obtain a prediction result includes: carrying out tensor decomposition on the target tensor to obtain a core tensor; and processing the core tensor through the first prediction model to obtain the prediction result. In the embodiment of the application, the core tensor training model obtained after tensor decomposition is used, and when data prediction is carried out, the core tensor training model is used as the input of the prediction model, and the core tensor core sensors can be captured more easily than original data (target tensor) to obtain the inherent time correlation, so that the calculation amount and the storage demand of the first prediction model in the training process can be reduced.
In an optional implementation of the first aspect, the processing the core tensor by the first predictive model includes: processing the core tensor through the first prediction model to obtain a first tensor; processing the first tensor through tensor decomposition inverse transformation to obtain a second tensor; mapping the second tensor into a plurality of predictor sequences, wherein each predictor sequence corresponds to a time sequence and the predictor sequences comprise data predictors of the corresponding time sequence in a time dimension.
In an optional implementation of the first aspect, the tensor resolution of the target tensor comprises: performing a Tucker decomposition on the target tensor based on a mapping matrix, wherein the mapping matrix is free of orthogonal constraints in a time dimension when the Tucker decomposition is performed.
In an optional implementation of the first aspect, the mapping the plurality of time series to corresponding target tensors includes: processing a plurality of time series in a data augmentation mode in the time dimension; and mapping the plurality of time sequences after the data augmentation processing into the target tensor. In the embodiment of the present application, the plurality of time sequences are processed in a time-domain in a data augmentation manner, and are directed to a small sample time sequence, because of the very limited information, the prior art cannot directly use the time sequence as input for training and executing a prediction model, or cannot obtain a better prediction result.
In an optional implementation of the first aspect, the mapping the plurality of time series to corresponding target tensors includes: mapping a plurality of time series to the target tensor in a time dimension by a multi-dimensional delay embedding transformation (MDT).
In an optional implementation of the first aspect, the first prediction model is a differential integrated moving average autoregressive, ARIMA, model supporting tensor inputs.
In a second aspect, the present application provides a data prediction method, including: acquiring a plurality of time sequences and target data corresponding to each time sequence, wherein each time sequence comprises a plurality of first data arranged in a time dimension, the target data is positioned behind the plurality of first data included in each time sequence in the time dimension, the time sequences have the same dimension, and the change of the first data included in the time sequences in the time dimension has an association relationship; mapping the time series into a target tensor, the dimension of the target tensor being greater than the dimension of the time series, the target tensor comprising the data included in the plurality of time series; processing the target tensor through a second prediction model to obtain a prediction result, wherein the second prediction model is a model which is not subjected to iterative training, and the prediction result comprises a data prediction value of each time sequence in the plurality of time sequences in the time dimension; performing iterative training on the second prediction model by using a first loss function according to the data prediction value of each time sequence in the time dimension and the corresponding target data until the similarity between the data prediction value of each time sequence in the time dimension and the corresponding target data reaches a first preset degree; and outputting a first prediction model, wherein the first prediction model is obtained after the second prediction model is subjected to iterative training.
In an optional implementation of the second aspect, the plurality of time series includes a first time series and a second time series, the first time series includes first data for representing a change in a first characteristic of a first object in a time dimension, the second time series includes first data for representing a change in the first characteristic of a second object in the time dimension, and the first object and the second object have an association relationship in a first feature.
In an optional implementation of the second aspect, the plurality of time series includes a third time series and a fourth time series, the third time series includes first data for representing a change of a second characteristic of a third object in a time dimension, the fourth time series includes first data for representing a change of a third characteristic of the third object in a time dimension, and the second characteristic of the third object and the third characteristic of the third object are in an associated relationship.
In an optional implementation of the second aspect, the processing the target tensor by the second prediction model to obtain the prediction result includes: carrying out tensor decomposition on the target tensor to obtain a core tensor; and processing the core tensor through the second prediction model to obtain the prediction result.
In an optional implementation of the second aspect, the processing the core tensor by the second predictive model includes: processing the core tensor through the second prediction model to obtain a first tensor; processing the first tensor through tensor decomposition inverse transformation to obtain a second tensor; mapping the second tensor into a plurality of predictor sequences, wherein each predictor sequence corresponds to a time sequence and the predictor sequences comprise data predictors of the corresponding time sequence in a time dimension.
In an optional implementation of the second aspect, the tensor resolution of the target tensor comprises: performing a Tucker decomposition on the target tensor based on a mapping matrix, wherein the mapping matrix is free of orthogonal constraints in a time dimension when the Tucker decomposition is performed.
In an optional implementation of the second aspect, the mapping the plurality of time series to corresponding target tensors includes: processing a plurality of time series in a data augmentation mode in the time dimension; and mapping the plurality of time sequences after the data augmentation processing into the target tensor.
In an optional implementation of the second aspect, the mapping the plurality of time series to corresponding target tensors includes: mapping a plurality of time series to the target tensor in a time dimension by a multi-dimensional delay embedding transformation (MDT).
In an optional implementation of the second aspect, the second prediction model and the first prediction model are differential integrated moving average autoregressive, ARIMA, models supporting tensor inputs.
In a third aspect, the present application provides an execution apparatus, comprising:
the acquisition module is used for acquiring a plurality of time sequences, wherein each time sequence comprises data arranged in a time dimension, the time sequences have the same dimension, and the change of the data in the time dimension has an association relation;
a mapping module, configured to map the plurality of time series into a target tensor, a dimension of the target tensor being greater than a dimension of the time series, the target tensor including data included in the plurality of time series;
and the prediction module is used for processing the target tensor through a first prediction model to obtain a prediction result, and the prediction result comprises a data prediction value of each time sequence in the plurality of time sequences in a time dimension.
In an optional implementation of the third aspect, the plurality of time series includes a first time series and a second time series, the first time series includes data representing a change in a time dimension of a first characteristic of a first object, the second time series includes data representing a change in a time dimension of the first characteristic of a second object, and the first object and the second object have an association relationship in a first feature.
In an optional implementation of the third aspect, the plurality of time series includes a third time series and a fourth time series, the third time series includes data used for representing a change of a second characteristic of a third object in a time dimension, the fourth time series includes data used for representing a change of a third characteristic of the third object in the time dimension, and the second characteristic of the third object and the third characteristic of the third object have an association relationship.
In an optional implementation of the third aspect, the prediction module is specifically configured to:
carrying out tensor decomposition on the target tensor to obtain a core tensor;
and processing the core tensor through the first prediction model to obtain the prediction result.
In an optional implementation of the third aspect, the prediction module is specifically configured to:
processing the core tensor through the first prediction model to obtain a first tensor;
processing the first tensor through tensor decomposition inverse transformation to obtain a second tensor;
mapping the second tensor into a plurality of predictor sequences, wherein each predictor sequence corresponds to a time sequence and the predictor sequences comprise data predictors of the corresponding time sequence in a time dimension.
In an optional implementation of the third aspect, the prediction module is specifically configured to:
performing a Tucker decomposition on the target tensor based on a mapping matrix, wherein the mapping matrix is free of orthogonal constraints in a time dimension when the Tucker decomposition is performed.
In an optional implementation of the third aspect, the mapping module is specifically configured to:
processing a plurality of time series in a data augmentation mode in the time dimension;
and mapping the plurality of time sequences after the data augmentation processing into the target tensor.
In an optional implementation of the third aspect, the mapping module is specifically configured to:
mapping a plurality of time series to the target tensor in a time dimension by a multi-dimensional delay embedding transformation (MDT).
In an optional implementation of the third aspect, the first prediction model is a differential integrated moving average autoregressive, ARIMA, model supporting tensor inputs.
In a fourth aspect, the present application provides a training apparatus, the apparatus comprising:
the acquisition module is used for acquiring a plurality of time sequences and target data corresponding to each time sequence, wherein each time sequence comprises a plurality of first data arranged in a time dimension, the target data is positioned behind the plurality of first data included in each time sequence in the time dimension, the time sequences have the same dimension, and the change of the data included in the time sequences in the time dimension has an association relationship;
a mapping module, configured to map the time series into a target tensor, a dimension of the target tensor being greater than a dimension of the time series, the target tensor including data included in the plurality of time series;
the prediction module is used for processing the target tensor through a second prediction model to obtain a prediction result, wherein the second prediction model is a model which is not subjected to iterative training, and the prediction result comprises a data prediction value of each time sequence in the plurality of time sequences in the time dimension;
the iterative training module is used for performing iterative training on the second prediction model by using a first loss function according to the data prediction value of each time sequence in the time dimension and the corresponding target data until the similarity between the data prediction value of each time sequence in the time dimension and the corresponding target data reaches a first preset degree;
and the output module is used for outputting a first prediction model, and the first prediction model is obtained after the second prediction model is subjected to iterative training.
In an optional implementation of the fourth aspect, the plurality of time series includes a first time series and a second time series, the first time series includes first data for representing a change in a first characteristic of a first object in a time dimension, the second time series includes first data for representing a change in the first characteristic of a second object in the time dimension, and the first object and the second object have an association relationship in a first feature.
In an optional implementation of the fourth aspect, the plurality of time series includes a third time series and a fourth time series, the third time series includes first data for representing a change of the second characteristic of the third object in the time dimension, the fourth time series includes first data for representing a change of the third characteristic of the third object in the time dimension, and the second characteristic of the third object and the third characteristic of the third object have an association relationship.
In an optional implementation of the fourth aspect, the prediction module is specifically configured to perform tensor decomposition on the target tensor to obtain a core tensor; and processing the core tensor through the second prediction model to obtain the prediction result.
In an optional implementation of the fourth aspect, the prediction module is specifically configured to process the core tensor through the second prediction model to obtain a first tensor; processing the first tensor through tensor decomposition inverse transformation to obtain a second tensor; mapping the second tensor into a plurality of predictor sequences, wherein each predictor sequence corresponds to a time sequence and the predictor sequences comprise data predictors of the corresponding time sequence in a time dimension.
In an optional implementation of the fourth aspect, the prediction module is specifically configured to perform a Tucker decomposition on the target tensor based on a mapping matrix, where, in the Tucker decomposition, the mapping matrix has no orthogonal constraint in a time dimension.
In an optional implementation of the fourth aspect, the mapping module is specifically configured to perform data augmentation processing on a plurality of time series in the time dimension; and mapping the plurality of time sequences after the data augmentation processing into the target tensor.
In an optional implementation of the fourth aspect, the mapping module is specifically configured to map the plurality of time series into the target tensor through a multi-dimensional time-delay embedding transformation MDT in a time dimension.
In an optional implementation of the fourth aspect, the second prediction model and the first prediction model are differential integrated moving average autoregressive, ARIMA, models supporting tensor inputs.
In a fifth aspect, the present application provides a training device, including a processor and a memory, where the processor is coupled with the memory, and the communication device is a terminal device or a training device; the memory is used for storing programs; the processor is configured to execute the program in the memory to cause the communication device to perform the method according to the second aspect.
In a sixth aspect, the present application provides a computer-readable storage medium, in which a computer program is stored, and when the computer program runs on a computer, the computer is caused to execute the data prediction method according to the first aspect or the second aspect.
In a seventh aspect, the present application provides a computer program, which when run on a computer, causes the computer to execute the data prediction method according to the first aspect or the second aspect.
In an eighth aspect, the present application provides a chip system, which includes a processor for enabling an executing device or a training device to implement the functions referred to in the above aspects, for example, to transmit or process data and/or information referred to in the above methods. In one possible design, the system-on-chip further includes a memory for storing program instructions and data necessary for the execution device or the training device. The chip system may be formed by a chip, or may include a chip and other discrete devices.
In a ninth aspect, the present application provides a data prediction method, comprising:
acquiring a plurality of time sequences, wherein each time sequence comprises data arranged in a time dimension, each time sequence is used for representing the change of one characteristic of one sales commodity in the time dimension, the characteristic is sales volume or demand volume, the time sequences have the same dimension, and the change of the data in the time dimension has an association relationship;
mapping the plurality of time series into a target tensor, wherein the dimension of the target tensor is larger than that of the time series, and the target tensor contains data included by the plurality of time series;
and processing the target tensor through a first prediction model to obtain a prediction result, wherein the prediction result comprises a data prediction value of each time sequence in the plurality of time sequences in a time dimension.
In an optional implementation of the ninth aspect, the plurality of time series includes a first time series and a second time series, the first time series includes data representing a change in a time dimension of a first characteristic of a first sold commodity, the second time series includes data representing a change in a time dimension of the first characteristic of a second sold commodity, there is a correlation between the first characteristic of the first sold commodity and the first characteristic of the second sold commodity, and the first characteristic is a sales volume or a demand volume.
In an optional implementation of the ninth aspect, the third time series includes data representing a change in a second characteristic of a third article for sale in a time dimension, the fourth time series includes data representing a change in a third characteristic of the third article for sale in the time dimension, the second characteristic of the third article for sale and the third characteristic of the third article for sale are associated, and the second characteristic and the third characteristic are sales volume or demand volume.
In the embodiment of the application, a plurality of time sequences are obtained, wherein each time sequence comprises data arranged in a time dimension, the plurality of time sequences have the same dimension, and the change of the data included in the plurality of time sequences in the time dimension has an incidence relation; mapping the plurality of time series into a target tensor, wherein the dimension of the target tensor is larger than that of the time series, and the target tensor contains data included by the plurality of time series; and processing the target tensor through a first prediction model to obtain a prediction result, wherein the prediction result comprises a data prediction value of each time sequence in the plurality of time sequences in a time dimension. By the method, the time sequence is mapped into the high-order multidimensional target tensor, the input of the prediction model is improved, the incidence relation between the time sequences can be learned by the prediction model, and the prediction precision of the prediction model is improved.
Drawings
FIG. 1 is a schematic structural diagram of an artificial intelligence body framework provided by an embodiment of the present application;
FIG. 2 is a schematic illustration of a product hierarchy provided by an embodiment of the present application;
fig. 3 is a schematic diagram illustrating an embodiment of a data prediction method according to an embodiment of the present application;
FIG. 4a is a structural schematic of a repeating matrix provided in an embodiment of the present application;
fig. 4b is a flowchart of a multi-dimensional delay embedding transformation MDT provided in the embodiment of the present application;
fig. 5 is a flowchart illustrating an embodiment of a data prediction method according to an embodiment of the present application;
FIG. 6 is a system architecture diagram of a data processing system according to an embodiment of the present application;
FIG. 7 is a flowchart illustrating a data prediction method according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of an execution device according to an embodiment of the present application;
FIG. 9 is a schematic structural diagram of a training apparatus provided in an embodiment of the present application;
fig. 10 is a schematic structural diagram of an execution device according to an embodiment of the present application;
FIG. 11 is a schematic structural diagram of a training apparatus according to an embodiment of the present disclosure;
fig. 12 is a schematic structural diagram of a chip according to an embodiment of the present disclosure.
Detailed Description
The embodiment of the application provides an image processing method and related equipment, which are used for acquiring a feature plane of an array image through a high-dimensional convolutional neural network and up-sampling the feature plane, and compared with the method for directly up-sampling a first array image, a large amount of computer resources are saved.
Embodiments of the present application are described below with reference to the accompanying drawings. As can be known to those skilled in the art, with the development of technology and the emergence of new scenarios, the technical solution provided in the embodiments of the present application is also applicable to similar technical problems.
The terms "first," "second," and the like in the description and in the claims of the present application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and are merely descriptive of the various embodiments of the application and how objects of the same nature can be distinguished. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of elements is not necessarily limited to those elements, but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
The general workflow of the artificial intelligence system will be described first, please refer to fig. 1, which shows a schematic structural diagram of an artificial intelligence body framework, and the artificial intelligence body framework is explained below from two dimensions of "intelligent information chain" (horizontal axis) and "IT value chain" (vertical axis). Where "intelligent information chain" reflects a list of processes processed from the acquisition of data. For example, the general processes of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision making and intelligent execution and output can be realized. In this process, the data undergoes a "data-information-knowledge-wisdom" refinement process. The 'IT value chain' reflects the value of the artificial intelligence to the information technology industry from the bottom infrastructure of the human intelligence, information (realization of providing and processing technology) to the industrial ecological process of the system.
(1) Infrastructure
The infrastructure provides computing power support for the artificial intelligent system, realizes communication with the outside world, and realizes support through a foundation platform. Communicating with the outside through a sensor; the computing power is provided by intelligent chips (hardware acceleration chips such as CPU, NPU, GPU, ASIC, FPGA and the like); the basic platform comprises distributed computing framework, network and other related platform guarantees and supports, and can comprise cloud storage and computing, interconnection and intercommunication networks and the like. For example, sensors and external communications acquire data that is provided to intelligent chips in a distributed computing system provided by the base platform for computation.
(2) Data of
Data at the upper level of the infrastructure is used to represent the data source for the field of artificial intelligence. The data relates to graphs, images, voice and texts, and also relates to the data of the Internet of things of traditional equipment, including service data of the existing system and sensing data such as force, displacement, liquid level, temperature, humidity and the like.
(3) Data processing
Data processing typically includes data training, machine learning, deep learning, searching, reasoning, decision making, and the like.
The machine learning and the deep learning can perform symbolized and formalized intelligent information modeling, extraction, preprocessing, training and the like on data.
Inference means a process of simulating an intelligent human inference mode in a computer or an intelligent system, using formalized information to think about and solve a problem by a machine according to an inference control strategy, and a typical function is searching and matching.
The decision-making refers to a process of making a decision after reasoning intelligent information, and generally provides functions of classification, sequencing, prediction and the like.
(4) General capabilities
After the above-mentioned data processing, further based on the result of the data processing, some general capabilities may be formed, such as algorithms or a general system, e.g. translation, analysis of text, computer vision processing, speech recognition, recognition of images, etc.
(5) Intelligent product and industrial application
The intelligent product and industry application refers to the product and application of an artificial intelligence system in various fields, and is the encapsulation of an artificial intelligence integral solution, the intelligent information decision is commercialized, and the landing application is realized, and the application field mainly comprises: intelligent terminal, intelligent manufacturing, intelligent transportation, intelligent house, intelligent medical treatment, intelligent security protection, autopilot, safe city etc..
The method can be applied to the field of time series prediction, wherein the time series prediction is to predict the characteristics of an event in a future period of time by using the characteristics of the event in a past period of historical time, the time series prediction is dependent on the sequence of occurrence of the event, and specifically, in some practical business scenarios, the characteristics (such as sales volume, demand volume and the like) of some target objects (such as products) need to be subjected to time series prediction to predict the characteristic change of the products in the future.
A scenario in which the data prediction method of the present application is applicable is described below:
the data prediction method can be applied to a plurality of time series and the prediction problems of a certain relation between the time series, such as the demand prediction problem of multiple products of a supply chain, the power prediction problem of multiple regions (cities and subordinate regions) and the like. The application scenario of the data sequence prediction method of the present application is described below by taking demand prediction of a plurality of products having a tree structure as an example.
Referring to fig. 2, fig. 2 is a schematic illustration of a product hierarchy structure according to an embodiment of the present application.
Complete machine (pick to order, PTO): refers to a complete product for direct sale, such as a server, computer, etc., which may be product 1 in fig. 2.
Semi-finished (AI): it refers to an intermediate piece that needs to be processed by a factory, and the semi-finished products can be products 2, 3, 4, 5, and 6 in fig. 2.
Procured Item (PI): the product is obtained directly by purchasing, and may be product 7, product 8, product 9, product 10, product 11, and product 12 in fig. 2.
In some scenarios, there is a correlation between some characteristics (referred to as a first characteristic or a second characteristic in the following embodiments) between some products in a time dimension, for example, an a product is a complete machine, a B product is a semi-finished product for assembling the a product, at this time, there is a correlation between demand and sales of the a product and demand of the B product, in other words, a time series of sales of the a product may be affected by a change of the time series, and in consideration of an internal correlation between the time series (or products), a plurality of time series having the correlation may be taken as an input to train a time series prediction model, so that the time series prediction model learns the correlation between the plurality of time series and predicts the plurality of time series simultaneously through the trained time series prediction model, and the prediction effect (precision) is better.
Illustratively, a notebook laptop, a display screen, a motherboard remote board, a keyboard, etc. because the display screen, the motherboard remote board, and the keyboard are components of the notebook laptop, the trend of the data in the time series of the sales volume of the notebook is associated with the trend of the data in the time series of the sales volume of the display screen, the motherboard remote board, the keyboard, etc., and for example, the motherboard remote board is associated with the fan, the memory, the hardware-driven hard drive, and the Central Processing Unit (CPU), etc., because the fan, the memory, the hardware-driven hard drive, and the Central Processing Unit (CPU) are components of the motherboard remote board, the fan, the memory, the hardware-driven hard drive, and the Central Processing Unit (CPU) are associated with the trend of the sales volume of the data in the time series of the sales volume of the notebook, and thus the trend of the data in the time series of the sales volume of the notebook is associated with the trend of the data in the time series of the sales volume of the notebook .
In some scenarios, there is a correlation between certain characteristics of a product in a time dimension, for example, there is a correlation between sales and demand of a product a, in other words, a change in a time series of one characteristic of a product may affect a time series of another characteristic of the product, and in consideration of an internal correlation between the time series, a plurality of time series having the correlation may be used as an input to train a time series prediction model, so that the time series prediction model learns the correlation between the plurality of time series, and predicts the plurality of time series simultaneously through the trained time series prediction model, and the prediction effect (accuracy) is good.
An autoregressive integrated moving average model (ARIMA) is a time sequence prediction method proposed by bosch (Box) and Jenkins (Jenkins), which is a model established by converting a non-stationary time sequence into a stationary time sequence and then regressing a dependent variable only for a lag value of the dependent variable and a present value and a lag value of a random error term, however, ARIMA cannot process a plurality of time sequences at the same time, and therefore, the incidence relation between the time sequences cannot be utilized to improve the prediction effect. And ARIMA processes each time series separately and estimates its model parameters, the computation time required for multiple time series, especially for data of large sample numbers, is very large.
Based on this, the embodiment of the application provides a data prediction method.
For ease of understanding, several concepts related to the embodiments of the present application will be described below.
1. Time series:
the time sequence is a group of data point sequences arranged according to the time occurrence sequence. Typically, the time interval of a group of time series is a constant value, so that the time series can be analyzed as discrete time data.
2. An autoregressive integrated moving average model (ARIMA) is a model established by converting a non-stationary time sequence into a stationary time sequence and then regressing a dependent variable only for a hysteresis value of the dependent variable and a present value and a hysteresis value of a random error term, which is proposed by bosch (Box) and Jenkins (Jenkins).
3. Tensor: is a multi-linear function for expressing a linear relationship between a vector, a scalar and other tensors, in which a one-dimensional tensor is referred to as a vector, a two-dimensional tensor is referred to as a matrix, and generally tensors in three or more dimensions are simply referred to as tensors. Optionally, the dimension of the tensor is related to the number of data classes included in the tensor, such as: the tensor includes account data, resource data and tag data, and the tensor is a three-dimensional tensor.
4. Tensor decomposition: tensor decomposition extracts valuable features from raw data by decomposing it. Tensors are the proper nouns for multidimensional data. The vectors and matrices may be considered first-order (one-dimensional) and second-order (two-dimensional) tensors, respectively. In the real world, many data such as video are in tensor form. The Tucker decomposition is one of the most commonly used decomposition techniques. For a third-order tensor, three second-order factor matrices (factor matrices) and a third-order kernel tensor (core tensor) can be obtained by the Tucker decomposition. In another expression: the Tucker decomposition maps the original tensor to a core tensor with good properties (such as low rank low-rank) by a factor matrix (also called mapping matrix). The core tensor can be further applied in classification, clustering, regression, and other tasks.
Referring to fig. 3, fig. 3 is a schematic diagram of an embodiment of a data prediction method provided in an embodiment of the present application, and as shown in fig. 3, the data prediction method provided in the embodiment of the present application includes:
301. acquiring a plurality of time sequences, wherein each time sequence comprises data arranged in a time dimension, the plurality of time sequences have the same dimension, and the change of the data in the time dimension has an association relationship.
In this embodiment of the application, a plurality of time series may be obtained, where the plurality of time series includes a first time series and a second time series, the first time series includes data used to represent a change in a time dimension of a first characteristic of a first object, the second time series includes data used to represent a change in a time dimension of a first characteristic of a second object, and the first object and the second object have an association relationship in a first feature.
In the embodiment of the present application, the plurality of time series includes a first time series corresponding to the first object and a second time series corresponding to the second object, the first time series represents a first characteristic of the first object in a time dimension, in one scenario, the first characteristic may be a demand or a sales volume, and the like, and the first time series may represent a demand or a sales volume change of the first object in the time dimension, for example, the first time series may be a sales volume of each month of the first object in the past year. The second time series may represent a demand or a change in sales of the second object in the time dimension, for example, the second time series may be sales of the second object in each month in the past year, and the first object and the second object have an association relationship on the first characteristic, that is, the first object and the second object have an association relationship between products as shown in fig. 2, for example, the first object and the second object have an association relationship on sales or an association relationship on demand, which is not limited herein. It should be noted that the association relationship is possessed by the first object and the second object themselves.
In this embodiment, the plurality of time series may include a third time series and a fourth time series, where the third time series includes data used for representing a change of a second characteristic of a third object in a time dimension, and the fourth time series includes data used for representing a change of a third characteristic of the third object in the time dimension, and there is an association relationship between the second characteristic of the third object and the third characteristic of the third object.
In this embodiment, the plurality of time series may include a third time series and a fourth time series corresponding to a third object, the third time series is a time series of a third characteristic of the third object, the third characteristic may be a demand amount or a sales amount, and the like, the third time series may indicate a demand amount or a sales amount change of the third object in a time dimension, for example, the third time series may be a sales amount of each month of the third object in a past year. The fourth time series may represent the demand or the change of the sales of the third object in the time dimension, for example, the fourth time series may be the demand of each month in the past year of the third object, and the third characteristic and the fourth characteristic of the third object have an association relationship, for example, the sales and the demand of the third object have an association relationship. It should be noted that the above-mentioned relation is possessed by the third feature and the fourth feature themselves.
In the embodiment of the application, the change of data included in a plurality of time sequences in a time dimension has an association relationship, the association relationship in the time dimension can be formed among the time sequences in the plurality of time sequences, the change of one time sequence can affect other time sequences, the time sequences with the association relationship can be taken as input, a time sequence prediction model is trained by taking the internal association relationship among the time sequences into consideration, the time sequence prediction model can learn the association relationship among the time sequences, the time sequences can be predicted simultaneously through the trained time sequence prediction model, and the prediction effect (accuracy) is good.
In one scenario, the first characteristic may be a power demand of a region, and the like, in this case, the first time series may include data representing a change in the power demand of the region a in a time dimension, the second time series may include data representing a change in the region B in the time dimension, and the region a and the region B have a correlation in the power demand.
In this embodiment, the first object, the second object, and the third object may be products or sales products, and the time series of the first object, the second object, and the third object may be the required quantity of the products in the past one year per month, so the data prediction method in this embodiment may be applied to product demand prediction, and the required quantity of the products in the future one time per month may be predicted.
For another example, the first object, the second object, and the third object may be regions, the regions may be provinces, cities, counties, districts, and the like in china, and the time series of the first object, the second object, and the third object may be power demand wattages of each day in the past month of the regions. Then, the data prediction method in this embodiment may be applied to power demand prediction, and may predict the required wattage of each area in a future period of time every day. It will be appreciated that there is a fixed hierarchy between regions, e.g., a province may include multiple municipalities, a municipality may include multiple districts, counties, etc.
302. Mapping the plurality of time series into a target tensor, a dimension of the target tensor being greater than a dimension of the time series, the target tensor including the data included in the plurality of time series.
In the embodiment of the present application, after acquiring a plurality of time series, each time series including data arranged in a time dimension, the plurality of time series may be mapped to a target tensor, a dimension of the target tensor being greater than a dimension of the time series.
In some scenarios, the life cycle of a product is short, such as an electronic product (a mobile phone or a notebook computer, etc.), resulting in a short raw material supply cycle and a short product sale cycle. In such a case, the demand prediction and the sales prediction are based on very short (small number) samples, and the variation of the corresponding time series is large.
Based on this, the plurality of time series can be subjected to data augmentation processing in the time dimension to increase the data quantity of the time series in the time dimension, and the plurality of time series subjected to data augmentation processing are mapped to corresponding target tensors.
In the embodiment of the present application, a plurality of time series may be converted into a target tensor by using a transformation technique (or referred to as a spatial reconstruction technique), wherein a dimension of the target tensor is larger than a dimension of the time series. Specifically, the plurality of time series may be converted into the target tensor along the time dimension by using a multi-dimensional delay embedding transform (MDT), a fourier transform, a wavelet transform, an Empirical Mode Decomposition (EMD), or the like.
Next, the embodiment of the present application will be described by taking as an example that multiple time series are mapped to corresponding target tensors through the multi-dimensional delay embedding transformation MDT in the time dimension:
in the embodiment of the present application, the time sequence is converted into a target tensor along the time dimension by using the multi-path delay embedding transformation technique MDT, wherein the target tensor may be referred to as a Block Hankel Tensor (BHT). If the transformation process is only performed along the time dimension, the dimension of the target tensor is increased by 1 dimension compared to the time series.
In the embodiment of the present application, the plurality of time sequences may be based on the repetition matrix STAnd processing the plurality of time sequences in a data augmentation mode in the time dimension. Specifically, referring to fig. 4a, fig. 4a is a repeating matrix S provided in the embodiment of the present applicationTAs shown in fig. 4a, repeating matrix STIs a matrix of tau (T-tau +1) x T, which diagonally comprises T-tau +1 matrices Iτ. Referring to fig. 4b, fig. 4b is a flowchart illustrating a multi-dimensional delay embedding transformation MDT provided in the embodiment of the present application, as shown in fig. 4b, I matrices formed by time-series arrangement and a repeating matrix STAfter multiplication, a matrix subjected to data amplification processing on the time dimension is obtained, the specification of the matrix subjected to the data amplification processing is tau (T-tau +1) multiplied by I, and the data quantity tau (T-tau +1) tau on the time dimension is greatly increased compared with the data quantity T on the time dimension of a time sequence which is not subjected to the data amplification processing. The parameter τ is a preset parameter, and may be selected according to actual conditions, and is not limited here.
In this embodiment of the application, after the matrix subjected to the data augmentation processing is obtained, the matrix subjected to the data augmentation processing may be split in the time dimension to obtain a plurality of matrix slices, for example, as shown in fig. 4b, the specification of each matrix slice is I × τ, the target tensor is composed of a plurality of matrix slices, in the example of fig. 4b, the dimension of the target tensor is 3, the dimension of the time series is 2, and the dimension of the target tensor is greater than the dimension of the time series.
For example, if the number I of time series is 1000, the length T of each time series is 40, i.e., I is 1000, T is 40, the parameter τ is 5, and MDT transformation is performed along the time dimension, so as to obtain a three-dimensional target tensor of 1000 × 5 (40-5+1) to 1000 × 5 × 36. It should be noted that the above is only an example, and does not limit the present application.
In the embodiment of the present application, the block hankel tensor BHT has good structural properties, such as low rank or smoothness, which is easier to learn and train than the original data (multiple time series).
It should be noted that, in the embodiment of the present application, only the MDT transformation is performed on the time dimension, because the adjacent relations of the multiple time sequences on other non-time dimensions are not strongly correlated (in other words, the ordering among different time sequences can be arbitrarily replaced), so that the significance of performing the MDT transformation on the multiple time sequences on the non-time dimensions is not great, and the calculation amount is increased instead.
In this embodiment of the application, data augmentation processing may not be performed on the plurality of time series in the time dimension, for example, if some time series with sufficient data amount are targeted, the time series may be directly split in the time dimension to obtain a plurality of matrix slices, and the target tensor is composed of the plurality of matrix slices.
It should be noted that, the above describes how to convert the plurality of time series into the target tensor along the time dimension only by using the MDT transform, however, in practical applications, the plurality of time series may be converted into the target tensor by other methods, for example, fourier transform, wavelet transform, Empirical Mode Decomposition (EMD), and the like, and the present embodiment is not limited thereto.
303. And processing the target tensor through a first prediction model to obtain a prediction result, wherein the prediction result comprises a data prediction value of each time sequence in the plurality of time sequences in a time dimension.
In this embodiment of the application, after the plurality of time series are mapped to corresponding target tensors, the dimensionalities of the target tensors are greater than the dimensionalities of the time series, the target tensors may be processed by a first prediction model to obtain a prediction result, where the prediction result includes data predicted values of each time series in the time dimension, or the target tensors may be tensor decomposed first to obtain core tensors, and then the core tensors are processed by the first prediction model to obtain the prediction result, where the prediction result includes data predicted values of each time series in the time dimension.
In the embodiment of the application, after the target tensor is obtained, tensor decomposition can be performed on the target tensor to obtain the core tensor, and because the core tensors can be captured more easily than original data (the target tensor) to obtain the inherent time correlation, the calculation amount and the storage demand of the first prediction model in the training process can be reduced, so that the first prediction model can be directly used for training.
How to make tensor decomposition on the target tensor is explained next:
in one embodiment, the target tensor can be subjected to a Tucker decomposition based on a mapping matrix, the Tucker decomposition requires a common set of mapping matrices, and in the prior art, the mapping matrices are constrained in each dimension direction, which is referred to as the following formula: (U is a mapping matrix)
Figure BDA0002339087830000121
In the embodiment of the application, the orthogonal constraint is not added on the time dimension, namely that
Figure BDA0002339087830000122
Orthogonal constraints of other dimensions are unchanged, and the prediction model is more robust to input parameters and more stable in performance through the mode.
In this embodiment, the first prediction model may be obtained by training based on a core tensor. In an embodiment, the first prediction model may be a differential-integration moving-average autoregressive (ARIMA) model supporting tensor input, specifically, after obtaining the block hankel tensor BHT, a core tensor is obtained by using a Tucker tensor decomposition, the core tensor is processed by the first prediction model ARIMA to obtain a first tensor (a new core tensor), then the first tensor is processed by a Tucker tensor decomposition inverse transformation to obtain a second tensor (a new target tensor), and the second tensor is mapped into a plurality of prediction result sequences (for example, a prediction value of each time sequence in the plurality of time sequences is obtained by an MDT inverse transformation).
For example, a target tensor of 1000 × 5 × 36 is obtained after MDT transformation, a core tensor is obtained through Tucker decomposition (assuming that the rank size of the core tensor is set to [ 50,4 ]), a core tensor of 50 × 4 × 36 is obtained, a new core tensor (first tensor) of size 50 × 4 × 37 is obtained through ARIMA prediction, a second tensor of 1000 × 5 × 37 is obtained through tensor decomposition inverse transformation, and a prediction result of 1000 × 41 is obtained through MDT inverse transformation, where the prediction result includes a plurality of prediction result sequences, each of the prediction result sequences corresponds to one time sequence, and the prediction result sequences includes data prediction values of the corresponding time sequences in a time dimension, that is, data prediction values of 41 th time point of the 1000 time sequences can be output from the prediction result.
Exemplarily, referring to fig. 5, fig. 5 is a flowchart illustrating an embodiment of a data prediction method provided in an embodiment of the present application, as shown in fig. 5, X1、X2、….、XTThe slice is formed by all values of a plurality of time sequence data at each moment, and the tensor slice is obtained after MDT conversion
Figure BDA0002339087830000131
Tensor slicing
Figure BDA0002339087830000132
…, and tensor slicing
Figure BDA0002339087830000133
The method comprises forming a target tensor, and obtaining a core tensor (including matrix slice) after tensor decomposition
Figure BDA0002339087830000134
Matrix slicing
Figure BDA0002339087830000135
…, and matrix slices
Figure BDA0002339087830000136
) The core tensor is processed by the first prediction model to obtain a first tensor
Figure BDA0002339087830000137
First tension
Figure BDA0002339087830000138
After the inverse transformation of tensor decomposition, a second tensor is obtained
Figure BDA0002339087830000139
For the second tensor
Figure BDA00023390878300001310
And after the MDT inverse transformation is carried out, mapping the MDT inverse transformation result into a plurality of prediction result sequences, wherein the prediction result sequences comprise data prediction values of corresponding time sequences in a time dimension.
In the embodiment of the application, the plurality of time sequences are mapped into the high-order multidimensional target tensor, so that the input of the prediction model is improved, the prediction model can learn the incidence relation among the plurality of time sequences, and the prediction precision of the prediction model is improved.
In the embodiment of the present application, the plurality of time sequences are processed in a time-domain in a data augmentation manner, and are directed to a small sample time sequence, because of the very limited information, the prior art cannot directly use the time sequence as input for training and executing a prediction model, or cannot obtain a better prediction result.
In the embodiment of the application, the core tensor training model obtained after tensor decomposition is used, and when data prediction is carried out, the core tensor training model is used as the input of the prediction model, and the core tensor core sensors can be captured more easily than original data (target tensor) to obtain the inherent time correlation, so that the calculation amount and the storage demand of the first prediction model in the training process can be reduced.
It should be noted that, in this embodiment, the method for mapping the plurality of time series to the corresponding target tensors is only an illustration and does not constitute a limitation to this application, and in practical applications, the MDT transformation technology in the above embodiments may be replaced by other technologies, such as wavelet transformation or other transformation technologies.
In this embodiment, the method of tensor decomposition of the target tensor is only one example, and does not limit the present application, and in practical applications, the Tucker decomposition may be replaced by other tensor decomposition methods, such as a CP decomposition (finite parafacc) or Singular Value Decomposition (SVD) tensor decomposition model, and the present application is not limited.
In the present embodiment, the type of the first prediction model is not limited to the ARIMA model, and may be a prediction model such as Auto Regression (AR) or Support Vector Regression (SVR), and the present application is not limited thereto.
In the embodiment of the application, a plurality of time sequences are obtained, wherein each time sequence comprises data arranged in a time dimension, the plurality of time sequences have the same dimension, and the change of the data included in the plurality of time sequences in the time dimension has an incidence relation; mapping the plurality of time series into a target tensor, wherein the dimension of the target tensor is larger than that of the time series, and the target tensor contains data included by the plurality of time series; and processing the target tensor through a first prediction model to obtain a prediction result, wherein the prediction result comprises a data prediction value of each time sequence in the plurality of time sequences in a time dimension. By means of the method, the multiple time sequences are mapped into the high-order multidimensional target tensor, input of the prediction model is improved, the prediction model can learn the incidence relation among the multiple time sequences, and prediction accuracy of the prediction model is improved.
Referring to fig. 6, fig. 6 is a system architecture diagram of a data processing system according to an embodiment of the present application, in fig. 6, the data processing system 200 includes an execution device 210, a training device 220, a database 230, a client device 240, and a data storage system 250, and the execution device 210 includes a calculation module 211.
The database 230 stores a plurality of time sequences, the training device 220 generates a target model/rule 201 for processing the plurality of time sequences, and performs iterative training on the target model/rule 201 by using the plurality of time sequences in the database to obtain a mature target model/rule 201. In the embodiment of the present application, the target model/rule 201 is taken as an example of the first prediction model.
The first prediction model obtained by the training device 220 may be applied to different systems or devices, such as a mobile phone, a tablet, a notebook, and so on. The execution device 210 may call data, codes, and the like in the data storage system 250, or store data, instructions, and the like in the data storage system 250. The data storage system 250 may be disposed in the execution device 210 or the data storage system 250 may be an external memory with respect to the execution device 210.
The calculation module 211 may map, through the first prediction model, a plurality of time series received by the client device 240 into corresponding target tensors, where a dimension of the target tensor is greater than a dimension of the time series, and process the target tensor through the first prediction model to obtain a prediction result, where the prediction result includes a data prediction value of each time series in the time dimension.
In some embodiments of the present application, referring to fig. 6, the execution device 210 and the client device 240 may be independent devices, the execution device 210 is configured with the I/O interface 212 to interact with the client device 240, the "user" may input a plurality of time sequences to the I/O interface 212 through the client device 240, and the execution device 210 returns predicted values of data to the client device 240 through the I/O interface 212 to provide the predicted values to the user.
It should be noted that fig. 6 is only an architecture diagram of a data processing system according to an embodiment of the present invention, and the position relationship between the devices, modules, and the like shown in the diagram does not constitute any limitation. For example, in other embodiments of the present application, the execution device 210 may be configured in the client device 240, for example, when the client device is a mobile phone or a tablet, the execution device 210 may be a module in a main processor (Host CPU) of the mobile phone or the tablet for data prediction, and the execution device 210 may also be a processor or a neural Network Processor (NPU) in the mobile phone or the tablet, where the NPU is mounted as a coprocessor to the main processor and the main processor allocates tasks.
With reference to the above description, a specific implementation flow of the training phase of the data prediction method provided in the embodiment of the present application is described below.
First, training phase
In this embodiment of the present application, a training phase describes a process of how the training device 220 obtains the first prediction model by using a plurality of time sequences maintained in the database 230 and target data corresponding to each time sequence, specifically, please refer to fig. 7, where fig. 7 is a flowchart of a data prediction method provided in this embodiment of the present application, and the data prediction method provided in this embodiment of the present application may include:
701. the training equipment acquires a plurality of time sequences and target data corresponding to each time sequence, wherein each time sequence comprises a plurality of first data arranged in a time dimension, the target data is positioned behind the plurality of first data included in each time sequence in the time dimension, the time sequences have the same dimension, and the change of the first data included in the time sequences in the time dimension has an association relationship.
In some embodiments of the application, a plurality of time sequences and target data corresponding to each time sequence need to be stored in advance on the training device, before the second prediction model is trained, the plurality of time sequences and the target data corresponding to each time sequence are obtained, and the plurality of time sequences and the target data corresponding to each time sequence are used for the training device to train the second prediction model.
Specifically, each time series includes a plurality of first data arranged in a time dimension, the target data is located after the plurality of first data included in each time series in the time dimension, for example, a time series includes a plurality of first data arranged in time series (a1, a2, A3, a4, a5), and the target data A6 is located after the plurality of first data in the time dimension, that is, a1, a2, A3, a4, a5, and A6 may also constitute a time series.
702. The training device maps the time series to a target tensor, the dimension of the target tensor is larger than that of the time series, and the target tensor contains data included by the time series.
In this embodiment of the application, the training device may map, after acquiring a plurality of time series and target data corresponding to each time series, each time series including a plurality of first data arranged in a time dimension, the target data being located after the plurality of first data included in each time series in the time dimension, the plurality of time series into a corresponding target tensor, a dimension of the target tensor being greater than a dimension of the time series.
Specifically, the plurality of time series may be subjected to data augmentation processing in the time dimension to increase the data amount of the time series in the time dimension, and the plurality of time series subjected to data augmentation processing may be mapped to corresponding target tensors.
In the embodiment of the present application, a plurality of time series may be converted into a target tensor by using a transformation technique (or referred to as a spatial reconstruction technique), wherein a dimension of the target tensor is larger than a dimension of the time series. Specifically, the plurality of time series may be converted into the target tensor by using a multi-way delay embedding transform (MDT), a fourier transform, a wavelet transform, an Empirical Mode Decomposition (EMD), or the like.
703. And the training equipment processes the target tensor through a second prediction model to obtain a prediction result, wherein the second prediction model is a model which is not subjected to iterative training, and the prediction result comprises a data prediction value of each time sequence in the plurality of time sequences in the time dimension.
In some embodiments of the application, before the training device trains the second prediction model, it is necessary to initialize one second prediction model, that is, the second prediction model is a time sequence prediction model on which iterative training is not performed, and then the plurality of time sequences may be input into the second prediction model, so that the plurality of time sequences are processed by the second prediction model, thereby obtaining a prediction result.
In this embodiment of the application, after the plurality of time series are mapped to corresponding target tensors, the dimensionalities of the target tensors are greater than the dimensionalities of the time series, the target tensors may be processed by a second prediction model to obtain a prediction result, where the prediction result includes data predicted values of each time series in the time dimension, or the target tensors may be tensor decomposed first to obtain core tensors, and then the core tensors are processed by the second prediction model to obtain the prediction result, where the prediction result includes data predicted values of each time series in the time dimension.
In the embodiment of the application, after the target tensor is obtained, tensor decomposition can be carried out on the target tensor to obtain the core tensor, and the core tensor core tensors can be captured more easily than original data (the target tensor) to obtain the inherent time correlation, so that the calculated amount and the storage demand of the first prediction model in the training process can be reduced, and the first prediction model can be directly used for training the second prediction model.
704. And the training equipment carries out iterative training on the second prediction model by using a first loss function according to the data prediction value of each time sequence in the time dimension and the corresponding target data until the similarity between the data prediction value of each time sequence in the time dimension and the corresponding target data reaches a first preset degree.
In some embodiments of the application, after obtaining the data prediction value of each time sequence in the time dimension, the training device may compare the data prediction value of each time sequence in the time dimension with the target data corresponding to each time sequence, and perform iterative training on the second prediction model through a first loss function until the similarity between the data prediction value of each time sequence in the time dimension and the target data corresponding to each time sequence reaches a first preset degree.
It should be understood that, as long as the first loss function is capable of representing the difference between the predicted data value and the target data in the dimension of the data size, the concrete expression of the first loss function is not limited herein.
705. And the training equipment outputs a first prediction model, and the first prediction model is obtained after the second prediction model is subjected to iterative training.
In some embodiments of the present application, the training device may output the first prediction model after performing a plurality of iterative operations on the second prediction model, where the first prediction model is a general concept and refers to a prediction model obtained after performing iterative training on the second prediction model with lower prediction accuracy.
In some embodiments of the present application, the training device may send the first predictive model to the execution device after outputting the first predictive model.
Optionally, the plurality of time series includes a first time series and a second time series, the first time series includes first data for representing a change of a first characteristic of a first object in a time dimension, the second time series includes first data for representing a change of the first characteristic of a second object in the time dimension, and the first object and the second object have an association relationship in a first feature.
Optionally, the plurality of time series includes a third time series and a fourth time series, the third time series includes first data for representing a change of the second characteristic of the third object in the time dimension, the fourth time series includes first data for representing a change of the third characteristic of the third object in the time dimension, and the second characteristic of the third object and the third characteristic of the third object have an association relationship.
On the basis of the embodiments corresponding to fig. 1 to fig. 7, in order to better implement the above-mentioned scheme of the embodiments of the present application, the following also provides related equipment for implementing the above-mentioned scheme. Referring to fig. 8 in particular, fig. 8 is a schematic structural diagram of an execution device according to an embodiment of the present application, where the execution device 800 includes: an obtaining module 801, configured to obtain a plurality of time series, where each time series includes data arranged in a time dimension, the plurality of time series have the same dimension, and changes of the data included in the plurality of time series in the time dimension have an association relationship; a mapping module 802, configured to map the plurality of time series into a target tensor, a dimension of the target tensor being greater than a dimension of the time series, the target tensor including data included in the plurality of time series; the predicting module 803 is configured to process the target tensor through a first prediction model to obtain a prediction result, where the prediction result includes a data prediction value of each of the plurality of time series in a time dimension.
In this embodiment of the application, the obtaining module 801 obtains a plurality of time series, where each time series includes data arranged in a time dimension, the plurality of time series have the same dimension, and changes of the data included in the plurality of time series in the time dimension have an association relationship; the mapping module 802 maps the plurality of time series into a target tensor, a dimension of the target tensor is larger than a dimension of the time series, and the target tensor contains data included in the plurality of time series; the predicting module 803 processes the target tensor through the first prediction model to obtain a prediction result, where the prediction result includes a data prediction value of each of the plurality of time series in the time dimension. In the embodiment of the application, the plurality of time sequences are mapped into the high-order multidimensional target tensor, so that the input of the prediction model is improved, the prediction model can learn the incidence relation among the plurality of time sequences, and the prediction precision of the prediction model is improved.
In one possible design, the plurality of time series includes a first time series and a second time series, the first time series includes data representing a change in a first characteristic of a first object in a time dimension, the second time series includes data representing a change in the first characteristic of a second object in the time dimension, and the first object and the second object have an association relationship in a first feature.
In this embodiment, the time series includes a first time series corresponding to the first object and a second time series corresponding to the second object, the first time series represents a first characteristic of the first object in a time dimension, the first characteristic may be a demand or a sales volume, and the first time series may represent a demand or a sales volume change of the first object in the time dimension, for example, the first time series may be a sales volume of each month in the past year of the first object. The second time series may represent a demand or a change in sales of the second object in the time dimension, for example, the second time series may be sales of the second object in each month in the past year, and the first object and the second object have an association relationship on the first characteristic, that is, the first object and the second object have an association relationship between products as shown in fig. 2, for example, the first object and the second object have an association relationship on sales or an association relationship on demand, which is not limited herein. It should be noted that the association relationship is possessed by the first object and the second object themselves.
In one possible design, the plurality of time series includes a third time series and a fourth time series, the third time series includes data representing a change in a second characteristic of a third object in the time dimension, the fourth time series includes data representing a change in a third characteristic of the third object in the time dimension, and the second characteristic of the third object and the third characteristic of the third object are in an associated relationship.
In this embodiment, the plurality of time series may include a third time series and a fourth time series corresponding to a third object, the third time series is a time series of a second characteristic of the third object, the third characteristic may be demand or sales, and the third time series may represent a change in demand or sales of the third object in a time dimension, for example, the third time series may be sales of each month of the third object in the past year. The fourth time series may represent the demand or the change of the sales of the third object in the time dimension, for example, the fourth time series may be the demand of each month in the past year of the third object, and the second feature and the third characteristic of the third object have an association relationship, for example, the sales and the demand of the third object have an association relationship.
In one possible design, the prediction module 803 is specifically configured to: carrying out tensor decomposition on the target tensor to obtain a core tensor; and processing the core tensor through the first prediction model to obtain the prediction result. In the embodiment of the application, the core tensor training model obtained after tensor decomposition is used, and when data prediction is carried out, the core tensor training model is used as the input of the prediction model, and the core tensor core sensors can be captured more easily than original data (target tensor) to obtain the inherent time correlation, so that the calculation amount and the storage demand of the first prediction model in the training process can be reduced.
In one possible design, the prediction module 803 is specifically configured to: processing the core tensor through the first prediction model to obtain a first tensor; processing the first tensor through tensor decomposition inverse transformation to obtain a second tensor; mapping the second tensor into a plurality of predictor sequences, wherein each predictor sequence corresponds to a time sequence and the predictor sequences comprise data predictors of the corresponding time sequence in a time dimension.
In one possible design, the prediction module 803 is specifically configured to: performing a Tucker decomposition on the target tensor based on a mapping matrix, wherein the mapping matrix is free of orthogonal constraints in a time dimension when the Tucker decomposition is performed.
In the embodiment of the application, the orthogonal constraint is not added on the time dimension, and the orthogonal constraint of other degrees is unchanged, so that the prediction model has higher robustness to the input parameters and more stable performance. In this embodiment, the method of tensor decomposition of the target tensor is only one example, and does not limit the present application, and in practical applications, the Tucker decomposition may be converted into other tensor decomposition methods, such as CP decomposition (finite parafacc) or Truncated Singular Value Decomposition (TSVD), which is not limited in the present application.
In one possible design, the mapping module 802 is specifically configured to: processing a plurality of time series in a data augmentation mode in the time dimension; and mapping the plurality of time sequences after the data augmentation processing into corresponding target tensors. In the embodiment of the present application, the plurality of time sequences are processed in a time-domain in a data augmentation manner, and are directed to a small sample time sequence, because of the very limited information, the prior art cannot directly use the time sequence as input for training and executing a prediction model, or cannot obtain a better prediction result.
In one possible design, the mapping module 802 is specifically configured to: and mapping a plurality of time sequences into corresponding target tensors through multi-dimensional delay embedding transformation MDT on a time dimension, wherein the target tensors are block Hankel tensors BHT.
In one possible design, the first predictive model is a differential integrated moving average autoregressive, ARIMA, model that supports tensor inputs. In the present embodiment, the type of the first prediction model is not limited to the ARIMA model, and may be a prediction model such as Auto Regression (AR) or Neighbor Neighbors (NN), and the present application is not limited thereto.
It should be noted that, the contents of performing information interaction and performing processes between the modules/units in the device 800 are based on the same concept as the method embodiments corresponding to fig. 3 in the present application, and specific contents may refer to descriptions in the foregoing method embodiments in the present application, and are not described herein again.
Referring to fig. 9, fig. 9 is a schematic structural diagram of a training apparatus provided in an embodiment of the present application, where the training apparatus 900 includes:
an obtaining module 901, configured to obtain a plurality of time series and target data corresponding to each time series, where each time series includes a plurality of first data arranged in a time dimension, the target data is located behind the plurality of first data included in each time series in the time dimension, the plurality of time series have the same dimension, and changes of the first data included in the plurality of time series in the time dimension have an association relationship;
a mapping module 902, configured to map the time series into a target tensor, a dimension of the target tensor being greater than a dimension of the time series, the target tensor including data included in the plurality of time series;
a predicting module 903, configured to process the target tensor through a second prediction model to obtain a prediction result, where the second prediction model is a model that has not been subjected to iterative training, and the prediction result includes a data prediction value of each of the multiple time sequences in a time dimension;
an iterative training module 904, configured to perform iterative training on the second prediction model by using a first loss function according to the data prediction value of each time sequence in the time dimension and the corresponding target data until a similarity between the data prediction value of each time sequence in the time dimension and the corresponding target data reaches a first preset degree;
an output module 905, configured to output a first prediction model, where the first prediction model is a model obtained after the second prediction model is subjected to iterative training.
In this embodiment of the present application, the obtaining module 901 is configured to obtain a plurality of time sequences and target data corresponding to each time sequence, where each time sequence includes a plurality of first data arranged in a time dimension, the target data is located behind the plurality of first data included in each time sequence in the time dimension, the time sequences have the same dimension, and changes of the first data included in the time sequences in the time dimension have an association relationship; a mapping module 902, configured to map the time series into a target tensor, a dimension of the target tensor being greater than a dimension of the time series, the target tensor including data included in the plurality of time series; a predicting module 903, configured to process the target tensor through a second prediction model to obtain a prediction result, where the second prediction model is a model that has not been subjected to iterative training, and the prediction result includes a data prediction value of each of the multiple time sequences in a time dimension; an iterative training module 904, configured to perform iterative training on the second prediction model by using a first loss function according to the data prediction value of each time sequence in the time dimension and the corresponding target data until a similarity between the data prediction value of each time sequence in the time dimension and the corresponding target data reaches a first preset degree; an output module 905, configured to output a first prediction model, where the first prediction model is a model obtained after the second prediction model is subjected to iterative training.
It should be noted that, the information interaction, the execution process, and other contents between the modules/units in the training device 900 are based on the same concept as those of the method embodiments corresponding to fig. 7 in the present application, and specific contents may refer to the description in the foregoing method embodiments in the present application, and are not described herein again.
Referring to fig. 10, fig. 10 is a schematic structural diagram of an execution device provided in the embodiment of the present application, and the execution device 1000 may be embodied as a mobile phone, a tablet, a notebook computer, a server, and the like, which is not limited herein. The execution device 1000 may be disposed with the execution device 800 described in the embodiment corresponding to fig. 8, and is configured to implement the function of the execution device 800 in the embodiment corresponding to fig. 8. Specifically, the execution apparatus 1000 includes: a receiver 1001, a transmitter 1002, a processor 1003 and a memory 1004 (wherein the number of processors 1003 in the execution device 1000 may be one or more, and one processor is taken as an example in fig. 10), wherein the processor 1003 may include an application processor 10031 and a communication processor 10032. In some embodiments of the present application, the receiver 1001, the transmitter 1002, the processor 1003, and the memory 1004 may be connected by a bus or other means.
The memory 1004 may include a read-only memory and a random access memory, and provides instructions and data to the processor 1003. A portion of memory 1004 may also include non-volatile random access memory (NVRAM). The memory 1004 stores the processor and the operating instructions, executable modules or data structures, or a subset or an expanded set thereof, wherein the operating instructions may include various operating instructions for performing various operations.
The processor 1003 controls the operation of the execution apparatus. In a particular application, the various components of the execution device are coupled together by a bus system that may include a power bus, a control bus, a status signal bus, etc., in addition to a data bus. For clarity of illustration, the various buses are referred to in the figures as a bus system.
The method disclosed in the embodiment of the present application may be applied to the processor 1003 or implemented by the processor 1003. The processor 1003 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be implemented by integrated logic circuits of hardware or instructions in the form of software in the processor 1003. The processor 1003 may be a general-purpose processor, a Digital Signal Processor (DSP), a microprocessor or a microcontroller, and may further include an Application Specific Integrated Circuit (ASIC), a field-programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, or discrete hardware components. The processor 1003 may implement or execute the methods, steps and logic blocks disclosed in the embodiments of the present application. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 1004, and the processor 1003 reads the information in the memory 1004, and completes the steps of the method in combination with the hardware thereof.
The receiver 1001 may be used to receive input numeric or character information and generate signal inputs related to performing relevant settings and function control of the device. The transmitter 1002 may be configured to output numeric or character information via a first interface; the transmitter 1002 may also be configured to send instructions to the disk group via the first interface to modify data in the disk group; the transmitter 1002 may also include a display device such as a display screen.
In this embodiment, in one case, the processor 1003 is configured to execute the data prediction method executed by the execution device 800 in the corresponding embodiment of fig. 8. Specifically, the application processor 10031 is configured to obtain a plurality of time series, where each time series includes data arranged in a time dimension, the plurality of time series have the same dimension, and changes in the time dimension of the data included in the plurality of time series have an association relationship; mapping the plurality of time series into a target tensor, wherein the dimension of the target tensor is larger than that of the time series, and the target tensor contains data included by the plurality of time series; and processing the target tensor through a first prediction model to obtain a prediction result, wherein the prediction result comprises a data prediction value of each time sequence in the plurality of time sequences in a time dimension.
In an alternative implementation, the plurality of time series includes a first time series and a second time series, the first time series includes data representing a change of a first characteristic of a first object in a time dimension, the second time series includes data representing a change of the first characteristic of a second object in the time dimension, and the first object and the second object have an association relationship in a first feature.
In an alternative implementation, the plurality of time series includes a third time series and a fourth time series, the third time series includes data representing a change of a second characteristic of a third object in a time dimension, the fourth time series includes data representing a change of a third characteristic of the third object in the time dimension, and the second characteristic of the third object and the third characteristic of the third object are in an associated relationship.
In an alternative implementation, the application processor 10031 is configured to perform tensor decomposition on the target tensor to obtain a core tensor; and processing the core tensor through the first prediction model to obtain the prediction result. In the embodiment of the application, the core tensor training model obtained after tensor decomposition is used, and when data prediction is carried out, the core tensor training model is used as the input of the prediction model, and the core tensor core sensors can be captured more easily than original data (target tensor) to obtain the inherent time correlation, so that the calculation amount and the storage demand of the first prediction model in the training process can be reduced.
In an optional implementation, the application processor 10031 is configured to process the core tensor through the first prediction model to obtain a first tensor; processing the first tensor through tensor decomposition inverse transformation to obtain a second tensor; mapping the second tensor into a plurality of predictor sequences, wherein each predictor sequence corresponds to a time sequence and the predictor sequences comprise data predictors of the corresponding time sequence in a time dimension.
In an optional implementation, the processor 10031 is configured to perform a Tucker decomposition on the target tensor based on a mapping matrix, where in the Tucker decomposition, the mapping matrix has no orthogonal constraint in a time dimension.
In one optional implementation, a plurality of time series are data-augmented in the time dimension; and mapping the plurality of time sequences after the data augmentation processing into corresponding target tensors. In the embodiment of the present application, the plurality of time sequences are processed in a time-domain in a data augmentation manner, and are directed to a small sample time sequence, because of the very limited information, the prior art cannot directly use the time sequence as input for training and executing a prediction model, or cannot obtain a better prediction result.
In an alternative implementation, the processor 10031 is applied to map a plurality of time series to the target tensor through a multi-dimensional time delay embedding transformation MDT in a time dimension.
In an alternative implementation, the first predictive model is a differential integrated moving average autoregressive, ARIMA, model supporting tensor inputs.
It should be noted that, the specific manner in which the application processor 10031 executes the above steps is based on the same concept as that of the method embodiments corresponding to fig. 3 in the present application, and the technical effect brought by the specific manner is the same as that of the method embodiments corresponding to fig. 3 in the present application, and specific contents may refer to the description in the foregoing method embodiments in the present application, and are not described again here.
Referring to fig. 11, fig. 11 is a schematic structural diagram of a training device provided in the embodiment of the present application, where the training device 900 described in the embodiment corresponding to fig. 9 may be disposed on the training device 1100, and is used to implement the functions of the training device 900 in the embodiment corresponding to fig. 9, specifically, the training device 1100 is implemented by one or more servers, and the training device 1100 may generate relatively large differences due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 1122 (e.g., one or more processors) and a memory 1132, and one or more storage media 1130 (e.g., one or more mass storage devices) storing an application program 1142 or data 1144. Memory 1132 and storage media 1130 may be, among other things, transient storage or persistent storage. The program stored on the storage medium 1130 may include one or more modules (not shown), each of which may include a sequence of instructions for operating on the exercise device. Still further, central processor 1122 may be configured to communicate with storage medium 1130 to perform a series of instructional operations on training device 1100 in storage medium 1130.
Training apparatus 1100 may also include one or more power supplies 1126, one or more wired or wireless network interfaces 1150, one or more input-output interfaces 1158, and/or one or more operating systems 1141, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and so forth.
In the embodiment of the present application, the central processing unit 1122 is configured to execute the data prediction method executed by the training apparatus in the embodiment corresponding to fig. 9. Specifically, the central processor 1122 is configured to obtain a plurality of time series and target data corresponding to each time series, where each time series includes a plurality of first data arranged in a time dimension, the target data is located behind the plurality of first data included in each time series in the time dimension, the plurality of time series have the same dimension, and changes of the first data included in the plurality of time series in the time dimension have an association relationship; mapping the plurality of time series into a target tensor, wherein the dimension of the target tensor is larger than that of the time series, and the target tensor contains data included by the plurality of time series; processing the target tensor through a second prediction model to obtain a prediction result, wherein the second prediction model is a model which is not subjected to iterative training, and the prediction result comprises a data prediction value of each time sequence in the plurality of time sequences in the time dimension; performing iterative training on the second prediction model by using a first loss function according to the data prediction value of each time sequence in the time dimension and the corresponding target data until the similarity between the data prediction value of each time sequence in the time dimension and the corresponding target data reaches a first preset degree; and outputting a first prediction model, wherein the first prediction model is obtained after the second prediction model is subjected to iterative training.
It should be noted that, the specific manner in which the central processing unit 1122 executes the above steps is based on the same concept as that of each method embodiment corresponding to fig. 7 in the present application, and the technical effect brought by the same concept as that of each method embodiment corresponding to fig. 7 in the present application can be referred to the description of the foregoing method embodiments in the present application, and details are not repeated herein.
Also provided in the embodiments of the present application is a computer program product, which when run on a computer, causes the computer to perform the steps performed by the device in the method described in the foregoing embodiment shown in fig. 8, or causes the computer to perform the steps performed by the training device in the method described in the foregoing embodiment shown in fig. 10.
Also provided in the embodiments of the present application is a computer-readable storage medium, which stores a program for signal processing, and when the program is run on a computer, the program causes the computer to execute the steps executed by the device in the method described in the foregoing embodiment shown in fig. 8, or causes the computer to execute the steps executed by the training device in the method described in the foregoing embodiment shown in fig. 10.
The execution device and the training device provided by the embodiment of the application can be specifically chips, and the chips comprise: a processing unit, which may be for example a processor, and a communication unit, which may be for example an input/output interface, a pin or a circuit, etc. The processing unit may execute the computer executable instructions stored by the storage unit to cause a chip in the execution device to perform the steps performed by the execution device in the method described in the embodiment shown in fig. 8 or cause the computer to perform the steps performed by the training device in the method described in the embodiment shown in fig. 10. Optionally, the storage unit is a storage unit in the chip, such as a register, a cache, and the like, and the storage unit may also be a storage unit located outside the chip in the wireless access device, such as a read-only memory (ROM) or another type of static storage device that can store static information and instructions, a Random Access Memory (RAM), and the like.
Specifically, referring to fig. 12, fig. 12 is a schematic structural diagram of a chip provided in the embodiment of the present application, where the chip may be represented as a neural network processor NPU 1200, and the NPU 1200 is mounted on a main CPU (Host CPU) as a coprocessor, and the Host CPU allocates tasks. The core portion of the NPU is an arithmetic circuit 1203, and the controller 1204 controls the arithmetic circuit 1203 to extract matrix data in the memory and perform multiplication.
In some implementations, the arithmetic circuitry 1203 internally includes multiple processing units (PEs). In some implementations, the operational circuitry 1203 is a two-dimensional systolic array. The arithmetic circuit 1203 may also be a one-dimensional systolic array or other electronic circuit capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuitry 1203 is a general-purpose matrix processor.
For example, assume that there is an input matrix A, a weight matrix B, and an output matrix C. The arithmetic circuit fetches the data corresponding to matrix B from the weight memory 1202 and buffers each PE in the arithmetic circuit. The arithmetic circuit takes the matrix a data from the input memory 1201 and performs matrix operation with the matrix B, and partial results or final results of the obtained matrix are stored in an accumulator (accumulator) 1208.
The unified memory 1206 is used for storing input data and output data. The weight data directly passes through a Memory Access Controller (DMAC) 1205, and the DMAC is transferred to the weight Memory 1202. The input data is also carried into the unified memory 1206 by the DMAC.
The BIU is a Bus Interface Unit 1210 for the interaction of the AXI Bus with the DMAC and an Instruction Fetch memory (IFB) 1209.
A Bus Interface Unit 1210(Bus Interface Unit, BIU for short) is used for the instruction fetch memory 1209 to fetch instructions from the external memory, and is also used for the storage Unit access controller 1205 to fetch the original data of the input matrix a or the weight matrix B from the external memory.
The DMAC is mainly used to transfer input data in the external memory DDR to the unified memory 1206 or to transfer weight data into the weight memory 1202 or to transfer input data into the input memory 1201.
The vector calculation unit 1207 includes a plurality of operation processing units, and performs further processing on the output of the operation circuit, such as vector multiplication, vector addition, exponential operation, logarithmic operation, magnitude comparison, and the like, if necessary. The method is mainly used for non-convolution/full-connection layer network calculation in the neural network, such as Batch Normalization, pixel-level summation, up-sampling of a feature plane and the like.
In some implementations, the vector calculation unit 1207 can store the processed output vector to the unified memory 1206. For example, the vector calculation unit 1207 may apply a linear function and/or a nonlinear function to the output of the operation circuit 1203, such as linear data augmentation on the feature plane extracted by the convolution layer, and further such as a vector of accumulated values to generate an activation value. In some implementations, the vector calculation unit 1207 generates normalized values, pixel-level summed values, or both. In some implementations, the vector of processed outputs can be used as activation inputs to arithmetic circuitry 1203, e.g., for use in subsequent layers in a neural network.
An instruction fetch buffer (issue fetch buffer)1209 connected to the controller 1204, configured to store instructions used by the controller 1204;
the unified memory 1206, the input memory 1201, the weight memory 1202, and the instruction fetch memory 1209 are all On-Chip memories. The external memory is private to the NPU hardware architecture.
The operations of the layers in the high-dimensional convolutional neural network shown in fig. 7 and 8 may be performed by the operation circuit 1203 or the vector calculation unit 1207.
Wherein any of the aforementioned processors may be a general purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits configured to control the execution of the programs of the method of the first aspect.
It should be noted that the above-described embodiments of the apparatus are merely schematic, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiments of the apparatus provided in the present application, the connection relationship between the modules indicates that there is a communication connection therebetween, and may be implemented as one or more communication buses or signal lines.
Through the above description of the embodiments, those skilled in the art will clearly understand that the present application can be implemented by software plus necessary general-purpose hardware, and certainly can also be implemented by special-purpose hardware including special-purpose integrated circuits, special-purpose CPUs, special-purpose memories, special-purpose components and the like. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions may be various, such as analog circuits, digital circuits, or dedicated circuits. However, for the present application, the implementation of a software program is more preferable. Based on such understanding, the technical solutions of the present application may be substantially embodied in the form of a software product, which is stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, an exercise device, or a network device) to execute the method according to the embodiments of the present application.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, training device, or data center to another website site, computer, training device, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a training device, a data center, etc., that incorporates one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

Claims (27)

1. A method of data prediction, the method comprising:
acquiring a plurality of time sequences, wherein each time sequence comprises data arranged in a time dimension, the plurality of time sequences have the same dimension, and the change of the data in the time dimension has an association relationship;
mapping the plurality of time series into a target tensor, wherein the dimension of the target tensor is larger than that of the time series, and the target tensor contains data included by the plurality of time series;
and processing the target tensor through a first prediction model to obtain a prediction result, wherein the prediction result comprises a data prediction value of each time sequence in the plurality of time sequences in a time dimension.
2. The method of claim 1, wherein the plurality of time series includes a first time series and a second time series, the first time series includes data representing a change in a first characteristic of a first object in a time dimension, the second time series includes data representing a change in the first characteristic of a second object in the time dimension, and the first object and the second object have an association relationship in the first characteristic.
3. The method according to claim 1 or 2, wherein the plurality of time series includes a third time series and a fourth time series, the third time series includes data for representing the change of the second characteristic of the third object in the time dimension, the fourth time series includes data for representing the change of the third characteristic of the third object in the time dimension, and the second characteristic of the third object and the third characteristic of the third object are in an associated relationship.
4. The method according to any one of claims 1 to 3, wherein the processing the target tensor using the first predictive model to obtain the prediction result comprises:
carrying out tensor decomposition on the target tensor to obtain a core tensor;
and processing the core tensor through the first prediction model to obtain the prediction result.
5. The method of claim 4, wherein the processing the core tensor by the first predictive model comprises:
processing the core tensor through the first prediction model to obtain a first tensor;
processing the first tensor through tensor decomposition inverse transformation to obtain a second tensor;
mapping the second tensor into a plurality of predictor sequences, wherein each predictor sequence corresponds to a time sequence and the predictor sequences comprise data predictors of the corresponding time sequence in a time dimension.
6. The method of claim 4 or 5, wherein the tensor resolution of the target tensor comprises:
performing a Tucker decomposition on the target tensor based on a mapping matrix, wherein the mapping matrix is free of orthogonal constraints in a time dimension when the Tucker decomposition is performed.
7. The method of any of claims 1 to 6, wherein mapping the plurality of time series to corresponding target tensors comprises:
processing a plurality of time series in a data augmentation mode in the time dimension;
and mapping the plurality of time sequences after the data augmentation processing into the target tensor.
8. The method of any of claims 1 to 7, wherein mapping the plurality of time series to corresponding target tensors comprises:
mapping a plurality of time series to the target tensor in a time dimension by a multi-dimensional delay embedding transformation (MDT).
9. The method of any of claims 1 to 8, wherein the first predictive model is a differential integrated moving average autoregressive, ARIMA, model supporting tensor inputs.
10. A method of data prediction, the method comprising:
acquiring a plurality of time sequences and target data corresponding to each time sequence, wherein each time sequence comprises a plurality of first data arranged in a time dimension, the target data is positioned behind the plurality of first data included in each time sequence in the time dimension, the time sequences have the same dimension, and the change of the first data included in the time sequences in the time dimension has an association relationship;
mapping the time series into a target tensor, the dimension of the target tensor being greater than the dimension of the time series, the target tensor comprising the data included in the plurality of time series;
processing the target tensor through a second prediction model to obtain a prediction result, wherein the second prediction model is a model which is not subjected to iterative training, and the prediction result comprises a data prediction value of each time sequence in the plurality of time sequences in the time dimension;
performing iterative training on the second prediction model by using a first loss function according to the data prediction value of each time sequence in the time dimension and the corresponding target data until the similarity between the data prediction value of each time sequence in the time dimension and the corresponding target data reaches a first preset degree;
and outputting a first prediction model, wherein the first prediction model is obtained after the second prediction model is subjected to iterative training.
11. The method of claim 10, wherein the plurality of time series includes a first time series and a second time series, the first time series includes first data for representing changes in a time dimension of a first characteristic of a first object, the second time series includes first data for representing changes in a time dimension of the first characteristic of a second object, and the first object and the second object have an association relationship in the first characteristic.
12. The method according to claim 10 or 11, wherein the plurality of time series includes a third time series and a fourth time series, the third time series includes first data for representing the change of the second characteristic of the third object in the time dimension, the fourth time series includes first data for representing the change of the third characteristic of the third object in the time dimension, and the second characteristic of the third object and the third characteristic of the third object are in an associated relationship.
13. An execution device, the device comprising:
the acquisition module is used for acquiring a plurality of time sequences, wherein each time sequence comprises data arranged in a time dimension, the time sequences have the same dimension, and the change of the data in the time dimension has an association relation;
a mapping module, configured to map the plurality of time series into a target tensor, a dimension of the target tensor being greater than a dimension of the time series, the target tensor including data included in the plurality of time series;
and the prediction module is used for processing the target tensor through a first prediction model to obtain a prediction result, and the prediction result comprises a data prediction value of each time sequence in the plurality of time sequences in a time dimension.
14. The apparatus of claim 13, wherein the plurality of time series includes a first time series and a second time series, the first time series includes data representing a change in a first characteristic of a first object in a time dimension, the second time series includes data representing a change in the first characteristic of a second object in the time dimension, and the first object and the second object have an association relationship in the first characteristic.
15. The apparatus according to claim 13 or 14, wherein the plurality of time series includes a third time series and a fourth time series, the third time series includes data representing a change of a second characteristic of a third object in a time dimension, the fourth time series includes data representing a change of a third characteristic of the third object in a time dimension, and the second characteristic of the third object and the third characteristic of the third object are associated with each other.
16. The apparatus according to any one of claims 13 to 15, wherein the prediction module is specifically configured to:
carrying out tensor decomposition on the target tensor to obtain a core tensor;
and processing the core tensor through the first prediction model to obtain the prediction result.
17. The apparatus of claim 16, wherein the prediction module is specifically configured to:
processing the core tensor through the first prediction model to obtain a first tensor;
processing the first tensor through tensor decomposition inverse transformation to obtain a second tensor;
mapping the second tensor into a plurality of predictor sequences, wherein each predictor sequence corresponds to a time sequence and the predictor sequences comprise data predictors of the corresponding time sequence in a time dimension.
18. The device according to claim 16 or 17, wherein the prediction module is specifically configured to:
performing a Tucker decomposition on the target tensor based on a mapping matrix, wherein the mapping matrix is free of orthogonal constraints in a time dimension when the Tucker decomposition is performed.
19. The device according to any one of claims 13 to 18, wherein the mapping module is specifically configured to:
processing a plurality of time series in a data augmentation mode in the time dimension;
and mapping the plurality of time sequences after the data augmentation processing into the target tensor.
20. The device according to any one of claims 13 to 19, wherein the mapping module is specifically configured to:
mapping a plurality of time series to the target tensor in a time dimension by a multi-dimensional delay embedding transformation (MDT).
21. The apparatus according to any one of claims 13 to 20,
the first prediction model is a differential integrated moving average autoregressive (ARIMA) model supporting tensor input.
22. An exercise apparatus, characterized in that the apparatus comprises:
the acquisition module is used for acquiring a plurality of time sequences and target data corresponding to each time sequence, wherein each time sequence comprises a plurality of first data arranged in a time dimension, the target data is positioned behind the plurality of first data included in each time sequence in the time dimension, the time sequences have the same dimension, and the change of the data included in the time sequences in the time dimension has an association relationship;
a mapping module, configured to map the time series into a target tensor, a dimension of the target tensor being greater than a dimension of the time series, the target tensor including data included in the plurality of time series;
the prediction module is used for processing the target tensor through a second prediction model to obtain a prediction result, wherein the second prediction model is a model which is not subjected to iterative training, and the prediction result comprises a data prediction value of each time sequence in the plurality of time sequences in the time dimension;
the iterative training module is used for performing iterative training on the second prediction model by using a first loss function according to the data prediction value of each time sequence in the time dimension and the corresponding target data until the similarity between the data prediction value of each time sequence in the time dimension and the corresponding target data reaches a first preset degree;
and the output module is used for outputting a first prediction model, and the first prediction model is obtained after the second prediction model is subjected to iterative training.
23. The apparatus of claim 22, wherein the plurality of time series includes a first time series and a second time series, the first time series includes data representing a change in a first characteristic of a first object in a time dimension, the second time series includes data representing a change in the first characteristic of a second object in the time dimension, and the first object and the second object have an association relationship in the first characteristic.
24. The apparatus according to claim 22 or 23, wherein the plurality of time series includes a third time series and a fourth time series, the third time series includes data representing a change of a second characteristic of a third object in a time dimension, the fourth time series includes data representing a change of a third characteristic of the third object in a time dimension, and the second characteristic of the third object and the third characteristic of the third object are associated with each other.
25. An exercise device comprising a processor and a memory, the processor coupled with the memory, wherein the communication device is a terminal device or an exercise device;
the memory is used for storing programs;
the processor, configured to execute the program in the memory, to cause the communication device to perform the method according to any one of claims 10 to 12.
26. An execution device comprising a processor and a memory, the processor coupled with the memory,
the memory is used for storing programs;
the processor to execute the program in the memory to cause the execution device to perform the method of any of claims 1 to 9.
27. A computer-readable storage medium comprising a program which, when run on a computer, causes the computer to perform the method of any one of claims 1 to 9.
CN201911374073.6A 2019-12-26 2019-12-26 Data prediction method and related equipment Pending CN113052618A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911374073.6A CN113052618A (en) 2019-12-26 2019-12-26 Data prediction method and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911374073.6A CN113052618A (en) 2019-12-26 2019-12-26 Data prediction method and related equipment

Publications (1)

Publication Number Publication Date
CN113052618A true CN113052618A (en) 2021-06-29

Family

ID=76506229

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911374073.6A Pending CN113052618A (en) 2019-12-26 2019-12-26 Data prediction method and related equipment

Country Status (1)

Country Link
CN (1) CN113052618A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113392114A (en) * 2021-07-29 2021-09-14 浩鲸云计算科技股份有限公司 Intelligent relationship management and intelligent data fusion method based on business object
CN115114345A (en) * 2022-04-02 2022-09-27 腾讯科技(深圳)有限公司 Feature representation extraction method, device, equipment, storage medium and program product
CN115481808A (en) * 2022-09-23 2022-12-16 江苏天成科技集团有限公司 Laying hen laying rate prediction method based on MDT-LSSVM model

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106548016A (en) * 2016-10-24 2017-03-29 天津大学 Time series analysis method based on tensor relativity of time domain decomposition model
CN107146015A (en) * 2017-05-02 2017-09-08 联想(北京)有限公司 Multivariate Time Series Forecasting Methodology and system
CN107292806A (en) * 2017-06-28 2017-10-24 南京师范大学 A kind of remote sensing image digital watermark embedding and extracting method based on quaternion wavelet
CN109978228A (en) * 2019-01-31 2019-07-05 中南大学 A kind of PM2.5 concentration prediction method, apparatus and medium
CN110046787A (en) * 2019-01-15 2019-07-23 重庆邮电大学 A kind of urban area charging demand for electric vehicles spatio-temporal prediction method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106548016A (en) * 2016-10-24 2017-03-29 天津大学 Time series analysis method based on tensor relativity of time domain decomposition model
CN107146015A (en) * 2017-05-02 2017-09-08 联想(北京)有限公司 Multivariate Time Series Forecasting Methodology and system
CN107292806A (en) * 2017-06-28 2017-10-24 南京师范大学 A kind of remote sensing image digital watermark embedding and extracting method based on quaternion wavelet
CN110046787A (en) * 2019-01-15 2019-07-23 重庆邮电大学 A kind of urban area charging demand for electric vehicles spatio-temporal prediction method
CN109978228A (en) * 2019-01-31 2019-07-05 中南大学 A kind of PM2.5 concentration prediction method, apparatus and medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113392114A (en) * 2021-07-29 2021-09-14 浩鲸云计算科技股份有限公司 Intelligent relationship management and intelligent data fusion method based on business object
CN115114345A (en) * 2022-04-02 2022-09-27 腾讯科技(深圳)有限公司 Feature representation extraction method, device, equipment, storage medium and program product
CN115114345B (en) * 2022-04-02 2024-04-09 腾讯科技(深圳)有限公司 Feature representation extraction method, device, equipment, storage medium and program product
CN115481808A (en) * 2022-09-23 2022-12-16 江苏天成科技集团有限公司 Laying hen laying rate prediction method based on MDT-LSSVM model

Similar Documents

Publication Publication Date Title
CN111797893B (en) Neural network training method, image classification system and related equipment
WO2022068623A1 (en) Model training method and related device
CN111950596A (en) Training method for neural network and related equipment
CN116415654A (en) Data processing method and related equipment
CN113052618A (en) Data prediction method and related equipment
CN111797589A (en) Text processing network, neural network training method and related equipment
CN113159273B (en) Neural network training method and related equipment
CN113191241A (en) Model training method and related equipment
CN112529149B (en) Data processing method and related device
CN115081616A (en) Data denoising method and related equipment
CN115238909A (en) Data value evaluation method based on federal learning and related equipment thereof
WO2024041483A1 (en) Recommendation method and related device
CN113627163A (en) Attention model, feature extraction method and related device
CN115879508A (en) Data processing method and related device
WO2023050143A1 (en) Recommendation model training method and apparatus
CN113627421A (en) Image processing method, model training method and related equipment
CN113065634A (en) Image processing method, neural network training method and related equipment
WO2023246735A1 (en) Item recommendation method and related device therefor
WO2023185541A1 (en) Model training method and related device
CN116739154A (en) Fault prediction method and related equipment thereof
CN117056589A (en) Article recommendation method and related equipment thereof
CN116910357A (en) Data processing method and related device
CN115795025A (en) Abstract generation method and related equipment thereof
CN116308640A (en) Recommendation method and related device
CN115907041A (en) Model training method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination