CN114726745B - Network traffic prediction method, device and computer readable storage medium - Google Patents
Network traffic prediction method, device and computer readable storage medium Download PDFInfo
- Publication number
- CN114726745B CN114726745B CN202110005328.2A CN202110005328A CN114726745B CN 114726745 B CN114726745 B CN 114726745B CN 202110005328 A CN202110005328 A CN 202110005328A CN 114726745 B CN114726745 B CN 114726745B
- Authority
- CN
- China
- Prior art keywords
- flow
- preset model
- sequence
- predicted
- prediction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 239000011159 matrix material Substances 0.000 claims abstract description 92
- 230000000694 effects Effects 0.000 claims description 45
- 238000012549 training Methods 0.000 claims description 40
- 238000010801 machine learning Methods 0.000 claims description 9
- 230000015654 memory Effects 0.000 claims description 9
- 238000012937 correction Methods 0.000 claims description 7
- 238000002372 labelling Methods 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 3
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 238000003062 neural network model Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 7
- 238000007405 data analysis Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000010295 mobile communication Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000013468 resource allocation Methods 0.000 description 2
- 230000001932 seasonal effect Effects 0.000 description 2
- 230000006403 short-term memory Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/147—Network analysis or design for predicting network behaviour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/145—Network analysis or design involving simulating, designing, planning or modelling of a network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0876—Network utilisation, e.g. volume of load or congestion level
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Environmental & Geological Engineering (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application discloses a network traffic prediction method, a device and a computer readable storage medium, relating to the technical field of artificial intelligence, wherein the method comprises the following steps: acquiring historical flow data of each region in a target range, wherein the historical flow data comprises first flow data corresponding to a region to be predicted; determining a feature matrix corresponding to the region to be predicted based on the first flow data; inputting the historical flow data and the feature matrix into a preset model, and determining a flow prediction result of the area to be predicted; the preset models comprise a first preset model, a second preset model and a third preset model, wherein the first preset model is respectively cascaded with the second preset model and the third preset model, and the second preset model is cascaded with the third preset model. In this way, the flow of the area to be predicted can be predicted through a plurality of models with cascade relations, so that the influence of the feature matrix of the area to be predicted and the flow of other areas in the target range is considered by the prediction result, and the accuracy of the prediction result is improved.
Description
Technical Field
The present invention relates to the field of artificial intelligence technologies, and in particular, to a network traffic prediction method, a network traffic prediction device, and a computer readable storage medium.
Background
With the development of the fourth generation mobile communication network (i.e., 4G) and the fifth generation mobile communication network (i.e., 5G), a large number of network services are integrated into the life of people through various terminal devices, so that the demands of people on network speed and time delay are gradually increased. In order to guarantee a good user experience, it is necessary to guarantee that the network is given sufficient resources. An operator usually makes a network resource allocation policy by taking province companies as units, network experts of each province company need to judge network congestion conditions of each local city in a future period of time according to network performance index time sequence data of each local city of the province by predicting network flow, and makes a reasonable resource allocation scheme according to resource load conditions of each subordinate local city, so that the aim of meeting network resource requirements of each local city and simultaneously avoiding resource waste to the greatest extent is achieved.
However, in the conventional network traffic prediction method, the original time sequence is generally split into a plurality of sub-time sequences, the plurality of sub-time sequences are predicted, and finally the prediction results of the sub-time sequences are added to obtain the prediction results of the original time sequence. The method is only suitable for single time sequence prediction which is not affected by other time sequences because the prediction result is fitted from a mathematical angle, but is applied to a network flow prediction scene, and the accuracy of the prediction result is lower because the similarity and the correlation of flow changes between the same province and local cities are ignored.
Disclosure of Invention
The embodiment of the invention provides a network traffic prediction method, a network traffic prediction device and a computer readable storage medium, which are used for solving the problem that the accuracy of a prediction result of the existing network traffic prediction method is low.
In order to solve the technical problems, the invention is realized as follows:
in a first aspect, an embodiment of the present invention provides a network traffic prediction method, where the method includes:
Acquiring historical flow data of each region in a target range, wherein the historical flow data comprises first flow data corresponding to a region to be predicted;
Determining a feature matrix corresponding to the region to be predicted based on the first flow data;
inputting the historical flow data and the feature matrix into a preset model, and determining a flow prediction result of the area to be predicted;
The preset models comprise a first preset model, a second preset model and a third preset model, wherein the first preset model is respectively cascaded with the second preset model and the third preset model, and the second preset model is cascaded with the third preset model.
Optionally, the inputting the historical flow data and the feature matrix into a preset model, and determining the flow prediction result of the to-be-predicted area includes:
Inputting the first flow data and the feature matrix into the first preset model to obtain a first prediction result, wherein the first prediction result comprises an initial flow prediction sequence;
Inputting the initial flow prediction sequence and the feature matrix into the second preset model to obtain a second prediction result, wherein the second prediction result comprises a flow difference feature sequence;
Inputting the first prediction result and the flow difference characteristic sequence into the third preset model to obtain a flow prediction result of the region to be predicted, wherein the flow difference characteristic sequence is used for indicating the influence quantity of population flow of each region in the target range on the flow of the region to be predicted;
The first preset model and the second preset model are different machine learning models, the third preset model is a correction model, and the third preset model is used for correcting the initial flow prediction sequence according to the flow difference characteristic sequence.
Optionally, the first prediction result further includes:
The method comprises the steps of determining a holiday sequence, a trend sequence and a season sequence of a region to be predicted, wherein the holiday sequence is used for indicating holiday characteristics of the region to be predicted in a preset prediction time period, the trend sequence is used for indicating trend characteristics of the region to be predicted in the prediction time period, and the season sequence is used for indicating season characteristics of the region to be predicted in the preset prediction time period.
Optionally, the second preset model includes N sub-models, the second prediction result includes N second prediction sub-results, and N is a positive integer;
Inputting the initial flow prediction sequence and the feature matrix into a second preset model to obtain a second prediction result, wherein the method comprises the following steps of:
Inputting the initial flow prediction sequence and the feature matrix into a target sub-model to obtain a second predictor result;
The target sub-model is any sub-module in the N sub-models.
Optionally, before the inputting the initial flow prediction sequence and the feature matrix into a second preset model, the method includes:
Training and learning the historical flow data and the feature matrix to obtain the N sub-models;
the value of the N is consistent with the number of population flowing types included in each area in the target range, and the population flowing types comprise at least one of inflow type, outflow type and stable type.
Optionally, the training learning is performed on the historical traffic data and the feature matrix to obtain the N submodels, including:
Classifying historical flow data of each region according to population flow types of each region in the target range;
Respectively calculating N average value sequences corresponding to the N human mouth flow types, wherein the average value sequences are used for indicating flow average values corresponding to a plurality of areas with the same human mouth flow type in the target range at each acquisition time point;
Based on the first flow data and the N mean value sequences, N differential feature sequences corresponding to the N human mouth flow types are determined, wherein the differential feature sequences comprise differences between flow values corresponding to all acquisition time points in the first flow data and flow mean values of corresponding time points of the mean value sequences;
And taking the first flow data and the feature matrix as the input of each sub-model in the N sub-models, taking the N differential feature sequences as the output of each sub-model in the N sub-models, and training and learning to obtain the N sub-models.
Optionally, the determining, based on the first flow data, a feature matrix corresponding to the region to be predicted includes:
Performing feature labeling on the first flow data to obtain a plurality of feature sequences;
Determining a feature matrix corresponding to the region to be predicted based on the plurality of feature sequences;
The first flow data is a four-dimensional time sequence comprising a plurality of acquisition time points, flow values, holiday information and activity information corresponding to the acquisition time points, and the plurality of feature sequences comprise a time feature sequence, a holiday feature sequence, an activity feature sequence and an area feature sequence.
In a second aspect, an embodiment of the present invention provides a network traffic prediction apparatus, where the apparatus includes:
the acquisition module is used for acquiring historical flow data of each region in the target range, wherein the historical flow data comprises first flow data corresponding to the region to be predicted;
the first determining module is used for determining a feature matrix corresponding to the region to be predicted based on the first flow data;
the second determining module is used for inputting the historical flow data and the feature matrix into a preset model and determining a flow prediction result of the area to be predicted;
The preset models comprise a first preset model, a second preset model and a third preset model, wherein the first preset model is respectively cascaded with the second preset model and the third preset model, and the second preset model is cascaded with the third preset model.
Optionally, the second determining module includes:
The first input sub-module is used for inputting the first flow data and the feature matrix into the first preset model to obtain a first prediction result, and the first prediction result comprises an initial flow prediction sequence;
The second input sub-module is used for inputting the initial flow prediction sequence and the feature matrix into the second preset model to obtain a second prediction result, and the second prediction result comprises a flow difference feature sequence;
the third input sub-module is used for inputting the first prediction result and the flow difference characteristic sequence into the third preset model to obtain a flow prediction result of the region to be predicted, and the flow difference characteristic sequence is used for indicating the influence quantity of population flow of each region in the target range on the flow of the region to be predicted;
The first preset model and the second preset model are different machine learning models, the third preset model is a correction model, and the third preset model is used for correcting the initial flow prediction sequence according to the flow difference characteristic sequence.
Optionally, the first prediction result further includes:
The method comprises the steps of determining a holiday sequence, a trend sequence and a season sequence of a region to be predicted, wherein the holiday sequence is used for indicating holiday characteristics of the region to be predicted in a preset prediction time period, the trend sequence is used for indicating trend characteristics of the region to be predicted in the prediction time period, and the season sequence is used for indicating season characteristics of the region to be predicted in the preset prediction time period.
Optionally, the second preset model includes N sub-models, the second prediction result includes N second prediction sub-results, and N is a positive integer; the second input submodule includes:
The input unit is used for inputting the initial flow prediction sequence and the feature matrix into a target sub-model to obtain a second predictor result;
Wherein the target sub-model is any one sub-module of the N sub-models.
Optionally, the second determining module further includes:
The training learning sub-module is used for training and learning the historical flow data and the feature matrix to obtain the N sub-models;
the value of the N is consistent with the number of population flowing types included in each area in the target range, and the population flowing types comprise at least one of inflow type, outflow type and stable type.
Optionally, the training learning sub-module is specifically configured to:
Classifying historical flow data of each region according to population flow types of each region in the target range;
Respectively calculating N average value sequences corresponding to the N human mouth flow types, wherein the average value sequences are used for indicating flow average values corresponding to a plurality of areas with the same human mouth flow type in the target range at each acquisition time point;
Based on the first flow data and the N mean value sequences, N differential feature sequences corresponding to the N human mouth flow types are determined, wherein the differential feature sequences comprise differences between flow values corresponding to all acquisition time points in the first flow data and flow mean values of corresponding time points of the mean value sequences;
And taking the first flow data and the feature matrix as the input of each sub-model in the N sub-models, taking the N differential feature sequences as the output of each sub-model in the N sub-models, and training and learning to obtain the N sub-models.
Optionally, the first determining module includes:
The marking sub-module is used for carrying out characteristic marking on the first flow data to obtain a plurality of characteristic sequences;
The second determining submodule is used for determining a feature matrix corresponding to the region to be predicted based on the plurality of feature sequences;
The first flow data is a four-dimensional time sequence comprising a plurality of acquisition time points, flow values, holiday information and activity information corresponding to the acquisition time points, and the plurality of feature sequences comprise a time feature sequence, a holiday feature sequence, an activity feature sequence and an area feature sequence.
In a third aspect, an embodiment of the present invention further provides a network traffic prediction apparatus, including: a processor, a memory and a program stored on the memory and executable on the processor, which when executed by the processor implements the steps of the method as described in the first aspect above.
In a fourth aspect, embodiments of the present invention also provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method according to the first aspect.
In the embodiment of the invention, by acquiring the historical flow data of each region in the target range, the historical flow data comprises first flow data corresponding to the region to be predicted; determining a feature matrix corresponding to the region to be predicted based on the first flow data; inputting the historical flow data and the feature matrix into a preset model, and determining a flow prediction result of the area to be predicted; the preset models comprise a first preset model, a second preset model and a third preset model, wherein the first preset model is respectively cascaded with the second preset model and the third preset model, and the second preset model is cascaded with the third preset model. In this way, the network flow of the area to be predicted can be predicted through a plurality of models with cascade relations, so that the prediction result considers the characteristic matrix of the area to be predicted and the flow influence of other areas in the target range, and the accuracy of the prediction result is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort to a person of ordinary skill in the art.
Fig. 1 is one of flowcharts of a network traffic prediction method provided in an embodiment of the present invention;
FIG. 2 is a second flowchart of a network traffic prediction method according to an embodiment of the present invention;
FIG. 3 is a third flowchart of a network traffic prediction method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of training data provided by an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a preset model according to an embodiment of the present invention;
Fig. 6 is a schematic structural diagram of a network traffic prediction device according to an embodiment of the present invention;
Fig. 7 is a second schematic structural diagram of a network traffic prediction device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the embodiment of the invention, a network flow prediction method, a device and related equipment are provided, so as to solve the problem that the accuracy of a prediction result is low because the similarity and the correlation of flow changes between the same province and the city are ignored, and the prediction result is only fitted from a mathematical angle by the existing network flow prediction method, and is suitable for single time sequence prediction which is not influenced by other time sequences.
Referring to fig. 1, fig. 1 is one of flowcharts of a network traffic prediction method according to an embodiment of the present invention. As shown in fig. 1, the method comprises the steps of:
step 101, acquiring historical flow data of each region in a target range, wherein the historical flow data comprises first flow data corresponding to a region to be predicted.
The target range may include a plurality of areas, and the target range may be a network coverage corresponding to a certain country, or a network coverage of a certain province in a certain country, or a network coverage of a certain city in a certain province, or the like. When the target range is the network coverage corresponding to a certain country, the region represents the network coverage of a certain province in the country; when the target range is the network coverage of a certain province in a certain country, the region represents the network coverage of a certain city in the province; when the target range is the network coverage of a certain city in a certain province, the area represents the network coverage of a certain county in the city.
The historical flow data are data collected by each area in the target range in the historical preset time length. The historical preset time length can be any time length of half a year, one year, two years, three years and the like, the collected time granularity can be any granularity of hours, days, months and the like, and the application is not particularly limited. The historical flow data may include data of multiple dimensions, such as acquisition time, flow value corresponding to each acquisition time, holiday information corresponding to each acquisition time, activity information corresponding to each acquisition time, and so on. For example, assuming that the target range includes 10 regions, by collecting data of four dimensions of each of the 10 regions in the historical preset duration, the flow value corresponding to each of the collecting times, holiday information corresponding to each of the collecting times, and activity information corresponding to each of the collecting times, 10 time series { W } w×4 may be obtained, where { W } represents a four-dimensional time series corresponding to W time collecting points of a certain region, W represents the length of the time series, 4 represents the dimension of the time series, and the number of W is determined by the historical preset duration and the collected time granularity.
Step 102, determining a feature matrix corresponding to the region to be predicted based on the first flow data.
The first flow data may include data of multiple dimensions collected by the area to be predicted within a historical preset duration, such as a collection time, a flow value corresponding to each collection time, holiday information corresponding to each collection time, activity information corresponding to each collection time, and the like. The feature matrix includes a plurality of feature sequences, each of which is used to indicate a feature of the first flow data, such as a temporal feature, a holiday feature, an activity feature, a regional feature, and so on.
In an embodiment, the first flow data includes an acquisition time, a flow value corresponding to each acquisition time, holiday information corresponding to each acquisition time, and activity information corresponding to each acquisition time, denoted by { W1} w×4, where W1 represents a four-dimensional data matrix corresponding to W time acquisition points of the area to be predicted, and the number of W is determined by a historical preset duration and an acquired time granularity. Thus, the first flow data can be subjected to characteristic marking through big data analysis, the characteristic comprises time characteristic, holiday characteristic, activity characteristic and area characteristic, so that a time characteristic sequence, a holiday characteristic sequence, an activity characteristic sequence and an area characteristic sequence are obtained, and a characteristic matrix { F } w×k is formed according to the time characteristic sequence, the holiday characteristic sequence, the activity characteristic sequence and the area characteristic sequence, wherein { F } represents a set of k-dimensional characteristic sequences corresponding to a plurality of w time acquisition points, w represents the length of the characteristic sequences, k represents the dimension of the characteristic sequences, the number of w is determined by the historical preset duration and the acquired time granularity, and k is any positive integer. Specifically, the time feature is used for indicating time information of each collection time of the first flow data, such as a corresponding season (i.e. spring, summer, autumn or winter), a corresponding period (i.e. morning, afternoon, noon or evening), whether the time information is rush hours or rush hours, etc.; the holiday feature is used for indicating holiday information of each acquisition time of the first flow data, such as whether the holiday is a holiday, a holiday name, a holiday duration and the like; the activity feature is used for indicating activity information of each acquisition time of the first flow data, such as whether the activity information is activity days, activity names, activity duration, activity area range and the like; the regional characteristics are used to indicate information of the region to be predicted, such as the population flowing type (i.e., population inflow type, population outflow type or population stable type), the city type (i.e., tourist city, agricultural city or industrial city) to which the region belongs, the total domestic production value (Gross Domestic Product, abbreviated as GDP) value in the current month, the number of resident population in the region, the number of town population, the number of colleges and universities, the number of middle and primary schools, and the like.
And step 103, inputting the historical flow data and the feature matrix into a preset model, and determining a flow prediction result of the area to be predicted.
The preset model may include a plurality of trained models, and the preset model may predict the network traffic of the area to be predicted in a future preset time period according to the input historical traffic data and the feature matrix, so as to obtain a traffic prediction result. Specifically, the preset models may include a first preset model for predicting an initial flow prediction sequence, a holiday sequence, a trend sequence, a season sequence, and the like of the area to be predicted, a second preset model for predicting a flow conversion relationship between the area to be predicted and other areas within the target range, and a third preset model for correcting the initial flow prediction sequence, where the first preset model may be a machine learning model such as prophet model for obtaining periodic characteristics of the area to be predicted, the second preset model may be a machine learning model such as a Long Short-Term Memory (LSTM) model for obtaining time compactness characteristics of the area to be predicted, and the third preset model may be an extreme gradient lifting (eXtreme Gradient Boosting, XGBoost) model. The preset model comprises a first preset model, a second preset model and a third preset model, the first preset model is respectively cascaded with the second preset model and the third preset model, and the second preset model is cascaded with the third preset model, so that the prediction result of the first preset model can be used as input data of the second preset model and the third preset model, the prediction result of the second preset model can be used as input data of the third preset model, and finally the flow prediction result of the area to be predicted is determined and obtained through the third preset model.
In this embodiment, the network traffic of the area to be predicted may be predicted by using multiple models with cascade relationships, so that the prediction result considers the feature matrix of the area to be predicted and the traffic influence of other areas in the target range, thereby improving the accuracy of the prediction result.
Further, referring to fig. 2, fig. 2 is a second flowchart of a network traffic prediction method according to an embodiment of the present invention, based on the embodiment shown in fig. 1, the step 103 of inputting historical traffic data and a feature matrix into a preset model to determine a traffic prediction result of a region to be predicted includes:
Step 201, inputting the first flow data and the feature matrix into a first preset model to obtain a first prediction result, wherein the first prediction result comprises an initial flow prediction sequence.
In an embodiment, the preset model includes a first preset model, a second preset model, and a third preset model, where the first preset model and the second preset model are different machine learning models, the third preset model is a correction model, and the third preset model is used to correct the initial flow prediction sequence in the first prediction result according to the second prediction result. In the preset models, the first preset model is respectively cascaded with a second preset model and a third preset model, and the second preset model is cascaded with the third preset model.
Specifically, the first preset model is obtained by performing training and learning in advance on first historical flow data of the area to be predicted, where the first historical flow data may be the same as the first flow data or different from the first flow data, and in order to make the training to obtain the first preset model to predict more accurately, the historical flow data with a longer time may be selected as training data, for example, the historical flow data of the past 5 years. Through the historical flow data of 5 years, an initial flow prediction sequence, a holiday sequence, a trend sequence, a season sequence and the like of the area to be predicted within a preset prediction time period (such as within the next 1 year) are performed. The first preset model includes, but is not limited to, prophet models, and a specific training learning process of the first preset model is a prior art and is not described herein.
When the prediction is performed through the first preset model, the first flow data and the feature matrix may be input to the first preset model, so that a first prediction result is output, where the first prediction result includes, but is not limited to, an initial flow prediction sequence, a holiday sequence, a trend sequence, and a season sequence.
Step 202, inputting the initial flow prediction sequence and the feature matrix into a second preset model to obtain a second prediction result, wherein the second prediction result comprises a flow difference feature sequence, and the flow difference feature sequence is used for indicating the influence quantity of population flow of each region in a target range on the flow of the region to be predicted.
Specifically, the second preset model includes one or more neural network models corresponding to population flow types, and the neural network models include, but are not limited to, long Short-Term Memory (LSTM) models.
When the prediction is performed through the second preset model, the second preset result can be obtained by inputting the initial flow prediction sequence and the feature matrix into the corresponding neural network model. The second preset result comprises one or more flow difference characteristic sequences corresponding to population flow types, wherein the flow difference characteristic sequences are used for indicating the influence quantity of population flow of each region in the target range on the flow of the region to be predicted.
And 203, inputting the first prediction result and the flow difference characteristic sequence into a third preset model to obtain a flow prediction result of the region to be predicted. Specifically, the third preset model includes, but is not limited to, an extreme gradient lifting (eXtreme Gradient Boosting, XGBoost for short) model, which is not specifically limited by the present application. When the third model is trained, the output results of the first preset model and the second preset model are required to be used as input data of the third preset model, training learning is performed, so that the third preset model is obtained through training, and the specific training learning process is the prior art and is not repeated here.
When the prediction is performed through the third preset model, the first prediction result and the flow difference characteristic sequence can be input into the third preset model, and the flow prediction result of the area to be predicted is obtained through the third preset model. The flow prediction result of the area to be predicted is a flow value corresponding to each prediction time point of the area to be predicted in a preset prediction time length.
In this embodiment, an initial flow prediction sequence of the area to be predicted may be obtained through a first preset model, then the initial flow prediction sequence is predicted to obtain a flow difference feature sequence corresponding to each flow type of the mouth, and finally the initial flow prediction sequence is corrected by using the flow difference feature sequence through a third preset model, so that a flow prediction result of the area to be predicted is obtained, and the prediction result considers a plurality of features of the area to be predicted and influences of other areas in a target range, so that accuracy of the prediction result is improved.
Further, the first prediction result further includes: at least one of a holiday sequence, a trend sequence and a season sequence of the area to be predicted, wherein the holiday sequence is used for indicating holiday characteristics of the area to be predicted within a preset prediction duration, the trend sequence is used for indicating trend characteristics of the area to be predicted within the prediction duration, and the season sequence is used for indicating season characteristics of the area to be predicted within the preset prediction duration.
In an embodiment, the initial traffic prediction sequence { Y' } L of the region to be predicted and at least one of the holiday sequence { H } L, the trend sequence { T } L and the season sequence { S } L may be obtained through a pre-trained prophet model, where the holiday sequence { H } L is used to indicate holiday characteristics of the region to be predicted within a preset prediction duration, the trend sequence { T } L is used to indicate trend characteristics of the region to be predicted within the preset prediction duration, and the season sequence { S } L is used to indicate seasonal characteristics of the region to be predicted within the preset prediction duration. Wherein { Y' } L、{H}L、{T}L and { S } L are one-dimensional time series of L time points, respectively, where the L time points can be understood as a plurality of time points within a preset predicted time period. For example, assuming that it is necessary to predict the traffic of one week in the future of the area to be predicted, the time granularity is hour, L is 7×24=168. In this way, when the initial flow prediction sequence can be corrected in the third preset model, the influence of time, holidays and activities of the area to be predicted on the flow value of the area to be predicted can be considered, so that the flow prediction result of the area to be predicted is more accurate.
Further, the second preset model comprises N sub-models, the second prediction result comprises N second prediction sub-results, and N is a positive integer;
Step 202, inputting the initial flow prediction sequence and the feature matrix into a second preset model to obtain a second prediction result, including:
Inputting the initial flow prediction sequence and the feature matrix into a target sub-model to obtain a second predictor result;
The target sub-model is any one sub-module of N sub-models.
In an embodiment, the second preset model may include 1,2, 3, etc. sub-models, each corresponding to one second predictor result. Thus, when the initial flow prediction sequence and the feature matrix are input into the target sub-model, one second predictor result can be obtained through the target sub-model, and N second predictor results can be obtained. The target sub-model is any one sub-module of N sub-models. For example, assuming that the number of sub-models in the second preset model is 3, and the sub-models are respectively used for predicting the flow difference characteristic sequence corresponding to the population inflow type, the flow difference characteristic sequence corresponding to the population outflow type and the flow difference characteristic sequence corresponding to the population stability type, the flow difference characteristic sequences corresponding to the population inflow type, the population outflow type and the population stability type can be obtained only by inputting the initial flow prediction sequence and the characteristic matrix of the region to be predicted into the 3 sub-models.
In this embodiment, different sub-models are set to respectively predict the flow difference feature sequences of different population flow types, so as to obtain the change condition of the network flow of the region to be predicted under the influence of the regions of different population flow types, and enable the predicted flow value of the region to be predicted to be more accurate.
Further, before inputting the initial flow prediction sequence and the feature matrix into a second preset model to obtain a second prediction result, the method comprises the following steps:
Training and learning the historical flow data and the feature matrix to obtain N sub-models;
Wherein the value of N is consistent with the number of population flow types included in each region in the target range, and the population flow types comprise at least one of inflow type, outflow type and stable type.
In an embodiment, the population flow type may include at least one of inflow type, outflow type and stable type, and the present application is not particularly limited. For example, while the population flow types of the respective regions within the target range may include inflow type, outflow type and stable type 3 types, it is necessary to include 3 sub-models in the second preset model. Before the 3 sub-models are used for prediction, training and learning are needed to be carried out on the historical flow data and the feature matrix, so that the 3 sub-models are obtained. Specifically, historical flow data of each region in a target range are classified according to the 3 population flow types, flow average values corresponding to a plurality of regions with the same population flow type in the target range at each acquisition time point are calculated respectively to obtain 3 flow average value sequences, first flow data of a region to be predicted is subtracted from the 3 flow average value sequences to obtain 3 differential feature sequences, the historical flow data and a feature matrix of the region to be predicted are used as inputs of 3 submodels, the 3 differential feature sequences are used as outputs of the 3 submodels respectively, and 3 LSTM models are obtained through training.
Of course, when each region within the target range includes two population flow types or one population flow type, two sub-models or one sub-model in the second preset model may be trained in the manner described above.
In this embodiment, a plurality of sub-models may be set according to population flow types, and training and learning may be performed on the second pre-plurality of sub-models, so as to implement prediction of flow influence amounts of the areas to be predicted in a plurality of areas corresponding to different population flow types.
Further, referring to fig. 3, fig. 3 is a third flowchart of a network traffic prediction method according to an embodiment of the present invention. Training and learning the historical flow data and the feature matrix to obtain N sub-models, wherein the training and learning comprises the following steps:
step 301, classifying historical flow data of each region according to population flow types of each region in the target range.
In one embodiment, population flow types for each region within the target range may include 3 types of inflow, outflow, and stable. The historical flow data of each region can be classified and combined according to the 3 population flowing types, so that the historical time series corresponding to the 3 population flowing types can be obtained. Assuming that the historical flow data of each region is a time series of four-dimensional data corresponding to W time acquisition points, an inflow time series { W2} w×4, an outflow time series { W3} w×4, and a steady time series { W4} w×4 can be obtained.
Step 302, calculating N average value sequences corresponding to the N population flow types respectively, where the average value sequences are used to indicate flow average values corresponding to multiple regions with the same population flow type in the target range at each acquisition time point.
The average value of the flow values of each acquisition time point in the inflow time sequence { W2} w×4, the outflow time sequence { W3} w×4 and the stable time sequence { W4} w×4 is calculated respectively to obtain the flow average value corresponding to each acquisition time point, and further the average value sequence { C1} w×2 of the inflow time sequence { W2} w×4, the average value sequence { C2} w×2 of the outflow time sequence { W3} w×4 and the average value sequence { C3} w×2 of the stable time sequence { W4} w×4 are obtained, wherein each average value sequence is a two-dimensional time sequence comprising W acquisition times and flow average values corresponding to W acquisition times.
Step 303, determining N differential feature sequences corresponding to the N human mouth flow types based on the first flow data and the N mean value sequences, where the differential feature sequences include differences between flow values corresponding to each acquisition time point in the first flow data and flow mean values of corresponding time points of the mean value sequences.
The flow value corresponding to each collection time point in the first flow data { W1} w×4 is subtracted from the flow average value corresponding to each collection time point in the average value sequence { C1} w×2,{C2}w×2,{C3}w×2, so as to obtain a differential feature sequence { D1} w×2 corresponding to the inflow type, a differential feature sequence { D2} w×2 corresponding to the outflow type, and a differential feature sequence { D3} w×2 corresponding to the stable type.
And 304, taking the first flow data and the feature matrix as the input of each sub-model in the N sub-models, taking the N differential feature sequences as the output of each sub-model in the N sub-models, and training and learning to obtain the N sub-models.
Since population flow types of each region within the target range include 3 types of inflow type, outflow type and stable type, training is required to obtain neural network models corresponding to the 3 population flow types. When the 3 neural network models are trained, the first flow data { W1} w×4 and the feature matrix { F } w×k are used as inputs of the 3 neural network models, the differential feature sequence { D1} w×2 corresponding to the inflow type, the differential feature sequence { D2} w×2 corresponding to the outflow type and the differential feature sequence { D3} w×2 corresponding to the stable type are respectively used as outputs of the 3 neural network models, and training and learning are performed on parameters in the 3 neural network models, so that the 3 neural network models can be obtained.
Therefore, after training to obtain 3 neural network models, the initial flow prediction sequence { Y '} L and the feature matrix { F } w×k can be input into the 3 neural network models, so that 3 flow differential feature sequences are predicted to obtain, namely, an inflow type corresponding flow differential feature sequence { D1' } L, an outflow type corresponding flow differential feature sequence { D2'} L and a steady type corresponding flow differential feature sequence { D3' } L. In this embodiment, the flow difference feature sequence corresponding to each population flow type may be determined according to the population flow type of each region in the target range, so that the initial flow prediction sequence of the region to be predicted is conveniently corrected by the flow difference feature sequences corresponding to different population flow types, and the flow prediction result of the region to be predicted is more accurate.
Further, the step 102 of determining the feature matrix corresponding to the region to be predicted based on the first flow data includes:
Performing feature labeling on the first flow data to obtain a plurality of feature sequences;
determining a feature matrix corresponding to the region to be predicted based on the plurality of feature sequences;
the first flow data is a four-dimensional time sequence comprising a plurality of acquisition time points, flow values corresponding to the acquisition time points, holiday information and activity information, and the plurality of feature sequences comprise a time feature sequence, a holiday feature sequence, an activity feature sequence and an area feature sequence.
In an embodiment, the first flow data includes an acquisition time, a flow value corresponding to each acquisition time, holiday information corresponding to each acquisition time, and activity information corresponding to each acquisition time, denoted by { W1} w×4, where W1 represents a four-dimensional data matrix corresponding to W time acquisition points of the area to be predicted, and the number of W is determined by a historical preset duration and an acquired time granularity. Thus, the first flow data can be subjected to characteristic marking through big data analysis, the characteristic comprises time characteristics, holiday characteristics, activity characteristics and area characteristics, so that a time characteristic sequence, a holiday characteristic sequence, an activity characteristic sequence and an area characteristic sequence are obtained, and a characteristic matrix { F } w×k is formed according to the time characteristic sequence, the holiday characteristic sequence, the activity characteristic sequence and the area characteristic sequence, wherein F represents a k-dimensional data matrix corresponding to w time acquisition points, the number of w is determined by the historical preset duration and the acquired time granularity, and k is any positive integer. Specifically, the time feature is used for indicating time information of each collection time of the first flow data, such as a corresponding season (i.e. spring, summer, autumn or winter), a corresponding period (i.e. morning, afternoon, noon or evening), whether the time information is rush hours or rush hours, etc.; the holiday feature is used for indicating holiday information of each acquisition time of the first flow data, such as whether the holiday is a holiday, a holiday name, a holiday duration and the like; the activity feature is used for indicating activity information of each acquisition time of the first flow data, such as whether the activity information is activity days, activity names, activity duration, activity area range and the like; the regional characteristics are used to indicate information of the region to be predicted, such as the population flowing type (i.e., population inflow type, population outflow type or population stable type), the city type (i.e., tourist city, agricultural city or industrial city) to which the region belongs, the total domestic production value (Gross Domestic Product, abbreviated as GDP) value in the current month, the number of resident population in the region, the number of town population, the number of colleges and universities, the number of middle and primary schools, and the like.
It should be noted that, the training data of obtaining the model prediction data may be implemented in the same manner as described above, referring to fig. 4, fig. 4 is a schematic diagram of the training data provided in the embodiment of the present invention, as shown in fig. 4, historical flow data is collected first, where the historical flow data represents historical flow data of each area of the target range, and the number of collection time points and granularity of collection time may be set according to actual needs. The historical flow data are time sequence data with multiple dimensions, wherein the time sequence data comprise acquisition time, flow values corresponding to the acquisition time, holiday information corresponding to the acquisition time, activity information corresponding to the acquisition time and the like. After the first flow data of the area to be predicted is acquired, a feature sequence is constructed, wherein the feature sequence can comprise a time feature sequence, a holiday feature sequence, an activity feature sequence and an area feature sequence of the area to be predicted. After the feature sequence is built, the historical flow data of each region are used as training data for training and learning a preset model, wherein the time feature sequence, the holiday feature sequence, the activity feature sequence and the region feature sequence are used as training data.
In this embodiment, the feature matrix of the area to be predicted is constructed by performing feature labeling on the first flow data of the area to be predicted, so that when training and prediction are performed through a preset model, the influence of factors such as time features, holiday features, activity features and area features on the flow value of the area to be predicted can be comprehensively considered, and the accuracy of the prediction result can be improved.
In an application example, referring to fig. 5, fig. 5 is a schematic structural diagram of a preset model provided by the embodiment of the present invention, where in fig. 5, the preset model includes a first preset model, a second preset model and a third preset model, the first preset model is cascaded with the second preset model and the third preset model, the second preset model is cascaded with the third preset model, and the second preset model includes a first sub-module, a second sub-model and a third sub-model, where the first sub-module, the second sub-model and the third sub-model respectively correspond to an inflow type, an outflow type and a stable type in a population flow type. The first traffic data { W1} w×4 and the feature matrix { F } w×k of the region to be predicted are used as inputs of a first preset model, and an initial traffic prediction sequence { Y' } L, a holiday sequence { H } L, a trend sequence { T } L, and a season sequence { S } L are output through the first preset model. The initial flow prediction sequence { Y '} L and the feature matrix { F } w×k are input into a first sub-module, a second sub-module and a third sub-module, and 3 flow difference feature sequences are obtained through prediction, namely a flow difference feature sequence { D1' } L corresponding to an inflow type, a flow difference feature sequence { D2'} L corresponding to an outflow type and a flow difference feature sequence { D3' } L corresponding to a stable type. Finally, the initial flow prediction sequence { Y '} L, the holiday sequence { H } L, the trend sequence { T } L, the seasonal sequence { S } L, the flow difference characteristic sequence { D1' } L corresponding to the inflow type, the flow difference characteristic sequence { D2'} L corresponding to the outflow type and the flow difference characteristic sequence { D3' } L corresponding to the stable type are used as the input of a third preset model, and the flow prediction result { Y } L of the area to be predicted is obtained through the third preset model.
In this embodiment, the data such as an initial flow prediction sequence, a holiday sequence, a trend sequence, a season sequence and the like of the region to be predicted can be obtained through a first preset model, a flow difference characteristic sequence corresponding to each population flow type can be obtained through a second preset model, and finally, the initial flow prediction sequence is corrected through a third preset model, so that the influence of a plurality of characteristics of the region to be predicted and population flows of other regions in a target range on the flow is considered by a prediction result, and the accuracy of the prediction result is improved. The embodiment of the present invention further provides a network traffic prediction device, referring to fig. 6, fig. 6 is a schematic structural diagram of the network traffic prediction device provided in the embodiment of the present invention, as shown in fig. 6, the network traffic prediction device 600 includes:
The acquiring module 601 is configured to acquire historical flow data of each region in the target range, where the historical flow data includes first flow data corresponding to a region to be predicted;
a first determining module 602, configured to determine a feature matrix corresponding to the region to be predicted based on the first flow data;
A second determining module 603, configured to input the historical flow data and the feature matrix to a preset model, and determine a flow prediction result of the area to be predicted;
The preset models comprise a first preset model, a second preset model and a third preset model, wherein the first preset model is respectively cascaded with the second preset model and the third preset model, and the second preset model is cascaded with the third preset model.
Optionally, the second determining module 603 includes:
The first input sub-module is used for inputting the first flow data and the feature matrix into a first preset model to obtain a first prediction result, and the first prediction result comprises an initial flow prediction sequence;
The second input sub-module is used for inputting the initial flow prediction sequence and the feature matrix into a second preset model to obtain a second prediction result, wherein the second prediction result comprises a flow difference feature sequence;
The third input sub-module is used for inputting the first prediction result and the flow difference characteristic sequence into a third preset model to obtain a flow prediction result of the area to be predicted, and the flow difference characteristic sequence is used for indicating the influence quantity of population flow of each area in the target range on the flow of the area to be predicted;
the first preset model and the second preset model are different machine learning models, the third preset model is a correction model, and the third preset model is used for correcting the initial flow prediction sequence according to the flow difference characteristic sequence.
Optionally, the first prediction result further includes:
At least one of a holiday sequence, a trend sequence and a season sequence of the area to be predicted, wherein the holiday sequence is used for indicating holiday characteristics of the area to be predicted within a preset prediction duration, the trend sequence is used for indicating trend characteristics of the area to be predicted within the prediction duration, and the season sequence is used for indicating season characteristics of the area to be predicted within the preset prediction duration.
Optionally, the second preset model includes N sub-models, the second prediction result includes N second prediction sub-results, and N is a positive integer; the second input submodule includes:
the input unit is used for inputting the initial flow prediction sequence and the feature matrix into the target sub-model to obtain a second predictor result;
Wherein, the target sub-model is any sub-module in N sub-models.
Optionally, the second determining module 603 further includes:
The training and learning sub-module is used for training and learning the historical flow data and the feature matrix to obtain N sub-models;
Wherein the value of N is consistent with the number of population flow types included in each region in the target range, and the population flow types comprise at least one of inflow type, outflow type and stable type.
Optionally, the training learning sub-module is specifically configured to:
classifying historical flow data of each region according to population flow types of each region in a target range;
Respectively calculating N average value sequences corresponding to the N human mouth flow types, wherein the average value sequences are used for indicating flow average values corresponding to a plurality of areas with the same population flow type in a target range at each acquisition time point;
based on the first flow data and N average value sequences, N differential feature sequences corresponding to N human mouth flow types are determined, wherein the differential feature sequences comprise differences between flow values corresponding to all acquisition time points in the first flow data and flow average values of corresponding time points of the average value sequences;
the first flow data and the feature matrix are used as the input of each sub-model in the N sub-models, the N differential feature sequences are respectively used as the output of each sub-model in the N sub-models, and the N sub-models are obtained through training and learning.
Optionally, the first determining module 602 includes:
the marking sub-module is used for carrying out characteristic marking on the first flow data to obtain a plurality of characteristic sequences;
the second determining submodule is used for determining a feature matrix corresponding to the region to be predicted based on the plurality of feature sequences;
the first flow data is a four-dimensional time sequence comprising a plurality of acquisition time points, flow values corresponding to the acquisition time points, holiday information and activity information, and the plurality of feature sequences comprise a time feature sequence, a holiday feature sequence, an activity feature sequence and an area feature sequence.
The network traffic prediction device 600 can implement the processes of the above-mentioned network traffic prediction method embodiment, and achieve the same beneficial effects, and for avoiding repetition, a detailed description is omitted here.
The embodiment of the invention also provides a network traffic prediction device 600, which comprises: the program is executed by the processor, and the processes of the network traffic prediction method embodiment can be achieved and the same technical effects can be achieved.
Specifically, referring to fig. 7, the embodiment of the present invention further provides a network traffic prediction apparatus, which includes a bus 701, a transceiver 702, an antenna 703, a bus interface 704, a processor 705, and a memory 706.
A processor 705, configured to obtain historical flow data of each region in the target range, where the historical flow data includes first flow data corresponding to the region to be predicted;
Determining a feature matrix corresponding to the region to be predicted based on the first flow data;
inputting the historical flow data and the feature matrix into a preset model, and determining a flow prediction result of the area to be predicted;
The preset models comprise a first preset model, a second preset model and a third preset model, wherein the first preset model is respectively cascaded with the second preset model and the third preset model, and the second preset model is cascaded with the third preset model.
Further, the processor 705 is further configured to input the first flow data and the feature matrix to a first preset model, to obtain a first prediction result, where the first prediction result includes an initial flow prediction sequence;
Inputting the initial flow prediction sequence and the feature matrix into a second preset model to obtain a second prediction result, wherein the second prediction result comprises a flow difference feature sequence;
inputting the first prediction result and the flow difference characteristic sequence into a third preset model to obtain a flow prediction result of the area to be predicted, wherein the flow difference characteristic sequence is used for indicating the influence quantity of population flow of each area in the target range on the flow of the area to be predicted;
the first preset model and the second preset model are different machine learning models, the third preset model is a correction model, and the third preset model is used for correcting the initial flow prediction sequence according to the flow difference characteristic sequence.
Further, the first prediction result further includes:
At least one of a holiday sequence, a trend sequence and a season sequence of the area to be predicted, wherein the holiday sequence is used for indicating holiday characteristics of the area to be predicted within a preset prediction duration, the trend sequence is used for indicating trend characteristics of the area to be predicted within the prediction duration, and the season sequence is used for indicating season characteristics of the area to be predicted within the preset prediction duration.
Further, the processor 705 is further configured to input the initial flow prediction sequence and the feature matrix to the target sub-model, to obtain a second predictor result;
The target sub-model is any one sub-module of N sub-models.
Further, the processor 705 is further configured to perform training learning on the historical traffic data and the feature matrix to obtain N sub-models;
Wherein the value of N is consistent with the number of population flow types included in each region in the target range, and the population flow types comprise at least one of inflow type, outflow type and stable type.
Further, the processor 705 is further configured to classify historical traffic data of each region according to population flow types of each region within the target range;
Respectively calculating N average value sequences corresponding to the N human mouth flow types, wherein the average value sequences are used for indicating flow average values corresponding to a plurality of areas with the same population flow type in a target range at each acquisition time point;
based on the first flow data and N average value sequences, N differential feature sequences corresponding to N human mouth flow types are determined, wherein the differential feature sequences comprise differences between flow values corresponding to all acquisition time points in the first flow data and flow average values of corresponding time points of the average value sequences;
the first flow data and the feature matrix are used as the input of each sub-model in the N sub-models, the N differential feature sequences are respectively used as the output of each sub-model in the N sub-models, and the N sub-models are obtained through training and learning.
Further, the processor 705 is further configured to perform feature labeling on the first traffic data to obtain a plurality of feature sequences;
determining a feature matrix corresponding to the region to be predicted based on the plurality of feature sequences;
the first flow data is a four-dimensional time sequence comprising a plurality of acquisition time points, flow values corresponding to the acquisition time points, holiday information and activity information, and the plurality of feature sequences comprise a time feature sequence, a holiday feature sequence, an activity feature sequence and an area feature sequence.
In FIG. 7, a bus architecture (represented by bus 701), the bus 701 may include any number of interconnected buses and bridges, with the bus 701 linking together various circuits, including one or more processors, represented by the processor 705, and memory, represented by the memory 706. The bus 701 may also link together various other circuits such as peripheral devices, voltage regulators, power management circuits, etc., which are well known in the art and, therefore, will not be described further herein. Bus interface 704 provides an interface between bus 701 and transceiver 702. The transceiver 702 may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor 705 is transmitted over a wireless medium via the antenna 703, and further, the antenna 703 receives and transmits data to the processor 705.
The processor 705 is responsible for managing the bus 701 and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And memory 706 may be used to store data used by processor 705 in performing operations.
Alternatively, the processor 705 may be CPU, ASIC, FPGA or a CPLD.
The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the processes of the network traffic prediction method embodiment described above, and can achieve the same technical effects, so that repetition is avoided, and no further description is given here. Among them, a computer-readable storage medium such as Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk, and the like.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The embodiments of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present invention and the scope of the claims, which are to be protected by the present invention.
Claims (9)
1. A method for predicting network traffic, the method comprising:
Acquiring historical flow data of each region in a target range, wherein the historical flow data comprises first flow data corresponding to a region to be predicted;
Determining a feature matrix corresponding to the region to be predicted based on the first flow data;
inputting the historical flow data and the feature matrix into a preset model, and determining a flow prediction result of the area to be predicted;
The preset models comprise a first preset model, a second preset model and a third preset model, the first preset model is respectively cascaded with the second preset model and the third preset model, the second preset model is cascaded with the third preset model, and the second preset model is used for predicting the flow conversion relation between the region to be predicted and other regions in the target range;
Inputting the historical flow data and the feature matrix into a preset model, and determining a flow prediction result of the region to be predicted comprises the following steps:
Inputting the first flow data and the feature matrix into the first preset model to obtain a first prediction result, wherein the first prediction result comprises an initial flow prediction sequence;
Inputting the initial flow prediction sequence and the feature matrix into the second preset model to obtain a second prediction result, wherein the second prediction result comprises a flow difference feature sequence, and the flow difference feature sequence is used for indicating the influence quantity of population flow of each region in the target range on the flow of the region to be predicted;
inputting the first prediction result and the flow difference characteristic sequence into the third preset model to obtain a flow prediction result of the region to be predicted;
The first preset model and the second preset model are different machine learning models, the third preset model is a correction model, and the third preset model is used for correcting the initial flow prediction sequence according to the flow difference characteristic sequence.
2. The method of claim 1, wherein the first predictor further comprises:
The method comprises the steps of determining a holiday sequence, a trend sequence and a season sequence of a region to be predicted, wherein the holiday sequence is used for indicating holiday characteristics of the region to be predicted in a preset prediction time period, the trend sequence is used for indicating trend characteristics of the region to be predicted in the prediction time period, and the season sequence is used for indicating season characteristics of the region to be predicted in the preset prediction time period.
3. The method of claim 1, wherein the second pre-set model comprises N sub-models, the second prediction result comprises N second prediction sub-results, and N is a positive integer;
Inputting the initial flow prediction sequence and the feature matrix into a second preset model to obtain a second prediction result, wherein the method comprises the following steps of:
Inputting the initial flow prediction sequence and the feature matrix into a target sub-model to obtain a second predictor result;
The target sub-model is any sub-module in the N sub-models.
4. A method according to claim 3, comprising, before said inputting the initial flow prediction sequence and the feature matrix into a second predetermined model to obtain a second prediction result:
Training and learning the historical flow data and the feature matrix to obtain the N sub-models;
the value of the N is consistent with the number of population flowing types included in each area in the target range, and the population flowing types comprise at least one of inflow type, outflow type and stable type.
5. The method of claim 4, wherein training the historical traffic data and the feature matrix to obtain the N sub-models comprises:
Classifying historical flow data of each region according to population flow types of each region in the target range;
Respectively calculating N average value sequences corresponding to N human mouth flow types, wherein the average value sequences are used for indicating flow average values corresponding to a plurality of areas with the same human mouth flow type in the target range at each acquisition time point;
Based on the first flow data and the N mean value sequences, N differential feature sequences corresponding to the N human mouth flow types are determined, wherein the differential feature sequences comprise differences between flow values corresponding to all acquisition time points in the first flow data and flow mean values of corresponding time points of the mean value sequences;
And taking the first flow data and the feature matrix as the input of each sub-model in the N sub-models, taking the N differential feature sequences as the output of each sub-model in the N sub-models, and training and learning to obtain the N sub-models.
6. The method of claim 1, wherein determining the feature matrix corresponding to the region to be predicted based on the first traffic data comprises:
Performing feature labeling on the first flow data to obtain a plurality of feature sequences;
Determining a feature matrix corresponding to the region to be predicted based on the plurality of feature sequences;
The first flow data is a four-dimensional time sequence comprising a plurality of acquisition time points, flow values, holiday information and activity information corresponding to the acquisition time points, and the plurality of feature sequences comprise a time feature sequence, a holiday feature sequence, an activity feature sequence and an area feature sequence.
7. A network traffic prediction apparatus, the apparatus comprising:
the acquisition module is used for acquiring historical flow data of each region in the target range, wherein the historical flow data comprises first flow data corresponding to the region to be predicted;
the first determining module is used for determining a feature matrix corresponding to the region to be predicted based on the first flow data;
the second determining module is used for inputting the historical flow data and the feature matrix into a preset model and determining a flow prediction result of the area to be predicted;
The preset models comprise a first preset model, a second preset model and a third preset model, the first preset model is respectively cascaded with the second preset model and the third preset model, the second preset model is cascaded with the third preset model, and the second preset model is used for predicting the flow conversion relation between the region to be predicted and other regions in the target range;
The second determining module includes:
The first input sub-module is used for inputting the first flow data and the feature matrix into the first preset model to obtain a first prediction result, and the first prediction result comprises an initial flow prediction sequence;
The second input sub-module is used for inputting the initial flow prediction sequence and the feature matrix into the second preset model to obtain a second prediction result, wherein the second prediction result comprises a flow difference feature sequence, and the flow difference feature sequence is used for indicating the influence quantity of population flow of each region in the target range on the flow of the region to be predicted;
The third input sub-module is used for inputting the first prediction result and the flow difference characteristic sequence into the third preset model to obtain a flow prediction result of the region to be predicted;
The first preset model and the second preset model are different machine learning models, the third preset model is a correction model, and the third preset model is used for correcting the initial flow prediction sequence according to the flow difference characteristic sequence.
8. A network traffic prediction apparatus, comprising: a processor, a memory and a program stored on the memory and executable on the processor, which when executed by the processor implements the steps of the network traffic prediction method according to any one of claims 1 to 6.
9. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the network traffic prediction method according to any of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110005328.2A CN114726745B (en) | 2021-01-05 | 2021-01-05 | Network traffic prediction method, device and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110005328.2A CN114726745B (en) | 2021-01-05 | 2021-01-05 | Network traffic prediction method, device and computer readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114726745A CN114726745A (en) | 2022-07-08 |
CN114726745B true CN114726745B (en) | 2024-05-17 |
Family
ID=82234109
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110005328.2A Active CN114726745B (en) | 2021-01-05 | 2021-01-05 | Network traffic prediction method, device and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114726745B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116709356B (en) * | 2022-09-05 | 2024-07-26 | 荣耀终端有限公司 | Flow prediction method, device and system |
CN115802366B (en) * | 2023-02-13 | 2023-04-28 | 网络通信与安全紫金山实验室 | Network traffic prediction method, device, computer equipment and medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109120463A (en) * | 2018-10-15 | 2019-01-01 | 新华三大数据技术有限公司 | Method for predicting and device |
CN110267292A (en) * | 2019-05-16 | 2019-09-20 | 湖南大学 | Cellular network method for predicting based on Three dimensional convolution neural network |
CN110995520A (en) * | 2020-02-28 | 2020-04-10 | 清华大学 | Network flow prediction method and device, computer equipment and readable storage medium |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10644979B2 (en) * | 2018-06-06 | 2020-05-05 | The Joan and Irwin Jacobs Technion-Cornell Institute | Telecommunications network traffic metrics evaluation and prediction |
-
2021
- 2021-01-05 CN CN202110005328.2A patent/CN114726745B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109120463A (en) * | 2018-10-15 | 2019-01-01 | 新华三大数据技术有限公司 | Method for predicting and device |
CN110267292A (en) * | 2019-05-16 | 2019-09-20 | 湖南大学 | Cellular network method for predicting based on Three dimensional convolution neural network |
CN110995520A (en) * | 2020-02-28 | 2020-04-10 | 清华大学 | Network flow prediction method and device, computer equipment and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN114726745A (en) | 2022-07-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110969854A (en) | Traffic flow prediction method, system and terminal equipment | |
CN114726745B (en) | Network traffic prediction method, device and computer readable storage medium | |
CN111210093B (en) | Daily water consumption prediction method based on big data | |
CN109448361B (en) | Resident traffic travel flow prediction system and prediction method thereof | |
Hasnine et al. | Tour-based mode choice modelling as the core of an activity-based travel demand modelling framework: A review of state-of-the-art | |
CN110555561A (en) | Medium-and-long-term runoff ensemble forecasting method | |
CN111726243B (en) | Method and device for predicting node state | |
CN110175690A (en) | A kind of method, apparatus, server and the storage medium of scenic spot passenger flow forecast | |
CN114331542A (en) | Method and device for predicting charging demand of electric vehicle | |
CN113762595A (en) | Traffic time prediction model training method, traffic time prediction method and equipment | |
CN113256022A (en) | Method and system for predicting electric load of transformer area | |
CN115423162A (en) | Traffic flow prediction method and device, electronic equipment and storage medium | |
CN117494906B (en) | Natural gas daily load prediction method based on multivariate time series | |
CN112287503B (en) | Dynamic space network construction method for traffic demand prediction | |
JP4780668B2 (en) | Traffic analysis model construction method, apparatus, construction program, and storage medium thereof | |
CN111833088B (en) | Supply and demand prediction method and device | |
CN113344290B (en) | Method for correcting sub-season rainfall weather forecast based on U-Net network | |
CN115409170A (en) | Sample data generation and trip demand prediction model training and prediction method and device | |
Su et al. | Graph ode recurrent neural networks for traffic flow forecasting | |
Jia et al. | RHMX: Bus Arrival Time Prediction via Mixed Model | |
CN117994006B (en) | Vehicle charging recommendation method and device | |
CN115578861B (en) | Highway traffic flow prediction method based on embedded feature selection strategy | |
CN117272848B (en) | Subway passenger flow prediction method and model training method based on space-time influence | |
CN117896671B (en) | Intelligent management method and system for Bluetooth AOA base station | |
JP4780670B2 (en) | Traffic analysis model construction method, apparatus, construction program, and storage medium thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |