CN114726745B

CN114726745B - Network traffic prediction method, device and computer readable storage medium

Info

Publication number: CN114726745B
Application number: CN202110005328.2A
Authority: CN
Inventors: 魏玉琼; 朱琳; 郭倩影; 王士一
Original assignee: China Mobile Communications Group Co Ltd; Research Institute of China Mobile Communication Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; Research Institute of China Mobile Communication Co Ltd
Priority date: 2021-01-05
Filing date: 2021-01-05
Publication date: 2024-05-17
Anticipated expiration: 2041-01-05
Also published as: CN114726745A

Abstract

The application discloses a network traffic prediction method, a device and a computer readable storage medium, relating to the technical field of artificial intelligence, wherein the method comprises the following steps: acquiring historical flow data of each region in a target range, wherein the historical flow data comprises first flow data corresponding to a region to be predicted; determining a feature matrix corresponding to the region to be predicted based on the first flow data; inputting the historical flow data and the feature matrix into a preset model, and determining a flow prediction result of the area to be predicted; the preset models comprise a first preset model, a second preset model and a third preset model, wherein the first preset model is respectively cascaded with the second preset model and the third preset model, and the second preset model is cascaded with the third preset model. In this way, the flow of the area to be predicted can be predicted through a plurality of models with cascade relations, so that the influence of the feature matrix of the area to be predicted and the flow of other areas in the target range is considered by the prediction result, and the accuracy of the prediction result is improved.

Description

Network traffic prediction method, device and computer readable storage medium

技术领域Technical Field

本发明涉及人工智能技术领域，尤其涉及一种网络流量预测方法、装置及计算机可读存储介质。The present invention relates to the field of artificial intelligence technology, and in particular to a network traffic prediction method, device and computer-readable storage medium.

背景技术Background technique

随着第四代移动通信网络(即4G)和第五代移动通信网络(即5G)的发展，大量网络服务通过各种终端设备融入人们的生活，因而人们对网络速度和时延的要求也逐渐增高。为了保障良好的用户使用体验，就必须要保证给予网络足够的资源。运营商通常以省公司为单位制定网络资源分配策略，各省公司的网络专家需要根据本省各地市的网络性能指标时间序列数据，通过预测网络流量判断各地市未来一段时间内的网络拥塞状况，并根据各下属地市的资源负载情况，制定合理的资源调配方案，达到满足各地市网络资源需求的同时最大化的避免资源浪费的目的。With the development of the fourth generation mobile communication network (ie 4G) and the fifth generation mobile communication network (ie 5G), a large number of network services have been integrated into people's lives through various terminal devices, and people's requirements for network speed and latency have gradually increased. In order to ensure a good user experience, it is necessary to ensure that the network is given enough resources. Operators usually formulate network resource allocation strategies based on provincial companies. Network experts in each provincial company need to determine the network congestion status of each city in the future by predicting network traffic based on the time series data of network performance indicators in various cities in the province, and formulate reasonable resource allocation plans based on the resource load of each subordinate city, so as to meet the network resource needs of various cities while maximizing the avoidance of resource waste.

然而，现有的网络流量预测方法，通常是将原时间序列拆分成多个子时间序列，对多个子时间序列分别进行预测，最后通过将各子时间序列的预测结果相加得到原时间序列的预测结果。由于这种方法仅从数学角度拟合预测结果，适合不受其他时间序列影响的单一时间序列预测，而应用在网络流量预测场景中，却会由于忽略了同省地市之间流量变化的相似性和相关性，导致预测结果的准确率较低。However, the existing network traffic prediction methods usually split the original time series into multiple sub-time series, predict the multiple sub-time series separately, and finally get the prediction result of the original time series by adding the prediction results of each sub-time series. Since this method only fits the prediction result from a mathematical perspective, it is suitable for the prediction of a single time series that is not affected by other time series. However, when applied to the network traffic prediction scenario, it will ignore the similarity and correlation of traffic changes between cities in the same province, resulting in a low accuracy rate of the prediction results.

发明内容Summary of the invention

本发明实施例提供一种网络流量预测方法、装置及计算机可读存储介质，以解决现有的网络流量预测方法的预测结果的准确率较低问题。The embodiments of the present invention provide a network traffic prediction method, device and computer-readable storage medium to solve the problem of low accuracy of prediction results of existing network traffic prediction methods.

为解决上述技术问题，本发明是这样实现的：To solve the above technical problems, the present invention is achieved as follows:

第一方面，本发明实施例提供了一种网络流量预测方法，所述方法包括：In a first aspect, an embodiment of the present invention provides a network traffic prediction method, the method comprising:

获取目标范围内各区域的历史流量数据，所述历史流量数据包括待预测区域对应的第一流量数据；Acquire historical traffic data of each area within the target range, wherein the historical traffic data includes first traffic data corresponding to the area to be predicted;

基于所述第一流量数据，确定所述待预测区域对应的特征矩阵；Based on the first traffic data, determining a feature matrix corresponding to the area to be predicted;

将所述历史流量数据和所述特征矩阵输入至预设模型，确定所述待预测区域的流量预测结果；Inputting the historical traffic data and the characteristic matrix into a preset model to determine the traffic prediction result of the area to be predicted;

其中，所述预设模型包括第一预设模型、第二预设模型和第三预设模型，所述第一预设模型分别与所述第二预设模型和所述第三预设模型级联，所述第二预设模型与所述第三预设模型级联。The preset model includes a first preset model, a second preset model and a third preset model, the first preset model is cascaded with the second preset model and the third preset model respectively, and the second preset model is cascaded with the third preset model.

可选地，所述将所述历史流量数据和所述特征矩阵输入至预设模型，确定所述待预测区域的流量预测结果，包括：Optionally, inputting the historical traffic data and the feature matrix into a preset model to determine the traffic prediction result of the area to be predicted includes:

将所述第一流量数据和所述特征矩阵输入至所述第一预设模型，得到第一预测结果，所述第一预测结果包括初始流量预测序列；Inputting the first flow data and the feature matrix into the first preset model to obtain a first prediction result, wherein the first prediction result includes an initial flow prediction sequence;

将所述初始流量预测序列与所述特征矩阵输入至所述第二预设模型，得到第二预测结果，所述第二预测结果包括流量差异性特征序列；Inputting the initial flow prediction sequence and the feature matrix into the second preset model to obtain a second prediction result, wherein the second prediction result includes a flow difference feature sequence;

将所述第一预测结果和所述流量差异性特征序列输入至所述第三预设模型，得到所述待预测区域的流量预测结果，所述流量差异性特征序列用于指示所述目标范围内各区域的人口流动对所述待预测区域的流量的影响量；Inputting the first prediction result and the flow difference characteristic sequence into the third preset model to obtain the flow prediction result of the area to be predicted, wherein the flow difference characteristic sequence is used to indicate the influence of the population flow of each area within the target range on the flow of the area to be predicted;

其中，所述第一预设模型和所述第二预设模型为不同的机器学习模型，所述第三预设模型为修正模型，所述第三预设模型用于根据所述流量差异性特征序列对所述初始流量预测序列进行修正。Among them, the first preset model and the second preset model are different machine learning models, and the third preset model is a correction model, and the third preset model is used to correct the initial traffic prediction sequence according to the traffic difference feature sequence.

可选地，所述第一预测结果还包括：Optionally, the first prediction result further includes:

所述待预测区域的节假日序列、趋势序列和季节序列中的至少一项，其中，所述节假日序列用于指示所述待预测区域在预设预测时长内的节假日特征，所述趋势序列用于指示所述待预测区域预测时长内的趋势特征，所述季节序列用于指示所述待预测区域在预设预测时长内的季节特征。At least one of the holiday sequence, trend sequence and season sequence of the area to be predicted, wherein the holiday sequence is used to indicate the holiday characteristics of the area to be predicted within a preset prediction time period, the trend sequence is used to indicate the trend characteristics of the area to be predicted within the prediction time period, and the season sequence is used to indicate the season characteristics of the area to be predicted within the preset prediction time period.

可选地，所述第二预设模型包括N个子模型，所述第二预测结果包括N个第二预测子结果，所述N为正整数；Optionally, the second preset model includes N sub-models, the second prediction result includes N second prediction sub-results, and N is a positive integer;

所述将所述初始流量预测序列与所述特征矩阵输入至第二预设模型，得到第二预测结果，包括：The step of inputting the initial flow prediction sequence and the feature matrix into a second preset model to obtain a second prediction result includes:

将所述初始流量预测序列与所述特征矩阵输入至目标子模型，得到第二预测子结果；Inputting the initial traffic prediction sequence and the feature matrix into the target sub-model to obtain a second prediction sub-result;

其中，所述目标子模型为所述N个子模型中的任意一个子模块。The target sub-model is any sub-module among the N sub-models.

可选地，在所述将所述初始流量预测序列与所述特征矩阵输入至第二预设模型，得到第二预测结果之前，包括：Optionally, before inputting the initial traffic prediction sequence and the feature matrix into a second preset model to obtain a second prediction result, the method further includes:

对所述历史流量数据和所述特征矩阵进行训练学习，得到所述N个子模型；Performing training and learning on the historical traffic data and the feature matrix to obtain the N sub-models;

其中，所述N的取值与所述目标范围内各区域所包括的人口流动类型的数量一致，所述人口流动类型包括流入型、流出型和稳定型中的至少一种。Among them, the value of N is consistent with the number of population flow types included in each area within the target range, and the population flow types include at least one of inflow type, outflow type and stable type.

可选地，所述对所述历史流量数据和所述特征矩阵进行训练学习，得到所述N个子模型，包括：Optionally, the training and learning of the historical traffic data and the feature matrix to obtain the N sub-models includes:

根据所述目标范围内各区域的人口流动类型，对各区域的历史流量数据进行分类；Classify the historical flow data of each area according to the population flow type of each area within the target range;

分别计算所述N个人口流动类型对应的N个均值序列，所述均值序列用于指示所述目标范围内人口流动类型相同的多个区域在各个采集时间点对应的流量均值；Calculate N mean sequences corresponding to the N population flow types respectively, where the mean sequences are used to indicate the flow mean values corresponding to multiple areas with the same population flow type within the target range at various collection time points;

基于所述第一流量数据和所述N个均值序列，确定所述N个人口流动类型对应的N个差异性特征序列，所述差异性特征序列包括所述第一流量数据中各采集时间点对应的流量值与所述均值序列的相应时间点的流量均值之间的差值；Based on the first flow data and the N mean sequences, determine N difference feature sequences corresponding to the N population flow types, the difference feature sequences comprising differences between flow values corresponding to each acquisition time point in the first flow data and flow means at corresponding time points in the mean sequence;

将所述第一流量数据和所述特征矩阵作为所述N个子模型中各子模型的输入，将所述N个差异性特征序列分别作为所述N个子模型中各子模型的输出，训练学习得到所述N个子模型。The first traffic data and the feature matrix are used as inputs of each of the N sub-models, and the N differential feature sequences are used as outputs of each of the N sub-models, and the N sub-models are obtained through training and learning.

可选地，所述基于所述第一流量数据，确定所述待预测区域对应的特征矩阵，包括：Optionally, determining the feature matrix corresponding to the to-be-predicted area based on the first traffic data includes:

对所述第一流量数据进行特征标记，获得多个特征序列；Performing feature marking on the first flow data to obtain a plurality of feature sequences;

基于所述多个特征序列，确定所述待预测区域对应的特征矩阵；Based on the multiple feature sequences, determining a feature matrix corresponding to the area to be predicted;

其中，所述第一流量数据为包括多个采集时间点以及所述多个采集时间点对应的流量值、节假日信息和活动信息的四维的时间序列，所述多个特征序列包括时间特征序列、节假日特征序列、活动特征序列和区域特征序列。Among them, the first traffic data is a four-dimensional time series including multiple collection time points and the traffic values, holiday information and activity information corresponding to the multiple collection time points, and the multiple feature sequences include time feature sequences, holiday feature sequences, activity feature sequences and regional feature sequences.

第二方面，本发明实施例提供了一种网络流量预测装置，所述装置包括：In a second aspect, an embodiment of the present invention provides a network traffic prediction device, the device comprising:

获取模块，用于获取目标范围内各区域的历史流量数据，所述历史流量数据包括待预测区域对应的第一流量数据；An acquisition module, used to acquire historical traffic data of each area within the target range, wherein the historical traffic data includes first traffic data corresponding to the area to be predicted;

第一确定模块，用于基于所述第一流量数据，确定所述待预测区域对应的特征矩阵；A first determination module, configured to determine a feature matrix corresponding to the area to be predicted based on the first traffic data;

第二确定模块，用于将所述历史流量数据和所述特征矩阵输入至预设模型，确定所述待预测区域的流量预测结果；A second determination module is used to input the historical traffic data and the characteristic matrix into a preset model to determine the traffic prediction result of the area to be predicted;

可选地，所述第二确定模块包括：Optionally, the second determining module includes:

第一输入子模块，用于将所述第一流量数据和所述特征矩阵输入至所述第一预设模型，得到第一预测结果，所述第一预测结果包括初始流量预测序列；A first input submodule, used for inputting the first flow data and the feature matrix into the first preset model to obtain a first prediction result, wherein the first prediction result includes an initial flow prediction sequence;

第二输入子模块，用于将所述初始流量预测序列与所述特征矩阵输入至所述第二预设模型，得到第二预测结果，所述第二预测结果包括流量差异性特征序列；A second input submodule, used for inputting the initial flow prediction sequence and the feature matrix into the second preset model to obtain a second prediction result, wherein the second prediction result includes a flow difference feature sequence;

第三输入子模块，用于将所述第一预测结果和所述流量差异性特征序列输入至所述第三预设模型，得到所述待预测区域的流量预测结果，所述流量差异性特征序列用于指示所述目标范围内各区域的人口流动对所述待预测区域的流量的影响量；A third input submodule is used to input the first prediction result and the flow difference feature sequence into the third preset model to obtain the flow prediction result of the area to be predicted, and the flow difference feature sequence is used to indicate the impact of the population flow of each area within the target range on the flow of the area to be predicted;

可选地，所述第二预设模型包括N个子模型，所述第二预测结果包括N个第二预测子结果，所述N为正整数；所述第二输入子模块包括：Optionally, the second preset model includes N sub-models, the second prediction result includes N second prediction sub-results, and N is a positive integer; the second input submodule includes:

输入单元，用于将所述初始流量预测序列与所述特征矩阵输入至目标子模型，得到第二预测子结果；An input unit, used for inputting the initial traffic prediction sequence and the feature matrix into the target sub-model to obtain a second prediction sub-result;

其中，其中，所述目标子模型为所述N个子模型中的任意一个子模块。Among them, the target sub-model is any sub-module among the N sub-models.

可选地，所述第二确定模块还包括：Optionally, the second determining module further includes:

训练学习子模块，用于对所述历史流量数据和所述特征矩阵进行训练学习，得到所述N个子模型；A training and learning submodule, used for training and learning the historical traffic data and the feature matrix to obtain the N submodels;

可选地，所述训练学习子模块，具体用于：Optionally, the training and learning submodule is specifically used to:

可选地，所述第一确定模块包括：Optionally, the first determining module includes:

标记子模块，用于对所述第一流量数据进行特征标记，获得多个特征序列；A marking submodule, used for performing feature marking on the first flow data to obtain a plurality of feature sequences;

第二确定子模块，用于基于所述多个特征序列，确定所述待预测区域对应的特征矩阵；A second determination submodule is used to determine a feature matrix corresponding to the area to be predicted based on the multiple feature sequences;

第三方面，本发明实施例还提供一种网络流量预测装置，包括：处理器、存储器及存储在所述存储器上并可在所述处理器上运行的程序，所述程序被所述处理器执行时实现如前述第一方面所述方法中的步骤。In a third aspect, an embodiment of the present invention further provides a network traffic prediction device, comprising: a processor, a memory, and a program stored in the memory and executable on the processor, wherein the program, when executed by the processor, implements the steps of the method described in the first aspect.

第四方面，本发明实施例还提供一种计算机可读存储介质，所述计算机可读存储介质上存储有计算机程序，所述计算机程序被处理器执行时实现如前述第一方面所述方法中的步骤。In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps in the method described in the first aspect are implemented.

在本发明实施例中，通过获取目标范围内各区域的历史流量数据，所述历史流量数据包括待预测区域对应的第一流量数据；基于所述第一流量数据，确定所述待预测区域对应的特征矩阵；将所述历史流量数据和所述特征矩阵输入至预设模型，确定所述待预测区域的流量预测结果；其中，所述预设模型包括第一预设模型、第二预设模型和第三预设模型，所述第一预设模型分别与所述第二预设模型和所述第三预设模型级联，所述第二预设模型与所述第三预设模型级联。这样，可以通过具有级联关系的多个模型对待预测区域的网络流量进行预测，使得预测结果考虑了待预测区域的特征矩阵和目标范围内其他区域的流量影响，从而提升了预测结果的准确率。In an embodiment of the present invention, by acquiring historical traffic data of each area within the target range, the historical traffic data includes first traffic data corresponding to the area to be predicted; based on the first traffic data, determining the feature matrix corresponding to the area to be predicted; inputting the historical traffic data and the feature matrix into a preset model to determine the traffic prediction result of the area to be predicted; wherein the preset model includes a first preset model, a second preset model and a third preset model, the first preset model is cascaded with the second preset model and the third preset model respectively, and the second preset model is cascaded with the third preset model. In this way, the network traffic of the area to be predicted can be predicted by multiple models with a cascade relationship, so that the prediction result takes into account the feature matrix of the area to be predicted and the traffic influence of other areas within the target range, thereby improving the accuracy of the prediction result.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本发明实施例的技术方案，下面将对本发明实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动性的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the accompanying drawings required for use in the description of the embodiments of the present invention will be briefly introduced below. Obviously, the accompanying drawings in the following description are only some embodiments of the present invention. For ordinary technicians in this field, other accompanying drawings can be obtained based on these accompanying drawings without paying creative labor.

图1是本发明实施例提供的网络流量预测方法的流程图之一；FIG1 is a flow chart of a network traffic prediction method according to an embodiment of the present invention;

图2是本发明实施例提供的网络流量预测方法的流程图之二；FIG2 is a second flowchart of a network traffic prediction method provided by an embodiment of the present invention;

图3是本发明实施例提供的网络流量预测方法的流程图之三；3 is a flowchart of a third method for predicting network traffic according to an embodiment of the present invention;

图4是本发明实施例提供的训练数据的示意图；FIG4 is a schematic diagram of training data provided by an embodiment of the present invention;

图5是本发明实施例提供的预设模型的结构示意图；FIG5 is a schematic diagram of the structure of a preset model provided in an embodiment of the present invention;

图6是本发明实施例提供的网络流量预测装置的结构示意图之一；FIG6 is a schematic diagram of a network traffic prediction device according to an embodiment of the present invention;

图7是本发明实施例提供的网络流量预测装置的结构示意图之二。FIG. 7 is a second schematic diagram of the structure of the network traffic prediction device provided in an embodiment of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

本发明实施例中，提出了一种网络流量预测方法、装置及相关设备，以解决现有网络流量预测方法仅从数学角度拟合预测结果，适合不受其他时间序列影响的单一时间序列预测，由于忽略了同省地市之间流量变化的相似性和相关性，导致预测结果的准确率较低的问题。In the embodiments of the present invention, a network traffic prediction method, apparatus and related equipment are proposed to solve the problem that the existing network traffic prediction method only fits the prediction result from a mathematical perspective, is suitable for a single time series prediction that is not affected by other time series, and ignores the similarity and correlation of traffic changes between cities in the same province, resulting in a low accuracy rate of the prediction result.

参见图1，图1是本发明实施例提供的网络流量预测方法的流程图之一。如图1所示，方法包括以下步骤：Referring to Figure 1, Figure 1 is a flow chart of a network traffic prediction method provided by an embodiment of the present invention. As shown in Figure 1, the method includes the following steps:

步骤101、获取目标范围内各区域的历史流量数据，历史流量数据包括待预测区域对应的第一流量数据。Step 101: Acquire historical traffic data of each area within a target range, where the historical traffic data includes first traffic data corresponding to the area to be predicted.

其中，上述目标范围可以包括多个区域，该目标范围可以为某一国家对应的网络覆盖范围，或者某一国家内某一省份的网络覆盖范围，或者某一省份内某一地市的网络覆盖范围等等。当上述目标范围为某一国家对应的网络覆盖范围时，则上述区域表示该国家内某一省份的网络覆盖范围；当上述目标范围为某一国家内某一省份的网络覆盖范围时，则上述区域表示该省份内某一地市的网络覆盖范围；当上述目标范围为某一省份内某一地市的网络覆盖范围时，则上述区域表示该地市内某一区县的网络覆盖范围。The target range may include multiple regions, and the target range may be the network coverage range corresponding to a certain country, or the network coverage range of a certain province in a certain country, or the network coverage range of a certain city in a certain province, etc. When the target range is the network coverage range corresponding to a certain country, the region indicates the network coverage range of a certain province in the country; when the target range is the network coverage range of a certain province in a certain country, the region indicates the network coverage range of a certain city in the province; when the target range is the network coverage range of a certain city in a certain province, the region indicates the network coverage range of a certain district or county in the city.

上述历史流量数据为目标范围内各区域在历史预设时长内所采集的数据。此处的历史预设时长可以为半年、一年、两年、三年等任一时长，所采集的时间粒度可以为小时、日、月等任一粒度，本申请不做具体限定。上述历史流量数据可以包括多个维度的数据，如采集时间、各采集时间对应的流量值、各采集时间对应的节假日信息、各采集时间对应的活动信息等等。例如，假设目标范围包括10个区域，可以通过采集这10个区域在历史预设时长内各采集时间、各采集时间对应的流量值、各采集时间对应的节假日信息和各采集时间对应的活动信息四个维度的数据，则可以得到10个时间序列{W}_w×4，其中，{W}表示某一区域的w个时间采集点对应的四维时间序列，w表示该时间序列的长度，4表示该时间序列的维度，w的数量由历史预设时长和所采集的时间粒度决定。The above historical traffic data is the data collected by each area within the target range within the historical preset time. The historical preset time here can be any time such as half a year, one year, two years, three years, etc., and the collected time granularity can be any granularity such as hours, days, months, etc., which is not specifically limited in this application. The above historical traffic data can include data of multiple dimensions, such as collection time, traffic value corresponding to each collection time, holiday information corresponding to each collection time, activity information corresponding to each collection time, etc. For example, assuming that the target range includes 10 areas, by collecting data of four dimensions of each collection time, traffic value corresponding to each collection time, holiday information corresponding to each collection time, and activity information corresponding to each collection time in these 10 areas within the historical preset time, 10 time series {W} _w×4 can be obtained, where {W} represents the four-dimensional time series corresponding to w time collection points in a certain area, w represents the length of the time series, 4 represents the dimension of the time series, and the number of w is determined by the historical preset time and the collected time granularity.

步骤102、基于第一流量数据，确定待预测区域对应的特征矩阵。Step 102: Determine a feature matrix corresponding to the area to be predicted based on the first traffic data.

其中，上述第一流量数据可以包括待预测区域在历史预设时长内所采集的多个维度的数据，如采集时间、各采集时间对应的流量值、各采集时间对应的节假日信息、各采集时间对应的活动信息等等。上述特征矩阵包括多个特征序列，每个特征序列用于指示第一流量数据的某一特征，如时间特征、节假日特征、活动特征、区域特征等等。The first traffic data may include data of multiple dimensions collected in the area to be predicted within a preset historical time period, such as collection time, traffic value corresponding to each collection time, holiday information corresponding to each collection time, activity information corresponding to each collection time, etc. The feature matrix includes multiple feature sequences, each feature sequence is used to indicate a feature of the first traffic data, such as time feature, holiday feature, activity feature, area feature, etc.

在一实施例中，第一流量数据包括采集时间、各采集时间对应的流量值、各采集时间对应的节假日信息、各采集时间对应的活动信息，用{W1}_w×4表示，其中，W1表示待预测区域的w个时间采集点对应的四维数据矩阵，w的数量由历史预设时长和所采集的时间粒度决定。这样可以通过大数据分析，对第一流量数据进行特征标记，其特征包括时间特征、节假日特征、活动特征和区域特征，由此得到时间特征序列、节假日特征序列、活动特征序列和区域特征序列，再根据时间特征序列、节假日特征序列、活动特征序列和区域特征序列，形成特征矩阵{F}_w×k，其中，{F}表示多个w个时间采集点对应的k维特征序列的集合，w表示特征序列的长度，k表示特征序列的维度，w的数量由历史预设时长和所采集的时间粒度决定，k为任意正整数。具体地，时间特征用于指示第一流量数据的各采集时间的时间信息，如对应的季节(即春季、夏季、秋季或者冬季)、对应的时段(即上午、下午、中午或者晚上)、是否为上下班高峰期等等；节假日特征用于指示第一流量数据的各采集时间的节假日信息，如是否为节假日、节假日名称、节假日时长等；活动特征用于指示第一流量数据的各采集时间的活动信息，如是否为活动日、活动名称、活动时长、活动区域范围等等；区域特征用于指示待预测区域的信息，如所属人口流动类型(即人口流入型、人口流出型或者人口稳定型)、所属城市类型(即旅游城市、农业城市或者工业城市)、当月国内生产总值(Gross DomesticProduct，简称GDP)值、区域内常驻人口数量、城镇人口数量、高校数量、中小学数量等等。In one embodiment, the first traffic data includes the collection time, the traffic value corresponding to each collection time, the holiday information corresponding to each collection time, and the activity information corresponding to each collection time, which is represented by {W1} _w×4 , where W1 represents the four-dimensional data matrix corresponding to the w time collection points in the area to be predicted, and the number of w is determined by the historical preset time length and the time granularity collected. In this way, the first traffic data can be marked with features through big data analysis, and its features include time features, holiday features, activity features and regional features, thereby obtaining a time feature sequence, a holiday feature sequence, an activity feature sequence and a regional feature sequence, and then forming a feature matrix {F} _w×k according to the time feature sequence, the holiday feature sequence, the activity feature sequence and the regional feature sequence, where {F} represents a set of k-dimensional feature sequences corresponding to multiple w time collection points, w represents the length of the feature sequence, k represents the dimension of the feature sequence, the number of w is determined by the historical preset time length and the time granularity collected, and k is an arbitrary positive integer. Specifically, the time feature is used to indicate the time information of each collection time of the first flow data, such as the corresponding season (i.e., spring, summer, autumn or winter), the corresponding time period (i.e., morning, afternoon, noon or evening), whether it is a rush hour, etc.; the holiday feature is used to indicate the holiday information of each collection time of the first flow data, such as whether it is a holiday, the name of the holiday, the duration of the holiday, etc.; the activity feature is used to indicate the activity information of each collection time of the first flow data, such as whether it is an activity day, the name of the activity, the duration of the activity, the scope of the activity area, etc.; the regional feature is used to indicate the information of the area to be predicted, such as the type of population flow (i.e., population inflow type, population outflow type or population stability type), the type of city (i.e., tourist city, agricultural city or industrial city), the gross domestic product (GDP) value of the month, the number of permanent residents in the region, the number of urban population, the number of colleges and universities, the number of primary and secondary schools, etc.

步骤103、将历史流量数据和特征矩阵输入至预设模型，确定待预测区域的流量预测结果。Step 103: Input the historical traffic data and the feature matrix into the preset model to determine the traffic prediction result of the area to be predicted.

其中，上述预设模型可以包括多个已经训练好的模型，该预设模型可以根据输入的历史流量数据和特征矩阵对待预测区域在未来预设时长内的网络流量进行预测，得到流量预测结果。具体地，上述预设模型可以包括对待预测区域的初始流量预测序列、节假日序列、趋势序列和季节序列等进行预测的第一预设模型，对待预测区域与目标范围内其他区域的流量转换关系进行预测的第二预设模型，以及对初始流量预测序列进行修正的第三预设模型，其中，第一预设模型可以为prophet模型等机器学习模型，用于获取待预测区域的周期性特征，第二预设模型可以为长短期记忆网络(Long Short-Term Memory，简称LSTM)模型等机器学习模型，用于获取待预测区域的时间紧密度特征，第三预设模型可以为极端梯度提升(eXtreme Gradient Boosting，简称XGBoost)模型等。该预设模型包括第一预设模型、第二预设模型和第三预设模型，第一预设模型分别与第二预设模型和第三预设模型级联，第二预设模型与第三预设模型级联，这样，可以将第一预设模型的预测结果作为第二预设模型和第三预设模型的输入数据，并将第二预设模型的预测结果作为第三预设模型的输入数据，最后通过第三预设模型确定得到待预测区域的流量预测结果。The preset model may include multiple trained models, which can predict the network traffic of the predicted area within a preset time period in the future according to the input historical traffic data and feature matrix to obtain the traffic prediction result. Specifically, the preset model may include a first preset model for predicting the initial traffic prediction sequence, holiday sequence, trend sequence and seasonal sequence of the predicted area, a second preset model for predicting the traffic conversion relationship between the predicted area and other areas within the target range, and a third preset model for correcting the initial traffic prediction sequence, wherein the first preset model may be a machine learning model such as a prophet model, which is used to obtain the periodic characteristics of the predicted area, the second preset model may be a machine learning model such as a long short-term memory network (Long Short-Term Memory, referred to as LSTM) model, which is used to obtain the time density characteristics of the predicted area, and the third preset model may be an extreme gradient boosting (eXtreme Gradient Boosting, referred to as XGBoost) model, etc. The preset model includes a first preset model, a second preset model and a third preset model. The first preset model is cascaded with the second preset model and the third preset model respectively, and the second preset model is cascaded with the third preset model. In this way, the prediction result of the first preset model can be used as input data for the second preset model and the third preset model, and the prediction result of the second preset model can be used as input data for the third preset model. Finally, the traffic prediction result of the area to be predicted is determined by the third preset model.

在本实施例中，可以通过具有级联关系的多个模型对待预测区域的网络流量进行预测，使得预测结果考虑了待预测区域的特征矩阵和目标范围内其他区域的流量影响，从而提升了预测结果的准确率。In this embodiment, the network traffic of the area to be predicted can be predicted by multiple models with a cascade relationship, so that the prediction result takes into account the feature matrix of the area to be predicted and the traffic influence of other areas within the target range, thereby improving the accuracy of the prediction result.

进一步地，参见图2，图2是本发明实施例提供的网络流量预测方法的流程图之二，基于图1所示的实施例，上述步骤103、将历史流量数据和特征矩阵输入至预设模型，确定待预测区域的流量预测结果，包括：Further, referring to FIG. 2 , FIG. 2 is a second flowchart of a network traffic prediction method provided by an embodiment of the present invention. Based on the embodiment shown in FIG. 1 , the above step 103, inputting historical traffic data and a feature matrix into a preset model to determine a traffic prediction result of a to-be-predicted area, includes:

步骤201、将第一流量数据和特征矩阵输入至第一预设模型，得到第一预测结果，第一预测结果包括初始流量预测序列。Step 201: input first flow data and a feature matrix into a first preset model to obtain a first prediction result, wherein the first prediction result includes an initial flow prediction sequence.

在一实施例中，上述预设模型包括第一预设模型、第二预设模型和第三预设模型，第一预设模型和第二预设模型为不同的机器学习模型，第三预设模型为修正模型，第三预设模型用于根据第二预测结果对第一预测结果中的初始流量预测序列进行修正。在该预设模型中，第一预设模型分别与第二预设模型和第三预设模型级联，第二预设模型与第三预设模型级联。In one embodiment, the preset model includes a first preset model, a second preset model and a third preset model, the first preset model and the second preset model are different machine learning models, the third preset model is a correction model, and the third preset model is used to correct the initial traffic prediction sequence in the first prediction result according to the second prediction result. In the preset model, the first preset model is cascaded with the second preset model and the third preset model respectively, and the second preset model is cascaded with the third preset model.

具体地，上述第一预设模型是通过待预测区域的第一历史流量数据进行预先训练学习得到，此处的第一历史流量数据可以与上述第一流量数据相同，也可以与上述第一流量数据不同，为了使得训练得到第一预设模型预测更加准确，可以选取较长时间的历史流量数据作为训练数据，如过去5年的历史流量数据。通过这5年的历史流量数据，对待预测区域的在预设预测时长内(如未来1年内)的初始流量预测序列、节假日序列、趋势序列和季节序列等。上述第一预设模型包括但不限于prophet模型，第一预设模型的具体训练学习过程为现有技术，在此不再赘述。Specifically, the above-mentioned first preset model is obtained by pre-training and learning the first historical traffic data of the area to be predicted. The first historical traffic data here can be the same as the above-mentioned first traffic data, or it can be different from the above-mentioned first traffic data. In order to make the prediction of the first preset model obtained by training more accurate, historical traffic data for a longer period of time can be selected as training data, such as historical traffic data for the past 5 years. Through these 5 years of historical traffic data, the initial traffic prediction sequence, holiday sequence, trend sequence and seasonal sequence of the predicted area within the preset prediction period (such as within the next 1 year) are obtained. The above-mentioned first preset model includes but is not limited to the prophet model. The specific training and learning process of the first preset model is a prior art and will not be repeated here.

在通过第一预设模型进行预测时，可以将第一流量数据和特征矩阵输入至第一预设模型，由此输出得到第一预测结果，该第一预设结果包括但不限于初始流量预测序列、节假日序列、趋势序列和季节序列。When making predictions through the first preset model, the first traffic data and the feature matrix can be input into the first preset model, thereby outputting a first prediction result, which includes but is not limited to an initial traffic prediction sequence, a holiday sequence, a trend sequence, and a seasonal sequence.

步骤202、将初始流量预测序列与特征矩阵输入至第二预设模型，得到第二预测结果，第二预测结果包括流量差异性特征序列，流量差异性特征序列用于指示目标范围内各区域的人口流动对待预测区域的流量的影响量。Step 202: Input the initial traffic prediction sequence and the feature matrix into a second preset model to obtain a second prediction result, wherein the second prediction result includes a traffic difference feature sequence, and the traffic difference feature sequence is used to indicate the impact of population flow in each area within the target range on the traffic in the predicted area.

具体地，上述第二预设模型包括一个或多个与人口流动类型对应的神经网络模型，该神经网络模型包括但不限于长短期记忆网络(Long Short-Term Memory，简称LSTM)模型。Specifically, the above-mentioned second preset model includes one or more neural network models corresponding to the population flow type, and the neural network model includes but is not limited to a long short-term memory network (Long Short-Term Memory, referred to as LSTM) model.

在通过第二预设模型进行预测时，可以根据初始流量预测序列和特征矩阵输入至对应的神经网络模型，得到第二预设结果。该第二预设结果包括一个或多个与人口流动类型对应的流量差异性特征序列，此处的流量差异性特征序列用于指示目标范围内各区域的人口流动对待预测区域的流量的影响量。When the second preset model is used for prediction, the initial flow prediction sequence and the feature matrix can be input into the corresponding neural network model to obtain the second preset result. The second preset result includes one or more flow difference feature sequences corresponding to the population flow type, where the flow difference feature sequence is used to indicate the impact of the population flow in each area within the target range on the flow of the area to be predicted.

步骤203、将第一预测结果和流量差异性特征序列输入至第三预设模型，得到待预测区域的流量预测结果。具体地，上述第三预设模型包括但不限于极端梯度提升(eXtremeGradient Boosting，简称XGBoost)模型，本申请不做具体限定。在对第三模型进行训练时，需要将第一预设模型和第二预设模型的输出结果作为第三预设模型的输入数据，进行训练学习，从而训练得到第三预设模型，具体训练学习过程为现有技术，在此不再赘述。Step 203, input the first prediction result and the traffic difference feature sequence into the third preset model to obtain the traffic prediction result of the area to be predicted. Specifically, the above-mentioned third preset model includes but is not limited to the extreme gradient boosting (eXtremeGradient Boosting, referred to as XGBoost) model, which is not specifically limited in this application. When training the third model, it is necessary to use the output results of the first preset model and the second preset model as input data of the third preset model for training and learning, so as to train the third preset model. The specific training and learning process is a prior art and will not be repeated here.

在通过第三预设模型进行预测时，可以将第一预测结果和流量差异性特征序列输入至第三预设模型，通过第三预设模型得到待预测区域的流量预测结果。该待预测区域的流量预测结果为待预测区域在预设预测时长内各预测时间点对应的流量值。When the third preset model is used for prediction, the first prediction result and the flow difference characteristic sequence can be input into the third preset model, and the flow prediction result of the area to be predicted can be obtained by the third preset model. The flow prediction result of the area to be predicted is the flow value corresponding to each prediction time point in the area to be predicted within the preset prediction time.

在本实施例中，可以通过第一预设模型得到待预测区域的初始流量预测序列，再通过对初始流量预测序列进行预测，得到各人口流动类型对应的流量差异性特征序列，最后通过第三预设模型使用流量差异性特征序列对初始流量预测序列进行修正，由此得到待预测区域的流量预测结果，使得预测结果考虑了待预测区域的多个特征和目标范围内其他区域的影响，从而提升了预测结果的准确率。In this embodiment, the initial flow prediction sequence of the area to be predicted can be obtained through the first preset model, and then the flow difference feature sequence corresponding to each population flow type can be obtained by predicting the initial flow prediction sequence. Finally, the initial flow prediction sequence is corrected using the flow difference feature sequence through the third preset model to obtain the flow prediction result of the area to be predicted, so that the prediction result takes into account multiple characteristics of the area to be predicted and the influence of other areas within the target range, thereby improving the accuracy of the prediction result.

进一步地，第一预测结果还包括：待预测区域的节假日序列、趋势序列和季节序列中的至少一项，其中，节假日序列用于指示待预测区域在预设预测时长内的节假日特征，趋势序列用于指示待预测区域预测时长内的趋势特征，季节序列用于指示待预测区域在预设预测时长内的季节特征。Furthermore, the first prediction result also includes: at least one of: a holiday sequence, a trend sequence and a season sequence of the area to be predicted, wherein the holiday sequence is used to indicate the holiday characteristics of the area to be predicted within a preset prediction period, the trend sequence is used to indicate the trend characteristics of the area to be predicted within the prediction period, and the season sequence is used to indicate the seasonal characteristics of the area to be predicted within the preset prediction period.

在一实施例中，通过预先训练好的prophet模型可以得到对待预测区域的初始流量预测序列{Y’}_L，以及节假日序列{H}_L、趋势序列{T}_L和季节序列{S}_L中的至少一项，其中，节假日序列{H}_L用于指示待预测区域在预设预测时长内的节假日特征，趋势序列{T}_L用于指示待预测区域预测时长内的趋势特征，季节序列{S}_L用于指示待预测区域在预设预测时长内的季节特征。其中，{Y’}_L、{H}_L、{T}_L和{S}_L分别为L个时间点的一维时间序列，此处的L个时间点可以理解为预设预测时长内的多个时间点。例如，假设需要预测待预测区域未来的一周的流量，时间粒度为小时，则L为7*24＝168。这样，可以在第三预设模型中对初始流量预测序列进行修正时，可以考虑待预测区域的时间、节假日和活动对待预测区域的流量值的影响，使得待预测区域的流量预测结果更加准确。In one embodiment, the initial traffic prediction sequence {Y'} _L of the area to be predicted, as well as at least one of the holiday sequence {H} _L , the trend sequence {T} _L and the season sequence {S} _L can be obtained through the pre-trained prophet model, wherein the holiday sequence {H} _L is used to indicate the holiday characteristics of the area to be predicted within the preset prediction time, the trend sequence {T} _L is used to indicate the trend characteristics of the area to be predicted within the prediction time, and the season sequence {S} _L is used to indicate the seasonal characteristics of the area to be predicted within the preset prediction time. Among them, {Y'} _L , {H} _L , {T} _L and {S} _L are one-dimensional time series of L time points respectively, and the L time points here can be understood as multiple time points within the preset prediction time. For example, assuming that the traffic of the area to be predicted needs to be predicted for the next week, the time granularity is hours, then L is 7*24=168. In this way, when the initial traffic prediction sequence is modified in the third preset model, the influence of the time, holidays and activities of the area to be predicted on the traffic value of the area to be predicted can be considered, so that the traffic prediction result of the area to be predicted is more accurate.

进一步地，第二预设模型包括N个子模型，第二预测结果包括N个第二预测子结果，N为正整数；Further, the second preset model includes N sub-models, the second prediction result includes N second prediction sub-results, and N is a positive integer;

上述步骤202、将初始流量预测序列与特征矩阵输入至第二预设模型，得到第二预测结果，包括：The above step 202, inputting the initial traffic prediction sequence and the feature matrix into the second preset model to obtain the second prediction result, includes:

将初始流量预测序列与特征矩阵输入至目标子模型，得到第二预测子结果；Input the initial traffic prediction sequence and feature matrix into the target sub-model to obtain the second prediction sub-result;

其中，目标子模型为N个子模型中的任意一个子模块。Among them, the target sub-model is any sub-module among the N sub-models.

在一实施例中，第二预设模型可以包括1个、2个、3个等子模型，每个子模型与一个第二预测子结果对应。这样，当将初始流量预测序列与特征矩阵输入至目标子模型时，可以通过目标子模型得到一个第二预测子结果，由此得到N个第二预测子结果。此处目标子模型为N个子模型中的任意一个子模块。例如，假设第二预设模型中的子模型的数量为3个，分别用于预测人口流入型对应的流量差异性特征序列、人口流出型对应的流量差异性特征序列和人口稳定型对应的流量差异性特征序列，那么只需要将初始流量预测序列和待预测区域的特征矩阵输入至这3个子模型中，就可以得到人口流入型、人口流出型和人口稳定型对应的流量差异性特征序列。In one embodiment, the second preset model may include 1, 2, 3, etc. sub-models, and each sub-model corresponds to a second prediction sub-result. In this way, when the initial flow prediction sequence and the feature matrix are input into the target sub-model, a second prediction sub-result can be obtained through the target sub-model, thereby obtaining N second prediction sub-results. The target sub-model here is any sub-module among the N sub-models. For example, assuming that the number of sub-models in the second preset model is 3, which are used to predict the flow difference feature sequence corresponding to the population inflow type, the flow difference feature sequence corresponding to the population outflow type, and the flow difference feature sequence corresponding to the population stability type, then it is only necessary to input the initial flow prediction sequence and the feature matrix of the area to be predicted into these 3 sub-models to obtain the flow difference feature sequences corresponding to the population inflow type, the population outflow type, and the population stability type.

在本实施例中，通过设置不同的子模型分别对不同人口流动类型的流量差异性特征序列进行预测，由此得到待预测区域在不同人口流动类型的区域的影响下网络流量的变化情况，使得待预测区域的预测流量值更加准确。In this embodiment, different sub-models are set to predict the traffic difference characteristic sequences of different population flow types, thereby obtaining the change of network traffic in the predicted area under the influence of areas with different population flow types, so that the predicted traffic value of the predicted area is more accurate.

进一步地，在将初始流量预测序列与特征矩阵输入至第二预设模型，得到第二预测结果之前，包括：Furthermore, before the initial traffic prediction sequence and the feature matrix are input into the second preset model to obtain the second prediction result, the method includes:

对历史流量数据和特征矩阵进行训练学习，得到N个子模型；Train and learn historical traffic data and feature matrices to obtain N sub-models;

其中，N的取值与目标范围内各区域所包括的人口流动类型的数量一致，人口流动类型包括流入型、流出型和稳定型中的至少一种。Among them, the value of N is consistent with the number of population flow types included in each area within the target range, and the population flow types include at least one of inflow, outflow and stable types.

在一实施例中，人口流动类型可以包括流入型、流出型和稳定型中的至少一种，本申请不做具体限制。例如，当目标范围内各区域的人口流动类型可以包括流入型、流出型和稳定型3种类型时，第二预设模型中需要包括3个子模型。在利用这3个子模型进行预测前，需要对历史流量数据和特征矩阵进行训练学习，得到这3个子模型。具体地，对目标范围的各区域的历史流量数据按照这3种人口流动类型进行分类，分别计算目标范围内人口流动类型相同的多个区域在各个采集时间点对应的流量均值，得到3个流量均值序列，再将待预测区域的第一流量数据与这3个流量均值序列相减，得到3个差异性特征序列，再将各历史流量数据和待预测区域的特征矩阵作为3个子模型的输入，将3个差异性特征序列分别作为3个子模型的输出，训练得到3个LSTM模型。In one embodiment, the population flow type may include at least one of an inflow type, an outflow type, and a stable type, and the present application does not make specific restrictions. For example, when the population flow types of each area within the target range may include three types: an inflow type, an outflow type, and a stable type, the second preset model needs to include three sub-models. Before using these three sub-models for prediction, it is necessary to train and learn the historical flow data and the feature matrix to obtain these three sub-models. Specifically, the historical flow data of each area in the target range is classified according to these three types of population flow, and the flow means corresponding to multiple areas with the same population flow type within the target range at each collection time point are calculated respectively to obtain three flow mean sequences, and then the first flow data of the area to be predicted is subtracted from the three flow mean sequences to obtain three difference feature sequences, and then the historical flow data and the feature matrix of the area to be predicted are used as the input of the three sub-models, and the three difference feature sequences are used as the output of the three sub-models respectively, and three LSTM models are obtained by training.

当然，当目标范围内的各区域包括两种人口流动类型或者一种人口流动类型时，可以采用上述方式训练得到第二预设模型中的两个子模型或者一个子模型。Of course, when each area within the target range includes two types of population mobility or one type of population mobility, the above method can be used to train to obtain two sub-models or one sub-model in the second preset model.

在本实施例中，可以根据人口流动类型设置多个子模型，并对第二预多个子模型进行训练学习，以实现不同人口流动类型下对应的多个区域对待预测区域的流量影响量的预测。In this embodiment, multiple sub-models can be set according to the population flow type, and the second multiple sub-models can be trained and learned to achieve the prediction of the impact of multiple areas corresponding to different population flow types on the flow of the predicted area.

进一步地，参见图3，图3是本发明实施例提供的网络流量预测方法的流程图之三。上述步骤对历史流量数据和特征矩阵进行训练学习，得到N个子模型，包括：Further, referring to FIG3 , FIG3 is a flowchart of the third method for predicting network traffic provided by an embodiment of the present invention. The above steps train and learn the historical traffic data and the feature matrix to obtain N sub-models, including:

步骤301、根据目标范围内各区域的人口流动类型，对各区域的历史流量数据进行分类。Step 301: Classify the historical traffic data of each area according to the population flow type of each area within the target range.

在一实施例中，目标范围内各区域的人口流动类型可以包括流入型、流出型和稳定型3种。可以根据这3种人口流动类将各区域的历史流量数据进行分类后合并，得到这3种人口流动类型对应的历史时间序列。假设各区域的历史流量数据为w个时间采集点对应的四维数据的时间序列，这样就可以得到流入型时间序列{W2}_w×4，流出型时间序列{W3}_w×4，稳定型时间序列{W4}_w×4。In one embodiment, the population flow types of each region within the target range may include inflow type, outflow type and stable type. The historical flow data of each region may be classified and merged according to the three population flow types to obtain the historical time series corresponding to the three population flow types. Assuming that the historical flow data of each region is a time series of four-dimensional data corresponding to w time collection points, the inflow type time series {W2} _w×4 , the outflow type time series {W3} _w×4 , and the stable type time series {W4} _w×4 can be obtained.

步骤302、分别计算N个人口流动类型对应的N个均值序列，均值序列用于指示目标范围内人口流动类型相同的多个区域在各个采集时间点对应的流量均值。Step 302: calculate N mean sequences corresponding to N population flow types respectively, where the mean sequences are used to indicate the flow mean values corresponding to multiple areas with the same population flow type within the target range at each collection time point.

分别计算流入型时间序列{W2}_w×4，流出型时间序列{W3}_w×4，稳定型时间序列{W4}_w×4中各采集时间点的流量值的平均值，得到各采集时间点对应的流量均值，进而得到流入型时间序列{W2}_w×4的均值序列{C1}_w×2，流出型时间序列{W3}_w×4的均值序列{C2}_w×2，稳定型时间序列{W4}_w×4的均值序列{C3}_w×2，其中，各均值序列为包括w个采集时间和w个采集时间对应的流量均值的二维的时间序列。Calculate the average of the flow values at each collection time point in the inflow time series {W2} _w×4 , the outflow time series {W3} _w×4 , and the stable time series {W4} _w×4 respectively, and obtain the flow mean corresponding to each collection time point, and then obtain the mean sequence {C1} _w _{×2 of the inflow time series {W2} w×4} , the mean sequence {C2} _w _×2 of the outflow time series {W3} w×4, and the mean sequence {C3} w _×2 of the stable time series {W4} _w×4 , where each mean sequence is a two-dimensional time series including w collection times and the flow mean corresponding to the w collection times.

步骤303、基于第一流量数据和N个均值序列，确定N个人口流动类型对应的N个差异性特征序列，差异性特征序列包括第一流量数据中各采集时间点对应的流量值与均值序列的相应时间点的流量均值之间的差值。Step 303: Based on the first flow data and N mean sequences, determine N difference feature sequences corresponding to N population flow types, wherein the difference feature sequences include the difference between the flow value corresponding to each collection time point in the first flow data and the flow mean at the corresponding time point of the mean sequence.

将第一流量数据{W1}_w×4中各采集时间点对应的流量值分别与均值序列{C1}_w×2，{C2}_w×2，{C3}_w×2中各采集时间点对应的流量均值相减，可以得到流入型对应的差异性特征序列{D1}_w×2、流出型对应的差异性特征序列{D2}_w×2和稳定型对应的差异性特征序列{D3}_w×2。By subtracting the flow value corresponding to each collection time point in the first flow data {W1} _w×4 from the flow mean corresponding to each collection time point in the mean sequence {C1} _w×2 , {C2} _w×2 , {C3} _w×2 , we can obtain the difference feature sequence {D1} _w×2 corresponding to the inflow type, the difference feature sequence {D2} _w×2 corresponding to the outflow type, and the difference feature sequence {D3} _w×2 corresponding to the stable type.

步骤304、将第一流量数据和特征矩阵作为N个子模型中各子模型的输入，将N个差异性特征序列分别作为N个子模型中各子模型的输出，训练学习得到N个子模型。Step 304: Use the first traffic data and the feature matrix as inputs of each of the N sub-models, use the N differential feature sequences as outputs of each of the N sub-models, and obtain N sub-models through training and learning.

由于目标范围内各区域的人口流动类型包括流入型、流出型和稳定型3种，因而需要训练得到这3种人口流动类型对应的神经网络模型。在训练这3个神经网络模型时，可以将第一流量数据{W1}_w×4和特征矩阵{F}_w×k作为这3个神经网络模型的输入，将流入型对应的差异性特征序列{D1}_w×2、流出型对应的差异性特征序列{D2}_w×2和稳定型对应的差异性特征序列{D3}_w×2分别作为这3个神经网络模型的输出，对这3个神经网络模型中的参数进行训练学习，这样就可以得到3个神经网络模型。Since the population flow types of each region within the target range include inflow type, outflow type and stable type, it is necessary to train the neural network models corresponding to the three population flow types. When training the three neural network models, the first flow data {W1} _w×4 and the feature matrix {F} _w×k can be used as the input of the three neural network models, and the difference feature sequence {D1} _w×2 corresponding to the inflow type, the difference feature sequence {D2} _w×2 corresponding to the outflow type and the difference feature sequence {D3} _w×2 corresponding to the stable type can be used as the output of the three neural network models respectively. The parameters in the three neural network models are trained and learned, so that the three neural network models can be obtained.

由此，在训练得到3个神经网络模型后，可以将初始流量预测序列{Y’}_L和特征矩阵{F}_w×k输入至这3个神经网络模型，预测得到3个流量差异性特征序列，即流入型对应的流量差异性特征序列{D1′}_L、流出型对应的流量差异性特征序列{D2′}_L和稳定型对应的流量差异性特征序列{D3′}_L。在本实施例中，可以根据目标范围内各区域的人口流动类型，确定各人口流动类型对应的流量差异性特征序列，由此方便后续通过不同人口流动类型对应的流量差异性特征序列对待预测区域的初始流量预测序列进行修正，使得待预测区域的流量预测结果更加准确。Thus, after training to obtain three neural network models, the initial flow prediction sequence {Y'} _L and the feature matrix {F} _w×k can be input into the three neural network models to predict three flow difference feature sequences, namely, the flow difference feature sequence {D1′} _L corresponding to the inflow type, the flow difference feature sequence {D2′} _L corresponding to the outflow type, and the flow difference feature sequence {D3′} _L corresponding to the stable type. In this embodiment, the flow difference feature sequence corresponding to each population flow type can be determined according to the population flow type of each area within the target range, thereby facilitating the subsequent correction of the initial flow prediction sequence of the area to be predicted by the flow difference feature sequence corresponding to different population flow types, so that the flow prediction result of the area to be predicted is more accurate.

进一步地，上述步骤102、基于第一流量数据，确定待预测区域对应的特征矩阵，包括：Furthermore, the above step 102, based on the first traffic data, determines the feature matrix corresponding to the area to be predicted, including:

对第一流量数据进行特征标记，获得多个特征序列；Performing feature marking on the first flow data to obtain multiple feature sequences;

基于多个特征序列，确定待预测区域对应的特征矩阵；Based on multiple feature sequences, determine the feature matrix corresponding to the area to be predicted;

其中，第一流量数据为包括多个采集时间点以及多个采集时间点对应的流量值、节假日信息和活动信息的四维的时间序列，多个特征序列包括时间特征序列、节假日特征序列、活动特征序列和区域特征序列。Among them, the first traffic data is a four-dimensional time series including multiple collection time points and the traffic values corresponding to the multiple collection time points, holiday information and activity information, and the multiple feature sequences include time feature sequences, holiday feature sequences, activity feature sequences and regional feature sequences.

在一实施例中，第一流量数据包括采集时间、各采集时间对应的流量值、各采集时间对应的节假日信息、各采集时间对应的活动信息，用{W1}_w×4表示，其中，W1表示待预测区域的w个时间采集点对应的四维数据矩阵，w的数量由历史预设时长和所采集的时间粒度决定。这样可以通过大数据分析，对第一流量数据进行特征标记，其特征包括时间特征、节假日特征、活动特征和区域特征，由此得到时间特征序列、节假日特征序列、活动特征序列和区域特征序列，再根据时间特征序列、节假日特征序列、活动特征序列和区域特征序列，形成特征矩阵{F}_w×k，其中，F表示w个时间采集点对应的k维数据矩阵，w的数量由历史预设时长和所采集的时间粒度决定，k为任意正整数。具体地，时间特征用于指示第一流量数据的各采集时间的时间信息，如对应的季节(即春季、夏季、秋季或者冬季)、对应的时段(即上午、下午、中午或者晚上)、是否为上下班高峰期等等；节假日特征用于指示第一流量数据的各采集时间的节假日信息，如是否为节假日、节假日名称、节假日时长等；活动特征用于指示第一流量数据的各采集时间的活动信息，如是否为活动日、活动名称、活动时长、活动区域范围等等；区域特征用于指示待预测区域的信息，如所属人口流动类型(即人口流入型、人口流出型或者人口稳定型)、所属城市类型(即旅游城市、农业城市或者工业城市)、当月国内生产总值(Gross Domestic Product，简称GDP)值、区域内常驻人口数量、城镇人口数量、高校数量、中小学数量等等。In one embodiment, the first traffic data includes the collection time, the traffic value corresponding to each collection time, the holiday information corresponding to each collection time, and the activity information corresponding to each collection time, which is represented by {W1} _w×4 , where W1 represents the four-dimensional data matrix corresponding to the w time collection points in the area to be predicted, and the number of w is determined by the historical preset time and the time granularity collected. In this way, the first traffic data can be marked with features through big data analysis, and its features include time features, holiday features, activity features and regional features, thereby obtaining a time feature sequence, a holiday feature sequence, an activity feature sequence and a regional feature sequence, and then forming a feature matrix {F} _w×k according to the time feature sequence, the holiday feature sequence, the activity feature sequence and the regional feature sequence, where F represents the k-dimensional data matrix corresponding to the w time collection points, the number of w is determined by the historical preset time and the time granularity collected, and k is an arbitrary positive integer. Specifically, the time feature is used to indicate the time information of each collection time of the first flow data, such as the corresponding season (i.e., spring, summer, autumn or winter), the corresponding time period (i.e., morning, afternoon, noon or evening), whether it is a rush hour, etc.; the holiday feature is used to indicate the holiday information of each collection time of the first flow data, such as whether it is a holiday, the name of the holiday, the duration of the holiday, etc.; the activity feature is used to indicate the activity information of each collection time of the first flow data, such as whether it is an activity day, the name of the activity, the duration of the activity, the scope of the activity area, etc.; the regional feature is used to indicate the information of the area to be predicted, such as the type of population flow (i.e., population inflow type, population outflow type or population stability type), the type of city (i.e., tourist city, agricultural city or industrial city), the gross domestic product (GDP) value of the month, the number of permanent residents in the region, the number of urban population, the number of colleges and universities, the number of primary and secondary schools, etc.

需要说明的是，获取模型预测数据的训练数据可以采用上述相同的方式实现，参见图4，图4是本发明实施例提供的训练数据的示意图，如图4所示，首先采集历史流量数据，这里的历史流量数据表示目标范围的各区域的历史流量数据，采集时间点的数量和采集时间的粒度可以根据实际需要进行设置。其中，该历史流量数据为包括采集时间、各采集时间对应的流量值、各采集时间对应的节假日信息、各采集时间对应的活动信息等多个维度的时序数据。在采集到待预测区域的第一流量数据后，构建特征序列，其中，特征序列可以包括待预测区域的时间特征序列、节假日特征序列、活动特征序列和区域特征序列。在构建完特征序列后，将各区域的历史流量数据，以及时间特征序列、节假日特征序列、活动特征序列和区域特征序列，作为训练数据，用于对预设模型进行训练学习。It should be noted that the acquisition of training data for model prediction data can be implemented in the same manner as described above, see FIG. 4, which is a schematic diagram of training data provided by an embodiment of the present invention. As shown in FIG. 4, historical traffic data is first collected, where the historical traffic data represents the historical traffic data of each area in the target range, and the number of collection time points and the granularity of the collection time can be set according to actual needs. Among them, the historical traffic data is time series data including multiple dimensions such as the collection time, the traffic value corresponding to each collection time, the holiday information corresponding to each collection time, and the activity information corresponding to each collection time. After collecting the first traffic data of the area to be predicted, a feature sequence is constructed, wherein the feature sequence may include a time feature sequence, a holiday feature sequence, an activity feature sequence, and a regional feature sequence of the area to be predicted. After the feature sequence is constructed, the historical traffic data of each area, as well as the time feature sequence, the holiday feature sequence, the activity feature sequence, and the regional feature sequence are used as training data for training and learning the preset model.

在本实施例中，通过对待预测区域的第一流量数据进行特征标记，构建待预测区域的特征矩阵，由此在通过预设模型进行训练和预测时，可以综合考虑到时间特征、节假日特征、活动特征和区域特征等因素对待预测区域的流量值造成的影响，从而可以提高预测结果的准确性。In this embodiment, the feature matrix of the area to be predicted is constructed by feature marking the first flow data of the area to be predicted. Therefore, when training and prediction are performed through a preset model, the influence of factors such as time characteristics, holiday characteristics, activity characteristics and regional characteristics on the flow value of the area to be predicted can be comprehensively considered, thereby improving the accuracy of the prediction results.

在一应用例中，可参见图5，图5是本发明实施例提供的预设模型的结构示意图，在图5中，该预设模型包括第一预设模型、第二预设模型和第三预设模型，第一预设模型均与第二预设模型和第三预设模型级联，第二预设模型与第三预设模型级联，其中，第二预设模型包括第一子模块、第二子模型和第三子模型，第一子模块、第二子模型和第三子模型分别与人口流动类型中的流入型、流出型和稳定型相对应。待预测区域的第一流量数据{W1}_w×4和特征矩阵{F}_w×k作为第一预设模型的输入，通过第一预设模型输出初始流量预测序列{Y’}_L、节假日序列{H}_L、趋势序列{T}_L和季节序列{S}_L。将初始流量预测序列{Y’}_L和特征矩阵{F}_w×k输入至第一子模块、第二子模型和第三子模型，预测得到3个流量差异性特征序列，即流入型对应的流量差异性特征序列{D1′}_L、流出型对应的流量差异性特征序列{D2′}_L和稳定型对应的流量差异性特征序列{D3′}_L。最后再将初始流量预测序列{Y’}_L、节假日序列{H}_L、趋势序列{T}_L、季节序列{S}_L，以及流入型对应的流量差异性特征序列{D1′}_L、流出型对应的流量差异性特征序列{D2′}_L和稳定型对应的流量差异性特征序列{D3′}_L作为第三预设模型的输入，通过第三预设模型得到待预测区域的流量预测结果{Y}_L。In an application example, see FIG5, which is a schematic diagram of the structure of a preset model provided by an embodiment of the present invention. In FIG5, the preset model includes a first preset model, a second preset model and a third preset model. The first preset model is cascaded with the second preset model and the third preset model, and the second preset model is cascaded with the third preset model. The second preset model includes a first submodule, a second submodel and a third submodel. The first submodule, the second submodel and the third submodel correspond to the inflow type, the outflow type and the stable type of population flow types respectively. The first flow data {W1} _w×4 and the feature matrix {F} _w×k of the area to be predicted are used as inputs of the first preset model, and the first preset model outputs the initial flow prediction sequence {Y'} _L , the holiday sequence {H} _L , the trend sequence {T} _L and the season sequence {S} _L. The initial flow prediction sequence {Y'} _L and the feature matrix {F} _w×k are input to the first submodule, the second submodel and the third submodel, and three flow difference feature sequences are predicted, namely, the flow difference feature sequence {D1′} _L corresponding to the inflow type, the flow difference feature sequence {D2′} _L corresponding to the outflow type and the flow difference feature sequence {D3′} _L corresponding to the stable type. Finally, the initial flow prediction sequence {Y'} _L , the holiday sequence {H} _L , the trend sequence {T} _L , the seasonal sequence {S} _L , and the flow difference feature sequence {D1′} _L corresponding to the inflow type, the flow difference feature sequence {D2′} _L corresponding to the outflow type and the flow difference feature sequence {D3′} _L corresponding to the stable type are used as the input of the third preset model, and the flow prediction result {Y} _L of the area to be predicted is obtained through the third preset model.

在本实施例中，可以通过第一预设模型得到待预测区域的初始流量预测序列、节假日序列、趋势序列和季节序列等数据，通过第二预设模型得到各人口流动类型对应的流量差异性特征序列，最后通过第三预设模型对初始流量预测序列进行修正，这样使得预测结果考虑了待预测区域的多个特征和目标范围内其他区域人口流动对流量的影响，从而提升了预测结果的准确率。本发明实施例还提供了一种网络流量预测装置，参见图6，图6是本发明实施例提供的网络流量预测装置的结构示意图之一，如图6所示，该网络流量预测装置600包括：In this embodiment, the data such as the initial traffic prediction sequence, holiday sequence, trend sequence and season sequence of the area to be predicted can be obtained through the first preset model, and the traffic difference feature sequence corresponding to each population flow type can be obtained through the second preset model. Finally, the initial traffic prediction sequence is corrected by the third preset model, so that the prediction result takes into account multiple characteristics of the area to be predicted and the impact of population flow in other areas within the target range on the traffic, thereby improving the accuracy of the prediction result. The embodiment of the present invention also provides a network traffic prediction device, see Figure 6, Figure 6 is one of the structural schematic diagrams of the network traffic prediction device provided by the embodiment of the present invention, as shown in Figure 6, the network traffic prediction device 600 includes:

获取模块601，用于获取目标范围内各区域的历史流量数据，历史流量数据包括待预测区域对应的第一流量数据；An acquisition module 601 is used to acquire historical traffic data of each area within a target range, where the historical traffic data includes first traffic data corresponding to the area to be predicted;

第一确定模块602，用于基于第一流量数据，确定待预测区域对应的特征矩阵；A first determination module 602 is used to determine a feature matrix corresponding to the area to be predicted based on the first traffic data;

第二确定模块603，用于将历史流量数据和特征矩阵输入至预设模型，确定待预测区域的流量预测结果；The second determination module 603 is used to input the historical traffic data and the feature matrix into a preset model to determine the traffic prediction result of the area to be predicted;

其中，预设模型包括第一预设模型、第二预设模型和第三预设模型，第一预设模型分别与第二预设模型和第三预设模型级联，第二预设模型与第三预设模型级联。The preset models include a first preset model, a second preset model and a third preset model. The first preset model is cascaded with the second preset model and the third preset model respectively, and the second preset model is cascaded with the third preset model.

可选地，第二确定模块603包括：Optionally, the second determining module 603 includes:

第一输入子模块，用于将第一流量数据和特征矩阵输入至第一预设模型，得到第一预测结果，第一预测结果包括初始流量预测序列；A first input submodule, used for inputting the first flow data and the feature matrix into a first preset model to obtain a first prediction result, wherein the first prediction result includes an initial flow prediction sequence;

第二输入子模块，用于将初始流量预测序列与特征矩阵输入至第二预设模型，得到第二预测结果，第二预测结果包括流量差异性特征序列；A second input submodule, used to input the initial traffic prediction sequence and the feature matrix into a second preset model to obtain a second prediction result, wherein the second prediction result includes a traffic difference feature sequence;

第三输入子模块，用于将第一预测结果和流量差异性特征序列输入至第三预设模型，得到待预测区域的流量预测结果，流量差异性特征序列用于指示目标范围内各区域的人口流动对待预测区域的流量的影响量；A third input submodule is used to input the first prediction result and the flow difference feature sequence into a third preset model to obtain a flow prediction result of the area to be predicted, and the flow difference feature sequence is used to indicate the impact of population flow in each area within the target range on the flow of the area to be predicted;

其中，第一预设模型和第二预设模型为不同的机器学习模型，第三预设模型为修正模型，第三预设模型用于根据流量差异性特征序列对初始流量预测序列进行修正。Among them, the first preset model and the second preset model are different machine learning models, and the third preset model is a correction model. The third preset model is used to correct the initial traffic prediction sequence according to the traffic difference feature sequence.

可选地，第一预测结果还包括：Optionally, the first prediction result further includes:

待预测区域的节假日序列、趋势序列和季节序列中的至少一项，其中，节假日序列用于指示待预测区域在预设预测时长内的节假日特征，趋势序列用于指示待预测区域预测时长内的趋势特征，季节序列用于指示待预测区域在预设预测时长内的季节特征。At least one of the holiday sequence, trend sequence and season sequence of the area to be predicted, wherein the holiday sequence is used to indicate the holiday characteristics of the area to be predicted within a preset prediction period, the trend sequence is used to indicate the trend characteristics of the area to be predicted within the prediction period, and the season sequence is used to indicate the seasonal characteristics of the area to be predicted within the preset prediction period.

可选地，第二预设模型包括N个子模型，第二预测结果包括N个第二预测子结果，N为正整数；第二输入子模块包括：Optionally, the second preset model includes N sub-models, the second prediction result includes N second prediction sub-results, and N is a positive integer; the second input submodule includes:

输入单元，用于将初始流量预测序列与特征矩阵输入至目标子模型，得到第二预测子结果；An input unit, used to input the initial traffic prediction sequence and the feature matrix into the target sub-model to obtain a second prediction sub-result;

其中，其中，目标子模型为N个子模型中的任意一个子模块。Among them, the target sub-model is any sub-module among the N sub-models.

可选地，第二确定模块603还包括：Optionally, the second determining module 603 further includes:

训练学习子模块，用于对历史流量数据和特征矩阵进行训练学习，得到N个子模型；The training and learning submodule is used to train and learn historical traffic data and feature matrices to obtain N submodels;

可选地，训练学习子模块，具体用于：Optionally, the training learning submodule is specifically used for:

根据目标范围内各区域的人口流动类型，对各区域的历史流量数据进行分类；Classify the historical flow data of each area according to the population flow type of each area within the target range;

分别计算N个人口流动类型对应的N个均值序列，均值序列用于指示目标范围内人口流动类型相同的多个区域在各个采集时间点对应的流量均值；Calculate N mean sequences corresponding to N population flow types respectively. The mean sequence is used to indicate the mean flow rate corresponding to multiple areas with the same population flow type within the target range at each collection time point;

基于第一流量数据和N个均值序列，确定N个人口流动类型对应的N个差异性特征序列，差异性特征序列包括第一流量数据中各采集时间点对应的流量值与均值序列的相应时间点的流量均值之间的差值；Based on the first flow data and the N mean sequences, determine N difference feature sequences corresponding to the N population flow types, the difference feature sequences including the difference between the flow value corresponding to each collection time point in the first flow data and the flow mean at the corresponding time point of the mean sequence;

将第一流量数据和特征矩阵作为N个子模型中各子模型的输入，将N个差异性特征序列分别作为N个子模型中各子模型的输出，训练学习得到N个子模型。The first flow data and the feature matrix are used as inputs of each of the N sub-models, and the N differential feature sequences are used as outputs of each of the N sub-models, and N sub-models are obtained through training and learning.

可选地，第一确定模块602包括：Optionally, the first determining module 602 includes:

标记子模块，用于对第一流量数据进行特征标记，获得多个特征序列；A marking submodule, used for performing feature marking on the first flow data to obtain a plurality of feature sequences;

第二确定子模块，用于基于多个特征序列，确定待预测区域对应的特征矩阵；The second determination submodule is used to determine the feature matrix corresponding to the area to be predicted based on multiple feature sequences;

网络流量预测装置600能够实现上述网络流量预测方法实施例的各个过程，以及达到相同的有益效果，为避免重复，这里不再赘述。The network traffic prediction device 600 can implement each process of the above-mentioned network traffic prediction method embodiment and achieve the same beneficial effects. To avoid repetition, it will not be described here.

本发明实施例还提供了一种网络流量预测装置600，包括：处理器、存储器及存储在存储器上并可在处理器上运行的程序，程序被处理器执行时实现上述网络流量预测方法实施例的各个过程，且能达到相同的技术效果，为避免重复，这里不再赘述。An embodiment of the present invention also provides a network traffic prediction device 600, including: a processor, a memory, and a program stored in the memory and executable on the processor. When the program is executed by the processor, each process of the above-mentioned network traffic prediction method embodiment is implemented, and the same technical effect can be achieved. To avoid repetition, it will not be described here.

具体的，参见图7所示，本发明实施例还提供了一种网络流量预测装置，包括总线701、收发机702、天线703、总线接口704、处理器705和存储器706。Specifically, as shown in FIG. 7 , an embodiment of the present invention further provides a network traffic prediction device, including a bus 701 , a transceiver 702 , an antenna 703 , a bus interface 704 , a processor 705 and a memory 706 .

处理器705，用于获取目标范围内各区域的历史流量数据，历史流量数据包括待预测区域对应的第一流量数据；Processor 705, used to obtain historical traffic data of each area within the target range, where the historical traffic data includes first traffic data corresponding to the area to be predicted;

基于第一流量数据，确定待预测区域对应的特征矩阵；Based on the first traffic data, determining a feature matrix corresponding to the area to be predicted;

将历史流量数据和特征矩阵输入至预设模型，确定待预测区域的流量预测结果；Input historical traffic data and feature matrix into the preset model to determine the traffic prediction result of the area to be predicted;

进一步地，处理器705，还用于将第一流量数据和特征矩阵输入至第一预设模型，得到第一预测结果，第一预测结果包括初始流量预测序列；Furthermore, the processor 705 is further configured to input the first flow data and the feature matrix into a first preset model to obtain a first prediction result, wherein the first prediction result includes an initial flow prediction sequence;

将初始流量预测序列与特征矩阵输入至第二预设模型，得到第二预测结果，第二预测结果包括流量差异性特征序列；Inputting the initial flow prediction sequence and the feature matrix into a second preset model to obtain a second prediction result, wherein the second prediction result includes a flow difference feature sequence;

将第一预测结果和流量差异性特征序列输入至第三预设模型，得到待预测区域的流量预测结果，流量差异性特征序列用于指示目标范围内各区域的人口流动对待预测区域的流量的影响量；The first prediction result and the flow difference characteristic sequence are input into the third preset model to obtain the flow prediction result of the area to be predicted, and the flow difference characteristic sequence is used to indicate the influence of the population flow in each area within the target range on the flow of the area to be predicted;

进一步地，第一预测结果还包括：Furthermore, the first prediction result also includes:

进一步地，处理器705，还用于将初始流量预测序列与特征矩阵输入至目标子模型，得到第二预测子结果；Furthermore, the processor 705 is further configured to input the initial traffic prediction sequence and the feature matrix into the target sub-model to obtain a second prediction sub-result;

进一步地，处理器705，还用于对历史流量数据和特征矩阵进行训练学习，得到N个子模型；Furthermore, the processor 705 is also used to train and learn the historical traffic data and the feature matrix to obtain N sub-models;

进一步地，处理器705，还用于根据目标范围内各区域的人口流动类型，对各区域的历史流量数据进行分类；Furthermore, the processor 705 is further configured to classify the historical flow data of each area according to the population flow type of each area within the target range;

进一步地，处理器705，还用于对第一流量数据进行特征标记，获得多个特征序列；Furthermore, the processor 705 is further configured to perform feature marking on the first flow data to obtain a plurality of feature sequences;

在图7中，总线架构(用总线701来代表)，总线701可以包括任意数量的互联的总线和桥，总线701将包括由处理器705代表的一个或多个处理器和存储器706代表的存储器的各种电路链接在一起。总线701还可以将诸如外围设备、稳压器和功率管理电路等之类的各种其他电路链接在一起，这些都是本领域所公知的，因此，本文不再对其进行进一步描述。总线接口704在总线701和收发机702之间提供接口。收发机702可以是一个元件，也可以是多个元件，比如多个接收器和发送器，提供用于在传输介质上与各种其他装置通信的单元。经处理器705处理的数据通过天线703在无线介质上进行传输，进一步，天线703还接收数据并将数据传送给处理器705。In FIG. 7 , a bus architecture (represented by bus 701) is shown, and bus 701 may include any number of interconnected buses and bridges, and bus 701 links various circuits including one or more processors represented by processor 705 and memory represented by memory 706. Bus 701 may also link various other circuits such as peripherals, voltage regulators, and power management circuits, which are well known in the art and are therefore not further described herein. Bus interface 704 provides an interface between bus 701 and transceiver 702. Transceiver 702 may be one element or multiple elements, such as multiple receivers and transmitters, providing a unit for communicating with various other devices on a transmission medium. Data processed by processor 705 is transmitted on a wireless medium via antenna 703, and further, antenna 703 also receives data and transmits the data to processor 705.

处理器705负责管理总线701和通常的处理，还可以提供各种功能，包括定时，外围接口，电压调节、电源管理以及其他控制功能。而存储器706可以被用于存储处理器705在执行操作时所使用的数据。The processor 705 is responsible for managing the bus 701 and general processing, and can also provide various functions, including timing, peripheral interfaces, voltage regulation, power management and other control functions. The memory 706 can be used to store data used by the processor 705 when performing operations.

可选的，处理器705可以是CPU、ASIC、FPGA或CPLD。Optionally, the processor 705 may be a CPU, an ASIC, an FPGA or a CPLD.

本发明实施例还提供一种计算机可读存储介质，计算机可读存储介质上存储有计算机程序，该计算机程序被处理器执行时实现上述网络流量预测方法实施例的各个过程，且能达到相同的技术效果，为避免重复，这里不再赘述。其中，的计算机可读存储介质，如只读存储器(Read-Only Memory，ROM)、随机存取存储器(Random Access Memory，RAM)、磁碟或者光盘等。The embodiment of the present invention also provides a computer-readable storage medium, on which a computer program is stored. When the computer program is executed by a processor, each process of the above-mentioned network traffic prediction method embodiment is implemented, and the same technical effect can be achieved. To avoid repetition, it is not repeated here. Among them, the computer-readable storage medium is, for example, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, etc.

需要说明的是，在本文中，术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下，由语句“包括一个……”限定的要素，并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。It should be noted that, in this article, the terms "include", "comprises" or any other variations thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, method, article or device. In the absence of further restrictions, an element defined by the sentence "comprises a ..." does not exclude the existence of other identical elements in the process, method, article or device including the element.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件，但很多情况下前者是更佳的实施方式。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中，包括若干指令用以使得一台终端(可以是手机，计算机，服务器，空调器，或者网络设备等)执行本发明各个实施例所述的方法。Through the description of the above implementation methods, those skilled in the art can clearly understand that the above-mentioned embodiment methods can be implemented by means of software plus a necessary general hardware platform, and of course by hardware, but in many cases the former is a better implementation method. Based on such an understanding, the technical solution of the present invention, or the part that contributes to the prior art, can be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, a magnetic disk, or an optical disk), and includes a number of instructions for enabling a terminal (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the methods described in each embodiment of the present invention.

上面结合附图对本发明的实施例进行了描述，但是本发明并不局限于上述的具体实施方式，上述的具体实施方式仅仅是示意性的，而不是限制性的，本领域的普通技术人员在本发明的启示下，在不脱离本发明宗旨和权利要求所保护的范围情况下，还可做出很多形式，均属于本发明的保护之内。The embodiments of the present invention are described above in conjunction with the accompanying drawings, but the present invention is not limited to the above-mentioned specific implementation methods. The above-mentioned specific implementation methods are merely illustrative and not restrictive. Under the guidance of the present invention, ordinary technicians in this field can also make many forms without departing from the scope of protection of the present invention and the claims, all of which are within the protection of the present invention.

Claims

1. A network traffic prediction method, characterized in that the method comprises:

Acquire historical traffic data of each area within the target range, wherein the historical traffic data includes first traffic data corresponding to the area to be predicted;

Based on the first traffic data, determining a feature matrix corresponding to the area to be predicted;

Inputting the historical traffic data and the characteristic matrix into a preset model to determine the traffic prediction result of the area to be predicted;

The preset model includes a first preset model, a second preset model and a third preset model, the first preset model is cascaded with the second preset model and the third preset model respectively, the second preset model is cascaded with the third preset model, and the second preset model is used to predict the flow conversion relationship between the area to be predicted and other areas within the target range;

The step of inputting the historical traffic data and the characteristic matrix into a preset model to determine the traffic prediction result of the area to be predicted includes:

Inputting the first flow data and the feature matrix into the first preset model to obtain a first prediction result, wherein the first prediction result includes an initial flow prediction sequence;

Inputting the initial flow prediction sequence and the characteristic matrix into the second preset model to obtain a second prediction result, wherein the second prediction result includes a flow difference characteristic sequence, and the flow difference characteristic sequence is used to indicate the influence of the population flow in each area within the target range on the flow of the area to be predicted;

Inputting the first prediction result and the flow difference feature sequence into the third preset model to obtain the flow prediction result of the area to be predicted;

Among them, the first preset model and the second preset model are different machine learning models, and the third preset model is a correction model, and the third preset model is used to correct the initial traffic prediction sequence according to the traffic difference feature sequence.

2. The method according to claim 1, characterized in that the first prediction result further comprises:

At least one of the holiday sequence, trend sequence and season sequence of the area to be predicted, wherein the holiday sequence is used to indicate the holiday characteristics of the area to be predicted within a preset prediction time period, the trend sequence is used to indicate the trend characteristics of the area to be predicted within the prediction time period, and the season sequence is used to indicate the season characteristics of the area to be predicted within the preset prediction time period.

3. The method according to claim 1, characterized in that the second preset model includes N sub-models, the second prediction result includes N second prediction sub-results, and N is a positive integer;

The step of inputting the initial flow prediction sequence and the feature matrix into a second preset model to obtain a second prediction result includes:

Inputting the initial traffic prediction sequence and the feature matrix into the target sub-model to obtain a second prediction sub-result;

The target sub-model is any sub-module among the N sub-models.

4. The method according to claim 3, characterized in that before inputting the initial flow prediction sequence and the feature matrix into a second preset model to obtain a second prediction result, it comprises:

Performing training and learning on the historical traffic data and the feature matrix to obtain the N sub-models;

Among them, the value of N is consistent with the number of population flow types included in each area within the target range, and the population flow types include at least one of inflow type, outflow type and stable type.

5. The method according to claim 4, characterized in that the training and learning of the historical traffic data and the feature matrix to obtain the N sub-models comprises:

Classify the historical flow data of each area according to the population flow type of each area within the target range;

Calculate N mean sequences corresponding to N population flow types respectively, wherein the mean sequences are used to indicate the flow mean values corresponding to multiple areas with the same population flow type within the target range at each collection time point;

Based on the first flow data and the N mean sequences, determine N difference feature sequences corresponding to the N population flow types, the difference feature sequences comprising differences between flow values corresponding to each acquisition time point in the first flow data and flow means at corresponding time points in the mean sequence;

The first traffic data and the feature matrix are used as inputs of each of the N sub-models, and the N differential feature sequences are used as outputs of each of the N sub-models, and the N sub-models are obtained through training and learning.

6. The method according to claim 1, characterized in that the step of determining the feature matrix corresponding to the area to be predicted based on the first traffic data comprises:

Performing feature marking on the first flow data to obtain a plurality of feature sequences;

Based on the multiple feature sequences, determining a feature matrix corresponding to the area to be predicted;

Among them, the first traffic data is a four-dimensional time series including multiple collection time points and the traffic values, holiday information and activity information corresponding to the multiple collection time points, and the multiple feature sequences include time feature sequences, holiday feature sequences, activity feature sequences and regional feature sequences.

7. A network traffic prediction device, characterized in that the device comprises:

An acquisition module, used to acquire historical traffic data of each area within the target range, wherein the historical traffic data includes first traffic data corresponding to the area to be predicted;

A first determination module, configured to determine a feature matrix corresponding to the area to be predicted based on the first traffic data;

A second determination module is used to input the historical traffic data and the characteristic matrix into a preset model to determine the traffic prediction result of the area to be predicted;

The second determining module includes:

A first input submodule, used for inputting the first flow data and the feature matrix into the first preset model to obtain a first prediction result, wherein the first prediction result includes an initial flow prediction sequence;

A second input submodule is used to input the initial traffic prediction sequence and the feature matrix into the second preset model to obtain a second prediction result, wherein the second prediction result includes a traffic difference feature sequence, and the traffic difference feature sequence is used to indicate the impact of population flow in each area within the target range on the traffic of the area to be predicted;

A third input submodule, used for inputting the first prediction result and the flow difference feature sequence into the third preset model to obtain the flow prediction result of the area to be predicted;

8. A network traffic prediction device, characterized in that it comprises: a processor, a memory, and a program stored in the memory and executable on the processor, wherein when the program is executed by the processor, the steps of the network traffic prediction method as described in any one of claims 1 to 6 are implemented.

9. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the network traffic prediction method according to any one of claims 1 to 6 are implemented.