CN117635218A

CN117635218A - Business district flow prediction method based on six-degree separation theory and graph annotation network

Info

Publication number: CN117635218A
Application number: CN202410103109.1A
Authority: CN
Inventors: 张佳琪; 黄世杰
Original assignee: Beijing Jiaotong University
Current assignee: Beijing Jiaotong University
Priority date: 2024-01-25
Filing date: 2024-01-25
Publication date: 2024-03-01
Anticipated expiration: 2044-01-25
Also published as: CN117635218B

Abstract

The invention provides a business turn flow prediction method based on a six-degree separation theory and a graph annotation network, which comprises the following steps: s1, acquiring business turn flow data; s2, constructing a self-adaptive business turn flow diagram structure based on a six-degree separation theory; s3, extracting time features based on a linear gating convolution attention unit; s4, extracting spatial features based on the attention of the multi-head graph; s5, outputting a prediction result through the full connection layer, and completing establishment of a business turn flow prediction model; s6, training a business turn flow prediction model; s7, calling a trained business district flow prediction model to conduct flow prediction. The invention solves the problems of special business circle characteristics of business circle flow prediction and node long-distance space correlation, time lag and dynamic coupling caused by business marketing activities, and has obvious difference with space-time characteristic extraction in other flow prediction processes.

Description

Business district traffic prediction method based on six degrees of separation theory and graph attention network

技术领域Technical field

本发明属于商业预测技术领域，特别是涉及一种基于六度分离理论和图注意网络的商圈流量预测方法。The invention belongs to the field of business prediction technology, and in particular relates to a business district traffic prediction method based on six degrees of separation theory and graph attention network.

背景技术Background technique

商圈流量预测对商场运营和管理有着重要作用。通过精准的流量预测，商场可以定制不同的商业策略。流量预测是智能商场管理系统中不可或缺的组成部分，现有的流量预测模型大致分为，和基于深度神经网络的预测模型。基于线性统计理论和基于机器学习的预测模型中的大多数实现了静态建模预测，在单独的空间或时间尺度上考虑信息，而不能挖掘多变量时序数据间的动态依赖关系，基于深度神经网络的预测模型在流量预测中取得了显著的进展。其中，图神经网络更适用于建模节点网络中的复杂关系。它们借助图结构有效地捕捉了节点间复杂关系，通过迭代更新节点特征，使节点能够聚合和传播信息，在流量预测方面表现出卓越成就。Business district traffic forecast plays an important role in shopping mall operations and management. Through accurate traffic prediction, shopping malls can customize different business strategies. Traffic prediction is an indispensable part of the smart shopping mall management system. The existing traffic prediction models are roughly divided into, and prediction models based on deep neural networks. Most prediction models based on linear statistical theory and machine learning implement static modeling predictions, considering information on a separate spatial or time scale, but cannot mine the dynamic dependencies between multi-variable time series data. Based on deep neural networks The prediction model has made significant progress in traffic prediction. Among them, graph neural network is more suitable for modeling complex relationships in node networks. They effectively capture the complex relationships between nodes with the help of graph structures, enable nodes to aggregate and disseminate information by iteratively updating node features, and show outstanding achievements in traffic prediction.

图模型构建方法有三类：基于领域知识的方法、基于文本数据的方法、基于数据的方法。There are three types of graph model building methods: domain knowledge-based methods, text data-based methods, and data-based methods.

（1）基于领域知识的图模型构建方法将数据中的节点表示为图的节点，并根据领域专家的知识或先验信息来确定节点之间的连接关系；或将领域知识构造为实体关系三元组，构建实体关系图，其中实体作为节点，而实体之间的关系用边表示，边的权重可以反映关系的强度或相关性；这两种构建方法均依赖于领域专家知识。(1) The graph model construction method based on domain knowledge represents the nodes in the data as nodes of the graph, and determines the connection relationship between the nodes based on the knowledge of domain experts or prior information; or constructs domain knowledge as entity relationships. Tuples, construct entity-relationship graphs, in which entities serve as nodes, and relationships between entities are represented by edges. The weights of edges can reflect the strength or relevance of the relationships; both construction methods rely on domain expert knowledge.

（2）基于文本数据的图模型构建方法将文本数据中的信息转化为图结构，例如创建一个词共现图，其中文本中的词汇被视为节点，而词汇之间的共现关系用边表示，边的权重可以表示共现频率。(2) The graph model construction method based on text data converts the information in the text data into a graph structure, such as creating a word co-occurrence graph, in which words in the text are regarded as nodes, and the co-occurrence relationships between words are represented by edges. means that the edge weight can represent the co-occurrence frequency.

（3）基于数据的图模型构建方法包括k-最近邻图和半径邻居图两种，k-最近邻图构建通过将每个数据点表示为图的节点，计算节点之间的距离或相似度，选择最近的k个节点来建立边的连接；半径邻居图同样将数据点表示为节点，但是根据预定义的距离半径来确定节点之间是否连接。(3) Data-based graph model construction methods include k-nearest neighbor graph and radius neighbor graph. k-nearest neighbor graph construction represents each data point as a node of the graph and calculates the distance or similarity between nodes. , select the nearest k nodes to establish edge connections; the radius neighbor graph also represents data points as nodes, but determines whether the nodes are connected based on the predefined distance radius.

这些方法通过捕获数据点之间的关系，为各种任务提供了基础图结构。基于数据的构建方法具有自动化、数据驱动、可解释性和广泛适用性等优点，使得该方法成为构建图模型的通用且强大的工具。因为它们直接从数据中提取信息，不依赖领域知识或文本数据的解释，从而通用于各种数据类型和领域，适用于大规模数据处理和机器学习任务。These methods provide the underlying graph structure for a variety of tasks by capturing relationships between data points. The data-based construction method has the advantages of automation, data-driven, interpretability, and wide applicability, making this method a versatile and powerful tool for building graph models. Because they extract information directly from the data and do not rely on domain knowledge or interpretation of text data, they are universally applicable to various data types and fields, and are suitable for large-scale data processing and machine learning tasks.

然而，现有的基于数据的图模型构建方法以及基于图神经网络的流量预测方案仍存在局限性，难以应用在现实生活中的商圈流量预测，主要原因在于：However, existing data-based graph model construction methods and graph neural network-based traffic prediction solutions still have limitations and are difficult to apply to business district traffic prediction in real life. The main reasons are:

1.现有方法在图模型构造时，只考虑节点间真实的位置分布，而没有考虑到商圈流量预测中日常可见的营销模式、商场活动带来的影响——各商场、商铺位置之间的空间依赖性是高度动态的，而非静态。1. When constructing the graph model, the existing method only considers the real location distribution between nodes, but does not take into account the impact of daily visible marketing models and shopping mall activities in business district traffic prediction - between each shopping mall and store location. The spatial dependence of is highly dynamic rather than static.

2.现有的方案存在过平滑问题，从而难以捕捉长距离的空间相关性——商圈中有多个商场，商场之间有一定的相似性，两个遥远的商铺位置可能具有相似的流量结构，空间依赖性是长距离的。2. Existing solutions have an over-smoothing problem, making it difficult to capture long-distance spatial correlations - there are multiple shopping malls in a business district, there is a certain similarity between shopping malls, and two distant store locations may have similar traffic. Structural, spatial dependence is long-distance.

3.现有的方法没有考虑位置之间的空间信息传播中可能会出现时间延迟的影响。例如，当一个地点开始进行商品促销活动时，需要一段时间后（延迟）才能影响商铺间的流量状况。3. Existing methods do not consider the impact of time delays that may occur in the propagation of spatial information between locations. For example, when a location starts a product promotion, it takes a while (delay) to affect traffic conditions between stores.

发明内容Contents of the invention

本发明实施例的目的在于提供一种基于六度分离理论和图注意网络的商圈流量预测方法，根据商圈动态调整建图方式，捕捉长距离的空间相关性，同时引入线性门控卷积注意力单元，实现商圈流量预测。The purpose of the embodiments of the present invention is to provide a business district traffic prediction method based on six degrees of separation theory and graph attention network, dynamically adjust the mapping method according to the business district, capture long-distance spatial correlation, and introduce linear gated convolution. Attention unit realizes business district traffic prediction.

为解决上述技术问题，本发明所采用的技术方案是，一种基于六度分离理论和图注意网络的商圈流量预测方法，包括以下步骤：In order to solve the above technical problems, the technical solution adopted by the present invention is a business district traffic prediction method based on six degrees of separation theory and graph attention network, which includes the following steps:

S1、采集商圈流量数据；S1. Collect business district traffic data;

S2、构造基于六度分离理论的自适应商圈流量图结构；S2. Construct an adaptive business district traffic graph structure based on six degrees of separation theory;

S3、基于线性门控卷积注意力单元提取时间特征；S3. Extract temporal features based on linear gated convolution attention unit;

S4、基于多头图注意力提取空间特征；S4. Extract spatial features based on multi-head graph attention;

S5、通过全连接层输出预测结果，完成商圈流量预测模型搭建；S5. Output the prediction results through the fully connected layer to complete the construction of the business district traffic prediction model;

S6、商圈流量预测模型训练；S6. Business district traffic prediction model training;

S7、调用训练好的商圈流量预测模型进行流量预测。S7. Call the trained business district traffic prediction model to predict traffic.

进一步的，所述S1采集的商圈流量数据包括商圈范围、商铺位置、采集频率、采集时间段。Further, the business district traffic data collected by S1 includes the business district range, store location, collection frequency, and collection time period.

进一步的，S2中基于六度分离理论的自适应商圈流量图结构构造过程如下：Furthermore, the construction process of the adaptive business district traffic diagram structure based on the six degrees of separation theory in S2 is as follows:

S21、计算任意两个商铺节点变量的相关系数，得到亲和矩阵S；对亲和矩阵S进行归一化，则矩阵内任一元素取值在0到1之间；将亲和矩阵S中的上三角矩阵元素降序排列；当/>时，将邻接矩阵A的相应位置/>设为1，其余为0，得到邻接矩阵A；/>为阈值；S21. Calculate the correlation coefficient of any two store node variables to obtain the affinity matrix S; normalize the affinity matrix S, then any element in the matrix The value is between 0 and 1; arrange the upper triangular matrix elements in the affinity matrix S in descending order; when/> When , change the corresponding position of the adjacency matrix A/> Set to 1, and the rest to 0, to obtain the adjacency matrix A;/> is the threshold;

S22、对邻接矩阵A分别计算对应的网络平均一致性和网络平均聚类系数CC，最终形成；两条曲线的交点/>为效率和网络冗余程度的平衡点；/>为/>和CC曲线的交点处网络半径的取值；S22. Calculate the corresponding network average consistency for the adjacency matrix A. and the network average clustering coefficient CC , finally forming; the intersection point of the two curves/> A balance between efficiency and network redundancy;/> for/> The value of the network radius at the intersection point with the CC curve;

S23、将平衡点的值作为最佳阈值，将距离大于等于/>的节点对连接，形成商圈流量节点图结构。S23. Set the balance point value as the optimal threshold, taking the distance to be greater than or equal to/> Node pairs are connected to form a business district traffic node graph structure.

进一步的，所述S3基于线性门控卷积注意力单元提取时间特征具体为：首先将Transformer模型中的自注意力机制替换为局部上下文敏感的卷积注意力，形成门控注意力单元，拟合时序数据的时间维度特性；输入为时间长度为的历史时序数据，将输入的时序数据进行分块，块内使用精确的卷积注意力而跨块使用快速线性注意力，得到具有线性复杂度的门控注意力单元；再通过由具有线性复杂度的门控注意力单元对输入的数据做时间卷积，输出各时间戳之间相关性的特征向量。Further, the S3 is based on the linear gated convolution attention unit to extract temporal features as follows: first, replace the self-attention mechanism in the Transformer model with local context-sensitive convolution attention to form a gated attention unit, which is intended to be Combined with the time dimension characteristics of time series data; the input is the time length of Historical time series data, the input time series data is divided into blocks, and accurate convolution attention is used within the block and fast linear attention is used across blocks to obtain a gated attention unit with linear complexity; then by The degree-gated attention unit performs temporal convolution on the input data and outputs the feature vector of the correlation between each timestamp.

进一步的，所述S4基于多头图注意力提取空间特征具体为：Further, the S4 extracts spatial features based on multi-head graph attention specifically as follows:

将S3得到的时间特征向量，输入空间卷积层，计算获得带有空间特性的特征向量，具体的计算过程如下：Input the temporal feature vector obtained by S3 into the spatial convolution layer to calculate the feature vector with spatial characteristics. The specific calculation process is as follows:

其中，/>是矩阵拼接操作；对于同一个节点，多头自注意力分别计算/>次注意力，并以拼接或平均的方式合并/>次注意力；/>为第/>层节点/>的特征向量，/>为第/>层节点/>的特征向量，经过注意力机制为核心的聚合操作（即公式中的操作）后，输出为节点/>的新的特征向量/>，可以理解为不同的节点/>特征表达，第/>个节点/>，经过注意力机制后的第/>个节点特征表示。 Among them,/> Is a matrix splicing operation; for the same node, multi-head self-attention is calculated separately/> attention, and merged by splicing or averaging/> attention;/> For the first/> Layer node/> eigenvector,/> For the first/> Layer node/> The feature vector of , after going through the aggregation operation (that is, the operation in the formula) with the attention mechanism as the core, the output is a node/> new eigenvector/> , can be understood as different nodes/> Characteristic Expression, Chapter/> nodes/> , the first one after going through the attention mechanism/> node feature representation.

指注意力的头，/>表示注意力系数，/>表示节点/>的度，/>，/>是可学习的参数，/>是第/>节点的向量表示，/>表示索引为/>中第/>节点对第/>节点的注意力权重；/>是激活函数，/>是第/>节点的向量表示； Refers to the head of attention,/> Represents the attention coefficient,/> Represents node/> degree,/> ,/> is a learnable parameter,/> Is the first/> Vector representation of nodes, /> Indicates that the index is/> Middle/> Node pair/> The attention weight of the node;/> is the activation function,/> Is the first/> Vector representation of nodes;

注意力系数的计算方法如下：Attention coefficient The calculation method is as follows:

其中/>为可学习的参数，第/>个头的注意力系数，/>是指数函数，/>是节点/>的特征向量。 Among them/> is a learnable parameter, No./> Attention coefficient of size,/> is an exponential function,/> Is a node/> eigenvector.

进一步的，所述S5具体为：通过全连接层融合时间特征提取与空间特征向量并输出长度为的预测结果/>；其中，/>表示节点索引，/>表示时刻，/>表示在/>时刻节点/>的预测值。Further, the S5 is specifically: fusing temporal feature extraction and spatial feature vector through a fully connected layer and outputting a length of Prediction results/> ;wherein,/> Represents the node index, /> Indicates time,/> Shown in/> Time node/> predicted value.

进一步的，所述S6训练模型的过程具体为：Further, the process of training the S6 model is specifically:

S61通过均方误差损失函数对模型进行训练，损失函数为：S61 trains the model through the mean square error loss function. The loss function is:

其中/>是时间序列中用以计算损失的样本总数，/>是真实值，/>是预测值； Among them/> is the total number of samples in the time series used to calculate the loss,/> is the true value,/> is the predicted value;

S62、利用Adam优化算法在反向传播中对每个权重参数求网络误差的梯度，通过参数更新过程得到新的权重；迭代计算模型权重，直到达到预定的小损失，并获得最佳预测值。S62. Use the Adam optimization algorithm to calculate the gradient of the network error for each weight parameter in backpropagation, and obtain new weights through the parameter update process; iteratively calculate the model weight until a predetermined small loss is reached, and the best prediction value is obtained.

进一步的，所述S7调用训练好的商圈流量预测模型进行流量预测具体为：Further, the S7 calls the trained business district traffic prediction model to perform traffic prediction as follows:

S71、启动实时数据采集，保持 5分钟一次的采样频率，将采集到的数据输入到S2得到的图结构中；S71. Start real-time data collection, maintain a sampling frequency of once every 5 minutes, and input the collected data into the graph structure obtained by S2;

S72、调用S6搭建好的模型，输出预测的流量值。S72. Call the model built in S6 to output the predicted traffic value.

本发明的有益效果是The beneficial effects of the present invention are

本发明针对商业属性复杂的商圈容易受到流量受营销、活动影响的问题，提出了一种基于六度分离思想的自适应商圈流量预测方法，解决了商业场景面临的营销活动带来的商铺动态耦合关系，商圈中商场相似性带来的过平滑问题以及营销活动的流量时滞问题。本发明可以直接应用于不同的商圈，无需再因商圈差异而人为调整商圈流量节点空间关系的图结构。Aiming at the problem that business districts with complex commercial attributes are susceptible to traffic being affected by marketing and activities, the present invention proposes an adaptive business district traffic prediction method based on the idea of six degrees of separation, which solves the problems caused by marketing activities in commercial scenarios. Dynamic coupling relationships, over-smoothing problems caused by the similarity of shopping malls in business districts, and traffic lag problems in marketing activities. The present invention can be directly applied to different business districts, and there is no need to manually adjust the graph structure of the spatial relationship of traffic nodes in the business districts due to differences in business districts.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单的介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following will briefly introduce the drawings needed to describe the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without exerting creative efforts.

图1是本发明的检测方法流程图；Figure 1 is a flow chart of the detection method of the present invention;

图2是基于六度分离思想的自适应邻接矩阵生成算法流程图；Figure 2 is a flow chart of the adaptive adjacency matrix generation algorithm based on the idea of six degrees of separation;

图3是基于六度分离思想的商圈流量预测模型结构图。Figure 3 is a structural diagram of the business district traffic prediction model based on the idea of six degrees of separation.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整的描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of the present invention.

如图3，本发明提供基于六度分离理论和图注意网络的商圈流量预测方法具体包括以下七步，其中第一步为数据统计，第二步为商圈流量的图模型构建，第三至五步为预测模型的搭建，第六步为模型训练策略，第七步为本方法的部署与应用。详细如下：As shown in Figure 3, the present invention provides a business district traffic prediction method based on six degrees of separation theory and graph attention network, which specifically includes the following seven steps. The first step is data statistics, the second step is the construction of a graph model of business district traffic, and the third step The fifth step is the construction of the prediction model, the sixth step is the model training strategy, and the seventh step is the deployment and application of this method. Details are as follows:

第一步：流量数据统计Step One: Traffic Data Statistics

首先制定详细的商圈流量数据采集计划，包括选择商圈范围、商铺位置、采集频率、采集时间段。将商圈流量数据采集设备部署在商铺门口，确保设备位置和角度适合捕捉流量情况。按计划启动设备，开始记录流量数据，并建立数据存储和管理系统。整个过程中保证定期清洗、校验数据，添加时间戳，进行数据备份，确保数据安全和隐私。First, formulate a detailed business district traffic data collection plan, including selecting the business district scope, store location, collection frequency, and collection time period. Deploy business district traffic data collection equipment at the entrance of the store to ensure that the location and angle of the equipment are suitable for capturing traffic conditions. Start the equipment as planned, start recording traffic data, and establish a data storage and management system. During the entire process, data is regularly cleaned and verified, timestamps are added, and data is backed up to ensure data security and privacy.

第二步：构造基于六度分离理论的自适应商圈流量图模型Step 2: Construct an adaptive business district traffic diagram model based on six degrees of separation theory

根据六度分离理论与现实商圈中营销消息传播效率高，商铺局部聚合的特点，结合半径网络构建技术，进行商圈流量图模型构建。图2为本算法的流程示意图，其中S为任意两个流量节点间的相关性组成亲和矩阵，A表示根据/>半径网络构建技术构建的图结构的节点邻接矩阵，CC（A）量化复杂网络中节点之间的连接程度， h(A)用以衡量复杂网络的全局传输容量。在统计图中，纵坐标表示CC（A）、h(A)的取值，横坐标表示不同的半径取值。Based on the six degrees of separation theory and the characteristics of high marketing message dissemination efficiency and local aggregation of shops in real business districts, combined Radius network construction technology is used to construct business district traffic graph models. Figure 2 is a schematic flow chart of this algorithm, where S is the affinity matrix composed of the correlation between any two traffic nodes, and A represents the affinity matrix based on /> The node adjacency matrix of the graph structure constructed by the radius network construction technology, CC (A) quantifies the degree of connection between nodes in the complex network, and h (A) is used to measure the global transmission capacity of the complex network. In the statistical chart, the ordinate represents the values of CC(A) and h(A), and the abscissa represents different radius values.

基于六度分离理论，商圈各节点间通过不超过六个中间层次的联系，都可以建立起联系，营销消息可以通过极少数节点实现传播。同时，商圈商铺分布皆按类划分区域，可按照自定义的距离（如商铺间相关性，商铺间实际距离等）实现聚类，因此按照半径网络构建技术可以实现商圈节点间的图构建。构图过程中，阈值/>的选择非常重要，选择较大的时，本发明将得到一个极其稀疏的随机网络。当/>逐渐减小时，网络连接会逐渐变密，其特征将开始向全连接规则网络演进。因此，本发明通过稀疏性网络平均一致性（h）和聚类系数（CC）对图结构传播效率和稀疏性的定量计算，寻找两者之间的平衡，从而确定阈值/>。图结构具体构建过程如下：Based on the six degrees of separation theory, connections can be established between nodes in a business district through no more than six intermediate levels of contact, and marketing messages can be spread through a very small number of nodes. At the same time, the distribution of shops in the business district is divided into areas according to categories, and clustering can be achieved according to customized distances (such as correlation between shops, actual distance between shops, etc.), so according to Radius network construction technology can realize graph construction between business district nodes. During the composition process, the threshold/> The choice is very important, choose the larger When , the present invention will obtain an extremely sparse random network. When/> As it gradually decreases, the network connections will gradually become denser, and its characteristics will begin to evolve toward a fully connected regular network. Therefore, this invention uses the sparse network average consistency ( h) and the clustering coefficient ( CC) to quantitatively calculate the propagation efficiency and sparsity of the graph structure to find the balance between the two, thereby determining the threshold/> . The specific construction process of the graph structure is as follows:

首先，计算任意两个商铺节点变量的相关系数，得到亲和矩阵S。归一化后，矩阵中的每一个元素取值在0到1之间。First, calculate the correlation coefficient of any two store node variables to obtain the affinity matrix S. After normalization, each element in the matrix The value is between 0 and 1.

然后亲和矩阵S中的上三角矩阵元素降序排列；阈值从1开始，取区间为/>，即。Then the upper triangular matrix elements in the affinity matrix S are arranged in descending order; threshold Starting from 1, take the interval/> ,Right now .

若，则将邻接矩阵A的相应位置/>设为1，其余为0，得到邻接矩阵A。like , then the corresponding position of the adjacency matrix A/> Set to 1 and the rest to 0, we get the adjacency matrix A.

将逐渐减小为0，共得到/>个图结构。Will gradually decreases to 0, and a total of/> A graph structure.

对于得到的邻接矩阵，分别计算对应的网络平均一致性和网络平均聚类系数CC，最终形成/>和CC曲线。/>是衡量网络传播效率的指标，CC是衡量网络节点间连接程度的指标。我们的目的是构建既有效率又有信息量的图结构，而两条曲线的交点/>是在效率和网络冗余程度的平衡点，因此是构建邻接矩阵的最佳阈值。For the obtained adjacency matrix, calculate the corresponding network average consistency respectively. and the network average clustering coefficient CC , finally forming/> and CC curve. /> is an indicator to measure the efficiency of network communication, and CC is an indicator to measure the degree of connection between network nodes. Our goal is to construct a graph structure that is both efficient and informative, and the intersection of the two curves/> is the balance point between efficiency and network redundancy, and is therefore the optimal threshold for constructing the adjacency matrix.

随后，我们将该平衡点的值作为阈值，将距离大于等于/>的节点对连接，形成商圈流量节点图结构。本发明商圈流量图模型构造方法会根据不同的商圈的商场、商铺分布而自适应动态调整构建参数，该方式减少了方法应用于现实时的工作量，实施人员不必依据经验设定参数，再进行观察实验选择出最优参数，耗时耗力且没有理论支持。Then, we define the equilibrium point as value as a threshold, the distance is greater than or equal to/> Node pairs are connected to form a business district traffic node graph structure. The business district traffic diagram model construction method of the present invention will adaptively and dynamically adjust the construction parameters according to the distribution of shopping malls and shops in different business districts. This method reduces the workload when the method is applied in reality, and the implementer does not need to set parameters based on experience. Then conduct observation experiments to select the optimal parameters, which is time-consuming and labor-intensive and has no theoretical support.

第三步：基于线性门控卷积注意力单元的时间特征提取Step 3: Temporal feature extraction based on linear gated convolutional attention unit

本步骤为模型的时间特征提取层，本层为模型的起始，输入为时间长度为的历史时序数据，同时输入第二步构建完成的邻接矩阵。This step is the temporal feature extraction layer of the model. This layer is the beginning of the model. The input is the time length of historical time series data, and at the same time input the adjacency matrix constructed in the second step.

由于目前用于提取非线性时间特性的 RNN、LSTM 等网络难以捕捉长期依赖关系，而Transformer 模型对时序进行逐点点积自注意力，忽略了上下文信息，对周围商业营销活动带来的影响不敏感。因此，本发明将 Transformer 中的自注意力机制替换为局部上下文敏感的卷积注意力，形成门控注意力单元，拟合时序数据的时间维度特性。由于门控注意力单元存在二次复杂度，而商圈流量预测需要使用较长的商圈流量数据导致巨大的计算量。针对此问题，本发明将输入的时序数据进行分块，块内使用精确的卷积注意力而跨块使用快速线性注意力，从而得到具有线性复杂度的门控注意力单元。Since RNN, LSTM and other networks currently used to extract nonlinear temporal characteristics are difficult to capture long-term dependencies, the Transformer model performs point-by-point dot product self-attention on time series, ignoring contextual information and being insensitive to the impact of surrounding commercial marketing activities. . Therefore, the present invention replaces the self-attention mechanism in Transformer with local context-sensitive convolution attention to form a gated attention unit to fit the time dimension characteristics of time series data. Due to the quadratic complexity of the gated attention unit, commercial district traffic prediction requires the use of long commercial district traffic data, resulting in a huge amount of calculation. To address this problem, the present invention divides the input time series data into blocks, uses precise convolutional attention within the block and uses fast linear attention across blocks, thereby obtaining a gated attention unit with linear complexity.

详细来说，该层由线性门控卷积注意力单元来做时间卷积，自适应地捕获各时间戳之间数据的相关性，其中CA使得注意力集中在局部上下文变化，从而学习到更多的特征。通过该时间卷积层后，输出捕获到各时间戳之间相关性的特征向量。线性门控卷积注意力单元结构如图3中所示，其中CA表示卷积注意力模块，ACA表示根据输入计算出的卷积注意力，dense表示密集连接层，用以加权整合信息。Xt*UT+bT，Xt*WT+cT为快速线性注意力模块，σ表示激活函数，表示叉乘。In detail, this layer uses a linear gated convolution attention unit to perform temporal convolution and adaptively capture the correlation of data between timestamps. CA focuses attention on local context changes, thereby learning more information. Many features. After passing through this temporal convolutional layer, the output is a feature vector that captures the correlation between each timestamp. The structure of the linear gated convolutional attention unit is shown in Figure 3, where CA represents the convolutional attention module, ACA represents the convolutional attention calculated based on the input, and dense represents the dense connection layer to weight and integrate information. Xt*UT+bT, Xt*WT+cT are fast linear attention modules, σ represents the activation function, Represents cross product.

本发明提出的时间特征提取方法能够解决商圈流量预测特有的商圈特点和商业营销活动导致的节点长距离空间相关性、时滞、动态耦合问题，与其他现有技术流量预测过程中的时空特征提取有明显差别。The time feature extraction method proposed by the present invention can solve the unique business district characteristics of business district traffic prediction and the long-distance spatial correlation, time lag, and dynamic coupling problems of nodes caused by commercial marketing activities, and is different from the time and space problems in the traffic prediction process of other existing technologies. There are obvious differences in feature extraction.

第四步：基于多头图注意力的空间特征提取Step 4: Spatial feature extraction based on multi-head graph attention

为了充分利用节点及其邻域的信息，本方法使用多头图注意力来构建空间卷积层。多头图注意力可以同时建模长期依赖和短期依赖，并行计算不同子空间的注意力，完成多步长距离预测。将经过第三步得到提取的时间特征向量，输入空间卷积层，通过计算获得带有空间特性的特征向量，具体的计算过程如下：In order to fully utilize the information of nodes and their neighbors, this method uses multi-head graph attention to build spatial convolutional layers. Multi-head graph attention can model long-term dependencies and short-term dependencies at the same time, calculate attention in different subspaces in parallel, and complete multi-step long-distance prediction. The extracted temporal feature vector obtained through the third step is input into the spatial convolution layer, and the feature vector with spatial characteristics is obtained through calculation. The specific calculation process is as follows:

其中，/>是矩阵拼接操作。对于同一个节点，多头自注意力分别计算/>次注意力，并以拼接或平均的方式合并/>次注意力。/>为第/>的节点/>的特征变量，/>指注意力的头、/>为注意力向量、/>表示节点/>的度，/>，/>是可学习的参数，/>是第/>节点的向量表示、/>表示索引为/>中第/>节点对第/>节点的注意力权重。 Among them,/> is a matrix splicing operation. For the same node, multi-head self-attention is calculated separately/> attention, and merged by splicing or averaging/> attention. /> For the first/> node/> characteristic variables,/> Refers to the head of attention,/> is the attention vector, /> Represents node/> degree,/> ,/> is a learnable parameter,/> Is the first/> Vector representation of nodes,/> Indicates that the index is/> Middle/> Node pair/> The attention weight of the node.

多头自注意力可以更深入地挖掘节点数据的潜力，让模型更好地理解节点的特征含义。注意力系数的计算由一个2层多层感知器(Multi-layer perceptron, MLP)完成，具体公式如下：Multi-head self-attention can dig deeper into the potential of node data, allowing the model to better understand the characteristic meaning of nodes. The calculation of the attention coefficient is completed by a 2-layer multi-layer perceptron (MLP). The specific formula is as follows:

其中/>为可学习的参数/>是激活函数。/>第/>个头的注意力系数，/>表示注意力向量、/>是指数函数、/>是节点索引、/>是节点/>的特征向量。 Among them/> is a learnable parameter/> is the activation function. /> No./> Attention coefficient of size,/> Represents the attention vector, /> is an exponential function,/> Is the node index, /> Is a node/> eigenvector.

第五步：全连接层。Step 5: Fully connected layer.

经过时间特征提取与空间特征提取后，经过一个全连接层，实现长度为的最终输出/>。其中，/>表示节点索引，/>表示时刻，/>表示在时刻节点/>的预测值，在此为泛指，表示所有节点。After temporal feature extraction and spatial feature extraction, through a fully connected layer, the length is The final output/> . Among them,/> Represents the node index, /> Indicates time,/> expressed in Time node/> The predicted value of , here refers to all nodes.

第六步：商圈流量预测模型训练。Step 6: Business district traffic prediction model training.

模型搭建完成后，开始进行模型训练。模型训练策略如下：After the model is built, model training begins. The model training strategy is as follows:

均方误差损失函数用于模型训练优化，损失函数的定义式为：The mean square error loss function is used for model training optimization. The loss function is defined as:

其中/>是时间序列中用以计算损失的样本总数，/>是真实值，/>是预测值。 Among them/> is the total number of samples in the time series used to calculate the loss,/> is the true value,/> is the predicted value.

根据上述框架，利用Adam优化算法在反向传播中对每个权重参数求网络误差的梯度，通过参数更新过程得到新的权重。迭代计算模型权重，直到达到预定的小损失，并获得最佳预测值。选择Adam作为优化算法的原因是它可以为不同的参数设计独立的自适应学习率。最重要的是，使用Adam使计算更加高效，训练算法整体如下：According to the above framework, the Adam optimization algorithm is used to calculate the gradient of the network error for each weight parameter in backpropagation, and the new weight is obtained through the parameter update process. Model weights are calculated iteratively until a predetermined small loss is reached and the best predicted value is obtained. The reason for choosing Adam as the optimization algorithm is that it can design independent adaptive learning rates for different parameters. Most importantly, using Adam makes calculations more efficient. The overall training algorithm is as follows:

（1）输入多变量时序数据X、邻接矩阵A；(1) Input multi-variable time series data X and adjacency matrix A;

（2）初始化模型：确定学习率，批尺寸/>, 迭代次数iter，初始化权重矩阵W和偏置矩阵/>。(2) Initialize the model: determine the learning rate , batch size/> , number of iterations iter , initialization weight matrix W and bias matrix/> .

（3）循环训练网络：(3) Cycle training network:

A．每次从多变量时序数据X中采样b个数据，直至数据全部被采集一遍：A. Sample b data from the multi-variable time series data X each time until all the data is collected:

1.计算 1. Calculation

2.计算损失Loss2. Calculate the loss

3.计算梯度，使用优化算法更新参数，其中学习率为。3. Calculate the gradient and update the parameters using an optimization algorithm, where the learning rate is .

B. iter+1 B.iter+1

(4)完成循环，模型训练完毕。 (4) Complete the cycle and the model training is completed.

第七步：模型部署与流量预测Step 7: Model deployment and traffic prediction

如图1，启动实时数据采集，保持 5分钟一次的采样频率，将采集到的数据输入到设计的图结构中。随后，调用搭建并训练好的模型，模型输出预测的流量值。这一过程将帮助管理者更好地了解流量的动态变化，为商圈管理和运营提供实时的决策支持。As shown in Figure 1, start real-time data collection, maintain a sampling frequency of once every 5 minutes, and input the collected data into the designed graph structure. Subsequently, the built and trained model is called, and the model outputs the predicted traffic value. This process will help managers better understand the dynamic changes in traffic and provide real-time decision support for business district management and operations.

本发明基于六度分离理论，提出一种自适应商圈流量图模型构建方法，通过权衡网络的全局效率和冗余度来完成商圈中不同商铺间流量的空间建模，捕捉长距离的空间相关性，去除由于商铺相似性带来的建模过平滑问题。改进了transformer中的注意力机制，设计新的时间特征维度提取器，抵抗位置之间的空间信息传播中可能会出现的时间延迟的影响。采用多头图注意力机制，设计新的空间特征维度提取器，动态捕获不同营销方式，不同商业活动对流量带来的影响，实现节点间空间关系的动态耦合。Based on the six-degree separation theory, the present invention proposes an adaptive business district traffic graph model construction method, which completes the spatial modeling of traffic between different shops in the business district by weighing the global efficiency and redundancy of the network, and captures long-distance space Correlation eliminates the modeling over-smoothing problem caused by store similarity. The attention mechanism in the transformer is improved, and a new temporal feature dimension extractor is designed to resist the impact of time delays that may occur in the propagation of spatial information between locations. Using the multi-head graph attention mechanism, a new spatial feature dimension extractor is designed to dynamically capture the impact of different marketing methods and different commercial activities on traffic, and achieve dynamic coupling of spatial relationships between nodes.

本说明书中的各个实施例均采用相关的方式描述，各个实施例之间相同相似的部分互相参见即可，每个实施例重点说明的都是与其他实施例的不同之处。尤其，对于系统实施例而言，由于其基本相似于方法实施例，所以描述的比得简单，相关之处参见方法实施例的部分说明即可。Each embodiment in this specification is described in a related manner. The same and similar parts between the various embodiments can be referred to each other. Each embodiment focuses on its differences from other embodiments. In particular, for the system embodiment, since it is basically similar to the method embodiment, the description is simple. For relevant details, please refer to the partial description of the method embodiment.

以上所述仅为本发明的较佳实施例而已，并非用于限定本发明的保护范围。凡在本发明的精神和原则之内所作的做的任何修改同替换、改进等，均包含在本发明的保护范围内。The above descriptions are only preferred embodiments of the present invention and are not intended to limit the scope of the present invention. Any modifications, substitutions, improvements, etc. made within the spirit and principles of the present invention are included in the protection scope of the present invention.

Claims

1. A business district traffic prediction method based on six degrees of separation theory and graph attention network, which is characterized by including the following steps:

S1. Collect business district traffic data;

S2. Construct an adaptive business district traffic graph structure based on six degrees of separation theory;

S3. Extract temporal features based on linear gated convolution attention unit;

S4. Extract spatial features based on multi-head graph attention;

S5. Output the prediction results through the fully connected layer to complete the construction of the business district traffic prediction model;

S6. Business district traffic prediction model training;

S7. Call the trained business district traffic prediction model to predict traffic.

2. The business district traffic prediction method based on six degrees of separation theory and graph attention network according to claim 1, characterized in that the business district traffic data collected by S1 includes business district range, shop location, collection frequency, collection period.

3. The business district traffic prediction method based on six degrees of separation theory and graph attention network according to claim 1, characterized in that the adaptive business district traffic graph structure construction process based on six degrees of separation theory in S2 is as follows:

S21. Calculate the correlation coefficient of any two store node variables to obtain the affinity matrix S; normalize the affinity matrix S, then any element in the matrix The value is between 0 and 1; arrange the upper triangular matrix elements in the affinity matrix S in descending order; when When , change the corresponding position of the adjacency matrix A/> Set to 1, and the rest to 0, to obtain the adjacency matrix A;/> is the threshold;

S22. Calculate the corresponding network average consistency for the adjacency matrix A. and the network average clustering coefficient CC , ultimately forming two curves; the intersection of the two curves/> A balance between efficiency and network redundancy;/> for/> The value of the network radius at the intersection point with the CC curve;

S23. Set the balance point value as the optimal threshold, taking the distance to be greater than or equal to/> Node pairs are connected to form a business district traffic node graph structure.

4. The business district traffic prediction method based on six degrees of separation theory and graph attention network according to claim 1, characterized in that the S3 is based on the linear gated convolution attention unit to extract temporal features: first, transformer The self-attention mechanism in the model is replaced with local context-sensitive convolutional attention to form a gated attention unit to fit the time dimension characteristics of time series data; the input is a time length of Historical time series data, the input time series data is divided into blocks, and accurate convolution attention is used within the block and fast linear attention is used across blocks to obtain a gated attention unit with linear complexity; then by The degree-gated attention unit performs temporal convolution on the input data and outputs the feature vector of the correlation between each timestamp.

5. The business district traffic prediction method based on six degrees of separation theory and graph attention network according to claim 1, characterized in that the S4 extracting spatial features based on multi-head graph attention is specifically:

Input the temporal feature vector obtained by S3 into the spatial convolution layer to calculate the feature vector with spatial characteristics. The specific calculation process is as follows:

Among them,/> Is a matrix splicing operation; for the same node, multi-head self-attention is calculated separately/> attention and merged in a splicing or averaging manner attention;/> For the first/> Layer node/> eigenvector,/> For the first/> Layer node/> The feature vector of , after the aggregation operation with the attention mechanism as the core, the output is a node/> new eigenvector/> , understood as different nodes/> Characteristic Expression, Chapter/> nodes/> , the first one after going through the attention mechanism/> Node feature representation;

Refers to the head of attention,/> Represents the attention coefficient,/> Represents node/> degree,/> ,/> is a learnable parameter,/> Is the first/> Vector representation of nodes, /> Indicates that the index is/> Middle/> Node pair/> The attention weight of the node;/> is the activation function,/> Is the first/> Vector representation of nodes;

Attention coefficient The calculation method is as follows:

Among them/> is a learnable parameter,/> No./> Attention coefficient of size,/> is an exponential function,/> Is a node/> eigenvector.

6. The business district traffic prediction method based on six degrees of separation theory and graph attention network according to claim 1, characterized in that the S5 specifically includes: merging temporal feature extraction and spatial feature vector through a fully connected layer and outputting the length. for Prediction results/> ;wherein,/> Represents the node index, /> Indicates time,/> Shown in/> Time node/> predicted value.

7. The business district traffic prediction method based on six degrees of separation theory and graph attention network according to claim 1, characterized in that the process of the S6 training model is specifically:

S61 trains the model through the mean square error loss function. The loss function is:

Among them/> is the total number of samples in the time series used to calculate the loss,/> is the true value,/> is the predicted value;

S62. Use the Adam optimization algorithm to calculate the gradient of the network error for each weight parameter in backpropagation, and obtain new weights through the parameter update process; iteratively calculate the model weight until a predetermined small loss is reached, and the best prediction value is obtained.

8. The business district traffic prediction method based on six degrees of separation theory and graph attention network according to claim 1, characterized in that the S7 calls the trained business district traffic prediction model to perform traffic prediction:

S71. Start real-time data collection, maintain a sampling frequency of once every 5 minutes, and input the collected data into the graph structure obtained by S2;

S72. Call the model built in S6 to output the predicted traffic value.