CN110119884A - A kind of high-speed railway passenger flow Time segments division method based on neighbour's propagation clustering - Google Patents

A kind of high-speed railway passenger flow Time segments division method based on neighbour's propagation clustering Download PDF

Info

Publication number
CN110119884A
CN110119884A CN201910307332.7A CN201910307332A CN110119884A CN 110119884 A CN110119884 A CN 110119884A CN 201910307332 A CN201910307332 A CN 201910307332A CN 110119884 A CN110119884 A CN 110119884A
Authority
CN
China
Prior art keywords
passenger flow
sample
passenger
time
time point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910307332.7A
Other languages
Chinese (zh)
Other versions
CN110119884B (en
Inventor
王文宪
肖蒙
翟玉江
林群煦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuyi University Fujian
Original Assignee
Wuyi University Fujian
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuyi University Fujian filed Critical Wuyi University Fujian
Priority to CN201910307332.7A priority Critical patent/CN110119884B/en
Publication of CN110119884A publication Critical patent/CN110119884A/en
Application granted granted Critical
Publication of CN110119884B publication Critical patent/CN110119884B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Finance (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Tourism & Hospitality (AREA)
  • Game Theory and Decision Science (AREA)
  • Educational Administration (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Train Traffic Observation, Control, And Security (AREA)

Abstract

本发明提供一种基于近邻传播聚类的高速铁路客流时段划分方法,本发明将统计时间划分为若干个时间点,然后统计每个时间点的客流数据,将进行预处理的样本变量构造时间点样本序列;然后,利用近邻传播聚类算法对样本序列进行划分;最后,采用CH、Hartigan及IGP等聚类有效性指标确定最优聚类结果,并进一步形成年度运营时段划分结果,本发明采用近邻传播算法对年度内客流量相似的时间点进行聚类归并,并根据CH、Hart以及IGP指标确定最佳聚类数,提高分类的准确性;同时能够更为客观准确地反映年度内不同时段客流需求,克服了手工划分方法主观性、效率低以及精度不高的缺点,从而为列车开行方案适应性调整奠定基础。

The invention provides a high-speed railway passenger flow time division method based on neighbor propagation clustering. The invention divides the statistical time into several time points, then counts the passenger flow data at each time point, and constructs a time point from the preprocessed sample variables. sample sequence; then, the sample sequence is divided by the nearest neighbor propagation clustering algorithm; finally, the optimal clustering result is determined by using clustering effectiveness indicators such as CH, Hartigan, and IGP, and the division result of the annual operation period is further formed. The nearest neighbor propagation algorithm clusters and merges the time points with similar passenger flow in the year, and determines the optimal number of clusters according to CH, Hart and IGP indicators to improve the classification accuracy; at the same time, it can more objectively and accurately reflect the passenger flow demand in different periods of the year. , which overcomes the subjectivity, low efficiency and low precision of the manual division method, thus laying a foundation for the adaptive adjustment of the train running plan.

Description

一种基于近邻传播聚类的高速铁路客流时段划分方法A time division method of high-speed railway passenger flow based on neighbor propagation clustering

技术领域technical field

本发明涉及高速铁路技术领域,尤其是一种基于近邻传播聚类的高速铁路客流时段划分方法。The invention relates to the technical field of high-speed railways, in particular to a method for dividing passenger flow periods of high-speed railways based on neighbor propagation clustering.

背景技术Background technique

八横八纵”高速铁路网以及城际铁路网的逐步完善,使得越来越多的中长途旅客将高速铁路作为首选出行方式,作为客运服务质量的重要影响因素,列车开行方案规定了旅客列车开行数量、区段、停靠站点等。为提高路网上各个去向客流的服务质量,同时尽可能地降低列车运行成本,高速铁路客运管理部门需要根据年度客流波动变化对列车开行方案进行适应性调整,使其符合年度内不同时段的客流需求。The gradual improvement of the "eight horizontal and eight vertical" high-speed railway network and the intercity railway network has made more and more medium and long-distance passengers choose high-speed railway as their preferred travel mode. The number of trains, sections, stops, etc. In order to improve the service quality of the passenger flow of each destination on the road network, and at the same time reduce the cost of train operation as much as possible, the high-speed railway passenger transportation management department needs to adjust the train operation plan according to the fluctuation of the annual passenger flow. Make it meet the needs of passenger flow at different times of the year.

铁路网络中列车上的客流分布状况,是评价旅客列车运行方案实施效率的重要依据。通常采用实际乘车客流分布来评价正在实施的旅客列车运行方案,但对于将要优化设计的旅客列车运行方案,只能借助于客流分配手段生成列车上的客流分布来评价。由于客流分配的效率和结果的合理性直接影响到旅客列车运行方案的优化水平,所以列车客流分配方法一直是研究铁路旅客列车运行方案优化的重要基础研究课题之一。The distribution of passenger flow on trains in the railway network is an important basis for evaluating the implementation efficiency of passenger train operation plans. Usually, the actual passenger flow distribution is used to evaluate the passenger train operation scheme being implemented. However, for the passenger train operation scheme to be optimally designed, the passenger flow distribution on the train can only be evaluated by means of the passenger flow distribution method. Since the efficiency of passenger flow distribution and the rationality of the results directly affect the optimization level of the passenger train operation plan, the train passenger flow distribution method has always been one of the important basic research topics for the optimization of the railway passenger train operation plan.

然而对列车开行方案的调整涉及因素众多,是一项复杂、庞大的系统工程,每年对其进行调整的次数也是有限的。根据客流波动特性对高速铁路运营年度进行时段划分,然后根据各时段的客流量对列车开行方案调整是一个较为可行的策略。因此,运营时段科学合理划分是列车开行方案调整的基本前提与重要依据,是调整列车开行方案与具有动态特性的客流需求相适应的重要保证。现行的高速铁路运营时段划分方法是,依据全年目标线路统计客流总量的变化情况,将年度运营时段划分为春运期、暑运期、节假期以及平峰期。这种做法虽然体现了不同时段间的客流量的差异,但时段划分的结果好坏在较大程度上取决于现场工程技术人员的经验,具有主观性强、容易导致不合理时段划分结果的缺陷,难以准确反映年度内具有季节性变化特性的客流需求。However, the adjustment of the train running plan involves many factors, which is a complex and huge system engineering, and the number of adjustments to it every year is limited. It is a feasible strategy to divide the operation year of high-speed railway according to the fluctuation characteristics of passenger flow, and then adjust the train operation plan according to the passenger flow of each period. Therefore, the scientific and reasonable division of operating periods is the basic premise and important basis for the adjustment of the train operation plan, and an important guarantee for adjusting the train operation plan to adapt to the dynamic passenger flow demand. The current high-speed railway operating time division method is to divide the annual operating time period into the spring travel period, summer travel period, holiday and peak season according to the changes in the total passenger flow of the target line throughout the year. Although this approach reflects the difference in passenger flow between different time periods, the quality of time division results depends to a large extent on the experience of on-site engineers and technicians, which is highly subjective and easily leads to unreasonable time division results. , it is difficult to accurately reflect the passenger flow demand with seasonal changes in the year.

对于高速铁路运营时段划分问题,国内外尚无相关研究。该问题在本质上与基于多时段控制(TOD)道路交叉口信号设计中的交通时段划分类似。针对道路交叉口多时段控制(TOD)问题,国内外学者有一些相关研究,主要通过绘制某个具有代表性交叉口一天的累积交通量曲线,并通过人工经验确定交通量曲线变化较为显著的时间节点为时段分割点,从而实现交通时段的合理划分,较早的列车客流分配方法几乎没有独立的研究,通常运用于列车开行方案优化研究中,这些研究基于给定的列车开行方案构造旅客换乘网络,设计旅客的出行广义费用,包括票价支出、旅行时间和拥挤效应等,建立静态用户均衡分配模型或随机用户均衡分配模型,将流量均衡地分配到列车运行区段上(参见史峰,邓连波,黎新华,方琪根.客运专线相关旅客列车开行方案研究[J].铁道学报,2004,26(2):16-20.)。城市公共交通客流分配与高铁客流分配非常类似,该领域具有大量的研究工作。Hamdouch等在时空网络上构造旅客行进至任意车站时的出行备选集,进而提出了综合考虑到出发时间要求、严格能力约束及时空优先原则下的基于时刻表的公交策略均衡分配模型,上述主要考虑了客流分配过程中的能力约束和时空优先特性,对于旅客购票的特性并没有给出研究分析,然而该特性在高速铁路旅客的出行选择中具有重要的影响,因此需要设计一种适用于高速铁路运输网络的客流分配方法。There is no relevant research at home and abroad on the division of high-speed railway operation periods. This problem is similar in nature to the division of traffic periods in the design of road intersection signals based on multi-period control (TOD). For the multi-time control (TOD) problem of road intersections, scholars at home and abroad have conducted some related research, mainly by drawing the cumulative traffic volume curve of a representative intersection for one day, and determining the time when the traffic volume curve changes significantly through manual experience. Nodes are time-segment points, so as to achieve a reasonable division of traffic time periods. There is almost no independent research on the earlier train passenger flow allocation methods, and they are usually used in the optimization of train operation plans. These studies construct passenger transfers based on a given train operation plan. network, design the generalized travel cost of passengers, including fare expenditure, travel time and congestion effect, etc., establish a static user equilibrium allocation model or a random user equilibrium allocation model, and distribute the traffic evenly to the train running sections (see Shi Feng, Deng Lianbo, Li Xinhua, Fang Qigen. Research on the operation plan of passenger trains related to passenger dedicated lines [J]. Journal of Railways, 2004, 26(2): 16-20.). Urban public transport passenger flow distribution is very similar to high-speed rail passenger flow distribution, and there is a lot of research work in this area. Hamdouch et al. constructed a travel alternative set for passengers traveling to any station on the space-time network, and then proposed a timetable-based balanced distribution model for public transport strategies under the principle of comprehensive consideration of departure time requirements, strict capacity constraints and space-time priority. Considering the capacity constraints and space-time priority characteristics in the process of passenger flow distribution, no research and analysis on the characteristics of passenger ticket purchases are given. However, this characteristic has an important impact on the travel choices of high-speed railway passengers. Therefore, it is necessary to design a suitable Passenger flow distribution methods for high-speed rail transport networks.

发明内容SUMMARY OF THE INVENTION

针对现有技术的不足,本发明提供一种基于近邻传播聚类的高速铁路客流时段划分方法,该方法根据高速铁路线路上各个去向的客流量的变化特点对整个年度进行合理时段划分,提高列车开行方案与客流需求的适应性。In view of the deficiencies of the prior art, the present invention provides a method for dividing the passenger flow period of high-speed railway based on neighbor propagation clustering. The adaptability of the opening plan and the passenger flow demand.

本发明的技术方案为:一种基于近邻传播聚类的高速铁路客流时段划分方法,包括以下步骤:The technical scheme of the present invention is as follows: a method for dividing the passenger flow period of high-speed railway based on neighbor propagation clustering, comprising the following steps:

S1)、将一年度划分为T个时间点,并设定高铁线路X包括n个重要站点,统计高速铁路线路X上行或下行方向上的每个时间间隔tk内的各个站点客流发送量,即S1), divide a year into T time points, and set the high-speed railway line X to include n important sites, and count the passenger flow of each site in each time interval t k in the upward or downward direction of the high-speed railway line X, which is

其中,Xk表示时间间隔tk内高铁线路X的客流发送量,表示时间间隔tk内高铁线路X的第n个站点的客流发送量;Among them, X k represents the passenger flow of the high-speed rail line X within the time interval t k , Represents the passenger flow sending volume of the nth station of the high-speed rail line X within the time interval tk;

S2)、采用阈值δ判定判断每个时间点的客流发送量是否异常,具体为统计时间点tl相邻的m个时间点的客流发送向量,并计算其均值如果S2), adopt the threshold value δ to determine whether the passenger flow transmission volume at each time point is abnormal, specifically counting the passenger flow transmission vectors of the m adjacent time points at the time point t1 , and calculate the mean value thereof if

则认为该时间点tl的客流发送量数据为非异常数据,否则为异常数据;Then it is considered that the passenger flow data at this time point tl is non-abnormal data, otherwise it is abnormal data;

其中,Xl为时间点tl的客流发送量;Wherein, X l is the amount of passenger flow sent at time point t l ;

如果数据异常,则删除异常数据,并利用该时间点相邻m个时间点的客流数据对其客流发送量进行拟合修复,计算式如下:If the data is abnormal, delete the abnormal data, and use the passenger flow data of m adjacent time points at this time point to fit and repair the passenger flow transmission volume. The calculation formula is as follows:

式中,lk(t)为k-1次拟合多项式,Ln(t)为拉格朗日插值多项式、t为待拟合的时间点、tj为第j个时间点、ti为第i个时间点、Xi为第i个时间点的客流矩阵、li(t)为i-1次拟合多项式;In the formula, l k (t) is the k-1 degree fitting polynomial, L n (t) is the Lagrangian interpolation polynomial, t is the time point to be fitted, t j is the jth time point, t i is the i-th time point, X i is the passenger flow matrix of the i-th time point, and l i (t) is the i-1 degree fitting polynomial;

将拟合修复后的数据与原正常数据进行合并,然后使用标准差标准化消除变量间尺度的差异,其计算式如下:Combine the fitted and repaired data with the original normal data, and then use the standard deviation standardization to eliminate the difference in scale between variables. The calculation formula is as follows:

式中,Z-score为进行标准差标准化后的值,x为某个时间点的站点旅客发送量,为年度内所有站点旅客发送量的均值,σ为年度内所有站点旅客发送量的标准差;In the formula, Z-score is the value after the standard deviation standardization, x is the passenger sending volume of the station at a certain time point, is the mean of the passenger volume of all stations in the year, σ is the standard deviation of the passenger volume of all stations in the year;

S3)、基于近邻传播聚类算法进行时段划分,将客流发送量数据集中T个时间点的客流发送量数据集作为候选类代表,判断任意2个时间间隔内的客流发送量的相似度s(i,k),即s(i,k)表示时间间隔k的客流发送量样本Xk与间间隔i的客流发送量样本Xi的相似度,即样本Xk适合作为样本Xi的类代表的量化程度,算法初始化时,假设所有样本被为类代表的可能性相同,即假设所有s(k,k)为相同吸引度中值p,其中,任意两样本的相似度计算公式为:S3), divide the time period based on the neighbor propagation clustering algorithm, take the passenger flow data set at T time points in the passenger flow data set as the candidate class representative, and judge the similarity s of the passenger flow in any two time intervals ( i,k), that is, s(i,k) represents the similarity between the passenger flow transmission volume sample X k of the time interval k and the passenger flow transmission volume sample X i of the interval i, that is, the sample X k is suitable as the class representative of the sample X i When the algorithm is initialized, it is assumed that all samples have the same probability of being represented by the class, that is, all s(k,k) are assumed to be the median value p of the same attraction, where the similarity calculation formula of any two samples is:

s(i,k)=-||xi-xk||2s(i,k)=-||x i -x k || 2 ;

其中,xi表示i的客流发送量样本,xk表示k时刻的客流发送量;Among them, x i represents the sample of the passenger flow sending volume of i, and x k represents the passenger flow sending volume at time k;

定义可信度矩阵r和可用度矩阵a,其中,可信度矩阵r(i,k)是从样本xi指向样本xk,表示样本xk适合作为xi的类代表的代表程度;a(i,k)是从样本xk指向样本xi,表示xi选择xk作为类代表的合适程度;对于任意样本xi计算其他时间间隔客流发送量的可信度r(i,k)和可用度a(i,k)之和,若两者之和最大的样本xk为类代表,输出所有时间点类别划分结果;Define the reliability matrix r and the availability matrix a, where the reliability matrix r(i,k) points from the sample x i to the sample x k , indicating the representative degree that the sample x k is suitable as the class representative of x i ; a (i,k) is from the sample x k to the sample xi , indicating the appropriateness of xi choosing x k as the class representative; for any sample xi, calculate the reliability r(i,k) of the amount of passengers sent in other time intervals and the sum of the availability a(i, k), if the sample x k with the largest sum of the two is the class representative, output the classification results of all time points;

具体包括如下步骤:Specifically include the following steps:

S301)、设置可信度矩阵r(i,k)和可用度矩阵a(i,k)的初始值为0;S301), set the initial value of the reliability matrix r(i,k) and the availability matrix a(i,k) to 0;

S302)、计算任意时间间隔的客流发送量样本的相似度矩阵s(i,k),矩阵值采用欧式距离为测度,即s(i,k)=-||xi-xk||2S302), calculate the similarity matrix s(i,k) of the passenger flow samples at any time interval, and the matrix value adopts the Euclidean distance as the measure, that is, s(i,k)=-||x i -x k || 2 ;

设置对角线元素s(k,k)为相同的吸引度中值,即Set the diagonal elements s(k,k) to the same median attraction, that is

式中,为N样本数量;In the formula, N is the number of samples;

S303)、更新可信度矩阵r(i,k)和可用度矩阵a(i,k),其中,可信度矩阵r(i,k)更新计算公式为:S303), update the credibility matrix r(i,k) and the availability matrix a(i,k), wherein, the update calculation formula of the credibility matrix r(i,k) is:

可用度矩阵a(i,k)更新计算公式为:The update calculation formula of the availability matrix a(i,k) is:

S304)、设置阻尼因子λ消除迭代中的数字震荡,即,S304), set the damping factor λ to eliminate the digital oscillation in the iteration, that is,

式中,rnew(i,k)和rold(i,k)分别为本次和上一次更新得到的可信度矩阵;anew(i,k)和aold(i,k)分别为本次和上一次更新得到的可用度矩阵;λ∈(0,1)为阻尼因子;In the formula, r new (i, k) and r old (i, k) are the credibility matrices obtained from the current and previous updates, respectively; a new (i, k) and a old (i, k) are respectively Availability matrices obtained from the current and previous updates; λ∈(0,1) is the damping factor;

S305)、对任意客流发送量数据样本求取其与所有客流发送量样本的可信度与可用度之和,根据找到每个样本的类中心样本;S305), obtain the sum of the reliability and availability of any passenger flow sending volume data sample and all passenger flow sending volume samples, according to Find the class center sample for each sample;

S306),当前迭代次数更新n←n+1,判断信息迭代过程是否达到设置的最大迭代次数,即n≤Nmax,是则算法终止,输出所有时间点类别划分结果,否则返回步骤S302);S306), the current number of iterations is updated n←n+1, and it is judged whether the information iteration process reaches the set maximum number of iterations, that is, n≤N max , if yes, the algorithm is terminated, and the classification results of all time points are output, otherwise, return to step S302);

S4)、分别计算不同时间点类别划分结果的Calinski-Harabasz、Hartigan以及In-Group Proportion指标,选择最优的时间点分类数及其对应的类别划分结果;S4), calculate the Calinski-Harabasz, Hartigan and In-Group Proportion indicators of the classification results at different time points respectively, and select the optimal number of time points and their corresponding classification results;

S5)、运营时段划分结果检验及矫正,遍历循环所有划分类别,对每一个样本进行两两对比分析,如果两个样本所对应的时间点是相邻的,则合并为一个运营时段,否则,视为另一个运营时段;S5), check and correct the division results of the operation period, traverse all the division categories in the cycle, and perform a pairwise comparative analysis of each sample. If the time points corresponding to the two samples are adjacent, they are combined into one operation period, otherwise, regarded as another operating period;

S6)、客流需求适应性评估,运营时段划分后,根据各时段客流需求均值编制列车开行方案并进行客流分配模拟,引入客流需求满足率、列车平均上座率以及客流直达率三个指标,对各时段客流需求与列车开行方案的适应程度进行量化评估并汇总。S6), passenger flow demand adaptability assessment, after the operation period is divided, the train operation plan is prepared according to the average passenger flow demand in each period and the passenger flow distribution simulation is carried out. Three indicators of passenger flow demand satisfaction rate, average train occupancy rate and passenger flow direct rate are introduced. Quantitative assessment and summary of the adaptability of the passenger flow demand and the train operation plan during the time period.

进一步的,所述的Calinski-Harabasz指标是基于全部样本的类内离差矩阵和类间离差矩阵的测度,其最大值对应的类数作为最佳聚类数,即Further, the Calinski-Harabasz index is based on the measurement of the intra-class dispersion matrix and the inter-class dispersion matrix of all samples, and the number of classes corresponding to the maximum value is used as the optimal number of clusters, that is,

式中,k为聚类数,trB(k)为间离差矩阵的迹,trW(k)为类内离差矩阵的迹,n为时间点样本数量。where k is the number of clusters, trB(k) is the trace of the inter-class dispersion matrix, trW(k) is the trace of the intra-class dispersion matrix, and n is the number of samples at time points.

进一步的,所述的Hartigan指标用于聚类数为1的情况,其满足Ha≤10的最小类数作为最佳聚类数,即Further, the Hartigan index is used when the number of clusters is 1, and the minimum number of clusters satisfying Ha≤10 is used as the optimal number of clusters, that is,

式中,k为样本聚类结果的时间点划分类别总数,trW(k)为类内离差矩阵的迹,n为时间点样本数量。In the formula, k is the total number of categories at the time point of the sample clustering result, trW(k) is the trace of the intra-class dispersion matrix, and n is the number of samples at the time point.

进一步的,所述的In-Group Proportion指标用来衡量在某一类中距离每个样本最近的样本是否在同一类中,所有聚类的平均IGP指标越大表示聚类的质量越好,其最大值对应的类数为最佳聚类数,即Further, the In-Group Proportion index is used to measure whether the samples closest to each sample in a certain class are in the same class. The larger the average IGP index of all clusters, the better the quality of the clusters. The number of classes corresponding to the maximum value is the optimal number of clusters, that is,

式中,u为某聚类的类标,Class(j)为样本j的类标,jN为距离样本j最近的样本,#为满足条件的个数。In the formula, u is the class label of a certain cluster, Class(j) is the class label of sample j, j N is the sample closest to sample j, and # is the number of satisfied conditions.

进一步的,所述的客流需求满足率用于体现高速铁路及相关路网的各个客流去向之间,列车开行方案提供的旅客输送能力与客流需求满足程度,在运力资源条件尤其是列车定员条件的约束下,通过既定列车开行方案条件下能够得到有效运输服务的旅客运输量与客流需求总量的比值来表示,其计算公式如下:Further, the passenger flow demand satisfaction rate is used to reflect the passenger transport capacity and passenger flow demand satisfaction degree provided by the train operation plan between the various passenger flow destinations of the high-speed railway and related road networks. Under the constraints, it is expressed by the ratio of the passenger traffic volume that can obtain effective transportation services under the condition of the established train operation plan to the total passenger flow demand. The calculation formula is as follows:

式中,q'w为客流OD对w间高速铁路输送的客流总量,w为路网客流去向数。In the formula, q' w is the total passenger flow transported by the passenger flow OD to the high-speed railway between w, and w is the number of destination of the passenger flow on the road network.

进一步的,所述的列车平均上座率是指评价范围内所有列车上座率的平均值,列车平均上座率指列车在其运行区段所承载的客流量与该列车提供席座总数的加权比值,该指标用于反映不同客流OD对之间的旅客对各种类型高速列车选择结果,列车平均上座率的计算公式如下:Further, the average train occupancy rate refers to the average occupancy rate of all trains within the evaluation range, and the average train occupancy rate refers to the weighted ratio of the passenger flow carried by the train in its running section to the total number of seats provided by the train, This indicator is used to reflect the selection results of various types of high-speed trains by passengers between different passenger flow OD pairs. The calculation formula of the average train occupancy rate is as follows:

式中,为列车h在区间(i,j)承载的客流量,Ah为列车h的定员数,Eh为列车h运行的区段数。In the formula, is the passenger flow carried by the train h in the section (i, j), A h is the number of passengers of the train h, and E h is the number of sections that the train h runs.

进一步的,客流需求结构由不同的需求方向构成,而每个需求方向都会有到达目的地的直达或者换乘乘车方案,客流直达率是指在既定列车开行方案以及客流需求结构下,各个客流需求点对间无需换乘直接到达目的地的客流量与该去向上总客流量的比值,其计算公式为:Further, the passenger flow demand structure is composed of different demand directions, and each demand direction will have a direct or transfer train plan to the destination. The ratio of the passenger flow that directly reaches the destination without transfer between demand points and the total passenger flow in this direction, and its calculation formula is:

式中,w为路网客流去向数,|e|为某个去向客流的换乘次数,为客流OD对w间不换乘直接到达目的地的旅客数量,为客流OD对w间通过|e|次换乘到达目的地的旅客数量。In the formula, w is the number of destinations of the road network passenger flow, |e| is the number of transfers of a certain destination passenger flow, is the number of passengers who reach the destination directly without transferring between passenger flow OD and w, is the number of passengers who reach the destination through |e| transfers between passenger flow OD pair w.

本发明的有益效果为:The beneficial effects of the present invention are:

1、本发明结合沿站每个时间点的客流量数据,采用近邻传播算法对年度内客流量相似的时间点进行聚类归并,并根据CH、Hart以及IGP指标确定最佳聚类数,提高分类的准确性;1. The present invention combines the passenger flow data at each time point along the station, adopts the neighbor propagation algorithm to cluster and merge the time points with similar passenger flow in the year, and determines the optimal number of clusters according to CH, Hart and IGP indicators, and improves the classification. accuracy;

2、本发明提供的于聚类分析的高速铁路运营时段划分方法结合聚类有效性指标的结果检验,能够更为客观准确地反映年度内不同时段客流需求,克服了手工划分方法主观性、效率低以及精度不高的缺点,从而为列车开行方案适应性调整奠定基础;2. The high-speed railway operation period division method provided by the present invention combined with the result test of the cluster validity index can more objectively and accurately reflect the passenger flow demand in different periods of the year, and overcome the subjectivity and low efficiency of the manual division method. And the shortcomings of low precision, so as to lay the foundation for the adaptive adjustment of the train running plan;

3、本发明确定最佳聚类结果后,对聚类结果进行人工分析,通过对运营时段划分结果检验及矫正保证规划的准确性,同时通过客流需求满足率、列车平均上座率以及客流直达率对客流需求适应性进行评估,进一步提高各时段客流需求与列车开行方案的适应程度。3. After the best clustering result is determined by the present invention, the clustering result is manually analyzed, and the accuracy of the planning is ensured by checking and correcting the results of the division of the operation period, and at the same time, the passenger flow demand satisfaction rate, the average train occupancy rate and the passenger flow direct rate are determined. Evaluate the adaptability of the passenger flow demand to further improve the adaptability of the passenger flow demand at each time period and the train operation plan.

4、本发明还对采集的数据进行预处理,删除异常数据,并采用基于拉格朗日插值法对异常数据进行拟合修复,从而保证数据的可用性,进而保证规划结果的可靠性。4. The present invention also preprocesses the collected data, deletes abnormal data, and uses Lagrangian interpolation to fit and repair the abnormal data, thereby ensuring data availability and reliability of planning results.

附图说明Description of drawings

图1为基于近邻传播聚类的高速铁路运营时段划分的流程;Fig. 1 is a flow chart of high-speed railway operation time division based on neighbor propagation clustering;

图2为2014年运营时段划分不同聚类数有效性指标值示意图;Figure 2 is a schematic diagram of the effectiveness index values of different cluster numbers in the operation period in 2014;

图3为2015年运营时段划分不同聚类数有效性指标值示意图;Figure 3 is a schematic diagram of the effectiveness index values of different cluster numbers in the operation period in 2015;

图4为2014年运营时段聚类结果示意图;Figure 4 is a schematic diagram of the clustering results of the operation period in 2014;

图5为2015年运营时段聚类结果示意图。Figure 5 is a schematic diagram of the clustering results of the operation period in 2015.

具体实施方式Detailed ways

下面结合附图对本发明的具体实施方式作进一步说明:The specific embodiments of the present invention will be further described below in conjunction with the accompanying drawings:

如图1所示,本实施例提供一种基于近邻传播聚类的高速铁路客流时段划分方法,为了便于理解,本实施例采用2014年1月1日~2014年12月31日,以及2015年1月1日-2015年12月31日某条已正常运行的高速铁路的客流发送量为数据对本发明进行说明,该铁路具有9个站点,具体包括以下步骤:As shown in FIG. 1 , this embodiment provides a method for dividing high-speed railway passenger flow periods based on neighbor propagation clustering. For ease of understanding, the From January 1st to December 31st, 2015, the passenger flow of a high-speed railway that has been operating normally is used as data to illustrate the present invention. The railway has 9 stations, which specifically includes the following steps:

S1)、将2014年1月1日~12月31日,以及2015年1月1日-2015年12月31日分别划分为365个时间点,即每天为一个时间点,统计每个时间点该条铁路下行的客流量,即分别描述该线路下行方向上的客流OD矩阵,其中,Xk为时间点k的客流量,为时间点k站点i的客流量;S1), divide January 1, 2014 to December 31, and January 1, 2015 to December 31, 2015 into 365 time points, that is, each day is a time point, and each time point is counted The passenger flow down the railway is Describe the passenger flow OD matrix in the downlink direction of the line respectively, where X k is the passenger flow at time point k, is the passenger flow of site i at time point k;

S2)、采用阈值δ判定判断每个时间点的客流发送量是否异常,具体为统计时间点tl相邻的m个时间点的客流发送向量,并计算其均值如果S2), use the threshold value δ to determine whether the passenger flow transmission volume at each time point is abnormal, specifically counting the passenger flow transmission vectors of the m adjacent time points at the time point t1 , and calculating the mean value thereof if

则认为该时间点tl的客流发送量数据为非异常数据,否则为异常数据;Then it is considered that the passenger flow data at this time point tl is non-abnormal data, otherwise it is abnormal data;

其中,Xl为时间点tl的客流发送量;Wherein, X l is the amount of passenger flow sent at time point t l ;

如果数据异常,则删除异常数据,并利用该时间点相邻m个时间点的客流数据对其客流发送量进行拟合修复,计算式如下:If the data is abnormal, delete the abnormal data, and use the passenger flow data of m adjacent time points at this time point to fit and repair the passenger flow transmission volume. The calculation formula is as follows:

式中,lk(t)为k-1次拟合多项式,Ln(t)为拉格朗日插值多项式、t为待拟合的时间点、tj为第j个时间点、ti为第i个时间点、Xi为第i个时间点的客流矩阵、li(t)为i-1次拟合多项式;In the formula, l k (t) is the k-1 degree fitting polynomial, L n (t) is the Lagrangian interpolation polynomial, t is the time point to be fitted, t j is the jth time point, t i is the i-th time point, X i is the passenger flow matrix of the i-th time point, and l i (t) is the i-1 degree fitting polynomial;

将拟合修复后的数据与原正常数据进行合并,然后使用标准差标准化消除变量间尺度的差异,其计算式如下:Combine the fitted and repaired data with the original normal data, and then use the standard deviation standardization to eliminate the difference in scale between variables. The calculation formula is as follows:

式中,Z-score为进行标准差标准化后的值,x为某个时间点的站点旅客发送量,为年度内所有站点旅客发送量的均值,σ为年度内所有站点旅客发送量的标准差;In the formula, Z-score is the value after the standard deviation standardization, x is the passenger sending volume of the station at a certain time point, is the mean of the passenger volume of all stations in the year, σ is the standard deviation of the passenger volume of all stations in the year;

S3)、基于近邻传播聚类算法进行时段划分,近邻传播聚类算法(AffinityPropagation,AP)是在由样本数据点的构成的相似度矩阵S上进行聚类,与其他聚类算法相同,其目标是使各个划分类别中各数据点与该类代表点之间的距离最小化,从而实现类别划分,具体包括以下步骤:S3), the time period is divided based on the neighbor propagation clustering algorithm. The neighbor propagation clustering algorithm (Affinity Propagation, AP) is to perform clustering on the similarity matrix S composed of sample data points, which is the same as other clustering algorithms. It is to minimize the distance between each data point in each category and the representative point of the category, so as to realize category division, which includes the following steps:

S301)、初始化可信度矩阵r(i,k)和可用度矩阵a(i,k)的值为0;S301), initialize the reliability matrix r(i,k) and the value of the availability matrix a(i,k) to 0;

S302)、计算样本相似度矩阵s(i,k),矩阵值采用欧式距离为测度,即s(i,k)=-||xi-xk||2,置对角线元素s(k,k)为相同的吸引度中值,即S302), calculate the sample similarity matrix s(i,k), the matrix value adopts the Euclidean distance as the measure, that is, s(i,k)=-||x i -x k || 2 , set the diagonal element s( k,k) is the same median value of attractiveness, i.e.

式中,为N样本数量,本实施例中取值为265;In the formula, N is the number of samples, and in this embodiment, the value is 265;

S303)、更新可信度矩阵r(i,k)和可用度矩阵a(i,k),其计算式分别为:S303), update the reliability matrix r(i,k) and the availability matrix a(i,k), the calculation formulas are respectively:

可信度矩阵r(i,k)更新计算公式为:The update calculation formula of the credibility matrix r(i,k) is:

可用度矩阵a(i,k)更新计算公式为:The update calculation formula of the availability matrix a(i,k) is:

S304)、设置阻尼因子消除迭代中的数字震荡S304), set the damping factor to eliminate the digital oscillation in the iteration

式中,rnew(i,k)和rold(i,k)分别为本次和上一次更新得到的可信度矩阵;anew(i,k)和aold(i,k)分别为本次和上一次更新得到的可用度矩阵;本实施例中设置阻尼因子λ=0.9;In the formula, r new (i, k) and r old (i, k) are the credibility matrices obtained from the current and previous updates, respectively; a new (i, k) and a old (i, k) are respectively Availability matrices obtained from this and the last update; damping factor λ=0.9 is set in this embodiment;

S305)、对任意时间点的客流量数据样本计算其对所有样本求可信度与可用度之和,根据找到每个样本的类中心样本,然后输出所有时间点类别划分结果;S305), calculate the sum of the reliability and availability for all samples for the passenger flow data samples at any time point, according to Find the class center sample of each sample, and then output the class division results of all time points;

S4)、由于步骤S3)的AP算法对样本进行聚类时会输出一系列的聚类结果,故采用Calinski-Harabasz、Hartigan以及In-Group Proportion指标对算法得到的各种聚类结果进行有效性检验,结果如图2和图3所示,由图可知,基于2014~2015年高速铁路客流数据的最佳样本聚类数均为5,将其作为最终聚类结果,描绘如图4与图5所示;S4), since the AP algorithm in step S3) will output a series of clustering results when clustering the samples, Calinski-Harabasz, Hartigan and In-Group Proportion indicators are used to evaluate the effectiveness of the various clustering results obtained by the algorithm The results are shown in Figure 2 and Figure 3. It can be seen from the figures that the optimal sample clustering number based on the high-speed railway passenger flow data from 2014 to 2015 is 5, which is used as the final clustering result, as shown in Figure 4 and Figure 4 5 shown;

S5)、遍历循环所有划分类别,对其中任意时间点的客流数据样本进行两两对比分析,将同一类别中不连续的时间点拆分,形成2014~2015年高速铁路运营时段划分结果,其结构如表1所示;S5), traverse all the divided categories in the cycle, conduct a pairwise comparative analysis of the passenger flow data samples at any time point, and split the discontinuous time points in the same category to form the division results of the high-speed railway operation period from 2014 to 2015. Its structure As shown in Table 1;

表1运营时段划分结果Table 1 Division results of operating periods

由表1可知,2014年~2015年基于客流变化规律的高速铁路铁路时段划分结果均为5类,一年365天可划分为13个运营时段。其中,2014年~2015年运营时段3、运营时段6、运营时段7、运营时段8、运营时段12的时间跨度相同,而其余运营时段时间跨度各不相同。究其原因,是由于历年春运时期的不同造成的。2014年的春节为1月31号,即第31天;2015年的春节为2月19号,即第50天。可以发现,每年均是在春节前7天进入运营时段2。其他运营时段所对应的年度时节较为明显,总体可归纳为:It can be seen from Table 1 that from 2014 to 2015, the high-speed railway railway time division results based on the change of passenger flow are all 5 categories, and the 365 days a year can be divided into 13 operating time periods. Among them, the time spans of operation period 3, operation period 6, operation period 7, operation period 8, and operation period 12 from 2014 to 2015 are the same, while the time spans of other operation periods are different. The reason is due to the differences in the Spring Festival travel period over the years. The Spring Festival in 2014 is on January 31, which is the 31st day; the Spring Festival in 2015 is on February 19, which is the 50th day. It can be found that every year, the operation period2 is 7 days before the Spring Festival. The annual seasons corresponding to other operating periods are more obvious, which can be summarized as follows:

运营时段1时间跨度为新年后与春节前的客流平缓期;运营时段2~运营时段4时间跨度为春运客流高峰期;运营时段5时间跨度为春运期与清明节之间的客流平缓期;运营时段6时间跨度为清明节客流高峰期;运营时段7时间跨度为清明节与五一劳动节之间的客流平缓期;运营时段8时间跨度为五一劳动节客流高峰期;运营时段9时间跨度为五一劳动节与暑运期之间的客流平缓期;运营时段10时间跨度为暑运客流高峰期;运营时段11时间跨度为暑运期与十一国庆节之间的客流平缓期;运营时段12时间跨度为“十一国庆”客流高峰期;运营时段13时间跨度十一国庆节与新年前的客流平缓期。The time span of operation period 1 is the period of smooth passenger flow after the New Year and before the Spring Festival; the time span of operation period 2 to operation period 4 is the peak period of passenger flow during the Spring Festival travel period; the time span of operation period 5 is the period of smooth passenger flow between the Spring Festival travel period and the Qingming Festival; The time span of time period 6 is the peak passenger flow period of Qingming Festival; the time span of operation period 7 is the smooth passenger flow period between Qingming Festival and May 1st Labor Day; the time span of operation period 8 is the peak passenger flow period of May 1st Labor Day; the time span of operation period 9 It is the passenger flow flat period between the May 1st Labor Day and the Summer Festival; the time span of the operation period 10 is the peak passenger flow period of the summer transport; the time span of the operation period 11 is the passenger flow flat period between the summer transport period and the National Day of the 11th National Day; The time span of time period 12 is the peak passenger flow period of “National Day”; the time span of operation period 13 is the period of smooth passenger flow before the National Day and the New Year.

S6)、运营时段划分结果矫正,由于运营时段3、6、8时间跨度仅为一天或者几天,对于高速铁路客运管理部门而言,为适应这些时段客流需求实施列车开行方案大规模调整会对既有运输计划造成过多的干扰,而且需要消耗过多的人力物力。因此,根据现场工作经验,本实施例将天数少于7天的三个运营时段与相邻的运营时段进行归并处理,得到矫正后的高速铁路运营时段划分结果如表2所示,S6), the correction of the division result of the operation period, since the time span of operation period 3, 6, and 8 is only one day or a few days, for the high-speed railway passenger transportation management department, the large-scale adjustment of the train running plan to meet the needs of passenger flow in these periods will affect the Existing transportation plans cause too much disruption and consume too much human and material resources. Therefore, according to the field work experience, in this embodiment, three operation periods with days less than 7 days are merged with adjacent operation periods, and the corrected high-speed railway operation period division results are shown in Table 2.

表2运营时段划分修正结果Table 2 The revised results of the division of operating periods

经过矫正的高速铁路运营时段划分结果可归纳为:运营时段1为新年后与春节前的客流平缓期;运营时段2为春运客流高峰期;运营时段3时间跨度为春运期与暑运期之间的客流平缓期;运营时段4时间跨度为暑运客流高峰期;运营时段5时间跨度为暑运期与十一国庆节之间的客流平缓期;运营时段6时间跨度为“十一国庆”客流高峰期;运营时段7时间跨度十一国庆节与新年前的客流平缓期。上述时段划分结论,可作为列车开行方案评价与调整的前提,对每个运营时段中根据预测得到的客流需求与列车开行方案的适应性进行评价,如果评价结果不理想,则需要对当前列车开行方案进行调整;The corrected high-speed railway operating time division results can be summarized as follows: operating time period 1 is the smooth passenger flow period after the New Year and before the Spring Festival; operating time period 2 is the peak passenger flow period of the Spring Festival; operation period 3 is the time span between the Spring Festival travel period and the summer travel period The time span of operation period 4 is the peak period of passenger flow during the summer transportation period; the time span of operation period 5 is the period of smooth passenger flow between the summer transportation period and the National Day holiday; the time span of operation period 6 is the passenger flow of "National Day on the 11th National Day" Peak period; operating period 7 time spans the National Day holiday and the flat period of passenger flow before the New Year. The above time division conclusion can be used as the premise for the evaluation and adjustment of the train operation plan, and evaluate the adaptability of the passenger flow demand and the train operation plan obtained according to the prediction in each operation period. If the evaluation result is not satisfactory, the current train operation needs to be evaluated adjust the plan;

S7)、客流需求适应性评估,为说明基于运营时段划分结果所制定的列车开行方案与客流需求具有更好的适应性,在对高速铁路运营时段划分的基础上,根据各时段的客流量均值编制列车开行方案,并模拟计算各时段列车开行方案与客流需求各项适应性评估指标。同时,根据高速铁路2014、2015年实际运营中的时段划分情况,对列车开行方案与客流需求适应性进行对比,结果如表3所示,S7), passenger flow demand adaptability assessment, in order to illustrate that the train operation plan formulated based on the operating period division results has better adaptability to passenger flow demand, based on the division of high-speed railway operating periods, according to the average passenger flow of each period. Compile the train operation plan, and simulate and calculate the train operation plan and various adaptability evaluation indicators of passenger flow demand at each time period. At the same time, according to the time division of the actual operation of the high-speed railway in 2014 and 2015, the train operation plan and the adaptability of passenger flow demand were compared. The results are shown in Table 3.

表3与实际运营状况对比结果Table 3 compares the results with the actual operating conditions

由表3可知,在列车开行方案大规模调整的次数不变的前提下,根据近邻传播聚类算法的运营时段划分结果编制的列车开行方案与客流需求具有更好的适应性。其中,2014年客流需求满足率、列车平均上座率以及客流直达率分别增加了7.6%、16.7%、14.1%,2015年上述三个指标分别增加了5.7%、18.4%、14.4%。It can be seen from Table 3 that under the premise that the number of large-scale adjustment of the train operation plan remains unchanged, the train operation plan prepared according to the operation period division results of the neighbor propagation clustering algorithm has better adaptability to the passenger flow demand. Among them, in 2014, the passenger flow demand satisfaction rate, the average train occupancy rate and the passenger flow direct rate increased by 7.6%, 16.7% and 14.1% respectively. In 2015, the above three indicators increased by 5.7%, 18.4% and 14.4% respectively.

本实施例结合沿线车站每天的旅客发送量调查数据,采用近邻传播算法对年度内客流量相似的时间点进行聚类归并,并根据CH、Hart以及IGP指标确定最佳聚类数,在此基础上设计高速铁路年度运营时段划分方法,主要结论如下In this embodiment, combined with the survey data of the daily passenger traffic at the stations along the line, the neighbor propagation algorithm is used to cluster and merge the time points with similar passenger flow in the year, and the optimal number of clusters is determined according to the CH, Hart and IGP indicators. The division method of the annual operation period of high-speed railway is designed, and the main conclusions are as follows

(1)基于聚类分析的高速铁路运营时段划分方法,并结合聚类有效性指标的结果检验,能够更为客观准确地反映年度内不同时段客流需求,克服了手工划分方法主观性、效率低以及精度不高的缺点,从而为列车开行方案适应性调整奠定基础。(1) The high-speed railway operating time division method based on cluster analysis, combined with the result test of the cluster validity index, can more objectively and accurately reflect the passenger flow demand in different periods of the year, and overcome the subjectivity, low efficiency and low efficiency of manual division methods. The disadvantage of low precision is to lay the foundation for the adaptive adjustment of the train running plan.

(2)以某条高速铁路沿线车站旅客发送量统计数据为样本的实例研究表明,通过确定年度运营时段的最佳聚类结果,以及对聚类结果进行人工分析,在此基础上可将全年划分为合理的运营时段。(2) The case study using the statistical data of passenger traffic at a station along a high-speed railway as a sample shows that by determining the best clustering results of the annual operation period and manually analyzing the clustering results, the entire The year is divided into reasonable operating periods.

上述实施例和说明书中描述的只是说明本发明的原理和最佳实施例,在不脱离本发明精神和范围的前提下,本发明还会有各种变化和改进,这些变化和改进都落入要求保护的本发明范围内。What is described in the above-mentioned embodiments and specification is only to illustrate the principle and best embodiment of the present invention. Without departing from the spirit and scope of the present invention, the present invention will also have various changes and improvements, and these changes and improvements all fall within the scope of the present invention. within the scope of the claimed invention.

Claims (7)

1. a kind of high-speed railway passenger flow Time segments division method based on neighbour's propagation clustering, comprising the following steps:
S1), a year is divided into T time point, and setting any one high-speed rail route X includes n important websites, statistics should Each time interval t on high-speed railway route X upstream or downstream directionkInterior each website passenger flow traffic volume constructs passenger flow square Battle array, i.e.,
Wherein, XkIndicate time interval tkThe passenger flow traffic volume of interior high-speed rail route X,Indicate time interval tkInterior high-speed rail route X's The passenger flow traffic volume of n-th of website;
S2), judge whether the passenger flow traffic volume of each time point is abnormal using threshold value δ, specially statistical time point tlIt is adjacent The passenger flow at m time point send vector, and calculate its mean valueIf
Then think time point tlPassenger flow traffic volume data be non-abnormal data, be otherwise abnormal data;
Wherein, XlFor time point tlPassenger flow traffic volume;
If data exception, suppressing exception data, and Lagrange's interpolation is utilized, according to the adjacent m time at the time point The passenger flow data of point is fitted reparation to its passenger flow traffic volume, and calculating formula is as follows:
In formula, lkIt (t) is k-1 polynomial fitting, Ln(t) be Lagrange interpolation polynomial, t be time point to be fitted, tj For j-th of time point, tiFor i-th of time point, XiFor passenger flow matrix, the l at i-th of time pointi(t) multinomial for i-1 fitting Formula;
Data after fitting is repaired are merged with former normal data, and scale between eliminating variable is then standardized using standard deviation Difference, calculating formula is as follows:
In formula, Z-score is the value carried out after standard deviation standardization, and x is website passenger's traffic volume at some time point,For year The mean value of all website passenger traffic volumes in spending, σ are the standard deviation of all website passenger traffic volumes in year;
S3), Time segments division is carried out based on neighbour's propagation clustering algorithm, by the passenger flow at T time point in passenger flow traffic volume data set Traffic volume data set is represented as candidate class, judges the similarity s (i, k) of the passenger flow traffic volume in any 2 time intervals, phase The passenger flow traffic volume sample X of time interval k is indicated like degree s (i, k)kWith a passenger flow traffic volume sample X of interval iiSimilarity, That is sample XkIt is suitable as sample XiThe quantization degree that represents of class, when algorithm initialization, it is assumed that all samples were represented for class Possibility is identical, that is, assumes that all s (k, k) are identical Attraction Degree intermediate value p, wherein the calculating formula of similarity of any two sample Are as follows:
S (i, k)=- | | xi-xk||2
Wherein, xiIndicate the passenger flow traffic volume sample of i, xkIndicate the passenger flow traffic volume at k moment;
Define reliability matrix r and availability matrix a, wherein reliability matrix r (i, k) is from sample xiIt is directed toward sample xk, table This x of samplekIt is suitable as xiClass represent representative degree;A (i, k) is from sample xkIt is directed toward sample xi, indicate xiSelect xkMake The appropriate level represented for class;For arbitrary sample xiCalculate other times interval passenger flow traffic volume confidence level r (i, k) and can The sum of expenditure a (i, k), if the maximum sample x of sum of the twokFor class representative, all time point category division results are exported;
Specifically comprise the following steps:
S301), the initial value that reliability matrix r (i, k) and availability matrix a (i, k) is arranged is 0;
S302), calculate arbitrary time span passenger flow traffic volume sample similarity matrix s (i, k), matrix value using it is European away from From to estimate, i.e. s (i, k)=- | | xi-xk||2
It is identical Attraction Degree intermediate value that diagonal entry s (k, k), which is arranged, i.e.,
It is N sample size in formula;
S303), reliability matrix r (i, k) and availability matrix a (i, k) are updated, wherein reliability matrix r (i, k) updates meter Calculate formula are as follows:
Availability matrix a (i, k) updates calculation formula are as follows:
S304), setting damping factor λ eliminates the number concussion in iteration, that is,
In formula, rnew(i, k) and rold(i, k) is respectively the reliability matrix that this is obtained with last update;anew(i, k) and aold(i, k) is respectively the availability matrix that this is obtained with last update;λ ∈ (0,1) is damping factor;
S305), the confidence level and availability of itself and all passenger flow traffic volume samples are sought to any passenger flow traffic volume data sample The sum of, according toFind the class central sample of each sample;
S306), current iteration number updates n ← n+1, judges whether information iteration process reaches the maximum number of iterations of setting, That is n≤Nmax, it is then algorithm termination, exports all time point category divisions as a result, otherwise return step S302);
S4), Calinski-Harabasz, Hartigan and In- of different time points category division result are calculated separately Group Proportion index selects optimal time point classification number and its corresponding category division result;
S5), Time segments division product test and correction are runed, all division classifications of traversal loop carry out two-by-two each sample Comparative analysis, if time point corresponding to two samples be it is adjacent, merge into an operation period, otherwise, be considered as another One operation period;
S6), passenger flow demand adaptability teaching after runing Time segments division, is worked out train according to day part passenger flow average demand and is started Scheme simultaneously carries out bus traveler assignment simulation, introduces that passenger flow satisfactory rate of information demand, train be averaged attendance and passenger flow is gone directly rate three and referred to Mark carries out quantitative evaluation to the adaptedness of day part passenger flow demand and train running scheme and summarizes.
2. a kind of high-speed railway passenger flow Time segments division method based on neighbour's propagation clustering according to claim 1, special Sign is: in step S4), the Calinski-Harabasz index is mean dispersion error matrix and class in the class based on whole samples Between mean dispersion error matrix estimate, the corresponding class number of maximum value is as preferable clustering number, i.e.,
In formula, k is cluster numbers, trB (k) be between mean dispersion error matrix mark, trW (k) is the mark of mean dispersion error matrix in class, and n is time point Sample size.
3. a kind of high-speed railway passenger flow Time segments division method based on neighbour's propagation clustering according to claim 1, special Sign is: in step S4), the Hartigan index is used for the case where cluster numbers are 1, meets the infima species number of Ha≤10 As preferable clustering number, i.e.,
In formula, k is cluster numbers, and trW (k) is the mark of mean dispersion error matrix in class, and n is time point sample size.
4. a kind of high-speed railway passenger flow Time segments division method based on neighbour's propagation clustering according to claim 1, special Sign is: in step S4), the In-Group Proportion index is used to measure each sample of distance in certain one kind Whether nearest sample is in same class, and the average IGP index of all clusters is bigger, and the quality for indicating cluster is better, maximum value Corresponding class number is preferable clustering number, i.e.,
In formula, u is the category of certain cluster, and Class (j) is the category of sample j, jNFor the nearest sample of distance sample j, # is to meet The number of condition.
5. a kind of high-speed railway passenger flow Time segments division method based on neighbour's propagation clustering according to claim 1, special Sign is: in step S6), the passenger flow satisfactory rate of information demand is used to embody each passenger flow whereabouts of high-speed railway and related road network Between, the passenger's conveying capacity and passenger flow need satisfaction degree that train running scheme provides especially are arranged in transport capacity resource condition Under the constraint of vehicle staffing condition, by the Passenger Traffic that can obtain effective transportation service under the conditions of set train running scheme It is indicated with the ratio of passenger flow total demand, calculation formula is as follows:
In formula, q'wFor the passenger flow total amount that passenger flow OD conveys high-speed railway w, w is road network passenger flow whereabouts number.
6. a kind of high-speed railway passenger flow Time segments division method based on neighbour's propagation clustering according to claim 1, special Sign is: in step S6), the train attendance that is averaged refers to the average value of all train attendances in range of value, column Vehicle is averaged the volume of the flow of passengers that attendance refers to that train is carried in its running section and the train provides the weighting ratio of seat seat sum, this Index is used to reflect passenger between OD pairs of different passenger flows to various types bullet train selection result, and train is averaged attendance Calculation formula is as follows:
In formula,For the volume of the flow of passengers that train h is carried at section (i, j), AhFor the staffing number of train h, EhFor the area of train h operation Number of segment.
7. a kind of high-speed railway passenger flow Time segments division method based on neighbour's propagation clustering according to claim 1, special Sign is: in step S6), passenger flow demand structure is made of different demand directions, and each demand direction can have arrival mesh Ground through or transfer riding scheme, passenger flow rate of going directly refers in set train running scheme and passenger flow demand structure Under, the volume of the flow of passengers to go directly to destination between each passenger flow demand point pair without transfer removes upwards the always ratio of the volume of the flow of passengers with this, Its calculation formula is:
In formula, w is road network passenger flow whereabouts number, | e | it is the number of transfer of some whereabouts passenger flow,It is straight between not changed to w for passenger flow OD The passenger number arrived at the destination is connect,Pass through for passenger flow OD between w | e | it is secondary to change to the passenger number arrived at the destination.
CN201910307332.7A 2019-04-17 2019-04-17 High-speed railway passenger flow time interval division method based on neighbor propagation clustering Expired - Fee Related CN110119884B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910307332.7A CN110119884B (en) 2019-04-17 2019-04-17 High-speed railway passenger flow time interval division method based on neighbor propagation clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910307332.7A CN110119884B (en) 2019-04-17 2019-04-17 High-speed railway passenger flow time interval division method based on neighbor propagation clustering

Publications (2)

Publication Number Publication Date
CN110119884A true CN110119884A (en) 2019-08-13
CN110119884B CN110119884B (en) 2022-09-13

Family

ID=67521058

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910307332.7A Expired - Fee Related CN110119884B (en) 2019-04-17 2019-04-17 High-speed railway passenger flow time interval division method based on neighbor propagation clustering

Country Status (1)

Country Link
CN (1) CN110119884B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112181955A (en) * 2020-09-01 2021-01-05 西南交通大学 A data normative governance method for information sharing of heavy-haul railway comprehensive big data platform
CN112749836A (en) * 2020-12-22 2021-05-04 蓝海(福建)信息科技有限公司 Customized passenger intelligent transportation capacity allocation method based on passenger flow time sequence
CN113902011A (en) * 2021-10-08 2022-01-07 南威软件股份有限公司 Urban rail transit short-time passenger flow prediction method based on cyclic neural network
CN114020808A (en) * 2021-11-01 2022-02-08 卡斯柯信号(郑州)有限公司 Calculation method of urban rail transit driving scheme based on multi-day passenger flow integration
CN116757390A (en) * 2023-05-05 2023-09-15 北京蔚行科技有限公司 Bus operation period division method based on time series clustering

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105857350A (en) * 2016-03-17 2016-08-17 中南大学 High-speed rail train driving method based on section profile passenger flow
WO2017063356A1 (en) * 2015-10-14 2017-04-20 深圳市天行家科技有限公司 Designated-driving order predicting method and designated-driving transport capacity scheduling method
CN107145985A (en) * 2017-05-09 2017-09-08 北京城建设计发展集团股份有限公司 A kind of urban track traffic for passenger flow Regional Linking method for early warning
JP2018503920A (en) * 2015-01-27 2018-02-08 ベイジン ディディ インフィニティ テクノロジー アンド ディベロップメント カンパニー リミティッド Method and system for providing on-demand service information
CN108665083A (en) * 2017-03-31 2018-10-16 江苏瑞丰信息技术股份有限公司 A kind of method and system for advertisement recommendation for dynamic trajectory model of being drawn a portrait based on user
CN108805344A (en) * 2018-05-29 2018-11-13 五邑大学 A kind of high-speed railway network train running scheme optimization method considering time-dependent demand

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2018503920A (en) * 2015-01-27 2018-02-08 ベイジン ディディ インフィニティ テクノロジー アンド ディベロップメント カンパニー リミティッド Method and system for providing on-demand service information
WO2017063356A1 (en) * 2015-10-14 2017-04-20 深圳市天行家科技有限公司 Designated-driving order predicting method and designated-driving transport capacity scheduling method
CN105857350A (en) * 2016-03-17 2016-08-17 中南大学 High-speed rail train driving method based on section profile passenger flow
CN108665083A (en) * 2017-03-31 2018-10-16 江苏瑞丰信息技术股份有限公司 A kind of method and system for advertisement recommendation for dynamic trajectory model of being drawn a portrait based on user
CN107145985A (en) * 2017-05-09 2017-09-08 北京城建设计发展集团股份有限公司 A kind of urban track traffic for passenger flow Regional Linking method for early warning
CN108805344A (en) * 2018-05-29 2018-11-13 五邑大学 A kind of high-speed railway network train running scheme optimization method considering time-dependent demand

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘吉华等: "不同轴重下轮轨损伤行为研究", 《五邑大学学报(自然科学版)》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112181955A (en) * 2020-09-01 2021-01-05 西南交通大学 A data normative governance method for information sharing of heavy-haul railway comprehensive big data platform
CN112181955B (en) * 2020-09-01 2022-12-09 西南交通大学 A data standard governance method for information sharing of heavy-haul railway comprehensive big data platform
CN112749836A (en) * 2020-12-22 2021-05-04 蓝海(福建)信息科技有限公司 Customized passenger intelligent transportation capacity allocation method based on passenger flow time sequence
CN113902011A (en) * 2021-10-08 2022-01-07 南威软件股份有限公司 Urban rail transit short-time passenger flow prediction method based on cyclic neural network
CN114020808A (en) * 2021-11-01 2022-02-08 卡斯柯信号(郑州)有限公司 Calculation method of urban rail transit driving scheme based on multi-day passenger flow integration
CN116757390A (en) * 2023-05-05 2023-09-15 北京蔚行科技有限公司 Bus operation period division method based on time series clustering
CN116757390B (en) * 2023-05-05 2024-08-27 北京蔚行科技有限公司 Bus operation period division method based on time sequence clustering

Also Published As

Publication number Publication date
CN110119884B (en) 2022-09-13

Similar Documents

Publication Publication Date Title
CN110119884B (en) High-speed railway passenger flow time interval division method based on neighbor propagation clustering
CN105857350B (en) A kind of high ferro train based on interval section passenger flow starts method
CN114117700B (en) Research method of urban public transportation network optimization based on complex network theory
CN102169524A (en) Staged multi-path model algorithm of urban rail transit network passenger flow distribution
CN104809112B (en) A kind of city bus development level integrated evaluating method based on multi-source data
CN108364127B (en) Road network passenger flow cooperative control optimization system
CN112990648B (en) Rail transit network operation stability assessment method
CN103745089A (en) Multi-dimensional public transport operation index evaluation method
CN102254453B (en) Functional sector partitioning method for airspace of civil aviation multi-airport terminal area
CN103761589A (en) Distribution method for urban rail transit
CN107194491A (en) A kind of dynamic dispatching method based on Forecasting of Travel Time between bus passenger flow and station
CN110119890B (en) A Sorting Method of Railway Ride Plans Based on AHP-Grey Correlation Analysis
CN104616076A (en) Method and system for optimizing multi-line collaborative operation scheme of urban rail transit
CN109272168A (en) Urban rail transit passenger flow change trend prediction method
CN112183891B (en) A complex network-based method for recommending express stations at major bus stations
CN110084397A (en) A kind of subway straightforward line planing method
CN112150802B (en) Classification method of urban road grades based on reliability of ground bus operation status
CN116611586B (en) Newly built road network flow prediction method and system based on double-layer heterogeneous network
CN110197335A (en) A kind of get-off stop number calculation method based on probability OD distributed model
CN106257504A (en) A kind of BRT passenger based on Equilibrium Assignment Model goes on a journey benefit optimization method
CN113361754A (en) Elastic bus stop layout method based on DBSCAN algorithm
CN106327867B (en) Bus punctuation prediction method based on GPS data
CN116562692A (en) Urban low-altitude unmanned aerial vehicle airway network evaluation method
CN112116124A (en) An audit method of bus network optimization scheme based on traveler's perspective
CN108197724B (en) Method for calculating efficiency weight and evaluating bus scheme performance in bus complex network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20220913

CF01 Termination of patent right due to non-payment of annual fee