CN110119884B

CN110119884B - High-speed railway passenger flow time interval division method based on neighbor propagation clustering

Info

Publication number: CN110119884B
Application number: CN201910307332.7A
Authority: CN
Inventors: 王文宪; 肖蒙; 翟玉江; 林群煦
Original assignee: Wuyi University
Current assignee: Wuyi University
Priority date: 2019-04-17
Filing date: 2019-04-17
Publication date: 2022-09-13
Anticipated expiration: 2039-04-17
Also published as: CN110119884A

Abstract

The invention provides a high-speed railway passenger flow time interval division method based on neighbor propagation clustering, which divides statistical time into a plurality of time points, then counts passenger flow data of each time point, and constructs a time sample sequence of preprocessed sample variables; then, dividing the sample sequence by utilizing a neighbor propagation clustering algorithm; finally, determining an optimal clustering result by adopting clustering effectiveness indexes such as CH, Hartigan and IGP, and further forming a annual operation time interval division result; meanwhile, the method can objectively and accurately reflect passenger flow requirements at different time intervals in the year, and overcomes the defects of low subjectivity, low efficiency and low precision of a manual dividing method, thereby laying a foundation for the adaptive adjustment of a train operation scheme.

Description

High-speed railway passenger flow time interval division method based on neighbor propagation clustering

Technical Field

The invention relates to the technical field of high-speed railways, in particular to a high-speed railway passenger flow time interval division method based on neighbor propagation clustering.

Background

The eight-transverse eight-longitudinal highway network and the intercity railway network are gradually improved, so that more and more medium and long-distance passengers take the high-speed railway as a preferred travel mode and serve as important influence factors of passenger service quality, and the train operation scheme specifies the operation number, sections, stop stations and the like of passenger trains. In order to improve the service quality of each going passenger flow on a road network and reduce the train running cost as much as possible, a high-speed railway passenger transport management department needs to adaptively adjust a train running scheme according to annual passenger flow fluctuation changes so as to meet the passenger flow requirements at different time intervals in the year.

The passenger flow distribution condition on the train in the railway network is an important basis for evaluating the implementation efficiency of the passenger train operation scheme. The actual passenger flow distribution is usually adopted to evaluate the running scheme of the passenger train being implemented, but for the passenger train running scheme to be optimally designed, the passenger flow distribution on the train can be generated only by means of a passenger flow distribution means to evaluate. Because the passenger flow distribution efficiency and the reasonability of the result directly influence the optimization level of the passenger train operation scheme, the train passenger flow distribution method is one of important basic research subjects for researching the optimization of the railway passenger train operation scheme.

However, the adjustment of the train operation scheme involves many factors, is a complex and huge system project, and has a limited number of times of adjustment every year. Time interval division is carried out on the operation year of the high-speed railway according to passenger flow fluctuation characteristics, and then adjustment of a train operation scheme according to the passenger flow of each time interval is a feasible strategy. Therefore, the scientific and reasonable division of the operation time period is a basic premise and an important basis for the adjustment of the train operation scheme and an important guarantee that the adjustment of the train operation scheme is adaptive to the passenger flow requirement with the dynamic characteristic. The existing high-speed railway operation time interval dividing method is that the annual operation time interval is divided into a spring operation period, a summer operation period, a holiday period and a peak-leveling period according to the change situation of the total passenger flow counted by target lines all year round. Although the method reflects the difference of the passenger flow volume among different periods, the result of the period division depends on the experience of field engineering technicians to a great extent, and the method has the defects of strong subjectivity and easiness in causing unreasonable period division results, and is difficult to accurately reflect the passenger flow demand with seasonal change characteristics within the year.

For the problem of time interval division of high-speed railway operation, no relevant research is available at home and abroad. This problem is similar in nature to traffic segment division in multi-segment control (TOD) based intersection signal design. Aiming at the problem of multi-time interval control (TOD) of a road intersection, scholars at home and abroad have some related researches, the reasonable division of traffic time intervals is realized mainly by drawing a one-day accumulated traffic curve of a certain representative intersection and determining a time node with obvious traffic curve change as a time interval division point through manual experience, and an earlier train passenger flow distribution method has almost no independent research and is usually applied to optimization research of train operation schemes, the researches construct a passenger transfer network based on a given train operation scheme, design the travel generalized expenses of passengers, including fare expenditure, travel time, congestion effect and the like, establish a static user balanced distribution model or a random user balanced distribution model, and distribute flow to train operation sections in a balanced manner (refer to railway related passenger train operation scheme research [ J ] of railway school newspaper, 2004,26(2):16-20.). Urban mass transit passenger flow distribution is very similar to high-speed rail passenger flow distribution, and there is a great deal of research in this field. The method mainly considers the capacity constraint and the space-time priority characteristic in the passenger flow distribution process, does not give research and analysis to the passenger ticket purchasing characteristic, but has important influence in the travel selection of passengers on a high-speed railway, so that a passenger flow distribution method suitable for a high-speed railway transportation network needs to be designed.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a high-speed railway passenger flow time interval division method based on neighbor propagation clustering, which reasonably divides the whole year according to the change characteristics of the destination passenger flow on a high-speed railway line and improves the adaptability of a train operation scheme and the passenger flow demand.

The technical scheme of the invention is as follows: a high-speed railway passenger flow time interval division method based on neighbor propagation clustering comprises the following steps:

s1), dividing one year into T time points, setting the high-speed railway line X to comprise n important stations, and counting each time interval T in the ascending or descending direction of the high-speed railway line X _k Traffic of individual stations within, i.e.

Wherein, X _k Representing a time interval t _k The passenger traffic volume of the interior high-speed rail line X,

representing a time interval t _k Passenger flow sending volume of the nth station of the inner high-speed rail line X;

s2), judging whether the passenger flow sending volume at each time point is abnormal or not by adopting a threshold value delta, and specifically counting the time point t _l The passenger flow sending vectors of the adjacent m time points are calculated, and the average value of the passenger flow sending vectors is calculated

If it is not

The time point t is considered _l The passenger flow sending volume data is non-abnormal data, otherwise, the passenger flow sending volume data is abnormal data;

wherein X _l At a time point t _l The passenger flow sending volume of (2);

if the data is abnormal, deleting the abnormal data, and fitting and repairing the passenger flow sending volume by using the passenger flow data of m adjacent time points at the time point, wherein the calculation formula is as follows:

in the formula I _k (t) is a k-1 order fitting polynomial, L _n (t) is a Lagrange interpolation polynomial, t is a time point to be fitted, t is _j At jth time point, t _i Is the ith time point, X _i As the traffic matrix at the ith time point, l _i (t) is a fitting polynomial of degree i-1;

merging the fitted and repaired data with the original normal data, and then normalizing by using a standard deviation to eliminate the difference of the scales between the variables, wherein the calculation formula is as follows:

wherein Z-score is a normalized value of standard deviation, x is a station passenger transmission amount at a certain time point,

the mean value of passenger sending quantity of all stations in the year, and sigma is the standard deviation of passenger sending quantity of all stations in the year;

s3), time interval division is carried out based on a neighbor propagation clustering algorithm, a passenger flow sending volume data set of T time points in the passenger flow sending volume data set is used as a candidate class representative, and the similarity S (i, k) of the passenger flow sending volume in any 2 time intervals is judged, namely S (i, k) represents a passenger flow sending volume sample X of the time interval k _k Sample X of passenger flow sending amount in interval i _i Of (2), i.e. sample X _k Is suitable for use as sample X _i When the algorithm is initialized, it is assumed that all samples have the same probability of being represented by a class, i.e., all s (k, k) are assumed to be the same median value p of attraction, wherein the similarity calculation formula of any two samples is as follows:

s(i,k)＝-||x _i -x _k || ² ；

wherein x is _i Sample of passenger traffic volume, x, representing i _k The passenger flow sending volume at the time k is shown;

defining a confidence matrix r and an availability matrix a, wherein the confidence matrix r (i, k) is derived from the samples x _i Point to sample x _k Represents a sample x _k Is suitable as x _i Degree of representation of the class representation of (c); a (i, k) is from sample x _k Pointing to sample x _i Denotes x _i Selecting x _k The degree of suitability as a class representative; for arbitrary sample x _i Calculating the credibility r (i, k) and credibility of passenger flow sending quantity in other time intervalsSample x using the sum of the degrees a (i, k), if the sum of the two is maximum _k Outputting all time point classification results for class representation;

the method specifically comprises the following steps:

s301), setting initial values of a reliability matrix r (i, k) and an availability matrix a (i, k) to be 0;

s302), calculating a similarity matrix S (i, k) of passenger flow sending volume samples at any time interval, wherein the matrix value adopts Euclidean distance as measure, namely S (i, k) ═ x _i -x _k || ² ；

Set the diagonal elements s (k, k) to the same median value of attractiveness, i.e.

In the formula, the number is N samples;

s303), updating the credibility matrix r (i, k) and the availability matrix a (i, k), wherein the credibility matrix r (i, k) is updated by the calculation formula:

the calculation formula is updated by the degree matrix a (i, k) as follows:

s304), setting a damping factor λ eliminates digital oscillations in the iteration, i.e.,

in the formula, r _new (i, k) and r _old (i, k) respectively obtaining credibility matrixes obtained by the updating of the current time and the last time; a is a _new (i, k) and a _old (i, k) respectively obtaining the availability matrixes updated this time and last time; lambda epsilon (0,1) is a damping factor;

s305), calculating the sum of the reliability and the availability of any passenger flow sending volume data sample and all passenger flow sending volume samples, and obtaining the sum according to the sum

Finding a class center sample of each sample;

s306), updating N ← N +1 for the current iteration times, and judging whether the information iteration process reaches the set maximum iteration times, namely N is not more than N _max If yes, the algorithm is terminated, all time point category division results are output, and if not, the step S302) is returned;

s4), respectively calculating Calinski-Harabasz, Hartigan and In-Group probability indexes of different time point class division results, and selecting the optimal time point class number and the corresponding class division result;

s5), checking and correcting the division result of the operation time period, traversing and circulating all the division categories, carrying out pairwise comparison analysis on each sample, if the time points corresponding to the two samples are adjacent, combining the two samples into one operation time period, and if not, regarding the two samples as the other operation time period;

s6), passenger flow demand adaptability assessment, after the operation time period is divided, a train operation scheme is compiled according to the passenger flow demand mean value of each time period, passenger flow distribution simulation is carried out, three indexes of passenger flow demand satisfaction rate, train average seat-getting rate and passenger flow direct rate are introduced, and the adaptability degree of the passenger flow demand and the train operation scheme of each time period is quantitatively assessed and summarized.

Furthermore, the Calinski-Harabasz index is based on the measure of the intra-class dispersion matrix and the inter-class dispersion matrix of all samples, and the class number corresponding to the maximum value of the measure is taken as the optimal clustering number, namely the best clustering number

Where k is the number of clusters, trB (k) is the trace of the inter-class dispersion matrix, trW (k) is the trace of the intra-class dispersion matrix, and n is the number of samples at time points.

Further, the Hartigan index is used in the case where the number of clusters is 1, and it satisfies the minimum number of clusters having ≦ 10 as the optimal number of clusters, i.e., the optimal number of clusters

In the formula, k is the total number of time point classification categories of the sample clustering result, trw (k) is the trace of the dispersion matrix in the categories, and n is the number of time point samples.

Further, the In-Group probability index is used to measure whether the samples closest to each sample In a certain class are In the same class, the larger the average IGP index of all clusters is, the better the cluster quality is, the class number corresponding to the maximum value is the optimal cluster number, that is, the class number corresponding to the maximum value is the optimal cluster number

Wherein u is the class label of a certain cluster, class (j) is the class label of a sample j, j ^N Sample # is the closest sample to sample j and is the number satisfying the condition.

Further, the passenger flow demand satisfaction rate is used for reflecting the destination of each passenger flow of the high-speed railway and the related road network, the passenger transport capacity and the passenger flow demand satisfaction degree provided by the train operation scheme are represented by the ratio of the passenger transport volume to the passenger flow demand total volume of the effective transport service under the condition of the established train operation scheme under the constraint of the transport capacity resource condition, particularly the train member condition, and the calculation formula is as follows:

in the formula (II), q' _w The total passenger flow amount transmitted by the high-speed railway between the passenger flow OD and the w is the number of the passenger flow directions of the road network.

Further, the average train occupancy rate refers to an average value of all train occupancy rates within the evaluation range, the average train occupancy rate refers to a weighted ratio of the passenger flow volume carried by the train in the running section of the train to the total number of seats provided by the train, the index is used for reflecting the selection result of passengers between different passenger flow OD pairs on various types of high-speed trains, and a calculation formula of the average train occupancy rate is as follows:

in the formula (I), the compound is shown in the specification,

the passenger volume of the train h in the section (i, j), A _h Number of passengers of train h, E _h The number of sections in which the train h runs.

Further, the passenger flow demand structure is composed of different demand directions, each demand direction has a direct or transfer riding scheme to the destination, the direct passenger flow rate refers to a ratio of the passenger flow volume directly reaching the destination without transfer between each passenger flow demand point pair and the total passenger flow volume going upward under the set train running scheme and the passenger flow demand structure, and the calculation formula is as follows:

wherein w is the destination number of the road network passenger flow, | e | is the transfer times of a certain destination passenger flow,

for the number of passengers arriving directly at the destination without transfer between the passenger flows OD and w,

the number of passengers arriving at the destination for the traffic OD by | e | number of transfers between the traffic OD and w.

The invention has the beneficial effects that:

1. according to the method, the passenger flow data of each time point along the station are combined, clustering merging is carried out on time points with similar annual passenger flow by adopting a neighbor propagation algorithm, the optimal clustering number is determined according to CH, Hart and IGP indexes, and the accuracy of classification is improved;

2. the high-speed railway operation time interval division method based on the cluster analysis provided by the invention is combined with the result test of the cluster validity index, can objectively and accurately reflect the passenger flow requirements of different time intervals in the year, and overcomes the defects of low subjectivity, low efficiency and low precision of a manual division method, thereby laying a foundation for the adaptability adjustment of a train operation scheme;

3. after the optimal clustering result is determined, the clustering result is manually analyzed, the planning accuracy is ensured by checking and correcting the operation time interval division result, and meanwhile, the passenger flow demand adaptability is evaluated by the passenger flow demand satisfaction rate, the train average attendance rate and the passenger flow direct rate, so that the adaptation degree of the passenger flow demand and the train operation scheme in each time interval is further improved.

4. The invention also preprocesses the collected data, deletes the abnormal data, and adopts a Lagrange interpolation method to carry out fitting repair on the abnormal data, thereby ensuring the usability of the data and further ensuring the reliability of the planning result.

Drawings

FIG. 1 is a flow of high-speed railway operation time interval division based on neighbor propagation clustering;

fig. 2 is a schematic diagram of effective index values of different clustering numbers divided in 2014 operation periods;

fig. 3 is a schematic diagram illustrating effectiveness index values of different clustering numbers divided in 2015 operation time period;

FIG. 4 is a schematic diagram of a clustering result in an operation period of 2014;

fig. 5 is a schematic diagram of the clustering result of the 2015 operation period.

Detailed Description

The following further describes embodiments of the present invention with reference to the accompanying drawings:

as shown in fig. 1, this embodiment provides a method for dividing a passenger flow time interval of a high-speed railway based on neighbor propagation clustering, and for convenience of understanding, this embodiment adopts, as data, a passenger flow sending amount of a certain high-speed railway that has normally operated in 1 month and 1 day 2014 to 12 months and 31 days 2014, and 1 month and 1 day 2015 to 12 months and 31 days 2015, where the railway has 9 stations, and specifically includes the following steps:

s1), dividing 1-12-31 days in 2014 and 1-2015-12-31 days in 2015 into 365 time points respectively, namely, taking one time point every day, and counting the passenger flow of the descending railway at each time point, namely, the passenger flow of the descending railway at each time point

Respectively describe the passenger flow OD matrix in the downstream direction of the line, wherein X _k The amount of traffic at time point k is,

the passenger flow of the station i at the time point k is obtained;

If it is not

wherein, X _l Is a point of time t _l The passenger flow sending volume of (2);

in the formula I _k (t) is a k-1 order fitting polynomial, L _n (t) is a Lagrange interpolation polynomial, t is a time point to be fitted, t is _j Is the jth time point, t _i Is the ith time point, X _i As the traffic matrix at the ith time point, l _i (t) is a fit polynomial of degree i-1;

combining the data after fitting and repairing with the original normal data, and then using standard deviation to standardize and eliminate the difference of the scales between the variables, wherein the calculation formula is as follows:

s3), performing time interval division based on an Affinity Propagation clustering algorithm, which is a clustering on a similarity matrix S composed of sample data points, and is the same as other clustering algorithms, aiming at minimizing the distance between each data point and the representative point of the class in each classification, thereby realizing class division, specifically comprising the following steps:

s301), initializing the values of the reliability matrix r (i, k) and the availability matrix a (i, k) to be 0;

s302), calculating a sample similarity matrix S (i, k), wherein the matrix value adopts Euclidean distance as measure, i.e. S (i, k) — | | x _i -x _k || ² Put the diagonal element s (k, k) to the same median value of attraction, i.e.

In the formula, the number of N samples is 265 in this embodiment;

s303), updating the credibility matrix r (i, k) and the availability matrix a (i, k), wherein the calculation formulas are respectively as follows:

the confidence matrix r (i, k) is updated by the calculation formula:

the calculation formula is updated by the degree matrix a (i, k) as follows:

s304), setting a damping factor to eliminate digital oscillation in iteration

In the formula, r _new (i, k) and r _old (i, k) respectively obtaining credibility matrixes obtained by the updating of the current time and the last time; a is _new (i, k) and a _old (i, k) respectively obtaining the availability matrixes updated this time and last time; in the embodiment, a damping factor lambda is set to be 0.9;

s305), calculating the sum of the credibility and the availability of all the samples for the passenger flow data samples at any time point, and calculating the sum according to the sum

Finding out a class center sample of each sample, and then outputting classification results of all time points;

s4), because a series of clustering results are output when the AP algorithm of step S3) clusters the samples, the effectiveness test is performed on various clustering results obtained by the algorithm using Calinski-harabsasz, Hartigan, and In-Group delivery indexes, and the results are shown In fig. 2 and 3, it can be seen from the figure that the optimal number of clusters of samples based on 2014 to 2015 year high-speed railway passenger flow data is 5, which is taken as the final clustering result and depicted In fig. 4 and 5;

s5), traversing and circulating all the division categories, carrying out pairwise comparison analysis on passenger flow data samples at any time point in the passenger flow data samples, splitting discontinuous time points in the same category, and forming a division result of the operation time period of the high-speed railway in 2014-2015, wherein the structure of the division result is shown in Table 1;

table 1 operation period division result

As can be seen from table 1, the high-speed railway time interval division results based on the passenger flow change rule in 2014 to 2015 are all 5 types, and 13 operation time intervals can be divided in 365 days in one year. The time spans of the operation time interval 3, the operation time interval 6, the operation time interval 7, the operation time interval 8 and the operation time interval 12 from 2014 to 2015 are the same, and the time spans of the other operation time intervals are different. The reason for this is due to the difference in spring transportation period of the past year. The spring festival in 2014 is No. 1 month No. 31, namely 31 st day; the spring festival of 2015 was No. 2/19, i.e., day 50. It can be seen that the operating period 2 is entered 7 days before the spring festival every year. The annual time sections corresponding to other operation time periods are obvious, and the general summary can be summarized as follows:

the time span of the operation period 1 is the guest flow smoothing period after the new year and before the spring festival; the time span from the operation period 2 to the operation period 4 is the peak time of the passenger flow in spring transportation; the operation period 5 time span is the passenger flow slow period between the spring transportation period and the Qingming festival; the time span of the operation period 6 is a clearness passenger flow peak period; the operation time interval 7 time span is the passenger flow slow period between the Qingming festival and the Wuyi labor festival; the operation time interval 8 is five-labor-saving passenger flow peak time; the operation time interval 9 time span is a smooth passenger flow period between five labor sections and a summer transportation period; the operation time interval 10 time span is the summer passenger flow peak time; the operation time interval 11 time span is a moderate passenger flow period between the summer transportation period and the festival of the eleven nations; the operation time period 12 time span is the passenger flow peak time of eleven national celebrations; the operation period 13 spans eleven national festations and the passenger flow before the new year is leveled and postponed.

S6), correcting the operation time interval division result, wherein the

operation time intervals

3, 6 and 8 only span one or several days, and for a high-speed railway passenger transportation management department, implementing large-scale adjustment of a train operation scheme to meet passenger flow requirements in the time intervals causes excessive interference to an existing transportation plan and consumes excessive manpower and material resources. Therefore, according to the field work experience, the embodiment merges three operation periods with less days than 7 days with the adjacent operation period, and the division result of the corrected operation period of the high-speed railway is shown in table 2,

table 2 operation period division correction results

The rectified time interval division result of the high-speed railway can be summarized as follows: the operation time period 1 is a passenger flow leveling delay period after a new year and before a spring festival; the operation time period 2 is a spring passenger flow peak period; the time span of the operation period 3 is a gentle passenger flow period between the spring transportation period and the summer transportation period; the operation period 4 time span is the summer passenger flow peak period; the operation time interval 5 time span is a moderate passenger flow period between the summer transportation period and the festival of the eleven nations; the time span of the operation time period 6 is the passenger flow peak time of eleven national celebrations; the operation period is 7, the time span of eleven national festations and the guest flow before the new year is leveled and postponed. The time interval division conclusion can be used as a premise for evaluating and adjusting the train operation scheme, the adaptability of the passenger flow demand and the train operation scheme obtained according to prediction in each operation time interval is evaluated, and if the evaluation result is not ideal, the current train operation scheme needs to be adjusted;

s7) and passenger flow demand adaptability assessment, wherein in order to show that a train operation scheme made based on operation time interval division results has better adaptability to the passenger flow demand, the train operation scheme is compiled according to the passenger flow average value of each time interval on the basis of high-speed railway operation time interval division, and each adaptability assessment index of the train operation scheme and the passenger flow demand of each time interval is simulated and calculated. Meanwhile, according to the time interval division condition in actual operation of 2014 and 2015 years of the high-speed railway, the train operation scheme is compared with the passenger flow demand adaptability, and the result is shown in table 3,

table 3 comparison with actual operating conditions

As can be seen from table 3, the train operation scheme compiled according to the operation period division result of the neighbor propagation clustering algorithm has better adaptability to the passenger flow demand on the premise that the number of times of large-scale adjustment of the train operation scheme is not changed. The passenger flow demand satisfaction rate, the average train boarding rate and the passenger flow direct rate are respectively increased by 7.6%, 16.7% and 14.1% in 2014, and the three indexes are respectively increased by 5.7%, 18.4% and 14.4% in 2015.

In the embodiment, by combining with passenger traffic survey data every day along a station, clustering merging is performed on time points with similar annual passenger traffic by adopting a neighbor propagation algorithm, the optimal clustering number is determined according to CH, Hart and IGP indexes, and a high-speed railway annual operation time interval division method is designed on the basis, wherein the main conclusion is as follows

(1) The high-speed railway operation time interval division method based on the clustering analysis is combined with the result test of the clustering validity index, the passenger flow requirements in different time intervals in the year can be objectively and accurately reflected, and the defects of low subjectivity, low efficiency and low precision of a manual division method are overcome, so that a foundation is laid for the adaptive adjustment of a train operation scheme.

(2) Example research using statistical data of passenger transmission amount of a station along a certain high-speed railway as a sample shows that the whole year can be divided into reasonable operation time intervals on the basis of determining the optimal clustering result of the annual operation time interval and manually analyzing the clustering result.

The foregoing embodiments and description have been provided to illustrate the principles and preferred embodiments of the invention, and various changes and modifications may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A high-speed railway passenger flow time interval division method based on neighbor propagation clustering comprises the following steps:

s1), dividing one year into T time points, setting any high-speed railway line X to comprise n important stations, and counting each time interval T in the upstream or downstream direction of the high-speed railway line X _k The passenger traffic of each station in the building of a passenger traffic matrix, i.e.

If it is not

wherein, X _l At a time point t _l Passenger flow sending volume of (2);

if the data is abnormal, deleting the abnormal data, and fitting and repairing the passenger flow sending quantity according to the passenger flow data of m adjacent time points at the time point by using a Lagrange interpolation method, wherein the calculation formula is as follows:

in the formula I _k (t) is a k-1 order fitting polynomial, L _n (t) is a Lagrange interpolation polynomial, t is a time point to be fitted, t is _j Is the jth time point, t _i Is the ith time point, X _i As the traffic matrix at the ith time point, l _i (t) is a fitting polynomial of degree i-1;

is the mean value of passenger transmission quantity of all sites in the year, and sigma is passengers of all sites in the yearA standard deviation of the transmitted amount;

s3), time interval division is carried out based on a neighbor propagation clustering algorithm, a passenger flow sending volume data set of T time points in the passenger flow sending volume data set is used as a candidate class representative, the similarity S (i, k) of the passenger flow sending volume in any 2 time intervals is judged, and the similarity S (i, k) represents a passenger flow sending volume sample X of the time interval k _k Sample of passenger traffic volume X at time interval i _i Of (2), i.e. sample X _k Is suitable for use as sample X _i When the algorithm is initialized, it is assumed that all samples have the same probability of being taken as class representations, i.e., all s (i, k) are assumed to be the same median value p of attraction, where the similarity calculation formula of any two samples is:

s(i,k)＝-||x _i -x _k || ² ；

wherein x is _i Sample of passenger traffic, x, representing i _k The passenger flow sending volume at the time k is shown;

defining a confidence matrix r and an availability matrix a, wherein the confidence matrix r (i, k) is derived from the samples x _i Pointing to sample x _k Represents a sample x _k Are suitable as x _i Degree of representation of the class representation of (c); a (i, k) is from sample x _k Pointing to sample x _i Denotes x _i Selection of x _k The degree of suitability as a class representative; for arbitrary sample x _i Calculating the sum of the credibility r (i, k) and the availability a (i, k) of the passenger flow sending quantity in other time intervals, and if the sum of the credibility r (i, k) and the availability a (i, k) is maximum, obtaining a sample x _k Outputting all time point classification results for class representation;

the method specifically comprises the following steps:

Set the diagonal elements s (i, k) to the same median value of attractiveness, i.e.

In the formula, the number is N samples;

the calculation formula is updated by the degree matrix a (i, k) as follows:

in the formula, r _new (i, k) and r _old (i, k) respectively obtaining credibility matrixes obtained by the updating of the current time and the last time; a is a _new (i, k) and a _old (i, k) respectively obtaining the availability matrixes obtained by the current time and the last time of updating; lambda epsilon (0,1) is a damping factor;

Finding a class center sample of each sample;

s306), updating N ← N +1 for the current iteration frequency, and judging whether the information iteration process reaches the set maximum iteration frequency, namely N is less than or equal to N _max If yes, the algorithm is terminated, all time point category division results are output, and if not, the step S302 is returned to);

s4), respectively calculating Calinski-Harabasz, Hartigan and In-Group probability indexes of different time point class division results, and selecting the optimal time point classification number and the corresponding class division result;

s6), evaluating the adaptability of the passenger flow demand, dividing the operation time periods, compiling a train operation scheme according to the mean value of the passenger flow demand in each time period, simulating the passenger flow distribution, introducing three indexes of the passenger flow demand satisfaction rate, the average seat-in rate of the train and the direct passenger flow rate, and quantitatively evaluating and summarizing the adaptability of the passenger flow demand and the train operation scheme in each time period.

2. The method for dividing the passenger flow time interval of the high-speed railway based on the neighbor propagation clustering, according to claim 1, is characterized in that: step S4), the Calinski-Harabasz index is based on the measure of the intra-class dispersion matrix and the inter-class dispersion matrix of all samples, and the class number corresponding to the maximum value is used as the optimal clustering number, namely the optimal clustering number

3. The method for dividing the passenger flow time interval of the high-speed railway based on the neighbor propagation clustering, according to claim 1, is characterized in that: in step S4), the Hartigan index is used for the case that the clustering number is 1, and the Hartigan index satisfies the minimum clustering number of Ha less than or equal to 10 as the optimal clustering number, namely

Where k is the number of clusters, trW (k) is the trace of the intra-class dispersion matrix, and n is the number of samples at a time point.

4. The method for dividing the passenger flow time interval of the high-speed railway based on the neighbor propagation clustering, according to claim 1, is characterized in that: step S4), the In-Group contribution index is used to measure whether the samples closest to each sample In a certain class are In the same class, the larger the average In-Group contribution index of all clusters is, the better the cluster quality is, the class number corresponding to the maximum value is the optimal cluster number, that is, the optimal cluster number is

5. The method for dividing the passenger flow time interval of the high-speed railway based on the neighbor propagation clustering, according to claim 1, is characterized in that: in step S6), the passenger flow demand satisfaction rate is used to reflect the arrival direction of each passenger flow of the high-speed railway and the related road network, the passenger transport capacity and the passenger flow demand satisfaction degree provided by the train operation scheme are expressed by the ratio of the passenger transport volume to the total passenger flow demand volume of the effective transport service under the condition of the established train operation scheme under the constraint that the transport capacity resource condition is the train passenger deciding condition, and the calculation formula is as follows:

6. The method for dividing the passenger flow time interval of the high-speed railway based on the neighbor propagation clustering, according to claim 1, is characterized in that: in step S6), the average train occupancy rate refers to an average of all train occupancy rates within the evaluation range, the average train occupancy rate refers to a weighted ratio of the passenger flow volume borne by the train in the running section of the train to the total number of seats provided by the train, the index is used for reflecting the result of selecting various types of high-speed trains by passengers between different passenger flow OD pairs, and a calculation formula of the average train occupancy rate is as follows:

in the formula (I), the compound is shown in the specification,

7. The method for dividing the passenger flow time interval of the high-speed railway based on the neighbor propagation clustering, according to claim 1, is characterized in that: step S6), the passenger flow demand structure is composed of different demand directions, and each demand direction has a direct arrival to destination or a transfer riding scheme, the passenger flow direct rate is a ratio of the passenger flow volume directly arriving at destination without transfer to the total passenger flow volume going to destination between each pair of passenger flow demand points under the established train driving scheme and passenger flow demand structure, and the calculation formula is:

wherein, w is the destination number of the road network passenger flow, | e | is the transfer times of a certain destination passenger flow,

the number of passengers arriving at the destination for the traffic OD by | e | transfer between w.