WO2020215798A1 - 一种地铁站内区域客流估计方法、系统及电子设备 - Google Patents

一种地铁站内区域客流估计方法、系统及电子设备 Download PDF

Info

Publication number
WO2020215798A1
WO2020215798A1 PCT/CN2019/130540 CN2019130540W WO2020215798A1 WO 2020215798 A1 WO2020215798 A1 WO 2020215798A1 CN 2019130540 W CN2019130540 W CN 2019130540W WO 2020215798 A1 WO2020215798 A1 WO 2020215798A1
Authority
WO
WIPO (PCT)
Prior art keywords
passenger flow
data
mac
mac data
ticket
Prior art date
Application number
PCT/CN2019/130540
Other languages
English (en)
French (fr)
Inventor
张伟林
张帆
张鋆
孙黎
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Publication of WO2020215798A1 publication Critical patent/WO2020215798A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry

Definitions

  • This application belongs to the field of rail transit technology, and particularly relates to a method, system and electronic equipment for estimating regional passenger flow in subway stations.
  • the relevant departments of the subway formulate operation plans based on various passenger flow conditions, or respond to various emergencies in a timely manner, which can not only improve the comfort and satisfaction of passengers, but also Relevant departments formulate reasonable operation plans in the face of changes in passenger flow, which are of great significance to the planning of new subway lines, unified management, operation and even decision-making.
  • the estimation methods of passenger flow in subway stations mainly include human visual inspection, infrared sensing, video recognition, and RFID (Radio Frequency Identification) passenger flow monitoring technologies. Most of these methods only rely on a certain independent data source pair. Passenger flow is estimated, and the statistical result formed is low in accuracy and large error.
  • This application provides a method, system and electronic equipment for estimating passenger flow in a subway station area, aiming to solve one of the above technical problems in the prior art at least to a certain extent.
  • a method for estimating passenger flow in subway stations includes the following steps:
  • Step a Calculate the passenger flow of the station's ticket and card data according to the subway ticket and card data;
  • Step b Calculate the MAC data passenger flow corresponding to the ticket data according to the MAC data collected by the AP device;
  • Step c Calculate the correspondence between the passenger flow of ticket and card data and the passenger flow of MAC data by using a linear regression method
  • Step d Perform regional division of the site grid, calculate the granular passenger flow according to the MAC data collected by the AP devices in each area, and obtain the passenger flow of each area in the site through the corresponding relationship of the granular passenger flow.
  • the technical solution adopted in the embodiment of the present application further includes: the step b further includes: cleaning the MAC data collected by the AP device, so that the MAC data of each passenger corresponds to the ticket data.
  • the MAC data cleaning specifically includes:
  • Repeated code cleaning divide the MAC data with the strongest signal to the device area with the strongest signal within the time interval;
  • Device code cleaning remove the MAC record corresponding to the AP device in the MAC data through the MAC static information table;
  • Pseudo-code cleaning The second digit of the MAC address is 0
  • Abnormal code cleaning Filter out MAC data whose signal volume is not within the effective signal range, MAC data collected in the early morning and collected multiple times, and MAC data collected by the same AP device that exceeds a certain time range.
  • the technical solution adopted in the embodiment of the present application further includes: in the step c, the calculation of the correspondence between the passenger flow of ticket and card data and the passenger flow of MAC data using the linear regression method is specifically: adopting a data fitting method to fit The relationship between ticket data passenger flow and MAC data passenger flow, using neural network combined with gradient descent method to solve the unknown parameters; construct a neural network model without hidden layers, including three input neurons and one output neuron, The output of the neural network model is:
  • prediction represents the target value, and a, b, and c respectively represent the parameters of the corresponding item;
  • the loss function is defined as:
  • loss represents the loss value
  • prediction represents the predicted value
  • target represents the true value
  • n represents the number of samples
  • the goal of the algorithm is to use the gradient descent method to find a set of model parameters ⁇ to minimize the loss function:
  • a passenger flow estimation system in a subway station including:
  • Ticket data passenger flow calculation module used to calculate the ticket data passenger flow of the station based on the subway ticket data
  • MAC data passenger flow calculation module used to calculate the MAC data passenger flow corresponding to the ticket data according to the MAC data collected by the AP device;
  • Data fitting module used to calculate the correspondence between the passenger flow of ticket and card data and the passenger flow of MAC data by using a linear regression method
  • Regional passenger flow calculation module used to divide the grid of the site, calculate the granular passenger flow according to the MAC data collected by the AP devices in each area, and obtain the passenger flow of each area in the station through the corresponding relationship of the granular passenger flow.
  • the technical solution adopted in the embodiment of the present application further includes a MAC data cleaning module, which is used to clean the MAC data collected by the AP device so that the MAC data of each passenger corresponds to the ticket data.
  • the MAC data cleaning specifically includes:
  • Repeated code cleaning divide the MAC data with the strongest signal to the device area with the strongest signal within the time interval;
  • Device code cleaning remove the MAC record corresponding to the AP device in the MAC data through the MAC static information table;
  • Pseudo-code cleaning The second digit of the MAC address is 0
  • Abnormal code cleaning Filter out MAC data whose signal volume is not within the effective signal range, MAC data collected in the early morning and collected multiple times, and MAC data collected by the same AP device that exceeds a certain time range.
  • the technical solution adopted in the embodiment of the application further includes: the data fitting module uses a linear regression method to calculate the correspondence between the passenger flow of ticket and card data and the passenger flow of MAC data. Specifically, the data fitting method is used to fit ticket data.
  • prediction represents the target value, and a, b, and c respectively represent the parameters of the corresponding item;
  • the loss function is defined as:
  • loss represents the loss value
  • prediction represents the predicted value
  • target represents the true value
  • n represents the number of samples
  • the goal of the algorithm is to use the gradient descent method to find a set of model parameters ⁇ to minimize the loss function:
  • an electronic device including:
  • At least one processor At least one processor
  • a memory communicatively connected with the at least one processor; wherein,
  • the memory stores instructions executable by the one processor, and the instructions are executed by the at least one processor, so that the at least one processor can perform the following operations of the above-mentioned method for estimating passenger flow in a subway station area:
  • Step a Calculate the passenger flow of the station's ticket and card data according to the subway ticket and card data;
  • Step b Calculate the MAC data passenger flow corresponding to the ticket data according to the MAC data collected by the AP device;
  • Step c Calculate the correspondence between the passenger flow of ticket and card data and the passenger flow of MAC data by using a linear regression method
  • Step d Perform regional division of the site grid, calculate the granular passenger flow according to the MAC data collected by the AP devices in each area, and obtain the passenger flow of each area in the site through the corresponding relationship of the granular passenger flow.
  • the beneficial effects produced by the embodiments of this application are: the method, system and electronic equipment for estimating passenger flow in subway stations in the embodiments of this application combine subway ticket data and MAC data collected by AP equipment, using multiple data sources The passenger flow estimation in the area of the subway station can realize more accurate and real-time passenger flow monitoring compared with the existing technology.
  • FIG. 1 is a flowchart of a method for estimating passenger flow in a subway station area according to an embodiment of the present application
  • Figure 2 is a schematic diagram of the neural network model structure
  • Figure 3 is a graph showing the relationship between ticket and card data passenger flow and MAC data passenger flow
  • Figure 4 is a schematic diagram of site area division
  • FIG. 5 is a schematic structural diagram of a passenger flow estimation system in a subway station according to an embodiment of the present application.
  • FIG. 6 is a schematic diagram of the hardware device structure of the method for estimating passenger flow in a subway station area provided by an embodiment of the present application.
  • the method for estimating passenger flow in subway stations in the embodiments of this application combines subway ticket data and MAC (Media Access Control) data collected by AP (Access Point) equipment , Using multiple data sources to estimate more real-time and more accurate passenger flow in subway stations.
  • MAC Media Access Control
  • FIG. 1 is a flowchart of a method for estimating passenger flow in a subway station area according to an embodiment of the present application.
  • the method for estimating passenger flow in a subway station area of the embodiment of the present application includes the following steps:
  • Step 100 Obtain the subway ticket and card data collected by the subway ticketing system, and calculate the ticket and card data passenger flow of the station according to the subway ticket data (*minute);
  • Step 200 Obtain MAC data collected by AP devices in the site
  • Step 300 Perform data cleaning on the acquired MAC data so that the MAC data of each passenger corresponds to the ticket data;
  • step 300 the goal of MAC data cleaning is to make one MAC correspond to a real subway passenger. Only in this way can the MAC data of each passenger in the station correspond to the ticket data.
  • MAC data cleaning includes four parts: repetition code, device code, pseudo code, and abnormal code. The specific cleaning process is as follows:
  • A. Repeated code When the system collects the MAC data of the site for a period of time, different devices may collect the same MAC address with different signal strength, so the repeated code needs to be processed.
  • the processing method is: since the server will upload MAC data once within a certain time interval t, the MAC data with the highest signal strength is divided into the device area with the strongest signal within the time interval.
  • Pseudo code In order to protect user privacy, terminal devices such as iPhone and some Android phones will automatically randomize the MAC address of the machine before sending it, so the MAC address collected by the AP device is not the real MAC address of the phone , And each time it is sent to the collection device will be randomized, which forms a pseudo code that interferes with the data.
  • the processing method is: if the second digit of the MAC address is 0
  • you look at historical data you can also combine the travel trajectory of each MAC at the same time, and filter the MAC data that only appears in one station, and the remaining MAC data is the real and effective MAC data by default.
  • D. Abnormal code Including the MAC outside the station whose collected signal is very weak, the MAC of other equipment not in the MAC static information table, and the MAC data of the staff inside the station.
  • the processing method is: the collected MAC signal volume is between -40 and -120 as the effective signal volume, so the MAC data whose signal volume is not within this range can be directly filtered out.
  • MAC data collected in the early morning and collected multiple times can also be filtered out. Since the AP device will upload data once within a certain time interval t, if a MAC data collected by the same AP device exceeds a certain time range (generally, it is the staff who reside at the site instead of the passengers that conform to this rule), then It can also be determined that the MAC belongs to an abnormal MAC.
  • the MAC data that has been cleaned by the above steps is counted to obtain the granular passenger flow (*minute) of a certain station on a certain day corresponding to the ticket data.
  • Step 400 Calculate the MAC data passenger flow of the station corresponding to the ticket and card data according to the cleaned MAC data;
  • Step 500 Fit the correspondence between the passenger flow of ticket issuing card data and the passenger flow of MAC data by using linear regression method
  • step 500 since the passenger flow of ticket and card data calculated by using the ticket data is only the overall passenger flow in the site, the passenger flow of each area in the site cannot be obtained, and the passenger flow of MAC data calculated by MAC data has a certain loss Therefore, it is necessary to establish the proportional relationship between the ticket and card data passenger flow and the MAC data passenger flow, and then calculate the granular passenger flow of each area in the station through MAC data calculation.
  • a data fitting method is adopted to fit the relationship between the passenger flow of ticket and card data and the passenger flow of MAC data.
  • the data fitting method includes least square method, genetic algorithm, neural network, etc.
  • the purpose of data fitting is to use a relatively simple function to approximate a complex and unknown function.
  • Table 1 The data is shown in Table 1:
  • the embodiment of the application uses a neural network combined with a gradient descent method to solve unknown parameters. Specifically:
  • the output (objective function) of the neural network model can be expressed as:
  • prediction represents the target value
  • a, b, and c respectively represent the parameters of the corresponding item.
  • the loss function can be defined as:
  • loss represents the loss value
  • prediction represents the predicted value
  • target represents the true value
  • n represents the number of samples.
  • the goal of the algorithm is to use the gradient descent method to find a set of model parameters ⁇ to minimize the loss function:
  • the parameters a, b, and c of the objective function can be obtained, thereby obtaining the corresponding relationship between the passenger flow of ticket and card data and the passenger flow of MAC data.
  • Figure 3 it is a graph of the relationship between ticket and card data passenger flow and MAC data passenger flow. It can be seen from the figure that the granular passenger flow calculated by ticket data or MAC data shows a certain pattern, showing a double peak state during the day. The passenger flow at the station during the morning and evening peak hours is higher, and the passenger flow during the peak hours At the same time, the passenger flow rules calculated by the two data are more consistent and show a certain correlation.
  • Step 600 Perform regional division of the station grid, clean and calculate the granular passenger flow through MAC data collected by AP devices in the area, and obtain the passenger flow of each area in the station through the corresponding relationship of the granular passenger flow.
  • step 600 in order to obtain the regional passenger flow, MAC data needs to be used for calculation. Because the AP equipment is distributed in the site according to the area, the site is divided into grids, as shown in Figure 4, which is a schematic diagram of the site area division. If you calculate the passenger flow in area A, you only need to analyze the MAC data collected by all AP devices in area A. Since the repetitive codes have been processed in the data preprocessing stage, the ones reserved at this time are APs in area A The MAC data with the strongest signal collected by the device, and at the same time, filter out some MAC data whose signal is less than a certain value according to the density of AP devices, so as to ensure that the MAC only appears in area A instead of area B.
  • FIG. 5 is a schematic structural diagram of a passenger flow estimation system in a subway station according to an embodiment of the present application.
  • the regional passenger flow estimation system in a subway station of the embodiment of the present application includes a ticket and card data passenger flow calculation module, a MAC data acquisition module, a MAC data cleaning module, a MAC data passenger flow calculation module, a data fitting module and a regional passenger flow calculation module.
  • Ticket data passenger flow calculation module used to obtain the subway ticket data collected by the subway ticketing system, and calculate the ticket data passenger flow of the station based on the subway ticket data;
  • MAC data acquisition module used to acquire MAC data collected by AP devices in the site;
  • MAC data cleaning module used to clean the acquired MAC data so that the MAC data of each passenger corresponds to the ticket data; the goal of MAC data cleaning is to make a MAC correspond to a real subway passenger. Only in this way can the MAC data of each passenger in the station correspond to the ticket data.
  • MAC data cleaning includes four parts: repetition code, device code, pseudo code, and abnormal code. The specific cleaning process is as follows:
  • A. Repeated code When the system collects the MAC data of the site for a period of time, different devices may collect the same MAC address with different signal strength, so the repeated code needs to be processed.
  • the processing method is: since the server will upload MAC data once within a certain time interval t, the MAC with the highest signal strength is divided into the device area with the strongest signal within the time interval.
  • Pseudo code In order to protect user privacy, terminal devices such as iPhone and some Android phones will automatically randomize the MAC address of the machine before sending it, so the MAC address collected by the AP device is not the real MAC address of the phone , And each time it is sent to the collection device will be randomized, which forms a pseudo code that interferes with the data.
  • the processing method is: if the second digit of the MAC address is 0
  • you look at historical data you can also combine the travel trajectory of each MAC at the same time, and filter the MAC data that only appears in one station, and the remaining MAC data is the real and effective MAC data by default.
  • D. Abnormal code Including the MAC outside the station whose collected signal is very weak, the MAC data of other devices that are not in the MAC static information table, and the MAC data of the staff inside the station.
  • the processing method is: the collected MAC signal volume is between -40 and -120 as the effective signal volume, so the MAC data whose signal volume is not within this range can be directly filtered out.
  • MAC data collected in the early morning and collected multiple times can also be filtered out. Since the AP device will upload data once within a certain time interval t, if a MAC data collected by the same AP device exceeds a certain time range (generally, it is the staff who reside at the site instead of the passengers that conform to this rule), then It can also be determined that the MAC belongs to an abnormal MAC.
  • the MAC data that has been cleaned by the above steps is counted to obtain the granular passenger flow (*minute) of a certain station on a certain day corresponding to the ticket data.
  • MAC data passenger flow calculation module used to calculate the MAC data passenger flow of the station corresponding to the ticket data according to the cleaned MAC data;
  • Data fitting module It is used to fit the correspondence between the passenger flow of ticket card data and the passenger flow of MAC data using linear regression method; among them, the passenger flow of ticket data calculated by using the ticket data is only the whole site Passenger flow cannot be obtained in each area of the site, and the MAC data passenger flow calculated by MAC data has a certain loss. Therefore, it is necessary to establish a proportional relationship between the ticket data passenger flow and the MAC data passenger flow. In this way, the granular passenger flow of each area in the station can be obtained through MAC data calculation.
  • a data fitting method is adopted to fit the relationship between the passenger flow of ticket and card data and the passenger flow of MAC data.
  • the data fitting method includes least square method, genetic algorithm, neural network, etc.
  • the purpose of data fitting is to use a relatively simple function to approximate a complex and unknown function.
  • Table 1 The data is shown in Table 1:
  • the embodiment of the application uses a neural network combined with a gradient descent method to solve unknown parameters. Specifically:
  • the output (objective function) of the neural network model can be expressed as:
  • prediction represents the target
  • the values, a, b, and c respectively represent the parameters of the corresponding item.
  • the loss function can be defined as:
  • loss represents the loss value
  • prediction represents the predicted value
  • target represents the true value
  • n represents the number of samples.
  • the goal of the algorithm is to use the gradient descent method to find a set of model parameters ⁇ to minimize the loss function:
  • the parameters a, b, and c of the objective function can be obtained, thereby obtaining the corresponding relationship between the passenger flow of ticket and card data and the passenger flow of MAC data.
  • Figure 3 it is a graph of the relationship between ticket and card data passenger flow and MAC data passenger flow. It can be seen from the figure that the granular passenger flow calculated by ticket data or MAC data shows a certain pattern, showing a double peak state during the day. The passenger flow at the station during the morning and evening peak hours is higher, and the passenger flow during the peak hours At the same time, the passenger flow rules calculated by the two data are more consistent and show a certain correlation.
  • Regional passenger flow calculation module used to partition the grid of the station, clean and calculate the granular passenger flow through the MAC data collected by the AP equipment in the area, and obtain the passenger flow of each area in the station through the corresponding relationship of the granular passenger flow. Among them, if you want to get the regional passenger flow, you need to use MAC data for calculation. Because the AP equipment is distributed in the site according to the area, the site is divided into grids, as shown in Figure 4, which is a schematic diagram of the site area division. If you calculate the passenger flow in area A, you only need to analyze the MAC data collected by all AP devices in area A.
  • the ones reserved at this time are APs in area A
  • the MAC data with the strongest signal collected by the device and at the same time, filter out some MAC data whose signal is less than a certain value according to the density of AP devices, so as to ensure that the MAC only appears in area A instead of area B.
  • FIG. 6 is a schematic diagram of the hardware device structure of the method for estimating passenger flow in a subway station area provided by an embodiment of the present application.
  • the device includes one or more processors and memory. Taking a processor as an example, the device may also include: an input system and an output system.
  • the processor, the memory, the input system, and the output system may be connected by a bus or other methods.
  • the connection by a bus is taken as an example.
  • the memory can be used to store non-transitory software programs, non-transitory computer executable programs, and modules.
  • the processor executes various functional applications and data processing of the electronic device by running non-transitory software programs, instructions, and modules stored in the memory, that is, realizing the processing methods of the foregoing method embodiments.
  • the memory may include a program storage area and a data storage area, where the program storage area can store an operating system and an application program required by at least one function; the data storage area can store data and the like.
  • the memory may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid state storage devices.
  • the storage may optionally include storage remotely arranged with respect to the processor, and these remote storages may be connected to the processing system through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
  • the input system can receive input digital or character information, and generate signal input.
  • the output system may include display devices such as a display screen.
  • the one or more modules are stored in the memory, and when executed by the one or more processors, the following operations of any of the foregoing method embodiments are performed:
  • Step a Calculate the passenger flow of the station's ticket and card data according to the subway ticket and card data;
  • Step b Calculate the MAC data passenger flow corresponding to the ticket data according to the MAC data collected by the AP device;
  • Step c Calculate the correspondence between the passenger flow of ticket and card data and the passenger flow of MAC data by using a linear regression method
  • Step d Perform regional division of the site grid, calculate the granular passenger flow according to the MAC data collected by the AP devices in each area, and obtain the passenger flow of each area in the site through the corresponding relationship of the granular passenger flow.
  • the embodiments of the present application provide a non-transitory (non-volatile) computer storage medium, the computer storage medium stores computer executable instructions, and the computer executable instructions can perform the following operations:
  • Step a Calculate the passenger flow of the station's ticket and card data according to the subway ticket and card data;
  • Step b Calculate the MAC data passenger flow corresponding to the ticket data according to the MAC data collected by the AP device;
  • Step c Calculate the correspondence between the passenger flow of ticket card data and the passenger flow of MAC data by using a linear regression method
  • Step d Perform regional division of the site grid, calculate the granular passenger flow according to the MAC data collected by the AP devices in each area, and obtain the passenger flow of each area in the site through the corresponding relationship of the granular passenger flow.
  • the embodiment of the present application provides a computer program product, the computer program product includes a computer program stored on a non-transitory computer-readable storage medium, the computer program includes program instructions, when the program instructions are executed by a computer To make the computer do the following:
  • Step a Calculate the passenger flow of the station's ticket and card data according to the subway ticket and card data;
  • Step b Calculate the MAC data passenger flow corresponding to the ticket data according to the MAC data collected by the AP device;
  • Step c Calculate the correspondence between the passenger flow of ticket and card data and the passenger flow of MAC data by using a linear regression method
  • Step d Perform regional division of the site grid, calculate the granular passenger flow according to the MAC data collected by the AP devices in each area, and obtain the passenger flow of each area in the site through the corresponding relationship of the granular passenger flow.
  • the method, system and electronic device for estimating the passenger flow in the subway station area of the embodiment of the present application combine the subway ticket data and MAC data collected by the AP device to use multiple data sources to estimate the passenger flow in the subway station area. Compared with the prior art, it can be realized More accurate and real-time passenger flow monitoring.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Devices For Checking Fares Or Tickets At Control Points (AREA)

Abstract

一种地铁站内区域客流估计方法、系统及电子设备。所述方法包括:步骤a:根据地铁票卡数据计算站点的票卡数据客流量;步骤b:根据AP设备采集的MAC数据计算得到与票卡数据相对应的MAC数据客流量;步骤c:利用线性回归方法计算票卡数据客流量和MAC数据客流量之间的对应关系;步骤d:对站点网格化进行区域划分,根据各个区域内的AP设备采集的MAC数据计算得到粒度客流,并将粒度客流通过对应关系得到站点内各个区域的客流量。通过结合地铁票卡数据和AP设备采集的MAC数据,采用多种数据源对地铁站内区域进行客流估计,可以实现更精准更实时的客流监测。

Description

一种地铁站内区域客流估计方法、系统及电子设备 技术领域
本申请属于轨道交通技术领域,特别涉及一种地铁站内区域客流估计方法、系统及电子设备。
背景技术
随着近些年地铁的大力发展,尤其是在许多城市的十三五规划中,都把轨道交通的发展放在非常重要的位置上,在可以预见的未来十年仍是轨道交通的快速发展期,轨道交通将在社会发展和进步当中扮演越来越重要的角色。
通过对地铁站内区域进行客流量估计以掌握客流信息的变化情况,地铁相关部门依据各种客流状况制定运营计划,或者及时应对各类突发状况,不仅能够提高乘客的舒适和满意度,而且给相关部门面对客流量的变化制定合理的运营方案,对地铁新线路的规划,统一的管理、运营甚至决策都具有重要的意义。
目前,对于地铁站内区域客流的估计方法主要包括人工目测、红外感应、视频识别和RFID(Radio Frequency Identification,射频识别)等客流监测技术,这些方法大多数仅仅依赖于某一种独立的数据源对客流进行估计,形成的统计结果客流数据精度较低,误差较大。
发明内容
本申请提供了一种地铁站内区域客流估计方法、系统及电子设备,旨在至少在一定程度上解决现有技术中的上述技术问题之一。
为了解决上述问题,本申请提供了如下技术方案:
一种地铁站内区域客流估计方法,包括以下步骤:
步骤a:根据地铁票卡数据计算站点的票卡数据客流量;
步骤b:根据AP设备采集的MAC数据计算得到与所述票卡数据相对应的MAC数据客流量;
步骤c:利用线性回归方法计算所述票卡数据客流量和MAC数据客流量之间的对应关系;
步骤d:对站点网格化进行区域划分,根据各个区域内的AP设备采集的MAC数据计算得到粒度客流,并将所述粒度客流通过对应关系得到站点内各个区域的客流量。
本申请实施例采取的技术方案还包括:所述步骤b还包括:对所述AP设备采集的MAC数据进行清洗,使每一位乘客的MAC数据和票卡数据相对应。
本申请实施例采取的技术方案还包括:所述MAC数据清洗具体包括:
重复码清洗:在时间间隔范围内将信号强度最大的MAC数据划分到信号最强的设备区域;
设备码清洗:通过MAC静态信息表去除MAC数据中AP设备对应的MAC记录;
伪码清洗:MAC地址的第二位为0|4|8|C的为非伪码,并结合每个MAC的出行轨迹,将只出现在一个站点的MAC数据进行过滤,剩余的MAC数据默认为真实有效的MAC数据;
异常码清洗:过滤掉信号量不在有效信号范围内的MAC数据、凌晨采集的且被采集多次的MAC数据、以及被同一AP设备采集累计超过一定时间范围的MAC数据。
本申请实施例采取的技术方案还包括:在所述步骤c中,所述利用线性回归方法计算票卡数据客流量和MAC数据客流量之间的对应关系具体为:采用数据拟合方法拟合票卡数据客流量和MAC数据客流量之间的关系,使用神经网络结合梯度下降法求解未知的参数;构建一个没有隐藏层的神经网络模型,包含三个输入神经元,和一个输出神经元,神经网络模型的输出为:
prediction (i)=a+bx i+cx i 2=[1 x i x i 2][a b c] T,i=1,2,...,n
上述公式中,prediction表示目标值,a,b,c分别表示对应项的参数;
损失函数定义为:
Figure PCTCN2019130540-appb-000001
上述公式中,loss表示损失值,prediction表示预测值,target表示真实值,n表示样本个数;
算法的目标即使用梯度下降法,寻找一组模型参数θ,使损失函数最小化:
θ=[a b c] T
θ=arg θmin‖loss(θ)‖
使用梯度下降法优化模型参数,参数的迭代公式为:
Figure PCTCN2019130540-appb-000002
上述公式中,
Figure PCTCN2019130540-appb-000003
表示学习率;通过使均方误差最小,结合梯度下降求得目标函数的参数a,b,c得到所述票卡数据客流量和MAC数据客流量之间的对应关系。
本申请实施例采取的另一技术方案为:一种地铁站内区域客流估计系统,包括:
票卡数据客流计算模块:用于根据地铁票卡数据计算站点的票卡数据客流量;
MAC数据客流计算模块:用于根据AP设备采集的MAC数据计算得到与所述票卡数据相对应的MAC数据客流量;
数据拟合模块:用于利用线性回归方法计算所述票卡数据客流量和MAC数据客流量之间的对应关系;
区域客流计算模块:用于对站点网格化进行区域划分,根据各个区域内的AP设备采集的MAC数据计算得到粒度客流,并将所述粒度客流通过对应关系得到站点内各个区域的客流量。
本申请实施例采取的技术方案还包括MAC数据清洗模块,所述MAC数据清洗模块用于对AP设备采集的MAC数据进行清洗,使每一位乘客的MAC数据和票卡数据相对应。
本申请实施例采取的技术方案还包括:所述MAC数据清洗具体包括:
重复码清洗:在时间间隔范围内将信号强度最大的MAC数据划分到信号最强的设备区域;
设备码清洗:通过MAC静态信息表去除MAC数据中AP设备对应的MAC记录;
伪码清洗:MAC地址的第二位为0|4|8|C的为非伪码,并结合每个MAC的出行轨迹,将只出现在一个站点的MAC数据进行过滤,剩余的MAC数据默认为 真实有效的MAC数据;
异常码清洗:过滤掉信号量不在有效信号范围内的MAC数据、凌晨采集的且被采集多次的MAC数据、以及被同一AP设备采集累计超过一定时间范围的MAC数据。
本申请实施例采取的技术方案还包括:所述数据拟合模块利用线性回归方法计算票卡数据客流量和MAC数据客流量之间的对应关系具体为:采用数据拟合方法拟合票卡数据客流量和MAC数据客流量之间的关系,使用神经网络结合梯度下降法求解未知的参数;构建一个没有隐藏层的神经网络模型,包含三个输入神经元,和一个输出神经元,神经网络模型的输出为:
prediction (i)=a+bx i+cx i 2=[1 x i x i 2][a b c] T,i=1,2,...,n
上述公式中,prediction表示目标值,a,b,c分别表示对应项的参数;
损失函数定义为:
Figure PCTCN2019130540-appb-000004
上述公式中,loss表示损失值,prediction表示预测值,target表示真实值,n表示样本个数;
算法的目标即使用梯度下降法,寻找一组模型参数θ,使损失函数最小化:
θ=[a b c] T
θ=arg θmin‖loss(θ)‖
使用梯度下降法优化模型参数,参数的迭代公式为:
Figure PCTCN2019130540-appb-000005
上述公式中,
Figure PCTCN2019130540-appb-000006
表示学习率;通过使均方误差最小,结合梯度下降求得目标函数的参数a,b,c,得到所述票卡数据客流量和MAC数据客流量之间的对应关系。
本申请实施例采取的又一技术方案为:一种电子设备,包括:
至少一个处理器;以及
与所述至少一个处理器通信连接的存储器;其中,
所述存储器存储有可被所述一个处理器执行的指令,所述指令被所述至少 一个处理器执行,以使所述至少一个处理器能够执行上述的地铁站内区域客流估计方法的以下操作:
步骤a:根据地铁票卡数据计算站点的票卡数据客流量;
步骤b:根据AP设备采集的MAC数据计算得到与所述票卡数据相对应的MAC数据客流量;
步骤c:利用线性回归方法计算所述票卡数据客流量和MAC数据客流量之间的对应关系;
步骤d:对站点网格化进行区域划分,根据各个区域内的AP设备采集的MAC数据计算得到粒度客流,并将所述粒度客流通过对应关系得到站点内各个区域的客流量。
相对于现有技术,本申请实施例产生的有益效果在于:本申请实施例的地铁站内区域客流估计方法、系统及电子设备结合地铁票卡数据和AP设备采集的MAC数据,采用多种数据源对地铁站内区域进行客流估计,相对于现有技术,可以实现更精准更实时的客流监测。
附图说明
图1是本申请实施例的地铁站内区域客流估计方法的流程图;
图2为神经网络模型结构示意图;
图3为票卡数据客流量和MAC数据客流量的关系曲线图;
图4为站点区域划分示意图;
图5是本申请实施例的地铁站内区域客流估计系统的结构示意图;
图6是本申请实施例提供的地铁站内区域客流估计方法的硬件设备结构示意图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施 例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。
为了解决现有技术存在的问题,本申请实施例的地铁站内区域客流估计方法结合地铁票卡数据和AP(Access Point,无线接入点)设备采集的MAC(Media Access Control,媒体访问控制)数据,采用多种数据源对地铁站内区域客流进行更实时更精准的估计。
请参阅图1,是本申请实施例的地铁站内区域客流估计方法的流程图。本申请实施例的地铁站内区域客流估计方法包括以下步骤:
步骤100:获取地铁票务系统收集的地铁票卡数据,并根据地铁票卡数据计算出站点的票卡数据客流量(*分钟);
步骤200:获取站点内AP设备采集的MAC数据;
步骤300:对获取的MAC数据进行数据清洗,使每一位乘客的MAC数据和票卡数据相对应;
步骤300中,MAC数据清洗的目标是使一个MAC对应到一个真实的地铁乘客,只有这样才能将站点内每一位乘客的MAC数据和票卡数据相对应。本申请实施例中,MAC数据清洗包括重复码、设备码、伪码、异常码四个部分,具体清洗过程如下:
A、重复码:系统在统计站点一段时间内的MAC数据时,不同设备可能会采集到信号强度不同的同一个MAC地址,因此需要对重复码进行处理。处理方法为:由于服务器会在一定时间间隔t内上传一次MAC数据,所以在该时间间隔范围内将信号强度最大的MAC数据划分到信号最强的设备区域。
B、设备码:由于AP设备本身有MAC地址,并且AP设备之间会在工作时间无间断地相互采集,因此,需要通过AP设备厂商提供的设备MAC静态信息表, 去除MAC数据中AP设备对应的MAC记录。
C、伪码:由于iPhone手机和部分安卓手机等终端设备为了保护用户隐私,会自动将本机的MAC地址进行地址随机化后再发送,所以AP设备采集到的MAC地址并非手机真正的MAC地址,并且每一次发送给采集设备都会进行一次随机化,这就形成了干扰数据的伪码。处理方法为:MAC地址的第二位为0|4|8|C的为非伪码。另外,如果从历史数据来看,也可以同时结合每个MAC的出行轨迹,将只出现在一个站点的MAC数据进行过滤,剩余的MAC数据默认就是真实有效的MAC数据。
D、异常码:包括采集信号很弱在站外的MAC、未在MAC静态信息表中的其它设备MAC、站内工作人员的MAC数据等。处理方法为:被采集的MAC信号量在-40到-120之间的才为有效信号量,所以可以直接过滤掉信号量不在该范围内的MAC数据。
另外,由于AP设备一般是全天24小时工作,乘客一般凌晨时间不会在地铁站内,所以还可以过滤掉凌晨采集的且被采集多次的MAC数据。由于AP设备会在一定时间间隔t内上传一次数据,如果一个MAC数据被同一AP设备采集累计超过一定时间范围(一般符合这种规律的是常驻于站点的工作人员,而非乘客),那么也可以判定该MAC属于一个异常MAC。
最后,将经过以上步骤清洗处理过的MAC数据进行统计,即可得到与票卡数据相对应的某天某站点的粒度客流(*分钟)。
步骤400:根据清洗后的MAC数据计算得到与票卡数据相对应的站点的MAC数据客流量;
步骤500:利用线性回归方法拟合出票卡数据客流量和MAC数据客流量之间的对应关系;
步骤500中,由于利用票卡数据计算得到的票卡数据客流量只是站点内的整体客流量,不能得到站点内各区域的客流量,而通过MAC数据计算得到的MAC数据客流量存在一定的损失,因此,需要建立起票卡数据客流量和MAC数据客流量之间的比例关系,从而再通过MAC数据计算得到站点内各区域的粒度客流。
本申请实施例采用数据拟合方法拟合票卡数据客流量和MAC数据客流量之间的关系,数据拟合方法包括最小二乘法、遗传算法、神经网络等。数据拟合的目的是使用一个较为简单的函数去逼近一个复杂的、未知的函数,数据展示如表1所示:
时间 T 1 T 2 ...... T n-1 T n
票卡数据粒度客流量 x 1 x 2 ...... x n-1 x n
MAC数据粒度客流量 y 1 y 2 ...... y n-1 y n
本申请实施例使用神经网络结合梯度下降法求解未知的参数。具体包括:
构建一个没有隐藏层的神经网络模型,包含三个输入神经元,和一个输出神经元,神经网络模型结构具体如图2所示。该神经网络模型的输出(目标函数)可以表示为:
prediction (i)=a+bx i+cx i 2=[1 x i x i 2][a b c] T,i=1,2,...,n  (1)
公式(1)中,prediction表示目标值,a,b,c分别表示对应项的参数。
损失函数可以定义为:
Figure PCTCN2019130540-appb-000007
公式(2)中,loss表示损失值,prediction表示预测值,target表示真实值,n表示样本个数。
算法的目标即使用梯度下降法,寻找一组模型参数θ,使损失函数最小化:
θ=[a b c] T  (3)
θ=arg θmin‖loss(θ)‖  (4)
使用梯度下降法优化模型参数,参数的迭代公式为:
Figure PCTCN2019130540-appb-000008
公式(5)中,
Figure PCTCN2019130540-appb-000009
表示学习率。
通过使均方误差最小,结合梯度下降可以求得目标函数的参数a,b,c,从而得到票卡数据客流量和MAC数据客流量之间的对应关系。如图3所示,为票卡数据客流量和MAC数据客流量的关系曲线图。从图中可以看出,无论是通过票卡数据还是MAC数据计算得到的粒度客流量均呈现一定规律,一天之中呈现双峰状态,在早晚高峰时间段站点的客流量较高,平峰时段客流量较低,同时两种数据计算得到的客流规律较一致,并呈现一定的相关性。
步骤600:对站点网格化进行区域划分,通过区域内的AP设备采集的MAC数据清洗并计算得到粒度客流,并将该粒度客流通过对应关系得到站内各个区域的客流量。
步骤600中,要想得到区域客流量,就需要利用MAC数据进行计算。因为对于AP设备在站点内是按照区域分散分布的,所以通过对站点网格化进行区域划分,具体如图4所示,为站点区域划分示意图。如果计算A区域内的客流量,只需要对A区域内所有AP设备采集的MAC数据进行分析,由于在数据预处理阶段已经对重复码进行了处理,所以此时保留的即为A区域内AP设备采集的信号最强的MAC数据,同时根据AP设备密集程度过滤掉一些信号小于一定数值的MAC数据,以保证该MAC只出现在A区域内而非B区域。
请参阅图5,是本申请实施例的地铁站内区域客流估计系统的结构示意图。本申请实施例的地铁站内区域客流估计系统包括票卡数据客流计算模块、MAC数据获取模块、MAC数据清洗模块、MAC数据客流计算模块、数据拟合模块和区域客流计算模块。
票卡数据客流计算模块:用于获取地铁票务系统收集的地铁票卡数据,并根据地铁票卡数据计算出站点的票卡数据客流量;
MAC数据获取模块:用于获取站点内AP设备采集的MAC数据;
MAC数据清洗模块:用于对获取的MAC数据进行数据清洗,使每一位乘客的MAC数据和票卡数据相对应;其中,MAC数据清洗的目标是使一个MAC对应到一个真实的地铁乘客,只有这样才能将站点内每一位乘客的MAC数据和票卡数据相对应。本申请实施例中,MAC数据清洗包括重复码、设备码、伪码、异常码四个部分,具体清洗过程如下:
A、重复码:系统在统计站点一段时间内的MAC数据时,不同设备可能会采集到信号强度不同的同一个MAC地址,因此需要对重复码进行处理。处理方法为:由于服务器会在一定时间间隔t内上传一次MAC数据,所以在该时间间隔范围内将信号强度最大的MAC划分到信号最强的设备区域。
B、设备码:由于AP设备本身有MAC地址,并且AP设备之间会在工作时间无间断地相互采集,因此,需要通过AP设备厂商提供的设备MAC静态信息表,去除MAC数据中AP设备对应的MAC记录。
C、伪码:由于iPhone手机和部分安卓手机等终端设备为了保护用户隐私,会自动将本机的MAC地址进行地址随机化后再发送,所以AP设备采集到的MAC地址并非手机真正的MAC地址,并且每一次发送给采集设备都会进行一次随机化,这就形成了干扰数据的伪码。处理方法为:MAC地址的第二位为0|4|8|C的为非伪码。另外,如果从历史数据来看,也可以同时结合每个MAC的出行轨迹,将只出现在一个站点的MAC数据进行过滤,剩余的MAC数据默认就是真实有效的MAC数据。
D、异常码:包括采集信号很弱在站外的MAC、未在MAC静态信息表中的其 它设备MAC、站内工作人员的MAC数据等。处理方法为:被采集的MAC信号量在-40到-120之间的才为有效信号量,所以可以直接过滤掉信号量不在该范围内的MAC数据。
另外,由于AP设备一般是全天24小时工作,乘客一般凌晨时间不会在地铁站内,所以还可以过滤掉凌晨采集的且被采集多次的MAC数据。由于AP设备会在一定时间间隔t内上传一次数据,如果一个MAC数据被同一AP设备采集累计超过一定时间范围(一般符合这种规律的是常驻于站点的工作人员,而非乘客),那么也可以判定该MAC属于一个异常MAC。
最后,将经过以上步骤清洗处理过的MAC数据进行统计,即可得到与票卡数据相对应的某天某站点的粒度客流(*分钟)。
MAC数据客流计算模块:用于根据清洗后的MAC数据计算得到与票卡数据相对应的站点的MAC数据客流量;
数据拟合模块:用于利用线性回归方法拟合出票卡数据客流量和MAC数据客流量之间的对应关系;其中,由于利用票卡数据计算得到的票卡数据客流量只是站点内的整体客流量,不能得到站点内各区域的客流量,而通过MAC数据计算得到的MAC数据客流量存在一定的损失,因此,需要建立起票卡数据客流量和MAC数据客流量之间的比例关系,从而再通过MAC数据计算得到站点内各区域的粒度客流。
本申请实施例采用数据拟合方法拟合票卡数据客流量和MAC数据客流量之间的关系,数据拟合方法包括最小二乘法、遗传算法、神经网络等。数据拟合的目的是使用一个较为简单的函数去逼近一个复杂的、未知的函数,数据展示如表1所示:
时间 T 1 T 2 ...... T n-1 T n
票卡数据粒度客流量 x 1 x 2 ...... x n-1 x n
MAC数据粒度客流量 y 1 y 2 ...... y n-1 y n
本申请实施例使用神经网络结合梯度下降法求解未知的参数。具体包括:
构建一个没有隐藏层的神经网络模型,包含三个输入神经元,和一个输出神经元,神经网络模型结构具体如图2所示。该神经网络模型的输出(目标函数)可以表示为:
prediction (i)=a+bx i+cx i 2=[1 x i x i 2][a b c] T,i=1,2,...,n  (1)公式(1)中,prediction表示目标值,a,b,c分别表示对应项的参数。
损失函数可以定义为:
Figure PCTCN2019130540-appb-000010
公式(2)中,loss表示损失值,prediction表示预测值,target表示真实值,n表示样本个数。
算法的目标即使用梯度下降法,寻找一组模型参数θ,使损失函数最小化:
θ=[a b c] T  (3)
θ=arg θmin‖loss(θ)‖  (4)
使用梯度下降法优化模型参数,参数的迭代公式为:
Figure PCTCN2019130540-appb-000011
公式(5)中,
Figure PCTCN2019130540-appb-000012
表示学习率。
通过使均方误差最小,结合梯度下降可以求得目标函数的参数a,b,c,从而得到票卡数据客流量和MAC数据客流量之间的对应关系。如图3所示,为票卡数据客流量和MAC数据客流量的关系曲线图。从图中可以看出,无论是通过票卡数据还是MAC数据计算得到的粒度客流量均呈现一定规律,一天之中呈现双峰状态,在早晚高峰时间段站点的客流量较高,平峰时段客流量较低,同时两种数据计算得到的客流规律较一致,并呈现一定的相关性。
区域客流计算模块:用于对站点网格化进行区域划分,通过区域内的AP设备采集的MAC数据清洗并计算得到粒度客流,并将该粒度客流通过对应关系得到站内各个区域的客流量。其中,要想得到区域客流量,就需要利用MAC数据进行计算。因为对于AP设备在站点内是按照区域分散分布的,所以通过对站点网格化进行区域划分,具体如图4所示,为站点区域划分示意图。如果计算A区域内的客流量,只需要对A区域内所有AP设备采集的MAC数据进行分析,由于在数据预处理阶段已经对重复码进行了处理,所以此时保留的即为A区域内AP设备采集的信号最强的MAC数据,同时根据AP设备密集程度过滤掉一些信号小于一定数值的MAC数据,以保证该MAC只出现在A区域内而非B区域。
图6是本申请实施例提供的地铁站内区域客流估计方法的硬件设备结构示意图。如图6所示,该设备包括一个或多个处理器以及存储器。以一个处理器为例,该设备还可以包括:输入系统和输出系统。
处理器、存储器、输入系统和输出系统可以通过总线或者其他方式连接,图6中以通过总线连接为例。
存储器作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序、非暂态计算机可执行程序以及模块。处理器通过运行存储在存储器中的非暂态软件程序、指令以及模块,从而执行电子设备的各种功能应用以及数据处理,即实现上述方法实施例的处理方法。
存储器可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储数据等。此外,存储器可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施例中,存储器可选包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至处理 系统。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
输入系统可接收输入的数字或字符信息,以及产生信号输入。输出系统可包括显示屏等显示设备。
所述一个或者多个模块存储在所述存储器中,当被所述一个或者多个处理器执行时,执行上述任一方法实施例的以下操作:
步骤a:根据地铁票卡数据计算站点的票卡数据客流量;
步骤b:根据AP设备采集的MAC数据计算得到与所述票卡数据相对应的MAC数据客流量;
步骤c:利用线性回归方法计算所述票卡数据客流量和MAC数据客流量之间的对应关系;
步骤d:对站点网格化进行区域划分,根据各个区域内的AP设备采集的MAC数据计算得到粒度客流,并将所述粒度客流通过对应关系得到站点内各个区域的客流量。
上述产品可执行本申请实施例所提供的方法,具备执行方法相应的功能模块和有益效果。未在本实施例中详尽描述的技术细节,可参见本申请实施例提供的方法。
本申请实施例提供了一种非暂态(非易失性)计算机存储介质,所述计算机存储介质存储有计算机可执行指令,该计算机可执行指令可执行以下操作:
步骤a:根据地铁票卡数据计算站点的票卡数据客流量;
步骤b:根据AP设备采集的MAC数据计算得到与所述票卡数据相对应的MAC数据客流量;
步骤c:利用线性回归方法计算所述票卡数据客流量和MAC数据客流量之间 的对应关系;
步骤d:对站点网格化进行区域划分,根据各个区域内的AP设备采集的MAC数据计算得到粒度客流,并将所述粒度客流通过对应关系得到站点内各个区域的客流量。
本申请实施例提供了一种计算机程序产品,所述计算机程序产品包括存储在非暂态计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,使所述计算机执行以下操作:
步骤a:根据地铁票卡数据计算站点的票卡数据客流量;
步骤b:根据AP设备采集的MAC数据计算得到与所述票卡数据相对应的MAC数据客流量;
步骤c:利用线性回归方法计算所述票卡数据客流量和MAC数据客流量之间的对应关系;
步骤d:对站点网格化进行区域划分,根据各个区域内的AP设备采集的MAC数据计算得到粒度客流,并将所述粒度客流通过对应关系得到站点内各个区域的客流量。
本申请实施例的地铁站内区域客流估计方法、系统及电子设备结合地铁票卡数据和AP设备采集的MAC数据,采用多种数据源对地铁站内区域进行客流估计,相对于现有技术,可以实现更精准更实时的客流监测。
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本申请。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本申请中所定义的一般原理可以在不脱离本申请的精神或范围的情况下,在其它实施例中实现。因此,本申请将不会被限制于本申请所示的这些实施例,而是要符合与本申请所公开的原理和新颖特点相一致的最宽的范围。

Claims (9)

  1. 一种地铁站内区域客流估计方法,其特征在于,包括以下步骤:
    步骤a:根据地铁票卡数据计算站点的票卡数据客流量;
    步骤b:根据AP设备采集的MAC数据计算得到与所述票卡数据相对应的MAC数据客流量;
    步骤c:利用线性回归方法计算所述票卡数据客流量和MAC数据客流量之间的对应关系;
    步骤d:对站点网格化进行区域划分,根据各个区域内的AP设备采集的MAC数据计算得到粒度客流,并将所述粒度客流通过对应关系得到站点内各个区域的客流量。
  2. 根据权利要求1所述的地铁站内区域客流估计方法,其特征在于,所述步骤b还包括:对所述AP设备采集的MAC数据进行清洗,使每一位乘客的MAC数据和票卡数据相对应。
  3. 根据权利要求2所述的地铁站内区域客流估计方法,其特征在于,所述MAC数据清洗具体包括:
    重复码清洗:在时间间隔范围内将信号强度最大的MAC数据划分到信号最强的设备区域;
    设备码清洗:通过MAC静态信息表去除MAC数据中AP设备对应的MAC记录;
    伪码清洗:MAC地址的第二位为0|4|8|C的为非伪码,并结合每个MAC的出行轨迹,将只出现在一个站点的MAC数据进行过滤,剩余的MAC数据默认为真实有效的MAC数据;
    异常码清洗:过滤掉信号量不在有效信号范围内的MAC数据、凌晨采集的且被采集多次的MAC数据、以及被同一AP设备采集累计超过一定时间范围的MAC数据。
  4. 根据权利要求1至3任一项所述的地铁站内区域客流估计方法,其特征在于,在所述步骤c中,所述利用线性回归方法计算票卡数据客流量和MAC数据客流量之间的对应关系具体为:采用数据拟合方法拟合票卡数据客流量和MAC数据客流量之间的关系,使用神经网络结合梯度下降法求解未知的参数;构建一个没有隐藏层的神经网络模型,包含三个输入神经元,和一个输出神经元,神经网络模型的输出为:
    prediction (i)=a+bx i+cx i 2=[1 x i x i 2][a b c] T,i=1,2,...,n
    上述公式中,prediction表示目标值,a,b,c分别表示对应项的参数;
    损失函数定义为:
    Figure PCTCN2019130540-appb-100001
    上述公式中,loss表示损失值,prediction表示预测值,target表示真实值,n表示样本个数;
    算法的目标即使用梯度下降法,寻找一组模型参数θ,使损失函数最小化:
    θ=[a b c] T
    θ=arg θmin‖loss(θ)‖
    使用梯度下降法优化模型参数,参数的迭代公式为:
    Figure PCTCN2019130540-appb-100002
    上述公式中,
    Figure PCTCN2019130540-appb-100003
    表示学习率;通过使均方误差最小,结合梯度下降求得目 标函数的参数a,b,c得到所述票卡数据客流量和MAC数据客流量之间的对应关系。
  5. 一种地铁站内区域客流估计系统,其特征在于,包括:
    票卡数据客流计算模块:用于根据地铁票卡数据计算站点的票卡数据客流量;
    MAC数据客流计算模块:用于根据AP设备采集的MAC数据计算得到与所述票卡数据相对应的MAC数据客流量;
    数据拟合模块:用于利用线性回归方法计算所述票卡数据客流量和MAC数据客流量之间的对应关系;
    区域客流计算模块:用于对站点网格化进行区域划分,根据各个区域内的AP设备采集的MAC数据计算得到粒度客流,并将所述粒度客流通过对应关系得到站点内各个区域的客流量。
  6. 根据权利要求5所述的地铁站内区域客流估计系统,其特征在于,还包括MAC数据清洗模块,所述MAC数据清洗模块用于对AP设备采集的MAC数据进行清洗,使每一位乘客的MAC数据和票卡数据相对应。
  7. 根据权利要求6所述的地铁站内区域客流估计系统,其特征在于,所述MAC数据清洗具体包括:
    重复码清洗:在时间间隔范围内将信号强度最大的MAC数据划分到信号最强的设备区域;
    设备码清洗:通过MAC静态信息表去除MAC数据中AP设备对应的MAC记录;
    伪码清洗:MAC地址的第二位为0|4|8|C的为非伪码,并结合每个MAC的出行轨迹,将只出现在一个站点的MAC数据进行过滤,剩余的MAC数据默认为 真实有效的MAC数据;
    异常码清洗:过滤掉信号量不在有效信号范围内的MAC数据、凌晨采集的且被采集多次的MAC数据、以及被同一AP设备采集累计超过一定时间范围的MAC数据。
  8. 根据权利要求5至7任一项所述的地铁站内区域客流估计系统,其特征在于,所述数据拟合模块利用线性回归方法计算票卡数据客流量和MAC数据客流量之间的对应关系具体为:采用数据拟合方法拟合票卡数据客流量和MAC数据客流量之间的关系,使用神经网络结合梯度下降法求解未知的参数;构建一个没有隐藏层的神经网络模型,包含三个输入神经元,和一个输出神经元,神经网络模型的输出为:
    prediction (i)=a+bx i+cx i 2=[1 x i x i 2][a b c] T,i=1,2,...,n
    上述公式中,prediction表示目标值,a,b,c分别表示对应项的参数;
    损失函数定义为:
    Figure PCTCN2019130540-appb-100004
    上述公式中,loss表示损失值,prediction表示预测值,target表示真实值,n表示样本个数;
    算法的目标即使用梯度下降法,寻找一组模型参数θ,使损失函数最小化:
    θ=[a b c] T
    θ=arg θmin‖loss(θ)‖
    使用梯度下降法优化模型参数,参数的迭代公式为:
    Figure PCTCN2019130540-appb-100005
    上述公式中,
    Figure PCTCN2019130540-appb-100006
    表示学习率;通过使均方误差最小,结合梯度下降求得目标函数的参数a,b,c得到所述票卡数据客流量和MAC数据客流量之间的对应关系。
  9. 一种电子设备,包括:
    至少一个处理器;以及
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行上述1至4任一项所述的地铁站内区域客流估计方法的以下操作:
    步骤a:根据地铁票卡数据计算站点的票卡数据客流量;
    步骤b:根据AP设备采集的MAC数据计算得到与所述票卡数据相对应的MAC数据客流量;
    步骤c:利用线性回归方法计算所述票卡数据客流量和MAC数据客流量之间的对应关系;
    步骤d:对站点网格化进行区域划分,根据各个区域内的AP设备采集的MAC数据计算得到粒度客流,并将所述粒度客流通过对应关系得到站点内各个区域的客流量。
PCT/CN2019/130540 2019-04-22 2019-12-31 一种地铁站内区域客流估计方法、系统及电子设备 WO2020215798A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910321767.7 2019-04-22
CN201910321767.7A CN110109991B (zh) 2019-04-22 2019-04-22 一种地铁站内区域客流估计方法、系统及电子设备

Publications (1)

Publication Number Publication Date
WO2020215798A1 true WO2020215798A1 (zh) 2020-10-29

Family

ID=67486087

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/130540 WO2020215798A1 (zh) 2019-04-22 2019-12-31 一种地铁站内区域客流估计方法、系统及电子设备

Country Status (2)

Country Link
CN (1) CN110109991B (zh)
WO (1) WO2020215798A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113962472A (zh) * 2021-10-31 2022-01-21 东南大学 一种基于GAT-Seq2seq模型的时空双注意力地铁客流短时预测方法

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110109991B (zh) * 2019-04-22 2021-08-24 中国科学院深圳先进技术研究院 一种地铁站内区域客流估计方法、系统及电子设备
CN111757272B (zh) * 2020-06-29 2024-03-05 北京百度网讯科技有限公司 地铁拥堵程度的预测方法、模型训练方法和装置
CN111885639A (zh) * 2020-07-24 2020-11-03 上海应用技术大学 地铁人流检测方法和系统
CN112686428B (zh) * 2020-12-15 2022-07-19 广州新科佳都科技有限公司 基于地铁线网站点相似性的地铁客流预测方法及装置
CN114399726B (zh) * 2021-12-06 2023-07-07 上海市黄浦区城市运行管理中心(上海市黄浦区城市网格化综合管理中心、上海市黄浦区大数据中心) 实时智能监测客流及预警的方法和系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160125297A1 (en) * 2014-10-30 2016-05-05 Umm Al-Qura University System and method for solving spatiotemporal-based problems
CN105869388A (zh) * 2016-05-31 2016-08-17 苏州朗捷通智能科技有限公司 一种公交客流数据采集及起讫点的分析方法及系统
CN108022000A (zh) * 2016-10-28 2018-05-11 浙江师范大学 一种地铁客流预测预警系统及方法
CN110109991A (zh) * 2019-04-22 2019-08-09 中国科学院深圳先进技术研究院 一种地铁站内区域客流估计方法、系统及电子设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180012197A1 (en) * 2016-07-07 2018-01-11 NextEv USA, Inc. Battery exchange licensing program based on state of charge of battery pack
CN207883011U (zh) * 2017-11-10 2018-09-18 同济大学 公交客流数据采集设备及od分析系统
CN108665178B (zh) * 2018-05-17 2020-05-29 上海工程技术大学 一种基于afc的地铁站内楼扶梯客流量预测方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160125297A1 (en) * 2014-10-30 2016-05-05 Umm Al-Qura University System and method for solving spatiotemporal-based problems
CN105869388A (zh) * 2016-05-31 2016-08-17 苏州朗捷通智能科技有限公司 一种公交客流数据采集及起讫点的分析方法及系统
CN108022000A (zh) * 2016-10-28 2018-05-11 浙江师范大学 一种地铁客流预测预警系统及方法
CN110109991A (zh) * 2019-04-22 2019-08-09 中国科学院深圳先进技术研究院 一种地铁站内区域客流估计方法、系统及电子设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113962472A (zh) * 2021-10-31 2022-01-21 东南大学 一种基于GAT-Seq2seq模型的时空双注意力地铁客流短时预测方法
CN113962472B (zh) * 2021-10-31 2024-04-19 东南大学 一种基于GAT-Seq2seq模型的时空双注意力地铁客流短时预测方法

Also Published As

Publication number Publication date
CN110109991B (zh) 2021-08-24
CN110109991A (zh) 2019-08-09

Similar Documents

Publication Publication Date Title
WO2020215798A1 (zh) 一种地铁站内区域客流估计方法、系统及电子设备
Etienne et al. Model-based count series clustering for bike sharing system usage mining: a case study with the Vélib’system of Paris
CN106504359B (zh) 一种基于位置和运动状态的智能考勤系统及其实现方法
EP3364157A1 (en) Method and system of outlier detection in energy metering data
CN107730993A (zh) 基于图像再识别的停车场智能寻车系统及方法
CN110297875B (zh) 一种评估城市各功能区之间联系需求紧密度的方法和装置
Xu et al. Self-adapting multi-fingerprints joint indoor positioning algorithm in WLAN based on database of AP ID
WO2014106363A1 (zh) 移动设备定位系统及方法
CN108241853A (zh) 一种视频监控方法、系统及终端设备
CN106507406A (zh) 一种无线网络的设备接入数的预测方法及设备
CN107633347A (zh) 一种数据指标统计方法及装置
CN106327619A (zh) 一种进出管理方法和系统
CN103987118A (zh) 基于接收信号强度信号ZCA白化的接入点k-means聚类方法
CN112948614A (zh) 图像处理方法、装置、电子设备及存储介质
CN109977324A (zh) 一种兴趣点挖掘方法及系统
CN107290714B (zh) 一种基于多标识指纹定位的定位方法
Jin et al. Distributed Byzantine tolerant stochastic gradient descent in the era of big data
WO2019062404A1 (zh) 应用程序的处理方法、装置、存储介质及电子设备
CN106846795A (zh) 人群密集区域的获取方法及装置
US10735906B2 (en) Computing analytics based on indoor location data streams
EP3745321A1 (en) An operating envelope recommendation system with guaranteed probabilistic coverage
CN113159408A (zh) 轨道交通站点客流预测方法及装置
Lu et al. An intelligent system for taxi service: Analysis, prediction and visualization
CN116610849A (zh) 获取轨迹相似的移动对象的方法、装置、设备及存储介质
Jin et al. Spatiotemporal graph convolutional neural networks for metro flow prediction

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19926202

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19926202

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 180322)

122 Ep: pct application non-entry in european phase

Ref document number: 19926202

Country of ref document: EP

Kind code of ref document: A1