CN107609682B - Medium-short term early warning method for population aggregation in big data environment - Google Patents

Medium-short term early warning method for population aggregation in big data environment Download PDF

Info

Publication number
CN107609682B
CN107609682B CN201710726883.8A CN201710726883A CN107609682B CN 107609682 B CN107609682 B CN 107609682B CN 201710726883 A CN201710726883 A CN 201710726883A CN 107609682 B CN107609682 B CN 107609682B
Authority
CN
China
Prior art keywords
time
thiessen polygon
thiessen
population
polygon
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710726883.8A
Other languages
Chinese (zh)
Other versions
CN107609682A (en
Inventor
刘杰
顾高翔
张颖
吴佳玲
郭鹏
宫龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI SHIMAI INFORMATION TECHNOLOGY CO LTD
Original Assignee
SHANGHAI SHIMAI INFORMATION TECHNOLOGY CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI SHIMAI INFORMATION TECHNOLOGY CO LTD filed Critical SHANGHAI SHIMAI INFORMATION TECHNOLOGY CO LTD
Priority to CN201710726883.8A priority Critical patent/CN107609682B/en
Publication of CN107609682A publication Critical patent/CN107609682A/en
Application granted granted Critical
Publication of CN107609682B publication Critical patent/CN107609682B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention aims to utilize an activity data set (namely communication records of a mobile terminal individual and a fixed sensor) of the mobile terminal individual in a specified time range and space range, divide Thiessen polygons according to the space distribution of the fixed sensor to determine the control range of the Thiessen polygons, mine travel space-time sequence data of a large number of individuals, and calculate a one-step state transition matrix of the individual in each Thiessen polygon by adopting a Markov process, thereby forecasting the possibility and mathematical expectation of individual distribution in the subsequent time by monitoring the number of the individuals in each Thiessen polygon in real time and early warning the possible large-scale population aggregation. In order to achieve the purpose, the invention provides a medium-short term early warning method for population aggregation in a big data environment. The method can quickly and effectively predict the potential danger of large-scale population aggregation.

Description

Medium-short term early warning method for population aggregation in big data environment
Technical Field
The invention relates to a population clustering medium and short term prediction and early warning method based on massive anonymous encryption time sequence positioning data, which comprises the steps of constructing massive individual trip time-space sequence data according to individual time and space position data, training by taking the massive individual trip time-space sequence data as a sample, and obtaining position transfer probability characteristics of an individual in space in different time periods; and calculating the positions and the probabilities of the population in different areas at the appointed time point at the next time point by adopting a Markov method, predicting the total population and the occurrence probability of each area in the future time period, and adopting different population aggregation early warning measures according to the prediction result by formulating an early warning mechanism.
Background
With the acceleration of the urbanization process, more and more people flow into the city, people in urban public places gather more and more frequently, and the consequences and the influence are more and more serious. The occurrence of major or special events often brings large-scale passenger flow in a short time and easily causes aggregation, and the public place passenger flow volume overload caused by population aggregation causes various crowding and trampling events to cause serious economic loss and cause adverse social influence, seriously threatens social public safety, and provides a serious challenge for the safety guarantee of public place people. In 2008, 29 months, trampling accidents occurred in the morning of 29 days in suburban areas of the capital of division of the country, i.e. 16 people died, and nearly 20 people were injured; in 31 days 12 th 2014, a plurality of people fall down and fold at the beach outside Huangpu area in Shanghai city, further causing a crowded stepping event to occur, causing 36 people to die, 49 people are injured in 2015, 9 th and 25 th days, and Sadi Arabia Megaka causes a stepping event of a pilgrimage to occur, causing at least 2177 people to die. Therefore, the method is important for the prediction and early warning of the crowd gathering phenomenon in medium and short periods by extracting and analyzing the space-time characteristics of large-scale passenger flow and for improving the urban public safety.
In recent years, with the development of information technology, the data information amount is increased explosively, the data sources are more and more, and the data amount is also more and more huge. Data recorded by information sensors such as mobile phones, WIFI and the Internet of things become the most important data source in big data analysis, and relatively complete individual trip records of the data provide good data support for big data, especially for traffic big data analysis. Taking a mobile phone as an example, in 2015, mobile phone users reach 13.06 hundred million, which accounts for more than 96% of the total population, and signal information continuously generated by mobile phone terminal equipment forms a series of data sets for recording user travel, so that an important data source is provided for large-scale passenger flow generation and extraction of population gathering phenomena and evolution characteristics thereof caused by the large-scale passenger flow generation.
Disclosure of Invention
The invention aims to utilize an activity data set (namely communication records of a mobile terminal individual and a fixed sensor) of the mobile terminal individual in a specified time range and space range, divide Thiessen polygons according to the space distribution of the fixed sensor to determine the control range of the Thiessen polygons, mine travel space-time sequence data of a large number of individuals, and calculate a one-step state transition matrix of the individual in each Thiessen polygon by adopting a Markov process, thereby forecasting the possibility and mathematical expectation of individual distribution in the subsequent time by monitoring the number of the individuals in each Thiessen polygon in real time and early warning the possible large-scale population aggregation.
In order to achieve the purpose, the invention provides a medium-short term early warning method for population aggregation in a big data environment, which is characterized by comprising the following steps:
step 1, a system reads anonymous encrypted mobile terminal sensor data obtained from a sensor operator, wherein the anonymous encrypted mobile terminal sensor data are continuous in time and space, different mobile terminals correspond to different EPIDs, and communication signaling records triggered by each EPID in a specified time period are extracted to form an EPD travel track data set;
step 2, extracting position information of all fixed sensors, clustering and combining the position information, then calculating a Voronoi diagram to obtain the actual control range of each sensor, namely a Thiessen polygon of each sensor, recording the area of each Thiessen polygon, and setting a three-level population bearing threshold of each Thiessen polygon;
step 3, sequentially extracting travel track data sets of each EPID in a specified time period, and sequencing the travel track data sets according to a time sequence; starting from a time starting point T0, interpolating travel trajectory data at intervals of space and time by taking T as a time interval, and establishing a travel space-time sequence data set; mapping each point on the travel time-space sequence data set to a Voronoi diagram, and giving the number of a Thiessen polygon corresponding to each point;
step 4, dividing all user travel track data into three types of working days, weekends, common festivals and major festivals, traversing all Thiessen polygons, starting from a time starting point T0, inquiring EPIDs of each Thiessen polygon in each time period T in the same type of date each day, searching the position of the EPID at the next time, counting the frequency of the EPIDs in a certain Thiessen polygon at the time T to other Thiessen polygons at the next time, and calculating the one-step transition probability of the states of all EPIDs among the Thiessen polygons through a large sample to form a Markov one-step transition matrix of all Thiessen polygons at the time T;
step 5, setting a large-scale population aggregation early warning mechanism according to the three-level population bearing threshold value set in the step 2, monitoring the number of EPIDs in each Thiessen polygon in real time, and expanding samples according to a certain proportion; starting a yellow alert if the population density in a Thiessen polygon reaches a lowest-level population density bearing threshold value at a certain moment, and predicting the mathematical expectation of the total population size and the probability of the maximum population aggregation of the Thiessen polygon and the peripheral Thiessen polygons thereof in a future time period T by using a Markov method with the moment as a starting point;
and 6, continuously predicting the population total amount and population density of each Thiessen polygon in the next stage target range based on the previous stage result according to a Markov method, determining whether to start higher-level orange and red warnings and implement necessary evacuation work or reduce the early warning level according to a preset rule until yellow warning is removed, stopping prediction calculation, and recovering the normal monitoring state.
Preferably, the step 1 comprises:
step 1.1, the system reads the anonymous encryption mobile terminal sensor data obtained from the sensor operator, the anonymous encryption mobile terminal sensor data is continuous in time and space, and the anonymous encryption mobile terminal sensor data comprises the following steps: the method comprises the steps of a unique user number EPID, a communication action TYPE TYPE, a communication action occurrence TIME TIME, a sensor located large area REGIONCODE and a sensor specific number SENSORID, wherein the sensor located large area REGIONCODE and the sensor specific number SENSORID form a sensor number;
step 1.2, one piece of anonymous encryption mobile terminal sensor data is a signaling record, and each signaling record is decrypted;
and step 1.3, inquiring all signaling records of the EPID in a specified time period according to the EPID, and constructing a travel track data set corresponding to the current EPID.
Preferably, the step 2 includes:
step 2.1, extracting the sensor numbers of all the fixed sensors and corresponding longitude and latitude coordinates LON-LAT thereof, and converting the longitude and latitude coordinates into geographic coordinates X-Y;
step 2.2, importing the records of the fixed sensors into geographic information system software, merging the fixed sensors overlapped in a vertical space into one fixed sensor, carrying out cluster analysis on the spatial position of the fixed sensors on the basis, merging the fixed sensors with the distance smaller than rds into one fixed sensor if the cluster radius is rds, taking the gravity center of the spatial position of the merged fixed sensors as the position of the merged fixed sensors, and numbering all the fixed sensors again after merging;
step 2.3, selecting to create Thiessen polygons, creating a Voronoi diagram taking a fixed sensor as the center of the polygons, creating an object for each Thiessen polygon, wherein the number of the ith Thiessen polygon is THi
Step 2.4, associating the numbers of the rearranged fixed sensors to a Thiessen polygon, and if a plurality of fixed sensors overlap or approach in a vertical space for a certain Thiessen polygon, assigning the sensor numbers of the combined fixed sensors to the Thiessen polygon;
step 2.5, endowing the attribute of the geographic space to a Thiessen polygon to measure the population bearing capacity PCC, and specifically comprises the following steps: the method comprises the steps that natural attributes NC, land occupation attributes LUC and construction conditions CC of a plot where a Thiessen polygon is located at present, for the condition that a plurality of fixed sensors are combined in one Thiessen polygon, the population bearing capacity PCC of the Thiessen polygon is correspondingly increased, the ith Thiessen polygon is formed by a plurality of plots which are divided into a road plot, a residential plot, a general plot and a factory building plot, and the population bearing capacity PCC of the ith Thiessen polygon isiThe calculation formula of (2) is as follows:
Figure BDA0001385604120000041
in the formula (I), the compound is shown in the specification,
Figure BDA0001385604120000042
and
Figure BDA0001385604120000043
respectively showing the areas of the jth road land, residential land, general land and commercial facility land;
Figure BDA0001385604120000044
and
Figure BDA0001385604120000045
respectively showing the bearing capacity of the jth road land, residential land, general land and commercial facility land;
Figure BDA0001385604120000046
and
Figure BDA0001385604120000047
respectively representing the number of layers of the jth road land, the residential land, the general land and the commercial facility land;
step 2.6, setting a three-level population bearing threshold value of each Thiessen polygon according to historical experience, wherein the l-level population bearing threshold value of the ith Thiessen polygon is
Figure BDA0001385604120000048
Then there are:
Figure BDA0001385604120000049
in the formula, alEarly warning thresholds are aggregated for class I population.
Preferably, the step 3 comprises:
step 3.1, traversing all travel track data sets of the EPIDs, sequentially arranging according to the triggered communication time TIMESTAMP, traversing the travel track data sets from a time starting point, fitting a secondary curve to every 3 adjacent signaling record points, wherein the X axis of the secondary curve is the time line of the user travel track, and the Y axis is the X-Y coordinates of the signaling record points, so that if the user travel track comprises n communication record points, 2n-4 secondary curves need to be fitted;
step 3.2, starting from an integer time starting point T0, calculating the X-Y coordinates of the user at each time point according to the time interval T, forming an interpolation point by X (T0+ nT) and Y ((T0+ nT) at the same time, sequencing all interpolation points according to the time sequence, setting the interpolation point closest to the original recording point in time as an interpolation recording point, and forming travel space-time sequence data of the user by all interpolation points;
and 3.3, performing spatial association with the Voronoi diagram according to the X-Y coordinates of each interpolation point in the user travel space-time sequence data, and endowing each interpolation point in the user travel space-time sequence data with the Thiessen polygon number.
Preferably, the step 4 comprises:
step 4.1, traversing all Thiessen polygons, creating an object for each Thiessen polygon, starting from a time starting point t0, searching for an EPID (extended object identifier) of the Thiessen polygon at the time t from user travel time-space sequence data, and storing the EPID and an interpolation point or an interpolation recording point of the EPID into a user communication LIST Temp _ EPID _ LIST;
step 4.2, traversing the Temp _ EPID _ LIST, searching the number of the Thiessen polygon corresponding to each EPID at the time t +1, and storing the number into a Thiessen polygon LIST Temp _ Th _ List where the user is located at the next time;
step 4.3, reading Temp _ Th _ List, storing the Temp _ Th _ List into a variable MarMatrix [ t ] Freq of a Thiessen polygon TH in a dynamic array AppendlList form, if the number TH-ID of a certain Thiessen polygon in the Temp _ Th _ List already exists in MarMatrix [ t ] Freq, adding 1 to the frequency, and if the number TH-ID does not exist, adding the TH-ID to MarMatrix [ t ] Freq and setting the frequency to be 1;
reading Temp _ EPID _ LIST, counting the total number between interpolation points and interpolation recording points, and storing in a sample expansion ratio Marmatrix [ t ]. EnlargeProp;
step 4.4, after traversing Temp _ Th _ List, according to Marmatrix [ t ]]Freq, statistics time t at ith Thiessen polygon THiThe spatial distribution of the EPID at the time t +1, thereby calculating the user's timet at the ith Thiessen polygon THiI.e. the user moves from the ith Thiessen polygon TH from time t to t +1iTransition to the nth Thiessen polygon THnProbability of (2)
Figure BDA0001385604120000051
Is the ith Thiessen polygon TH from time t to t +1 for all EPIDsiTransition to the nth Thiessen polygon THnDivided by the ith Thiessen polygon TH at time tiThe total number of EPIDs of (1);
the Markov one-step transition matrix of the ith Thiessen polygon at time t is:
Figure BDA0001385604120000052
the Markov one-step transfer matrix for all Thiessen polygons at time t is:
Figure BDA0001385604120000061
preferably, the step 5 comprises:
step 5.1, setting the three-level population density threshold value of each Thiessen polygon as yellow alert, orange alert and red alert respectively, wherein the three-level population density threshold value of the ith Thiessen polygon is
Figure BDA0001385604120000062
Step 5.2, monitoring the number of communication users of each fixed sensor in each appointed time period in real time, using sample expansion ratio Marmatrix [ t ] EnlargeProp to expand samples as the total number of users of the Thiessen polygon where the fixed sensor is located in the current time period, and then expanding the samples to the total number of users in the Thiessen polygon according to the population total sample expansion ratio EnlargeRadio;
step 5.3, if the ith Thiessen polygon TH in the current time tiPopulation density of
Figure BDA0001385604120000063
Reach first-class population density early warning threshold
Figure BDA0001385604120000064
Then the yellow alert is started;
step 5.4, if the ith Thiessen polygon THiWhen the yellow alert is turned on at time t, the ith Thiessen polygon TH is usediAs a center, the population size at time t +1 including the Thiessen polygon adjacent thereto and the ith Thiessen polygon TH are calculatediAdjacent Thiessen polygons THjThere are K adjacent Thiessen polygons in total, then the Thiessen polygon THjPopulation size at time t +1 is
Figure BDA0001385604120000065
Figure BDA0001385604120000066
In the formula (I), the compound is shown in the specification,
Figure BDA0001385604120000067
is equal to Thiessen polygon THjAdjacent kth Thiessen polygon THkThe size of the population at the time t,
Figure BDA0001385604120000068
from time t to t +1 from Thiessen polygon THkTransfer to Thiessen polygon THjThe probability of (d);
step 5.5, using Thiessen polygon THiAs a center, calculate the Thiessen polygon TH at time t +1iHas a population density exceeding
Figure BDA0001385604120000071
And
Figure BDA0001385604120000072
comprising the following steps:
step 5.5.1, for Thiessen polygon THiThe Thiessen polygon within two surrounding layers is transferred to the Thiessen polygon TH according to the Thiessen polygon TH between the time t and the time t +1iThe probabilities of the two groups are sorted from big to small;
step 5.5.2, with Thiessen polygon THiPopulation at time t is radix, assuming Thiessen polygon TH at time t to t +1iAll of the population of (A) is left in the Thiessen polygon THiThe population is
Figure BDA0001385604120000073
Probability of being
Figure BDA0001385604120000074
Step 5.5.3, traversing the sequenced Thiessen polygons, wherein the jth Thiessen polygon THjIs completely transferred to the Thiessen polygon TH at time t +1iHas a probability of
Figure BDA0001385604120000075
Suppose that the jth Thiessen polygon THjAll transition to Thiessen polygon TH at the next instantiThen the Thiessen polygon TH at time t +1iThe largest possible population is
Figure BDA0001385604120000076
Probability of being
Figure BDA0001385604120000077
Calculate the Thiessen polygon TH at this timeiIf it exceeds
Figure BDA0001385604120000078
The Thiessen polygon TH is recordediExceeds at time t +1
Figure BDA0001385604120000079
Has a probability of
Figure BDA00013856041200000710
And 5.5.4. Continuously traversing the sorted Thiessen polygons, wherein the z-TH Thiessen polygon THzIs completely transferred to the Thiessen polygon TH at time t +1iHas a probability of
Figure BDA00013856041200000711
Also assume that it all transitions to the Thiessen polygon TH at the next instant in timeiThen the Thiessen polygon TH at time t +1iThe largest possible population is
Figure BDA00013856041200000712
Probability of being
Figure BDA00013856041200000713
Step 5.5.5, traversing n Thiessen polygons by the same method as steps 5.5.3 and 5.5.4, and then obtaining the Thiessen polygon THiThe largest possible population of
Figure BDA00013856041200000714
Probability of being
Figure BDA00013856041200000715
Up to Thiessen polygon THiIs greater than the maximum possible population density
Figure BDA00013856041200000716
Record Thiessen polygon THiExceeds at time t +1
Figure BDA00013856041200000717
Has a probability of
Figure BDA00013856041200000718
Preferably, the step 6 comprises:
step 6.1, if the Thiessen polygon TH is obtained through calculationiPopulation density at time t +1
Figure BDA0001385604120000081
Is greater than
Figure BDA0001385604120000082
Or greater than
Figure BDA0001385604120000083
If the probability of (1) is greater than p1, starting a yellow alert;
step 6.2, if Thiessen polygon THiPopulation density at time t +1
Figure BDA0001385604120000084
Is greater than
Figure BDA0001385604120000085
And the population density of the adjacent Thiessen polygons existing around the polygon at the time t +1 is more than D _ THR2Or greater than D _ THR2If the probability of the user is more than p2, starting an orange alert and needing to take certain abstinence and evacuation measures of population gathering;
step 6.3, if Thiessen polygon THiPopulation density at time t +1
Figure BDA0001385604120000086
Is greater than
Figure BDA0001385604120000087
The orange warning is directly started, and if the population density of the adjacent Thiessen polygons existing around the Thiessen polygons at the moment t +1 is greater than that of the adjacent Thiessen polygons
Figure BDA0001385604120000088
Or greater than
Figure BDA0001385604120000089
If the probability of the alarm is higher than p3, starting a red alarm and taking evacuation measures immediately;
step 6.4, if the Thiessen polygon TH in the calculation processiCircumferentially adjacent Thiessen polygons THjThe alarm threshold is instead higher than the Thiessen polygon THiThen the Thiessen polygon THjFor the calculated center point, the Thiessen polygon THiReduced to Thiessen polygon THjAdjacent Thiessen polygons;
step 6.5, if in the calculation process, the Thiessen polygon THiPopulation density at time t + n
Figure BDA00013856041200000810
Is less than
Figure BDA00013856041200000811
And the population density of the adjacent Thiessen polygons at the time t + n is less than
Figure BDA00013856041200000812
The yellow alarm is deactivated.
Processing and screening big data of a mobile terminal, constructing time-space sequence data of individual trips through communication records between the mobile terminal held by an individual and a sensor, complementing the time-space sequence data of the user trips with uniform time intervals through mathematical interpolation, dividing a geographical background into Thiessen polygons according to the space positions of fixed sensors, training the big sample data through a Markov method, calculating the probability of transferring the individual from one Thiessen polygon to another Thiessen polygon in each time period, and predicting the population density of each Thiessen polygon at a certain time point in the future on the basis of the probability; and through a three-level early warning mechanism, the population gathering condition of the designated area is monitored in real time, and corresponding evacuation measures are taken according to different early warning levels.
The invention has the advantages that: the method has the advantages that the existing large data resources of communication between the mobile terminal and the sensor held by the user are fully relied, the existing massive continuous encrypted position information of the anonymous mobile terminal in the communication network is utilized, the travel time-space sequence of a large number of people in a specified time range can be obtained automatically and conveniently at low cost, and the spatial transition probabilities of individuals in different places and different time periods are trained by taking the time-space sequence as a sample, so that the potential danger of large-scale population aggregation is predicted quickly and effectively.
Drawings
Fig. 1 is a Voronoi diagram generated by this example.
Detailed Description
In order to make the invention more comprehensible, preferred embodiments are described in detail below with reference to the accompanying drawings.
Step 1, a system reads anonymous encryption mobile terminal sensor data obtained from a sensor operator, wherein the anonymous encryption mobile terminal sensor data are continuous in time and space theoretically, different mobile terminals correspond to different EPIDs, and communication signaling records triggered by each EPID in a specified time period are extracted to form a travel data set of the EPID.
The anonymous encryption mobile terminal sensor data is encrypted position information of an anonymous mobile phone user time sequence obtained by an operator from a mobile communication network, a fixed broadband network, wireless WIFI, a position service related APP and the like in real time and subjected to desensitization encryption, and the content comprises the following steps: EPID, TYPE, TIME, REGIONCODE, SENSORID, see the Chinese patent with application number 201610273693.0. The specific introduction is as follows:
the EPID (anonymous one-way EncryPtion globally unique mobile terminal identification code) is used for carrying out one-way irreversible EncryPtion on each mobile terminal user, so that each mobile terminal user is uniquely identified, the user number privacy information is not exposed, and the encrypted EPID of each mobile terminal user is required to keep uniqueness, namely the EPID of each mobile phone user is kept unchanged at any moment and is not repeated with other mobile phone users.
TYPE, which is the TYPE of communication action related to the current record, such as internet access, call, calling and called, short message receiving and sending, GPS positioning, sensor cell switching, sensor switching, power on and power off, etc.
TIME is the TIME at which the communication operation related to the current record occurs, and is expressed in milliseconds.
The REGIONCODE and the sensor are sensor encryption position information in which the communication operation related to the current recording occurs. The number of the REGIONCODE, SENSORID sensor, wherein REGIONCODE represents the area where the sensor is located, and SENSORID is the number of the particular sensor.
Step 1.1, the system reads sensor data of an anonymous encryption mobile terminal obtained from a sensor operator, theoretically, the sensor data of the anonymous encryption mobile terminal should be continuous in time and space, and the method comprises the following steps: the unique number EPID of the user, the TYPE TYPE of the communication action, the TIME of the occurrence of the communication action, the REGIONCODE of the area where the sensor is located and the specific number SENSORID of the sensor; wherein, the REGIONCODE of the area where the sensor is located and the specific sensor number SENSORID form the sensor number;
step 1.2, one piece of anonymous encryption mobile terminal sensor data is a signaling record, and each signaling record is decrypted;
step 1.3, inquiring all signaling records of the user within a specified time period according to the user number EPID, and constructing user travel data;
in this example, the extracted real-time signaling record data of the user and the sensor is:
table 1: decrypted newly received real-time signaling record data
RECORDID EPID TYPE TIMESTAMP REGIONCODE SENSORID
…… …… …… …… …… ……
R101 E1 T1 2017-04-12 13:05:24 9540 9784
R102 E1 T2 2017-04-12 11:01:26 9540 5781
R103 E1 T1 2017-04-12 11:01:48 9540 7675
R104 E1 T2 2017-04-12 11:01:48 9540 2746
…… …… …… …… …… ……
R301 E1 T1 2017-04-12 11:15:09 10408 8641
R302 E1 T1 2017-04-12 11:16:45 10408 8642
R303 E1 T1 2017-04-12 11:18:20 10408 8644
R304 E1 T1 2017-04-12 11:18:48 10408 8513
R305 E1 T1 2017-04-12 11:19:26 10408 8092
…… …… …… …… …… ……
R441 E1 T2 2017-04-12 11:35:21 9874 3325
R442 E1 T2 2017-04-12 11:35:30 9874 2144
R443 E1 T4 2017-04-12 11:35:59 9881 5744
R444 E1 T1 2017-04-12 11:36:04 9875 7121
…… …… …… …… …… ……
Step 2, extracting the position information of all the fixed sensors, clustering and merging the position information, calculating a Voronoi diagram of the fixed sensors to obtain the actual control range (namely Thiessen polygons) of each sensor, recording the area of each Thiessen polygon, and setting the three-level population bearing capacity of each Thiessen polygon, wherein the method comprises the following steps:
step 2.1, extracting all fixed sensor numbers REGIONCODE-SENSORID and corresponding longitude and latitude coordinates LON-LAT thereof, and converting the longitude and latitude coordinates into geographic coordinates X-Y;
in this example, the number and geographic coordinates of the stationary sensors are shown in Table 2:
TABLE 2 fixed sensor X-Y coordinates after latitude and longitude conversion
REGIONCODE SENSORID X Y
…… …… …… ……
9878 4796 64948.9426 120901.1132
9879 9721 68860.6841 132947.0579
9646 2716 65657.4389 120328.1643
9421 4523 72559.6835 118194.8941
9880 8095 68948.2975 114069.1227
9421 4041 72556.7439 118391.5958
9878 4339 63132.7352 120466.3362
10407 2086 61696.4767 113860.7929
…… …… …… ……
2.2, importing the fixed sensor records into geographic information system software, and combining a plurality of sensors into one sensor for the special condition that the sensors are overlapped in a vertical space, if each floor of a business center has one sensor; on the basis, carrying out clustering analysis on the spatial positions of the fixed sensors, combining the fixed sensors with the distance smaller than rds into one sensor on the assumption that the clustering radius is rds, taking the gravity center of the spatial positions of the combined sensors as the position of the combined sensor, and numbering all the sensors again after combination;
TABLE 3 fixed sensors renumbered after consolidation
Figure BDA0001385604120000111
Step 2.3, selecting to create Thiessen polygons, creating a Voronoi diagram taking a fixed sensor as the center of each Thiessen polygon, and creating an object for each Thiessen polygon, wherein the number of the ith Thiessen polygon is THi
In this example, the Thiessen polygon class contains the following variables:
polygon number TH-ID;
population bearing CAPACITY;
population density bearing threshold D _ THR;
markov transfer matrix class Markrix [ t ] at any time t;
wherein the Markov transfer matrix class contains the following variables:
the user transfers frequency MarMatrix [ t ] Freq from time t to t + 1;
the user transfer probability MarMatrix [ t ] P is from the moment t to t + 1;
interpolation point and fitting record point ratio Marmatrix t. EnlargeProp at time t;
recording the ratio Marmatrix [ t ] EnlargeRadio of points to population total at time t;
the Voronoi diagram generated by this example is shown in fig. 1.
Step 2.4, associating the numbers of the rearranged sensors to a Thiessen polygon, and if the plurality of sensors in the step 2.2 are overlapped or close to each other in the vertical space, assigning the numbers REGIONENCODE-SENSORID of the combined sensors to the Thiessen polygon;
TABLE 4 association of Thiessen polygons with stationary sensors
Figure BDA0001385604120000121
Step 2.5, endowing the attribute of the geographic space to a Thiessen polygon to measure the population bearing capacity PCC; the method specifically comprises the following steps: the natural attribute NC (land, river, lake) of the plot of the Thiessen polygon, the land attribute LUC (road, commercial facility, residential land, universal land), the construction condition CC (road level, building floor number, capacity of accommodating population); for the case that the plurality of sensors in step 2.2 are merged into one Thiessen polygon, the population bearing capacity PCC of the Thiessen polygon is correspondingly increased;
in this example, the population bearing capacity of each Thiessen polygon is:
TABLE 5 population bearing capacity of Thiessen polygon (people/square meter)
Figure BDA0001385604120000131
Step 2.6, setting a three-level population bearing threshold value D _ THR according to historical experience,
Figure BDA0001385604120000132
alif the population density exceeds D _ THR, starting early warning;
in this example, a is setl、al、alRespectively 0.9, 1.1 and 1.3, the three-level early warning threshold of the Thiessen polygon is:
TABLE 6 Tertiary early warning threshold for Thiessen polygons
Figure BDA0001385604120000133
Figure BDA0001385604120000141
Step 3, sequentially extracting travel track data sets of each EPID in a specified time period, and sequencing the travel track data sets according to a time sequence; starting from a time starting point T0, interpolating travel data at intervals of T time in space and time, and establishing a travel space-time sequence data set; mapping each point on the travel space-time sequence data set to a Voronoi diagram, and assigning a Thiessen polygon number corresponding to each point, wherein the method comprises the following steps:
step 3.1, traversing the user travel track data obtained in step 1.3, arranging the travel track data according to the triggered communication time TIMESTAMP in sequence, traversing the travel data from the time starting point, fitting a secondary curve to every 3 adjacent communication recording points, wherein the X axis of the secondary curve is the time line of the user travel track, and the Y axis is the X-Y coordinates of the communication recording points, so that if the user travel track contains n communication recording points, 2n-4 secondary curves need to be fitted in total
In this example, the user's original travel data is:
table 7 original travel record of user associated with X-Y coordinates
RECORDID EPID TYPE TIMESTAMP REGIONCODE SENSORID X Y
…… …… …… …… …… …… …… ……
R1 E1 T1 2017-04-30 07:58:17 9881 8835 66906.63 114996.29
R2 E1 T4 2017-04-30 08:02:55 9881 8835 66906.63 114996.29
R3 E1 T4 2017-04-30 08:04:57 9881 8835 66906.63 114996.29
R4 E1 T3 2017-04-30 08:07:15 9881 8835 66906.63 114996.29
R5 E1 T4 2017-04-30 08:12:15 9881 8813 67695.63 115742.55
R6 E1 T1 2017-04-30 08:17:03 9881 8548 67895.97 115699.61
R7 E1 T2 2017-04-30 08:21:04 9881 2101 68141.16 115533.00
R8 E1 T2 2017-04-30 08:56:17 9881 8855 68450.68 115118.71
R9 E1 T2 2017-04-30 09:26:18 9881 8855 68450.68 115118.71
R10 E1 T4 2017-04-30 09:56:23 9881 6130 68366.52 114560.06
R11 E1 T1 2017-04-30 10:56:28 9881 6130 68366.52 114560.06
R12 E1 T1 2017-04-30 11:10:14 9881 6135 68899.23 114451.70
R13 E1 T1 2017-04-30 11:33:28 9881 2849 68931.60 114652.72
R14 E1 T1 2017-04-30 11:37:45 9881 2101 68939.49 114927.09
R15 E1 T2 2017-04-30 11:44:09 9877 1857 68811.01 115143.33
R16 E1 T1 2017-04-30 11:49:45 9877 5331 68818.71 115568.92
R17 E1 T3 2017-04-30 12:00:30 9881 9457 68900.19 115793.66
…… …… …… …… …… …… …… ……
Step 3.2, starting from an integer time starting point T0, calculating the X-Y coordinates of the user at each time point according to the time interval T, forming an interpolation point by X (T0+ nT) and Y ((T0+ nT) at the same time, sequencing all interpolation points according to the time sequence, setting the interpolation point closest to the original recording point in time as an interpolation recording point, and forming travel space-time sequence data of the user by all interpolation points;
in this example, let T be 15 minutes, and start interpolation from 8:00, then the obtained interpolation data is:
TABLE 8 interpolation data and recorded data
RECORDID EPID TYPE TIMESTAMP REGIONCODE SENSORID X Y
…… …… …… …… …… …… …… ……
R1 E1 T1 2017-04-30 07:58:17 9881 8835 66906.63 114996.29
INS1 2017-04-30 08:00:00 66906.63 114996.29
R2 E1 T4 2017-04-30 08:02:55 9881 8835 66906.63 114996.29
R3 E1 T4 2017-04-30 08:04:57 9881 8835 66906.63 114996.29
R4 E1 T3 2017-04-30 08:07:15 9881 8835 66906.63 114996.29
R5 E1 T4 2017-04-30 08:12:15 9881 8813 67695.63 115742.55
INS2 2017-04-30 08:15:00 67797.82 115720.34
R6 E1 T1 2017-04-30 08:17:03 9881 8548 67895.97 115699.61
R7 E1 T2 2017-04-30 08:21:04 9881 2101 68141.16 115533.00
INS3 2017-04-30 08:30:00 68235.34 115436.54
INS4 2017-04-30 08:45:00 68364.56 115231.63
R8 E1 T2 2017-04-30 08:56:17 9881 8855 68450.68 115118.71
INS5 2017-04-30 09:00:00 68450.68 115118.71
INS6 2017-04-30 09:15:00 68450.68 115118.71
R9 E1 T2 2017-04-30 09:26:18 9881 8855 68450.68 115118.71
INS7 2017-04-30 09:30:00 68434.43 115012.52
INS8 2017-04-30 09:45:00 68382.57 114653.31
R10 E1 T4 2017-04-30 09:56:23 9881 6130 68366.52 114560.06
INS9 2017-04-30 10:00:00 68366.52 114560.06
INS10 2017-04-30 10:15:00 68366.52 114560.06
INS11 2017-04-30 10:30:00 68366.52 114560.06
INS12 2017-04-30 10:45:00 68366.52 114560.06
R11 E1 T1 2017-04-30 10:56:28 9881 6130 68366.52 114560.06
INS13 2017-04-30 11:00:00 68533.88 114523.91
R12 E1 T1 2017-04-30 11:10:14 9881 6135 68899.23 114451.70
INS14 2017-04-30 11:15:00 68902.24 114487.71
INS15 2017-04-30 11:30:00 68926.88 114612.35
R13 E1 T1 2017-04-30 11:33:28 9881 2849 68931.60 114652.72
R14 E1 T1 2017-04-30 11:37:45 9881 2101 68939.49 114927.09
R15 E1 T2 2017-04-30 11:44:09 9877 1857 68811.01 115143.33
INS16 2017-04-30 11:45:00 68812.86 115234.12
R16 E1 T1 2017-04-30 11:49:45 9877 5331 68818.71 115568.92
INS17 2017-04-30 12:00:00 68894.23 115786.84
R17 E1 T3 2017-04-30 12:00:30 9881 9457 68900.19 115793.66
…… …… …… …… …… …… …… ……
3.3, according to the X-Y coordinates of each interpolation point in the user trip time-space sequence data, carrying out spatial association with the Voronoi diagram which is generated in the step 2.3 and is formed by sensor distribution, and endowing each interpolation point in the user trip time-space sequence data with the Thiessen polygon number;
in this example, the spatial association of the user's interpolation point and Thiessen polygon is shown as:
TABLE 9 interpolation points associated with Thiessen polygons
RECORDID TIMESTAMP X Y TH-ID
…… …… …… …… ……
INS1 2017-04-30 08:00:00 66906.63 114996.29 162
INS2 2017-04-30 08:15:00 67797.82 115720.34 117
INS3 2017-04-30 08:30:00 68235.34 115436.54 97
INS4 2017-04-30 08:45:00 68364.56 115231.63 67
INS5 2017-04-30 09:00:00 68450.68 115118.71 61
INS6 2017-04-30 09:15:00 68450.68 115118.71 61
INS7 2017-04-30 09:30:00 68434.43 115012.52 61
INS8 2017-04-30 09:45:00 68382.57 114653.31 72
INS9 2017-04-30 10:00:00 68366.52 114560.06 91
INSI0 2017-04-30 10:15:00 68366.52 114560.06 91
INS11 2017-04-30 10:30:00 68366.52 114560.06 91
INS12 2017-04-30 10:45:00 68366.52 114560.06 91
INS13 2017-04-30 11:00:00 68533.88 114523.91 85
INS14 2017-04-30 11:15:00 68902.24 114487.71 82
INS15 2017-04-30 11:30:00 68926.88 114612.35 56
INS16 2017-04-30 11:45:00 68812.86 115234.12 46
INS17 2017-04-30 12:00:00 68894.23 115786.84 34
…… …… …… …… ……
Step 4, dividing all user travel data into three types of working days, weekends, common festivals and major festivals, traversing all Thiessen polygons, starting from a time starting point T0, inquiring EPIDs of each Thiessen polygon in each time period T in the same type of date every day, searching the position of the EPID at the next moment, counting the EPIDs in a certain Thiessen polygon at a certain moment, counting the frequency of going to other Thiessen polygons at the next moment, and calculating the one-step transition probability of the states of the EPIDs among the Thiessen polygons by adopting a large sample;
step 4.1, traversing Thiessen polygons, creating an object for each Thiessen polygon, starting from a time starting point t0, and searching for the Thiessen polygon TH at the moment t in a time t from the user travel time-space sequence data setiUser EPID (i.e., at TH at time t)iAn EPID with an interpolation point), storing the EPID and the attribute (interpolation point or interpolation record point) of the point into a user communication LIST Temp _ EPID _ LIST;
in this example, the Thiessen polygon TH100The Temp _ EPID _ LIST at time t of the weekday is:
TABLE 10 TH100User communication LIST Temp _ EPID _ LIST at time t
TH-ID EPID TYPE
…… …… ……
100 E0441 INS
100 E0087 R
100 E0087 INS
100 E0087 R
100 E4002 INS
100 E9875 INS
100 E1931 INS
100 E7638 INS
100 E2836 INS
100 E2836 R
100 E1837 INS
100 E5938 INS
100 E5938 INS
100 E2891 INS
100 E6829 INS
100 E2881 R
100 E7892 R
100 E1983 INS
100 E1983 INS
100 E1983 INS
100 E1983 INS
100 E1983 INS
100 E5438 INS
100 E0192 INS
100 E9103 INS
100 E8701 INS
100 E4289 R
100 E3429 INS
100 E5431 INS
100 E4366 INS
…… …… ……
Step 4.2, traversing the Temp _ EPID _ LIST, searching the Thiessen polygon number corresponding to each EPID at the next moment t +1, storing the Thiessen polygon LIST Temp _ Th _ List where the user is at the next moment, and returning the Temp _ EPID _ LIST and the Temp _ Th _ List to the Thiessen polygon THi
In this example, TH100The location List Temp _ Th _ List of EPIDs at time t +1 on weekdays is:
TABLE 11 TH100Temp _ Th _ List at time t
EPID TH-ID
…… ……
E0441 100
E0087 98
E0087 98
E0087 99
E4002 96
E9875 90
E1931 104
E7638 100
E2836 101
E2836 103
E1837 103
E5938 103
E5938 105
E2891 99
E6829 101
E2881 87
E7892 99
E1983 103
E1983 103
E1983 103
E1983 100
E1983 103
E5438 112
E0192 96
E9103 99
E8701 100
E4289 103
E3429 103
E5431 99
E4366 100
…… ……
Step 4.3, reading Temp _ Th _ List, storing the Temp _ Th _ List into a variable MarMatrix [ t ] Freq of a Thiessen polygon TH in a dynamic array AppendlList form, if a certain Thiessen polygon number TH-ID in the Temp _ Th _ List already exists in the MarMatrix [ t ] Freq, adding 1 to the frequency, and if the No, adding the TH-ID to the MarMatrix [ t ] Freq and setting the frequency to be 1; reading Temp _ EPID _ LIST, counting the total number between interpolation points and interpolation recording points, and storing in a sample expansion ratio Marmatrix [ t ]. EnlargeProp;
in this example, TH100Marmatrix [ t ] at time t of weekday]EnlargeProp 0.2158, Marmatrix [ t ]]Freq is:
TABLE 12 TH100Frequency statistics of MarMatrix t at time t].Freq
TH-ID Frequency
103 1431
98 1034
100 783
99 204
96 134
104 32
102 13
…… ……
Step 4.4, after traversing Temp _ Th _ List, according to Marmatrix [ t ]]Freq, statistical time t at Thiessen polygon THiThe spatial distribution of the user EPID at the time t +1, thereby calculating the polygon TH of the user at the time t and the ThiesseniI.e. the user goes from TH from time t to t +1iTo THnIs that all EPIDs are from TH from time t to t +1iTo THnDivided by the sum at TH at time tiThe total number of EPIDs of (1);
in this example, TH100The one-step transition probability from time t to t +1 on weekdays is:
TABLE 13 TH100One-step transition probability MarMatrix [ t ] from time t to t +1].P
Figure BDA0001385604120000191
Figure BDA0001385604120000201
Step 5, setting a large-scale population aggregation early warning mechanism according to the three-level population bearing threshold value set in the step 2, monitoring the number of EPIDs in each Thiessen polygon in real time, and expanding samples according to a certain proportion; starting a yellow alert if the population density in a Thiessen polygon reaches a lowest-level population density bearing threshold value at a certain moment, and predicting the mathematical expectation of the total population size and the probability of the maximum population aggregation of the Thiessen polygon and the peripheral Thiessen polygons thereof in a future time period T by using a Markov method with the moment as a starting point;
step 5.1, according to the step 2.5, a three-level population density threshold value of each Thiessen polygon is set
Figure BDA0001385604120000202
According to the danger degree, respectively setting the alarm as yellow alarm, orange alarm and red alarm;
in this example, the Thiessen polygon TH100Has population bearing capacity of 6.5378 persons/square meter, and three-level population density threshold values of 5.8840, 7.1916 and 8.4991 respectively.
Step 5.2, monitoring the number of communication users of each fixed sensor in each appointed time period in real time, and using a sample expansion ratio MarMatrix [ t ]]EnlargeProp sample expansion is the Thiessen polygon TH where the fixed sensor is located in the current periodiThe total number of users, and then the TH is expanded to the current time period according to the population total sample expansion ratio EnlargeRadioiThe total number of people in;
in this example, the Thiessen polygon TH100Marmatrix [ t ] at weekday time t]An EnlargeProp of 0.2158, externally obtained population-expanded proportion MarMatrix [ t ]]Enlargeradio 0.8732, TH100The number of the users recorded at the time t is 6529, the number of the users after the sample expansion of the whole users is 36784, and the population after the sample expansion of the total population is 42125;
step 5.3, if TH is within the current time tiPopulation density of
Figure BDA0001385604120000203
To reach THiFirst-level population density early warning threshold
Figure BDA0001385604120000204
Then the yellow alert is started;
in this example, TH100The area of the system is 7056 square meters, the population density at the moment t reaches 5.97 persons/square meter, and the yellow alert is started when the population density exceeds a first-level population density early warning threshold value;
step 5.4, if Thiessen polygon THiThe yellow alert is turned on at the time t, thenBy THiAs the center, calculating the population size of the Thiessen polygon adjacent to the Thiessen polygon at the time t + 1;
in this example, TH is calculated100The population at time t remains at TH at time t +110010995 people, 42562 people, TH, flowing in from the surrounding Thiessen polygon100The population count at time t +1 is 53557;
step 5.5, with THiAs a center, TH at time t +1 is calculatediHas a population density exceeding
Figure BDA00013856041200002114
And
Figure BDA00013856041200002115
the specific steps are as follows:
step 5.5.1, for THiThiessen polygons within two surrounding layers, according to which the transition to TH occurs between times t and t +1iThe probabilities of the two groups are sorted from big to small;
step 5.5.2, with THiPopulation at time t is the base, and TH is assumed to be at time t to t +1iAll of the population of (A) is left at THiThe population is
Figure BDA0001385604120000211
Probability of being
Figure BDA0001385604120000212
Step 5.5.3, traversing the sorted Thiessen polygons, wherein the person of the Thiessen polygon j is completely transferred to TH at t +1iHas a probability of
Figure BDA0001385604120000213
Assuming that it all transitions to TH at the next instantiAnd then TH at t +1iThe largest possible population is
Figure BDA0001385604120000214
Probability of being
Figure BDA0001385604120000215
Calculate TH at this timeiIf it exceeds
Figure BDA0001385604120000216
Then record THiExceeds at t +1
Figure BDA0001385604120000217
Has a probability of
Figure BDA0001385604120000218
Step 5.5.4, continuously traversing the sequenced Thiessen polygons, namely the Thiessen polygon THzIs completely transferred to TH at t +1iHas a probability of
Figure BDA0001385604120000219
Also assume that it all transitions to TH at the next timeiAnd then TH at t +1iThe largest possible population is
Figure BDA00013856041200002110
Probability of being
Figure BDA00013856041200002111
Step 5.5.5, traversing n Thiessen polygons, THiThe largest possible population of
Figure BDA00013856041200002112
Probability of being
Figure BDA00013856041200002113
Up to THiIs greater than the maximum possible population density
Figure BDA0001385604120000221
Record THiExceeds at t +1
Figure BDA0001385604120000222
Has a probability of
Figure BDA0001385604120000223
In this example, TH is calculated100Population density of (2) exceeds at time t +1
Figure BDA0001385604120000224
Has a probability of 0.2154, over
Figure BDA0001385604120000225
Has a probability of 0.1548;
and 6, continuously predicting the population total amount and population density of each Thiessen polygon in the next stage target range based on the previous stage result according to a Markov method, determining whether to start higher-level orange and red warnings and implement necessary evacuation work or reduce the early warning level according to a preset rule until yellow warning is removed, stopping prediction calculation, and recovering the normal monitoring state.
Step 6.1, if the Thiessen polygon TH is obtained through calculationiPopulation density at time t +1
Figure BDA0001385604120000226
Is greater than
Figure BDA0001385604120000227
Or greater than D _ THR1If the probability of (1) is greater than p1, starting a yellow alert;
step 6.2, if THiPopulation density at time t +1
Figure BDA0001385604120000229
Is greater than
Figure BDA00013856041200002210
And the population density of the adjacent Thiessen polygons existing around the polygon at the time t +1 is more than D _ THR2Or greater than D _ THR2If the probability of the user is more than p2, starting an orange alert and needing to take certain abstinence and evacuation measures of population gathering;
step 6.3, if THiPopulation density at time t +1Degree of rotation
Figure BDA00013856041200002211
Is greater than
Figure BDA00013856041200002212
The orange alert is directly started, and if the population density of the adjacent Thiessen polygons existing around the Thiessen polygons at the moment t +1 is larger than D _ THR3Or greater than D _ THR3If the probability of the alarm is higher than p3, starting a red alarm and taking evacuation measures immediately;
step 6.4, if TH in the calculation processiCircumferentially adjacent Thiessen polygons THjThe alarm threshold reached is instead higher than THiThen TH will bejDetermining the center point of calculation, and determining THiIs reduced to THjAdjacent Thiessen polygons;
step 6.5, if in the calculation process, THiPopulation density at time t + n
Figure BDA00013856041200002213
Is less than
Figure BDA00013856041200002214
And the population density of the adjacent Thiessen polygons at the time t + n is less than D _ THR1The yellow alarm is released;
in this example, the probability thresholds p1, p2, and p3 are set to 0.6, 0.4, and 0.25, respectively, and TH is calculated100Is 7.5902, exceeding the population density at time t +1
Figure BDA0001385604120000231
Population density of adjacent Thiessen polygons at the time t +1 of non-existence is greater than D _ THR2If the condition is not met, the standard for starting the orange early warning is met, and the yellow warning is kept;
time t +1, TH100Is expected to have a population density of 8.5245 people per square meter in excess of
Figure BDA0001385604120000232
Direct opening orangeAlert, the population density of the adjacent Thiessen polygons reaches D _ THR2To achieve D _ THR3The probabilities of the two are all less than 0.1353, and if the probabilities of the two do not reach p, an orange warning is started;
time t +2, TH100Is expected to have a population density of 8.8342 people per square meter in excess of
Figure BDA0001385604120000233
There is a population density of adjacent Thiessen polygons that exceeds D _ THR2Not reaching D _ THR3But reaches D _ THR3The maximum probability of (2) is 0.2613, exceeds p3, and a red alert is started;
……
at time t + n, TH100Is desirably 5.6458 < D _ THR1And the population density of adjacent Thiessen polygons is less than D _ THR1The yellow alarm is deactivated.

Claims (4)

1. A middle-short term early warning method for population aggregation in a big data environment is characterized by comprising the following steps:
step 1, a system reads anonymous encrypted mobile terminal sensor data obtained from a sensor operator, wherein the anonymous encrypted mobile terminal sensor data are continuous in time and space, different mobile terminals correspond to different EPIDs, and communication signaling records triggered by each EPID in a specified time period are extracted to form a travel track data set of the EPID;
step 2, extracting position information of all fixed sensors, clustering and combining the position information, then calculating a Voronoi diagram to obtain the actual control range of each sensor, namely a Thiessen polygon of each sensor, recording the area of each Thiessen polygon, and setting a three-level population bearing threshold of each Thiessen polygon;
step 3, sequentially extracting travel track data sets of each EPID in a specified time period, and sequencing the travel track data sets according to a time sequence; starting from a time starting point T0, interpolating travel trajectory data at intervals of space and time by taking T as a time interval, and establishing a travel space-time sequence data set; mapping each point on the travel space-time sequence dataset to a Voronoi diagram, and assigning the number of a Thiessen polygon corresponding to each point, wherein the mapping comprises the following steps:
step 3.1, traversing all travel track data sets of the EPIDs, sequentially arranging according to the triggered communication time TIMESTAMP, traversing the travel track data sets from a time starting point, fitting a secondary curve to every 3 adjacent signaling record points, wherein the X axis of the secondary curve is the time line of the user travel track, and the Y axis is the X-Y coordinates of the signaling record points, so that if the user travel track comprises n communication record points, 2n-4 secondary curves need to be fitted;
step 3.2, starting from an integer time starting point T0, calculating the X-Y coordinates of the user at each time point according to the time interval T, forming an interpolation point by X (T0+ nT) and Y ((T0+ nT) at the same time, sequencing all interpolation points according to the time sequence, setting the interpolation point closest to the original recording point in time as an interpolation recording point, and forming travel space-time sequence data of the user by all interpolation points;
3.3, performing spatial association with the Voronoi diagram according to the X-Y coordinates of each interpolation point in the user trip time-space sequence data, and endowing each interpolation point in the user trip time-space sequence data with the Thiessen polygon number;
step 4, dividing all user travel track data into three types of working days, weekends, common festivals and major festivals, traversing all Thiessen polygons, starting from a time starting point T0, inquiring EPIDs of each Thiessen polygon in each time period T in the same type of date each day, searching the position of the EPID at the next time, counting the frequency of the EPIDs in a certain Thiessen polygon at the time T to other Thiessen polygons at the next time, and calculating the one-step transition probability of the states of all EPIDs among the Thiessen polygons through a large sample to form a Markov one-step transition matrix of all Thiessen polygons at the time T, wherein the Markov one-step transition matrix comprises the following steps:
step 4.1, traversing all Thiessen polygons, creating an object for each Thiessen polygon, starting from a time starting point t0, searching for an EPID (extended object identifier) of the Thiessen polygon at the time t from user travel time-space sequence data, and storing the EPID and an interpolation point or an interpolation recording point of the EPID into a user communication LIST Temp _ EPID _ LIST;
step 4.2, traversing the Temp _ EPID _ LIST, searching the number of the Thiessen polygon corresponding to each EPID at the time t +1, and storing the number into a Thiessen polygon LIST Temp _ Th _ List where the user is located at the next time;
step 4.3, reading Temp _ Th _ List, storing the Temp _ Th _ List into a variable MarMatrix [ t ] Freq of a Thiessen polygon TH in a dynamic array AppendlList form, if the number TH-ID of a certain Thiessen polygon in the Temp _ Th _ List already exists in MarMatrix [ t ] Freq, adding 1 to the frequency, and if the number TH-ID does not exist, adding the TH-ID to MarMatrix [ t ] Freq and setting the frequency to be 1;
reading Temp _ EPID _ LIST, counting the total number between interpolation points and interpolation recording points, and storing in a sample expansion ratio Marmatrix [ t ]. EnlargeProp;
step 4.4, after traversing Temp _ Th _ List, according to Marmatrix [ t ]]Freq, statistics time t at ith Thiessen polygon THiThe spatial distribution of the EPID at the time t +1, so as to calculate the ith Thiessen polygon TH of the user at the time tiI.e. the user moves from the ith Thiessen polygon TH from time t to t +1iTransition to the nth Thiessen polygon THnProbability of (2)
Figure FDA0002521633130000021
Is the ith Thiessen polygon TH from time t to t +1 for all EPIDsiTransition to the nth Thiessen polygon THnDivided by the ith Thiessen polygon TH at time tiThe total number of EPIDs of (1);
the Markov one-step transition matrix of the ith Thiessen polygon at time t is:
Figure FDA0002521633130000022
the Markov one-step transfer matrix for all Thiessen polygons at time t is:
Figure FDA0002521633130000031
step 5, setting a large-scale population aggregation early warning mechanism according to the three-level population bearing threshold value set in the step 2, monitoring the number of EPIDs in each Thiessen polygon in real time, and expanding samples according to a certain proportion; if the population density in a Thiessen polygon reaches the lowest level population density bearing threshold value at a certain moment, starting a yellow alert, and taking the moment as a starting point, predicting the mathematical expectation of the total population size and the probability of the maximum population aggregation of the Thiessen polygon and the peripheral Thiessen polygons thereof in a future time period T by adopting a Markov method, wherein the mathematical expectation comprises the following steps:
step 5.1, setting the three-level population density threshold value of each Thiessen polygon as yellow alert, orange alert and red alert respectively, wherein the three-level population density threshold value of the ith Thiessen polygon is
Figure FDA0002521633130000032
Step 5.2, monitoring the number of communication users of each fixed sensor in each appointed time period in real time, using sample expansion ratio Marmatrix [ t ] EnlargeProp to expand samples as the total number of users of the Thiessen polygon where the fixed sensor is located in the current time period, and then expanding the samples to the total number of users in the Thiessen polygon according to the population total sample expansion ratio EnlargeRadio;
step 5.3, if the ith Thiessen polygon TH in the current time tiPopulation density of
Figure FDA0002521633130000033
Reach first-class population density early warning threshold
Figure FDA0002521633130000034
Then the yellow alert is started;
step 5.4, if the ith Thiessen polygon THiWhen the yellow alert is turned on at time t, the ith Thiessen polygon TH is usediAs a center, the population size at time t +1 including the Thiessen polygon adjacent thereto and the ith Thiessen polygon TH are calculatediAdjacent Thiessen polygons THjThere are K adjacent Thiessen polygons in total, then the Thiessen polygon THjPopulation size at time t +1 is
Figure FDA0002521633130000035
Figure FDA0002521633130000036
In the formula (I), the compound is shown in the specification,
Figure FDA0002521633130000041
is equal to Thiessen polygon THjAdjacent kth Thiessen polygon THkThe size of the population at the time t,
Figure FDA0002521633130000042
from time t to t +1 from Thiessen polygon THkTransfer to Thiessen polygon THjThe probability of (d);
step 5.5, using Thiessen polygon THiAs a center, calculate the Thiessen polygon TH at time t +1iHas a population density exceeding
Figure FDA0002521633130000044
And
Figure FDA0002521633130000045
comprising the following steps:
step 5.5.1, for Thiessen polygon THiThe Thiessen polygon within two surrounding layers is transferred to the Thiessen polygon TH according to the Thiessen polygon TH between the time t and the time t +1iThe probabilities of the two groups are sorted from big to small;
step 5.5.2, with Thiessen polygon THiPopulation at time t is radix, assuming Thiessen polygon TH at time t to t +1iAll of the population of (A) is left in the Thiessen polygon THiThe population is
Figure FDA0002521633130000046
Probability of being
Figure FDA0002521633130000047
Step 5.5.3, traversing the sequenced Thiessen polygons, wherein the jth Thiessen polygon THjIs completely transferred to the Thiessen polygon TH at time t +1iHas a probability of
Figure FDA0002521633130000048
Suppose that the jth Thiessen polygon THjAll transition to Thiessen polygon TH at the next instantiThen the Thiessen polygon TH at time t +1iThe largest possible population is
Figure FDA0002521633130000049
Probability of being
Figure FDA00025216331300000410
Calculate the Thiessen polygon TH at this timeiIf it exceeds
Figure FDA00025216331300000411
The Thiessen polygon TH is recordediExceeds at time t +1
Figure FDA00025216331300000412
Has a probability of
Figure FDA00025216331300000413
And step 5.5.4, continuously traversing the sequenced Thiessen polygons, wherein the z-TH Thiessen polygon THzIs completely transferred to the Thiessen polygon TH at time t +1iHas a probability of
Figure FDA00025216331300000414
Also assume that it all transitions to the Thiessen polygon TH at the next instant in timeiThen, thenThiessen polygon TH at time t +1iThe largest possible population is
Figure FDA00025216331300000415
Probability of being
Figure FDA00025216331300000416
Step 5.5.5, traversing n Thiessen polygons by the same method as steps 5.5.3 and 5.5.4, and then obtaining the Thiessen polygon THiThe largest possible population of
Figure FDA0002521633130000043
Probability of being
Figure FDA0002521633130000051
Up to Thiessen polygon THiIs greater than the maximum possible population density
Figure FDA0002521633130000053
Record Thiessen polygon THiExceeds at time t +1
Figure FDA0002521633130000054
Has a probability of
Figure FDA0002521633130000052
And 6, continuously predicting the population total amount and population density of each Thiessen polygon in the next stage target range based on the previous stage result according to a Markov method, determining whether to start higher-level orange and red warnings and implement necessary evacuation work or reduce the early warning level according to a preset rule until yellow warning is removed, stopping prediction calculation, and recovering the normal monitoring state.
2. The method for short-term and medium-term pre-warning of population aggregation in big data environment as claimed in claim 1, wherein the step 1 comprises:
step 1.1, the system reads the anonymous encryption mobile terminal sensor data obtained from the sensor operator, the anonymous encryption mobile terminal sensor data is continuous in time and space, and the anonymous encryption mobile terminal sensor data comprises the following steps: the method comprises the steps of a unique user number EPID, a communication action TYPE TYPE, a communication action occurrence TIME TIME, a sensor located large area REGIONCODE and a sensor specific number SENSORID, wherein the sensor located large area REGIONCODE and the sensor specific number SENSORID form a sensor number;
step 1.2, one piece of anonymous encryption mobile terminal sensor data is a signaling record, and each signaling record is decrypted;
and step 1.3, inquiring all signaling records of the EPID in a specified time period according to the EPID, and constructing a travel track data set corresponding to the current EPID.
3. The method for short-term and medium-term pre-warning of population aggregation in big data environment as claimed in claim 2, wherein the step 2 comprises:
step 2.1, extracting the sensor numbers of all the fixed sensors and corresponding longitude and latitude coordinates LON-LAT thereof, and converting the longitude and latitude coordinates into geographic coordinates X-Y;
step 2.2, importing the records of the fixed sensors into geographic information system software, merging the fixed sensors overlapped in a vertical space into one fixed sensor, carrying out cluster analysis on the spatial position of the fixed sensors on the basis, merging the fixed sensors with the distance smaller than rds into one fixed sensor if the cluster radius is rds, taking the gravity center of the spatial position of the merged fixed sensors as the position of the merged fixed sensors, and numbering all the fixed sensors again after merging;
step 2.3, selecting to create Thiessen polygons, creating a Voronoi diagram taking a fixed sensor as the center of the polygons, creating an object for each Thiessen polygon, wherein the number of the ith Thiessen polygon is THi
Step 2.4, associating the numbers of the rearranged fixed sensors to a Thiessen polygon, and if a plurality of fixed sensors overlap or approach in a vertical space for a certain Thiessen polygon, assigning the sensor numbers of the combined fixed sensors to the Thiessen polygon;
step 2.5, endowing the attribute of the geographic space to a Thiessen polygon to measure the population bearing capacity PCC, and specifically comprises the following steps: the method comprises the steps that natural attributes NC, land occupation attributes LUC and construction conditions CC of a plot where a Thiessen polygon is located at present, for the condition that a plurality of fixed sensors are combined in one Thiessen polygon, the population bearing capacity PCC of the Thiessen polygon is correspondingly increased, the ith Thiessen polygon is formed by a plurality of plots which are divided into a road plot, a residential plot, a general plot and a factory building plot, and the population bearing capacity PCC of the ith Thiessen polygon isiThe calculation formula of (2) is as follows:
Figure FDA0002521633130000061
in the formula (I), the compound is shown in the specification,
Figure FDA0002521633130000062
and
Figure FDA0002521633130000063
respectively showing the areas of the jth road land, residential land, general land and commercial facility land;
Figure FDA0002521633130000064
and
Figure FDA0002521633130000065
respectively showing the bearing capacity of the jth road land, residential land, general land and commercial facility land;
Figure FDA0002521633130000066
and
Figure FDA0002521633130000067
respectively representing the number of layers of the jth road land, the residential land, the general land and the commercial facility land;
step 2.6, setting a three-level population bearing threshold value of each Thiessen polygon according to historical experience, wherein the l-level population bearing threshold value of the ith Thiessen polygon is
Figure FDA0002521633130000068
Then there are:
Figure FDA0002521633130000069
in the formula, alEarly warning thresholds are aggregated for class I population.
4. The method for short-term and medium-term pre-warning of population aggregation in big data environment as claimed in claim 1, wherein the step 6 comprises:
step 6.1, if the Thiessen polygon TH is obtained through calculationiPopulation density at time t +1
Figure FDA00025216331300000610
Is greater than
Figure FDA00025216331300000611
Or greater than
Figure FDA00025216331300000612
If the probability of (1) is greater than p1, starting a yellow alert;
step 6.2, if Thiessen polygon THiPopulation density at time t +1
Figure FDA00025216331300000613
Is greater than
Figure FDA0002521633130000071
And the population density of the adjacent Thiessen polygons existing around the polygon at the time t +1 is more than D _ THR2Or greater than D _ THR2If the probability of the user is more than p2, starting an orange alert and needing to take certain abstinence and evacuation measures of population gathering;
step 6.3, if Thiessen polygon THiPopulation density at time t +1
Figure FDA0002521633130000072
Is greater than
Figure FDA0002521633130000073
The orange warning is directly started, and if the population density of the adjacent Thiessen polygons existing around the Thiessen polygons at the moment t +1 is greater than that of the adjacent Thiessen polygons
Figure FDA0002521633130000074
Or greater than
Figure FDA0002521633130000075
If the probability of the alarm is higher than p3, starting a red alarm and taking evacuation measures immediately;
step 6.4, if the Thiessen polygon TH in the calculation processiCircumferentially adjacent Thiessen polygons THjThe alarm threshold is instead higher than the Thiessen polygon THiThen the Thiessen polygon THjFor the calculated center point, the Thiessen polygon THiReduced to Thiessen polygon THjAdjacent Thiessen polygons;
step 6.5, if in the calculation process, the Thiessen polygon THiPopulation density at time t + n
Figure FDA0002521633130000076
Is less than
Figure FDA0002521633130000077
And the population density of the adjacent Thiessen polygons at the time t + n is less than
Figure FDA0002521633130000078
The yellow alarm is deactivated.
CN201710726883.8A 2017-08-22 2017-08-22 Medium-short term early warning method for population aggregation in big data environment Active CN107609682B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710726883.8A CN107609682B (en) 2017-08-22 2017-08-22 Medium-short term early warning method for population aggregation in big data environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710726883.8A CN107609682B (en) 2017-08-22 2017-08-22 Medium-short term early warning method for population aggregation in big data environment

Publications (2)

Publication Number Publication Date
CN107609682A CN107609682A (en) 2018-01-19
CN107609682B true CN107609682B (en) 2020-09-11

Family

ID=61065492

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710726883.8A Active CN107609682B (en) 2017-08-22 2017-08-22 Medium-short term early warning method for population aggregation in big data environment

Country Status (1)

Country Link
CN (1) CN107609682B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108133298B (en) * 2018-03-08 2022-04-19 河南工业大学 National grain consumption prediction method based on multiple regression model
CN109520499B (en) * 2018-10-08 2020-06-26 浙江浙大中控信息技术有限公司 Method for realizing regional real-time isochrones based on vehicle GPS track data
CN109711447A (en) * 2018-12-19 2019-05-03 武大吉奥信息技术有限公司 A kind of special population event early warning and monitoring method and device
CN109831774B (en) * 2019-01-08 2021-08-10 中国联合网络通信集团有限公司 Big data sample expansion method and device
CN110139221B (en) * 2019-05-09 2020-02-14 特斯联(北京)科技有限公司 Population cluster dynamic monitoring method and system based on mobile phone signal micro-card port
CN111669710B (en) * 2020-04-21 2021-07-06 上海因势智能科技有限公司 Demographic deduplication method
CN111680830B (en) * 2020-05-25 2024-01-26 广州衡昊数据科技有限公司 Epidemic situation prevention method and device based on aggregation risk early warning
CN112367608B (en) * 2020-10-27 2022-09-20 上海世脉信息科技有限公司 Fixed sensor spatial position mining method in big data environment
CN112418508B (en) * 2020-11-19 2021-10-08 中国科学院地理科学与资源研究所 Population distribution prediction method based on interaction between physical space and social network space
CN115865988B (en) * 2023-02-21 2023-05-26 武汉理工大学三亚科教创新园 Passenger ship passenger tread event monitoring system and method by utilizing mobile phone sensor network

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105488120A (en) * 2015-11-23 2016-04-13 上海川昱信息科技有限公司 Method for collecting spatial population distribution in real time on basis of mobile phone big data and realizing large passenger flow early warning
CN105760454A (en) * 2016-02-04 2016-07-13 东南大学 Method for dynamically measuring distribution density of city population in real time
CN106792517A (en) * 2016-12-05 2017-05-31 武汉大学 Base station service number time sequence forecasting method based on mobile phone location Time-spatial diversion probability

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105488120A (en) * 2015-11-23 2016-04-13 上海川昱信息科技有限公司 Method for collecting spatial population distribution in real time on basis of mobile phone big data and realizing large passenger flow early warning
CN105760454A (en) * 2016-02-04 2016-07-13 东南大学 Method for dynamically measuring distribution density of city population in real time
CN106792517A (en) * 2016-12-05 2017-05-31 武汉大学 Base station service number time sequence forecasting method based on mobile phone location Time-spatial diversion probability

Also Published As

Publication number Publication date
CN107609682A (en) 2018-01-19

Similar Documents

Publication Publication Date Title
CN107609682B (en) Medium-short term early warning method for population aggregation in big data environment
EP3132592B1 (en) Method and system for identifying significant locations through data obtainable from a telecommunication network
Zheng et al. Diagnosing New York city's noises with ubiquitous data
Thuillier et al. Clustering weekly patterns of human mobility through mobile phone data
EP3335209B1 (en) Method and system for computing an o-d matrix obtained through radio mobile network data
Gao et al. Detecting origin-destination mobility flows from geotagged tweets in greater Los Angeles area
CN109996193B (en) Short message sending method, device, system and equipment based on intelligent communication platform
Qin et al. Applying big data analytics to monitor tourist flow for the scenic area operation management
JP5815368B2 (en) Estimation of dynamic movement behavior in mobile networks
CN107977673B (en) Economic activity population identification method based on big data
CN109688532B (en) Method and device for dividing city functional area
US20130166352A1 (en) Mobile categorization
Demissie et al. Analysis of the pattern and intensity of urban activities through aggregate cellphone usage
CN111080501B (en) Real crowd density space-time distribution estimation method based on mobile phone signaling data
Demissie et al. Inferring origin-destination flows using mobile phone data: A case study of Senegal
CN106504524B (en) A method of express highway section is divided based on mobile signaling protocol dynamic
CN111445369A (en) Urban large-scale gathering activity intelligence early warning method and device based on L BS big data
CN115034524A (en) Method, system and storage medium for predicting working population based on mobile phone signaling
Alhasoun et al. The city browser: Utilizing massive call data to infer city mobility dynamics
Yuan et al. Recognition of functional areas based on call detail records and point of interest data
Kashiyama et al. Pseudo-PFLOW: Development of nationwide synthetic open dataset for people movement based on limited travel survey and open statistical data
Platos et al. Population data mobility retrieval at territory of Czechia in pandemic COVID‐19 period
EP3563592B1 (en) Method for determining the mobility status of a user of a wireless communication network
Lwin et al. Analysis of trip distributions of human mobility patterns and their transit behaviors using mobile call detail records
Jiang et al. Improved F-DBSCAN for trip end identification using mobile phone data in combination with base station density

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant