CN110889536A - Method and system for predicting and early warning situation - Google Patents

Method and system for predicting and early warning situation Download PDF

Info

Publication number
CN110889536A
CN110889536A CN201911035111.5A CN201911035111A CN110889536A CN 110889536 A CN110889536 A CN 110889536A CN 201911035111 A CN201911035111 A CN 201911035111A CN 110889536 A CN110889536 A CN 110889536A
Authority
CN
China
Prior art keywords
prediction
seasonal
data
parameters
alarm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911035111.5A
Other languages
Chinese (zh)
Inventor
姜坤
周建忠
袁弘强
孙坚
王金宝
王涛
张旭东
吕玉璋
饶启玉
徐小磊
郭晓峰
张锦贵
岳朝娥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
KUNMING PUBLIC SECURITY BUREAU
Xinzhi Cognitive Digital Technology Co Ltd
Original Assignee
KUNMING PUBLIC SECURITY BUREAU
Xinzhi Cognitive Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by KUNMING PUBLIC SECURITY BUREAU, Xinzhi Cognitive Digital Technology Co Ltd filed Critical KUNMING PUBLIC SECURITY BUREAU
Priority to CN201911035111.5A priority Critical patent/CN110889536A/en
Publication of CN110889536A publication Critical patent/CN110889536A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Educational Administration (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

An alarm situation prediction and early warning method comprises the following steps: predicting the number of specific alarms in a prediction time period; and secondly, early warning is carried out on the specific warning condition. The invention adopts all available data, including real name track data, one-standard three-real data, alarm condition data, forepart data, data obtained by Internet crawling, such as weather, regional building (business) information and the like, manually labeled data, such as urban village data and the like, and adopts a plurality of models for prediction and early warning, and simultaneously trains independent prediction models and early warning models aiming at different alarm conditions.

Description

Method and system for predicting and early warning situation
Technical Field
The invention belongs to the technical field of warning situation analysis, and particularly relates to a warning situation prediction and early warning method and a warning situation prediction and early warning system.
Background
The current warning situation prediction and early warning are based on historical warning to carry out time and space dimension statistical analysis, wherein the time dimension considers the same ring ratio, the space dimension considers the branch and the district to be dispatched, the current method only considers the historical factors, the future warning situation is artificially presumed according to the historical statistical information, the method is greatly influenced by the subjectivity of the judging personnel, other existing data cannot be fully utilized, and objective future judgment is carried out by combining the space and the current time factor.
Disclosure of Invention
Based on the above, the warning situation prediction and early warning method and system are provided for the technical problems.
In order to solve the technical problems, the invention adopts the following technical scheme:
an alarm situation prediction and early warning method comprises the following steps:
predicting the number of specific alarms in a prediction time period, wherein the prediction time period comprises a preset prediction starting date and a prediction number of days:
(111) counting the specific alarm condition occurring before the prediction starting date to obtain the time sequence data of the specific alarm condition: the date on which the specific alarm condition occurs and the occurrence number of the specific alarm condition on the date;
(112) performing statistical analysis on the time sequence data by adopting a seasonal autoregressive moving average model SARIMA to obtain the trend and the distribution of the specific alarm, and determining trend parameters and seasonal parameters according to the analysis result, wherein the trend parameters comprise a trend autoregressive order P, a trend differential order D and a trend moving average order Q, and the seasonal parameters comprise a seasonal autoregressive order P, a seasonal differential order D, a seasonal moving average order Q and a seasonal trend parameter s;
(113) constructing a quantity prediction model of the specific alarm according to the trend parameters and the seasonal parameters: SARIMA (P, D, Q) x (P, D, Q, s), training the quantitative prediction model to fit the time series data of the specific alarm condition through the time series data of the specific alarm condition;
(114) inputting the prediction days into the quantity prediction model to obtain a prediction result in the prediction time period;
secondly, early warning is carried out on specific warning situations:
(121) dividing a district into grid areas of n × n meters on an electronic map, and setting three granularity areas from small to large: grid, district and sub-district, n is 450-;
(122) at the three granularities, the spatio-temporal characteristics of each region at each day of history are constructed:
122a, constructing activity characteristics in the current area according to the real-name track database and the predecessor database, wherein the activity characteristics comprise the total number of activities/the number of men/the number of predecessors in the hotel in the current area n1/n2/n3/n4 days before the current day and the total number of activities/the number of men/the number of predecessors in the internet cafe in the current area n1/n2/n3/n4 days before the current day;
122b, constructing the alarm characteristics of the specific alarm in the current area according to an alarm database, wherein the alarm characteristics comprise the alarm quantity of the specific alarm in the current area n1/n2/n3/n4 days before the current day;
122c, obtaining interest point attribute characteristics in the current area from the electronic map;
122d, constructing the regional attribute characteristics in the current region according to a standard three-entity database;
122e, acquiring weather data from the Internet to construct weather characteristics in the current area of the day;
(123) setting a label for each historical day of each area, judging whether the number of the specific warnings which contain N days after the current day is larger than the number of the specific warnings which contain N days before the current day and is not 0, and if not, setting the label of the current day to be 1, otherwise, setting the label of the current day to be 0;
(124) according to an alarm condition database, counting the historical daily average alarm condition number of the specific alarm condition in each grid to obtain an alarm condition high-rate grid with the historical daily average alarm condition number higher than 90%;
(125) on the three granularities, a first feature set and a label set are constructed through the space-time features and the corresponding labels of each region, a neural network model is trained through the first feature set and the label set, the output of the middle layer is taken as a second feature set, and the middle layer model is stored;
(126) training a lifting tree model through the second feature set and a tag set at the three granularities;
(127) constructing a spatiotemporal feature of each warning high-rate grid on the early warning current date according to the steps 122a to 122e, inputting the spatiotemporal feature into a lifting tree model, obtaining the probability of whether the number of the specific warning conditions of each warning high-rate grid on the last N days of the current date is greater than the number of the specific warning conditions on the last N days, and early warning according to the probability;
wherein the prediction days are 1-7, N1< N2< N3< N4, and N is 1-10.
The step (112) further comprises:
112a, carrying out d-time difference on the time sequence data, using a unit root detection method to detect the stationarity of the time sequence data, if the difference data is stable, determining the parameter difference order as d, otherwise, increasing the number of d by 1, and continuing the difference until the stable time sequence data is obtained;
112b, drawing a partial autocorrelation graph and an autocorrelation graph of the stationary time sequence data, wherein when the delay in the partial autocorrelation graph is i, obvious projection exists, but when the delay is larger, similar projection does not exist, and the value of p is determined to make p equal to i; when the delay in the autocorrelation graph is j, obvious protrusions exist, but similar protrusions do not exist when the delay is larger, the value of q is determined so that q is equal to j, exponential smoothing models ARMA (0, q), ARMA (p,0) and ARMA (p, q) are respectively constructed, the information amount of an erythrocyte pool of smoothing time sequence data on the three models, namely-2 log (L) +2(p + q + k +1), is calculated, the model with the minimum AIC is selected, and parameters (p, q) are determined, wherein L is a likelihood function of the smoothing time sequence data, k is equal to 1 when c is equal to 0, k is 0 when c is equal to 0, and c is an average value of continuous observed value changes;
112c, constructing an exponential smoothing model ARMA (p, q) according to the parameters (p, q), calculating residual errors of smoothing time sequence data in the model ARMA (p, q), adopting a D-W test mode to test whether the residual errors are autocorrelation, drawing a bitmap to test whether the residual errors accord with normal distribution with the average value of 0 and the variance of a constant, further confirming the selected parameters (p, q), returning to the previous step for reselecting the parameters (p, q) if the conditions are not met, and selecting a smaller value close to the previously selected parameters when the parameters are reselected;
112d, obtaining time sequence data in the step (111), taking time delays of 7 days, 1 month and 3 months, respectively setting corresponding seasonal trend parameters s as 7, 12 and 4, sequentially using the time delays to carry out seasonal differentiation, carrying out unit root inspection on differentiated data, judging data stability, selecting the seasonal trend parameter s with the best stability, and determining the value of the seasonal trend parameter s;
112e, after determining the seasonal trend s, repeating the step 112a using the seasonally differentiated data to obtain a seasonal differentiation order D, and repeating the steps 112b and 112c to obtain a seasonal autoregressive order P and a seasonal moving average order Q.
The early warning according to the probability is that the corresponding grids are marked and early warned through four different colors:
labeling the grids with the probability greater than 0.68 through a first color;
labeling the grids with the probability of being greater than 0.34 and less than or equal to 0.68 through a second color;
labeling the grids with the probability less than or equal to 0.34 through a third color;
and marking the non-warning high-rate grids through a fourth color.
The step (122) further comprises:
other time-related features were constructed: whether the day is weekend, whether the day is holiday, and whether the day is on a rest.
The invention also relates to an alarm situation prediction and early warning system, which comprises a storage module, wherein a plurality of instructions are stored in the storage module, and the instructions are loaded and executed by a processor:
predicting the number of specific alarms in a prediction time period, wherein the prediction time period comprises a preset prediction starting date and a prediction number of days:
(111) counting the specific alarm condition occurring before the prediction starting date to obtain the time sequence data of the specific alarm condition: the date on which the specific alarm condition occurs and the occurrence number of the specific alarm condition on the date;
(112) performing statistical analysis on the time sequence data by adopting a seasonal autoregressive moving average model SARIMA to obtain the trend and the distribution of the specific alarm, and determining trend parameters and seasonal parameters according to the analysis result, wherein the trend parameters comprise a trend autoregressive order P, a trend differential order D and a trend moving average order Q, and the seasonal parameters comprise a seasonal autoregressive order P, a seasonal differential order D, a seasonal moving average order Q and a seasonal trend parameter s;
(113) constructing a quantity prediction model of the specific alarm according to the trend parameters and the seasonal parameters: SARIMA (P, D, Q) x (P, D, Q, s), training the quantitative prediction model to fit the time series data of the specific alarm condition through the time series data of the specific alarm condition;
(114) inputting the prediction days into the quantity prediction model to obtain a prediction result in the prediction time period;
secondly, early warning is carried out on specific warning situations:
(121) dividing a district into grid areas of n × n meters on an electronic map, and setting three granularity areas from small to large: grid, district and sub-district, n is 450-;
(122) at the three granularities, the spatio-temporal characteristics of each region at each day of history are constructed:
122a, constructing activity characteristics in the current area according to the real-name track database and the predecessor database, wherein the activity characteristics comprise the total number of activities/the number of men/the number of predecessors in the hotel in the current area n1/n2/n3/n4 days before the current day and the total number of activities/the number of men/the number of predecessors in the internet cafe in the current area n1/n2/n3/n4 days before the current day;
122b, constructing the alarm characteristics of the specific alarm in the current area according to an alarm database, wherein the alarm characteristics comprise the alarm quantity of the specific alarm in the current area n1/n2/n3/n4 days before the current day;
122c, obtaining interest point attribute characteristics in the current area from the electronic map;
122d, constructing the regional attribute characteristics in the current region according to a standard three-entity database;
122e, acquiring weather data from the Internet to construct weather characteristics in the current area of the day;
(123) setting a label for each historical day of each area, judging whether the number of the specific warnings which contain N days after the current day is larger than the number of the specific warnings which contain N days before the current day and is not 0, and if not, setting the label of the current day to be 1, otherwise, setting the label of the current day to be 0;
(124) according to an alarm condition database, counting the historical daily average alarm condition number of the specific alarm condition in each grid to obtain an alarm condition high-rate grid with the historical daily average alarm condition number higher than 90%;
(125) on the three granularities, a first feature set and a label set are constructed through the space-time features and the corresponding labels of each region, a neural network model is trained through the first feature set and the label set, the output of the middle layer is taken as a second feature set, and the middle layer model is stored;
(126) training a lifting tree model through the second feature set and a tag set at the three granularities;
(127) constructing a spatiotemporal feature of each warning high-rate grid on the early warning current date according to the steps 122a to 122e, inputting the spatiotemporal feature into a lifting tree model, obtaining the probability of whether the number of the specific warning conditions of each warning high-rate grid on the last N days of the current date is greater than the number of the specific warning conditions on the last N days, and early warning according to the probability;
wherein the prediction days are 1-7, N1< N2< N3< N4, and N is 1-10.
The step (112) further comprises:
112a, carrying out d-time difference on the time sequence data, using a unit root detection method to detect the stationarity of the time sequence data, if the difference data is stable, determining the parameter difference order as d, otherwise, increasing the number of d by 1, and continuing the difference until the stable time sequence data is obtained;
112b, drawing a partial autocorrelation graph and an autocorrelation graph of the stationary time sequence data, wherein when the delay in the partial autocorrelation graph is i, obvious projection exists, but when the delay is larger, similar projection does not exist, and the value of p is determined to make p equal to i; when the delay in the autocorrelation graph is j, obvious protrusions exist, but similar protrusions do not exist when the delay is larger, the value of q is determined so that q is equal to j, exponential smoothing models ARMA (0, q), ARMA (p,0) and ARMA (p, q) are respectively constructed, the information amount of an erythrocyte pool of smoothing time sequence data on the three models, namely-2 log (L) +2(p + q + k +1), is calculated, the model with the minimum AIC is selected, and parameters (p, q) are determined, wherein L is a likelihood function of the smoothing time sequence data, k is equal to 1 when c is equal to 0, k is 0 when c is equal to 0, and c is an average value of continuous observed value changes;
112c, constructing an exponential smoothing model ARMA (p, q) according to the parameters (p, q), calculating residual errors of smoothing time sequence data in the model ARMA (p, q), adopting a D-W test mode to test whether the residual errors are autocorrelation, drawing a bitmap to test whether the residual errors accord with normal distribution with the average value of 0 and the variance of a constant, further confirming the selected parameters (p, q), returning to the previous step for reselecting the parameters (p, q) if the conditions are not met, and selecting a smaller value close to the previously selected parameters when the parameters are reselected;
112d, obtaining time sequence data in the step (111), taking the time delays of 7 days, 1 month and 3 months, respectively taking the corresponding seasonal trend parameters s of 7, 12 and 4, sequentially using the time delays to carry out seasonal differentiation, carrying out unit root inspection on the differentiated data, judging the data stability, selecting the seasonal trend parameter s with the best stability, and determining the value of the seasonal trend parameter s;
112e, after determining the seasonal trend s, repeating the step 112a using the seasonally differentiated data to obtain a seasonal differentiation order D, and repeating the steps 112b and 112c to obtain a seasonal autoregressive order P and a seasonal moving average order Q.
The early warning according to the probability is that the corresponding grids are marked and early warned through four different colors:
labeling the grids with the probability greater than 0.68 through a first color;
labeling the grids with the probability of being greater than 0.34 and less than or equal to 0.68 through a second color;
labeling the grids with the probability less than or equal to 0.34 through a third color;
and marking the non-warning high-rate grids through a fourth color.
The step (122) further comprises:
other time-related features were constructed: whether the day is weekend, whether the day is holiday, and whether the day is on a rest.
The invention adopts all available data, including real name track data, one-standard three-real data, alarm condition data, forepart data, data obtained by Internet crawling, such as weather, regional building (business) information and the like, manually labeled data, such as urban village data and the like, and adopts a plurality of models for prediction and early warning, and simultaneously trains independent prediction models and early warning models aiming at different alarm conditions.
Detailed Description
An alarm situation prediction and early warning method comprises the following steps:
the method comprises the steps of predicting the number of specific alarms in a prediction time period, wherein the prediction time period comprises a preset prediction starting date and prediction days, the prediction days are 1-7, and the prediction days are 7 in the embodiment.
Wherein, the specific alarm condition refers to a certain type of alarm condition.
The invention trains independent quantity prediction models aiming at different alarm situations, and can more accurately predict the quantity of corresponding alarm situations.
(111) Counting a specific alarm condition occurring before the prediction starting date to obtain time sequence data of the specific alarm condition: the date on which the particular alert occurred and the number of occurrences of the particular alert on that date.
(112) Performing statistical analysis on the time sequence data by adopting a seasonal autoregressive moving average model SARIMA to obtain the trend and the distribution of the specific warning situation, and determining a trend parameter and a seasonal parameter according to an analysis result, wherein the trend parameter comprises a trend autoregressive order P, a trend difference order D and a trend moving average order Q, and the seasonal parameter comprises a seasonal autoregressive order P, a seasonal difference order D, a seasonal moving average order Q and a seasonal trend parameter s:
112a, carrying out d-time difference (one-step difference) on the time sequence data, using a unit root detection method to detect the stationarity of the time sequence data, if the difference data is stable, determining the parameter difference order as d, otherwise, increasing the number of d by 1, and continuing the difference until the stable time sequence data is obtained. If the difference is made for 1 time to the time sequence data, if the data is stable, the order of the parameter difference is determined to be 1, otherwise, the difference is made for 2 times to the time sequence data, if the data is stable, the order of the parameter difference is determined to be 2, otherwise, the difference is made for 3 times to the time sequence data, and so on. Generally, 3 differences can be smoothed.
112b, drawing a partial autocorrelation graph and an autocorrelation graph of the stationary time sequence data, wherein when the delay in the partial autocorrelation graph is i, obvious projection exists, but when the delay is larger, similar projection does not exist, and the value of p is determined to make p equal to i; the method comprises the steps of determining the value of q to be equal to j, respectively constructing exponential smoothing models ARMA (0, q) (namely p to 0), ARMA (p,0) (namely q to 0) and ARMA (p, q), calculating an average value of an information quantity AIC of smooth time sequence data on three models to be equal to-2 log (L) and +2(p + q + k +1), selecting the model with the minimum AIC, determining parameters (p, q), wherein L is a likelihood function of the smooth time sequence, k is 1 when c is equal to 0, k is 0 when c is equal to 0, c is an average value of continuous observed value changes, the continuous observed value refers to a continuous value on the smooth time sequence after d differences, and in the continuous time sequence, one data value can be called as an observed value, for example, the observed value of 2019-07-07 is 20, 2019-07-08, the change of the continuous observed value is 5, and the average value of the differences of the observed values of all adjacent continuous dates is the value c.
112c, constructing an exponential smoothing model ARMA (p, q) according to the determined parameters (p, q), calculating the residual error of the smoothing time sequence data in the model ARMA (p, q), adopting a D-W test mode to test whether the residual error is autocorrelation, drawing a bitmap to test whether the residual error accords with normal distribution with the average value of 0 and the variance of constant, further confirming the selected parameters (p, q), returning to the previous step for reselecting the parameters (p, q) if the conditions are not met, and selecting a smaller value close to the previously selected parameters when the parameters are reselected.
112d, obtaining time sequence data in the step (111), taking the time delays m as 7 days, 1 month and 3 months, respectively taking the corresponding seasonal trend parameters s as 7, 12 and 4, sequentially using the time delays to carry out seasonal difference (difference with time delay m), carrying out unit root inspection on the data after difference, judging the data stability, selecting the seasonal trend parameter s with the best stability, and determining the value of the seasonal trend parameter s.
112e, after determining the seasonal trend s, repeating the step 112a using the seasonally differentiated data to obtain a seasonal differentiation order D, and repeating the steps 112b and 112c to obtain a seasonal autoregressive order P and a seasonal moving average order Q.
(113) Constructing a quantity prediction model of the specific alarm situations according to the trend parameters and the seasonal parameters: SARIMA (P, D, Q) x (P, D, Q, s), training the quantitative prediction model to fit the time series data by the time series data of the specific alarm condition.
(114) And inputting the predicted days into the quantity prediction model to obtain a prediction result in the prediction time period.
Considering that the prediction is more inaccurate as the prediction days are longer, the prediction days of the embodiment are not suitable to be too large (<7), the model is retrained every day (the parameters are not required to be determined again, the parameters are fixed once, and only one day of data is added to the training data for retraining), and the model is retrained and the prediction result covers yesterday.
Taking alarm a as an example, assuming that the predicted start date is 2019-05-05, we need to predict the number of alarms a according to the time series data of alarm a before 2019-05-05, if the time series data of 2019-05-05 is:
...
2019-05-01 26
2019-05-02 31
2019-05-03 28
2019-05-04 24
we determine the model SARIMA (P, D, Q) x (P, D, Q, s) from the above data by the above steps, and train the model with the time series data to fit the data, and then predict using the trained model. Because the model is a time sequence model, the training of time sequence data before a certain time period can directly predict data of specified days after the time period, and if the training data is 2019-03-01-2019-05-04, the trained model can directly predict future data of the specified prediction days. And if the specified prediction days are 4, predicting data of 2019-05-2019-05-08 for four days.
When the current time is 2019-05-05, parameters of a model SARIMA (P, D, Q) x (P, D, Q, s) are unchanged, 2019-05-05 data are added into training data, namely the training data are 2019-03-01-2019-05-05, and the model is trained; the prediction days are still 4, namely data of 2019-05-06-2019-05-09 four days are predicted, the prediction result of yesterday is covered (2019-05-06-2019-05-08 is covered), and prediction data of 2019-05-09 are added.
Secondly, early warning is carried out on specific warning situations:
(121) dividing a district into grid areas of n × n meters on an electronic map, and setting three granularity areas from small to large: grid, district to be distinguished and district to be distinguished, n is 450-.
(122) At three granularities, spatio-temporal characteristics of each region at each day of history are constructed.
The history is generally 2-3 years before the current date, namely, data 2-3 years before the current date is acquired, the history length is determined according to database data, and if the number of data books in the database is not large (<2 years), the number of data books is acquired. If the current date is 9, 24 and 2019, and the warning needs to be carried out on a certain specific warning condition, the warning current date is 9, 24 and 2019.
122a, constructing activity characteristics in the current area according to the real-name track database and the predecessor database, wherein the activity characteristics comprise the total number of people who have activities/men/predecessors in the hotel in the current area for the previous n1/n2/n3/n4 days and the total number of people who have activities/men/predecessors in the internet cafe in the current area for the previous n1/n2/n3/n4 days, wherein n1< n2< n3< n 4.
The "predecessor" herein refers to a predecessor of a particular alert.
In the present embodiment, n1, n2, n3 and n4 are 1, 2, 5 and 7, respectively, the same applies below.
For example, in each historical day, the current day is 2018-08-08, and the activity characteristics of the current day in the area A comprise the total activity number of all the hotels in 1 day (2018-08-07), the total activity number of all the hotels in the area A in 2 days (2018-08-06-2018-08-07), the total activity number of all the hotels in the area A in 5 days and the total activity number of all the hotels in the area A in 7 days.
For a hotel, the number of people is the number of people registered to check in; for the internet bar, the number of people is the number of people who register to surf the internet.
122b, constructing the alarm characteristics of the specific alarms in the current area according to the alarm database, wherein the alarm characteristics comprise the alarm quantity of the specific alarms in the current area at the previous n1/n2/n3/n4 days. For example, the current day is 2018-08-08, the area A, the alarm quantity of the alarm A1 day (2018-08-07) before the area A, and the alarm quantity of the alarm A2 days (2018-08-06-2018-08-07) before the area A.
122c, obtaining the interest point attribute characteristics in the current area from the electronic map: the quantity of hotel, internet bar, building site, traffic hub, camera, district, market, bar, KTV and bank, nearest party, station distance, whether rural area etc. wherein, nearest party, station distance use regional center as the starting point to calculate.
122d, constructing the region attribute characteristics in the current region according to a standard three-entity database: regional population, number and distribution, regional building, density and distribution, regional business, distribution, etc.
122e, acquiring weather data from the Internet to construct weather characteristics in the current area of the day: such as weather, temperature, wind, air quality, statistics thereof, etc.
Other time-dependent features can also be constructed: whether the day is weekend, whether the day is holiday, and whether the day is on a rest.
(123) Setting a label for each historical day of each area, judging whether the number of the specific warnings including N days after the current day is larger than the number of the specific warnings including N days before the current day and is not 0, wherein the label of the current day is 1, otherwise, the label of the current day is 0, N is 1-10, in the embodiment, N is 7, and the same is applied below. If the number of alarms in the last 7 days (2018-08-14) of the current day is greater than that in the first 7 days (2018-08-01-2018-08-07) of the current day and is not 0, the label is equal to 1, and otherwise, the label is equal to 0.
(124) And according to the warning condition database, counting the historical daily average warning condition number of the specific warning condition in each grid to obtain the warning condition high-rate grids with the historical daily average warning condition number higher than 90%.
(125) On three granularities, a first feature set and a label set are constructed through the space-time features of each region and corresponding labels, a neural network model is trained through the first feature set and the label set, the output of the middle layer is taken as a second feature set, and the middle layer model is stored;
if on the grid granularity, aiming at the alarm A, a single-layer neural network model (NN) is constructed, the spatio-temporal characteristics of the alarm A and the corresponding labels are fed to the model for training, the output of the middle layer is taken as a new characteristic set, and the middle layer model is stored.
The space-time characteristic is as follows:
time 1 region 1 feature 1.1 feature 2.1 feature 3.1 feature 4.1 … … tag 1
Time 2 region 1 feature 1.2 feature 2.2 feature 3.2 feature 4.2 … … tag 2
Time 1 region 2 feature 1.3 feature 2.3 feature 3.3 feature 4.3 … … tag 3
Time 2, region 2, feature 1.4, feature 2.4, feature 3.4, feature 4.4 … …, tag 4.
(126) And training a lifting tree model through a second characteristic set and a label set on three granularities:
and respectively dividing the second feature set and the label set into a training set, a verification set and a test set on three granularities, wherein the training set is used for training the model, the verification set is used for checking the training degree of the model and stopping training in time, and the test set is used for checking the generalization ability (the ability of predicting unknown data) of the model. Setting hyper-parameters of the lifting tree model, training the model, stopping training when the accuracy of the model on the verification set is not lifted any more, and verifying the generalization capability of the model by using the test set; and adjusting the hyper-parameters, and repeating the training step and the test set verification step. And finally, selecting a group of hyper-parameters with the best effect on the test set, retraining the lifting tree model and storing the trained lifting tree model.
(127) And (3) constructing a space-time characteristic of each warning high-rate grid on the early warning current date according to the steps 122a to 122e, inputting the space-time characteristic into the lifting tree model, obtaining the probability of whether the number of specific warnings of each warning high-rate grid on the last N days of the current date is larger than that on the last N days, and early warning according to the probability.
In this embodiment, the corresponding grid is labeled and warned by four different colors:
labeling the grids with probability greater than 0.68 by a first color (red);
labeling the grids with the probability of being greater than 0.34 and less than or equal to 0.68 through a second color (orange);
labeling the grids with the probability less than or equal to 0.34 through a third color (yellow);
and marking the non-alert high-rate grids through a fourth color (green).
Of course, the warning can be performed in other ways according to the probability.
For example, for alert a, the following spatiotemporal features exist:
time 1 region 1 feature 1.1 feature 2.1 feature 3.1 feature 4.1 … … tag 1
Time 2 region 1 feature 1.2 feature 2.2 feature 3.2 feature 4.2 … … tag 2
Time 1 region 2 feature 1.3 feature 2.3 feature 3.3 feature 4.3 … … tag 3
Time 2, region 2, feature 1.4, feature 2.4, feature 3.4, feature 4.4 … …, tag 4.
To obtain a first feature set X, i.e.
X ═ feature 1.1 feature 2.1 feature 3.1 feature 4.1 … …
Feature 1.2 feature 2.2 feature 3.2 feature 4.2 … …
Feature 1.3 feature 2.3 feature 3.3 feature 4.3 … …
Feature 1.4 feature 2.4 feature 3.4 feature 4.4 … …
......],
A tag set y is obtained, i.e. y ═ tag 1, tag 2, tag 3, tag 4, ·. ],
inputting X and y into a neural network model for training, storing an intermediate layer model, and obtaining a second feature set, namely:
[ new feature 1.1 new feature 2.1 new feature 3.1 new feature 4.1 … …
New features 1.2 new features 2.2 new features 3.2 new features 4.2 … …
New features 1.3 new features 2.3 new features 3.3 new features 4.3 … …
New features 1.4 new features 2.4 new features 3.4 new features 4.4 … …
......],
And taking the second feature set and the second feature set y as input, dividing a training set, a verification set and a test set, and training a lifting tree model.
Supposing that the early warning current date is 2019-05-05 and N is 7, the data used for training is data before 2019-05-05 ([ 2019-03-01-2019-05-04 ]), space-time characteristics and labels of all the areas [ 2019-03-01-2019-05-04 ] are constructed, then the characteristic sets and the label sets are input into the trained lifting tree model through the neural network middle layer model to obtain new characteristic sets, a prediction value can be output by the model, and the prediction value is the probability whether the number of the current areas in the future 7 days of the early warning current date is larger than the number of the current areas in the past 7 days.
The invention adopts all available data, including real name track data, one-standard three-real data, alarm condition data, forepart data, data obtained by Internet crawling, such as weather, regional building (business) information and the like, manually labeled data, such as urban village data and the like, and adopts a plurality of models to predict and pre-warn, and simultaneously trains independent prediction models and pre-warn models (lifting tree models) aiming at different alarm conditions.
The scheme also relates to an alarm situation prediction and early warning system which comprises a storage module, wherein a plurality of instructions are stored in the storage module, and the instructions are loaded and executed by a processor:
the method comprises the steps of predicting the number of specific alarms in a prediction time period, wherein the prediction time period comprises a preset prediction starting date and prediction days, the prediction days are 1-7, and the prediction days are 7 in the embodiment.
Wherein, the specific alarm condition refers to a certain type of alarm condition.
The invention trains independent quantity prediction models aiming at different alarm situations, and can more accurately predict the quantity of corresponding alarm situations.
(111) Counting a specific alarm condition occurring before the prediction starting date to obtain time sequence data of the specific alarm condition: the date on which the particular alert occurred and the number of occurrences of the particular alert on that date.
(112) Performing statistical analysis on the time sequence data by adopting a seasonal autoregressive moving average model SARIMA to obtain the trend and the distribution of the specific warning situation, and determining a trend parameter and a seasonal parameter according to an analysis result, wherein the trend parameter comprises a trend autoregressive order P, a trend difference order D and a trend moving average order Q, and the seasonal parameter comprises a seasonal autoregressive order P, a seasonal difference order D, a seasonal moving average order Q and a seasonal trend parameter s:
112a, carrying out d-time difference (one-step difference) on the time sequence data, using a unit root detection method to detect the stationarity of the time sequence data, if the difference data is stable, determining the parameter difference order as d, otherwise, increasing the number of d by 1, and continuing the difference until the stable time sequence data is obtained. If the difference is made for 1 time to the time sequence data, if the data is stable, the order of the parameter difference is determined to be 1, otherwise, the difference is made for 2 times to the time sequence data, if the data is stable, the order of the parameter difference is determined to be 2, otherwise, the difference is made for 3 times to the time sequence data, and so on. Generally, 3 differences can be smoothed.
112b, drawing a partial autocorrelation graph and an autocorrelation graph of the stationary time sequence data, wherein when the delay in the partial autocorrelation graph is i, obvious projection exists, but when the delay is larger, similar projection does not exist, and the value of p is determined to make p equal to i; the method comprises the steps of determining the value of q to be equal to j, respectively constructing exponential smoothing models ARMA (0, q) (namely p to 0), ARMA (p,0) (namely q to 0) and ARMA (p, q), calculating an average value of an information quantity AIC of smooth time sequence data on three models to be equal to-2 log (L) and +2(p + q + k +1), selecting the model with the minimum AIC, determining parameters (p, q), wherein L is a likelihood function of the smooth time sequence, k is 1 when c is equal to 0, k is 0 when c is equal to 0, c is an average value of continuous observed value changes, the continuous observed value refers to a continuous value on the smooth time sequence after d differences, and in the continuous time sequence, one data value can be called as an observed value, for example, the observed value of 2019-07-07 is 20, 2019-07-08, the change of the continuous observed value is 5, and the average value of the differences of the observed values of all adjacent continuous dates is the value c.
112c, constructing an exponential smoothing model ARMA (p, q) according to the determined parameters (p, q), calculating the residual error of the smoothing time sequence data in the model ARMA (p, q), adopting a D-W test mode to test whether the residual error is autocorrelation, drawing a bitmap to test whether the residual error accords with normal distribution with the average value of 0 and the variance of constant, further confirming the selected parameters (p, q), returning to the previous step for reselecting the parameters (p, q) if the conditions are not met, and selecting a smaller value close to the previously selected parameters when the parameters are reselected.
112d, obtaining time sequence data in the step (111), taking the time delays m as 7 days, 1 month and 3 months, respectively taking the corresponding seasonal trend parameters s as 7, 12 and 4, sequentially using the time delays to carry out seasonal difference (difference with time delay m), carrying out unit root inspection on the data after difference, judging the data stability, selecting the seasonal trend parameter s with the best stability, and determining the value of the seasonal trend parameter s.
112e, after determining the seasonal trend s, repeating the step 112a using the seasonally differentiated data to obtain a seasonal differentiation order D, and repeating the steps 112b and 112c to obtain a seasonal autoregressive order P and a seasonal moving average order Q.
(113) Constructing a quantity prediction model of the specific alarm situations according to the trend parameters and the seasonal parameters: SARIMA (P, D, Q) x (P, D, Q, s), training the quantitative prediction model to fit the time series data by the time series data of the specific alarm condition.
(114) And inputting the predicted days into the quantity prediction model to obtain a prediction result in the prediction time period.
Considering that the prediction is more inaccurate as the prediction days are longer, the prediction days of the embodiment are not suitable to be too large (<7), the model is retrained every day (the parameters are not required to be determined again, the parameters are fixed once, and only one day of data is added to the training data for retraining), and the model is retrained and the prediction result covers yesterday.
Taking alarm a as an example, assuming that the predicted start date is 2019-05-05, we need to predict the number of alarms a according to the time series data of alarm a before 2019-05-05, if the time series data of 2019-05-05 is:
...
2019-05-01 26
2019-05-02 31
2019-05-03 28
2019-05-04 24
we determine the model SARIMA (P, D, Q) x (P, D, Q, s) from the above data by the above steps, and train the model with the time series data to fit the data, and then predict using the trained model. Because the model is a time sequence model, the training of time sequence data before a certain time period can directly predict data of specified days after the time period, and if the training data is 2019-03-01-2019-05-04, the trained model can directly predict future data of the specified prediction days. And if the specified prediction days are 4, predicting data of 2019-05-2019-05-08 for four days.
When the current time is 2019-05-05, parameters of a model SARIMA (P, D, Q) x (P, D, Q, s) are unchanged, 2019-05-05 data are added into training data, namely the training data are 2019-03-01-2019-05-05, and the model is trained; the prediction days are still 4, namely data of 2019-05-06-2019-05-09 four days are predicted, the prediction result of yesterday is covered (2019-05-06-2019-05-08 is covered), and prediction data of 2019-05-09 are added.
Secondly, early warning is carried out on specific warning situations:
(121) dividing a district into grid areas of n × n meters on an electronic map, and setting three granularity areas from small to large: grid, district to be distinguished and district to be distinguished, n is 450-.
(122) At three granularities, spatio-temporal characteristics of each region at each day of history are constructed.
The history is generally 2-3 years before the current date, namely, data 2-3 years before the current date is acquired, the history length is determined according to database data, and if the number of data books in the database is not large (<2 years), the number of data books is acquired. If the current date is 9, 24 and 2019, and the warning needs to be carried out on a certain specific warning condition, the warning current date is 9, 24 and 2019.
122a, constructing activity characteristics in the current area according to the real-name track database and the predecessor database, wherein the activity characteristics comprise the total number of people who have activities/men/predecessors in the hotel in the current area for the previous n1/n2/n3/n4 days and the total number of people who have activities/men/predecessors in the internet cafe in the current area for the previous n1/n2/n3/n4 days, wherein n1< n2< n3< n 4.
The "predecessor" herein refers to a predecessor of a particular alert.
In the present embodiment, n1, n2, n3 and n4 are 1, 2, 5 and 7, respectively, the same applies below.
For example, in each historical day, the current day is 2018-08-08, and the activity characteristics of the current day in the area A comprise the total activity number of all the hotels in 1 day (2018-08-07), the total activity number of all the hotels in the area A in 2 days (2018-08-06-2018-08-07), the total activity number of all the hotels in the area A in 5 days and the total activity number of all the hotels in the area A in 7 days.
For a hotel, the number of people is the number of people registered to check in; for the internet bar, the number of people is the number of people who register to surf the internet.
122b, constructing the alarm characteristics of the specific alarms in the current area according to the alarm database, wherein the alarm characteristics comprise the alarm quantity of the specific alarms in the current area at the previous n1/n2/n3/n4 days. For example, the current day is 2018-08-08, the area A, the alarm quantity of the alarm A1 day (2018-08-07) before the area A, and the alarm quantity of the alarm A2 days (2018-08-06-2018-08-07) before the area A.
122c, obtaining the interest point attribute characteristics in the current area from the electronic map: the quantity of hotel, internet bar, building site, traffic hub, camera, district, market, bar, KTV and bank, nearest party, station distance, whether rural area etc. wherein, nearest party, station distance use regional center as the starting point to calculate.
122d, constructing the region attribute characteristics in the current region according to a standard three-entity database: regional population, number and distribution, regional building, density and distribution, regional business, distribution, etc.
122e, acquiring weather data from the Internet to construct weather characteristics in the current area of the day: such as weather, temperature, wind, air quality, statistics thereof, etc.
Other time-dependent features can also be constructed: whether the day is weekend, whether the day is holiday, and whether the day is on a rest.
(123) Setting a label for each historical day of each area, judging whether the number of the specific warnings including N days after the current day is larger than the number of the specific warnings including N days before the current day and is not 0, wherein the label of the current day is 1, otherwise, the label of the current day is 0, N is 1-10, in the embodiment, N is 7, and the same is applied below. If the number of alarms in the last 7 days (2018-08-14) of the current day is greater than that in the first 7 days (2018-08-01-2018-08-07) of the current day and is not 0, the label is equal to 1, and otherwise, the label is equal to 0.
(124) And according to the warning condition database, counting the historical daily average warning condition number of the specific warning condition in each grid to obtain the warning condition high-rate grids with the historical daily average warning condition number higher than 90%.
(125) On three granularities, a first feature set and a label set are constructed through the space-time features of each region and corresponding labels, a neural network model is trained through the first feature set and the label set, the output of the middle layer is taken as a second feature set, and the middle layer model is stored;
if on the grid granularity, aiming at the alarm A, a single-layer neural network model (NN) is constructed, the spatio-temporal characteristics of the alarm A and the corresponding labels are fed to the model for training, the output of the middle layer is taken as a new characteristic set, and the middle layer model is stored.
The space-time characteristic is as follows:
time 1 region 1 feature 1.1 feature 2.1 feature 3.1 feature 4.1 … … tag 1
Time 2 region 1 feature 1.2 feature 2.2 feature 3.2 feature 4.2 … … tag 2
Time 1 region 2 feature 1.3 feature 2.3 feature 3.3 feature 4.3 … … tag 3
Time 2, region 2, feature 1.4, feature 2.4, feature 3.4, feature 4.4 … …, tag 4.
(126) And training a lifting tree model through a second characteristic set and a label set on three granularities:
and respectively dividing the second feature set and the label set into a training set, a verification set and a test set on three granularities, wherein the training set is used for training the model, the verification set is used for checking the training degree of the model and stopping training in time, and the test set is used for checking the generalization ability (the ability of predicting unknown data) of the model. Setting hyper-parameters of the lifting tree model, training the model, stopping training when the accuracy of the model on the verification set is not lifted any more, and verifying the generalization capability of the model by using the test set; and adjusting the hyper-parameters, and repeating the training step and the test set verification step. And finally, selecting a group of hyper-parameters with the best effect on the test set, retraining the lifting tree model and storing the trained lifting tree model.
(127) And (3) constructing a space-time characteristic of each warning high-rate grid on the early warning current date according to the steps 122a to 122e, inputting the space-time characteristic into the lifting tree model, obtaining the probability of whether the number of specific warnings of each warning high-rate grid on the last N days of the current date is larger than that on the last N days, and early warning according to the probability.
In this embodiment, the corresponding grid is labeled and warned by four different colors:
labeling the grids with probability greater than 0.68 by a first color (red);
labeling the grids with the probability of being greater than 0.34 and less than or equal to 0.68 through a second color (orange);
labeling the grids with the probability less than or equal to 0.34 through a third color (yellow);
and marking the non-alert high-rate grids through a fourth color (green).
Of course, the warning can be performed in other ways according to the probability.
For example, for alert a, the following spatiotemporal features exist:
time 1 region 1 feature 1.1 feature 2.1 feature 3.1 feature 4.1 … … tag 1
Time 2 region 1 feature 1.2 feature 2.2 feature 3.2 feature 4.2 … … tag 2
Time 1 region 2 feature 1.3 feature 2.3 feature 3.3 feature 4.3 … … tag 3
Time 2, region 2, feature 1.4, feature 2.4, feature 3.4, feature 4.4 … …, tag 4.
To obtain a first feature set X, i.e.
X ═ feature 1.1 feature 2.1 feature 3.1 feature 4.1 … …
Feature 1.2 feature 2.2 feature 3.2 feature 4.2 … …
Feature 1.3 feature 2.3 feature 3.3 feature 4.3 … …
Feature 1.4 feature 2.4 feature 3.4 feature 4.4 … …
......],
A tag set y is obtained, i.e. y ═ tag 1, tag 2, tag 3, tag 4, ·. ],
inputting X and y into a neural network model for training, storing an intermediate layer model, and obtaining a second feature set, namely:
[ new feature 1.1 new feature 2.1 new feature 3.1 new feature 4.1 … …
New features 1.2 new features 2.2 new features 3.2 new features 4.2 … …
New features 1.3 new features 2.3 new features 3.3 new features 4.3 … …
New features 1.4 new features 2.4 new features 3.4 new features 4.4 … …
......],
And taking the second feature set and the second feature set y as input, dividing a training set, a verification set and a test set, and training a lifting tree model.
Supposing that the early warning current date is 2019-05-05 and N is 7, the data used for training is data before 2019-05-05 ([ 2019-03-01-2019-05-04 ]), space-time characteristics and labels of all the areas [ 2019-03-01-2019-05-04 ] are constructed, then the characteristic sets and the label sets are input into the trained lifting tree model through the neural network middle layer model to obtain new characteristic sets, a prediction value can be output by the model, and the prediction value is the probability whether the number of the current areas in the future 7 days of the early warning current date is larger than the number of the current areas in the past 7 days.
The invention adopts all available data, including real name track data, one-standard three-real data, alarm condition data, forepart data, data obtained by Internet crawling, such as weather, regional building (business) information and the like, manually labeled data, such as urban village data and the like, and adopts a plurality of models to predict and pre-warn, and simultaneously trains independent prediction models and pre-warn models (lifting tree models) aiming at different alarm conditions.
However, those skilled in the art should realize that the above embodiments are illustrative only and not limiting to the present invention, and that changes and modifications to the above described embodiments are intended to fall within the scope of the appended claims, provided they fall within the true spirit of the present invention.

Claims (8)

1. An alarm situation prediction and early warning method is characterized by comprising the following steps:
predicting the number of specific alarms in a prediction time period, wherein the prediction time period comprises a preset prediction starting date and a prediction number of days:
(111) counting the specific alarm condition occurring before the prediction starting date to obtain the time sequence data of the specific alarm condition: the date on which the specific alarm condition occurs and the occurrence number of the specific alarm condition on the date;
(112) performing statistical analysis on the time sequence data by adopting a seasonal autoregressive moving average model SARIMA to obtain the trend and the distribution of the specific alarm, and determining trend parameters and seasonal parameters according to the analysis result, wherein the trend parameters comprise a trend autoregressive order P, a trend differential order D and a trend moving average order Q, and the seasonal parameters comprise a seasonal autoregressive order P, a seasonal differential order D, a seasonal moving average order Q and a seasonal trend parameter s;
(113) constructing a quantity prediction model of the specific alarm according to the trend parameters and the seasonal parameters: SARIMA (P, D, Q) x (P, D, Q, s), training the quantitative prediction model to fit the time series data of the specific alarm condition through the time series data of the specific alarm condition;
(114) inputting the prediction days into the quantity prediction model to obtain a prediction result in the prediction time period;
secondly, early warning is carried out on specific warning situations:
(121) dividing a district into grid areas of n × n meters on an electronic map, and setting three granularity areas from small to large: grid, district and sub-district, n is 450-;
(122) at the three granularities, the spatio-temporal characteristics of each region at each day of history are constructed:
122a, constructing activity characteristics in the current area according to the real-name track database and the predecessor database, wherein the activity characteristics comprise the total number of activities/the number of men/the number of predecessors in the hotel in the current area n1/n2/n3/n4 days before the current day and the total number of activities/the number of men/the number of predecessors in the internet cafe in the current area n1/n2/n3/n4 days before the current day;
122b, constructing the alarm characteristics of the specific alarm in the current area according to an alarm database, wherein the alarm characteristics comprise the alarm quantity of the specific alarm in the current area n1/n2/n3/n4 days before the current day;
122c, obtaining interest point attribute characteristics in the current area from the electronic map;
122d, constructing the regional attribute characteristics in the current region according to a standard three-entity database;
122e, acquiring weather data from the Internet to construct weather characteristics in the current area of the day;
(123) setting a label for each historical day of each area, judging whether the number of the specific warnings which contain N days after the current day is larger than the number of the specific warnings which contain N days before the current day and is not 0, and if not, setting the label of the current day to be 1, otherwise, setting the label of the current day to be 0;
(124) according to an alarm condition database, counting the historical daily average alarm condition number of the specific alarm condition in each grid to obtain an alarm condition high-rate grid with the historical daily average alarm condition number higher than 90%;
(125) on the three granularities, a first feature set and a label set are constructed through the space-time features and the corresponding labels of each region, a neural network model is trained through the first feature set and the label set, the output of the middle layer is taken as a second feature set, and the middle layer model is stored;
(126) training a lifting tree model through the second feature set and a tag set at the three granularities;
(127) constructing a spatiotemporal feature of each warning high-rate grid on the early warning current date according to the steps 122a to 122e, inputting the spatiotemporal feature into a lifting tree model, obtaining the probability of whether the number of the specific warning conditions of each warning high-rate grid on the last N days of the current date is greater than the number of the specific warning conditions on the last N days, and early warning according to the probability;
wherein the prediction days are 1-7, N1< N2< N3< N4, and N is 1-10.
2. An alert situation prediction and warning method according to claim 1, wherein the step (112) further comprises:
112a, carrying out d-time difference on the time sequence data, using a unit root detection method to detect the stationarity of the time sequence data, if the difference data is stable, determining the parameter difference order as d, otherwise, increasing the number of d by 1, and continuing the difference until the stable time sequence data is obtained;
112b, drawing a partial autocorrelation graph and an autocorrelation graph of the stationary time sequence data, wherein when the delay in the partial autocorrelation graph is i, obvious projection exists, but when the delay is larger, similar projection does not exist, and the value of p is determined to make p equal to i; when the delay in the autocorrelation graph is j, obvious protrusions exist, but similar protrusions do not exist when the delay is larger, the value of q is determined so that q is equal to j, exponential smoothing models ARMA (0, q), ARMA (p,0) and ARMA (p, q) are respectively constructed, the information amount of an erythrocyte pool of smoothing time sequence data on the three models, namely-2 log (L) +2(p + q + k +1), is calculated, the model with the minimum AIC is selected, and parameters (p, q) are determined, wherein L is a likelihood function of the smoothing time sequence data, k is equal to 1 when c is equal to 0, k is 0 when c is equal to 0, and c is an average value of continuous observed value changes;
112c, constructing an exponential smoothing model ARMA (p, q) according to the parameters (p, q), calculating residual errors of smoothing time sequence data in the model ARMA (p, q), adopting a D-W test mode to test whether the residual errors are autocorrelation, drawing a bitmap to test whether the residual errors accord with normal distribution with the average value of 0 and the variance of a constant, further confirming the selected parameters (p, q), returning to the previous step for reselecting the parameters (p, q) if the conditions are not met, and selecting a smaller value close to the previously selected parameters when the parameters are reselected;
112d, obtaining time sequence data in the step (111), taking time delays of 7 days, 1 month and 3 months, respectively setting corresponding seasonal trend parameters s as 7, 12 and 4, sequentially using the time delays to carry out seasonal differentiation, carrying out unit root inspection on differentiated data, judging data stability, selecting the seasonal trend parameter s with the best stability, and determining the value of the seasonal trend parameter s;
112e, after determining the seasonal trend s, repeating the step 112a using the seasonally differentiated data to obtain a seasonal differentiation order D, and repeating the steps 112b and 112c to obtain a seasonal autoregressive order P and a seasonal moving average order Q.
3. An alarm situation prediction and early warning method according to claim 1 or 2, wherein the early warning according to the probability is that the corresponding grid is labeled and early warned by four different colors:
labeling the grids with the probability greater than 0.68 through a first color;
labeling the grids with the probability of being greater than 0.34 and less than or equal to 0.68 through a second color;
labeling the grids with the probability less than or equal to 0.34 through a third color;
and marking the non-warning high-rate grids through a fourth color.
4. A method for alarm situation prediction and forewarning according to claim 3, characterized in that said step (122) further comprises:
other time-related features were constructed: whether the day is weekend, whether the day is holiday, and whether the day is on a rest.
5. An alarm situation prediction and early warning system is characterized by comprising a storage module, wherein a plurality of instructions are stored in the storage module, and the instructions are loaded and executed by a processor:
predicting the number of specific alarms in a prediction time period, wherein the prediction time period comprises a preset prediction starting date and a prediction number of days:
(111) counting the specific alarm condition occurring before the prediction starting date to obtain the time sequence data of the specific alarm condition: the date on which the specific alarm condition occurs and the occurrence number of the specific alarm condition on the date;
(112) performing statistical analysis on the time sequence data by adopting a seasonal autoregressive moving average model SARIMA to obtain the trend and the distribution of the specific alarm, and determining trend parameters and seasonal parameters according to the analysis result, wherein the trend parameters comprise a trend autoregressive order P, a trend differential order D and a trend moving average order Q, and the seasonal parameters comprise a seasonal autoregressive order P, a seasonal differential order D, a seasonal moving average order Q and a seasonal trend parameter s;
(113) constructing a quantity prediction model of the specific alarm according to the trend parameters and the seasonal parameters: SARIMA (P, D, Q) x (P, D, Q, s), training the quantitative prediction model to fit the time series data of the specific alarm condition through the time series data of the specific alarm condition;
(114) inputting the prediction days into the quantity prediction model to obtain a prediction result in the prediction time period;
secondly, early warning is carried out on specific warning situations:
(121) dividing a district into grid areas of n × n meters on an electronic map, and setting three granularity areas from small to large: grid, district and sub-district, n is 450-;
(122) at the three granularities, the spatio-temporal characteristics of each region at each day of history are constructed:
122a, constructing activity characteristics in the current area according to the real-name track database and the predecessor database, wherein the activity characteristics comprise the total number of activities/the number of men/the number of predecessors in the hotel in the current area n1/n2/n3/n4 days before the current day and the total number of activities/the number of men/the number of predecessors in the internet cafe in the current area n1/n2/n3/n4 days before the current day;
122b, constructing the alarm characteristics of the specific alarm in the current area according to an alarm database, wherein the alarm characteristics comprise the alarm quantity of the specific alarm in the current area n1/n2/n3/n4 days before the current day;
122c, obtaining interest point attribute characteristics in the current area from the electronic map;
122d, constructing the regional attribute characteristics in the current region according to a standard three-entity database;
122e, acquiring weather data from the Internet to construct weather characteristics in the current area of the day;
(123) setting a label for each historical day of each area, judging whether the number of the specific warnings which contain N days after the current day is larger than the number of the specific warnings which contain N days before the current day and is not 0, and if not, setting the label of the current day to be 1, otherwise, setting the label of the current day to be 0;
(124) according to an alarm condition database, counting the historical daily average alarm condition number of the specific alarm condition in each grid to obtain an alarm condition high-rate grid with the historical daily average alarm condition number higher than 90%;
(125) on the three granularities, a first feature set and a label set are constructed through the space-time features and the corresponding labels of each region, a neural network model is trained through the first feature set and the label set, the output of the middle layer is taken as a second feature set, and the middle layer model is stored;
(126) training a lifting tree model through the second feature set and a tag set at the three granularities;
(127) constructing a spatiotemporal feature of each warning high-rate grid on the early warning current date according to the steps 122a to 122e, inputting the spatiotemporal feature into a lifting tree model, obtaining the probability of whether the number of the specific warning conditions of each warning high-rate grid on the last N days of the current date is greater than the number of the specific warning conditions on the last N days, and early warning according to the probability;
wherein the prediction days are 1-7, N1< N2< N3< N4, and N is 1-10.
6. An alarm situation prediction and warning system according to claim 5, wherein the step (112) further comprises:
112a, carrying out d-time difference on the time sequence data, using a unit root detection method to detect the stationarity of the time sequence data, if the difference data is stable, determining the parameter difference order as d, otherwise, increasing the number of d by 1, and continuing the difference until the stable time sequence data is obtained;
112b, drawing a partial autocorrelation graph and an autocorrelation graph of the stationary time sequence data, wherein when the delay in the partial autocorrelation graph is i, obvious projection exists, but when the delay is larger, similar projection does not exist, and the value of p is determined to make p equal to i; when the delay in the autocorrelation graph is j, obvious protrusions exist, but similar protrusions do not exist when the delay is larger, the value of q is determined so that q is equal to j, exponential smoothing models ARMA (0, q), ARMA (p,0) and ARMA (p, q) are respectively constructed, the information amount of an erythrocyte pool of smoothing time sequence data on the three models, namely-2 log (L) +2(p + q + k +1), is calculated, the model with the minimum AIC is selected, and parameters (p, q) are determined, wherein L is a likelihood function of the smoothing time sequence data, k is equal to 1 when c is equal to 0, k is 0 when c is equal to 0, and c is an average value of continuous observed value changes;
112c, constructing an exponential smoothing model ARMA (p, q) according to the parameters (p, q), calculating residual errors of smoothing time sequence data in the model ARMA (p, q), adopting a D-W test mode to test whether the residual errors are autocorrelation, drawing a bitmap to test whether the residual errors accord with normal distribution with the average value of 0 and the variance of a constant, further confirming the selected parameters (p, q), returning to the previous step for reselecting the parameters (p, q) if the conditions are not met, and selecting a smaller value close to the previously selected parameters when the parameters are reselected;
112d, obtaining time sequence data in the step (111), taking the time delays of 7 days, 1 month and 3 months, respectively taking the corresponding seasonal trend parameters s of 7, 12 and 4, sequentially using the time delays to carry out seasonal differentiation, carrying out unit root inspection on the differentiated data, judging the data stability, selecting the seasonal trend parameter s with the best stability, and determining the value of the seasonal trend parameter s;
112e, after determining the seasonal trend s, repeating the step 112a using the seasonally differentiated data to obtain a seasonal differentiation order D, and repeating the steps 112b and 112c to obtain a seasonal autoregressive order P and a seasonal moving average order Q.
7. An alarm situation prediction and early warning system according to claim 5 or 6, wherein the early warning according to the probability is a labeling early warning of a corresponding grid by four different colors:
labeling the grids with the probability greater than 0.68 through a first color;
labeling the grids with the probability of being greater than 0.34 and less than or equal to 0.68 through a second color;
labeling the grids with the probability less than or equal to 0.34 through a third color;
and marking the non-warning high-rate grids through a fourth color.
8. An alert situation prediction and warning system according to claim 7, wherein the step (122) further comprises:
other time-related features were constructed: whether the day is weekend, whether the day is holiday, and whether the day is on a rest.
CN201911035111.5A 2019-10-29 2019-10-29 Method and system for predicting and early warning situation Pending CN110889536A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911035111.5A CN110889536A (en) 2019-10-29 2019-10-29 Method and system for predicting and early warning situation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911035111.5A CN110889536A (en) 2019-10-29 2019-10-29 Method and system for predicting and early warning situation

Publications (1)

Publication Number Publication Date
CN110889536A true CN110889536A (en) 2020-03-17

Family

ID=69746569

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911035111.5A Pending CN110889536A (en) 2019-10-29 2019-10-29 Method and system for predicting and early warning situation

Country Status (1)

Country Link
CN (1) CN110889536A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111353828A (en) * 2020-03-30 2020-06-30 中国工商银行股份有限公司 Method and device for predicting number of people arriving at store from network
CN113570846A (en) * 2021-06-08 2021-10-29 北京交通大学 Traffic warning situation analysis and research method, equipment and readable storage medium
CN114418071A (en) * 2022-01-24 2022-04-29 中国光大银行股份有限公司 Cyclic neural network training method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107331132A (en) * 2017-08-04 2017-11-07 深圳航天智慧城市系统技术研究院有限公司 A kind of method and system of Urban Fires hidden danger dynamic prediction monitoring
CN107392644A (en) * 2017-06-19 2017-11-24 华南理工大学 A kind of commodity purchasing predicts modeling method
KR101830522B1 (en) * 2016-08-22 2018-02-21 가톨릭대학교 산학협력단 Method for predicting crime occurrence of prediction target region using big data
CN109214716A (en) * 2018-10-17 2019-01-15 四川佳联众合企业管理咨询有限公司 Mountain fire risk profile modeling method based on stacking algorithm
CN109376227A (en) * 2018-10-29 2019-02-22 山东大学 A kind of prison term prediction technique based on multitask artificial neural network
CN109447331A (en) * 2018-10-17 2019-03-08 四川佳联众合企业管理咨询有限公司 Mountain fire Risk Forecast Method based on stacking algorithm
CN110008979A (en) * 2018-12-13 2019-07-12 阿里巴巴集团控股有限公司 Abnormal data prediction technique, device, electronic equipment and computer storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101830522B1 (en) * 2016-08-22 2018-02-21 가톨릭대학교 산학협력단 Method for predicting crime occurrence of prediction target region using big data
CN107392644A (en) * 2017-06-19 2017-11-24 华南理工大学 A kind of commodity purchasing predicts modeling method
CN107331132A (en) * 2017-08-04 2017-11-07 深圳航天智慧城市系统技术研究院有限公司 A kind of method and system of Urban Fires hidden danger dynamic prediction monitoring
CN109214716A (en) * 2018-10-17 2019-01-15 四川佳联众合企业管理咨询有限公司 Mountain fire risk profile modeling method based on stacking algorithm
CN109447331A (en) * 2018-10-17 2019-03-08 四川佳联众合企业管理咨询有限公司 Mountain fire Risk Forecast Method based on stacking algorithm
CN109376227A (en) * 2018-10-29 2019-02-22 山东大学 A kind of prison term prediction technique based on multitask artificial neural network
CN110008979A (en) * 2018-12-13 2019-07-12 阿里巴巴集团控股有限公司 Abnormal data prediction technique, device, electronic equipment and computer storage medium

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
ALIF RIDZUAN KHAIRUDDIN等: ""Comparative Study on Artificial Intelligence Techniques in Crime Forecasting"", 《APPLIED MECHANICS AND MATERIALS》 *
SOKRATIS PAPADOPOULOS等: ""Short-term electricity load forecasting using time series and ensemble learning methods"", 《2015 IEEE POWER AND ENERGY CONFERENCE AT ILLINOIS (PECI)》 *
SUHONG KIM等: ""Crime Analysis Through Machine Learning"", 《2018 IEEE 9TH ANNUAL INFORMATION TECHNOLOGY, ELECTRONICS AND MOBILE COMMUNICATION CONFERENCE (IEMCON)》 *
丁红军: ""基于Elman神经网络110警情预测研究"", 《网络安全技术与应用》 *
赖慧慧: ""大数据背景下基于 ARMA 模型的增值税销项税额预测"", 《税务研究》 *
陈鹏等: ""基于时间序列模型的110警情数据预测研究"", 《信息系统工程》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111353828A (en) * 2020-03-30 2020-06-30 中国工商银行股份有限公司 Method and device for predicting number of people arriving at store from network
CN111353828B (en) * 2020-03-30 2023-09-12 中国工商银行股份有限公司 Method and device for predicting number of people coming to store at website
CN113570846A (en) * 2021-06-08 2021-10-29 北京交通大学 Traffic warning situation analysis and research method, equipment and readable storage medium
CN113570846B (en) * 2021-06-08 2022-11-04 北京交通大学 Traffic warning situation analysis and judgment method, equipment and readable storage medium
CN114418071A (en) * 2022-01-24 2022-04-29 中国光大银行股份有限公司 Cyclic neural network training method

Similar Documents

Publication Publication Date Title
Ali et al. A data-driven approach for multi-scale GIS-based building energy modeling for analysis, planning and support decision making
CN107844915B (en) Automatic scheduling method of call center based on traffic prediction
Meng et al. Degree-day based non-domestic building energy analytics and modelling should use building and type specific base temperatures
CN109242049B (en) Water supply pipe network multipoint leakage positioning method and device based on convolutional neural network
CN113361665B (en) Highland mountain tourism safety risk early warning method based on reinforcement learning
Liu et al. Land-use decision support in brownfield redevelopment for urban renewal based on crowdsourced data and a presence-and-background learning (PBL) method
CN110889536A (en) Method and system for predicting and early warning situation
CN105469602B (en) A kind of Forecasting Methodology of the bus passenger waiting time scope based on IC-card data
AU2005232219A1 (en) Forecasting based on geospatial modeling
US20110085649A1 (en) Fluctuation Monitoring Method that Based on the Mid-Layer Data
CN105678457A (en) Method for evaluating user behavior on the basis of position mining
CN107992968A (en) Electric energy meter measurement error Forecasting Methodology based on integrated techniques of teime series analysis
CN110889092A (en) Short-time large-scale activity peripheral track station passenger flow volume prediction method based on track transaction data
CN106127333A (en) Movie attendance Forecasting Methodology and system
CN114418175A (en) Personnel management method and device, electronic equipment and storage medium
Willis Estimating the benefits of job creation from local investment subsidies
Dwyer Cost-benefit analysis
Soldatenko et al. Managing climate risks associated with socio-economic development of the Russian Arctic
CN115293465B (en) Crowd density prediction method and system
CN117252305A (en) House risk assessment method, device, equipment and medium
CN116992265A (en) Carbon emission estimation method, apparatus, device, and storage medium
CN112163964B (en) Risk prediction method, risk prediction device, electronic equipment and storage medium
CN108846746A (en) A kind of carbon transaction behavior modeling method of combination discrete statistics and extreme learning machine
CN114493027A (en) Future talent demand prediction method and system based on Markov model
CN112926664A (en) Feature selection and CART forest short-time strong rainfall forecasting method based on evolutionary algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200317