CN109447331B - Mountain fire risk prediction method based on stacking algorithm - Google Patents

Mountain fire risk prediction method based on stacking algorithm Download PDF

Info

Publication number
CN109447331B
CN109447331B CN201811209152.7A CN201811209152A CN109447331B CN 109447331 B CN109447331 B CN 109447331B CN 201811209152 A CN201811209152 A CN 201811209152A CN 109447331 B CN109447331 B CN 109447331B
Authority
CN
China
Prior art keywords
data
mountain fire
dynamic
static
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811209152.7A
Other languages
Chinese (zh)
Other versions
CN109447331A (en
Inventor
黄科
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Cap Data Service Co.,Ltd.
Original Assignee
Sichuan Jialian Zhonghe Enterprise Management Consultation Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Jialian Zhonghe Enterprise Management Consultation Co ltd filed Critical Sichuan Jialian Zhonghe Enterprise Management Consultation Co ltd
Priority to CN201811209152.7A priority Critical patent/CN109447331B/en
Publication of CN109447331A publication Critical patent/CN109447331A/en
Application granted granted Critical
Publication of CN109447331B publication Critical patent/CN109447331B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • G06Q50/265Personal security, identity or safety
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

The invention discloses a mountain fire risk prediction method based on a stacking algorithm, which can improve the prediction efficiency and the prediction accuracy. The mountain fire risk prediction method adopts various time-space data such as combustible factor data, geographic data, meteorological data, historical mountain fire data and the like to predict mountain fire occurrence risk, and designs a processing technology of multi-source, heterogeneous and massive time-space data to form a rich mountain fire occurrence prediction characteristic set; the capacity of processing massive space-time data is realized; the modeling driven by data is realized, the complicated Bayesian modeling process is avoided, and the efficiency of time-space data modeling is improved; meanwhile, the mountain fire risk prediction method takes the characteristics of processing time and space characteristics and dynamic and static characteristics into consideration, secondary processing generation of the characteristics is realized by a stacking method, and the overall effect of mountain fire risk prediction is improved; the AUC index reaches 0.85 through experimental verification. Is suitable for popularization and application in the technical field of data processing.

Description

Mountain fire risk prediction method based on stacking algorithm
Technical Field
The invention relates to the technical field of data processing, in particular to a mountain fire risk prediction method based on a stacking algorithm.
Background
Since the 20's of the last century, research into the predictive warning of mountain fire occurrence risks has never been stopped. Due to the acquisition of massive space-time data such as remote sensing and weather and the great progress of modern information processing and analyzing capacity, the mountain fire risk prediction depends on the technologies such as tests, numerical calculation and the like from the early stage to the situation of rapid development of various technologies such as data mining, machine learning and the like at present.
In the method for predicting the mountain fire risk by adopting a supervised data mining method, supervised learning technologies such as a Bayesian network, a decision tree and an SVM are taken as representatives. The main method comprises the following steps: whether the mountain fire (or the fire passing area) occurs in the future of an area (or a pixel) is used as a mountain fire risk level identification, and factors such as meteorological elements (temperature, humidity, rainfall, wind speed and wind direction) and human activities are used as characteristics to construct a statistical learning model so as to realize the prediction of mountain fire risks.
The Bayesian network predicts whether a mountain fire occurs in a certain area (pixel) in the future by using a Bayesian probability modeling technology. The method utilizes a Bayesian network technology to construct a probability model for factors (weather and the like) influencing the occurrence of the mountain fire, thereby realizing the prediction of the occurrence of the mountain fire. The technology has the advantages that uncertainty can be described, human expert knowledge can be adopted, and the like, but the technology has larger time and space complexity in the aspect of mass data processing, and has larger restriction in the aspect of predicting by using mass space-time data such as remote sensing, weather and the like.
In the existing research, a machine learning model of a decision tree, an SVM and a neural network is generally used for supervised learning by taking a fire area grade as a target variable. In an application scene, the fire passing area is used as a risk prediction target, and the method can be applied to predicting the mountain fire spreading risk in the initial stage of mountain fire occurrence and can also be used for predicting whether the mountain fire risk occurs in a short term in the future under the known environmental condition. The method drives modeling and prediction by data, and can avoid complex and fussy modeling process. However, the data mainly adopted by the existing research and application is remote sensing or meteorological data, and the data source is single; and the processing such as space-time autocorrelation, space heterogeneity and the like of the space-time data is not much.
The time-space data mining is a data mining method taking time-space data as a research object, and the time-space data mining becomes a research and application hotspot along with the accumulation of the time-space data in recent years. At present, in the aspect of crime and disease prediction analysis, a space-time data mining technology is used, and relevant literature data is not disclosed in the aspect of mountain fire prediction.
Disclosure of Invention
The invention aims to solve the technical problem of providing a mountain fire risk prediction method based on a stacking algorithm, which can improve the prediction efficiency and the prediction accuracy.
The technical scheme adopted by the invention for solving the technical problems is as follows: the mountain fire risk prediction method based on the stacking algorithm comprises the following steps:
1) establishing a mountain fire risk prediction model by using a stacking algorithm;
the mountain fire risk prediction model comprises a base model and a meta model;
2) acquiring combustible factor data, geographic data, meteorological data and historical mountain fire data in a historical period from the current time according to the requirements of a mountain fire risk prediction task;
3) processing the meteorological data acquired in the step 2) by a time resolution fusion method to obtain meteorological data taking days as a unit;
4) fusing and matching the combustible factor data, the geographic data, the historical mountain fire data and the weather data which is obtained in the step 3) and takes the day as a unit by a spatial data fusion method;
5) dividing combustible factor data, geographic data, meteorological data and historical mountain fire data obtained by the processing of the step 4) into dynamic data, static data and time data according to characteristic change characteristics; the Dynamic data are recorded as Dynamic _ Indexs, and the Static data are recorded as Static _ Indexs;
6) extracting Dynamic characteristic Dynamic _ Feats from Dynamic data Dynamic _ Indexs by adopting a statistical generalization method of a time and space window;
7) performing normalization processing on numerical data in all Dynamic characteristic Dynamic _ Feats and Static data Static _ Indexs by adopting a min-max method, and performing coding processing on discrete characteristics in the Static data Static _ Indexs by adopting a one-hot coding mode;
8) inputting the Dynamic characteristics Dynamic _ Feats processed in the step 7) into a base model to predict to obtain the probability pred _ i of the occurrence of the mountain fire, and then combining the pred _ i with the Static characteristics Static _ Indexs processed in the step 7) to serve as a new characteristic set and inputting the new characteristic set into a meta model to obtain the final probability NFire of the occurrence of the mountain fire;
9) generation of target variable
The target variable design adopts the occurrence of mountain fire as a mountain fire risk mark; a formula is adopted for label of a certain pixel in t days
Figure BDA0001831994790000021
Wherein, labeli,j,tThe target variable label NFire of the ith and j th pixel element in t daysi,j,hIndicates the probability that the ith and jth pixel has mountain fire or not in h day, labeli,j,t0 means no occurrence, labeli,j,t1 indicates occurrence.
Further, in the step 1), the establishment of the mountain fire risk prediction model by using a stacking algorithm comprises the following steps:
A. collecting combustible factor data, geographic data, meteorological data and historical mountain fire data in a historical time period from the current time according to the requirement of a mountain fire risk prediction task;
B. processing the meteorological data acquired in the step A by a time resolution fusion method to obtain meteorological data taking days as a unit;
C. fusion matching of spatial data is realized through a spatial data fusion method by using combustible factor data, geographic data, historical mountain fire data and weather data which is obtained in the step B and takes the day as a unit;
D. dividing combustible factor data, geographic data, meteorological data and historical mountain fire data obtained by processing in the step C into dynamic data, static data and time data according to characteristic change characteristics; the Dynamic data are recorded as Dynamic _ Indexs, and the Static data are recorded as Static _ Indexs;
E. extracting Dynamic characteristics Dynamic _ Feats from Dynamic data Dynamic _ Indexs by adopting a statistical generalization method of a time and space window;
F. performing normalization processing on numerical data in all Dynamic _ features and Static data Static _ Indexs by adopting a min-max method, and performing coding processing on discrete features in the Static data Static _ Indexs by adopting a one-hot coding mode;
G. taking t as the data obtained in the step F0The time point is used as a dividing point for data division to obtain two data subsets;
dataset0(t≤t0)
dataset1(t0<t);
H. constructing a base model by using GBDT, which specifically comprises the following steps: firstly, data splitting and feature selection are carried out, and the specific method comprises the following steps: extracting Dynamic characteristics Dynamic _ Feats of a dataset0, randomly splitting the Dynamic characteristics Dynamic _ Feats extracted from the dataset0 into N parts of { dataset0_ i }, wherein i belongs to [1, …, N ], and each row of each data subset represents an observation record of one pixel on a certain day; each column is a dynamic characteristic index; then, training a base model, specifically training a GBDT model on { dataset0_ i }, i ∈ [1, …, N ], to obtain N different base models { base model _ i }, i ∈ [1, …, N ];
I. constructing a meta model by the following specific method: first, the feature data set dataset1newGeneration of the characteristic numberData set1newThe generation method of (2) is as follows: extracting Dynamic characteristics Dynamic _ Feats of the dataset1 data set, and inputting the extracted Dynamic characteristics Dynamic _ Feats into a base model { base model _ i }, i belongs to [1, …, N [ ]]Obtaining the prediction probability { pred _ i } of N models, i belongs to [1, …, N-]Meanwhile, Static data Static _ Indexs of the dataset1 are extracted, and the probability { pred _ i }, i ∈ [1, …, N ] is predicted]And Static data Static _ Indexs extracted from dataset1 to form a new data set namely dataset1new(ii) a Next, a meta learning model was constructed on the dataset1 dataset and used as a meta learner using logistic regression, at dataset1newAnd training the model to obtain a final prediction model meta model.
Further, in step a, the combustible factor data includes: combustible water content FMC, combustible load FL and combustible type FT; the spatial resolution of combustible factor data is 500m, and the time resolution is 1 day; the geographic data comprises elevation, gradient and slope direction; the geographic data spatial resolution is 30 m; the meteorological data comprise temperature, humidity, rainfall, wind speed and wind direction, the spatial resolution of the meteorological data is 500m, and the time resolution is 1 hour; the historical mountain fire data comprises longitude and latitude and time information of the occurrence of the historical mountain fire.
Further, in step B, the temporal resolution fusion method is specifically as follows:
the observation data m of the d-th weather image index m in each hourd,i(i epsilon 0, …,23) polymerizing according to days to generate corresponding statistical indexes, which comprise: maximum value max _ m of the daydMinimum value min _ mdRange _ m, range _ mdTotal of the day _ md
Wherein max _ md=max(md,i)(i∈0,…,23)
min_md=min(md,i)(i∈0,…,23)
range_md=max(md,i)-min(md,i)(i∈0,…,23)
total_md=sum(md,i)(i∈0,…,23)
Maximum value max umdMinimum value min _ mdFor temperature, humidity, rainfall, wind speed, range _ mdThe system is used for temperature, humidity and total amount of total _ m on daydFor rainfall levels.
Further, in the step C, the spatial data fusion method adopts an adjacent matching method or a krige interpolation method.
Further, in the step D, the dynamic data comprise combustible water content FMC, combustible load FL, temperature, humidity, rainfall, wind speed and wind direction; the static data comprises combustible type FT, elevation, gradient and slope direction; the time data includes year, month, quarter, week, day.
Further, in step E, the statistical generalization method using the "time + space" window is specifically as follows:
for image element data DI located at ith column and j rowi,jSpatio-temporal eigenvalues DI at time ti,j,tIf w iss=[Ws/2]Representing the size of the spatial window (typically {4,8, … }), WtRepresents a time window size; then:
mean value feature
Figure BDA0001831994790000051
Maximum value characteristic
Figure BDA0001831994790000052
Minimum feature
Figure BDA0001831994790000053
When W issWhen W is equal to 0, the feature of the spatial neighborhood pixel is not extractedt1 denotes a time window of 1, when Ws=8,WtE {7,30}, constitutes the complete spatio-temporal dynamics.
Further, in the step H, the parameters of the GBDT model are determined by using a cross validation plus bayes search algorithm.
Further, in the step I, parameters of the logistic regression model are determined by adopting cross validation and a Bayesian search algorithm.
The invention has the beneficial effects that: the mountain fire risk prediction method based on the stacking algorithm adopts various time-space data such as combustible factor data, geographic data, meteorological data, historical mountain fire data and the like to predict mountain fire occurrence risk, and designs a processing technology of multi-source, heterogeneous and massive time-space data to form a rich mountain fire occurrence prediction characteristic set; in the aspect of mass data processing capacity, the system has the capacity of processing mass space-time data; the mountain fire risk prediction model is constructed by a spatio-temporal data integration learning technology, so that data-driven modeling is realized, a complicated and complicated Bayesian modeling process is avoided, and the efficiency of spatio-temporal data modeling is improved; the mountain fire risk prediction method based on the stacking algorithm takes the characteristics of processing time and space characteristics, and dynamic and static characteristics into consideration, and realizes secondary processing generation of the characteristics by the stacking method, thereby improving the overall effect of mountain fire risk prediction; the AUC (Area Under ROC Curve) index reaches 0.85 through experimental verification.
Detailed Description
The mountain fire risk prediction method based on the stacking algorithm comprises the following steps:
1) establishing a mountain fire risk prediction model by using a stacking algorithm;
the mountain fire risk prediction model comprises a base model and a meta model;
2) acquiring combustible factor data, geographic data, meteorological data and historical mountain fire data in a historical period from the current time according to the requirements of a mountain fire risk prediction task; the method comprises the following steps that combustible factor data, geographic data and historical mountain fire data are obtained through satellite remote sensing, meteorological data are obtained through a meteorological department, and the data are automatically obtained from relevant data channels through http/FTP data acquisition interfaces; the combustible factor data includes: combustible water content FMC, combustible load FL and combustible type FT; the spatial resolution of combustible factor data is 500m, and the time resolution is 1 day; the geographic data comprises elevation, gradient and slope direction; the geographic data spatial resolution is 30 m; the meteorological data comprise temperature, humidity, rainfall, wind speed and wind direction, the spatial resolution of the meteorological data is 500m, and the time resolution is 1 hour; the historical mountain fire data comprise longitude and latitude and time information of occurrence of historical mountain fire;
3) processing the meteorological data acquired in the step 2) by a time resolution fusion method to obtain meteorological data taking days as a unit; the step is that the time resolution of different types of data is different, and the hour-level meteorological data is converted into meteorological data with day as a unit by a time resolution fusion method, so that the time resolution of the meteorological data is consistent with that of other data; the temporal resolution fusion method is specifically as follows:
the observation data m of the d-th weather image index m in each hourd,i(i epsilon 0, …,23) polymerizing according to days to generate corresponding statistical indexes, which comprise: maximum value max _ m of the daydMinimum value min _ mdRange _ m, range _ mdTotal of the day _ md
Wherein max _ md=max(md,i)(i∈0,…,23)
min_md=min(md,i)(i∈0,…,23)
range_md=max(md,i)-min(md,i)(i∈0,…,23)
total_md=sum(md,i)(i∈0,…,23)
Maximum value max _ mdMinimum value min _ mdFor temperature, humidity, rainfall, wind speed, range _ mdThe system is used for temperature, humidity and total amount of total _ m on daydFor rainfall levels;
4) fusing and matching the combustible factor data, the geographic data, the historical mountain fire data and the weather data which is obtained in the step 3) and takes the day as a unit by a spatial data fusion method; the step is that the spatial resolution of different types of data is different, and the problem of the difference of the spatial resolution between different data sources or the difference of grid data coordinate points can be solved by a spatial data fusion method; the spatial data fusion method adopts a close proximity matching method or a krige interpolation method;
5) dividing combustible factor data, geographic data, meteorological data and historical mountain fire data obtained by the processing of the step 4) into dynamic data, static data and time data according to characteristic change characteristics; the Dynamic data are recorded as Dynamic _ Indexs, and the Static data are recorded as Static _ Indexs; the dynamic data refers to index data which changes along with time change, the dynamic data can reflect dynamic changes of combustible materials and meteorological conditions and has very important value for predicting mountain fire occurrence risk, and the static data refers to index data which does not change along with time change (or has a long change period); the dynamic data comprises combustible water content FMC, combustible load FL, temperature, humidity, rainfall, wind speed and wind direction; the static data comprises combustible type FT, elevation, gradient and slope direction; the time data comprises year, month, quarter, week and day;
6) extracting Dynamic characteristic Dynamic _ Feats from Dynamic data Dynamic _ Indexs by adopting a statistical generalization method of a time and space window; the statistical generalization method using the "time + space" window is specifically as follows:
for image element data DI located at ith column and j rowi,jSpatio-temporal eigenvalues DI at time ti,j,tIf w iss=[Ws/2]Representing the size of the spatial window (typically {4,8, … }), WtRepresents a time window size; then:
mean value feature
Figure BDA0001831994790000071
Maximum value characteristic
Figure BDA0001831994790000072
Minimum feature
Figure BDA0001831994790000073
When W issThe characteristic that no spatial neighborhood pixel is extracted is represented as 0When W ist1 denotes a time window of 1, when Ws=8,WtThe element belongs to {7,30}, and all space-time dynamic characteristics are formed;
7) performing normalization processing on numerical data in all Dynamic characteristic Dynamic _ Feats and Static data Static _ Indexs by adopting a min-max method, and performing coding processing on discrete characteristics in the Static data Static _ Indexs by adopting a one-hot coding mode;
8) inputting the Dynamic characteristics Dynamic _ Feats processed in the step 7) into a base model to predict to obtain the probability pred _ i of the occurrence of the mountain fire, and then combining the pred _ i with the Static characteristics Static _ Indexs processed in the step 7) to serve as a new characteristic set and inputting the new characteristic set into a meta model to obtain the final probability NFire of the occurrence of the mountain fire;
9) generation of target variable
The target variable design adopts the occurrence of mountain fire as a mountain fire risk mark; a formula is adopted for label of a certain pixel in t days
Figure BDA0001831994790000074
Wherein, labeli,j,tThe target variable label NFire of the ith and j th pixel element in t daysi,j,hIndicates the probability that the ith and jth pixel has mountain fire or not in h day, labeli,j,t0 means no occurrence, labeli,j,t1 indicates occurrence.
The mountain fire risk prediction method based on the stacking algorithm adopts various time-space data such as combustible factor data, geographic data, meteorological data, historical mountain fire data and the like to predict mountain fire occurrence risk, and designs a processing technology of multi-source, heterogeneous and massive time-space data to form a rich mountain fire occurrence prediction characteristic set; in the aspect of mass data processing capacity, the system has the capacity of processing mass space-time data; the mountain fire risk prediction model is constructed by a spatio-temporal data integration learning technology, so that data-driven modeling is realized, a complicated and complicated Bayesian modeling process is avoided, and the efficiency of spatio-temporal data modeling is improved; the mountain fire risk prediction modeling method based on the stacking algorithm takes the characteristics of processing time and space characteristics, and dynamic and static characteristics into consideration, and realizes secondary processing generation of the characteristics by the stacking method, thereby improving the overall effect of the mountain fire prediction model; the AUC (Area Under ROC Curve) index reaches 0.85 through experimental verification.
In the above embodiment, in step 1), the building of the mountain fire risk prediction model by using the stacking algorithm includes the following steps:
A. collecting combustible factor data, geographic data, meteorological data and historical mountain fire data in a historical time period from the current time according to the requirement of a mountain fire risk prediction task; the method comprises the following steps that combustible factor data, geographic data and historical mountain fire data are obtained through satellite remote sensing, meteorological data are obtained through a meteorological department, and the data are automatically obtained from relevant data channels through http/FTP data acquisition interfaces; the combustible factor data includes: combustible water content FMC, combustible load FL and combustible type FT; the spatial resolution of combustible factor data is 500m, and the time resolution is 1 day; the geographic data comprises elevation, gradient and slope direction; the geographic data spatial resolution is 30 m; the meteorological data comprise temperature, humidity, rainfall, wind speed and wind direction, the spatial resolution of the meteorological data is 500m, and the time resolution is 1 hour; the historical mountain fire data comprise longitude and latitude and time information of occurrence of historical mountain fire;
B. processing the meteorological data acquired in the step A by a time resolution fusion method to obtain meteorological data taking days as a unit; the step is that the time resolution of different types of data is different, and the hour-level meteorological data is converted into meteorological data with day as a unit by a time resolution fusion method, so that the time resolution of the meteorological data is consistent with that of other data; the temporal resolution fusion method is specifically as follows:
the observation data m of the d-th weather image index m in each hourd,i(i epsilon 0, …,23) polymerizing according to days to generate corresponding statistical indexes, which comprise: maximum value max _ m of the daydMinimum value min _ mdRange _ m, range _ mdTotal of the day _ md
Wherein max _ md=max(md,i)(i∈0,…,23)
min_md=min(md,i)(i∈0,…,23)
range_md=max(md,i)-min(md,i)(i∈0,…,23)
total_md=sum(md,i)(i∈0,…,23)
Maximum value max _ mdMinimum value min _ mdFor temperature, humidity, rainfall, wind speed, range _ mdThe system is used for temperature, humidity and total amount of total _ m on daydFor rainfall levels;
C. fusion matching of spatial data is realized through a spatial data fusion method by using combustible factor data, geographic data, historical mountain fire data and weather data which is obtained in the step B and takes the day as a unit; the step is that the spatial resolution of different types of data is different, and the problem of the difference of the spatial resolution between different data sources or the difference of grid data coordinate points can be solved by a spatial data fusion method; the spatial data fusion method adopts a close proximity matching method or a krige interpolation method;
D. dividing combustible factor data, geographic data, meteorological data and historical mountain fire data obtained by processing in the step C into dynamic data, static data and time data according to characteristic change characteristics; the Dynamic data are recorded as Dynamic _ Indexs, and the Static data are recorded as Static _ Indexs; the dynamic data refers to index data which changes along with time change, the dynamic data can reflect dynamic changes of combustible materials and meteorological conditions and has very important value for predicting mountain fire occurrence risk, and the static data refers to index data which does not change along with time change (or has a long change period); the dynamic data comprises combustible water content FMC, combustible load FL, temperature, humidity, rainfall, wind speed and wind direction; the static data comprises combustible type FT, elevation, gradient and slope direction; the time data comprises year, month, quarter, week and day;
E. extracting Dynamic characteristics Dynamic _ Feats from Dynamic data Dynamic _ Indexs by adopting a statistical generalization method of a time and space window; the statistical generalization method using the "time + space" window is specifically as follows:
for image element data DI located at ith column and j rowi,jSpatio-temporal eigenvalues DI at time ti,j,tIf w iss=[Ws/2]Representing the size of the spatial window (typically {4,8, … }), WtRepresents a time window size; then:
mean value feature
Figure BDA0001831994790000091
Maximum value characteristic
Figure BDA0001831994790000092
Minimum feature
Figure BDA0001831994790000093
When W issWhen W is equal to 0, the feature of the spatial neighborhood pixel is not extractedt1 denotes a time window of 1, when Ws=8,WtThe element belongs to {7,30}, and all space-time dynamic characteristics are formed;
F. performing normalization processing on numerical data in all Dynamic _ features and Static data Static _ Indexs by adopting a min-max method, and performing coding processing on discrete features in the Static data Static _ Indexs by adopting a one-hot coding mode;
G. taking t as the data obtained in the step F0The time point is used as a dividing point for data division to obtain two data subsets;
dataset0(t≤t0)
dataset1(t0<t);
H. constructing a base model by using GBDT, which specifically comprises the following steps: firstly, data splitting and feature selection are carried out, and the specific method comprises the following steps: extracting Dynamic characteristics Dynamic _ Feats of a dataset0, randomly splitting the Dynamic characteristics Dynamic _ Feats extracted from the dataset0 into N parts of { dataset0_ i }, wherein i belongs to [1, …, N ], and each row of each data subset represents an observation record of one pixel on a certain day; each column is a dynamic characteristic index; then, training a base model, specifically training a GBDT model on { dataset0_ i }, i ∈ [1, …, N ], determining parameters of the GBDT model by adopting a cross validation and Bayesian search algorithm, and obtaining N different base models { base model _ i }, i ∈ [1, …, N ];
I. constructing a meta model by the following specific method: first, the feature data set dataset1newThe feature data set dataset1newThe generation method of (2) is as follows: extracting Dynamic characteristics Dynamic _ Feats of the dataset1 data set, and inputting the extracted Dynamic characteristics Dynamic _ Feats into a base model { base model _ i }, i belongs to [1, …, N [ ]]Obtaining the prediction probability { pred _ i } of N models, i belongs to [1, …, N-]Meanwhile, Static data Static _ Indexs of the dataset1 are extracted, and the probability { pred _ i }, i ∈ [1, …, N ] is predicted]And Static data Static _ Indexs extracted from dataset1 to form a new data set namely dataset1new(ii) a Next, a meta learning model was constructed on the dataset1 dataset and used as a meta learner using logistic regression, at dataset1newAnd (3) training the model, and determining the parameters of the logistic regression model by adopting cross validation and Bayesian search algorithm to obtain the final prediction model meta model.
The mountain fire risk prediction modeling method based on the stacking algorithm is used for modeling mountain fire occurrence risks based on various time-space data such as combustible factor data, geographic data, meteorological data, historical mountain fire data and the like, and processing technologies of multi-source, heterogeneous and massive time-space data are designed to form a rich mountain fire occurrence prediction characteristic set; in the aspect of mass data processing capacity, the system has the capacity of processing mass space-time data; the mountain fire risk prediction model is constructed by a spatio-temporal data integration learning technology, so that data-driven modeling is realized, a complicated and complicated Bayesian modeling process is avoided, and the efficiency of spatio-temporal data modeling is improved; the mountain fire risk prediction modeling method based on the stacking algorithm takes the characteristics of processing time and space characteristics, and dynamic and static characteristics into consideration, and realizes secondary processing generation of the characteristics by the stacking method, thereby improving the overall effect of the mountain fire prediction model; the AUC (Area Under ROC Curve) index reaches 0.85 through experimental verification.

Claims (7)

1. The mountain fire risk prediction method based on the stacking algorithm is characterized by comprising the following steps of:
1) establishing a mountain fire risk prediction model by using a stacking algorithm;
the mountain fire risk prediction model comprises a base model and a meta model;
the method for establishing the mountain fire risk prediction model by using the stacking algorithm comprises the following steps:
A. collecting combustible factor data, geographic data, meteorological data and historical mountain fire data in a historical time period from the current time according to the requirement of a mountain fire risk prediction task; the combustible factor data includes: combustible water content FMC, combustible load FL and combustible type FT; the spatial resolution of combustible factor data is 500m, and the time resolution is 1 day; the geographic data comprises elevation, gradient and slope direction; the geographic data spatial resolution is 30 m; the meteorological data comprise temperature, humidity, rainfall, wind speed and wind direction, the spatial resolution of the meteorological data is 500m, and the time resolution is 1 hour; the historical mountain fire data comprise longitude and latitude and time information of occurrence of historical mountain fire;
B. processing the meteorological data acquired in the step A by a time resolution fusion method to obtain meteorological data taking days as a unit;
C. fusion matching of spatial data is realized through a spatial data fusion method by using combustible factor data, geographic data, historical mountain fire data and weather data which is obtained in the step B and takes the day as a unit;
D. dividing combustible factor data, geographic data, meteorological data and historical mountain fire data obtained by processing in the step C into dynamic data, static data and time data according to characteristic change characteristics; the Dynamic data are recorded as Dynamic _ Indexs, and the Static data are recorded as Static _ Indexs;
E. extracting Dynamic characteristics Dynamic _ Feats from Dynamic data Dynamic _ Indexs by adopting a statistical generalization method of a time and space window;
F. performing normalization processing on numerical data in all Dynamic _ features and Static data Static _ Indexs by adopting a min-max method, and performing coding processing on discrete features in the Static data Static _ Indexs by adopting a one-hot coding mode;
G. taking t as the data obtained in the step F0The time point is used as a dividing point for data division to obtain two data subsets;
Figure FDA0002599858050000011
H. constructing a base model by using GBDT, which specifically comprises the following steps: firstly, data splitting and feature selection are carried out, and the specific method comprises the following steps: extracting Dynamic characteristics Dynamic _ Feats of a dataset0, randomly splitting the Dynamic characteristics Dynamic _ Feats extracted from the dataset0 into N parts of { dataset0_ i }, wherein i belongs to [1, …, N ], and each row of each data subset represents an observation record of one pixel on a certain day; each column is a dynamic characteristic index; then, training a base model, specifically training a GBDT model on { dataset0_ i }, i ∈ [1, …, N ], to obtain N different base models { base model _ i }, i ∈ [1, …, N ];
I. constructing a meta model by the following specific method: first, the feature data set dataset1newThe feature data set dataset1newThe generation method of (2) is as follows: extracting Dynamic characteristics Dynamic _ Feats of the dataset1 data set, and inputting the extracted Dynamic characteristics Dynamic _ Feats into a base model { base model _ i }, i belongs to [1, …, N [ ]]Obtaining the prediction probability { pred _ i } of N models, i belongs to [1, …, N-]Meanwhile, Static data Static _ Indexs of the dataset1 are extracted, and the probability { pred _ i }, i ∈ [1, …, N ] is predicted]And Static data Static _ Indexs extracted from dataset1 to form a new data set namely dataset1new(ii) a Next, a meta learning model was constructed on the dataset1 dataset and used as a meta learner using logistic regression, at dataset1newTraining the model to obtain a final prediction model meta model;
2) acquiring combustible factor data, geographic data, meteorological data and historical mountain fire data in a historical period from the current time according to the requirements of a mountain fire risk prediction task;
3) processing the meteorological data acquired in the step 2) by a time resolution fusion method to obtain meteorological data taking days as a unit;
4) fusing and matching the combustible factor data, the geographic data, the historical mountain fire data and the weather data which is obtained in the step 3) and takes the day as a unit by a spatial data fusion method;
5) dividing combustible factor data, geographic data, meteorological data and historical mountain fire data obtained by the processing of the step 4) into dynamic data, static data and time data according to characteristic change characteristics; the Dynamic data are recorded as Dynamic _ Indexs, and the Static data are recorded as Static _ Indexs;
6) extracting Dynamic characteristic Dynamic _ Feats from Dynamic data Dynamic _ Indexs by adopting a statistical generalization method of a time and space window;
7) performing normalization processing on numerical data in all Dynamic characteristic Dynamic _ Feats and Static data Static _ Indexs by adopting a min-max method, and performing coding processing on discrete characteristics in the Static data Static _ Indexs by adopting a one-hot coding mode;
8) inputting the Dynamic characteristics Dynamic _ Feats processed in the step 7) into a base model to predict to obtain the probability pred _ i of the occurrence of the mountain fire, and then combining the pred _ i with the Static characteristics Static _ Indexs processed in the step 7) to serve as a new characteristic set and inputting the new characteristic set into a meta model to obtain the final probability NFire of the occurrence of the mountain fire;
9) generation of target variable
The target variable design adopts the occurrence of mountain fire as a mountain fire risk mark; a formula is adopted for label of a certain pixel in t days
Figure FDA0002599858050000031
Wherein, labeli,j,tThe target variable label NFire of the ith and j th pixel element in t daysi,j,hRepresents the ith, j imageProbability of occurrence of mountain fire in Yuan-h days, labeli,j,t0 means no occurrence, labeli,j,t1 indicates occurrence.
2. The mountain fire risk prediction method based on the packing algorithm as claimed in claim 1, wherein: in step B, the temporal resolution fusion method is specifically as follows:
the observation data m of the d-th weather image index m in each hourd,i(i epsilon 0, …,23) polymerizing according to days to generate corresponding statistical indexes, which comprise: maximum value max _ m of the daydMinimum value min _ mdRange _ m, range _ mdTotal of the day _ md
Wherein max _ md=max(md,i)(i∈0,…,23)
min_md=min(md,i)(i∈0,…,23)
range_md=max(md,i)-min(md,i)(i∈0,…,23)
total_md=sum(md,i)(i∈0,…,23)
Maximum value max _ mdMinimum value min _ mdFor temperature, humidity, rainfall, wind speed, range _ mdThe system is used for temperature, humidity and total amount of total _ m on daydFor rainfall levels.
3. The mountain fire risk prediction method based on the racking algorithm as claimed in claim 2, wherein: in the step C, the spatial data fusion method adopts a neighbor matching method or a krige interpolation method.
4. The mountain fire risk prediction method based on the packing algorithm as claimed in claim 3, wherein: in the step D, the dynamic data comprise combustible water ratio FMC, combustible load FL, temperature, humidity, rainfall, wind speed and wind direction; the static data comprises combustible type FT, elevation, gradient and slope direction; the time data includes year, month, quarter, week, day.
5. The mountain fire risk prediction method based on the packing algorithm as claimed in claim 4, wherein: in step E, the statistical generalization method using the "time + space" window is specifically as follows:
for image element data DI located at ith column and j rowi,jSpatio-temporal eigenvalues DI at time ti,j,tIf w iss=[Ws/2]Represents the size of the spatial window, WsIs set to {4,8, … }, WtRepresents a time window size; then:
mean value feature
Figure FDA0002599858050000041
Maximum value characteristic
Figure FDA0002599858050000042
Minimum feature
Figure FDA0002599858050000043
When W issWhen W is equal to 0, the feature of the spatial neighborhood pixel is not extractedt1 denotes a time window of 1, when Ws=8,WtE {7,30}, constitutes the complete spatio-temporal dynamics.
6. The mountain fire risk prediction method based on the packing algorithm as claimed in claim 5, wherein: in step H, the parameters of the GBDT model are determined using a cross validation plus bayes search algorithm.
7. The mountain fire risk prediction method based on the racking algorithm as claimed in claim 6, wherein: in the step I, parameters of the logistic regression model are determined by adopting cross validation and a Bayesian search algorithm.
CN201811209152.7A 2018-10-17 2018-10-17 Mountain fire risk prediction method based on stacking algorithm Active CN109447331B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811209152.7A CN109447331B (en) 2018-10-17 2018-10-17 Mountain fire risk prediction method based on stacking algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811209152.7A CN109447331B (en) 2018-10-17 2018-10-17 Mountain fire risk prediction method based on stacking algorithm

Publications (2)

Publication Number Publication Date
CN109447331A CN109447331A (en) 2019-03-08
CN109447331B true CN109447331B (en) 2020-10-27

Family

ID=65547092

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811209152.7A Active CN109447331B (en) 2018-10-17 2018-10-17 Mountain fire risk prediction method based on stacking algorithm

Country Status (1)

Country Link
CN (1) CN109447331B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110889536A (en) * 2019-10-29 2020-03-17 新智认知数字科技股份有限公司 Method and system for predicting and early warning situation
CN111445011B (en) * 2020-04-01 2023-07-28 成都思晗科技股份有限公司 Mountain fire early warning method based on meteorological and remote sensing data
CN111931648B (en) * 2020-08-10 2023-08-01 成都思晗科技股份有限公司 Mountain fire real-time monitoring method based on Himaware 8-band data
CN112819356B (en) * 2021-02-08 2022-10-14 国网山西省电力公司电力科学研究院 Power transmission line forest fire risk grade forecasting method based on gradient lifting tree

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105678419A (en) * 2016-01-05 2016-06-15 天津大学 Fine grit-based forest fire hazard probability forecasting system
US20180096253A1 (en) * 2016-10-04 2018-04-05 Civicscape, LLC Rare event forecasting system and method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105678419A (en) * 2016-01-05 2016-06-15 天津大学 Fine grit-based forest fire hazard probability forecasting system
US20180096253A1 (en) * 2016-10-04 2018-04-05 Civicscape, LLC Rare event forecasting system and method

Also Published As

Publication number Publication date
CN109447331A (en) 2019-03-08

Similar Documents

Publication Publication Date Title
CN109214716B (en) Mountain fire risk prediction modeling method based on stacking algorithm
CN109447331B (en) Mountain fire risk prediction method based on stacking algorithm
CN109508476B (en) Mountain fire prediction modeling method based on deep network learning
Talukdar et al. Wetland‐inundated area modeling and monitoring using supervised and machine learning classifiers
CN109829583B (en) Mountain fire risk prediction method based on probability programming technology
CN112785066A (en) Convolution-recurrent neural network-based global wildfire sub-season time-space prediction method
Zhuang et al. Long-lead prediction of extreme precipitation cluster via a spatiotemporal convolutional neural network
KR102151272B1 (en) Method, apparatus and computer program for analyzing data using prediction model
CN115629160A (en) Air pollutant concentration prediction method and system based on space-time diagram
CN114385611A (en) Precipitation prediction method and system based on artificial intelligence algorithm and knowledge graph
CN117556197B (en) Typhoon vortex initialization method based on artificial intelligence
CN113590971A (en) Interest point recommendation method and system based on brain-like space-time perception characterization
KR102014288B1 (en) Development pressure prediction method based on artificial intelligence using drone
Qu et al. A modified self-adaptive method for mapping annual 30-m land use/land cover using Google Earth Engine: A case study of Yangtze River Delta
CN116756695A (en) Urban function collaborative optimization method integrating geographic features and flow features
Sambo et al. Integration of GPS and satellite images for detection and classification of fleet hotspots
CN115017990B (en) Traffic flow prediction method, device, equipment and storage medium
Joseph et al. Short-term wind speed forecasting using an optimized three-phase convolutional neural network fused with bidirectional long short-term memory network model
Indrabayu et al. A new approach of expert system for rainfall prediction based on data series
CN117709732B (en) Agricultural disaster report generation method and system combined with meteorological monitoring data
Umamaheswari et al. A Novel Modified LSTM Deep Learning Model on Precipitation Analysis for South Indian States
Yang et al. An Improved LSTM-based Method Capturing Temporal Correlations and Using Attention Mechanism for Radar Echo Extrapolation
Geiß et al. Anticipating a risky future: LSTM models for spatiotemporal extrapolation of population data in areas prone to earthquakes and tsunamis in Lima, Peru
Ambildhuke et al. Performance Analysis of Ensemble Techniques for Rainfall Prediction: A Study Based on the Current Atmospheric Parameters
Jeong et al. Using maps to predict economic activity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20210202

Address after: No.1, 3rd floor, building 1, No.366, north section of Hupan Road, Tianfu New District, Chengdu, Sichuan 610041, China (Sichuan) pilot Free Trade Zone, Chengdu, China

Patentee after: Chengdu Cap Data Service Co.,Ltd.

Address before: Room 503, 5th floor, unit 1, building 12, 333 Taihe 2nd Street, high tech Zone, Chengdu, Sichuan 610041

Patentee before: SICHUAN JIALIAN ZHONGHE ENTERPRISE MANAGEMENT CONSULTATION Co.,Ltd.

TR01 Transfer of patent right