CN108038081A - The landslide disaster logistic regression analysis of feature based function space filter value - Google Patents

The landslide disaster logistic regression analysis of feature based function space filter value Download PDF

Info

Publication number
CN108038081A
CN108038081A CN201711425595.5A CN201711425595A CN108038081A CN 108038081 A CN108038081 A CN 108038081A CN 201711425595 A CN201711425595 A CN 201711425595A CN 108038081 A CN108038081 A CN 108038081A
Authority
CN
China
Prior art keywords
landslide
feature vector
value
sample
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711425595.5A
Other languages
Chinese (zh)
Other versions
CN108038081B (en
Inventor
陈玉敏
李慧芳
周江
杨家鑫
张静祎
陈娒杰
方涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201711425595.5A priority Critical patent/CN108038081B/en
Publication of CN108038081A publication Critical patent/CN108038081A/en
Application granted granted Critical
Publication of CN108038081B publication Critical patent/CN108038081B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention proposes a kind of logistic based on space filter value and returns landslide hazard analysis method, studied for landslide hazard analysis, filter value thought in space is incorporated into common logistic regression models, design is a kind of include not coming down choose, the acquisition of Flood inducing factors value and a classification, adjacency matrix structure, eigen vector calculate, the selection of successive Regression feature vector, regression modeling and etc. landslide algorithm with regress analysis method.The present invention can solve the problems, such as spatial autocorrelation of the logistic regression models between by variable influenced caused by model accuracy it is not high.It is added to using the feature vector of selection to build filter value operator in logistic regression models, can effectively filters off autocorrelation of residuals influence, improve the goodness of fit and the prediction accuracy of regression model, realize the accurate simulation to landslide disaster and prediction.

Description

The landslide disaster logistic regression analysis of feature based function space filter value
Technical field
The invention belongs to soil statistics and spatial analysis field, more particularly to a kind of cunning of feature based function space filter value Slope disaster logistic regression analysis.
Background technology
Landslide disaster is one of most commonly seen geological disaster, and landslide hazard analysis mainly includes qualitative analysis and quantitative point Two major classes (Yalcin et al., 2011, referring to background document 1) are analysed, qualitative analysis is mainly the research people by relevant industries Member based on oneself professional knowledge and landslide disaster is analyzed and assessed after survey region is understood in depth and investigated, It is chiefly used in small range or certain particular incident, the method mainly used has method of expertise, weighted linear and method and level point Analysis.The quantitative square rule of landslide analysis is established in perfect theoretical foundation more, and landslide disaster is divided from the angle of macroscopic view Analysis research, mainly including decerministic method, artificial intelligence method and multivariate statistics analysis.
Logistic regression models are that it is as one using one of relatively broad method of quantitative analysis landslide disaster Kind generalized linear statistic law, for probability whether generation using event as dependent variable, the factor for influencing event establishes recurrence for independent variable Model, suitable for two classification and more classified variables.During landslide disaster is studied, since the dependent variable of landslide disaster is to occur Landslide can not analyze the class variable with i.e. two-category data, the general linear regression model (LRM) in landslide does not occur.And The advantage of logistic regression models is to build the variables transformations of two classification for that can carry out recurrence with several independents variable The logit variables (Bewick et al., 2005, referring to background document 2) of mould, so that the research applied to landslide hazard analysis. Many scholars (Bai et al., 2010, referring to background document 3;Das et al., 2010, referring to background document 4;Mousavi Et al., 2011, referring to background document 5;Budimir et al., 2014, referring to background document 6) established based on GIS Different regions are carried out landslide disaster Study of Sensitivity by logistic regression models, and model is evaluated, and are demonstrated Applicability of the logistic regression models in landslide analysis.In addition logistic regression models and neutral net, frequency ratio, certainly Between the models such as plan tree, evidence weight and information content comparative study (Yesilnacar et al., 2005, referring to background document 7;Wang et al., 2016, referring to background document 8), Chen et al., 2016, referring to background document 9), also illustrate that phase For other research methods logistic regression models relevance grade, model accuracy, Evaluated effect etc. it is many-sided have it is good Performance.
But First Law of Geography is pointed out:Geographical object or attribute are related each other in spatial distribution, exist and gather (clustering), random (random), regular (Regularity) distribution, and stronger (the Miller et of the nearlyer correlation of distance Al., 2004, referring to background document 10).In traditional Logistic regression models, existing spatial coherence, meeting between variable Embodied by error propagation in the residual error of Logistic regression models, be usually used as measurement by the use of Moran ' the s I values of residual error Index (Et al., 1996, referring to background document 11), and this frequently can lead to the erroneous judgement of model, influence model Accuracy.In order to solve this problem, it is necessary to eliminate the influence of spatial autocorrelation.
The method that eliminating spatial autocorrelation influences mainly has Geographical Weighted Regression and space filter value (spatial Filtering) method, space filter value method are by Getis (Getis, 1995, referring to background document 12) and Griffith earliest (Griffith, 2000, referring to background document 13) propose, the core concept of this method is that the variable in model is resolved into sky Between influence and non-space influence two parts, by the spacial influence extracting section of variable out and is filtered off can utilize commonly use Homing method analyzed.The space filter value method that Getis is proposed is counted using local Gi and carries out independent variable by formula Change, realizes the elimination of spacial influence part in residual error, and this method can filter off spatial autocorrelation influence really, but to variable Requirement be must to be fulfilled for as on the occasion of variable, therefore be not suitable for ratio variable or percentage variable.
The characteristic function space filter value method that Griffith is proposed is added in independent variable by selected characteristic vector to be built Filter value operator replaces the auto-correlation part in model residual error so that remaining residual error portion only by random errors affect so that Eliminate the influence (Getis and Griffith, 2002, referring to background document 14) of spatial autocorrelation.Filter value operator is equivalent to residual The auto-correlation part of difference, therefore it is necessarily required to comprising the spatial relationship between geographical unit.Spatial weight matrix is by building space Binary relationship between geographical unit, can effectively express the spatial coherence of geographical unit, therefore can be based on space weight Matrix builds filter value operator.It is added in linear regression model (LRM), can be had to build filter value operator using the feature vector selected Model is set by mistake caused by the reduction of effect is influenced by the spatial autocorrelation of residual error.Patuelli (Patuelli et al., 2011, Referring to background document 15) using space filter value technique study Germany unemployment phenomenon, it is found that the addition of space filter value to return mould Type is improved for the predictablity rate for phenomenon of being unemployed, and the validity of space filter value method is demonstrated from the angle of positive research. Murakami and Griffith (Murakami et al., 2015, referring to background document 16) empty for stochastic effects characteristic function Between filter value method cannot effectively this problem of processing space auto-correlation be analyzed after spatial confusion is considered, article is detailed 2 points that elaborate stochastic effects ESF it is mainly insufficient, and propose the maximal possibility estimation of residual error is added in extended model To solve the problems, such as, the application range of space filter value method is further expanded.Chun (Chun et al., 2016, referring to background Document 17) although pointing out that filter value method in space can effectively solve the problem that spatial autocorrelation problem and be led suitable for different research Domain, but there is also the disadvantage that algorithm itself is complex, and computational efficiency has much room for improvement, thus propose it is a kind of more quickly and effectively The method for generating feature vector subset so that the efficiency of space filter value method greatly improves.It can be seen from the above that space filter value method day Become ripe.
Background document:
[1]Yalcin A,Reis S,Aydinoglu AC,et al.AGIS-based comparative study of frequency ratio,analytical hierarchy process,bivariate statistics and logistics regression methods for landslide susceptibility mapping in Trabzon, NE Turkey[J].Catena,2011,85(3):274-287.
[2]Bewick V,Cheek L,Ball J.Statistics review 14:Logistic regression [J].Critical Care,2005,9(1):112.
[3]Bai S B,Jian W,Zhou P G,et al.GIS-based logistic regression for landslide susceptibility mapping of the Zhongxian segment in the Three Gorges area,China.[J].Geomorphology,2010,115(1–2):23-31.
[4]Das I,Sahoo S,Westen C V,et al.Landslide susceptibility assessment using logistic regression and its comparison with a rock mass classification system,along a road section in the northern Himalayas(India).[J] .Geomorphology,2010,114(4):627-637.
[5]Seyedeh Zohreh Mousavi,Ataollah Kavian,Karim Soleimani,et al.GIS- based spatial prediction of landslide susceptibility using logistic regression model[J].Geomatics Natural Hazards&Risk,2011,2(1):33-50.
[6]Budimir M E A,Atkinson P M,Lewis H G.Asystematic review of landslide probability mapping using logistic regression[J].Landslides,2015,12 (3):419-436.
[7]E.Yesilnacar,T.Topal.Landslide susceptibility mapping:Acomparison of logistic regression and neural networks methods in a medium scale study, Hendek region(Turkey)[J].Engineering Geology,2005,79(3–4):251-266.
[8]Wang L J,Guo M,Sawada K,et al.Acomparative study of landslide susceptibility maps using logistic regression,frequency ratio,decision tree, weights of evidence and artificial neural network[J].Geosciences Journal, 2016,20(1):117-136.
[9]Chen T,Niu R,Jia X.A comparison of information value and logistic regression models in landslide susceptibility mapping by using GIS[J] .Environmental Earth Sciences,2016,75(10):1-16.
[10]Miller H J.Tobler's First Law and Spatial Analysis[J].Annals of the Association of American Geographers,2004,94(2):284–289.
[11] T.The spatial autocorrelation coefficient moran's i under heteroscedasticity.Statistics in Medicine,1996,15(7-9):887.
[12]Getis A.Spatial Filtering in a Regression Framework:Examples Using Data on Urban Crime,Regional Inequality,and Government Expenditures [M]//New Directions in Spatial Econometrics.1995:172-185.
[13]Griffith D A.A linear regression solution to the spatial autocorrelation problem[J].Journal of Geographical Systems,2000,2(2):141-156
[14]Getis A,Griffith D A.Comparative Spatial Filtering in Regression Analysis[J].Geographical Analysis,2002,34(2):130–140.
[15]Patuelli R,Griffith D A,Tiefelsdorf M,et al.Spatial Filtering Methods For Tracing Space-Time Developments In An Open Regional System: Experiments with German Unemployment Data[M]//Societies in Motion:Innovation, Migration and Regional Transformation.2012.
[16]Murakami D,Griffith D A.Random effects specifications in eigenvector spatial filtering:a simulation study[J].Journal of Geographical Systems,2015,17(4):1-21.
[17]Chun Y,Griffith D A,Lee M,et al.Eigenvector selection with stepwise regression techniques to construct eigenvector spatial filters[J] .Journal of Geographical Systems,2016,18(1):67-85.
The content of the invention
In order to solve when Logistic regression models are applied to landslide hazard analysis, spatial auto-correlation is influenced between by variable And causing the not high problem of model accuracy, the landslide disaster Logistic that the present invention provides a kind of feature based function space filter value is returned Return analysis method.
The technical solution adopted in the present invention is that a kind of landslide disaster logistic of feature based function space filter value is returned Return analysis method, comprise the following steps:
Step 1, landslide sample data is made choice and handled, the landslide sample data is including landslide point sample and not Landslide point sample, including landslide sample and its corresponding locus attribute and landslide area attribute are obtained, choose and come down A sample that do not come down for the identical quantity of point sample;
Step 2, the landslide sample data obtained to step 1 carries out the acquisition and classification of corresponding Flood inducing factors value;
Step 3, the landslide sample point obtained to step 1 carries out Spatial Adjacency between sample point by building Thiessen polygon The judgement of relation, obtains corresponding Spatial Adjacency matrix W, and carries out centralization to Spatial Adjacency matrix W and operate to obtain Matrix C;
Step 4, the Matrix C obtained to step 3 carries out eigen vector calculating;
Step 5, returned for Logistic, successive Regression is carried out according to the obtained feature vector and characteristic value of step 4 Feature vector is chosen, and realizes that step is as follows,
Step 5.1, the preliminary screening of feature vector, including each feature vector is calculated by corresponding characteristic value Moran ' s I values, choose the feature vector that Moran ' s I values are more than corresponding predetermined threshold value, as candidate feature vector collection EnInto Row subsequent characteristics vector is chosen;
Step 5.2, for Logistic regression models, the candidate feature vector collection E that step 5.1 is obtainednMiddle n candidate Feature vector is added separately in the regression model without filter value operator, is obtained n new regression models, is calculated new and old model Likelihood ratio test statistic LRT, the feature vector for choosing LRT statistic maximums are added in regression model, and in EnMiddle rejecting The feature vector chosen;
Step 5.3, significance test is carried out to the feature vector of selection, if result is notable, rejects this feature vector, and Return and perform step 5.2, if result is not notable, perform step 5.4;
Step 5.4, the residual error spatial autocorrelation of the new model to adding feature vector carries out significance test, if result is shown Write, then return and perform step 5.2 and step 5.3, if result is not notable, the selection of feature vector terminates;
Step 6, it is added to the feature vector selected in step 5 as independent variable in Logistic regression models, structure Build the landslide disaster Logistic regression models of feature based function space filter value.
Moreover, in step 1, using landslide disaster point as the center of circle, to come down, coverage does buffering area as radius, is come down Influence area, it is exactly the chosen area not come down a little that whole research on landslide region, which subtracts landslide influence area, in chosen area Randomly select a sample that do not come down for quantity identical with landslide point sample.
Moreover, choose residual error Moran ' s I, Prob>Four indexs of AUC value of chi2, Pseudo R2 and ROC curve are made The landslide disaster Logistic regression models of feature based function space filter value are evaluated for evaluating.
The present invention is studied for landslide hazard analysis, and filter value thought in space is incorporated into common logistic regression models In, the logistic for devising a kind of feature based function space filter value returns landslide hazard analysis method.The present invention can solve Certainly logistic regression models due to model accuracy caused by residual error autocorrelation it is not high the problem of, autocorrelation of residuals is influenced Filter off, the goodness of fit and the prediction accuracy of regression model can be effectively improved, realize the accurate simulation to landslide disaster and prediction.
Brief description of the drawings
Fig. 1 is the flow chart of the embodiment of the present invention.
Fig. 2 is the sub-process figure of step 1 of the embodiment of the present invention.
Fig. 3 is the sub-process figure of step 2 of the embodiment of the present invention.
Fig. 4 is the sub-process figure of step 5 of the embodiment of the present invention.
Embodiment
Understand for the ease of those of ordinary skill in the art and implement the present invention, with reference to the accompanying drawings and embodiments to this hair It is bright to be described in further detail, it will be appreciated that implementation example described herein is merely to illustrate and explain the present invention, not For limiting the present invention.
The invention solves key problem be:When carrying out landslide hazard analysis using Logistic regression models, utilize Characteristic function space filter value method eliminates influence of the spatial autocorrelation to model accuracy, the goodness of fit between variable.
Referring to attached drawing 1, follow the steps below:
Step 1:Landslide sample data (summation of landslide point sample and a sample that do not come down) is made choice and handled, is wrapped Include and obtain landslide point sample and its corresponding locus attribute and landslide area attribute according to actual conditions, choose and landslide point A sample that do not come down for the identical quantity of sample;
Referring to attached drawing 2, when it is implemented, choosing and the landslide point the same number of sample that do not come down of sample.Do not come down a little The selection of sample should comply with 2 principles:When a sample that do not come down on locus selection should distance have occurred and that There is certain distance in the region on landslide, second, the point sample that do not come down should be uniformly distributed as far as possible, avoids model caused by constellation effect Error.Specific selection principle:Corresponding landslide shadow is carried out first with existing landslide point sample and its corresponding landslide area The calculating in region is rung, specific calculation is using landslide disaster point as the center of circle, and corresponding landslide influences distance and buffered for radius Area, wherein landslide, which influences distance, is determined that calculation formula is by landslide areal calculation:
Wherein R influences distance for landslide, and A is landslide area, and ρ is proportionality constant, can voluntarily be determined as the case may be.And It is exactly the chosen area of a sample of not coming down that whole research on landslide region, which subtracts landslide influence area,;Given birth at random in chosen area Into etc. number a sample that do not come down.An obtained sample that do not come down collectively forms landslide sample number with known landslide point sample According to.
Step 2:The landslide sample data (point data) obtained to step 1 carries out the acquisition of corresponding Flood inducing factors value and divides Level;
Referring to attached drawing 3, when it is implemented, being directed to survey region, the factor for having stronger fall out effect to landslide disaster is chosen Flood inducing factors as landslide.The data source that Flood inducing factors value obtains mainly includes raster data and vector data, vector data It may include line number again according to knead dough data, therefore the sample data that comes down corresponds to the acquisition of Flood inducing factors value and is mainly concerned with based on point Elements recognition grid point value, calculate point key element to line feature distance, judgement point and the position relationship in face.For example, it is high to be based on numeral Journey model (DEM) obtains the corresponding height value of landslide sample data.
Grid point value extraction based on a key element, is by the way that the geographical coordinate inverse of point is obtained Flood inducing factors Raster Images Ranks coordinate, by reading corresponding grid point value with regard to the factor values of the landslide sample can be obtained.Carry out geo-referenced coordinates system with The method of grid positions relation conversion mainly has GCP (more control point positioning methods) and affine transform.
The distance for calculating point and arriving line segment can be converted into the distance between line feature by calculating point key element, i.e., calculate successively a little Key element is to the distance of the line segment of composition line feature, and it is point key element to line feature distance to choose wherein minimum.Point arrives line segment Have three kinds apart from specific algorithm, classical geometrical analysis algorithm, area algorithm and vector algorithm.
The relative position for judging point and face is for determining the polygon belonging to the point of landslide and reading the property value of polygon It can be obtained by corresponding Flood inducing factors value.Judge a little whether the algorithm in polygon mainly has area and diagnostic method, angle With diagnostic method and injection collimation method.
The classification that landslide sample data corresponds to Flood inducing factors value is carried out by the type of data, the grade scale of quantized data Mainly according to natural interval method, ensure the maximum difference between different classifications, the classification of non-quantized data is mainly with reference in the past Qualitative research, be classified according to the influence size to landslide disaster.
Step 3:The landslide sample data obtained step 1 passes through Spatial Adjacency building Thiessen polygon progress sample point The judgement of relation, obtains corresponding Spatial Adjacency matrix W, and carries out centralization to Spatial Adjacency matrix W and operate to obtain Matrix C;
When it is implemented, building corresponding Thiessen polygon based on sample point, each Thiessen polygon correspondence one is discrete Come down sample point, the Spatial Adjacency of point so being judged, the Spatial Adjacency for being converted into face judges, if two spaces unit It is adjacent, then the weight between them is 1, is otherwise 0, may finally obtain the matrix of n*n, i.e. Spatial Adjacency matrix W.Establish Adjacency matrix W it is symmetrical on diagonal, this will cause in the calculating of feature vector afterwards, phase between feature vector result It is mutually orthogonal, it is possible to cause Problems of Multiple Synteny and model is set by mistake, it is therefore desirable to the operation of centralization is carried out to matrix W, Calculation formula is as follows:
Wherein, the matrix after changing centered on C, I are unit matrix, 11TThe matrix for being 1 for all elements, n are adjacency matrix Ranks number, line number and row number are equal.
Step 4:The Matrix C obtained to step 3 carries out eigen vector calculating;
When it is implemented, to the Matrix C after centralization, come with reference to the computational methods and computer program algorithm of numerical analysis Calculate the eigen vector of C.The prior art, solution eigen vector common at present can be used in specific implementation Algorithm have power method, inverse power method, Jacobi iterative methods, QR algorithms.And can carry out the software of eigen vector calculating with Storehouse of increasing income is also more, wherein the relatively conventional math storehouses for having MATLAB, Eigen storehouse and C# is carried.
Step 5:Returned for Logistic, successive Regression is carried out according to the obtained feature vector and characteristic value of step 4 Feature vector is chosen, and specific selecting step is as follows:
Referring to the drawings 4, specific implementation is as follows:
Step 5.1:The preliminary screening of feature vector, including each feature vector is calculated by its corresponding characteristic value Moran ' s I values and selection Moran ' s I values are more than the feature vector of corresponding predetermined threshold value (being preferably set to 0.25) as candidate Set of eigenvectors EnSubsequent characteristics vector is carried out to choose;Moran ' s I values represent not blue index;
When it is implemented, understanding that feature vector should possess the autocorrelation consistent with residual error, Moran ' s I values are utilized to weigh The autocorrelation of measure feature vector, Moran ' the s I values of feature vector are bigger, then more can represent autocorrelation of residuals.Usually may be used To choose Moran ' s I>Subset of 0.25 feature vector as candidate, improves follow-up selection efficiency.Feature vector Moran ' s I values can be calculated by corresponding characteristic value, its calculation formula is as follows:
Wherein, λiFor corresponding characteristic value, n is the ranks number of matrix, and W is original adjacency matrix;L is one of n*1 Whole elements are 1 vectors.
Step 5.2:For Logistic regression models, if candidate feature vector collection EnIn include n candidate feature vector, The candidate feature vector collection E that step 5.1 is obtainednMiddle n candidate feature vector Ei (i ∈ (1,2 ..., n)), is added separately to In regression model Y=aX+b, independent variable X is replaced with into X=X+Ei, original argument X here refers to the landslide obtained in step 2 Sample data corresponds to the ranked data of Flood inducing factors value, so as to obtain n new regression models, calculates the likelihood ratio of new and old model Test statistics LRT, the feature vector for choosing LRT statistic maximums are added in regression model, and in EnMiddle rejecting has been chosen Feature vector;
When it is implemented, assuming that the feature vector of addition is nonsensical to regression model, it is used as and is judged by the use of likelihood function Index carries out hypothetical inspection, then constructs likelihood ratio test variables L RT:
Wherein,For the likelihood function maximum of initial regression model,It is new after feature vector to add The likelihood function maximum of regression model.LRT values are bigger, and the feature vector for illustrating to add causes what likelihood function maximum occurred Change is bigger, more has reason to refuse null hypothesis, so choosing the feature vector of LRT statistical value maximums.
Step 5.3:Significance test is carried out to the feature vector of selection, if result is notable, rejects this feature vector, and Revolution performs step 5.2, concentrates and chooses in remaining feature vector again, if result is not notable, performs following step 5.4;
When it is implemented, carrying out significance test to the feature vector of selection, whether verification characteristics vector parameter has meaning Justice.Since logistic regression models are nonlinear regression model (NLRM)s, embodiment chooses the Wald Chi-square Tests in significance test The conspicuousness of the regression parameter of selected feature vector is judged, so as to judge whether to need to reject this feature vector.The inspection The null hypothesis tested is that the parameter for the feature vector selected in regression model should be 0, calculates chi-square statistics amount under the assumed condition In the p value of overall probability distribution, if p value is less than significance, 0.05 is generally, then refuses null hypothesis, i.e., need not reject This feature vector.It is opposite then need reject this feature vector, select again.
Step 5.4:The residual error spatial autocorrelation of new model to adding feature vector carries out significance test, if result is shown Write, illustrate after this feature vector of selection is added in regression model, residual error still suffers from notable spatial auto-correlation, it is also necessary to continues Suitable feature vector is chosen to add, and this feature vector has been selected, then revolution performs step 5.2 and step 5.3, if result Not significantly, then the selection of feature vector terminates;
Under normal conditions, after adding several feature vectors into model, can generally achieve the goal, if the spy that traversal is all Sign vector cannot all eliminate residual error spatial autocorrelation, illustrate that the method is not suitable for this kind of situation.
When it is implemented, the computational methods of Logistic regression residuals it is most widely used be Pearson came residual error, embodiment is excellent The conspicuousness of the spatial autocorrelation of the Pearson came residual error for the new model for judging to be obtained after above step was gated to judge to tie Beam continues to select feature vector.Moran ' the s I values of the Pearson came residual error e of computation model first and residual error, are denoted as respectively PrealAnd Ireal, since Pearson came residual error and its Moran ' s I values only have one, it is necessary to the numerical value in Pearson came residual vector Carry out random alignment n times (default value is 999 times, and each row in an e vector carries out random alignment, obtain 999 it is one-dimensional E vectors), using the residual vector obtained after random alignment as the new Pearson came residual error P of dependent variable calculatingrnd, count N number of new Pearson came residual error Moran ' s I values IrndMore than IrealSecondary percentage be p value, the computational methods of p value are:
P is more than 0.05 and terminates space filter value-based algorithm.
Step 6:It is added to the feature vector selected in step 5 as independent variable in Logistic regression models, structure Build the landslide disaster Logistic regression models of feature based function space filter value.I.e. based on the feature vector selected by step 5 Regression model is built with original argument.
When it is implemented, the logistic regression models added after all feature vectors chosen can be expressed as:
Logit (x)=w0+w1x1+…+wnxn+Eα
Wherein, x1...nFlood inducing factors value ranked data is corresponded to for landslide sample data, E is the feature vector chosen.Utilize Maximum likelihood estimate can carry out the parametric solution of logistic regression models, structure feature based function space filter value Logistic regression models, for researching and analysing for landslide disaster.
Step 7:Model evaluation, selects residual error Moran ' s I, Prob>The AUC value of chi2, Pseudo R2 and ROC curve The Logistic regression models for the feature based function space filter value that four indexs build step 6 as evaluating are commented Valency.
When it is implemented, the Logistic for calculating common Logistic models and feature based function space filter value respectively is returned Return residual error Moran ' s I, the Prob of model>Chi2, Pseudo R2 and AUC value.Wherein residual error Moran ' s I are used to evaluate sky Between filter value threshold value 0.05 is less than to the effect of residual error auto-correlation processing, general residual error Moran ' s I, you can judge that residual error is not present Spatial autocorrelation;Prob>Chi2 is that chi-square statistics amount null hypothesis is genuine probability, that is, independent variable does not influence dependent variable Probability, for evaluating whether Parameters in Regression Model significant, under normal conditions by the use of 0.05 or 0.01 as parameter conspicuousness Level, works as Prob>Independent variable, which has dependent variable, when chi2 is less than 0.05 or 0.01 significant impact, it is believed that parameter is intentional Justice;Pseudo R2 are also known as puppet R2, are the inspections that logistic regression models are proposed with reference to the R-squared of linear regression model (LRM) Amount, Pseudo R2For the degree of fitting of evaluation model, numerical value is bigger, then illustrates that the fitting effect of regression model is better;ROC is bent Line is also known as Receiver operating curve, is the composite target of binding specificity, sensitivity and False Rate, below its curve Product is AUC value, and AUC value is used for the predictablity rate for evaluating regression model, and AUC value is bigger, and the predictablity rate of model is higher.
It can be used for whole survey region by the qualified model of evaluation, by returning, can predict the landslide of whole region Liability.
It should be appreciated that the part that this specification does not elaborate belongs to the prior art.
The foregoing is merely one embodiment in the present invention, it is not intended to limit the invention.All spirit in the present invention Within principle, any modification for being made, improve etc., it should all be included in the protection scope of the present invention.

Claims (3)

  1. A kind of 1. landslide disaster logistic regression analysis of feature based function space filter value, it is characterised in that including Following steps:
    Step 1, landslide sample data is made choice and handled, the landslide sample data includes landslide point sample and do not come down Point sample, including landslide point sample and its corresponding locus attribute and landslide area attribute are obtained, choose and landslide point sample A sample that do not come down for this identical quantity;
    Step 2, the landslide sample data obtained to step 1 carries out the acquisition and classification of corresponding Flood inducing factors value;
    Step 3, the landslide sample point obtained to step 1 carries out Space Lorentz Curve between sample point by building Thiessen polygon Judgement, obtain corresponding Spatial Adjacency matrix W, and centralization is carried out to Spatial Adjacency matrix W and operates to obtain Matrix C;
    Step 4, the Matrix C obtained to step 3 carries out eigen vector calculating;
    Step 5, returned for Logistic, successive Regression feature is carried out according to the obtained feature vector and characteristic value of step 4 Vector is chosen, and realizes that step is as follows,
    Step 5.1, the preliminary screening of feature vector, including the Moran ' s by each feature vector of corresponding characteristic value calculating I values, choose the feature vector that Moran ' s I values are more than corresponding predetermined threshold value, as candidate feature vector collection EnCarry out follow-up special Sign vector is chosen;
    Step 5.2, for Logistic regression models, the candidate feature vector collection E that step 5.1 is obtainednMiddle n candidate feature Vector is added separately in the regression model without filter value operator, is obtained n new regression models, is calculated the likelihood of new and old model Than test statistics LRT, the feature vector for choosing LRT statistic maximums is added in regression model, and in EnMiddle rejecting has been selected The feature vector taken;
    Step 5.3, significance test is carried out to the feature vector of selection, if result is notable, rejects this feature vector, and return Step 5.2 is performed, if result is not notable, performs step 5.4;
    Step 5.4, the residual error spatial autocorrelation of the new model to adding feature vector carries out significance test, if result is notable, Then return and perform step 5.2 and step 5.3, if result is not notable, the selection of feature vector terminates;
    Step 6, it is added to the feature vector selected in step 5 as independent variable in Logistic regression models, builds base In the landslide disaster Logistic regression models of characteristic function space filter value.
  2. 2. the logistic of feature based vector space worry value according to claim 1 returns landslide hazard analysis method, It is characterized in that:In step 1, using landslide disaster point as the center of circle, to come down, coverage does buffering area as radius, obtains landslide shadow Ring region, whole research on landslide region subtract landslide influence area be exactly the chosen area not come down a little, in chosen area with Machine chooses a sample that do not come down for quantity identical with landslide point sample.
  3. 3. the logistic of feature based vector space worry value according to claim 1 or 2 returns landslide hazard analysis side Method, it is characterised in that:Choose residual error Moran ' s I, Prob>Four indexs of AUC value of chi2, Pseudo R2 and ROC curve are made The landslide disaster Logistic regression models of feature based function space filter value are evaluated for evaluating.
CN201711425595.5A 2017-12-25 2017-12-25 Landslide disaster logistic regression analysis method based on characteristic function spatial filtering value Active CN108038081B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711425595.5A CN108038081B (en) 2017-12-25 2017-12-25 Landslide disaster logistic regression analysis method based on characteristic function spatial filtering value

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711425595.5A CN108038081B (en) 2017-12-25 2017-12-25 Landslide disaster logistic regression analysis method based on characteristic function spatial filtering value

Publications (2)

Publication Number Publication Date
CN108038081A true CN108038081A (en) 2018-05-15
CN108038081B CN108038081B (en) 2020-02-11

Family

ID=62101191

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711425595.5A Active CN108038081B (en) 2017-12-25 2017-12-25 Landslide disaster logistic regression analysis method based on characteristic function spatial filtering value

Country Status (1)

Country Link
CN (1) CN108038081B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784552A (en) * 2018-12-29 2019-05-21 武汉大学 A kind of construction method of the space variable coefficient PM2.5 concentration appraising model based on Re-ESF algorithm
CN110188324A (en) * 2019-05-17 2019-08-30 武汉大学 A kind of traffic accident poisson regression analysis based on characteristic vector space filter value
CN110569554A (en) * 2019-08-13 2019-12-13 成都垣景科技有限公司 Landslide susceptibility evaluation method based on spatial logistic regression and geographic detector
CN112070366A (en) * 2020-08-19 2020-12-11 核工业湖州工程勘察院有限公司 Regional landslide risk quantitative measuring and calculating method based on multi-source monitoring data correlation analysis
CN112181642A (en) * 2020-09-16 2021-01-05 武汉大学 Artificial intelligence optimization method for space calculation operation
CN113468477A (en) * 2020-12-23 2021-10-01 南方科技大学 Sensitive data investigation and analysis method, storage medium and equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103364830A (en) * 2013-07-24 2013-10-23 北京师范大学 Predication method of happening position of slump disaster after earthquake based on multiple factors
CN106600578A (en) * 2016-11-22 2017-04-26 武汉大学 Remote-sensing-image-based parallelization method of regression model of characteristic function space filter value

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103364830A (en) * 2013-07-24 2013-10-23 北京师范大学 Predication method of happening position of slump disaster after earthquake based on multiple factors
CN106600578A (en) * 2016-11-22 2017-04-26 武汉大学 Remote-sensing-image-based parallelization method of regression model of characteristic function space filter value

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JONATHAN B.THAYN 等: "Accounting for Spatial Autocorrelation in Linear Regression Models Using Spatial Filtering with Eigenvectors", 《ANNALS OF THE ASSOCIATION OF AMERICAN GEOGRAPHERS》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784552A (en) * 2018-12-29 2019-05-21 武汉大学 A kind of construction method of the space variable coefficient PM2.5 concentration appraising model based on Re-ESF algorithm
CN109784552B (en) * 2018-12-29 2022-12-13 武汉大学 Re-ESF algorithm-based construction method of space variable coefficient PM2.5 concentration estimation model
CN110188324A (en) * 2019-05-17 2019-08-30 武汉大学 A kind of traffic accident poisson regression analysis based on characteristic vector space filter value
CN110188324B (en) * 2019-05-17 2023-05-30 武汉大学 Traffic accident poisson regression analysis method based on feature vector space filtering value
CN110569554A (en) * 2019-08-13 2019-12-13 成都垣景科技有限公司 Landslide susceptibility evaluation method based on spatial logistic regression and geographic detector
CN110569554B (en) * 2019-08-13 2020-11-10 成都垣景科技有限公司 Landslide susceptibility evaluation method based on spatial logistic regression and geographic detector
CN112070366A (en) * 2020-08-19 2020-12-11 核工业湖州工程勘察院有限公司 Regional landslide risk quantitative measuring and calculating method based on multi-source monitoring data correlation analysis
CN112070366B (en) * 2020-08-19 2022-03-29 核工业湖州勘测规划设计研究院股份有限公司 Regional landslide risk quantitative measuring and calculating method based on multi-source monitoring data correlation analysis
CN112181642A (en) * 2020-09-16 2021-01-05 武汉大学 Artificial intelligence optimization method for space calculation operation
CN112181642B (en) * 2020-09-16 2024-02-02 武汉大学 Artificial intelligence optimization method for space calculation operation
CN113468477A (en) * 2020-12-23 2021-10-01 南方科技大学 Sensitive data investigation and analysis method, storage medium and equipment
CN113468477B (en) * 2020-12-23 2023-11-24 南方科技大学 Sensitive data investigation analysis method, storage medium and equipment

Also Published As

Publication number Publication date
CN108038081B (en) 2020-02-11

Similar Documents

Publication Publication Date Title
CN108038081A (en) The landslide disaster logistic regression analysis of feature based function space filter value
Ulmas et al. Segmentation of satellite imagery using u-net models for land cover classification
Feng et al. Dynamic land use change simulation using cellular automata with spatially nonstationary transition rules
Yao et al. Simulating urban land-use changes at a large scale by integrating dynamic land parcel subdivision and vector-based cellular automata
Feng et al. Modeling urban growth with GIS based cellular automata and least squares SVM rules: a case study in Qingpu–Songjiang area of Shanghai, China
Jochem et al. Identifying residential neighbourhood types from settlement points in a machine learning approach
Baeza et al. Statistical and spatial analysis of landslide susceptibility maps with different classification systems
Pijanowski et al. Urban expansion simulation using geospatial information system and artificial neural networks
Wan Entropy-based particle swarm optimization with clustering analysis on landslide susceptibility mapping
Chu et al. Comparison of landslide susceptibility maps using random forest and multivariate adaptive regression spline models in combination with catchment map units
Fathizad et al. Evaluating desertification using remote sensing technique and object-oriented classification algorithm in the Iranian central desert
CN112506990A (en) Hydrological data anomaly detection method based on spatiotemporal information
CN106295498A (en) Remote sensing image target area detection apparatus and method
CN109446894A (en) The multispectral image change detecting method clustered based on probabilistic segmentation and Gaussian Mixture
CN110020469A (en) Flood warning analysis method and system based on Poisson regression and space filter value
Wan et al. Construction of knowledge-based spatial decision support system for landslide mapping using fuzzy clustering and KPSO analysis
Wang et al. Modeling urban growth by coupling localized spatio-temporal association analysis and binary logistic regression
Saha et al. Integrating the artificial intelligence and hybrid machine learning algorithms for improving the accuracy of spatial prediction of landslide hazards in Kurseong Himalayan Region
Zhao et al. Mapping landslide sensitivity based on machine learning: A case study in Ankang City, Shaanxi Province, China
CN117540303A (en) Landslide susceptibility assessment method and system based on cross semi-supervised machine learning algorithm
Wang et al. Incorporation of intra-city human mobility into urban growth simulation: A case study in Beijing
CN116070762A (en) Landslide vulnerability prediction method and system coupled with Smoteen and TabTransformer
CN114880954A (en) Landslide sensitivity evaluation method based on machine learning
CN115393148A (en) Data monitoring system, monitoring method, device, medium and terminal for natural resources
Guo et al. How do the landslide and non-landslide sampling strategies impact landslide susceptibility assessment?—A catchment-scale case study from China

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant