CN111582615A - Evaluation method and system based on LOGISTIC regression model - Google Patents

Evaluation method and system based on LOGISTIC regression model Download PDF

Info

Publication number
CN111582615A
CN111582615A CN201910119929.9A CN201910119929A CN111582615A CN 111582615 A CN111582615 A CN 111582615A CN 201910119929 A CN201910119929 A CN 201910119929A CN 111582615 A CN111582615 A CN 111582615A
Authority
CN
China
Prior art keywords
grid
evaluation
grids
area
event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910119929.9A
Other languages
Chinese (zh)
Inventor
陈朝亮
钱静
彭树宏
胡增运
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN201910119929.9A priority Critical patent/CN111582615A/en
Publication of CN111582615A publication Critical patent/CN111582615A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Operations Research (AREA)
  • Strategic Management (AREA)
  • Pure & Applied Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Mathematical Physics (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Algebra (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Probability & Statistics with Applications (AREA)
  • Marketing (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides an evaluation method and system based on a Logistic regression model, and belongs to the technical field of models and evaluation. Dividing the area to be evaluated into grids with the same size; acquiring the number of first event points in the grid, and determining whether the grid is a first grid or not and whether surrounding grids are first grids or not according to the number of the first event points in the grid; the first event point corresponds to a first evaluation index; acquiring a first grid evaluation index according to the first evaluation index in the first grid; and calculating the probability of the first event in the evaluation area according to the first grid evaluation index and the Logistic regression model. The method for counting the number and the positions of the event points in the grids is used for determining the properties of each disaster grid, and regression analysis is performed according to the properties, so that the precision of a simulated regression result is improved, the weight information of each evaluation index can be calculated more accurately, and higher evaluation precision is obtained.

Description

Evaluation method and system based on LOGISTIC regression model
Technical Field
The invention relates to the technical field of model evaluation, in particular to an evaluation method and system based on a Logistic regression model.
Background
The existing commonly used geological disaster evaluation method is based on index weight, namely, the sub-weight of each index is obtained by using a corresponding mathematical model, the total evaluation value of a research area is obtained by using a linear weighted summation method, and the evaluation result is graded according to the relevant standard to obtain the geological disaster susceptibility sub-area result.
The method for determining the evaluation index weight can be divided into a subjective weight determination method and an objective weight determination method. The subjective weight determination method is represented by an Analytic Hierarchy Process (AHP), and mainly comprises the steps of comparing every two evaluation indexes by experts in the industry, scoring the importance of each evaluation index according to a scale value of 1-9, constructing a judgment matrix according to the score of the evaluation index and a hierarchical structure of the judgment matrix, solving the maximum characteristic root of the judgment matrix and the corresponding characteristic vector by utilizing linear algebra knowledge, wherein the solved characteristic vector is the importance ranking of each evaluation factor, and obtaining the weight result of each index after normalization. The method has strict mathematical logicality, but the importance scoring process is greatly influenced by artificial subjectivity, and the obtained evaluation index is easy to be inaccurate due to incomplete cognition. The objective weight determination method is represented by a binary Logistic regression model, and is characterized in that the method is respectively assigned to 1 and 0 on the basis of the historical disaster points and the non-disaster points randomly generated in the same number, fuzzy values of evaluation index layers corresponding to the positions of the disaster points are extracted by using a geographic information means, then Logistic regression operation is carried out on each index, and the weight of each evaluation index is fitted. The method has strong theoretical basis, can eliminate the influence of artificial subjective factors, but cannot control the generated random non-disaster points, is easy to be mixed and collided with a disaster area, and causes low accuracy of evaluation index weight.
Therefore, in the method for evaluating geological disasters in the prior art, on one hand, the position of a non-disaster point cannot be accurately determined, and a binary Logistic regression model provided in the prior art needs to perform model fitting according to a historical disaster point and the non-disaster point, and because the required data size is large, the non-disaster point can only be randomly generated in a research area range, and the distance from the disaster point cannot be accurately determined, the non-disaster point falls in a disaster area range, and thus the model fitting result is inaccurate. On the other hand, the existing evaluation scheme has low evaluation precision, and the model fitting precision cannot be guaranteed due to low confidence of each index weight, so that the evaluation result is inaccurate.
Disclosure of Invention
In view of the above, the invention provides a method and a system for automatically correcting a projection picture, which are used for solving the technical problems that the existing evaluation based on a Logistic regression model has large data volume and the evaluation precision of an inaccurate evaluation scheme of a model fitting result is low.
The technical scheme of the invention is as follows:
a Logistic regression model-based evaluation method, comprising:
dividing the area to be evaluated into grids with the same size;
acquiring the number of first event points in the grid, and determining whether the grid is a first grid or not and whether surrounding grids are first grids or not according to the number of the first event points in the grid; the first event point corresponds to a first evaluation index;
acquiring a first grid evaluation index according to the first evaluation index in the first grid;
and calculating the probability of the first event in the evaluation area according to the first grid evaluation index and the Logistic regression model.
Correspondingly, the determining whether the grid is the first grid and whether the surrounding grid thereof is the first grid according to the number of the first event points in the grid includes:
and if the number of the first event points in the grid is N, judging all the grids in the grid and the adjacent area of N layers around the grid as disaster grids.
Accordingly, the first event point comprises a geological disaster point.
Correspondingly, the first evaluation index at least comprises one or more of gradient, elevation, distance from the river, annual average precipitation, normalized vegetation index, distance from the road, distance from the structure and formation lithology.
Correspondingly, the obtaining a first grid evaluation index according to the first evaluation index in the first grid includes:
and acquiring the mean value of the maximum value and the minimum value of the first evaluation index corresponding to the first event point in the first grid as the first grid evaluation index.
Correspondingly, the calculating the probability of the first event in the evaluation area according to the first grid evaluation index and the Logistic regression model comprises:
Figure BDA0001971528500000021
the Logistic regression model is:
Figure BDA0001971528500000022
Figure BDA0001971528500000023
where P represents the probability of occurrence of a first event within the evaluation area, Xi represents a first evaluation index, β0,β1,...,βiAre logistic regression coefficients.
Correspondingly, the dividing the region to be evaluated into grids of the same size includes: and determining the size of the grid according to the size of the scale in the area to be evaluated.
In addition, in order to achieve the above object, the present invention further provides an evaluation system based on Logistic regression model, the system comprising:
the dividing unit is used for dividing the area to be evaluated into grids with the same size;
the determining unit is used for acquiring the number of the first event points in the grid and determining whether the grid is the first grid or not and whether the surrounding grid is the first grid or not according to the number of the first event points in the grid; the first event point corresponds to a first evaluation index;
the index acquisition unit is used for acquiring a first grid evaluation index according to the first evaluation index in the first grid;
and the model calculation unit is used for calculating the probability of the first event in the evaluation area according to the first grid evaluation index and the Logistic regression model.
Correspondingly, the determining unit determines whether the grid is the first grid and whether the surrounding grid is the first grid according to the number of the first event points in the grid, and the determining unit includes:
and if the number of the first event points in the grid is N, judging all the grids in the grid and the adjacent area of N layers around the grid as disaster grids.
Accordingly, the first event point comprises a geological disaster point.
Correspondingly, the first evaluation index at least comprises one or more of gradient, elevation, distance from the river, annual average precipitation, normalized vegetation index, distance from the road, distance from the structure and formation lithology.
Correspondingly, the index obtaining unit comprises:
and acquiring the mean value of the maximum value and the minimum value of the first evaluation index corresponding to the first event point in the first grid as the first grid evaluation index.
Accordingly, the model calculation unit includes:
Figure BDA0001971528500000031
the Logistic regression model is:
Figure BDA0001971528500000032
Figure BDA0001971528500000041
where P represents the probability of occurrence of a first event within the evaluation area, Xi represents a first evaluation index, β0,β1,...,βiAre logistic regression coefficients.
Correspondingly, the dividing unit comprises: and determining the size of the grid according to the size of the scale in the area to be evaluated.
In the scheme of the embodiment of the invention, the area to be evaluated is divided into grids with the same size; acquiring the number of first event points in the grid, and determining whether the grid is a first grid or not and whether surrounding grids are first grids or not according to the number of the first event points in the grid; the first event point corresponds to a first evaluation index; acquiring a first grid evaluation index according to the first evaluation index in the first grid; and calculating the probability of the first event in the evaluation area according to the first grid evaluation index and the Logistic regression model. According to the invention, a grid mode is used for replacing a disaster point and a non-disaster point, so that binary Logistic regression analysis is carried out, and the precision of a simulation regression result is improved. By determining the property of each disaster grid by using a method for counting the number and the positions of disaster points in the grid and performing regression analysis on the property, compared with the conventional method for performing binary Logistic regression calculation on disaster points and random non-disaster points, the method can more accurately calculate the weight information of each evaluation index and obtain higher evaluation precision.
Drawings
Fig. 1 is a flowchart of a method for evaluating a Logistic regression model according to an embodiment of the present invention;
fig. 2(a) is a schematic diagram of an eight-neighborhood disaster point according to an embodiment of the present invention;
fig. 2(b) is an expanded view of the disaster point extending out by 2 layers according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a Logistic regression model-based evaluation system according to a second embodiment of the present invention;
FIG. 4 is a ROC curve obtained by verifying the disaster point according to the third embodiment of the present invention;
fig. 5 is an ROC curve obtained by verifying a disaster grid according to three embodiments of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
The embodiment of the invention provides an evaluation method based on a Logistic regression model, and fig. 1 is a flow chart of the evaluation method based on the Logistic regression model provided by the embodiment of the invention; the method comprises the following steps:
s101, dividing an area to be evaluated into grids with the same size;
correspondingly, the dividing the region to be evaluated into grids of the same size includes: and determining the size of the grid according to the size of the scale in the area to be evaluated.
In practical application, the size of the grid is determined according to the scale size of the index data in the research area range, and the formula is as follows:
Gs=7.5+0.0006s-2.01×109S2+2.91×1015S3
in the formula: gsThe grid size is determined by S being the size of the base data scale.
S102, acquiring the number of first event points in the grid, and determining whether the grid is a first grid or not and whether surrounding grids are first grids or not according to the number of the first event points in the grid; the first event point corresponds to a first evaluation index;
accordingly, the first event point comprises a geological disaster point.
In practical application, considering that a binary Logistic regression model needs to perform regression calculation on a research area according to a determined binary value, a disaster grid and a non-disaster grid need to be determined first. The basis determined by the method is that the grid units divided in the previous step are superposed with the known disaster points according to the position relationship between the disaster points and the grids, and the number of the disaster points in each grid is calculated and counted according to the position relationship.
Correspondingly, the determining whether the grid is the first grid and whether the surrounding grid thereof is the first grid according to the number of the first event points in the grid includes:
and if the number of the first event points in the grid is N, judging all the grids in the grid and the adjacent area of N layers around the grid as disaster grids.
In practical application, whether each grid is a disaster grid or not is determined according to the number of disaster points in each grid, and the specific implementation method comprises the following steps: if there is a disaster point in the grid, first, all grids in the grid and its surrounding eight neighborhoods are determined as disaster grids, as shown in fig. 2(a), the central grid is a grid with a disaster point (black dots are disaster points), the blue grid is its surrounding eight neighborhoods, and all of them are determined as disaster grids; secondly, counting the number of the disaster points in each grid, and if the number of the disaster points is more than 1, expanding one more layer outwards on the basis of the original disaster grid. In fig. 2(b), for example, when the number of disaster points in the center grid is 2 (the black dots are disaster points), all grids in the 2-layer neighborhood in which the grid and its periphery are expanded outward are determined as disaster grids.
After the disaster grids are determined, the disaster grids are reversely selected from all the grid units, and then the non-disaster grids can be obtained.
S103, acquiring a first grid evaluation index according to the first evaluation index in the first grid;
correspondingly, the obtaining a first grid evaluation index according to the first evaluation index in the first grid includes:
and acquiring the mean value of the maximum value and the minimum value of the first evaluation index corresponding to the first event point in the first grid as the first grid evaluation index.
In practical application, the method for performing Logistic regression analysis by using disaster points can directly extract the values of all evaluation indexes on the disaster points, and in the method, a disaster grid is used for replacing the disaster points, the maximum value and the minimum value of all indexes in the grid are firstly extracted, then the mean value of all indexes in the disaster grid is calculated, and the mean value replaces the index values in the grid to perform regression analysis. Index a in gridValue MaThe calculation method is as follows:
Figure BDA0001971528500000061
in the formula, amaxIs the maximum value of an index a in the grid, aminIs the minimum value of the index a in the grid.
Correspondingly, the first evaluation index at least comprises one or more of gradient, elevation, distance from the river, annual average precipitation, normalized vegetation index, distance from the road, distance from the structure and formation lithology.
In practical application, according to the principles of the geological environment characteristics of the research area, pertinence, universality, quantifiability, data accessibility and the like, 3 types of 8 indexes of natural geography, basic geology and ecological conditions are selected as evaluation factors of the ecological geological environment quality of the research area: namely, grade, elevation, distance from river, annual average precipitation, normalized vegetation index (NDVI), distance from road, distance from structure, and stratigraphic lithology.
And S104, calculating the probability of the first event in the evaluation area according to the first grid evaluation index and the Logistic regression model.
Correspondingly, the calculating the probability of the first event in the evaluation area according to the first grid evaluation index and the Logistic regression model comprises:
and (3) setting whether the geological disaster occurs or not as Y as a two-dimensional dependent variable (1 represents occurrence, and 0 represents non-occurrence), and setting the disaster causing factor Xi as a dependent variable. If P is the probability of occurrence of the geological disaster, the value range is [0,1], then 1-P is the probability of non-occurrence, the ratio of the two is taken as the natural logarithm, and is recorded as Logist P, then:
Figure BDA0001971528500000071
the Logistic linear regression model is:
Figure BDA0001971528500000072
in the formula β0,β1,...,βiIs a logistic regression coefficient, i.e. the weight of each factor, wherein β0Is a reference constant. Solving for P to obtain:
Figure BDA0001971528500000073
after the weights of the evaluation factors are determined, the probability value of the geological disaster occurrence in the research area can be obtained through the formula, and the geological disaster proneness of the research area is divided into different grades according to the relevant rules to obtain a geological disaster risk prediction zoning map of the research area.
Specifically, when the probability value of the occurrence of the geological disaster is more than 0.7, dividing the geological disaster susceptibility of the region into high disaster susceptibility regions; dividing the probability value of the occurrence of the geological disaster in the area into 0.3-0.7 areas with medium disaster susceptibility; and dividing the geological disaster susceptibility of the area into low-grade disaster susceptibility areas when the probability value of the occurrence of the geological disaster is below 0.3.
Example two
An evaluation system based on a Logistic regression model according to an embodiment of the present invention is a schematic structural diagram of an evaluation system based on a Logistic regression model according to an embodiment of the present invention, as shown in fig. 3, and the system includes:
a dividing unit 301 that divides the region to be evaluated into grids of the same size;
accordingly, the dividing unit 301 includes: and determining the size of the grid according to the size of the scale in the area to be evaluated.
In practical application, the size of the grid is determined according to the scale size of the index data in the research area range, and the formula is as follows:
GS=7.5+0.0006S-2.01×109S2+2.91×1015S3
in the formula: gsThe grid size is determined by S being the size of the base data scale.
A determining unit 302, configured to obtain the number of first event points in the grid, and determine whether the grid is a first grid and whether surrounding grids are the first grid according to the number of first event points in the grid; the first event point corresponds to a first evaluation index;
accordingly, the first event point comprises a geological disaster point.
In practical application, considering that a binary Logistic regression model needs to perform regression calculation on a research area according to a determined binary value, a disaster grid and a non-disaster grid need to be determined first. The basis determined by the method is that the grid units divided in the previous step are superposed with the known disaster points according to the position relationship between the disaster points and the grids, and the number of the disaster points in each grid is calculated and counted according to the position relationship.
Accordingly, the determining unit 302 includes:
and if the number of the first event points in the grid is N, judging all the grids in the grid and the adjacent area of N layers around the grid as disaster grids.
In practical application, whether each grid is a disaster grid or not is determined according to the number of disaster points in each grid, and the specific implementation method comprises the following steps: if there is a disaster point in the grid, first, all grids in the grid and its surrounding eight neighborhoods are determined as disaster grids, as shown in fig. 2(a), the central grid is a grid with a disaster point (black dots are disaster points), the blue grid is its surrounding eight neighborhoods, and all of them are determined as disaster grids; secondly, counting the number of the disaster points in each grid, and if the number of the disaster points is more than 1, expanding one more layer outwards on the basis of the original disaster grid. In fig. 2(b), for example, when the number of disaster points in the center grid is 2 (the black dots are disaster points), all grids in the 2-layer neighborhood in which the grid and its periphery are expanded outward are determined as disaster grids.
After the disaster grids are determined, the disaster grids are reversely selected from all the grid units, and then the non-disaster grids can be obtained.
An index obtaining unit 303, which obtains a first grid evaluation index according to the first evaluation index in the first grid;
correspondingly, the index obtaining unit 303 includes:
and acquiring the mean value of the maximum value and the minimum value of the first evaluation index corresponding to the first event point in the first grid as the first grid evaluation index.
In practical application, the method for performing Logistic regression analysis by using disaster points can directly extract the values of all evaluation indexes on the disaster points, and in the method, a disaster grid is used for replacing the disaster points, the maximum value and the minimum value of all indexes in the grid are firstly extracted, then the mean value of all indexes in the disaster grid is calculated, and the mean value replaces the index values in the grid to perform regression analysis. Mean value M of index a in gridaThe calculation method is as follows:
Figure BDA0001971528500000091
in the formula, amaxIs the maximum value of an index a in the grid, aminIs the minimum value of the index a in the grid.
Correspondingly, the first evaluation index at least comprises one or more of gradient, elevation, distance from the river, annual average precipitation, normalized vegetation index, distance from the road, distance from the structure and formation lithology.
In practical application, according to the principles of the geological environment characteristics of the research area, pertinence, universality, quantifiability, data accessibility and the like, 3 types of 8 indexes of natural geography, basic geology and ecological conditions are selected as evaluation factors of the ecological geological environment quality of the research area: namely, grade, elevation, distance from river, annual average precipitation, normalized vegetation index (NDVI), distance from road, distance from structure, and stratigraphic lithology.
Model calculation unit 304 calculates a probability of occurrence of the first event in the evaluation area according to the first grid evaluation index and the Logistic regression model.
Accordingly, the model calculation unit 304 includes:
and (3) setting whether the geological disaster occurs or not as Y as a two-dimensional dependent variable (1 represents occurrence, and 0 represents non-occurrence), and setting the disaster causing factor Xi as a dependent variable. If P is the probability of occurrence of the geological disaster, the value range is [0,1], then 1-P is the probability of non-occurrence, the ratio of the two is taken as the natural logarithm, and is recorded as Logist P, then:
Figure BDA0001971528500000092
the Logistic linear regression model is:
Figure BDA0001971528500000101
in the formula β0,β1,...,βiIs a logistic regression coefficient, i.e. the weight of each factor, wherein β0Is a reference constant. Solving for P to obtain:
Figure BDA0001971528500000102
after the weights of the evaluation factors are determined, the probability value of the geological disaster occurrence in the research area can be obtained through the formula, and the geological disaster proneness of the research area is divided into different grades according to the relevant rules to obtain a geological disaster risk prediction zoning map of the research area.
Specifically, when the probability value of the occurrence of the geological disaster is more than 0.7, dividing the geological disaster susceptibility of the region into high disaster susceptibility regions; dividing the probability value of the occurrence of the geological disaster in the area into 0.3-0.7 areas with medium disaster susceptibility; and dividing the geological disaster susceptibility of the area into low-grade disaster susceptibility areas when the probability value of the occurrence of the geological disaster is below 0.3.
EXAMPLE III
Based on the evaluation method and system based on the Logistic regression model provided in the first embodiment and the second embodiment, the invention performs model verification and analysis by combining data.
The invention adopts 1108 historical disaster points of a certain city to carry out experiments, wherein 80 percent (886 disaster points) is used for generating the model, and 20 percent (222 disaster points) is used for verifying the model.
Experiment one: according to the best method in the prior art, firstly, corresponding values of all evaluation indexes are extracted according to known disaster points and generated non-disaster points, binary Logistic regression analysis is carried out according to the corresponding values to obtain weights of all the evaluation indexes, linear weighting operation is carried out on all the evaluation indexes according to the weights, and the operation results are displayed in a grading mode to obtain final evaluation results.
Experiment two: according to the grid method, firstly, a research area is divided into 5983872 grids according to a grid dividing formula, then historical disaster points are overlapped with the grids, the number of the disaster points in each grid is calculated by using a subarea statistical tool, all the disaster grids are counted according to the statistic, 11032 disaster grids are counted, 80% (8826 grids) of the historical disaster points are randomly extracted for generating a model, the remaining 20% (2206 grids) of the historical disaster points are used for model verification, the disaster grids are subtracted from all the grids to obtain non-disaster grids, then the same number (8826) of the non-disaster grids are randomly selected from the non-disaster grids, the average value of evaluation indexes in the grids is extracted together with the disaster grids for regression analysis, the weight of each evaluation index is obtained, linear weighting operation is carried out on each evaluation index according to the average value, and the operation result is displayed in a grading mode to obtain a final evaluation result.
And (5) result verification:
the method comprises the following steps: and (3) superposing the evaluation results generated by the two methods with respective disaster points and disaster grids, extracting evaluation result values corresponding to the disaster points and the disaster grids, carrying out ROC analysis on the evaluation result values and the known value of '1', obtaining an ROC curve and calculating an AUC value, as shown in fig. 4 and 5. When the area under the ROC curve is larger, the AUC value is higher, the accuracy of the simulation result is higher, and the accuracy of the method is higher according to the area under the ROC curve and the AUC value.
The second method comprises the following steps: and overlapping the rest 20% of disaster points and disaster grids with the evaluation result graph, and counting the accuracy rate of correct prediction of each disaster point and disaster grid, which is shown in table 1.
TABLE 1 number and ratio of disaster points in different susceptibility levels
Figure BDA0001971528500000111
Compared with the two methods, the method has the advantages that the AUC and the correctness are higher than those of the best method under the same experimental conditions.
A series of experiments show that the high-precision binary Logistic regression evaluation method is superior to all the existing similar technologies.
In the scheme of the embodiment of the invention, the area to be evaluated is divided into grids with the same size; acquiring the number of first event points in the grid, and determining whether the grid is a first grid or not and whether surrounding grids are first grids or not according to the number of the first event points in the grid; the first event point corresponds to a first evaluation index; acquiring a first grid evaluation index according to the first evaluation index in the first grid; and calculating the probability of the first event in the evaluation area according to the first grid evaluation index and the Logistic regression model. According to the invention, a grid mode is used for replacing a disaster point and a non-disaster point, so that binary Logistic regression analysis is carried out, and the precision of a simulation regression result is improved. By determining the property of each disaster grid by using a method for counting the number and the positions of disaster points in the grid and performing regression analysis on the property, compared with the conventional method for performing binary Logistic regression calculation on disaster points and random non-disaster points, the method can more accurately calculate the weight information of each evaluation index and obtain higher evaluation precision.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when executed, the program includes one or a combination of the steps of the method embodiments.
In addition, each functional unit in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (14)

1. An evaluation method based on a Logistic regression model is characterized by comprising the following steps:
dividing the area to be evaluated into grids with the same size;
acquiring the number of first event points in the grid, and determining whether the grid is a first grid or not and whether surrounding grids are first grids or not according to the number of the first event points in the grid; the first event point corresponds to a first evaluation index;
acquiring a first grid evaluation index according to the first evaluation index in the first grid;
and calculating the probability of the first event in the evaluation area according to the first grid evaluation index and the Logistic regression model.
2. The area evaluation method of claim 1, wherein determining whether the grid is a first grid and whether its surrounding grids are first grids according to the number of first event points in the grid comprises:
and if the number of the first event points in the grid is N, judging all the grids in the grid and the adjacent area of N layers around the grid as disaster grids.
3. The regional assessment method of claim 2, wherein the first event point comprises a geological disaster point.
4. The regional evaluation method of claim 3, wherein the first evaluation index comprises at least one or more of grade, elevation, distance from river, annual average precipitation, normalized vegetation index, distance from road, distance from structure, and formation lithology.
5. The area evaluation method according to claim 1 or 4, wherein the obtaining a first grid evaluation index from a first evaluation index in a first grid comprises:
and acquiring the mean value of the maximum value and the minimum value of the first evaluation index corresponding to the first event point in the first grid as the first grid evaluation index.
6. The regional evaluation method according to claim 1 or 4, wherein the calculating the probability of the first event occurring in the evaluation region according to the first grid evaluation index and a Logistic regression model comprises:
Figure FDA0001971528490000011
the Logistic regression model is:
Figure FDA0001971528490000012
Figure FDA0001971528490000021
where P represents the probability of occurrence of a first event within the evaluation area, Xi represents a first evaluation index, β0,β1,…,βiAre logistic regression coefficients.
7. The area evaluation method according to claim 1, wherein the dividing the area to be evaluated into grids of the same size comprises: and determining the size of the grid according to the size of the scale in the area to be evaluated.
8. An evaluation system based on Logistic regression model, the system comprising:
the dividing unit is used for dividing the area to be evaluated into grids with the same size;
the determining unit is used for acquiring the number of the first event points in the grid and determining whether the grid is the first grid or not and whether the surrounding grid is the first grid or not according to the number of the first event points in the grid; the first event point corresponds to a first evaluation index;
the index acquisition unit is used for acquiring a first grid evaluation index according to the first evaluation index in the first grid;
and the model calculation unit is used for calculating the probability of the first event in the evaluation area according to the first grid evaluation index and the Logistic regression model.
9. The area evaluation method of claim 8, wherein the determining unit determines whether the grid is a first grid and whether its surrounding grids are first grids according to the number of first event points in the grid, comprises:
and if the number of the first event points in the grid is N, judging all the grids in the grid and the adjacent area of N layers around the grid as disaster grids.
10. The regional assessment method of claim 9, wherein the first event point comprises a geological disaster point.
11. The regional evaluation method of claim 10, wherein the first evaluation index includes at least one or more of grade, elevation, distance from river, annual average precipitation, normalized vegetation index, distance from road, distance from structure, and formation lithology.
12. The area evaluation method according to claim 8 or 11, wherein the index acquisition unit includes:
and acquiring the mean value of the maximum value and the minimum value of the first evaluation index corresponding to the first event point in the first grid as the first grid evaluation index.
13. The area evaluation method according to claim 8 or 11, wherein the model calculation unit includes:
Figure FDA0001971528490000031
the Logistic regression model is:
Figure FDA0001971528490000032
Figure FDA0001971528490000033
where P represents the probability of occurrence of a first event within the evaluation area, Xi represents a first evaluation index, β0,β1,...,βiAre logistic regression coefficients.
14. The area evaluation method according to claim 8, wherein the dividing unit includes: and determining the size of the grid according to the size of the scale in the area to be evaluated.
CN201910119929.9A 2019-02-18 2019-02-18 Evaluation method and system based on LOGISTIC regression model Pending CN111582615A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910119929.9A CN111582615A (en) 2019-02-18 2019-02-18 Evaluation method and system based on LOGISTIC regression model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910119929.9A CN111582615A (en) 2019-02-18 2019-02-18 Evaluation method and system based on LOGISTIC regression model

Publications (1)

Publication Number Publication Date
CN111582615A true CN111582615A (en) 2020-08-25

Family

ID=72110752

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910119929.9A Pending CN111582615A (en) 2019-02-18 2019-02-18 Evaluation method and system based on LOGISTIC regression model

Country Status (1)

Country Link
CN (1) CN111582615A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113379200A (en) * 2021-05-25 2021-09-10 中国地质大学(武汉) Method and device for determining geological disaster susceptibility evaluation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107358327A (en) * 2017-07-21 2017-11-17 重庆大学 Landslide liability assessment method based on unmanned aerial vehicle remote sensing images
CN107957982A (en) * 2017-12-05 2018-04-24 中国科学院遥感与数字地球研究所 Secondary Geological Hazards liability fast evaluation method and system after shake
CN108597189A (en) * 2018-04-24 2018-09-28 河海大学 Small watershed geological disaster and flood warning method in distribution based on Critical Rainfall
CN109118265A (en) * 2018-06-27 2019-01-01 阿里巴巴集团控股有限公司 Commercial circle determines method, apparatus and server
CN109165424A (en) * 2018-08-03 2019-01-08 四川理工学院 A kind of landslide assessment of easy generation method based on domestic GF-1 satellite data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107358327A (en) * 2017-07-21 2017-11-17 重庆大学 Landslide liability assessment method based on unmanned aerial vehicle remote sensing images
CN107957982A (en) * 2017-12-05 2018-04-24 中国科学院遥感与数字地球研究所 Secondary Geological Hazards liability fast evaluation method and system after shake
CN108597189A (en) * 2018-04-24 2018-09-28 河海大学 Small watershed geological disaster and flood warning method in distribution based on Critical Rainfall
CN109118265A (en) * 2018-06-27 2019-01-01 阿里巴巴集团控股有限公司 Commercial circle determines method, apparatus and server
CN109165424A (en) * 2018-08-03 2019-01-08 四川理工学院 A kind of landslide assessment of easy generation method based on domestic GF-1 satellite data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
慕金波等编著: "《崂山生态与保护》", vol. 1, 山东大学出版社, pages: 156 - 158 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113379200A (en) * 2021-05-25 2021-09-10 中国地质大学(武汉) Method and device for determining geological disaster susceptibility evaluation
CN113379200B (en) * 2021-05-25 2022-08-16 中国地质大学(武汉) Method and device for determining geological disaster susceptibility evaluation
NL2031633A (en) * 2021-05-25 2022-12-08 Univ China Geosciences Wuhan Method and Device for Geohazard Susceptibility Mapping

Similar Documents

Publication Publication Date Title
CN108280553B (en) Mountain torrent disaster risk zoning and prediction method based on GIS-neural network integration
CN110569554B (en) Landslide susceptibility evaluation method based on spatial logistic regression and geographic detector
CN116108758B (en) Landslide susceptibility evaluation method
CN115688404B (en) Rainfall landslide early warning method based on SVM-RF model
CN110298321A (en) Route denial information extraction based on deep learning image classification
CN110929939B (en) Landslide hazard susceptibility spatial prediction method based on clustering-information coupling model
KR101675778B1 (en) Decision making system corresponding to volcanic disaster
CN106295351B (en) A kind of Risk Identification Method and device
CN112131731B (en) Urban growth cellular simulation method based on spatial feature vector filtering
CN113190424A (en) Fuzzy comprehensive evaluation method for knowledge graph recommendation system
Gunawan et al. Regional income disparities, distributional convergence, and spatial effects: evidence from Indonesia
CN111582615A (en) Evaluation method and system based on LOGISTIC regression model
CN116258279B (en) Landslide vulnerability evaluation method and device based on comprehensive weighting
CN106709501A (en) Method for scene matching region selection and reference image optimization of image matching system
CN106874286B (en) Method and device for screening user characteristics
CN110826897A (en) Regional safety evaluation method, device, equipment and storage medium
Herceg et al. A new cluster-based financial vulnerability indicator and its application to household stress testing in Croatia
CN112883339B (en) Method and system for determining earthquake sensing range
CN112950350B (en) Loan product recommendation method and system based on machine learning
CN115271332A (en) Drought monitoring method
CN114997666A (en) Method for evaluating easiness of region debris flow
CN111563775A (en) Crowd division method and device
Chrisnanto et al. GIS-based ranking and categorization of potential impact on drought as disaster mitigation effort in Bandung Barat Regency (KBB) using Simple Additive Weighting (SAW)
CN117094234B (en) Landslide vulnerability evaluation method integrating convolutional neural network and self-attention model
CN115271428B (en) Environment vulnerability evaluation method, device and medium based on SVD decomposition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination