CN113901348A - Oncomelania snail distribution influence factor identification and prediction method based on mathematical model - Google Patents

Oncomelania snail distribution influence factor identification and prediction method based on mathematical model Download PDF

Info

Publication number
CN113901348A
CN113901348A CN202111324115.2A CN202111324115A CN113901348A CN 113901348 A CN113901348 A CN 113901348A CN 202111324115 A CN202111324115 A CN 202111324115A CN 113901348 A CN113901348 A CN 113901348A
Authority
CN
China
Prior art keywords
time
oncomelania
space
distribution
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111324115.2A
Other languages
Chinese (zh)
Inventor
杨坤
王喆
蒋甜甜
施亮
刘璐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Institute of Parasitic Diseases
Original Assignee
Jiangsu Institute of Parasitic Diseases
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Institute of Parasitic Diseases filed Critical Jiangsu Institute of Parasitic Diseases
Priority to CN202111324115.2A priority Critical patent/CN113901348A/en
Publication of CN113901348A publication Critical patent/CN113901348A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/904Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for identifying and predicting oncomelania distribution influence factors based on a mathematical model, which comprises the following steps: establishing a time-space database of distribution and attribute data of the oncomelania; using an elliptical coordinate system to balance measurement units of time and space, and defining a space-time distance by referring to a three-dimensional space Euclidean distance calculation formula; substituting the distance into a secondary kernel function to construct a space-time weight matrix of each oncomelania distribution point, and selecting an 'optimal' bandwidth value according to the Chichi information criterion; and constructing a space-time geographic weighted regression (GTWR) model to obtain related parameters, performing visual processing on the time and space changes of the parameters, and analyzing the internal law to obtain the predicted value of the oncomelania distribution. The method has the greatest advantages that the time non-stationarity characteristic of the snail situation is integrated into the traditional oncomelania density prediction and analysis model, so that the fitting precision is improved, the quantitative analysis and research of the influence factors of the oncomelania distribution in the region are facilitated, and the prediction precision can be obviously improved.

Description

Oncomelania snail distribution influence factor identification and prediction method based on mathematical model
Technical Field
The invention relates to the technical field of mathematical models, in particular to a method for identifying and predicting oncomelania distribution influence factors based on a mathematical model.
Background
Schistosomiasis is a disease closely related to biological, environmental and social economic factors, has multiple transmission links and complex epidemic factors, and has epidemiological characteristics of obvious susceptible seasons, uncertain susceptible environment, centralized high risk groups, closely related infection modes with the production and living modes of residents and the like. The Hubei Oncomelania hupensis is the only intermediate host of the schistosoma japonicum, and the breeding and distribution of the species directly influence the prevalence and spread of schistosomiasis. The distribution of Oncomelania hupensis has obvious regionality, and at present, the factors influencing the propagation, distribution and diffusion of the Oncomelania hupensis are considered to relate to social factors and environment, wherein the environmental factors are mainly related to weather, terrain and vegetation conditions, such as water level, illumination, elevation, vegetation and the like. The distribution space environment of the oncomelania is relatively complex, the cost of field oncomelania investigation is high, a large amount of manpower and material resources are consumed, the oncomelania is mostly in complex environments such as a shoal, a water system and the like, the restriction of the environment and the climate is more, and the collection of some field environment factors is difficult. The development of space technologies such as remote sensing and the like can accurately extract related environmental factors, and a feasible method is provided for investigating and predicting the distribution of oncomelania and the potential epidemic area of schistosomiasis.
In most of current studies on the exploration of oncomelania distribution, when the correlation between the change of the oncomelania distribution and the related environmental influence factors is analyzed, the analysis methods of the correlation and (linear or multiple linear) regression in the traditional statistical method, such as the common least square method (OLS), are mostly adopted, but the model is too simple to solve the complex space problem. And the Geographical Weighted Regression (GWR) adds spatial information on the basis of least square regression, and adds geographical position coordinates into the model, so that the spatial heterogeneity of the relationship between the oncomelania density and the influence factors can be disclosed, namely, the relationship between the oncomelania density and the influence factors may have differences, and the method is very suitable for analysis of spatial data. However, the data of the spiral situations belong to spatial data, have independent geographical coordinate positions, distribution and adjacent area interdependence, and have time attributes. However, most of the existing data analysis methods or models are based on spatial variation, and cannot incorporate temporal variation into the methods, so that the research results lose relevant temporal variation information and cannot reflect the authenticity of the oncomelania distribution data.
In order to solve the problems, the invention provides a method for identifying and predicting oncomelania distribution influence factors based on a mathematical model.
Disclosure of Invention
Based on the technical problems in the background art, the invention provides a method for identifying and predicting oncomelania distribution influence factors based on a mathematical model.
The invention provides a method for identifying and predicting oncomelania distribution influence factors based on a mathematical model, which comprises the following steps:
the method comprises the following steps: and establishing a time-space database of oncomelania distribution and environmental factors. Collecting the oncomelania distribution data of the target area as basic data, and obtaining the relevant environmental factors of the target area as attribute data. And connecting the vector map and the data of the two, and introducing the vector map and the data of the two into ArcGIS software together to construct a time-space database with a spiral environment.
Step two: and (4) constructing a space-time geographic weighted regression model and calculating the space-time characteristics of the space-time geographic weighted regression model. And (3) constructing a space-time geographic weighted regression-GTWR model for the data of the space-time database in the step one, performing regression analysis to obtain a correlation coefficient of the environmental factor and the oncomelania distribution, and further predicting the oncomelania distribution according to regression.
Step three: and visualizing the space-time distribution of the influence coefficients of the environmental factors on the distribution of the oncomelania. And respectively carrying out visualization processing on regression parameters of space and time according to the operation result of the GTWR model, and revealing the change rule of the regression parameters in a timing empty range.
Preferably, in the first step, collecting the oncomelania distribution data of the target area as the basic attribute data mainly includes: the method comprises the following steps of obtaining attribute information such as a snail environment name, a snail environment longitude and latitude coordinate, snail searching time, snail area, snail searching frame number, snail density, environment type and the like. Obtaining environmental factors related to the breeding of oncomelania in a target area: rainfall, sunshine time, normalized vegetation index, vegetation coverage, surface temperature, humidity, water level, flooding time, elevation and the like. Downloading rainfall and sunshine duration data from a national weather science data center; acquiring indexes such as a normalized vegetation index, vegetation coverage, surface temperature, humidity and the like through downloading satellite image data of a target area and performing inversion after preprocessing; and respectively applying for hydrological data and DEM elevation data from a local water conservancy hall and a geological mapping institute. To eliminate the effect of the dimension on the fitting results, all data were normalized.
Preferably, in the second step, the GTWR model is a regression algorithm based on local spatial statistics, and regression analysis model calculation is performed on the independently sampled analysis points, so as to obtain regression coefficients corresponding to the spatial position and the time position one to one. The regression coefficient changes along with different space-time positions, and the characteristics of space-time non-stationarity can be quantitatively represented. The model construction process is as follows;
i. extracting coordinates u and v of the oncomelania points in the target area from a spatial database, wherein y represents oncomelania density, t represents time, n represents the number of the oncomelania points in the area, and xiRepresents an environmental factor;
integrating the temporal and spatial distances directly into the spatio-temporal distance function with reference to measuring the proximity between the regression point and its surrounding observation points using an ellipsoid coordinate system, calculating the spatio-temporal distances with the following formula:
Figure BDA0003346324370000041
wherein, λ and u are proportional adjustment coefficients of time distance and space distance respectively, and are obtained by optimizing the sum of squares of regression residuals;
constructing a spatial weighting matrix W based on distancesAnd a time weighting matrix WTThen combine them to form a space-time weighting matrix WST=WS×WT. Its kernel function is based on a quadratic (Bi-square) function:
Figure BDA0003346324370000042
wherein the content of the first and second substances,
Figure BDA0003346324370000043
representing a measure of the spatial and temporal distances between location i and location j. h isSTIs the bandwidth, the effect it produces decays with distance. The invention adopts variable bandwidth, and measures through Akaike Information Criterion (AIC):
Figure BDA0003346324370000044
wherein the content of the first and second substances,
Figure BDA0003346324370000045
tr (S) is the trace of the hat matrix S for model standard deviation estimation; AICCIndicating corrected AIC values (AICc) passing through the minimum AICCSelecting the corresponding 'optimal' bandwidth value;
in the space-time correlation matrix and the "optimal" bandwidth hSTConstructing a GTWR model on the basis of the following steps:
Figure BDA0003346324370000046
wherein, for each observation i, yiIs a dependent variable, XikIs the kth explanatory variable. (u)i,vi,ti) Representing the spatiotemporal coordinates of observation i; u. ofiAnd viIs the spatial coordinate of the projection, tiIs the time coordinate of the projection. Beta is a0(ui,vi,ti) Is the intercept term, betak(ui,vi,ti) Is the k regression coefficient, ε, at the ith data pointiRepresenting a random error term.
Preferably, in the third step: the spatial and temporal variation maps of the regression coefficients were visualized with the aid of ArcGIS software: the spatial variation graph represents the difference of correlation coefficients of different oncomelania distribution areas at the same time; the time variation graph represents the difference of the correlation coefficients of different time periods in the same oncomelania distribution area. The visual interpretation of the GTWR model result by utilizing ArcGIS software is one of the key points of the invention, and the heterogeneity characteristics of the space-time relationship can be visually displayed, so that the time-space change of the space-time relationship can be further deeply interpreted and analyzed.
The invention has the beneficial effects that:
1. the method has the greatest advantages that the time non-stationarity characteristic of the snail situation is integrated into the traditional oncomelania density prediction and analysis model, so that the fitting precision is improved, more details are provided, the quantitative analysis and research of the influence factors of the oncomelania distribution in the region are facilitated, and the prediction precision can be obviously improved.
2. According to the method, the oncomelania situation spatial database is constructed by means of ArcGIS software, model operation is carried out through the GTWR plug-in, the calculation efficiency is greatly improved, the change condition of the correlation coefficient in a certain time and space range can be provided, the result can be directly and visually output, and analysis and interpretation are facilitated.
Drawings
FIG. 1 is a schematic diagram of a spatiotemporal distance with respect to an ellipsoid coordinate system in a mathematical model-based oncomelania distribution influence factor identification and prediction method provided by the invention;
FIG. 2 is a schematic diagram of a fixed broadband in a mathematical model-based oncomelania distribution influence factor identification and prediction method;
FIG. 3 is a schematic diagram of a variable broadband in the method for identifying and predicting the distribution influence factors of the oncomelania based on the mathematical model.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
Referring to fig. 1-3, a method for identifying and predicting oncomelania distribution influence factors based on a mathematical model comprises the following steps:
the method comprises the following steps: and establishing a time-space database of oncomelania distribution and environmental factors. Collecting the oncomelania distribution data of the target area as basic data, and obtaining the relevant environmental factors of the target area as attribute data. Connecting the vector map and the data of the vector map are together imported into ArcGIS software to construct a time-space database with a snail environment;
wherein collecting the distribution data of the oncomelania in the target area as the basic attribute data mainly comprises: the method comprises the following steps of obtaining attribute information such as a snail environment name, a snail environment longitude and latitude coordinate, snail searching time, snail area, snail searching frame number, snail density, environment type and the like. Obtaining environmental factors related to the breeding of oncomelania in a target area: rainfall, sunshine time, normalized vegetation index, vegetation coverage, surface temperature, humidity, water level, flooding time, elevation and the like. Downloading rainfall and sunshine duration data from a national weather science data center; the method comprises the steps of obtaining indexes such as a normalized vegetation index, vegetation coverage, surface temperature, humidity and the like through preprocessing and inversion by downloading data of satellite images (sentinel No. 2, LandSat8) of a target area; and respectively applying for hydrological data and DEM elevation data from a local water conservancy hall and a geological mapping institute. All data were normalized in order to eliminate the effect of the dimension on the fitting results;
particularly, the electronic basic map of the research area is downloaded from a national basic geographic map database of a national basic information center (website: http:// www.gscloud.cn /), and specifically comprises map data of administrative divisions, county-level administrative divisions and boundaries, first-level rivers, rivers with more than three levels and the like. And connecting the downloaded vector map with the basic attribute data of the oncomelania distribution and the attribute data of the environmental factors, importing ArcGIS software, and establishing a time-space database of the oncomelania distribution in the research area.
Step two: and (4) constructing a space-time geographic weighted regression model and calculating the space-time characteristics of the space-time geographic weighted regression model. Constructing a space-time geographic weighted regression-GTWR model for the data of the space-time database in the step one, carrying out regression analysis to obtain a correlation coefficient between the environmental factor and the oncomelania distribution, and further predicting the oncomelania distribution according to regression;
the GTWR model is a regression algorithm based on local space statistics, and regression analysis model resolving is carried out on independently sampled analysis points respectively to obtain regression coefficients corresponding to space positions and time positions one by one. The regression coefficient changes along with different space-time positions, and the characteristics of space-time non-stationarity can be quantitatively represented. The model construction process is as follows:
1. extracting coordinates u and v of the oncomelania points in the target area from a spatial database, wherein y represents oncomelania density, t represents time, n represents the number of the oncomelania points in the area, and xiRepresenting an environmental factor.
2. The time and space distances are directly integrated into the space-time distance function by referring to the use of an ellipsoid coordinate system to measure the proximity between the regression point and the observation points around the regression point, and the space-time distances are calculated by the following formula:
Figure BDA0003346324370000071
wherein, λ and u are proportional adjustment coefficients of time distance and space distance respectively, and are obtained by optimizing the sum of squares of regression residuals;
3. constructing a spatial weighting matrix W based on distancesAnd a time weighting matrix WTThen combine them to form a space-time weighting matrix WST=WS×WT. Its kernel function is based on a quadratic (Bi-square) function:
Figure BDA0003346324370000081
wherein the content of the first and second substances,
Figure BDA0003346324370000082
representing a measure of the spatial and temporal distances between location i and location j. h isSTIs the bandwidth, the effect it produces decays with distance. The invention adopts variable bandwidth byThe Akaike Information Criterion (AIC) measures:
Figure BDA0003346324370000083
wherein the content of the first and second substances,
Figure BDA0003346324370000084
tr (S) is the trace of the hat matrix S for model standard deviation estimation; AICCIndicating corrected AIC values (AICc) passing through the minimum AICCSelecting the corresponding 'optimal' bandwidth value;
4. in the space-time correlation matrix and the 'optimal' bandwidth hSTConstructing a GTWR model on the basis of the following steps:
Figure BDA0003346324370000085
wherein, for each observation i, yiIs a dependent variable, XikIs the kth explanatory variable. (u)i,vi,ti) Representing the spatiotemporal coordinates of observation i; u. ofiAnd viIs the spatial coordinate of the projection, tiIs the time coordinate of the projection. Beta is a0(ui,vi,ti) Is the intercept term, betak(ui,vi,ti) Is the k regression coefficient, ε, at the ith data pointiRepresenting a random error term.
Step three: and visualizing the space-time distribution of the influence coefficients of the environmental factors on the distribution of the oncomelania. And respectively carrying out visualization processing on regression parameters of space and time according to the operation result of the GTWR model, and revealing the change rule of the regression parameters in a timing empty range.
Wherein, the spatial and temporal change maps of the regression coefficients are visualized by means of ArcGIS software: the spatial variation graph represents the difference of correlation coefficients of different oncomelania distribution areas at the same time; the time variation graph represents the difference of the correlation coefficients of different time periods in the same oncomelania distribution area. The visual interpretation of the GTWR model result by utilizing ArcGIS software is one of the key points of the invention, and the heterogeneity characteristics of the space-time relationship can be visually displayed, so that the time-space change of the space-time relationship can be further deeply interpreted and analyzed.
The specific implementation mode is as follows:
the invention identifies and predicts the oncomelania distribution influence factors by constructing a space-time geographic weighted regression model, and the main links comprise:
1. collecting the snail situation data of a target area based on a unit or a platform as basic attribute data, mainly comprising: the method comprises the following steps of obtaining attribute information such as spiral environment spatial position data, spiral environment names, spiral area, number of spiral checking frames, number of spiral nailing frames, density of spiral nailing frames, environment type and the like. Wherein the oncomelania density is taken as a dependent variable of the regression model.
Obtaining environmental factors of the target area related to breeding of oncomelania as independent variables of the regression model, wherein the details are shown in the following table:
Figure BDA0003346324370000091
all data were normalized to eliminate the effect of the dimension on the fitting results.
The electronic basic map of the research area is downloaded from a national basic geographic map database of a national basic information center (website: http:// www.gscloud.cn /), and specifically comprises map data of administrative divisions, county-level administrative divisions and boundaries, first-level rivers, rivers with more than three levels and the like. And connecting the downloaded vector map with basic attribute data such as longitude and latitude information of a spiral environment in a research area, importing ArcGIS software, selecting a universal horizontal axis mercator method as a projection method, and taking a WGS 1984UTM Zone50N as a projection coordinate of a projection system of all data. And acquiring a point map database of all snail environments, and completing the establishment of a spatial database of the distribution of the snails in the research area.
2. And constructing a GTWR model. As an extension of a GWR model, GTWR not only considers the spatial non-stationarity of geographic data, but also takes time effect into model calculation, so that the goodness of fit of the model is improved, and more information can be provided as a result. The model construction process is as follows:
firstly, the method comprises the following steps: and calculating the space-time distance.
The key to measure spatio-temporal non-stationarity is to establish a weight matrix to estimate the different importance of each observed value in the dataset of the location parameter i. Generally, the closer the observation is to i, the greater the weight. Thus, each point estimate has a unique weight matrix. Time and space measurement can be well balanced by utilizing an elliptic coordinate system to calculate the space-time distance, namely, the relativity between each point is represented by utilizing the surface of an ellipsoid;
given a spatial distance dsAnd a time distance dTWe can combine them into a spatio-temporal distance dTS:
dST=λdS+μdT#(1)
Where λ, μ are the scaling factors used to measure spatial and temporal distances in their respective metrology systems to balance the different effects.
Referring to the calculation mode of the Euclidean distance in the three-dimensional space, the space-time distance can be calculated according to the following formula:
Figure BDA0003346324370000111
Figure BDA0003346324370000112
Figure BDA0003346324370000113
and (4) calculating the distances between the position i and all the observed values according to the formula (4), and constructing a weight function in the next step. Theoretically, if the observed data has no temporal variation, the parameter μmay be set to 0 (i.e., μ ═ 0), so that the distance calculation degenerates to the conventional GWR model. Likewise, if the parameter λ is set to 0 (i.e., λ ═ 0), then only temporal non-stationarity is considered, which will degrade to the conventional TWR model;
second, establish a spatio-temporal regression weight matrix
And (3) respectively adopting a weighted linear least square method to carry out model solution on each regression analysis point i:
Figure BDA0003346324370000114
wherein, β (u)i,vi,ti) Is the regression coefficient at the ith data point, w (u)i,vi,ti) Is a diagonal matrix whose diagonal element values are from each data point to the regression analysis point w (u)i,vi,ti) The spatiotemporal weight value of (1). Constructing a spatial weighting matrix
Figure BDA0003346324370000115
And a time weighting matrix
Figure BDA0003346324370000116
They are then combined to form a spatio-temporal weighting matrix:
Figure BDA0003346324370000117
according to the first law of geography, the basic principle of calculating the weight by the GTWR model is as follows: the closer the space-time distance is, the higher the weight value is given; conversely, the lower the weight value. A weight matrix is constructed by adopting a quadratic kernel function to measure the influence among different measurement points in the regression analysis:
Figure BDA0003346324370000121
wherein the content of the first and second substances,
Figure BDA0003346324370000122
representing a measure of the spatial and temporal distances between location i and location j. h isSTIs the bandwidth, the effect of which decays with distance. The secondary kernel function clearly defines a local range (combined bandwidth value) considered in the calculation process of the GTWR model, and is more beneficial to the interpretation of a model result. Substituting the formula (4) into the formula (6) to calculate
Figure BDA0003346324370000123
The bandwidth is an important control parameter for GTWR model weight calculation and can be divided into a fixed bandwidth and a variable bandwidth. When the density of the data points is not uniform, the fixed bandwidth will result in insufficient samples of the participating local model. In order to avoid the defect, variable bandwidth is adopted, the number N of nearest neighbors is defined, and the distance between a regression analysis point and the Nth nearest neighbor is used as a bandwidth value for corresponding model calculation. The bandwidth value is measured by Akaike Information Criterion (AIC):
Figure BDA0003346324370000124
Figure BDA0003346324370000125
tr (S) is the trace of the hat matrix S for model standard deviation estimation; AICCIndicates the corrected AIC value (corrected AIC), passing through the minimum AICCAnd selecting the corresponding 'optimal' bandwidth value. AICCThe value is better than the optimization degree of the CV value obtained by a cross validation method;
thirdly, constructing a GTWR model and estimating regression parameters
And (3) constructing a regression equation by utilizing the space-time regression weight matrix and the optimal broadband in the steps:
Figure BDA0003346324370000126
for each observation i, yiIs a dependent variable, xikIs the kth explanatory variable. (u)i,vi,ti) Representing the spatiotemporal coordinates of observation i; u. ofiAnd viIs the spatial coordinate of the projection, tiIs the time coordinate of the projection. Beta is a0(ui,vi,ti) Is the intercept term, betak(ui,vi,ti) Is the k regression coefficient, ε, at the ith data pointiRepresents a random error term;
3. coefficient space-time distribution visualization processing
Through the last step, the correlation coefficient between the oncomelania density and the environmental factor of each oncomelania distribution point can be obtained. According to the analysis requirement, a space distribution schematic diagram and a time sequence line diagram of the correlation coefficient are respectively visually output by means of Arcgis 10.2: the space change chart shows the space difference of the correlation coefficients of the oncomelania density in different areas in the same time period, and the time change chart shows the change rule of the correlation coefficients of the environmental factors at the same oncomelania distribution point along with the time. The correlation coefficients have great difference in time and space, which provides a very detailed theoretical support for the research on the relationship between the oncomelania density and the environmental factors and is one of the important meanings of the invention.
Example (b):
the method is characterized by collecting and collecting 2016-type recurrence environment data along the Jiangjiang beach in Nanjing, Zhenjiang and Yangzhou three markets in 2020, and verifying the Oncomelania hupensis distribution influence factor identification and prediction method based on the mathematical model by combining the following table.
1. As shown in the following table, the collection of 2016-: and counting 221 repeated oncomelania environment distribution points in the river beach according to the attribute information of the oncomelania environment name, the oncomelania area, the positive oncomelania area, the number of oncomelania checking frames, the number of oncomelania, the number of positive oncomelania, the first appearing year, the last appearing year and the like.
Acquiring satellite data of a sentinel 2 and LandSat8 in 2016-2020, carrying out data sorting and preprocessing in a remote sensing image processing platform ENVI, and extracting NDVI, humidity, NDSI and LST. Elevation data is acquired from a geological mapping institute at a resolution of 1 m.
Download 1 from a national base geographic map database at the national base information center (website: http:// www.gscloud.cn /): 400 million electronic basic maps and extracts the water system map and administrative data of the peaceful town. And connecting the downloaded vector map with the basic attribute data of the oncomelania distribution and the attribute data of the environmental factors, importing ArcGIS software, and establishing a time-space database of the oncomelania distribution in the research area.
2. The oncomelania density is taken as a dependent variable, and 5 influencing factors of NDVI, humidity, NDSI, LST and DEM are taken as independent variables. And constructing a space-time three-dimensional coordinate system, establishing a space-time matrix based on a quadratic kernel function, calculating the bandwidth by using the Chichi information criterion, and constructing a GTWR model. Wherein, the AIC calculates the bandwidth to determine the nearest neighborhood number N as 51.
3. And calculating the correlation coefficient of the oncomelania density and the environmental factor in different years through a GTWR model. The study has 221 distribution points of the water network oncomelania and 5 parameters in total, so 221 × 5 regression parameters are obtained in total, and the characteristic values of the statistical regression coefficients are described in five aspects of mean value, median, standard deviation, minimum value and maximum value respectively as shown in the following table.
Figure BDA0003346324370000141
And visualizing the space and time change graphs of the regression coefficients by means of ArcGIS software to deeply research the change rule of the relationship between the oncomelania distribution density and the environmental factors under different space-time conditions.
According to the oncomelania distribution influence factor identification and prediction method based on space-time geographic weighted regression, oncomelania situation data in a target area are collected as basic attribute data (including longitude and latitude information), and a downloaded vector map is connected with the basic attribute data. The environmental factors related to the breeding of the oncomelania in the target area are obtained through different ways: rainfall, sunshine time, normalized vegetation index, vegetation coverage, surface temperature, humidity, water level, flooding time and elevation are subjected to standardization processing and then are together introduced into ArcGIS software, and a spatial database with a snail environment is constructed. The method comprises the steps of measuring the proximity degree between a regression point and a surrounding observation point by using an ellipsoid coordinate system, directly integrating time and space distance into a time-space distance function to construct a space-time coordinate system, and taking longitude and latitude of a spiral point as an XY plane and time as a Z axis. And calculating the distance between each space-time oncomelania distribution point in an elliptical coordinate system, and constructing a space-time regression weight matrix between each oncomelania distribution point and other oncomelania distribution flow points according to the distance. The weight value is inversely proportional to the distance between the oncomelania distribution points, and the closer the distances are, the greater the influence between the weight values and the oncomelania distribution points is. And constructing a regression equation between the oncomelania density and the environmental factors in different regions at different time periods by utilizing the space-time regression weight matrix and the optimal bandwidth obtained through AIC. And outputting a space and time change diagram of the correlation coefficient of the environment variable to the oncomelania density by utilizing the calculation result of the GTWR model. The spatial variation graph represents the difference of the correlation coefficients of different spatial regions at the same time; the time variation graph represents the difference of the correlation coefficients in different time periods in the same region.
Compared with a least squares (OLS) model and a Geographical Weighted Regression (GWR) model, the time non-stationarity and the space non-stationarity can be considered at the same time, the spatiotemporal characteristics of the influence factors on the oncomelania density are revealed, the defect that the existing method lacks time variables is overcome, the fitting precision is improved, more spatiotemporal information can be extracted, and the method has important significance for deeply exploring the oncomelania distribution characteristics and further accurately predicting the schistosomiasis control measures.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims (4)

1. A method for identifying and predicting oncomelania distribution influence factors based on a mathematical model is characterized by comprising the following steps:
the method comprises the following steps: and establishing a time-space database of oncomelania distribution and environmental factors. Collecting the oncomelania distribution data of the target area as basic data, and obtaining the relevant environmental factors of the target area as attribute data. And connecting the vector map and the data of the two, and introducing the vector map and the data of the two into ArcGIS software together to construct a time-space database with a spiral environment.
Step two: and (4) constructing a space-time geographic weighted regression model and calculating the space-time characteristics of the space-time geographic weighted regression model. And (3) constructing a space-time geographic weighted regression-GTWR model for the data of the space-time database in the step one, performing regression analysis to obtain a correlation coefficient of the environmental factor and the oncomelania distribution, and further predicting the oncomelania distribution according to regression.
Step three: and visualizing the space-time distribution of the influence coefficients of the environmental factors on the distribution of the oncomelania. And respectively carrying out visualization processing on regression parameters of space and time according to the operation result of the GTWR model, and revealing the change rule of the regression parameters in a timing empty range.
2. The method as claimed in claim 1, wherein in the first step, the collecting of the oncomelania distribution data of the target area as the basic attribute data mainly comprises: the method comprises the following steps of obtaining attribute information such as a snail environment name, a snail environment longitude and latitude coordinate, snail searching time, snail area, snail searching frame number, snail density, environment type and the like. Obtaining environmental factors related to the breeding of oncomelania in a target area: rainfall, sunshine time, normalized vegetation index, vegetation coverage, surface temperature, humidity, water level, flooding time, elevation and the like. Downloading rainfall and sunshine duration data from a national weather science data center; acquiring indexes such as a normalized vegetation index, vegetation coverage, surface temperature, humidity and the like through downloading satellite image data of a target area and performing inversion after preprocessing; and respectively applying for hydrological data and DEM elevation data from a local water conservancy hall and a geological mapping institute. All data were normalized in order to eliminate the effect of the dimension on the fitting results;
the electronic basic map of the research area is downloaded from a national basic geographic map database of a national basic information center, and specifically comprises map data of administrative divisions, county-level administrative divisions and boundaries, first-level rivers, rivers with more than three levels and the like. And connecting the downloaded vector map with the basic attribute data of the oncomelania distribution and the attribute data of the environmental factors, importing ArcGIS software, and establishing a time-space database of the oncomelania distribution in the research area.
3. The method as claimed in claim 1, wherein in the second step, the GTWR model is a regression algorithm based on local spatial statistics, and regression analysis model solution is performed on the independently sampled analysis points to obtain regression coefficients corresponding to spatial positions and temporal positions one by one. The regression coefficient changes along with different space-time positions, and the characteristics of space-time non-stationarity can be quantitatively represented. The model construction process is as follows:
i. extracting coordinates u and v of the oncomelania points in the target area from a spatial database, wherein y represents oncomelania density, t represents time, n represents the number of the oncomelania points in the area, and xiRepresenting an environmental factor.
integrating the temporal and spatial distances directly into the spatio-temporal distance function with reference to measuring the proximity between the regression point and its surrounding observation points using an ellipsoid coordinate system, calculating the spatio-temporal distances with the following formula:
Figure FDA0003346324360000021
wherein, λ and u are proportional adjustment coefficients of time distance and space distance respectively, and are obtained by optimizing the sum of squares of regression residuals;
constructing a spatial weighting matrix W based on distancesAnd a time weighting matrix WTThen combine them to form a space-time weighting matrix WST=WS×WT. Its kernel function is based on a quadratic (Bi-square) function:
Figure FDA0003346324360000031
wherein the content of the first and second substances,
Figure FDA0003346324360000032
representing a measure of the spatial and temporal distances between location i and location j. h isSTIs the bandwidth. The invention adopts variable bandwidth, and measures by the red pool information quantity criterion:
Figure FDA0003346324360000033
wherein the content of the first and second substances,
Figure FDA0003346324360000034
tr (S) is the trace of the hat matrix S for model standard deviation estimation; AICCIndicating corrected AIC value, passing minimum AICCSelecting the corresponding 'optimal' bandwidth value;
in the space-time correlation matrix and the "optimal" bandwidth hSTConstructing a GTWR model on the basis of the following steps:
Figure FDA0003346324360000035
wherein, for each observation i, yiIs a dependent variable, XikIs the kth explanatory variable. (u)i,vi,ti) Representing the spatiotemporal coordinates of observation i; u. ofiAnd viIs the spatial coordinate of the projection, tiIs the time coordinate of the projection. Beta is a0(ui,vi,ti) Is the intercept term, betak(ui,vi,ti) Is the k regression coefficient, ε, at the ith data pointiRepresenting a random error term.
4. The method for identifying and predicting the influence factors of the distribution of the oncomelania based on the mathematical model as claimed in claim 1, wherein in the third step: the spatial and temporal variation maps of the regression coefficients were visualized with the aid of ArcGIS software: the spatial variation graph represents the difference of correlation coefficients of different oncomelania distribution areas at the same time; the time variation graph represents the difference of the correlation coefficients of different time periods in the same oncomelania distribution area.
CN202111324115.2A 2021-11-10 2021-11-10 Oncomelania snail distribution influence factor identification and prediction method based on mathematical model Pending CN113901348A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111324115.2A CN113901348A (en) 2021-11-10 2021-11-10 Oncomelania snail distribution influence factor identification and prediction method based on mathematical model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111324115.2A CN113901348A (en) 2021-11-10 2021-11-10 Oncomelania snail distribution influence factor identification and prediction method based on mathematical model

Publications (1)

Publication Number Publication Date
CN113901348A true CN113901348A (en) 2022-01-07

Family

ID=79193750

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111324115.2A Pending CN113901348A (en) 2021-11-10 2021-11-10 Oncomelania snail distribution influence factor identification and prediction method based on mathematical model

Country Status (1)

Country Link
CN (1) CN113901348A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114580178A (en) * 2022-03-09 2022-06-03 中国科学院地理科学与资源研究所 Mosquito distribution prediction method, device, equipment and storage medium
CN116383774A (en) * 2023-06-06 2023-07-04 中国科学院精密测量科学与技术创新研究院 Positioning method for oncomelania living environment in water network area

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107103392A (en) * 2017-05-24 2017-08-29 北京航空航天大学 A kind of identification of bus passenger flow influence factor and Forecasting Methodology based on space-time Geographical Weighted Regression
CN110046771A (en) * 2019-04-25 2019-07-23 河南工业大学 A kind of PM2.5 concentration prediction method and apparatus
CN111210052A (en) * 2019-12-16 2020-05-29 天津职业技术师范大学(中国职业培训指导教师进修中心) Traffic accident prediction method based on mixed geography weighted regression
CN112990609A (en) * 2021-04-30 2021-06-18 中国测绘科学研究院 Air quality prediction method based on space-time bandwidth self-adaptive geographical weighted regression

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107103392A (en) * 2017-05-24 2017-08-29 北京航空航天大学 A kind of identification of bus passenger flow influence factor and Forecasting Methodology based on space-time Geographical Weighted Regression
CN110046771A (en) * 2019-04-25 2019-07-23 河南工业大学 A kind of PM2.5 concentration prediction method and apparatus
CN111210052A (en) * 2019-12-16 2020-05-29 天津职业技术师范大学(中国职业培训指导教师进修中心) Traffic accident prediction method based on mixed geography weighted regression
CN112990609A (en) * 2021-04-30 2021-06-18 中国测绘科学研究院 Air quality prediction method based on space-time bandwidth self-adaptive geographical weighted regression

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
BO HUANG: "Geographically and temporally weighted regression for modeling spatio-temporal variation in house prices", 《RESEARCHGATE》 *
刘毛毛: "太湖流域苏锡常地区有螺环境时空变化规律研究", 《中国血吸虫病防治杂志》 *
刘璐: "2015–2017年长江江苏段流域螺情时空分析", 《中国吸血虫病防治杂志》 *
孙阳: "基于夜间灯光数据的城镇化对血吸虫病的影响", 《中国优秀硕士学位论文全文数据库》 *
蒋甜甜,杨坤: "钉螺扩散规律与监测方法研究进展", 《中国血吸虫病防治杂志》 *
蒋甜甜: "京杭大运河丹阳段及丹金溧漕河沿线钉螺扩散研究", 《中国优秀硕士学位论文全文数据库》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114580178A (en) * 2022-03-09 2022-06-03 中国科学院地理科学与资源研究所 Mosquito distribution prediction method, device, equipment and storage medium
CN114580178B (en) * 2022-03-09 2022-08-30 中国科学院地理科学与资源研究所 Mosquito distribution prediction method, device, equipment and storage medium
CN116383774A (en) * 2023-06-06 2023-07-04 中国科学院精密测量科学与技术创新研究院 Positioning method for oncomelania living environment in water network area
CN116383774B (en) * 2023-06-06 2023-09-12 中国科学院精密测量科学与技术创新研究院 Positioning method for oncomelania living environment in water network area

Similar Documents

Publication Publication Date Title
Wadoux Using deep learning for multivariate mapping of soil with quantified uncertainty
Jat et al. Modelling of urban growth using spatial analysis techniques: a case study of Ajmer city (India)
Panagos et al. Estimating the soil organic carbon content for European NUTS2 regions based on LUCAS data collection
Jin et al. Downscaling AMSR-2 soil moisture data with geographically weighted area-to-area regression kriging
Murphy et al. Representing genetic variation as continuous surfaces: an approach for identifying spatial dependency in landscape genetic studies
CN105243435B (en) A kind of soil moisture content prediction technique based on deep learning cellular Automation Model
Rocchini et al. Remotely sensed spatial heterogeneity as an exploratory tool for taxonomic and functional diversity study
CN109493119B (en) POI data-based urban business center identification method and system
Falahatkar et al. Integration of remote sensing data and GIS for prediction of land cover map
CN114091613B (en) Forest biomass estimation method based on high-score joint networking data
CN113901348A (en) Oncomelania snail distribution influence factor identification and prediction method based on mathematical model
CN111937016B (en) City internal poverty-poor space measuring method and system based on street view picture and machine learning
DomaÇ et al. Integration of environmental variables with satellite images in regional scale vegetation classification
van Oort et al. Spatial variability in classification accuracy of agricultural crops in the Dutch national land-cover database
Gervasoni et al. Convolutional neural networks for disaggregated population mapping using open data
Lin et al. Geostatistical approaches and optimal additional sampling schemes for spatial patterns and future sampling of bird diversity
CN113255961A (en) Lake water environment monitoring site optimized layout method based on time sequence multi-source spectrum remote sensing data
Gu et al. Environmental monitoring and landscape design of green city based on remote sensing image and improved neural network
Tassinari et al. Wide-area spatial analysis: A first methodological contribution for the study of changes in the rural built environment
CN115345069A (en) Lake water volume estimation method based on maximum water depth record and machine learning
CN110716998A (en) Method for spatializing fine-scale population data
CN110321528B (en) Hyperspectral image soil heavy metal concentration assessment method based on semi-supervised geospatial regression analysis
CN115829163B (en) Multi-mode integration-based runoff prediction method and system for middle and lower reaches of Yangtze river
Mubea et al. Spatial effects of varying model coefficients in urban growth modeling in Nairobi, Kenya
CN105160065B (en) Remote sensing information method for evaluating similarity based on topological relation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination