CN110019633B - Line density statistical method and system based on ArcGIS secondary development - Google Patents

Line density statistical method and system based on ArcGIS secondary development Download PDF

Info

Publication number
CN110019633B
CN110019633B CN201810803200.9A CN201810803200A CN110019633B CN 110019633 B CN110019633 B CN 110019633B CN 201810803200 A CN201810803200 A CN 201810803200A CN 110019633 B CN110019633 B CN 110019633B
Authority
CN
China
Prior art keywords
value
statistical
mth
result
radius
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810803200.9A
Other languages
Chinese (zh)
Other versions
CN110019633A (en
Inventor
辜智慧
张艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen University
Original Assignee
Shenzhen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen University filed Critical Shenzhen University
Priority to CN201810803200.9A priority Critical patent/CN110019633B/en
Publication of CN110019633A publication Critical patent/CN110019633A/en
Application granted granted Critical
Publication of CN110019633B publication Critical patent/CN110019633B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Evolutionary Biology (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Mathematical Optimization (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Remote Sensing (AREA)
  • Artificial Intelligence (AREA)
  • Operations Research (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Algebra (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a line density statistical method and a system based on ArcGIS secondary development, which are applied to an ArcGIS space analysis tool. The method comprises the steps of obtaining an nth first statistical result of an mth group by counting the number of simple lines of which starting point coordinates and end point coordinates are in a space region formed by taking the center point coordinates of the nth simple line as the circle center and the value of the mth search radius as the radius, calculating the mth minimum clustering result with significance according to a statistical principle and the N first statistical results of the mth group to obtain density statistics on the N simple lines according to the M minimum clustering results and the N first statistical results of the mth group to obtain a density result set. The method and the system are applied to the ArcGIS space analysis tool, and the technical problem that the ArcGIS space analysis tool lacks a linear density statistical tool can be solved.

Description

Line density statistical method and system based on ArcGIS secondary development
Technical Field
The invention relates to the field of Origin-Destination (OD) line processing, in particular to a line density statistical method and system based on ArcGIS secondary development.
Background
The OD matrix is one of the most important sources of information for strategic planning and traffic network management. Traditionally, city planning and traffic engineering relied on family questionnaires or census conducted every 5-10 years and road surveys to develop OD matrix estimation methods. In recent years, improvements in big data and tracking facilities have made it possible to collect a large amount of travel data for moving objects. However, due to the large amount of intersection and overlap of OD traffic, in previous studies of OD matrices, based on point statistics on administrative or traffic space units, it quickly became illegible as the amount of data increased.
The processing function of the geospatial analysis software ArcGIS with complete functions on OD lines is weak, and only two related algorithms are used, namely line density analysis and OD matrix calculation or track generation. The former calculates the length of OD lines in the grid as OD line density analysis according to the setting of the grid, and the interpretability of the result is very poor. Because it counts a virtual passing frequency and length, not the actual route. The OD matrix calculation of the latter is simpler to realize, but only some relevant attributes of the OD lines are added, and the spatial relationship cannot be clearly expressed, especially when mass data are analyzed. OD trajectory generation, which is also relatively difficult to implement in ArcGIS, is important for extraction of such actual trajectories if the research problem is related to traffic analysis. However, if we are concerned with spatial relationships or areas of particular interest, such as finding the most closely related employment and residential centers, the OD line density statistics tool is more important, and this is what is lacking in the ArcGIS spatial analysis tool.
Disclosure of Invention
The invention mainly aims to provide a linear density statistical method and a system, which can solve the technical problem that an ArcGIS space analysis tool in the prior art is lack of a linear density statistical tool.
In order to achieve the above object, a first aspect of the present invention provides a linear density statistical method based on ArcGIS secondary development, wherein the method is applied in an ArcGIS spatial analysis tool, and the method includes:
step 101, acquiring a simple line data set, and calculating the value of the mth search radius according to a preset radius formula according to the value of the minimum search radius, the value of the radius increment and the value of the cycle number, wherein the simple line data set comprises N simple lines with a start point coordinate and an end point coordinate, N is a positive integer, the value of M is from 1 to M in sequence, and M is a positive integer;
step 102, counting the number of simple lines in a space region which is formed by taking the center point coordinate of the nth simple line as the center of a circle and the value of the mth search radius as the radius of the start point coordinate and the end point coordinate to obtain an mth group of nth first statistical results, wherein the value of N is from 1 to N in sequence;
103, calculating an mth minimum clustering result with significance according to a statistical principle and the nth first statistical results of the mth group;
and 104, performing density statistics on the N simple lines according to the M minimum clustering results and the N first statistical results of the mth group to obtain a density result set.
In order to achieve the above object, a second aspect of the present invention provides a linear density statistical system based on ArcGIS secondary development, wherein the system is applied in an ArcGIS spatial analysis tool, and the system comprises:
the acquisition and calculation module is used for acquiring a simple line data set and calculating the value of the mth search radius according to a preset radius formula according to the value of the minimum search radius, the value of the radius increment and the value of the cycle number, wherein the simple line data set comprises N simple lines with a start point coordinate and an end point coordinate, N is a positive integer, the value of M is from 1 to M in sequence, and M is a positive integer;
the first statistical module is used for counting the number of simple lines in a space region which is formed by taking the midpoint coordinate of the nth simple line as the center of a circle and the value of the mth search radius as the radius of the start point coordinate and the end point coordinate to obtain an mth group of nth first statistical results, wherein the value of N is from 1 to N in sequence;
the computing module is used for computing an mth minimum clustering result with significance according to a statistical principle and the nth first statistical results of the mth group;
and the density counting module is used for carrying out density counting on the N simple lines according to the M minimum clustering results and the N first counting results of the M group to obtain a density result set.
The invention provides a line density statistical method and a line density statistical system based on ArcGIS secondary development, which are applied to an ArcGIS space analysis tool. The method comprises the steps of obtaining an nth first statistical result of an mth group by counting the number of simple lines of which starting point coordinates and end point coordinates are in a space region formed by taking the center point coordinates of the nth simple line as the circle center and the value of the mth search radius as the radius, calculating the mth minimum clustering result with significance according to a statistical principle and the N first statistical results of the mth group to obtain density statistics on the N simple lines according to the M minimum clustering results and the N first statistical results of the mth group to obtain a density result set. The method and the system are applied to the ArcGIS space analysis tool, and the technical problem that the ArcGIS space analysis tool lacks a linear density statistical tool can be solved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a line density statistical method based on ArcGIS secondary development according to a first embodiment of the present invention;
FIG. 2 is a schematic flow chart illustrating a refinement step of step 103 in the first embodiment of the present invention;
FIG. 3 is a schematic flow chart illustrating a refinement step of step 104 in the first embodiment of the present invention;
FIG. 4 is a schematic diagram of a spatial region formed based on an nth simple line and an mth search radius according to a first embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a linear density statistics system based on ArcGIS secondary development according to a second embodiment of the present invention;
FIG. 6 is a schematic structural diagram of a refinement module of the calculation module 203 in the second embodiment of the present invention;
fig. 7 is a schematic structural diagram of a refining module of the density statistics module 204 according to the second embodiment of the present invention.
Detailed Description
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The technical problem that an ArcGIS space analysis tool lacks a line density statistical tool exists in the prior art.
In order to solve the technical problems, the invention provides a linear density statistical method and a system based on ArcGIS secondary development, which are applied to an ArcGIS space analysis tool. The method comprises the steps of obtaining an nth first statistical result of an mth group by counting the number of simple lines of which starting point coordinates and end point coordinates are in a space region formed by taking the center point coordinates of the nth simple line as the circle center and the value of the mth search radius as the radius, calculating the mth minimum clustering result with significance according to a statistical principle and the N first statistical results of the mth group to obtain density statistics on the N simple lines according to the M minimum clustering results and the N first statistical results of the mth group to obtain a density result set. The method and the system are applied to the ArcGIS space analysis tool, and the technical problem that the ArcGIS space analysis tool lacks a linear density statistical tool can be solved.
Fig. 1 is a schematic flow chart of a linear density statistical method based on ArcGIS secondary development according to a first embodiment of the present invention. Specifically, the method is applied to an ArcGIS space analysis tool, and comprises the following steps:
step 101, acquiring a simple line data set, and calculating the value of the mth search radius according to a preset radius formula according to the value of the minimum search radius, the value of the radius increment and the value of the cycle number, wherein the simple line data set comprises N simple lines with a start point coordinate and an end point coordinate, N is a positive integer, the value of M is from 1 to M in sequence, and M is a positive integer;
further, the radius formula is:
rm=r1+(i-1)Δr
wherein r ismDenotes the m-th search radius, r1Indicates the minimum search radius, and also the 1 st search radius, i indicates the number of cycles, and Δ r indicates the radius increase.
Step 102, counting the number of simple lines in a space region which is formed by taking the center point coordinate of the nth simple line as the center of a circle and the value of the mth search radius as the radius of the start point coordinate and the end point coordinate to obtain an mth group of nth first statistical results, wherein the value of N is from 1 to N in sequence;
103, calculating an mth minimum clustering result with significance according to a statistical principle and the nth first statistical results of the mth group;
specifically, please refer to fig. 2, which is a flowchart illustrating a step 103 of refining according to the first embodiment of the present invention. The refining step comprises:
step 1031, acquiring configured significance level values, and testing the distribution condition of data distribution formed by the N first statistical results in the mth group;
step 1032, if the data distribution formed by the N first statistical results in the mth group meets normal distribution, calculating according to a normal distribution formula based on the significance level value to obtain an mth minimum clustering result;
and 1033, if the data distribution formed by the N first statistical results in the mth group meets the pareto distribution, calculating according to a pareto formula based on the significance level value to obtain an mth minimum clustering result.
Further, the normal distribution formula is:
minlines=average(Nls)+r*SD(Nls)
wherein, mins represents the minimum clustering result, average represents the average function, Nls represents the N first statistical results in the mth group, r is a parameter related to the significance level value, when the significance level value is 99%, r takes the value of 2.58, when the significance level value is 95%, r takes the value of 1.96, and SD represents the standard deviation function.
Further, the pareto formula is:
Figure BDA0001737581390000061
wherein p is a parameter related to the significance level value, when the significance level value is 99%, the value of p is less than 0.01, when the significance level value is 95%, the value of p is less than 0.05, and x ismRepresents the minimum clustering result, x represents the N first statistical results in the mth group, and α represents the regression coefficient, which is a positive parameter.
And 104, performing density statistics on the N simple lines according to the M minimum clustering results and the N first statistical results of the mth group to obtain a density result set.
Specifically, please refer to fig. 3, which is a flowchart illustrating a step 104 of the first embodiment of the present invention. The refining step comprises:
step 1041, counting the number of unmarked simple lines in a space region formed by taking the midpoint coordinate of the unmarked a-th simple line as the center of a circle and the numerical value of the b-th search radius as the radius to obtain a b-th group a-th second statistical result, wherein a is a positive integer and takes values from 1 to A in sequence, A is the numerical value of the unmarked simple line in the simple line data set, wherein the M-th maximum clustering result is obtained by calculation according to the statistical principle and the M-th group N first statistical results, the c-th search radius meeting the condition that the c-th maximum clustering result is smaller than the c-th minimum clustering result is searched from the 1-th to M-th maximum clustering results, the initial value of b is c-1, and c takes one of 1 to M;
step 1042, extracting a target statistical result with the largest numerical value from the group b of the A second statistical results, and judging whether the target statistical result is larger than the b-th minimum clustering result;
step 1043, if the target statistical result is greater than the b-th minimum clustering result, marking i as i +1, querying a target simple line corresponding to the target statistical result, marking a starting point coordinate and an end point coordinate on an unmarked simple line and a target simple line in a space region formed by taking the midpoint coordinate of the target simple line as the center of a circle and the value of the b-th search radius as the radius, and marking the target statistical result as the i-th density result, wherein the initial value of i is 0;
step 1044 of determining whether the numerical value of the unmarked simple line is 0, if not, returning to execute step 1041, and if the numerical value of the unmarked simple line is 0, obtaining a density result set based on the i density results;
step 1045, if the target statistical result is less than or equal to the b-th minimum clustering result, if b is greater than 1, making b equal to b-1, returning to execute step 1041, and if b is less than or equal to 1, obtaining a density result set based on the i density results.
It is emphasized that the initial value of b is one of 1 to M, the initial value of b is c-1, and the c-1 th search radius is obtained based on statistical characteristics. The concrete expression is as follows: and calculating to obtain the mth maximum clustering result according to a statistical principle and the Nth group of the first statistical results, wherein the values of M are from 1 to M in sequence, so that M maximum statistical results are obtained in total. And sequentially searching from 1 st to Mth maximum clustering results, wherein when the first c-th search radius meeting the condition that the c-th maximum clustering result is smaller than the c-th minimum clustering result appears, the initial search radius (which is the maximum search radius) of the b-th search radius is the c-1-th search radius. That is, the initial value of b is c-1, and c takes one of values 1 to M. And if the c-th searching radius meeting the condition that the c-th maximum clustering result is smaller than the c-th minimum clustering result does not exist in the M maximum clustering results, the initial value of b is M.
Further, in step 1045, if the target statistic result is less than or equal to the b-th minimum clustering result, if b is greater than 1, let b be b-1, and return to step 1041. Each time this step is performed, the value of the search radius is subtracted by an increment of the radius. For example, if M is 10 and c is 6, then b will have an initial value of 5. In step 101 to 103, 10 search radii participate, in step 104, density statistics is performed once according to the 5 th search radius, and when the target statistical result is less than or equal to the 5 th minimum clustering result, density statistics is performed once according to the 4 th search radius, and the density statistics is performed in a circulating manner until the unmarked simple line value is 0 or the 1 st search radius (which is the minimum search radius) is completed.
Please refer to fig. 4, which is a schematic diagram of a spatial region formed based on the nth simple line and the mth search radius according to the first embodiment of the present invention. Wherein, the solid line arrow represents a simple line, the dashed line arrow represents an mth search radius, the simple line where the starting point of the dashed line arrow is located represents an nth simple line, the dashed line circle represents a space region formed by taking the midpoint coordinate of the nth simple line as the center of a circle and the value of the mth search radius as the radius, the number of the simple lines of which the starting point coordinate and the end point coordinate are both in the dashed line circle is counted, the number is the nth first statistical result of the mth group, and taking fig. 4 as an example, the nth first statistical result of the mth group is 5 (including the nth simple line).
Further, in the parameter selection, an operator manually inputs a numerical value of a minimum search radius, a numerical value of a radius increment and a numerical value of cycle times, the numerical value of the minimum search radius, the numerical value of the radius increment and the numerical value of the cycle times are obtained, then the numerical value of the M-th search radius is calculated, and N simple lines are subjected to statistical processing respectively to obtain M groups of N first statistical results.
It is emphasized that the radius increment may be 0 or not, and the number of cycles may be 1 or not, i.e. the invention has two ways in the choice of parameters. One is that the operator manually enters a value for the minimum search radius, an increment of the radius with a value of 0 and/or the number of cycles with a value of 1, where r is known from the radius formulamIs constant at a value of r1The numerical value of the search radius is stated to be constant, m being equal to 1. This method is suitable for assigning numerical values of search radius and significance level values that can be directly subjective, with sufficient knowledge of simple lines or with explicit analysis purposes. For example, if a particular bus route is set by finding the strongest contact area in the living line data, the search radius may be set to 500 meters, which is generally the range of influence of bus stops, and the minimum clustering result may be set to the minimum number of persons served by a bus. The other is that the operator manually inputs a value of the minimum search radius, a radius increment whose value is not 0, and the number of cycles whose value is not 1 to make the value of the search radius variable. This method is suitable for situations where the knowledge of simple lines is not sufficient, or where the purpose of the analysis is not clear, and where the number of search radii is specified, the statistically derived density results may not be optimal. At this time, the automatic extraction of the search radius is realized by adopting a mode of designating the minimum search radius, the radius increment with the numerical value not being 0 and the cycle number with the numerical value not being 1And an ideal density result can be obtained through statistics under the condition of lacking prior knowledge.
Further, calculation is performed according to the statistical principle and the mth group of N first statistical results, with the goal of obtaining the mth minimum clustering result with significance. The data distribution is embodied by the distribution condition of the data distribution formed by testing the N first statistical results in the mth group. Generally, most data distributions satisfy normal distribution, so if the data distribution formed by the N first statistical results in the mth group satisfies normal distribution, the mth minimum clustering result is calculated according to a normal distribution formula based on a significance level value, wherein the significance level value can be manually selected by an operator and includes two values of 95% and 99%, and the significance level value can also be manually input by the operator; if the data distribution formed by the N first statistics in the mth group does not satisfy the normal distribution, but satisfies the power law distribution, generally, most of the OD lines in the space are discrete, and only a few OD lines are clustered together. Therefore, a pareto distribution (one of power law distributions) can be used to test the data distribution formed by the N first statistical results in the mth group, and the mth minimum clustering result is calculated by using a pareto formula. Where α in the pareto formula represents a regression coefficient, is a positive parameter, and can be derived from the power law model f (x) cx-(α+1)C represents a regression coefficient. When the search radius is small, the first statistical result mostly has a value of 1, and α is large. As the search radius value increases, the number of first statistical results having a value of 1 decreases, and α decreases. When α is smaller than 1, the expected value of the random variable after the pareto distribution is infinite, where the tail of the distribution has an infinite area, and the probability density function becomes meaningless. Therefore, when α is smaller than 1, the minimum clustering result is none, and the center line cannot be found.
It should be noted that, if the data distribution formed by the N first statistical results in the mth group satisfies other types of data distributions, the data distribution can be calculated by using corresponding distribution formulas. And testing the distribution condition of the data distribution formed by the N first statistical results in the mth group, and calculating by adopting a formula corresponding to the distribution condition, wherein the aim is to extract the minimum clustering result with higher significance level in order to find out a proper probability distribution.
Furthermore, after the minimum clustering result with significance is obtained, density statistics needs to be carried out on N simple lines. The specific process of density statistics can be seen in fig. 3. It should be noted that, in step 102, the number of simple lines in a space region where the start point coordinate and the end point coordinate are both located in the nth simple line and the midpoint coordinate is the center of a circle, and the value of the mth search radius is the radius is counted to obtain the mth group of nth first statistical results, where the value of N is sequentially from 1 to N. In the M groups obtained in this step, there may be a case of coincidence statistics between N first statistical results of each group, for example, the 1 st simple line, the 2 nd simple line, the 3 rd simple line, and the 4 th simple line are counted in the 1 st first statistical result in the M group, and the 1 st simple line, the 2 nd simple line, the 4 th simple line, the 5 th simple line, and the 6 th simple line are counted in the 2 nd first statistical result in the M group. Therefore, the first statistical result obtained by the step 102 is inaccurate. And calculating the mth minimum clustering result with significance by using the N first statistical results of each group in the M groups, and performing density statistics on the N simple lines based on the M minimum clustering results and the N first statistical results of the mth group to obtain a density result set. There are no instances of coincidence statistics between the density results in this step 104. This is because the unmarked simple lines and the target simple lines in the spatial region of the target simple line are marked, so that the marked simple lines no longer participate in the process of density statistics at the next time of statistics.
In the embodiment of the invention, the number of simple lines in a space region formed by taking the center point coordinate of the nth simple line as the center of a circle and the value of the mth search radius as the radius is counted to obtain the mth group of nth first statistical results, the values of M are sequentially from 1 to M, the values of N are sequentially from 1 to N, the mth minimum clustering result with significance is obtained by calculation according to the statistical principle and the N first statistical results of the mth group, and the density statistics is carried out on the N simple lines according to the M minimum clustering results and the N first statistical results of the mth group to obtain the density result set. The method and the system are applied to the ArcGIS space analysis tool, and the technical problem that the ArcGIS space analysis tool lacks a linear density statistical tool can be solved.
Fig. 5 is a schematic structural diagram of a linear density statistical system based on ArcGIS second development according to a second embodiment of the present invention. Specifically, the system is applied to an ArcGIS space analysis tool, and comprises:
the acquisition and calculation module 201 is configured to acquire a simple line data set, and calculate a value of an mth search radius according to a preset radius formula according to a value of a minimum search radius, a value of a radius increment, and a value of a cycle number, where the simple line data set includes N simple lines having a start point coordinate and an end point coordinate, N is a positive integer, and values of M are sequentially from 1 to M, and M is a positive integer;
further, the radius formula is:
rm=r1+(i-1)Δr
wherein r ismDenotes the m-th search radius, r1Indicates the minimum search radius, and also the 1 st search radius, i indicates the number of cycles, and Δ r indicates the radius increase.
The first statistical module 202 is configured to count the number of simple lines in a space region where the start point coordinate and the end point coordinate are both located in a center of a circle with a midpoint coordinate of the nth simple line and a value of the mth search radius is a radius, to obtain an mth group of nth first statistical results, where N is sequentially from 1 to N;
the calculating module 203 is configured to calculate an mth minimum clustering result with significance according to a statistical principle and the mth group of N first statistical results;
specifically, please refer to fig. 6, which is a schematic structural diagram of a refinement module of the calculation module 203 according to a second embodiment of the present invention. The refining module comprises:
an obtaining test module 2031, configured to obtain the configured significance level value, and test a distribution status of data distribution formed by the N first statistical results in the mth group;
a first calculating module 2032, configured to calculate, based on the significance level value and according to a normal distribution formula, an mth minimum clustering result if data distribution formed by the N first statistical results in the mth group satisfies the normal distribution;
the second calculating module 2033 is configured to, if data distribution formed by the N first statistical results in the mth group meets pareto distribution, calculate an mth minimum clustering result according to a pareto formula based on the significance level value.
Further, the normal distribution formula is:
minlines=average(Nls)+r*SD(Nls)
wherein, mins represents the mth minimum clustering result, average represents an average function, Nls represents the N first statistical results in the mth group, r is a parameter related to the significance level value, when the significance level value is 99%, r takes the value of 2.58, when the significance level value is 95%, r takes the value of 1.96, and SD represents a standard deviation function;
the pareto formula is:
Figure BDA0001737581390000121
wherein p is a parameter related to the significance level value, when the significance level value is 99%, the value of p is less than 0.01, when the significance level value is 95%, the value of p is less than 0.05, and x ismRepresents the m-th minimum clustering result, x represents the N first statistical results in the m-th group, and α represents the regression coefficient, which is a positive parameter.
And the density counting module 204 is configured to perform density counting on the N simple lines according to the M minimum clustering results and the N first counting results of the mth group to obtain a density result set.
Specifically, please refer to fig. 7, which is a schematic structural diagram of a refining module of the density statistics module 204 according to a second embodiment of the present invention. The refining module comprises:
the second statistical module 2041 is configured to count the number of unmarked simple lines in a space region formed by taking the midpoint coordinate of an unmarked a-th simple line as a center of a circle and taking the value of the b-th search radius as a radius to obtain a b-th group of a-th second statistical results, where a is a positive integer and takes values from 1 to a in sequence, and a is the value of the unmarked simple line in the simple line data set, where an mth maximum clustering result is obtained by calculation according to a statistical principle and N first statistical results of the mth group, a first c-th search radius satisfying a condition that the c-th maximum clustering result is smaller than the c-th minimum clustering result is searched from the 1-th to the mth maximum clustering results, an initial value of b is c-1, and c takes one of 1 to M;
the extraction and judgment module 2042 is configured to extract a target statistical result with a largest numerical value from the group b a second statistical results, and judge whether the target statistical result is greater than the group b minimum clustering result;
the query marking module 2043 is configured to mark i as i +1 if the target statistical result is greater than the b-th minimum clustering result, query a target simple line corresponding to the target statistical result, mark an unmarked simple line and a target simple line in a space region formed by taking the midpoint coordinate of the target simple line as the center of a circle and the value of the b-th search radius as the radius, mark the target statistical result as the i-th density result, and set the initial value of i as 0;
the judgment processing module 2044 is configured to judge whether the numerical value of the unmarked simple line is 0, return to the second statistical module 2041 if the numerical value of the unmarked simple line is not 0, and obtain a density result set based on the i density results if the numerical value of the unmarked simple line is 0;
an obtaining module 2045, configured to, if the target statistical result is less than or equal to the b-th minimum clustering result, if b is greater than 1, return b to b-1, and when b is less than or equal to 1, obtain a density result set based on the i density results.
For the description of the embodiment of the present invention, reference may be made to the related description of the first embodiment of the present invention, and further description is omitted here.
In the embodiment of the invention, the number of simple lines in a space region formed by taking the center point coordinate of the nth simple line as the center of a circle and the value of the mth search radius as the radius is counted to obtain the mth group of nth first statistical results, the values of M are sequentially from 1 to M, the values of N are sequentially from 1 to N, the mth minimum clustering result with significance is obtained by calculation according to the statistical principle and the N first statistical results of the mth group, and the density statistics is carried out on the N simple lines according to the M minimum clustering results and the N first statistical results of the mth group to obtain the density result set. The method and the system are applied to the ArcGIS space analysis tool, and the technical problem that the ArcGIS space analysis tool lacks a linear density statistical tool can be solved.
The method and the system can be used for common OD data mining including migration, telephone, world trade and the like, so as to synthesize large-flow data, extract main modes, confirm known structures and discover unknown structures. Because the method and the system can identify the hot link between different areas, the method and the system can also be used for predicting the spatial corridor and the current spatial relationship characteristics.
It should be noted that, for the sake of simplicity, the above-mentioned method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present invention is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no acts or modules are necessarily required of the invention.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In view of the above description of the method and system for linear density statistics based on ArcGIS secondary development provided by the present invention, those skilled in the art may have changes in the specific implementation manners and application ranges according to the ideas of the embodiments of the present invention.

Claims (4)

1. A linear density statistical method based on ArcGIS secondary development is characterized in that the method is applied to an ArcGIS spatial analysis tool, and comprises the following steps:
step 101, obtaining a simple line data set, and calculating a value of an mth search radius according to a preset radius formula according to a value of a minimum search radius, a value of a radius increment and a value of a cycle number, where the simple line data set includes N simple lines having a start point coordinate and an end point coordinate, N is a positive integer, a value of M is sequentially from 1 to M, M is a positive integer, and the radius formula is:
rm=r1+(i-1)Δr
wherein r ismRepresents the m-th search radius, r1Represents the minimum search radius, also the 1 st search radius, i represents the number of cycles, ar represents the radius increase;
step 102, counting the number of simple lines in a space region which is formed by taking the center point coordinate of the nth simple line as the center of a circle and the value of the mth search radius as the radius of the start point coordinate and the end point coordinate to obtain an mth group of nth first statistical results, wherein the value of N is from 1 to N in sequence;
step 1031, obtaining configured significance level values, and testing distribution conditions of data distribution formed by the N first statistical results in the mth group;
step 1032, if the data distribution formed by the N first statistical results in the mth group meets normal distribution, calculating according to a normal distribution formula based on the significance level value to obtain an mth minimum clustering result;
step 1033, if the data distribution formed by the N first statistical results in the mth group meets pareto distribution, calculating according to a pareto formula based on the significance level value to obtain an mth minimum clustering result;
step 1041, counting the number of unmarked simple lines in a space region formed by taking the midpoint coordinate of an unmarked a-th simple line as the center of a circle and the numerical value of a b-th search radius as the radius to obtain a b-th group a-th second statistical result, wherein a is a positive integer and takes values from 1 to A in sequence, A is the numerical value of the unmarked simple line in the simple line data set, wherein the M-th maximum clustering result is obtained by calculation according to the statistical principle and the N first statistical results of the M-th group, the c-th search radius which meets the condition that the c-th maximum clustering result is smaller than the c-th minimum clustering result is searched from the 1-M-th maximum clustering results, the initial value of b is c-1, and c takes one of the values from 1 to M;
step 1042, extracting a target statistical result with the largest numerical value from the b-th group of A second statistical results, and judging whether the target statistical result is larger than the b-th minimum clustering result;
step 1043, if the target statistical result is greater than the b-th minimum clustering result, marking i as i +1, querying a target simple line corresponding to the target statistical result, marking a start point coordinate and an end point coordinate on an unmarked simple line and the target simple line in a space region formed by taking a midpoint coordinate of the target simple line as a circle center and a value of the b-th search radius as a radius, and marking the target statistical result as an i-th density result, wherein an initial value of i is 0;
step 1044 of determining whether the numerical value of the unmarked simple line is 0, if not, returning to execute step 1041, and if the numerical value of the unmarked simple line is 0, obtaining a density result set based on the i density results;
step 1045, if the target statistical result is less than or equal to the b-th minimum clustering result, if b is greater than 1, making b equal to b-1, returning to execute step 1041, and if b is less than or equal to 1, obtaining a density result set based on i density results.
2. The method of claim 1,
the normal distribution formula is as follows:
minlines=average(Nls)+r*SD(Nls)
wherein, mins represents the mth minimum clustering result, average represents an average function, Nls represents the N first statistical results in the mth group, r is a parameter related to the significance level value, when the significance level value is 99%, r takes a value of 2.58, when the significance level value is 95%, r takes a value of 1.96, and SD represents a standard deviation function;
the pareto formula is:
Figure FDA0002989287730000031
wherein p is a parameter related to the significance level value, when the significance level value is 99%, the value of p is less than 0.01, when the significance level value is 95%, the value of p is less than 0.05, and x ismRepresents the m-th minimum clustering results, x represents the N first statistical results in the m-th group, and α represents a regression coefficient, which is a positive parameter.
3. A linear density statistical system based on ArcGIS secondary development is characterized in that the system is applied to an ArcGIS spatial analysis tool, and the system comprises:
the acquisition and calculation module is used for acquiring a simple line data set, and calculating the value of the mth search radius according to a preset radius formula according to the value of the minimum search radius, the value of the radius increment and the value of the cycle number, wherein the simple line data set comprises N simple lines with a start point coordinate and an end point coordinate, N is a positive integer, the value of M is from 1 to M in sequence, M is a positive integer, and the radius formula is as follows:
rm=r1+(i-1)Δr
wherein r ismRepresents the m-th search radius, r1Represents the minimum search radius, also the 1 st search radius, i represents the number of cycles, ar represents the radius increase;
the first statistical module is used for counting the number of simple lines in a space region which is formed by taking the midpoint coordinate of the nth simple line as the center of a circle and the value of the mth search radius as the radius of the start point coordinate and the end point coordinate to obtain an mth group of nth first statistical results, wherein the value of N is from 1 to N in sequence;
the computing module is used for computing an mth minimum clustering result with significance according to a statistical principle and the nth first statistical results of the mth group;
the specific modules of the computing module comprise:
the acquisition testing module is used for acquiring the configured significance level value and testing the distribution condition of data distribution formed by the N first statistical results in the mth group;
the first calculation module is used for calculating an mth minimum clustering result according to a normal distribution formula based on the significance level value if data distribution formed by the N first statistical results in the mth group meets normal distribution;
the second calculation module is used for calculating an mth minimum clustering result according to a pareto formula based on the significance level value if data distribution formed by the N first statistical results in the mth group meets the pareto distribution;
the density counting module is used for carrying out density counting on the N simple lines according to the M minimum clustering results and the N first counting results of the M group to obtain a density result set;
the density statistic module comprises the following specific modules:
the second statistical module is used for counting the number of unmarked simple lines in a space region formed by taking the midpoint coordinate of an unmarked a-th simple line as the center of a circle and the numerical value of a b-th search radius as the radius to obtain a b-th group of a-th second statistical results, wherein a is a positive integer and takes values from 1 to A in sequence, A is the numerical value of the unmarked simple line in the simple line data set, the M-th maximum clustering result is obtained by calculation according to a statistical principle and N first statistical results of the M-th group, the c-th search radius meeting the condition that the c-th maximum clustering result is smaller than the c-th minimum clustering result is searched from the 1-M maximum clustering results, the initial value of b is c-1, and c takes a value from 1 to M;
the extraction and judgment module is used for extracting a target statistical result with the largest numerical value from the group b A second statistical results and judging whether the target statistical result is larger than the group b minimum clustering result;
a query marking module, configured to mark i as i +1 if the target statistical result is greater than the b-th minimum clustering result, query a target simple line corresponding to the target statistical result, mark an unmarked simple line and the target simple line in a space region formed by taking a midpoint coordinate of the target simple line as a center of a circle and taking a value of the b-th search radius as a radius, and mark the target statistical result as the i-th density result, where an initial value of i is 0;
the judgment processing module is used for judging whether the numerical value of the unmarked simple line is 0 or not, if the numerical value of the unmarked simple line is not 0, returning to the second statistical module, and if the numerical value of the unmarked simple line is 0, obtaining a density result set based on the i density results;
and an obtaining module, configured to, if the target statistical result is less than or equal to the b-th minimum clustering result, if b is greater than 1, return b to b-1, and when b is less than or equal to 1, obtain a density result set based on i density results.
4. The system of claim 3,
the normal distribution formula is as follows:
minlines=average(Nls)+r*SD(Nls)
wherein, mins represents the mth minimum clustering result, average represents an average function, Nls represents the N first statistical results in the mth group, r is a parameter related to the significance level value, when the significance level value is 99%, r takes a value of 2.58, when the significance level value is 95%, r takes a value of 1.96, and SD represents a standard deviation function;
the pareto formula is:
Figure FDA0002989287730000061
wherein p is a parameter related to the significance level value, when the significance level value is 99%, the value of p is less than 0.01, when the significance level value is 95%, the value of p is less than 0.05, and x ismRepresents the m-th minimum clustering result, x represents the N first statistical results in the m-th group, and α represents a regression coefficient, which is a positive parameter.
CN201810803200.9A 2018-07-20 2018-07-20 Line density statistical method and system based on ArcGIS secondary development Active CN110019633B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810803200.9A CN110019633B (en) 2018-07-20 2018-07-20 Line density statistical method and system based on ArcGIS secondary development

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810803200.9A CN110019633B (en) 2018-07-20 2018-07-20 Line density statistical method and system based on ArcGIS secondary development

Publications (2)

Publication Number Publication Date
CN110019633A CN110019633A (en) 2019-07-16
CN110019633B true CN110019633B (en) 2021-08-10

Family

ID=67188364

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810803200.9A Active CN110019633B (en) 2018-07-20 2018-07-20 Line density statistical method and system based on ArcGIS secondary development

Country Status (1)

Country Link
CN (1) CN110019633B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006085602A (en) * 2004-09-17 2006-03-30 Gosei:Kk Traffic analysis system
CN102298640A (en) * 2011-09-14 2011-12-28 清华大学 Method for preprocessing map display data
CN103927312A (en) * 2013-01-15 2014-07-16 中芯国际集成电路制造(上海)有限公司 Automatic classification method and system for failure information of CIS (contact image sensor)
CN104484870A (en) * 2014-11-25 2015-04-01 北京航空航天大学 Calibration aircraft positioning method
CN105913658A (en) * 2016-05-18 2016-08-31 杭州智诚惠通科技有限公司 Method for estimating OD position and OD matrix by means of traffic flow

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170024412A1 (en) * 2015-07-17 2017-01-26 Environmental Systems Research Institute (ESRI) Geo-event processor

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006085602A (en) * 2004-09-17 2006-03-30 Gosei:Kk Traffic analysis system
CN102298640A (en) * 2011-09-14 2011-12-28 清华大学 Method for preprocessing map display data
CN103927312A (en) * 2013-01-15 2014-07-16 中芯国际集成电路制造(上海)有限公司 Automatic classification method and system for failure information of CIS (contact image sensor)
CN104484870A (en) * 2014-11-25 2015-04-01 北京航空航天大学 Calibration aircraft positioning method
CN105913658A (en) * 2016-05-18 2016-08-31 杭州智诚惠通科技有限公司 Method for estimating OD position and OD matrix by means of traffic flow

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于ArcGIS空间分析工具的二次开发研究;王浩骅 等;《中国人民公安大学学报》;20130331;第87-91页 *

Also Published As

Publication number Publication date
CN110019633A (en) 2019-07-16

Similar Documents

Publication Publication Date Title
DE69928333T2 (en) Pattern recognition based position determination
CN106937251B (en) Indoor positioning method and server
CN112579718B (en) Urban land function identification method and device and terminal equipment
CN106612511B (en) Wireless network throughput evaluation method and device based on support vector machine
CN111739292B (en) Toll station flow prediction method considering OD (origin-destination) flow contribution time-varying property of road network
CN104915897A (en) Computer implementation method for power grid planning evaluation service
CN111294841B (en) Method, device and storage medium for processing wireless network problem
CN105990170A (en) Wafer yield analysis method and device
CN114374449A (en) Interference source determination method, device, equipment and medium
CN111881243A (en) Taxi track hotspot area analysis method and system
WO2023201938A1 (en) Missing trajectory filling method and system
CN107290714B (en) Positioning method based on multi-identification fingerprint positioning
CN110110339A (en) A kind of hydrologic forecast error calibration method and system a few days ago
CN111414878A (en) Method and device for social attribute analysis and image processing of land parcel
CN104821854B (en) A kind of many primary user's multidimensional frequency spectrum sensing methods based on random set
CN110019633B (en) Line density statistical method and system based on ArcGIS secondary development
CN108574927B (en) Mobile terminal positioning method and device
CN110913407A (en) Method and device for analyzing overlapping coverage
CN109117870B (en) ArcGIS secondary development-based linear clustering extraction method and system
CN113055927B (en) Method and device for positioning latitude and longitude of base station, computing equipment and computer storage medium
CN110992081A (en) Data processing method and device for offline service provider
CN116307761A (en) Business invitation data management system based on cloud platform
CN114565031A (en) Vehicle fleet identification method and device based on longitude and latitude and computer equipment
CN114117121A (en) Data acquisition method for smart city
CN113076451B (en) Abnormal behavior identification and risk model library establishment method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant