CN111651502B - City functional area identification method based on multi-subspace model - Google Patents
City functional area identification method based on multi-subspace model Download PDFInfo
- Publication number
- CN111651502B CN111651502B CN202010484901.8A CN202010484901A CN111651502B CN 111651502 B CN111651502 B CN 111651502B CN 202010484901 A CN202010484901 A CN 202010484901A CN 111651502 B CN111651502 B CN 111651502B
- Authority
- CN
- China
- Prior art keywords
- matrix
- functional
- functional area
- subspace
- geographic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 239000011159 matrix material Substances 0.000 claims abstract description 84
- 230000000694 effects Effects 0.000 claims abstract description 35
- 238000011160 research Methods 0.000 claims abstract description 9
- 238000005192 partition Methods 0.000 claims abstract description 4
- 239000013598 vector Substances 0.000 claims description 15
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000001514 detection method Methods 0.000 claims description 6
- 238000012163 sequencing technique Methods 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 5
- 238000013507 mapping Methods 0.000 claims description 5
- 230000008569 process Effects 0.000 claims description 5
- 230000002159 abnormal effect Effects 0.000 claims description 4
- 230000001174 ascending effect Effects 0.000 claims description 4
- 238000003064 k means clustering Methods 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 4
- 238000000513 principal component analysis Methods 0.000 claims description 4
- 230000001186 cumulative effect Effects 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims 1
- 238000011161 development Methods 0.000 abstract description 6
- 230000007547 defect Effects 0.000 abstract description 2
- 238000004458 analytical method Methods 0.000 description 4
- 238000007726 management method Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000005714 functional activity Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000029305 taxis Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2462—Approximate or statistical queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Probability & Statistics with Applications (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Remote Sensing (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a city functional area identification method based on a multi-subspace model, which comprises the following steps: acquiring taxi track data and check-in data in a research area; constructing a time sequence characteristic matrix C facing to partitions and based on an access purpose; inputting the time sequence characteristic matrix C to a sparse subspace clustering algorithm, and calculating to obtain the corresponding relation between the geographic unit and the urban functional area; and acquiring the remarkable characteristic location of each functional area, and further identifying the main function of each functional area. The method provided by the invention utilizes human activity information provided by geographic big data, overcomes the defects in the prior art based on a multi-subspace model, can more accurately identify the urban functional areas, analyzes the uniqueness and abundance of each functional area based on the geometric properties of the subspace, and provides a fine quantitative index indication for the management and development of the urban functional areas.
Description
Technical Field
The invention belongs to the technical field of geographic spatial information identification, relates to an urban geographic information identification method, and particularly relates to an urban functional area identification method based on a multi-subspace model.
Background
The urban space structure is a core research content of urban geographic informatics and is also a centralized reflection of human-ground relations, because the urban space influences the production and activities of people when influenced by human activities, the urban space structure relates to urban planning, site selection and travel and site recommendation. In urban spatial structure analysis, the distribution of urban functional areas is a result presented in a geographic space under the influence of a plurality of factors.
There are many methods for analyzing urban functional areas, such as social investigation, but the method takes time and labor to obtain data, and may be greatly influenced by subjective factors during analysis, and the greatest disadvantage is that human activities, which are key factors of urban development, cannot be directly reflected. With the rapid development of mobile communication, internet and satellite positioning technologies, a series of electronic footprints are generated by mobile devices with positioning functions, and the electronic footprints are real records of urban resident activities, so that people can explore urban functional areas from the perspective of human activities. The existing method utilizes social media sign-in data, mobile phone data and taxi track data to detect the city functional area.
The prior art is not fully developed on models for analyzing data. The general steps are as follows, firstly, when processing the large data of the geographic space, after mapping the human activity time sequence characteristic information to the geographic units divided by people, each geographic unit can be expressed by a vector, and the information is stored in a high-dimensional vector space. Then, they characterize these geographic units by some algorithms such as singular value decomposition, latent semantic analysis, latent dirichlet, and the like. Finally, clustering is carried out through the similarity of the geographic units on the feature expression, and each clustering result represents one functional area, so that the distribution of the urban functional areas is obtained. However, these models have the following disadvantages.
First, in the process of feature expression, part of the algorithm first makes strict assumptions about features, such as samples having only one set of features or obeying the same distribution. Since the samples are reduced from a high-dimensional space to a low-dimensional subspace after being subjected to feature expression, these algorithms can be called single subspace algorithms. The strict assumption of the simplex space algorithm is convenient for obtaining the characteristic mode, and the functional area distribution can be obtained by clustering according to the relation between the samples and the characteristics, but if the weight occupied by the sample information is smaller, the sample information will be marginalized after the characteristics are expressed, so that the clustering result is inaccurate. And there are characteristic differences between functional regions, each of which cannot be described concisely and accurately using the same set of characteristics. When the data is too large and the feature pattern is too complex, the assumption of the feature pattern by the single subspace model will limit the mining of the feature. They cannot handle more complex data.
Second, these models ignore the geometric meaning of the vector space. The geometrical properties of the subspace are related to the characteristics of the urban functional area, and the prior art has ignored discussion and consideration of this.
Disclosure of Invention
In view of the above, the present invention provides a method for identifying an urban functional area based on a multi-subspace model, which utilizes human activity information provided by geographic big data, overcomes the defects in the prior art by using the multi-subspace model, and can identify the urban functional area more accurately.
The invention aims to realize the method, and the method for identifying the urban functional area based on the multi-subspace model comprises the following steps:
and 4, acquiring the remarkable characteristic location of each functional area, and further identifying the main function of each functional area.
Specifically, the process of constructing the time sequence feature matrix C in step 2 includes the following steps:
step 201, dividing the research area to obtain N geographic units;
step 202, preprocessing the taxi track data, eliminating abnormal points, extracting the end point and the arrival time of each journey, and mapping the end point and the geographic unit to obtain the visit record of the geographic unit;
step 203, matching the check-in data records with visit records of the geographic units, and classifying the purpose of each visit;
step 204, constructing a time sequence feature matrix C with M rows and N columns, which represents the human activity dynamic carried by the geographic unit in a period of time, wherein M is T × D, T represents the number of divided time segments, D represents the number of categories of visited destinations, and each column in C represents the number of people visiting the corresponding geographic unit for different purposes in different time segments.
Specifically, the sparse subspace clustering algorithm described in step 3 includes the following steps:
step 301, solving a coefficient matrix Z with a size of NxN, wherein the matrix Z needs to satisfy l1Minimization under constraint:
CZ=C,Zii=0
wherein,is represented by1Norm, ZiiThe value of the ith row and ith column element, l, of the matrix Z1Norm minimization makes the coefficient matrix Z sparse, forcing the timing characteristics of each geographic cell to need to be represented only by linear combinations of the timing characteristics of other geographic cells in the same subspace;
step 302, a similarity matrix W | + | Z | -Y | -Z | -Y |, of data is then established using the coefficient matrixTThe size of W is NxN, and the value in the matrix is the similarity of the geographic units corresponding to the indexes on the time sequence characteristic;
the similarity matrix W is a block diagonal matrix, namely only a non-zero sub-matrix is arranged on a main diagonal, the other sub-matrices are zero matrices, each non-zero sub-matrix is a subspace, the same subspace comprises a plurality of geographic units with extremely similar time sequence characteristics, the geographic units in different subspaces have large difference in time sequence characteristics, and therefore the subspace is a required detection city functional area;
step 303, calculating the number of subspaces by using the normalized laplacian matrix L of the similarity matrix W, where L is I-D-1/ 2WD-1/2Where I is the identity matrix and D ═ ΣiWijSorting the eigenvalues of L in ascending order, calculating the difference lambda of every two adjacent eigenvaluesk+1-λkThe k corresponding to the maximum difference is the number of the required subspaces, i.e. the number of the urban functional areas to be detected, WijThe ith row and the jth column element value of the matrix W are represented;
and 304, using a K-means clustering method for the similarity matrix W, setting the clustering number as K obtained in the step 303, obtaining the corresponding relation between the geographic unit and K categories, namely the corresponding relation between the geographic unit and K city functional areas, and completing the detection of the city functional areas.
Specifically, the obtaining of the significant feature location of each functional area in step 4 includes: extracting a subspace matrix S corresponding to each city functional area from the similarity matrix W generated in the step 302 by using the corresponding relation in the step 3041,...,Si,...,SkAnd performing principal component analysis to obtain a feature vector [ e ]1,e2,...,ep,…,eM]iIs called SiThe first r eigenvectors [ e ] with the cumulative eigenvalue percentage higher than 90%1,e2,…,er]iIs SiOf the salient feature locations.
Specifically, the identification of the main function of each functional area includes that each significant feature point of each functional area is deformed into a matrix with D rows and T columns, each row represents the activity level change of the feature point aiming at D over T time periods, the main activity mode of the functional area is obtained, the functional area is marked by the most active function in the main activity mode, and the urban functional area identification is completed.
Further, the method for identifying the urban functional area further comprises the following steps:
and 6, calculating the uniqueness of each functional area, and sequencing each functional area according to the uniqueness.
Specifically, the similarity of the functional regions is calculated according to the main angle between the corresponding subspaces, and any two functional regions correspond to the subspace SkAnd SlSimilarity aff (S) ofk,Sl) The calculation formula is as follows,
wherein,is thatOf the ith maximum singular value, UkAnd UlAre each SkAnd SlThe orthogonal basis of (a) is,is the main angle between the subspaces, dk^dlDenotes SkAnd SlOf spatial dimension dkAnd dlThe smaller of these;
specifically, the uniqueness of the functional region is inversely proportional to the similarity, if the similarity between the subspaces is high, the functions of the corresponding functional regions will be greatly similar, the uniqueness of the functional region is low, and each functional region S isiThe uniqueness calculation formula is as follows:
where k is the total number of functional regions, S-iDenotes in addition to SiAnd (4) functional regions outside.
Further, the method for identifying the urban functional area further comprises the following steps:
and 7, calculating the abundance of each functional region, and sequencing each functional region according to the abundance.
Specifically, the abundance of the functional region is related to the reconstruction error of the significant feature site of each functional region, which is calculated as follows:
wherein C (S)i) Is formed by belonging to a subspace SiIs used to form a matrix of the original vectors,is formed by SiThe significant feature location of the computing system is reconstructed by the matrix, | | | | | non-calculationFThe frobenius norm of the matrix is represented.
The method provides a model based on multiple subspaces, considers that city functional areas have multiple groups of characteristics, when the time-space activity information of a geographic unit is expressed by vectors, vector samples of the time-space activity information are located in a high-dimensional space formed by joint subspaces, the dynamic characteristics of human activities carried by the geographic units located in the same subspace are similar, the dynamic characteristics can be clustered into one functional area, the identification of the city functional areas is realized by searching the subspaces, the uniqueness and abundance of each functional area are analyzed based on the geometric properties of the subspaces, and a fine quantitative index indication is provided for the management and development of the city functional areas.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention;
FIG. 2 is a schematic flow chart of an embodiment of the method of the present invention.
FIG. 3 is a similarity matrix obtained by using sparse subspace clustering in an embodiment of the present invention;
FIG. 4 shows the results of detecting urban functional areas according to an embodiment of the present invention;
FIG. 5 functional activity level of a salient feature site for each functional area in an embodiment of the present invention;
FIG. 6 illustrates the similarity of functional areas calculated according to an embodiment of the present invention;
FIG. 7 shows the uniqueness and abundance of the functional regions calculated by the example of the present invention.
Detailed Description
The present invention is further illustrated by the following examples and the accompanying drawings, but the present invention is not limited thereto in any way, and any modifications or alterations based on the teaching of the present invention are within the scope of the present invention.
As shown in fig. 1, a city functional area identification method based on a multi-subspace model includes the following steps:
and 4, acquiring the remarkable characteristic location of each functional area, and further identifying the main function of each functional area.
Specifically, the process of constructing the time sequence feature matrix C in step 2 includes the following steps:
step 201, dividing the research area to obtain N geographic units;
step 202, preprocessing the taxi track data, eliminating abnormal points, extracting the end point and the arrival time of each journey, and mapping the end point and the geographic unit to obtain the visit record of the geographic unit;
step 203, matching the check-in data records with visit records of the geographic units, and classifying the purpose of each visit;
step 204, constructing a time sequence feature matrix C with M rows and N columns, which represents the human activity dynamic carried by the geographic unit in a period of time, wherein M is T × D, T represents the number of divided time segments, D represents the number of categories of visited destinations, and each column in C represents the number of people visiting the corresponding geographic unit for different purposes in different time segments.
Specifically, the sparse subspace clustering algorithm described in step 3 includes the following steps:
step 301, solving a coefficient matrix Z with a size of NxN, wherein the matrix Z needs to satisfy l1Minimization under constraint:
CZ=C,Zii=0
wherein,is represented by1Norm, ZiiThe value of the ith row and ith column element, l, of the matrix Z1Norm minimization makes the coefficient matrix Z sparse, forcing the timing characteristics of each geographic cell to need to be represented only by linear combinations of the timing characteristics of other geographic cells in the same subspace;
step 302, a similarity matrix W | + | Z | -Y | -Z | -Y |, of data is then established using the coefficient matrixTThe size of W is NxN, and the value in the matrix is the similarity of the geographic units corresponding to the indexes on the time sequence characteristic;
the similarity matrix W is a block diagonal matrix, namely only a non-zero sub-matrix is arranged on a main diagonal, the other sub-matrices are zero matrices, each non-zero sub-matrix is a subspace, the same subspace comprises a plurality of geographic units with extremely similar time sequence characteristics, the geographic units in different subspaces have large difference in time sequence characteristics, and therefore the subspace is a required detection city functional area;
step 303, calculating the number of subspaces by using the normalized laplacian matrix L of the similarity matrix W, where L is I-D-1/ 2WD-1/2Where I is the identity matrix and D ═ ΣiWijSorting the eigenvalues of L in ascending order, calculating the difference lambda of every two adjacent eigenvaluesk+1-λkK corresponding to the maximum difference is the number of the acquired subspaces, namely the number of the urban functional areas to be detected;
and 304, using a K-means clustering method for the similarity matrix W, setting the clustering number as K obtained in the step 303, obtaining the corresponding relation between the geographic unit and K categories, namely the corresponding relation between the geographic unit and K city functional areas, and completing the detection of the city functional areas.
Specifically, the obtaining of the significant feature location of each functional area in step 4 includes: extracting a subspace matrix S corresponding to each city functional area from the similarity matrix W generated in the step 302 by using the corresponding relation in the step 3041,...,Si,...,SkAnd performing principal component analysis to obtain a feature vector [ e ]1,e2,...,ep,...,eM]iIs called SiThe first r eigenvectors [ e ] with the cumulative eigenvalue percentage higher than 90%1,e2,…,er]iIs SiOf the salient feature locations.
Specifically, the identification of the main function of each functional area includes deforming each significant feature point of each functional area into a matrix of D rows and T columns, where each row represents the change of the activity level of the feature point for D over T time periods, obtaining the main activity pattern of the functional area, and regarding the most active function in the main activity pattern as the main function of the area.
Further, the method for identifying the urban functional area further comprises the following steps:
and 6, calculating the uniqueness of each functional area, and sequencing each functional area according to the uniqueness.
In particular, the calculation of the similarity of the functional regions is calculated from the principal angles between the subspaces, any two functional regions SkAnd SlThe similarity calculation formula of (a) is as follows,
wherein,is thatOf the ith maximum singular value, UkAnd UlAre each SkAnd SlThe orthogonal basis of (a) is,is the main angle between the subspaces, dk^dlDenotes SkAnd SlOf spatial dimension dkAnd dlThe smaller of these.
Specifically, the uniqueness of the functional region is inversely proportional to the similarity, if the similarity between the subspaces is high, the functions of the corresponding functional regions will be greatly similar, the uniqueness of the functional region is low, and each functional region S isiThe uniqueness calculation formula is as follows:
where k is the total number of functional regions, S-iDenotes in addition to SiAnd (4) functional regions outside.
Further, the method for identifying the urban functional area further comprises the following steps:
and 7, calculating the abundance of each functional region, and sequencing each functional region according to the abundance.
Specifically, the abundance of the functional region is related to the reconstruction error of the significant feature site of each functional region, which is calculated as follows:
wherein C (S)i) Is formed by belonging to a subspace SiIs used to form a matrix of the original vectors,is formed by SiIs used to reconstruct the salient feature locations.
The reconstruction error describes the difference value of the original subspace matrix restored by the salient feature positions, and the larger the reconstruction error is, the more the salient feature positions are required to depict the dynamic change in the functional region besides the dominant salient feature positions are indicated. The abundance is the abundant activity pattern of people in the area and the functional development which can support the activity pattern.
As shown in the flow chart of FIG. 2, the experiment of the present invention includes the following steps.
(1) Data processing
Step 1.1: selecting main urban areas in Shanghai as research areas, dividing grids into 500 m × 500 m, and removing water units to obtain 3166 geographic units.
Step 1.2: preprocessing GPS track data of 6600 taxis from Shanghai city, eliminating abnormal points, extracting the end point and arrival time of each journey, and mapping the end point and the geographic unit to obtain 7852724 arrival records.
Step 1.3: matching the check-in data records with visit records of the geographic units, and classifying the purposes of each visit, wherein the purposes of the visits have six types: home, traffic, work, dining, entertainment and others (refer to places such as parks, museums, libraries, etc.).
Step 1.4: dividing one day by hours to obtain 24 time periods, and counting the times of visiting each geographic unit in the 24 time periods for each purpose (total number 6) to obtain a time sequence characteristic matrix C with 144 rows and 3166 columns.
(2) Urban functional area identification
Step 2.1: inputting the time sequence characteristic matrix C to a sparse subspace clustering algorithm to obtain a similarity matrix W, wherein a visual similarity matrix result is shown in FIG. 3, which reveals the similarity between geographic units, and the similarity value is colored black if the similarity value is nonzero, so that five opposite angles can be seen, and the structure reveals that the number of the urban functional areas is 5.
Step 2.2: computing subspaces using a normalized Laplace matrix L of WNumber, L ═ I-D-1/2WD-1/2Where I is the identity matrix and D ═ ΣiWij. Arranging the characteristic values of L in ascending order, and calculating the difference lambda of every two adjacent characteristic valuesk+1-λkAnd k corresponding to the maximum difference is 5, namely the number of subspaces (urban functional areas) is 5, which is consistent with the interpretation result in the step 2.1.
Step 2.3: therefore, the K-means clustering method is used for W, the clustering number is set to 5, and the urban functional area detection is completed to obtain urban functional areas 1, 2, 3, 4 and 5. The results of the visualization of the clustering results on the map are shown in fig. 4, where it can be seen that the central area is mainly covered by the functional area 5.
Step 2.4: since the main function of the functional area is determined by the significant activity features of the functional area, in order to determine the actual function of the detected urban functional area, principal component analysis is performed on the subspace matrix corresponding to each functional area to obtain the feature locations of each functional area, and it is found that the ratio of the first 5 feature values in the functional areas 1, 2, 3, and 4 exceeds 90%, and the ratio of the first 5 feature values in the functional area 5 is less than 90%, therefore, we use the basis vectors corresponding to the first 5 feature values as the significant feature locations of each functional area, and when analyzing the functional area 5, use the basis vectors corresponding to the first 10 feature values as the significant feature locations.
Step 2.5: each distinctive feature location of each functional area is transformed into a matrix of 6 rows and 24 columns, and each row represents the activity level change of the feature location for the purpose of home (H), traffic (Tr), work (W), dining (D), entertainment (E) and others (O, referring to places such as a park, a museum, a library) in 24 hours, and the distinctive feature locations of all the functional areas are shown in fig. 5. As can be seen from the figure, the family event (H) is most active in the distinctive feature location of the functional area 1, the dining event (D) is ranked second, and the entertainment event (E) is more prominent, so that the functional area 1 can be used as a living area developed with catering and entertainment facilities; similarly, the traffic activity (Tr) of the functional area 2 is highlighted as a traffic hub; the functional area 3 is mainly active as work (W), and is therefore a work area; for the functional area 5, the influence of the first 10 significant characteristic places is mainly measured, and dining activities (D) and entertainment activities (E) are found to be active, and are considered as commercial areas; the functional area 4 corresponds to other functional areas such as a park, a museum, a gas station, etc.
(3) Urban functional area analysis
Step 3.1: the proximity of the subspaces, i.e. the similarity of the functional regions, is calculated from the principal angles between the subspaces, see fig. 6, the similarity of the functional regions themselves is not calculated and is set to 0. The residential and business areas in fig. 6 share the highest similarity because the residential areas are more likely to have dining and entertainment facilities, and the locations of the business areas in fig. 3 are themselves mixed with a large number of residential areas.
Step 3.2: the uniqueness of the functional zones was calculated from the similarity of the functional zones and the results are shown in FIG. 7. The overall value of the functional zone uniqueness was higher, indicating that the overall functional zone differences were significant for the study area. Where the uniqueness of residential and commercial areas is low, this is also consistent with the results of step 3.1 in (3).
Step 3.3: the abundance of the functional domains was calculated and the results are shown in FIG. 7. The maximum reconstruction error of other functional areas (areas providing other services) means that the activity pattern in the other functional areas is the most complex because of the large number of facilities involved and the large difference of dynamic activity patterns. And reconstruction errors of residential areas and commercial areas are minimum, because the residential areas and the commercial areas are respectively concentrated on living, catering and entertainment, and the dynamic activity mode of the functions is single.
It can be known from the summary and the embodiments of the invention that, in order to solve the problems existing in the prior art, the invention provides a model based on multiple subspaces, and considers that a city functional area has multiple groups of features, when the spatio-temporal activity information of a geographic unit is expressed by a vector, a vector sample is located in a high-dimensional space formed by joint subspaces, the dynamic features of human activities carried by the geographic units located in the same subspace are similar and can be clustered into one functional area, the identification of the city functional area is realized by searching the subspace, and the uniqueness and abundance of each functional area are analyzed based on the geometric properties of the subspace, so that a fine quantitative index indication is provided for the management and development of the city functional area.
Claims (3)
1. A city functional area identification method based on a multi-subspace model is characterized by comprising the following steps:
step 1, taxi track data and check-in data in a research area are obtained;
step 2, constructing a time sequence characteristic matrix C facing to partitions and based on the visiting purpose;
step 3, inputting the time sequence characteristic matrix C to a sparse subspace clustering algorithm, and calculating to obtain the corresponding relation between the geographic unit and the urban functional area;
step 4, acquiring the significant characteristic location of each functional area, and further identifying the main function of each functional area;
the construction process of the time sequence characteristic matrix C in the step 2 comprises the following steps:
step 201, dividing the research area to obtain N geographic units;
step 202, preprocessing the taxi track data, eliminating abnormal points, extracting the end point and the arrival time of each journey, and mapping the end point and the geographic unit to obtain the visit record of the geographic unit;
step 203, matching the check-in data records with visit records of the geographic units, and classifying the purpose of each visit;
step 204, constructing a time sequence characteristic matrix C with M rows and N columns, wherein the time sequence characteristic matrix C represents the human activity dynamic carried by the geographic unit in a period of time, M is T multiplied by D, T represents the number of divided time segments, D represents the number of categories of visited destinations, and each column in C represents the number of people visiting the corresponding geographic unit in different time segments for different destinations;
the sparse subspace clustering algorithm in the step 3 comprises the following steps:
step 301, solving a coefficient matrix Z with a size of NxN, wherein the matrix Z needs to satisfy l1Minimization under constraint:
CZ=C,Zii=0
wherein,is represented by1Norm, ZiiThe value of the ith row and ith column element, l, of the matrix Z1Norm minimization makes the coefficient matrix Z sparse, forcing the timing characteristics of each geographic cell to need to be represented only by linear combinations of the timing characteristics of other geographic cells in the same subspace;
step 302, then, a similarity matrix of the data is established by using the coefficient matrixThe size of W is NxN, and the value in the matrix is the similarity of the geographic units corresponding to the indexes on the time sequence characteristic;
step 303, calculating the number of subspaces by using the normalized laplacian matrix L of the similarity matrix W, where L is I-D-1/2WD-1/2Where I is the identity matrix and D ═ ΣiWijSorting the eigenvalues of L in ascending order, calculating the difference lambda of every two adjacent eigenvaluesk+1-λkThe k corresponding to the maximum difference is the number of the required subspaces, i.e. the number of the urban functional areas to be detected, WijThe ith row and the jth column element value of the matrix W are represented;
304, using a K-means clustering method for the similarity matrix W, setting the clustering number as K obtained in the step 303, obtaining the corresponding relation between the geographic unit and K categories, namely the corresponding relation between the geographic unit and K city functional areas, and completing the detection of the city functional areas;
the obtaining of the significant feature location of each functional area in step 4 includes: extracting a subspace matrix S corresponding to each city functional area from the similarity matrix W generated in the step 302 by using the corresponding relation in the step 3041,...,Si,...,SkAnd performing principal component analysis to obtain a feature vector [ e ]1,e2,...,ep,...,eM]iIs called SiThe first r eigenvectors [ e ] with the cumulative eigenvalue percentage higher than 90%1,e2,...,er]iIs SiA salient feature location of;
the main function identification of each functional area comprises the steps of deforming each significant characteristic place of each functional area into a matrix with D rows and T columns, wherein each row represents the activity level change of the characteristic place aiming at D in T time periods to obtain the main activity mode of the functional area, marking the functional area by the most active function in the main activity mode, and completing city functional area identification.
2. The urban functional area identification method according to claim 1, wherein said urban functional area identification method further comprises the steps of:
step 5, calculating the similarity of each functional area;
step 6, calculating the uniqueness of each functional area, and sequencing each functional area according to the uniqueness;
the similarity of the functional regions is calculated according to the main angle between the corresponding subspaces, and any two functional regions correspond to the subspace SkAnd SlSimilarity aff (S) ofk,Sl) The calculation formula is as follows,
wherein,is thatUlOf the ith maximum singular value, UkAnd UlAre each SkAnd SlThe orthogonal basis of (a) is,is the main angle between the subspaces, dk∧dlDenotes SkAnd SlOf spatial dimension dkAnd dlThe smaller of these;
the uniqueness of the functional regions is inversely proportional to the similarity, if the similarity between the subspaces is high, the functions of the corresponding functional regions will be greatly similar, the uniqueness of the functional regions is low, and each functional region S isiThe uniqueness calculation formula is as follows:
where k is the total number of functional regions, S-iDenotes in addition to SiAnd (4) functional regions outside.
3. The urban functional area identification method according to claim 2, wherein said urban functional area identification method further comprises the steps of:
step 7, calculating the abundance of each functional region, and sequencing each functional region according to the abundance;
the abundance of the functional region is related to the reconstruction error of the significant feature site of each functional region, which is calculated as follows:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010484901.8A CN111651502B (en) | 2020-06-01 | 2020-06-01 | City functional area identification method based on multi-subspace model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010484901.8A CN111651502B (en) | 2020-06-01 | 2020-06-01 | City functional area identification method based on multi-subspace model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111651502A CN111651502A (en) | 2020-09-11 |
CN111651502B true CN111651502B (en) | 2021-09-14 |
Family
ID=72344015
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010484901.8A Active CN111651502B (en) | 2020-06-01 | 2020-06-01 | City functional area identification method based on multi-subspace model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111651502B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112559909B (en) * | 2020-12-18 | 2022-06-21 | 浙江工业大学 | Business area discovery method based on GCN embedded spatial clustering model |
CN113343781B (en) * | 2021-05-17 | 2022-02-01 | 武汉大学 | City functional area identification method using remote sensing data and taxi track data |
CN113806419B (en) * | 2021-08-26 | 2024-04-12 | 西北大学 | Urban area function recognition model and recognition method based on space-time big data |
CN113902185B (en) * | 2021-09-30 | 2023-10-31 | 北京百度网讯科技有限公司 | Determination method and device for regional land property, electronic equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105117595A (en) * | 2015-08-19 | 2015-12-02 | 大连理工大学 | Floating car data based private car travel data integration method |
CN108764193A (en) * | 2018-06-04 | 2018-11-06 | 北京师范大学 | Merge the city function limited region dividing method of POI and remote sensing image |
CN108876475A (en) * | 2018-07-12 | 2018-11-23 | 青岛理工大学 | City functional area identification method based on interest point acquisition, server and storage medium |
CN110298500A (en) * | 2019-06-19 | 2019-10-01 | 大连理工大学 | A kind of urban transportation track data set creation method based on taxi car data and city road network |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10515101B2 (en) * | 2016-04-19 | 2019-12-24 | Strava, Inc. | Determining clusters of similar activities |
-
2020
- 2020-06-01 CN CN202010484901.8A patent/CN111651502B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105117595A (en) * | 2015-08-19 | 2015-12-02 | 大连理工大学 | Floating car data based private car travel data integration method |
CN108764193A (en) * | 2018-06-04 | 2018-11-06 | 北京师范大学 | Merge the city function limited region dividing method of POI and remote sensing image |
CN108876475A (en) * | 2018-07-12 | 2018-11-23 | 青岛理工大学 | City functional area identification method based on interest point acquisition, server and storage medium |
CN110298500A (en) * | 2019-06-19 | 2019-10-01 | 大连理工大学 | A kind of urban transportation track data set creation method based on taxi car data and city road network |
Non-Patent Citations (5)
Title |
---|
Latent spatio-temporal activity structures: a new approach to inferring intra-urban functional regions via social media check-in data;Ye Zhi等;《Geo-spatial Information Science》;20160517;第94-105页 * |
基于Landsat与DMSP-OLS的非监督城区提取方法研究;柯文聪等;《测绘与空间地理信息》;20181231;第183-186页 * |
基于出租车和POI数据的城市土地利用现状变化研究;刘旭;《中国优秀硕士学位论文全文数据库 基础科技辑》;20200415;第A008-106页 * |
基于签到数据的城市热点功能区识别研究;宁鹏飞等;《测绘地理信息》;20180430;第110-114页 * |
基于轨迹和兴趣点数据的城市功能区动态识别与时变规律可视分析;张慧杰等;《计算机辅助设计与图形学学报》;20180930;第1728-1739页 * |
Also Published As
Publication number | Publication date |
---|---|
CN111651502A (en) | 2020-09-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111651502B (en) | City functional area identification method based on multi-subspace model | |
CN112949413B (en) | City landscape element classification and locality measurement method based on street view picture | |
CN111651545A (en) | Urban marginal area extraction method based on multi-source data fusion | |
CN110264709A (en) | The prediction technique of the magnitude of traffic flow of road based on figure convolutional network | |
CN109493119B (en) | POI data-based urban business center identification method and system | |
CN110674858B (en) | Traffic public opinion detection method based on space-time correlation and big data mining | |
CN106991510A (en) | A kind of method based on the traffic accident of spatial-temporal distribution characteristic predicted city | |
CN112819207A (en) | Geological disaster space prediction method, system and storage medium based on similarity measurement | |
CN109840272B (en) | Method for predicting user demand of shared electric automobile station | |
CN108898244B (en) | Digital signage position recommendation method coupled with multi-source elements | |
CN106227884A (en) | A kind of recommendation method of calling a taxi online based on collaborative filtering | |
CN112598165A (en) | Private car data-based urban functional area transfer flow prediction method and device | |
CN112561401A (en) | City vitality measurement and characterization method and system based on multi-source big data | |
Renigier-Biłozor et al. | Modern challenges of property market analysis-homogeneous areas determination | |
CN114444794A (en) | Travel intention prediction method based on double-intention diagram embedded network | |
CN116542708A (en) | Intelligent high-quality business gate shop-form recommendation and grading scoring method thereof | |
CN118195172A (en) | Method, device, medium and product for determining urban integration degree of railway passenger station | |
Qiu et al. | RPSBPT: A route planning scheme with best profit for taxi | |
Bajat et al. | Spatial hedonic modeling of housing prices using auxiliary maps | |
CN117408167A (en) | Debris flow disaster vulnerability prediction method based on deep neural network | |
Lin et al. | Enhancing Urban Land Use Identification Using Urban Morphology | |
Keskin et al. | Cohort fertility heterogeneity during the fertility decline period in Turkey | |
CN117076788A (en) | Personnel foothold point location analysis method and system based on multidimensional sensing data | |
Haining | Data problems in spatial econometric modeling | |
CN114881309B (en) | Method for measuring characteristic correlation between urban activity and carbon evacuation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |