CN111625549A - Real estate registration space data user building rapid fuzzy matching method - Google Patents

Real estate registration space data user building rapid fuzzy matching method Download PDF

Info

Publication number
CN111625549A
CN111625549A CN202010356805.5A CN202010356805A CN111625549A CN 111625549 A CN111625549 A CN 111625549A CN 202010356805 A CN202010356805 A CN 202010356805A CN 111625549 A CN111625549 A CN 111625549A
Authority
CN
China
Prior art keywords
house
similarity
data
graph
matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010356805.5A
Other languages
Chinese (zh)
Other versions
CN111625549B (en
Inventor
黄锦丞
朱江洪
刘越岩
王宇
宫子强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Geosciences
Original Assignee
China University of Geosciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Geosciences filed Critical China University of Geosciences
Priority to CN202010356805.5A priority Critical patent/CN111625549B/en
Publication of CN111625549A publication Critical patent/CN111625549A/en
Application granted granted Critical
Publication of CN111625549B publication Critical patent/CN111625549B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2425Iterative querying; Query formulation based on the results of a preceding query
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases

Abstract

The invention provides a fast fuzzy matching method for a real estate registration space data user, which comprises the following steps: firstly, acquiring data, then preprocessing the data, and finally processing the data, namely fuzzy matching of elements; and performing quick matching association on each hierarchical user graph respectively. The invention has the beneficial effects that: according to the technical scheme provided by the invention, the quick fuzzy matching model is used for matching the natural building base outline of the layered household map and the house property management space data set, the layered household map with uncertain geographic position information is quickly matched and associated with the house property registration space data set, the efficiency is improved, and the time cost and the labor cost are reduced on the premise of ensuring the precision.

Description

Real estate registration space data user building rapid fuzzy matching method
Technical Field
The invention relates to the technical field of computer graph matching, in particular to a rapid fuzzy matching method for a real estate registration space data user.
Background
According to the related notice and technical standard of the unified registration of the national exported real estate, governments in various regions develop and collect the existing real estate registration space data, and set forth to construct a scientific and feasible real estate registration data clearing and integrating technical method, so that a real estate registration space data association mode of 'one floor, one floor and one house with one certificate' is formed. The existing house property information includes house ownership information and house measurement information for historical reasons, and the house position and location information belonging to the house measurement information is not accurate or missing along with separation and combination of streets and change of street names in recent years and cannot be matched with an original house property registration space data set. In order to meet the requirement of high-speed social development in the information age, the real estate authority registration certificate is matched, a layered user distribution graph which is not subjected to graph library management in a real estate registration book needs to be matched with house measurement information and geographic position information in a real estate registration space data set, the real estate registration space data of a user and the authority information are associated, and a complete and unified real estate registration database with a connection inside is formed.
At present, a common method for searching the current address of house data with changed house position location information of house measurement information is manual visual identification matching, workers need to manually compare original layered household drawings with existing house registration space data sets in administrative districts of the workers, and a large amount of time and labor are consumed for the work, so that the automatic matching of the house position location space data of real estate users is realized, and the method has great practical significance.
Disclosure of Invention
In order to solve the problems, the invention provides a quick fuzzy matching method for a real estate registration space data user; the method comprises the following steps:
s101: acquiring data; the method comprises the following steps: acquiring the existing layered household graph, a real estate registration space data set and place name address information;
s102: preprocessing data; screening out a layered household graph which cannot be associated with address information in an address name database, and carrying out vectorization processing on the screened layered household graph which cannot be associated with place name information to obtain a shape index SI, a length AR, a housing area S, a floor number N and a belonging area of each layered household graph;
according to the affiliated jurisdiction of each hierarchical household graph, all house data in the affiliated jurisdiction of each hierarchical household graph is extracted from the house registration space data set to serve as objects to be matched in the element matching stage of each hierarchical household graph, and meanwhile, an error allowable range delta is set;
s103: data processing, namely element fuzzy matching; and respectively carrying out quick matching association on each layered household graph: and taking the house area S, the floor number N, the shape index SI and the length AR of the house in each layered household graph as similarity factors E to carry out similarity calculation one by one.
Further, in step S102, the shape index SI and the length AR of the vector graph corresponding to each hierarchical household graph are automatically calculated in the ArcGIS software, and the house area S, the floor number N, and the jurisdiction to which the house belongs are extracted.
Further, in step S103, the building area S, the number of floors N, the shape index SI, and the elongation AR of the house in each hierarchical household graph are taken as similarity factors E to perform similarity calculation one by one, and the similarity calculation includes 5 stages, which sequentially include: area similarity matching, floor number similarity matching, shape similarity matching, length similarity matching and total similarity matching.
Further, in step S103, for a hierarchical household graph that cannot be associated with address information, the specific element fuzzy matching method is as follows:
and (S1) matching the area similarity: extracting all house data in the jurisdiction of the hierarchical household graph in the house registration space data set as a first house data set to be matched of the hierarchical household graph; and calculating the area similarity SF between each extracted house data and the layered household graph according to the following formulaS
Figure BDA0002473741290000021
In the above formula, S1For areas of the house where a hierarchical household graph of address information cannot be managed, S2Registering the area of the house data in the district belonging to the hierarchical household graph in the space data set for the house; the calculation formula of the house area is as follows:
Figure BDA0002473741290000022
wherein (X)k,Yk) The coordinates of the k-th turning point forming the vector graphics; k is 1, 2.., n, which represents the 1 st to nth turning point coordinates constituting the vector graphic; n is the total number of turning points of the vector graph;
traversing all the house data in the area belonging to the domain of the hierarchical household graph to obtain the area similarity between each house data in the area belonging to the hierarchical household graph and the house of the hierarchical household graph, and keeping all the house data with the area similarity within an error tolerance range delta to be used as a second house data set to be matched of the hierarchical household graph to participate in the matching at the stage S2; removing the house data with the area similarity not within the error tolerance range delta;
and (S2) matching the floor number similarity: calculating the floor similarity SF between each house data in the second house data set to be matched and the house in the layered household graph according to the following formulaF
Figure BDA0002473741290000031
In the above formula, F1The number of floors of the house in the layered household-dividing diagram, F2The number of floors of the house data in the second house data set to be matched;
traversing all the house data in the second house data set to be matched to obtain the floor number similarity between the house in the hierarchical household graph and each house data in the second house data set to be matched, and keeping all the house data with the floor number similarity within an error tolerance range delta to be used as a third house data set to be matched of the hierarchical household graph to participate in matching at the stage S3; removing the house data of which the floor number similarity is not within the error tolerance range delta;
and (S3) matching the shape similarity: calculating the shape similarity SF between each house data in the third house data set to be matched and the house in the layered household graph according to the following formulaSI
Figure BDA0002473741290000032
In the above formula, SI1Is the shape index, SI, of the house in the hierarchical household graph2The shape index of the house data in the third house data set to be matched; the shape index SI (shape index) is calculated as follows:
Figure BDA0002473741290000033
Figure BDA0002473741290000034
in the above formula, L is the perimeter of the vector graphic, and S is the area of the vector graphic; (X)i,Yi) The coordinates of the ith turning point forming the vector graph; n is the total number of turning points of the vector graph;
traversing all the house data in the third house data set to be matched to obtain the shape similarity between the house in the hierarchical household graph and each house data in the third house data set to be matched, and keeping all the house data with the shape similarity within an error tolerance range delta to be used as a fourth house data set to be matched of the hierarchical household graph to participate in the matching at the stage S4; removing the house data with the shape similarity not within the error tolerance range delta;
and (S4) matching the similarity of the slightness: calculating the similarity SF of the length between each house data in the fourth house data set to be matched and the house in the layered household graph according to the following formulaAR
Figure BDA0002473741290000041
In the above formula, AR1For the length of the house in the hierarchical household graph, AR2The narrow length of the house data in the fourth house data set to be matched; the narrow length is calculated as follows:
Figure BDA0002473741290000042
in the above formula, ar (aspect ratio) is a narrow length, that is, an aspect ratio of a minimum bounding rectangle of the vector graphics; l is the length of the minimum circumscribed rectangle of the vector graphics, and W is the width of the minimum circumscribed rectangle of the vector graphics;
traversing all the house data in the fourth house data set to be matched to obtain the similarity of the length and width between the house in the hierarchical household graph and each house data in the fourth house data set to be matched, and keeping all the house data with the similarity of the length and width within an error tolerance range delta to be used as a fifth house data set to be matched of the hierarchical household graph to participate in matching in the stage S5; removing the house data with the similarity of the length and the width not within the error tolerance range delta;
stage S5, total similarity matching: and calculating the total similarity SM between each house data in the fifth house data set to be matched and the house in the hierarchical household graph according to the following formula:
Figure BDA0002473741290000043
in the above formula, n is 4, which is the total number of similarity factors; a. theiIs the weight coefficient corresponding to the ith similarity factor, and is a preset value, SFiThe similarity corresponding to the ith similarity factor; i is 1,2,3 and 4, which respectively correspond to four similarity factors of the house area, the floor number, the shape index and the length;
and traversing all the house data in the fifth house data set to be matched to obtain the total similarity between the house in the hierarchical household graph and each house data in the fifth house data set to be matched, sequencing all the total similarities from large to small, and selecting the house data with the maximum total similarity as an element fuzzy matching result of the hierarchical household graph.
Further, if only one house data exists in the fifth house database to be matched, the house data is used as an element fuzzy matching result of the hierarchical family splitting diagram.
Further, the data processing further comprises: in the data output stage, technicians consult related files according to the element fuzzy matching result to determine the house seating in the corresponding layered household graph and perform field investigation to verify the seating information, and finally sort and enter the element fuzzy matching result into a real estate registration database.
In step S102, the error allowable range Δ is [0.9,1.1 ].
The technical scheme provided by the invention has the beneficial effects that: according to the technical scheme provided by the invention, the quick fuzzy matching model is used for matching the natural building base outer contour of the layered household graph and the house property registration space data set, on the premise of ensuring the matching precision, the layered household graph of the geographical position information with unknown geographical position information is quickly matched and associated with the house property registration space data set, so that the working efficiency is improved, and the time cost and the labor cost are reduced.
Drawings
The invention will be further described with reference to the accompanying drawings and examples, in which:
FIG. 1 is a flowchart of a fast fuzzy matching method for a real estate registration space data user;
fig. 2 is a detailed flowchart of a real estate registration space data user collision fast fuzzy matching method in the embodiment of the present invention.
Detailed Description
For a more clear understanding of the technical features, objects and effects of the present invention, embodiments of the present invention will now be described in detail with reference to the accompanying drawings.
The embodiment of the invention provides a rapid fuzzy matching method for a real estate registration space data user; referring to fig. 1, fig. 1 is a flowchart of a fast fuzzy matching method for a real estate registration space data user in an embodiment of the present invention, which specifically includes the following steps:
s101: acquiring data; the method comprises the following steps: acquiring the existing layered household graph, a real estate registration space data set and place name address information;
s102: preprocessing data; screening out a layered household graph which cannot be associated with the address information of the place name, and carrying out vectorization treatment on the screened layered household graph which cannot be associated with the address information of the place name to obtain a shape index SI (shape index), an elongation AR (aspect ratio), a housing area S, a floor number N and a belonging jurisdiction of each layered household graph;
according to the belonged jurisdiction of each layered household graph, all house data in the belonged jurisdiction of each layered household graph is extracted from the house registration space data set to serve as objects to be matched in the element matching stage of each layered household graph, and meanwhile, the error allowable range delta is set to be [0.9,1.1 ];
s103: data processing, namely element fuzzy matching; and respectively carrying out quick matching association on each layered household graph: and taking the house area S, the floor number N, the shape index SI and the length AR of the house in each layered household graph as similarity factors E to carry out similarity calculation one by one, wherein the similarity calculation comprises 5 stages which are sequentially as follows: area similarity matching, floor number similarity matching, shape similarity matching, length similarity matching and total similarity matching.
In step S102, the shape index SI and the length AR of the vector graph corresponding to each layered household graph are automatically calculated in ArcGIS software, and the area S of the house, the floor number N of the building and the district to which the house belongs are extracted.
FIG. 2 is a detailed flowchart of a real estate registration space data user collision fast fuzzy matching method according to an embodiment of the present invention; in step S103, for a hierarchical household graph that cannot be associated with address information, the specific element fuzzy matching method is as follows:
and (S1) matching the area similarity: extracting all house data in the jurisdiction of the hierarchical household graph in the house registration space data set as a first house data set to be matched of the hierarchical household graph; and calculating the area similarity SF between each extracted house data and the layered household graph according to the following formulaS
Figure BDA0002473741290000061
In the above formula, S1For a floor area of a hierarchical household graph in which place name information cannot be managed, S2Registering the area of the house data in the district belonging to the hierarchical household graph in the space data set for the house; the calculation formula of the house area is as follows:
Figure BDA0002473741290000062
wherein (X)k,Yk) The coordinates of the k-th turning point forming the vector graphics; k is 1, 2.., n, which represents the 1 st to nth turning point coordinates constituting the vector graphic; n is the total number of turning points of the vector graph;
traversing all the house data in the area belonging to the domain of the hierarchical household graph to obtain the area similarity between each house data in the area belonging to the hierarchical household graph and the house of the hierarchical household graph, and keeping all the house data with the area similarity within an error tolerance range delta to be used as a second house data set to be matched of the hierarchical household graph to participate in the matching at the stage S2; removing the house data with the area similarity not within the error tolerance range delta;
and (S2) matching the floor number similarity: calculating the floor similarity SF between each house data in the second house data set to be matched and the house in the layered household graph according to the following formulaF
Figure BDA0002473741290000063
In the above formula, F1The number of floors of the house in the layered household-dividing diagram, F2The number of floors of the house data in the second house data set to be matched; there is a possibility that the floor attributes of the hierarchical household map and the property registration space data set may be inconsistent, so an error allowable range Δ needs to be set for the number of floors. Based on the feature number extraction, all floor fields are traversed, and thenJudging whether the floor number of the house registration space data set is within an error tolerance range, if so, keeping the selected record and outputting the result, otherwise, deleting the selected record:
traversing all the house data in the second house data set to be matched to obtain the floor number similarity between the house in the hierarchical household graph and each house data in the second house data set to be matched, and keeping all the house data with the floor number similarity within an error tolerance range delta to be used as a third house data set to be matched of the hierarchical household graph to participate in matching at the stage S3; removing the house data of which the floor number similarity is not within the error tolerance range delta;
and (S3) matching the shape similarity: calculating the shape similarity SF between each house data in the third house data set to be matched and the house in the layered household graph according to the following formulaSI
Figure BDA0002473741290000071
In the above formula, SI1Is the shape index, SI, of the house in the hierarchical household graph2The shape index of the house data in the third house data set to be matched; the shape index SI (shape index) is calculated as follows:
Figure BDA0002473741290000072
Figure BDA0002473741290000073
in the above formula, L is the perimeter of the vector graphic, and S is the area of the vector graphic; (X)i,Yi) The coordinates of the ith turning point forming the vector graph; n is the total number of turning points of the vector graph; the closer the SI is to 1, the smaller the deviation degree of the polygon corresponding to the hierarchical user splitting diagram from the standard shape is, otherwise, the more complicated the shape is, and the closer the SFSI is to 1, the higher the similarity of the two polygons is. Sources of differences in measuring instruments, inconsistencies in measurement standards, etcTherefore, measurement errors are caused, and the house property map is not completely consistent during drawing, such as existence of balconies and garbage channels, and the difference in drawing has a certain influence on the whole area. Firstly, respectively calculating the perimeter L and the area S of the layered household graph and the graph in the real estate registration space data set, substituting the perimeter L and the area S into a formula to calculate the respective shape index, and then judging the SF of the real estate registration space data setSIIf the error is within the tolerance range, the selected record is not deleted if the error is false, otherwise, the record is kept.
Traversing all the house data in the third house data set to be matched to obtain the shape similarity between the house in the hierarchical household graph and each house data in the third house data set to be matched, and keeping all the house data with the shape similarity within an error tolerance range delta to be used as a fourth house data set to be matched of the hierarchical household graph to participate in the matching at the stage S4; removing the house data with the shape similarity not within the error tolerance range delta;
and (S4) matching the similarity of the slightness: calculating the similarity SF of the length between each house data in the fourth house data set to be matched and the house in the layered household graph according to the following formulaAR
Figure BDA0002473741290000081
In the above formula, AR1For the length of the house in the hierarchical household graph, AR2The narrow length of the house data in the fourth house data set to be matched; the narrow length is calculated as follows:
Figure BDA0002473741290000082
in the above formula, ar (aspect ratio) is a narrow length, that is, an aspect ratio of a minimum bounding rectangle of the vector graphics; l is the length of the minimum circumscribed rectangle of the vector graphics, and W is the width of the minimum circumscribed rectangle of the vector graphics;
traversing all the house data in the fourth house data set to be matched to obtain the similarity of the length and the width between the house in the hierarchical family splitting diagram and each house data in the fourth house data set to be matched, and keeping all the house data of which the similarity of the length and the width is within an error tolerance range delta; removing the house data with the similarity of the length and the width not within the error tolerance range delta;
when the layered household drawings are used for surveying and mapping, due to the fact that the due north directions marked on the drawings are not all uniformly upward, the marking of the due north direction of the house on the drawings is inconsistent, when the vectors are extracted, the directions marked on the drawings cannot be distinguished, the extracted graphs are not necessarily correct in direction, and partial graphs can be regarded as results formed by rotating through a certain angle. In order to eliminate the matching error caused by the graph rotation, the length of the polygon slit is selected as a similarity factor, the similarity factor has rotation invariance, and the length of the polygon slit is a fixed value as long as the rough shape of the polygon is not changed in any way of rotating a graph, so that the requirement of model precision can be well met. Firstly, calculating the central point of each graph, rotating all coordinate points by taking the central point as an origin, calculating the area of a circumscribed rectangle in the rotating process until the area of the circumscribed rectangle is minimum, wherein the circumscribed rectangle is the minimum circumscribed rectangle, judging whether the length-width ratio in the house property registration space data set is within an error tolerance range or not by calculating the length-width ratio of the minimum circumscribed rectangle, deleting the selected record if the length-width ratio is not within the error tolerance range, and otherwise, keeping the record.
Stage S5, total similarity matching: in order to prevent the occurrence of graphic data difference between a historical layered household graph and a house property registration space data set due to inconsistent drawing standards, on the premise of ensuring the precision, four parameters are selected as similarity factors E, the similarity between the layered household graph and the house property registration space data set is calculated by comparing the similarity factors E between the layered household graph and the house property registration space data set, the weight ratio of the similarity factors is distributed according to the contribution importance degree of each similarity factor in a model, and the total similarity SM of model matching is calculated according to the following formula:
Figure BDA0002473741290000091
in the above formula, n is 4, which is the total number of similarity factors; a. theiIs the weight coefficient corresponding to the ith similarity factor, and is a preset value, SFiThe similarity corresponding to the ith similarity factor; i is 1,2,3 and 4, which respectively correspond to four similarity factors of the house area, the floor number, the shape index and the length;
and traversing all the house data in the fifth house data set to be matched to obtain the total similarity between the house in the hierarchical household graph and each house data in the fifth house data set to be matched, sequencing all the total similarities from large to small, and selecting the house data with the maximum total similarity as an element fuzzy matching result of the hierarchical household graph. And if only one house data exists in the fifth house data set to be matched, taking the house data as an element fuzzy matching result of the hierarchical family splitting diagram.
And finally, in the data output stage, technicians consult related files according to the element fuzzy matching result to determine the house seating in the corresponding layered household graph and check the seating information through field work investigation, and finally sort and record the element fuzzy matching result into a real estate space database.
In the embodiment of the invention, nine experimental data are respectively subjected to fast fuzzy matching, and the accuracy and the efficiency of the model are evaluated by comparing the operation performance conditions of the model in the nine experimental data.
In order to improve the accuracy of the model and the operation efficiency of the model, the error tolerance range of the model is firstly divided into 5 ranges, the fifth data is used as experimental data, the experimental result is shown in table 1, it can be seen from table 1 that the accuracy of model matching is a first-rising and then-falling process, when the error tolerance range is too large, the matching accuracy is slowly reduced, incorrect figures begin to appear in the matching result, and when the error tolerance range is too small, the matching accuracy is suddenly reduced, because the model is a fuzzy matching model and not an accurate matching model, certain errors exist between the figures, complete accurate matching cannot be achieved, and only the similarity of the figures can be measured. According to the results shown in table 1, in the embodiment of the present invention, 0.9-1.1 is selected as the error tolerance range of the model, which not only has a very high accuracy, but also consumes relatively less time for matching, thereby satisfying the requirements of accuracy and efficiency of the model.
TABLE 1 fast fuzzy match error tolerance query results
Tab 1 Fast Fuzzy Matching Error Tolerance Range Query Results
Figure BDA0002473741290000092
Figure BDA0002473741290000101
And inquiring and matching the nine layered household graphs of the data in respective corresponding real estate registration space data sets, wherein the nine layered household graphs have great shape differences and are all irregular geometric figures, the quick fuzzy matching inquiry result is shown in table 2, and the nine data are different in matching quantity in order to better reflect the running condition of a quick fuzzy matching model. The fast fuzzy matching model operates well in nine data sets, and the similarity is low, medium, high and high respectively according to the five-level classification standard with 20% as an equal interval. The similarity of the matching is over 70 percent, wherein the average matching similarity is 82.186 percent, and the matching belongs to the high similarity category.
TABLE 2 fast fuzzy matching query results
Tab 2 Fast Fuzzy Matching Query Results
Figure BDA0002473741290000102
In the aspect of precision matching, as shown in table 3, the error tolerance range is set to be 0.9-1.1, the selected number of the nine data is completely consistent with the actual result, the matching results all reach 100% of accuracy, the requirement of the model on the precision is completely met, and the basic work of data association is made for the subsequent real estate registration space data cleaning.
In the aspect of matching efficiency, the average time consumed by completing one matching of nine data is 16.8208s, the time consumed by matching one data is about 1 day in manual matching, and compared with manual matching, the automatic matching of a computer is quicker and more convenient.
TABLE 3 quick fuzzy matching accuracy and efficiency query results
Tab 4.3 Query Results of Fast Fuzzy Matching Accuracy and Efficiency
Figure BDA0002473741290000103
Figure BDA0002473741290000111
Referring to FIG. 2, FIG. 2 is a flow chart of the implementation based on python according to the embodiment of the present invention.
The invention has the beneficial effects that: according to the technical scheme provided by the invention, the rapid fuzzy matching model is used for matching the natural building base outer contour of the layered household graph and the house property registration space data set, and on the premise of ensuring the matching precision, the layered household graph lacking specific geographic position information is rapidly matched and associated with the house property registration space data set, so that the efficiency is improved, and the manual labor force is reduced.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (7)

1. A fast fuzzy matching method for a real estate registration space data user building is characterized by comprising the following steps: the method comprises the following steps:
s101: acquiring data; the method comprises the following steps: acquiring the existing layered household graph, a real estate registration space data set and place name address information;
s102: preprocessing data; screening out a layered household graph which cannot be associated with address information in the place name address information, and carrying out vectorization processing on the screened layered household graph which cannot be associated with the place name information to obtain a shape index SI, a length AR, a housing area S, a floor number N and a belonging area of each layered household graph;
according to the affiliated jurisdiction of each hierarchical household graph, all house data in the affiliated jurisdiction of each hierarchical household graph is extracted from the house registration space data set to serve as objects to be matched in the element matching stage of each hierarchical household graph, and meanwhile, an error allowable range delta is set;
s103: data processing, namely element fuzzy matching; and respectively carrying out quick matching association on each layered household graph: and taking the house area S, the floor number N, the shape index SI and the length AR of the house in each layered household graph as similarity factors E to carry out similarity calculation one by one.
2. The fast fuzzy matching method for the real estate registration space data user building as claimed in claim 1, characterized in that: in step S102, the shape index SI and the length AR of the vector graph corresponding to each layered household graph are automatically calculated in ArcGIS software, and the area S of the house, the floor number N of the building and the district to which the house belongs are extracted.
3. The fast fuzzy matching method for the real estate registration space data user building as claimed in claim 1, characterized in that: in step S103, the house area S, the number of floors N, the shape index SI, and the elongation AR of the house in each hierarchical household graph are used as similarity factors E to perform similarity calculation one by one, including 5 stages, which are sequentially: area similarity matching, floor number similarity matching, shape similarity matching, length similarity matching and total similarity matching.
4. The fast fuzzy matching method for the real estate registration space data user building as claimed in claim 1, characterized in that: in step S103, for a hierarchical user-divided graph that cannot be associated with place name information, the specific element fuzzy matching method is as follows:
and (S1) matching the area similarity: extracting all house data in the jurisdiction of the hierarchical household graph in the house registration space data set as a first house data set to be matched of the hierarchical household graph; and calculating the area similarity SF between each extracted house data and the layered household graph according to the following formulaS
Figure FDA0002473741280000021
In the above formula, S1For areas of the house where a hierarchical household graph of address information cannot be managed, S2Registering the area of the house data in the district belonging to the hierarchical household graph in the space data set for the house; the calculation formula of the house area is as follows:
Figure FDA0002473741280000022
wherein (X)k,Yk) The coordinates of the k-th turning point forming the vector graphics; k is 1, 2.., n, which represents the 1 st to nth turning point coordinates constituting the vector graphic; n is the total number of turning points of the vector graph;
traversing all the house data in the area belonging to the domain of the hierarchical household graph to obtain the area similarity between each house data in the area belonging to the hierarchical household graph and the house of the hierarchical household graph, and keeping all the house data with the area similarity within an error tolerance range delta to be used as a second house data set to be matched of the hierarchical household graph to participate in the matching at the stage S2; removing the house data with the area similarity not within the error tolerance range delta;
and (S2) matching the floor number similarity: calculating the floor similarity SF between each house data in the second house data set to be matched and the house in the layered household graph according to the following formulaF
Figure FDA0002473741280000023
In the above formula, the first and second carbon atoms are,F1the number of floors of the house in the layered household-dividing diagram, F2The number of floors of the house data in the second house data set to be matched;
traversing all the house data in the second house data set to be matched to obtain the floor number similarity between the house in the hierarchical household graph and each house data in the second house data set to be matched, and keeping all the house data with the floor number similarity within an error tolerance range delta to be used as a third house data set to be matched of the hierarchical household graph to participate in matching at the stage S3; removing the house data of which the floor number similarity is not within the error tolerance range delta;
and (S3) matching the shape similarity: calculating the shape similarity SF between each house data in the third house data set to be matched and the house in the layered household graph according to the following formulaSI
Figure FDA0002473741280000024
In the above formula, SI1Is the shape index, SI, of the house in the hierarchical household graph2The shape index of the house data in the third house data set to be matched; the shape index SI (shape index) is calculated as follows:
Figure FDA0002473741280000031
Figure FDA0002473741280000032
in the above formula, L is the perimeter of the vector graphic, and S is the area of the vector graphic; (X)i,Yi) The coordinates of the ith turning point forming the vector graph; n is the total number of turning points of the vector graph;
traversing all the house data in the third house data set to be matched to obtain the shape similarity between the house in the hierarchical household graph and each house data in the third house data set to be matched, and keeping all the house data with the shape similarity within an error tolerance range delta to be used as a fourth house data set to be matched of the hierarchical household graph to participate in the matching at the stage S4; removing the house data with the shape similarity not within the error tolerance range delta;
and (S4) matching the similarity of the slightness: calculating the similarity SF of the length between each house data in the fourth house data set to be matched and the house in the layered household graph according to the following formulaAR
Figure FDA0002473741280000033
In the above formula, AR1For the length of the house in the hierarchical household graph, AR2The narrow length of the house data in the fourth house data set to be matched; the narrow length is calculated as follows:
Figure FDA0002473741280000034
in the above formula, ar (aspect ratio) is a narrow length, that is, an aspect ratio of a minimum bounding rectangle of the vector graphics; l is the length of the minimum circumscribed rectangle of the vector graphics, and W is the width of the minimum circumscribed rectangle of the vector graphics;
traversing all the house data in the fourth house data set to be matched to obtain the similarity of the length and width between the house in the hierarchical household graph and each house data in the fourth house data set to be matched, and keeping all the house data with the similarity of the length and width within an error tolerance range delta to be used as a fifth house data set to be matched of the hierarchical household graph to participate in matching in the stage S5; removing the house data with the similarity of the length and the width not within the error tolerance range delta;
stage S5, total similarity matching: and calculating the final similarity SM between each house data in the fifth house data set to be matched and the house in the hierarchical household graph according to the following formula:
Figure FDA0002473741280000035
in the above formula, n is 4, which is the total number of similarity factors; a. theiIs the weight coefficient corresponding to the ith similarity factor, and is a preset value, SFiThe similarity corresponding to the ith similarity factor; i is 1,2,3 and 4, which respectively correspond to four similarity factors of the house area, the floor number, the shape index and the length;
and traversing all the house data in the fifth house database to be matched to obtain the total similarity between the house in the hierarchical family distribution diagram and each house data in the fifth house data set to be matched, sequencing all the total similarities from large to small, and selecting the house data with the maximum total similarity as an element fuzzy matching result of the hierarchical family distribution diagram.
5. The fast fuzzy matching method for the real estate registration space data user building as claimed in claim 4, characterized in that: and if only one house data exists in the fifth house data set to be matched, taking the house data as an element fuzzy matching result of the hierarchical family splitting diagram.
6. The fast fuzzy matching method for the real estate registration space data user building as claimed in claim 4, characterized in that: the data processing further comprises: in the data output stage, technicians consult related files according to the element fuzzy matching result to determine house seating in the corresponding layered household graph and carry out field investigation to verify seating information, and finally, real estate registration right information and natural building data are associated in a real estate registration database.
7. The fast fuzzy matching method for the real estate registration space data user building as claimed in claim 1, characterized in that: in step S102, the error allowable range Δ is [0.9,1.1 ].
CN202010356805.5A 2020-04-29 2020-04-29 Rapid fuzzy matching method for real estate registration space data user landing Active CN111625549B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010356805.5A CN111625549B (en) 2020-04-29 2020-04-29 Rapid fuzzy matching method for real estate registration space data user landing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010356805.5A CN111625549B (en) 2020-04-29 2020-04-29 Rapid fuzzy matching method for real estate registration space data user landing

Publications (2)

Publication Number Publication Date
CN111625549A true CN111625549A (en) 2020-09-04
CN111625549B CN111625549B (en) 2023-09-22

Family

ID=72260615

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010356805.5A Active CN111625549B (en) 2020-04-29 2020-04-29 Rapid fuzzy matching method for real estate registration space data user landing

Country Status (1)

Country Link
CN (1) CN111625549B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116089541A (en) * 2022-12-21 2023-05-09 深圳市规划和自然资源数据管理中心(深圳市空间地理信息中心) Abnormal identification method for massive real estate registration data

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001034736A (en) * 1999-07-21 2001-02-09 Suehiro Era Mapping management system and house diagram preparation management system
JP2002236720A (en) * 2001-02-13 2002-08-23 Ben System:Kk System for managing building data
US20040098269A1 (en) * 2001-02-05 2004-05-20 Mark Wise Method, system and apparatus for creating and accessing a hierarchical database in a format optimally suited to real estate listings
CN105069718A (en) * 2015-07-27 2015-11-18 华南师范大学 Self-service real estate monitoring method and system of smart territory based on mobile Internet of things (IoT)
FR3033910A1 (en) * 2015-03-17 2016-09-23 Magilog Sas METHOD FOR LOCATING REAL PROPERTY SEARCHED FROM CADASTRAL DATA
CN106204358A (en) * 2016-06-30 2016-12-07 广东新禾道信息科技有限公司 Real estate unifies register information management system
CN108153917A (en) * 2018-01-31 2018-06-12 福州市勘测院 A kind of real estate investigation banking process for visualizing with defining people room relationship
CN108427714A (en) * 2018-02-02 2018-08-21 北京邮电大学 The source of houses based on machine learning repeats record recognition methods and system
CN109460446A (en) * 2019-01-29 2019-03-12 江苏省测绘工程院 A kind of integration method on house property ancestor ground
CN109816767A (en) * 2017-11-17 2019-05-28 南京国图信息产业有限公司 A kind of three-dimensional building model house Story and door based map Method of Fuzzy Matching based on tangent space
CN109886844A (en) * 2019-03-21 2019-06-14 中国电建集团昆明勘测设计研究院有限公司 House registration data based on Bayesian network model is associated with building table method
CN111028119A (en) * 2019-12-09 2020-04-17 南京苏测测绘科技有限公司 Real estate data integration method based on GIS

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001034736A (en) * 1999-07-21 2001-02-09 Suehiro Era Mapping management system and house diagram preparation management system
US20040098269A1 (en) * 2001-02-05 2004-05-20 Mark Wise Method, system and apparatus for creating and accessing a hierarchical database in a format optimally suited to real estate listings
JP2002236720A (en) * 2001-02-13 2002-08-23 Ben System:Kk System for managing building data
FR3033910A1 (en) * 2015-03-17 2016-09-23 Magilog Sas METHOD FOR LOCATING REAL PROPERTY SEARCHED FROM CADASTRAL DATA
CN105069718A (en) * 2015-07-27 2015-11-18 华南师范大学 Self-service real estate monitoring method and system of smart territory based on mobile Internet of things (IoT)
CN106204358A (en) * 2016-06-30 2016-12-07 广东新禾道信息科技有限公司 Real estate unifies register information management system
CN109816767A (en) * 2017-11-17 2019-05-28 南京国图信息产业有限公司 A kind of three-dimensional building model house Story and door based map Method of Fuzzy Matching based on tangent space
CN108153917A (en) * 2018-01-31 2018-06-12 福州市勘测院 A kind of real estate investigation banking process for visualizing with defining people room relationship
CN108427714A (en) * 2018-02-02 2018-08-21 北京邮电大学 The source of houses based on machine learning repeats record recognition methods and system
CN109460446A (en) * 2019-01-29 2019-03-12 江苏省测绘工程院 A kind of integration method on house property ancestor ground
CN109886844A (en) * 2019-03-21 2019-06-14 中国电建集团昆明勘测设计研究院有限公司 House registration data based on Bayesian network model is associated with building table method
CN111028119A (en) * 2019-12-09 2020-04-17 南京苏测测绘科技有限公司 Real estate data integration method based on GIS

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郑举汉等: "房产测绘成果共享利用模式的创新性探索-以武汉市历史房产测绘成果清理工作", no. 2, pages 96 - 97 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116089541A (en) * 2022-12-21 2023-05-09 深圳市规划和自然资源数据管理中心(深圳市空间地理信息中心) Abnormal identification method for massive real estate registration data
CN116089541B (en) * 2022-12-21 2023-09-12 深圳市规划和自然资源数据管理中心(深圳市空间地理信息中心) Abnormal identification method for massive real estate registration data

Also Published As

Publication number Publication date
CN111625549B (en) 2023-09-22

Similar Documents

Publication Publication Date Title
EP3916668A1 (en) Urban land automatic identification system integrating industrial big data and building forms
CN112347222B (en) Method and system for converting non-standard address into standard address based on knowledge base reasoning
Mustière et al. Matching networks with different levels of detail
US20060041375A1 (en) Automated georeferencing of digitized map images
Zhang et al. An iterative road-matching approach for the integration of postal data
CN111782741A (en) Interest point mining method and device, electronic equipment and storage medium
CN106874384A (en) A kind of isomery address standard handovers and matching process
CN107368480A (en) A kind of interest point data type of error positioning, repeat recognition methods and device
CN110046661A (en) A kind of vehicle-mounted cloud clustering method cutting algorithm based on contextual feature and figure
CN111625549A (en) Real estate registration space data user building rapid fuzzy matching method
KR20100025144A (en) Architectural structure management system based on geographic information system and architecture drawing and method thereof
CN114201480A (en) Multi-source POI fusion method and device based on NLP technology and readable storage medium
CN114168705A (en) Chinese address matching method based on address element index
Liu et al. M: N Object matching on multiscale datasets based on MBR combinatorial optimization algorithm and spatial district
CN108647189B (en) Method and device for identifying user crowd attributes
CN111090630A (en) Data fusion processing method based on multi-source spatial point data
CN111325235B (en) Multilingual-oriented universal place name semantic similarity calculation method and application thereof
CN115186074A (en) Meta analysis-based method for simulating spatial distribution pattern of pH value of soil
CN110609874B (en) Address entity coreference resolution method based on density clustering algorithm
CN111105124A (en) Multi-landmark influence calculation method based on distance constraint
CN113780459B (en) Urban and rural settlement type automatic identification method and system based on spatial pedigree
CN110727793B (en) Method, device, terminal and computer readable storage medium for area identification
CN116578676B (en) Method and system for inquiring space-time evolution of place name
CN111476033B (en) Bus stop name generation method and device
Santos et al. Knowledge discovery in spatial databases: the PADRÃO’s qualitative approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant