CN113360789A - Interest point data processing method and device, electronic equipment and storage medium - Google Patents

Interest point data processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113360789A
CN113360789A CN202110602863.6A CN202110602863A CN113360789A CN 113360789 A CN113360789 A CN 113360789A CN 202110602863 A CN202110602863 A CN 202110602863A CN 113360789 A CN113360789 A CN 113360789A
Authority
CN
China
Prior art keywords
information
administrative
point
interest data
region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110602863.6A
Other languages
Chinese (zh)
Inventor
杨浩铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vivo Mobile Communication Co Ltd
Original Assignee
Vivo Mobile Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vivo Mobile Communication Co Ltd filed Critical Vivo Mobile Communication Co Ltd
Priority to CN202110602863.6A priority Critical patent/CN113360789A/en
Publication of CN113360789A publication Critical patent/CN113360789A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/232Orthographic correction, e.g. spell checking or vowelisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Navigation (AREA)

Abstract

The application discloses a method and a device for processing point of interest data, electronic equipment and a storage medium, belonging to the field of computers, wherein the method for processing the point of interest data comprises the following steps: obtaining point of interest data, wherein the point of interest data comprises an address field and a administrative area field; acquiring at least one stage of administrative region information in the address field; matching the information of at least one stage of administrative region with the information of the administrative region field in the interest point data to obtain a matching result; and updating the point of interest data according to the matching result.

Description

Interest point data processing method and device, electronic equipment and storage medium
Technical Field
The application belongs to the field of computers, and particularly relates to a method and a device for processing point of interest data, electronic equipment and a storage medium.
Background
A Point of Interest (POI) generally refers to any geographic object that can be abstracted in space as a Point. For example, the common points of interest include schools, subway stations, hospitals, shops, and the like. The interest point data is used for accurately representing real geographic entities in virtual spaces such as maps and the like, and provides data support for functions of geographic query, positioning, display, personalized recommendation and the like. Therefore, the accuracy and correctness of the point of interest data is the basis of advanced functions.
However, the point of interest data may be obtained from disparate sources, resulting in data conflicts between certain fields in the point of interest data. For example, the point-of-interest data includes an address field and a city field, wherein the address field is "shenzhen city futian zhong road in futian zone", and the city field is "guangzhou city". Thus, there is a conflict between "Shenzhen City" in the address field and "Guangzhou City" in the City field. The interest point data with low quality can influence the query and use of the interest point.
Disclosure of Invention
The embodiment of the application aims to provide a method and a device for processing point of interest data, electronic equipment and a storage medium, which can solve the problem that the point of interest data with low quality can influence the query and use of the point of interest.
In a first aspect, an embodiment of the present application provides a method for processing point of interest data, including:
obtaining point of interest data, wherein the point of interest data comprises an address field and an administrative area field;
acquiring at least one stage of administrative region information in the address field;
matching the information of the at least one stage of administrative region with the information of the administrative region field in the point of interest data to obtain a matching result;
and updating the point of interest data according to the matching result.
In a second aspect, an embodiment of the present application provides a point of interest data processing apparatus, including:
the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring point-of-interest data which comprises an address field and an administrative area field;
the second acquisition module is used for acquiring at least one stage of administrative region information in the address field;
the matching module is used for matching the information of the at least one stage of administrative region with the information of the administrative region field in the point of interest data to obtain a matching result;
and the updating module is used for updating the point of interest data according to the matching result.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a processor, a memory, and a program or instructions stored on the memory and executable on the processor, and when executed by the processor, the program or instructions implement the steps of the method according to the first aspect.
In a fourth aspect, embodiments of the present application provide a readable storage medium, on which a program or instructions are stored, which when executed by a processor implement the steps of the method according to the first aspect.
In a fifth aspect, an embodiment of the present application provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a program or instructions to implement the method according to the first aspect.
In the embodiment of the application, point-of-interest data is obtained firstly, then administrative region information is obtained from an address field in the point-of-interest data, and the administrative region information is matched with information of an administrative region field in the point-of-interest data to obtain a matching result; and then, updating the point of interest data according to the matching result. Therefore, errors in the point of interest data can be corrected, the quality of the point of interest data is improved, and data support is provided for functions of position query, positioning and the like based on the point of interest data.
Drawings
Fig. 1 is a schematic flowchart of an embodiment of a point of interest data processing method provided in the present application.
Fig. 2 is a schematic diagram of an embodiment of acquiring administrative area information by using a prefix tree according to the present application.
Fig. 3 is a schematic diagram of another embodiment of acquiring administrative area information by using a prefix tree according to the present application.
Fig. 4 is a schematic diagram of an embodiment of administrative area information conflicts provided herein.
Fig. 5 is a schematic diagram of another embodiment of administrative area information conflicts provided herein.
FIG. 6 is a diagram illustrating one embodiment of information handling rules for conflict handling provided herein.
FIG. 7 is a diagram illustrating one embodiment of a province field conflict with a city as provided herein.
Fig. 8 is a schematic diagram of an embodiment of uniformly setting the prefecture and county information in the prefecture and county information provided by the present application.
Fig. 9 is a flowchart illustrating a point of interest data processing method according to another embodiment of the present application.
FIG. 10 is a schematic structural diagram of an embodiment of a point of interest data processing apparatus provided in the present application.
Fig. 11 is a schematic structural diagram of an embodiment of an electronic device provided in the present application.
Fig. 12 is a schematic structural diagram of another embodiment of an electronic device provided in the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described clearly below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present disclosure.
The terms first, second and the like in the description and in the claims of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that embodiments of the application may be practiced in sequences other than those illustrated or described herein, and that the terms "first," "second," and the like are generally used herein in a generic sense and do not limit the number of terms, e.g., the first term can be one or more than one. In addition, "and/or" in the specification and claims means at least one of connected objects, a character "/" generally means that a preceding and succeeding related objects are in an "or" relationship.
Due to the fact that the obtaining sources of the point of interest data are uneven and different in quality, the correctness of the point of interest data is threatened to be wrong or missing in longitude and latitude, wrong or missing in address, wrong or missing in administrative division and the like. Common errors in the point of interest data are as follows:
1. the information of the province field in the point-of-interest data conflicts with the information of the address field, for example, referring to table 1, "shenzhen city" in the address conflicts with "guangzhou city" in the province field.
TABLE 1
Figure BDA0003093189410000031
2. The longitude and latitude in the point-of-interest data conflicts with the information of the provincial region field, for example, referring to table 2, the longitude and latitude are not within the range of the provincial region "futian region of guangdong shenzhen city, that is, the longitude and latitude conflict with the information of the provincial region.
TABLE 2
Figure BDA0003093189410000032
3. The latitude and longitude in the point of interest data conflicts with the information in the address field, for example, referring to table 3, "good road in fotian area 136" with latitude and longitude not in the address.
TABLE 3
Figure BDA0003093189410000041
These errors affect the query and use of the point of interest data, so detecting whether these point of interest data are erroneous or not is a prerequisite to ensure that the point of interest data can be effectively used.
In the related art, the processing of the point-of-interest data focuses on both geocoding (i.e., generating corresponding latitude and longitude according to the address in the point-of-interest data) and reverse geocoding (i.e., generating corresponding address according to latitude and longitude in the point-of-interest data). However, the above method only trusts information (latitude and longitude or address) of one field, and does not detect conflict between latitude and longitude and address. In fact, there are a lot of data of interest, and the addresses do not match the corresponding positions of the latitude and longitude.
For example, the address is "Kanglu in Futian district of Shenzhen, Guangdong province", and the position corresponding to the latitude and longitude is in Shanxi province, Jinzhong province, and this kind of error can be recognized by simple application of geocoding and inverse geocoding, but the more complicated conflict is not recognized well. For example, the address in the interest point data is "futian region of Guangdong Shenzhen city", the province region fields are "Guangdong province", "Guangzhou city" and "Tianhe region", respectively, and the position corresponding to the longitude and latitude is "Guangzhou city leaux region" in Guangdong province. These three sources of information create conflicts in how to define, identify and resolve, and no methods have been mentioned.
In addition, the related art has a scheme for processing the point of interest data, but the scheme defaults that the information of the administrative area field in the point of interest data is accurate, and mainly detects the conflict relationship between the longitude and latitude and the address field. Such as road name, road number, name of point of interest, whether the named area in the address is near the longitude and latitude, etc. The consistency detection of the information of the administrative area fields and the fields of addresses, longitudes and latitudes and the like is not involved. Only if the matching between the administrative region information in the address field and the administrative region information is ensured, the quality of the point of interest data can be ensured, and the functions of data query, positioning, display, personalized recommendation and the like can be better performed by using the point of interest.
In order to improve the quality of the point of interest data and provide data support for functions such as position query and positioning, the present application provides a point of interest data processing method, and fig. 1 is a schematic flow chart of an embodiment of the point of interest data processing method provided by the present application. As shown in fig. 1, includes:
s102, obtaining the point of interest data, wherein the point of interest data comprises an address field and an administrative area field.
The point-of-interest data not only includes an address field and administrative district fields (such as administrative district fields of province, city, prefecture and the like), but also includes the following fields: name, address, province, city, district, province code, city code, district code. For example, the point of interest data is shown in table 4:
TABLE 4
Figure BDA0003093189410000051
The method for processing the point of interest data further comprises the following steps:
s104, acquiring at least one-level administrative district information in the address field, for example, acquiring province, city, district and other information in the address field;
and S106, matching the information of the at least one stage of administrative region with the information of the administrative region field in the interest point data to obtain a matching result.
S106 may specifically include: and matching the administrative region information of each level with the information of the corresponding administrative region field in the interest point data respectively to obtain a matching result. For example, in the case where province information "guangdong province", city information "shenzhen city", and prefecture information "futian district" are acquired from the address fields shown in table 4, in S106, the province "guangdong province" in the address fields is matched with the province "guangdong province" in the province fields shown in table 4, and a matching result is obtained. Similarly, the city in the address field is matched with the information in the city field shown in table 4, and the county in the address field is matched with the information in the district field shown in table 4, so as to obtain a matching result.
The method for processing the point of interest data further comprises the following steps:
and S108, updating the point of interest data according to the matching result.
As an example, S108 may specifically include: and under the condition that the information of any level of administrative region is not matched with the information of the corresponding administrative region field, updating the point-of-interest data, so that the information of the administrative region of the level is matched with the information of the corresponding administrative region field in the updated point-of-interest data.
In the embodiment of the application, point-of-interest data is obtained firstly, then administrative region information is obtained from an address field in the point-of-interest data, and the administrative region information is matched with information of an administrative region field in the point-of-interest data to obtain a matching result; and then, updating the point of interest data according to the matching result. Therefore, errors in the point of interest data can be corrected, the quality of the point of interest data is improved, and data support is provided for functions of position query, positioning and the like based on the point of interest data.
In one or more embodiments of the present application, S104 may include:
segmenting words of the information of the address field in the interest point data to obtain segmented words; matching the word segmentation words with information corresponding to nodes on a preset prefix tree, wherein the preset prefix tree comprises administrative region information, and the information corresponding to a plurality of nodes on the same path of the preset prefix tree forms the administrative region information; and under the condition that the word segmentation words are matched with the information corresponding to the nodes on the preset prefix tree, determining administrative region information according to the word segmentation words.
In the embodiment of the application, administrative region information in an address field is acquired by using a prefix tree mode. Compared with the mode of identifying administrative region information by adopting a naming body identification, such as machine learning, deep learning and the like, the advantage of utilizing the prefix tree is that a large amount of marking data is not needed for training. Moreover, timeliness of acquiring administrative region information can be guaranteed by using the prefix tree. The reason is that operations such as renaming, merging and deleting often occur in administrative divisions of countries, the way of training a machine learning model is difficult to meet the requirement of timeliness, outdated administrative division information is possibly extracted, a certain extraction error rate exists, and weak interference of data easily causes great influence on the model. For example, the address field includes "qing yuan international school", and "qing yuan" may be extracted by mistake when acquiring administrative area information from the address field by using the model. According to the method and the device, the administrative area information is acquired in a prefix tree mode, and the error rate can be reduced.
The prefix tree is constructed based on the latest administrative districts, so that extracted administrative district information is ensured to be administrative division information used by the country, and not outdated information. In addition, since not all addresses are written in a standardized form, for example, some addresses in the province are in a form of shorthand or miswritten. Some of the administrative district information are abbreviated and miswritten as shown in the following table 5:
TABLE 5
Shorthand or miswriting of administrative district information Standard writing method for administrative region information
Shenzhen (Shenzhen medicine) Shenzhen city
Guangdong (Chinese character of Guangdong) Guangdong province
Guangxi province Guangxi Zhuang Autonomous Region
Inner Mongolia Inner Mongolia Autonomous Region
State of enrichments Enshi Tujia and Miao Autonomous Prefecture
Aksu Aksu region
In order to improve the accuracy of acquiring the administrative district information from the address field, as an example, the determining the administrative district information according to the word segmentation term may specifically include: determining the word segmentation words as first-level administrative district information under the condition that the word segmentation words are standardized administrative district information; and under the condition that the word segmentation words are not standardized administrative district information, the standardized administrative district information corresponding to the word segmentation words is used as first-level administrative district information. Wherein whether the participle term is in table 5 similar to the above can be queried to determine whether the participle term is standardized administrative district information.
As another example, determining the administrative area information according to the word segmentation words may specifically include: and determining administrative region information corresponding to the word segmentation words according to nodes corresponding to the information matched with the word segmentation words in the preset prefix tree.
As shown in fig. 2 and 3, the positions of the arrows represent the positions where the participle words are matched in the preset prefix tree, the arrow pointing to "shenzhen city" in fig. 2 represents administrative region information determined according to the participle words, and the arrow pointing to "shenchurian family miaow autonomous state" in fig. 3 represents administrative region information determined according to the participle words.
Specifically referring to fig. 2, in the case that the participle term "Shenzhen" is obtained from the address field, the participle term "Shenzhen" is matched with "Shenzhen" corresponding to a node in the preset prefix tree, at this time, information "Shenzhen city" corresponding to a leaf node on the same path as the node is obtained, and it is determined that the administrative region information is "Shenzhen city". Of course, if the participle term "Shenzhen city" is obtained from the address field, the participle term "Shenzhen city" is matched with "Shenzhen city" in the preset prefix tree, and it is determined that the administrative region information is also "Shenzhen city".
Fig. 3 is similar to fig. 2, and in the case where the word segmentation word "enriches state" or "enriches earth family miaow autonomous state" is acquired from the address field, the administrative area information determined based on the prefix tree is "enriches earth family miaow autonomous state".
Therefore, the full-name matching of the administrative region information can be realized through the prefix tree, the matching of the shorthand and the error writing of the administrative region information can also be realized, and the accuracy of acquiring the administrative region information from the address field is improved.
In one or more embodiments of the present application, S104 may include:
and acquiring at least one stage of administrative district information in the address field from the initial position of the information of the address field, wherein when any one piece of administrative district information is acquired from the address field, the next stage of administrative district information of the administrative district information is acquired from the information positioned after the administrative district information in the address field.
As one example, each character in the information of the address field is indexed; after acquiring the information of the first-level administrative district from the address field, acquiring the information of the next-level administrative district after the information of the first-level administrative district according to the index of the ending character of the information of the first-level administrative district in the information of the address field.
Suppose the information of the address field is "Kanglu 136 number in Meilin street in Shentian district of Guangdong province Shenzhen city", the first word (i.e., "Guang") at the beginning of the address field is numbered as number 1, the second word (i.e., "east") after the address field is numbered as number 2, and so on.
Then, acquiring administrative region information from the No. 1 character, and acquiring province information 'Guangdong province'; then, since the province in the address book is generally followed by the city name, according to the index of the ending character of "guangdong province", in the information following "guangdong province", the city information "Shenzhen city" following "guangdong province" is acquired.
After acquiring the city information "Shenzhen city", since the city information in the address generally follows the region county information, according to the index of the ending character of "Shenzhen city", the region county information "Futian region" following the "Shenzhen city" is acquired in the information following the "Shenzhen city".
However, when acquiring administrative area information from the address field, there may be a case where there is a conflict between two acquired administrative area information, for example, referring to fig. 4, in acquiring two area names "jiulong" and "jiulong slope area" from the address field shown in fig. 4, the positions of the two area names in the address field overlap, specifically, the position index of "jiulong" in the address field is the 4 th character and the 5 th character, and the position index of "jiulong slope area" in the address field is the 4 th character to the 7 th character, and the overlapping positions are the 4 th character and the 5 th character. In addition, there are two zone names in the same address field in fig. 4, resulting in two zone names "jiulong" and "jiulongpo" conflicts.
Referring again to fig. 5, in acquiring the area name "nine dragons" and the road name "nine dragons in a way" from the address field shown in fig. 5, the area name overlaps with the position of the road name in the address field, resulting in a conflict between the area name and the road name.
In order to determine whether there is a conflict between administrative area information obtained from the address field, in one or more embodiments of the present application, after S104, the point-of-interest data processing method may further include:
under the condition that the number of administrative district information of at least one level of administrative district information is multiple, determining whether the positions of any two pieces of administrative district information in the at least one level of administrative district information are overlapped in an address field;
and if the two administrative area information are overlapped, performing invalidation processing on the two pieces of administrative area information, or acquiring the administrative area information from the address field again according to the two pieces of administrative area information, wherein the invalidation processing comprises the following steps: recording the two administrative district information as invalid administrative district information, or deleting the two administrative district information.
As an example, the two administrative area information may be a combination of any two of the following administrative area information: province name, city name, district name, street name, road name. In addition, the two administrative area information may be one of the following items: two provincial names, two city names, two district names, two street names, and two road names.
As an example, in the case where the positions of two administrative area information overlap, the administrative area information for which the two administrative area information are invalid may be recorded, and the invalid administrative area information does not participate in the matching in the subsequent S106.
As another example, in the case where the positions of two administrative area information overlap, the administrative area information may be newly acquired from the address field according to the two administrative area information. For example, the administrative area information with the largest number of characters is obtained from two pieces of administrative area information whose positions overlap, the administrative area information with the largest number of characters is determined as valid administrative area information, the valid administrative area information can participate in the matching in the following S106, and another administrative area information with a smaller number of characters is determined as invalid administrative area information.
For example, with continued reference to fig. 4, in the case where the two area names "nine dragons" and "nine dragons hilly area" conflict, the area name "nine dragons hilly area" with the largest number of characters is selected as the administrative area information participating in the matching in S106. With continued reference to fig. 5, in the case where the area name "nine dragons" and the road name "nine dragons to one road" conflict, the road name "nine dragons to one road" with the largest number of characters is selected as the administrative area information participating in the matching in S106.
The embodiment of the application can determine whether position overlapping exists between the administrative district information acquired from the address field, and perform invalidation processing on two administrative district information with overlapped positions under the condition that the position overlapping exists. Or, the administrative region information is obtained again, so that the finally obtained administrative region information is more accurate, the point of interest data can be updated more accurately, and the accuracy of the updated point of interest data is ensured.
In one or more embodiments of the present application, after S104, the method for processing point of interest data may further include:
determining whether the administrative district information is in a preset standardized administrative district information base;
under the condition that the administrative district information is not in the standardized administrative district information base, determining whether the administrative district information is in a preset non-standardized administrative district information base;
and under the condition that the administrative district information is in the non-standardized administrative district information base, carrying out standardized processing on the administrative district information to obtain standardized administrative district information.
The standardized administrative district information base stores administrative district information which meets the regulations and does not have short or wrong information of the administrative district information. For example, the standardized administrative area information base stores "inner Mongolia autonomous areas", but does not store "inner Mongolia" for short. For another example, the standardized administrative district information base stores "the Guangxi Zhuang autonomous district" and does not store the expression "Guangxi province" with an error.
The information in the non-standardized administrative district information base may be abbreviated or miswritten as the administrative district information in table 5 above.
When the administrative district information is in the non-standardized administrative district information base, it is described that the administrative district information is not standardized, and it is necessary to standardize the administrative district information. As an example, the normalizing the administrative district information to obtain normalized administrative district information may specifically include: and acquiring standardized administrative district information corresponding to the administrative district information in the corresponding relation between the preset non-standardized administrative district information and the standardized administrative district information.
For example, if the administrative area information "guangxi province" is acquired from the address field and is not in the standardized administrative area information base but in the non-standardized administrative area information base, the standardized administrative area information corresponding to the "guangxi province" is acquired as "the Guangxi nationality autonomous region".
And after the normalized administrative district information is obtained by normalizing the administrative district information, the normalized administrative district information is used as the administrative district information acquired from the address field and is matched with the information corresponding to the administrative district field.
In the embodiment of the application, the administrative area information is standardized, so that whether the interest points are wrong or not can be identified more accurately by using the standardized administrative area information.
In one or more embodiments of the present application, S108 may specifically include:
under the condition that any first administrative district information in at least one level of administrative district information is not matched with the information corresponding to the administrative district fields, the first administrative district information and the information corresponding to the administrative district fields are uniformly set according to preset information setting rules, so that the first administrative district information after being uniformly set is matched with the information corresponding to the administrative district fields.
The preconfigured information processing rule may include: specifically, under the condition that the first administrative area information is not matched with the information of the corresponding administrative area field, the information of the corresponding administrative area field is reset according to the first administrative area information.
Alternatively, the preconfigured information processing rule may include: specifically, under the condition that the information of the administrative district field is not matched with the information of the corresponding administrative district field, the information of the first administrative district is reset according to the information of the administrative district field.
Besides the unified setting of the first administrative area information and the information of the corresponding administrative area fields, the errors of the first administrative area information and the information of the corresponding administrative area fields can be recorded.
How to use the pre-configured information processing rule is exemplarily illustrated by fig. 6.
As shown in fig. 6, the information of the province field in the point-of-interest data is "guangxi province", and after the normalization processing, the "guangxi Zhuang autonomous region" is obtained, and the province extracted from the information of the address field is "guangdong province", whereby a conflict occurs between the information of the province field and the province in the information of the address field.
In this case, if the province field is the standard when the conflict is met by the pre-configuration, the province name of the province field is directly used as the province used in the subsequent flow. If the province in the address field is taken as the standard when the conflict is met by the pre-configuration, the province in the address field is directly taken as the province used in the subsequent flow. And if the pre-configuration encounters conflict and errors need to be recorded, writing the conflict errors between the provinces in the information of the province field and the provinces in the information of the address field into an error recording module.
In one or more embodiments of the present application, S108 may include:
under the condition that second administrative region information in at least one level of administrative region information is matched with information of a first field in the point-of-interest data, determining whether the second administrative region information conflicts with information of a second field in the point-of-interest data, wherein the first field is an administrative region field corresponding to the second administrative region information, and the second field is an administrative region field except the first field;
in the event that a conflict is determined to exist, the point of interest data is updated. For example, based on one of the second administrative area information and the information of the second field, the other information is reset so that the two do not conflict with each other.
For example, as shown in fig. 7, the information "guangxi province" in the province field is standardized to "guangxi Zhuang autonomous district", and the standardized information is not matched with the province "Guangdong province" in the address field, that is, a province conflict occurs.
In addition, the information "south china" in the city field does not match the city "east china" in the address field, that is, a city conflict occurs, and in this case, it is assumed that the address field is used as a reference, that is, the cities are uniformly set to the city in the address field, that is, the city "east china".
However, after the provinces and the cities are uniformly set, it is found that "eastern guan city" is not in the jurisdiction of "guang-west Zhuang autonomous district", that is, there is a conflict between the provinces and the cities after the uniform setting.
Whether the second administrative district information conflicts with the information of the second field in the point of interest data can be determined through the embodiment of the application, and the second administrative district information can be district information or any administrative district information.
Determining whether the second administrative area information conflicts with information of a second field in the point of interest data may specifically include: determining whether one of the two administrative districts is in the administration range of the other administrative district, wherein the two administrative districts comprise an administrative district corresponding to the second administrative district information and an administrative district corresponding to the second field; in the case where one administrative district is not within the jurisdiction of another administrative district, it can be determined that there is a conflict between the two.
The following is an exemplary description of embodiments of the present application.
In one example, assuming that the second administrative district information is a district name in the address, and the information in the first field is a district name corresponding to the district field, in the case that the district name in the address is consistent with the district name corresponding to the district field, it may be determined whether the district name in the address conflicts with a city name or a province name in a multi-source trust manner. Wherein the priority of the city is greater than the priority of the province. That is, it is determined whether the county in the address is within the jurisdiction range of the city corresponding to the city field, and then it is determined whether the county in the address is within the jurisdiction range of the province corresponding to the province field.
The method for determining whether the names of the counties and the names of the cities or the provinces conflict with each other by adopting a multi-source trust mode specifically comprises the following steps:
determining whether the county in the address is within the administration range of the city corresponding to the city field, and if the county in the address is within the administration range of the city corresponding to the city field, taking the county in the address as the standard, and participating in the subsequent process by the county in the address; if the county in the address is not in the administration range of the city corresponding to the city field, recording that the data of the interest point has conflict errors;
under the condition that no city corresponding to the city field exists in the point-of-interest data, whether the county in the address is in the jurisdiction range of the province corresponding to the province field is determined; if the county in the address is within the administration range of the province corresponding to the province field, taking the county in the address as a reference, and participating in the subsequent process; (ii) a If the county in the address is not in the jurisdiction range of the province corresponding to the province field, recording that the data of the interest point has conflict errors;
under the condition that the point-of-interest data has neither a city field corresponding to a city nor a province field corresponding to a province, it is indicated that no more data sources are available for collision detection, and the prefecture and the county in the address are taken as the standard.
The multi-source trust mode is adopted in the judgment, and under the condition that the county in the address does not conflict with the information of a plurality of sources, the county in the address can be considered to be trustable.
When the county in the address conflicts with information from a certain source (for example, a city corresponding to the city field or a province corresponding to the province field), it may be recorded that there is a conflict error in the point of interest data, or output adjustment may be performed according to a conflict configuration.
Performing output adjustment according to the conflict configuration, which may specifically include:
and if the configuration meets the conflict and takes the prefecture field as the standard, directly transmitting the prefecture name corresponding to the prefecture field as the prefecture name of the subsequent flow. And, the corresponding district and county code can be searched by combining the province name, the city name and the district and county name.
And if the configuration encounters a recording error when the conflict exists, recording the conflict, and outputting the error of the point of interest data and the error reason without performing the next judgment.
In one or more embodiments of the present application, the method for processing point of interest data may further include:
determining whether the point of interest data comprises administrative region codes or not and whether the at least one level of administrative region information comprises target administrative region information corresponding to the administrative region codes or not;
under the condition that the point-of-interest data comprises administrative region codes and at least one level of administrative region information comprises target administrative region information, resetting one information according to the administrative region codes and the target administrative region information;
under the condition that the point-of-interest data comprises administrative region codes and the target administrative region information is not included in the information of at least one stage of administrative region, adding the target administrative region information into the information of the address field according to the administrative region codes;
and under the condition that the point of interest data does not include administrative region codes and the at least one level of administrative region information includes target administrative region information, adding administrative region codes in the point of interest data according to the target administrative region information.
The following describes embodiments of the present application by taking administrative district codes as prefecture codes as examples.
As shown in fig. 8, in the case where the district information is obtained from the address field and the district code (hereinafter, referred to as the district code) is included in the point-of-interest data, since the district code is more accurate than the district information, for example, the same district may generate different district names due to mistyping or renaming, the district code is preferably used to search for the corresponding district information. If the corresponding district code is not found, the corresponding district code is searched by using the district code information in the address field, if the corresponding district code is found, the searched district code is taken as the standard, and if the corresponding district code is not found, the district code and the district code are both set to be null.
And in the case that the region-county information does not exist in the address field and the region code is included in the point-of-interest data, searching the corresponding region-county information by using the region code.
And under the condition that the address field has region and county information and the point of interest data has no region code, searching the corresponding region code by using the region and county information.
When the address field does not have the district information and the point of interest data does not have the district code, it indicates that the district information and the district code cannot be uniformly set, and both the district information and the district code are set to be null.
The region information or the region code can be searched in the corresponding relationship between the preset region information and the region code. The data structure of the correspondence relationship between the district information and the district code of "Guangdong province" may be as follows:
{ "Guangdong province": { "Shenzhen City": { "Futian region": 440304 "," Dragon hillock region ": 440307", … },
guangzhou city { "Tianhe area": 440107 "," Yuexiu area ": 440104", … } },
"Hunan province" { "Hede City" { "Wuling district": 430702 ', "Shimen county": 430726' },
"Yueyang City": Huarong county ": 430623", "Junshan district": 430611 }, … }
The above is the corresponding relationship between the district information of "Guangdong province" and the district code, and the data structure of the corresponding relationship between the district information of other provinces and the district code is similar to the above structure, and is not described herein again.
When the provincial name, the city name, and the district name are known, the corresponding district code may be searched for in the correspondence between the district information and the district code of each province. Therefore, the unity of the district information and the district code is realized. The above uniform mode can well cope with null values, and does not require area codes or area and county information.
After unification, only the district code is focused in the subsequent steps, and meanwhile, the city code and the province code can be conveniently deduced according to the natural attribute of the district code.
In one or more embodiments of the present application, the point-of-interest data may further include a longitude and a latitude corresponding to the longitude and latitude field and an administrative region code corresponding to the administrative region code field.
S108 may include:
under the condition that each level of administrative district information is matched with the information of the corresponding administrative district field, acquiring the administrative district where the longitude and latitude corresponding position is located; and under the condition that the administrative region where the position corresponding to the longitude and latitude is located is not matched with the administrative region code, updating the point-of-interest data. For example, the administrative region at the position corresponding to the longitude and latitude is the Guangzhou city Tian river region, and the administrative region code is the administrative region code of the Shenzhen city Futian region, so that the Guangzhou city Tian river region is not matched with the administrative region code, and the data of the interest point is determined to be in error.
As one example, the administrative district code includes a district code.
In the related art, acquiring an administrative area where the longitude and latitude corresponding position is located may specifically include: and searching the counties where the positions corresponding to the longitude and latitude are located in the plurality of counties by utilizing the longitude and latitude. However, this searching method needs to traverse multiple counties, and the obtaining process is slow.
In order to improve efficiency of obtaining an administrative area where the longitude and latitude corresponding position is located, in one or more embodiments of the present application, obtaining the administrative area where the longitude and latitude corresponding position is located may specifically include:
searching a target provincial region where the longitude and the latitude are located in a plurality of preset provincial regions, wherein the target provincial region comprises a plurality of city-level regions; searching a target city-level region where the longitude and the latitude are located in a plurality of city-level regions, wherein the target city-level region comprises a plurality of county-level regions; and searching a target county-level area where the longitude and latitude are located in a plurality of county-level areas to obtain an administrative district where the position corresponding to the longitude and latitude is located.
The provincial region may be a region formed by a provincial boundary line, or may be a circumscribed rectangular region formed based on leftmost, rearmost, uppermost, and lowermost points on the provincial boundary line. The circumscribed rectangular area may approximately represent the boundary of the province.
Similarly, the city-level region may be a region formed by the city boundary line or a circumscribed rectangular region of the city boundary line. The county-level region may be a region formed by county boundary lines or a circumscribed rectangular region of the county boundary lines.
In the embodiment of the application, compared with the searching mode of traversing each county in the related art, the searching efficiency can be improved by adopting the hierarchical searching mode without increasing a large amount of extra boundary data. In particular, the search speed may be increased to 10-15 times.
In order to better explain the point of interest data processing method provided by the present application, an exemplary description is given below with reference to the point of interest data processing method shown in fig. 9.
FIG. 9 is a flowchart illustrating a point of interest data processing method according to another embodiment of the present application. As shown in fig. 9, the method for processing point of interest data includes:
and S202, detecting the longitude and latitude.
Wherein, S202 may specifically include: and performing missing value detection on the longitude and the latitude in the point of interest data, and executing S216 to write the corresponding missing field and the error information into the error recording module under the condition that at least one of the longitude and the latitude does not exist in the point of interest data. And if the longitude and latitude are wrong, subsequent links needing longitude and latitude filling or checking can be skipped.
The error information recorded by the error recording module is stored in a key value pair mode, and the specific storage mode is as follows:
{ "error type a": "field name 1",
"error type B": "field name 2",
"error type C": "field name 3",
……}
the error type recorded by the error recording module is a predefined error type, and the field name is a field name carried in the data.
In addition, if the longitude and latitude are not missing, whether the point position where the longitude and latitude is located falls within the national boundary is judged. If the latitude and longitude information falls outside the country boundary, S216 is executed to write the latitude and longitude error information into the error recording module. And if the longitude and latitude are wrong, all subsequent links needing longitude and latitude filling or checking can be skipped.
The method for processing the point of interest data further comprises the following steps:
s204, analyzing provinces, cities and counties in the address field;
s206, judging whether the province in the province field conflicts with the province in the address field, if not, directly executing S208, and if so, executing S216 and S208;
s208, judging whether the city in the city field conflicts with the city in the address field, if not, directly executing S210, and if so, executing S216 and executing S210;
s210, unifying the district and county codes in the district and county code field in the address field;
s212, judging whether the county in the county field conflicts with the county in the address field, if not, directly executing S214, and if so, executing S216 and S214;
s214, judging whether the county and the longitude and latitude in the county field conflict, if not, directly executing S218, and if so, executing S216 and executing S218;
s216, recording errors, for example, recording data error conditions of each step in the processing process, such as name errors of provinces, cities and counties, conflict errors, longitude and latitude errors and the like;
s218, clearing the error.
Wherein, S218 specifically includes: and acquiring the errors recorded in the S216 and summarizing the errors.
In S218, if the area code is not empty, the area name, the city code, the city name, the province code, and the province name can be deduced from the area code if the above detection is passed. And obtaining data with uniform spatial information. The errors related to the provincial and urban areas, such as provincial name errors, city name errors and the like, which appear before can be cleared and cancelled.
Under the condition that the area code is null and the city name is not null, the city code, province code and province name can be reversely deduced according to the city name. Errors that have occurred previously with respect to the province can then be cleared away.
If the area code and the city name are null, the province code is deduced according to the province name.
Therefore, by uniformly setting the district and county codes in the address field and the district and county code field and reversely deducing the information of the administrative district of the previous level according to the information of the administrative district, the unification of the information such as the address field, the administrative district field and the code field in the point of interest data is realized, and the correction of the error of the wrong point of interest data is realized.
The method for processing the point of interest data further comprises the following steps:
s220, outputting the error reason according to the error clearing result.
As one example, in S220, in addition to outputting the error cause, the point-of-interest data after error correction may be output.
For example, the original point of interest data is shown in table 6, and the corrected point of interest data obtained after correcting the error of the original point of interest data is shown in table 7.
TABLE 6
Figure BDA0003093189410000161
TABLE 7
Figure BDA0003093189410000171
In the embodiment of the application, the data with conflicting spatial attributes can be detected quickly and accurately. The problem that information utilization is incomplete and conflicts with information application in the conventional data processing is solved, and the information of data addresses, fields and longitude and latitude can be fully utilized for judgment and supplement. The value of each field of the data is fully utilized. In addition, the prefix tree with fault tolerance can be used for obtaining information such as provinces, cities, counties and the like in the address field, a multi-source trust mode and conflict configuration are adopted, conflict misidentification can be reduced, data correction can be carried out on weak conflict data, and the output mode can be flexibly adjusted.
It should be noted that, in the point-of-interest data processing method provided in the embodiment of the present application, the execution main body may be the point-of-interest data processing apparatus, or a control module in the point-of-interest data processing apparatus for executing the point-of-interest data processing method. In the embodiment of the present application, a method for executing point of interest data processing by using a point of interest data processing device is taken as an example, and the point of interest data processing device provided in the embodiment of the present application is described.
FIG. 10 is a schematic structural diagram of an embodiment of a point of interest data processing apparatus provided in the present application. As shown in fig. 10, the point-of-interest data processing apparatus 300 includes:
a first obtaining module 302, configured to obtain point-of-interest data, where the point-of-interest data includes an address field and an administrative area field;
a second obtaining module 304, configured to obtain information of at least one stage of administrative districts in the address field;
the matching module is used for matching the information of at least one stage of administrative region with the information of the administrative region field in the point of interest data to obtain a matching result;
and the updating module 306 is configured to update the point of interest data according to the matching result.
In the embodiment of the application, point-of-interest data is obtained firstly, then administrative region information is obtained from an address field in the point-of-interest data, and the administrative region information is matched with information of an administrative region field in the point-of-interest data to obtain a matching result; and then, updating the point of interest data according to the matching result. Therefore, errors in the point of interest data can be corrected, the quality of the point of interest data is improved, and data support is provided for functions of position query, positioning and the like based on the point of interest data.
In one or more embodiments of the present application, the second obtaining module 304 may include:
the word segmentation unit is used for segmenting the information of the address field in the interest point data to obtain segmented words;
the matching unit is used for matching the word segmentation words with information corresponding to nodes on a preset prefix tree, wherein the preset prefix tree comprises administrative region information, and the administrative region information is formed by information corresponding to a plurality of nodes on the same path of the preset prefix tree;
and the first determining unit is used for determining the administrative region information according to the word segmentation words under the condition that the word segmentation words are matched with the information corresponding to the nodes on the preset prefix tree.
In one or more embodiments of the present application, the point of interest data processing apparatus 300 may further include:
the first determining module is used for determining whether the positions of any two pieces of administrative area information in the at least one level of administrative area information are overlapped or not under the condition that the number of the administrative area information of the at least one level of administrative area information is multiple;
and the processing module is used for carrying out invalidation processing on the two administrative district information if the administrative district information is overlapped, or reacquiring the administrative district information from the address field according to the two administrative district information, wherein the invalidation processing comprises the following steps: recording the two administrative district information as invalid administrative district information, or deleting the two administrative district information.
In one or more embodiments of the present application, the point of interest data processing apparatus 300 may further include:
the second determining module is used for determining whether the point-of-interest data comprises administrative region codes or not and whether the at least one level of administrative region information comprises target administrative region information corresponding to the administrative region codes or not;
the setting module is used for resetting one information of the administrative region code and the target administrative region information under the condition that the point-of-interest data comprises the administrative region code and the at least one level of administrative region information comprises the target administrative region information;
the first adding module is used for adding target administrative region information into the information of the address field according to the administrative region codes under the condition that the point-of-interest data comprises the administrative region codes and the target administrative region information is not included in the information of at least one stage of administrative region;
and the second adding module is used for adding the administrative region codes into the point-of-interest data according to the target administrative region information under the condition that the point-of-interest data does not include the administrative region codes and the at least one level of administrative region information includes the target administrative region information.
In one or more embodiments of the present application, the update module 308 can include:
the second determining unit is used for determining whether the second administrative area information conflicts with information of a second field in the point-of-interest data under the condition that the second administrative area information in the at least one level of administrative area information is matched with the information of the first field in the point-of-interest data, wherein the first field is an administrative area field corresponding to the second administrative area information, and the second field is an administrative area field except the first field;
and the first updating unit is used for updating the point of interest data under the condition that the conflict is determined to exist.
In one or more embodiments of the present application, the point-of-interest data further includes a longitude and latitude corresponding to the longitude and latitude field and an administrative region code corresponding to the administrative region code field; the update module 308 may include:
the second acquisition unit is used for acquiring the administrative regions where the longitude and latitude corresponding positions are located under the condition that the information of each level of administrative region is matched with the information of the corresponding administrative region field;
and the second updating unit is used for updating the point-of-interest data under the condition that the administrative region where the longitude and latitude corresponding positions are located is not matched with the administrative region codes.
In one or more embodiments of the present application, the administrative area code comprises a prefecture code; the second acquisition unit is used for:
searching a target provincial region where the longitude and the latitude are located in a plurality of preset provincial regions, wherein the target provincial region comprises a plurality of city-level regions;
searching a target city-level region where the longitude and the latitude are located in a plurality of city-level regions, wherein the target city-level region comprises a plurality of county-level regions;
and searching a target county-level area where the longitude and latitude are located in a plurality of county-level areas to obtain an administrative district where the position corresponding to the longitude and latitude is located.
The interest point data processing apparatus in the embodiment of the present application may be an apparatus, and may also be a component, an integrated circuit, or a chip in a terminal. The device can be mobile electronic equipment or non-mobile electronic equipment. By way of example, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palm top computer, a vehicle-mounted electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), and the like, and the non-mobile electronic device may be a server, a Network Attached Storage (NAS), a Personal Computer (PC), a Television (TV), a teller machine or a self-service machine, and the like, and the embodiments of the present application are not particularly limited.
The point of interest data processing apparatus in the embodiment of the present application may be an apparatus having an operating system. The operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, and embodiments of the present application are not limited specifically.
The device for processing point of interest data provided in the embodiment of the present application can implement each process implemented by the method embodiment of fig. 1 or fig. 9, and is not described here again to avoid repetition.
As shown in fig. 11, the electronic device 400 includes a processor 401, a memory 402, and a program or an instruction stored in the memory 402 and executable on the processor 401, where the program or the instruction is executed by the processor 401 to implement the processes of the above-mentioned interest point data processing method embodiment, and can achieve the same technical effects, and no further description is given here to avoid repetition.
It should be noted that the electronic devices in the embodiments of the present application include the mobile electronic device and the non-mobile electronic device described above.
Fig. 12 is a schematic structural diagram of another embodiment of an electronic device provided in the present application.
As shown in fig. 12, the electronic device 500 includes, but is not limited to: a radio frequency unit 501, a network module 502, an audio output unit 503, an input unit 504, a sensor 505, a display unit 506, a user input unit 507, an interface unit 508, a memory 509, a processor 510, and the like.
Those skilled in the art will appreciate that the electronic device 500 may further include a power supply (e.g., a battery) for supplying power to various components, and the power supply may be logically connected to the processor 510 via a power management system, so as to implement functions of managing charging, discharging, and power consumption via the power management system. The electronic device structure shown in fig. 12 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than those shown, or combine some components, or arrange different components, and thus, the description is not repeated here.
Wherein processor 510 is configured to:
obtaining point of interest data, wherein the point of interest data comprises an address field and a administrative area field;
acquiring at least one stage of administrative region information in the address field;
matching the information of at least one stage of administrative region with the information of the administrative region field in the interest point data to obtain a matching result;
and updating the point of interest data according to the matching result.
In the embodiment of the application, point-of-interest data is obtained firstly, then administrative region information is obtained from an address field in the point-of-interest data, and the administrative region information is matched with information of an administrative region field in the point-of-interest data to obtain a matching result; and then, updating the point of interest data according to the matching result. Therefore, errors in the point of interest data can be corrected, the quality of the point of interest data is improved, and data support is provided for functions of position query, positioning and the like based on the point of interest data.
In one or more embodiments of the present application, processor 510 is specifically configured to:
segmenting words of the information of the address field in the interest point data to obtain segmented words;
matching the word segmentation words with information corresponding to nodes on a preset prefix tree, wherein the preset prefix tree comprises administrative region information, and the information corresponding to a plurality of nodes on the same path of the preset prefix tree forms the administrative region information;
and under the condition that the word segmentation words are matched with the information corresponding to the nodes on the preset prefix tree, determining administrative region information according to the word segmentation words.
In one or more embodiments of the present application, processor 510 is further configured to:
under the condition that the number of administrative district information of at least one level of administrative district information is multiple, determining whether the positions of any two pieces of administrative district information in the at least one level of administrative district information are overlapped in an address field;
and if the two administrative area information are overlapped, performing invalidation processing on the two pieces of administrative area information, or acquiring the administrative area information from the address field again according to the two pieces of administrative area information, wherein the invalidation processing comprises the following steps: recording the two administrative district information as invalid administrative district information, or deleting the two administrative district information.
In one or more embodiments of the present application, processor 510 is further configured to:
determining whether the point of interest data comprises administrative region codes or not and whether the at least one level of administrative region information comprises target administrative region information corresponding to the administrative region codes or not;
under the condition that the point-of-interest data comprises administrative region codes and at least one level of administrative region information comprises target administrative region information, resetting one information according to the administrative region codes and the target administrative region information;
under the condition that the point-of-interest data comprises administrative region codes and the target administrative region information is not included in the information of at least one stage of administrative region, adding the target administrative region information into the information of the address field according to the administrative region codes;
and under the condition that the point of interest data does not include administrative region codes and the at least one level of administrative region information includes target administrative region information, adding administrative region codes in the point of interest data according to the target administrative region information.
In one or more embodiments of the present application, processor 510 is specifically configured to:
under the condition that second administrative region information in at least one level of administrative region information is matched with information of a first field in the point-of-interest data, determining whether the second administrative region information conflicts with information of a second field in the point-of-interest data, wherein the first field is an administrative region field corresponding to the second administrative region information, and the second field is an administrative region field except the first field;
in the event that a conflict is determined to exist, the point of interest data is updated.
In one or more embodiments of the present application, the point-of-interest data further includes a longitude and latitude corresponding to the longitude and latitude field and an administrative region code corresponding to the administrative region code field; processor 510 is specifically configured to:
under the condition that each level of administrative district information is matched with the information of the corresponding administrative district field, acquiring the administrative district where the longitude and latitude corresponding position is located; and under the condition that the administrative region where the position corresponding to the longitude and latitude is located is not matched with the administrative region code, determining that the data of the point of interest is wrong.
In one or more embodiments of the present application, the administrative area code comprises a prefecture code; processor 510 is specifically configured to:
searching a target provincial region where the longitude and the latitude are located in a plurality of preset provincial regions, wherein the target provincial region comprises a plurality of city-level regions; searching a target city-level region where the longitude and the latitude are located in a plurality of city-level regions, wherein the target city-level region comprises a plurality of county-level regions; and searching a target county-level area where the longitude and latitude are located in a plurality of county-level areas to obtain an administrative district where the position corresponding to the longitude and latitude is located.
It should be understood that in the embodiment of the present application, the input Unit 504 may include a Graphics Processing Unit (GPU) 5041 and a microphone 5042, and the Graphics processor 5041 processes image data of still pictures or videos obtained by an image capturing device (such as a camera) in a video capturing mode or an image capturing mode. The display unit 506 may include a display panel 5061, and the display panel 5061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 507 includes a touch panel 5071 and other input devices 5072. A touch panel 5071, also referred to as a touch screen. The touch panel 5071 may include two parts of a touch detection device and a touch controller. Other input devices 5072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which are not described in further detail herein. The memory 509 may be used to store software programs as well as various data including, but not limited to, application programs and operating systems. Processor 510 may integrate an application processor, which primarily handles operating systems, user interfaces, applications, etc., and a modem processor, which primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 510.
The embodiments of the present application further provide a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or the instruction is executed by a processor, the program or the instruction implements each process of the above-mentioned point-of-interest data processing method embodiment, and can achieve the same technical effect, and in order to avoid repetition, details are not described here again.
The processor is the processor in the electronic device in the above embodiment. Readable storage media, including computer-readable storage media, such as Read-Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, etc.
The embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to execute a program or an instruction to implement each process of the above-mentioned embodiment of the point of interest data processing method, and can achieve the same technical effect, and in order to avoid repetition, the description is omitted here.
It should be understood that the chips mentioned in the embodiments of the present application may also be referred to as system-on-chip, system-on-chip or system-on-chip, etc.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Further, it should be noted that the scope of the methods and apparatus of the embodiments of the present application is not limited to performing the functions in the order illustrated or discussed, but may include performing the functions in a substantially simultaneous manner or in a reverse order based on the functions involved, e.g., the methods described may be performed in an order different than that described, and various steps may be added, omitted, or combined. In addition, features described with reference to certain examples may be combined in other examples.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a computer software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, or a network device) to execute the method of the embodiments of the present application.
While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments described above, which are meant to be illustrative and not restrictive, and that various changes may be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (16)

1. A method for processing point of interest data is characterized by comprising the following steps:
obtaining point of interest data, wherein the point of interest data comprises an address field and an administrative area field;
acquiring at least one stage of administrative region information in the address field;
matching the information of the at least one stage of administrative region with the information of the administrative region field in the point of interest data to obtain a matching result;
and updating the point of interest data according to the matching result.
2. The method of claim 1, wherein obtaining at least one level of administrative area information in the address field comprises:
segmenting words of the information of the address field in the interest point data to obtain segmented words;
matching the word segmentation words with information corresponding to nodes on a preset prefix tree, wherein the preset prefix tree comprises administrative region information, and the administrative region information is formed by information corresponding to a plurality of nodes on the same path of the preset prefix tree;
and under the condition that the word segmentation words are matched with the information corresponding to the nodes on the preset prefix tree, determining administrative region information according to the word segmentation words.
3. The method of claim 1, wherein after obtaining at least one level of administrative area information in the address field, the method further comprises:
under the condition that the number of the administrative district information of the at least one level of administrative district information is multiple, determining whether the positions of any two pieces of administrative district information in the at least one level of administrative district information in the address field are overlapped;
and if the two administrative area information are overlapped, performing invalidation processing on the two pieces of administrative area information, or acquiring the administrative area information from the address field again according to the two pieces of administrative area information, wherein the invalidation processing comprises the following steps: and recording the administrative district information of which the two pieces of administrative district information are invalid, or deleting the two pieces of administrative district information.
4. The method of claim 1, wherein after obtaining at least one level of administrative area information in the address field, the method further comprises:
determining whether the point of interest data comprises administrative region codes or not and whether the at least one level of administrative region information comprises target administrative region information corresponding to the administrative region codes or not;
under the condition that the point-of-interest data comprises the administrative region code and the at least one level of administrative region information comprises the target administrative region information, resetting another information according to one information of the administrative region code and the target administrative region information;
under the condition that the point-of-interest data comprises the administrative region code and the target administrative region information is not included in the at least one level of administrative region information, adding the target administrative region information into the information of the address field according to the administrative region code;
and under the condition that the point of interest data does not comprise the administrative region code and the at least one level of administrative region information comprises the target administrative region information, adding the administrative region code into the point of interest data according to the target administrative region information.
5. The method of claim 1, wherein the updating the point of interest data according to the matching result comprises:
under the condition that second administrative region information in the at least one level of administrative region information is matched with information of a first field in the point-of-interest data, determining whether the second administrative region information conflicts with information of a second field in the point-of-interest data, wherein the first field is an administrative region field corresponding to the second administrative region information, and the second field is an administrative region field except the first field;
and updating the point of interest data in the case of determining that the conflict exists.
6. The method of claim 1, wherein the point-of-interest data further comprises a longitude and latitude corresponding to the longitude and latitude field and an administrative region code corresponding to the administrative region code field;
the updating the point of interest data according to the matching result comprises the following steps:
under the condition that the administrative district information of each level is matched with the information of the corresponding administrative district field, acquiring the administrative district where the longitude and latitude corresponding position is located;
and under the condition that the administrative region where the longitude and latitude corresponding positions are located is not matched with the administrative region codes, updating the point of interest data.
7. The method of claim 6, wherein the administrative district code comprises a district code;
the acquiring of the administrative area where the longitude and latitude corresponding position is located includes:
searching a target provincial region where the longitude and the latitude are located in a plurality of preset provincial regions, wherein the target provincial region comprises a plurality of city-level regions;
searching a target city-level region where the longitude and the latitude are located in the plurality of city-level regions, wherein the target city-level region comprises a plurality of county-level regions;
and searching a target county-level area where the longitude and latitude are located in the plurality of county-level areas to obtain an administrative district where the position corresponding to the longitude and latitude is located.
8. An apparatus for processing point of interest data, comprising:
the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring point-of-interest data which comprises an address field and an administrative area field;
the second acquisition module is used for acquiring at least one stage of administrative region information in the address field;
the matching module is used for matching the information of the at least one stage of administrative region with the information of the administrative region field in the point of interest data to obtain a matching result;
and the updating module is used for updating the point of interest data according to the matching result.
9. The apparatus of claim 8, wherein the second obtaining module comprises:
the word segmentation unit is used for segmenting the information of the address field in the interest point data to obtain segmented words;
the matching unit is used for matching the word segmentation words with information corresponding to nodes on a preset prefix tree, wherein the preset prefix tree comprises administrative region information, and the administrative region information is formed by information corresponding to a plurality of nodes on the same path of the preset prefix tree;
and the first determining unit is used for determining administrative region information according to the word segmentation words under the condition that the word segmentation words are matched with the information corresponding to the nodes on the preset prefix tree.
10. The apparatus of claim 8, further comprising:
the first determining module is used for determining whether the positions of any two pieces of administrative area information in the at least one level of administrative area information in the address field are overlapped or not under the condition that the number of the administrative area information of the at least one level of administrative area information is multiple;
and the processing module is used for carrying out invalidation processing on the two administrative district information if the administrative district information is overlapped, or acquiring the administrative district information from the address field again according to the two administrative district information, wherein the invalidation processing comprises the following steps: and recording the administrative district information of which the two pieces of administrative district information are invalid, or deleting the two pieces of administrative district information.
11. The apparatus of claim 8, further comprising:
the second determining module is used for determining whether the point of interest data comprises administrative region codes or not and whether the at least one level of administrative region information comprises target administrative region information corresponding to the administrative region codes or not;
the setting module is used for resetting one information of the administrative region code and the target administrative region information according to the administrative region code under the condition that the point-of-interest data comprises the administrative region code and the target administrative region information is included in the at least one level of administrative region information;
a first adding module, configured to add the target administrative district information to the information of the address field according to the administrative district code under the condition that the point-of-interest data includes the administrative district code and the target administrative district information is not included in the at least one level of administrative district information;
and the second adding module is used for adding the administrative region codes into the point of interest data according to the target administrative region information under the condition that the point of interest data does not include the administrative region codes and the at least one level of administrative region information includes the target administrative region information.
12. The apparatus of claim 8, wherein the update module comprises:
a second determining unit, configured to determine whether second administrative area information in the at least one level of administrative area information conflicts with information of a second field in the point of interest data when the second administrative area information is matched with information of the first field in the point of interest data, where the first field is an administrative area field corresponding to the second administrative area information, and the second field is an administrative area field other than the first field;
and the first updating unit is used for updating the point of interest data under the condition that the conflict is determined to exist.
13. The apparatus of claim 8, wherein the point-of-interest data further comprises a longitude and latitude corresponding to the longitude and latitude field and an administrative region code corresponding to the administrative region code field;
the update module includes:
the second acquisition unit is used for acquiring the administrative regions where the longitude and latitude corresponding positions are located under the condition that the administrative region information of each level is matched with the information of the corresponding administrative region field;
and the second updating unit is used for updating the point of interest data under the condition that the administrative district where the longitude and latitude corresponding position is located is not matched with the administrative district code.
14. The apparatus of claim 13, wherein the administrative district code comprises a district code;
the second obtaining unit is configured to:
searching a target provincial region where the longitude and the latitude are located in a plurality of preset provincial regions, wherein the target provincial region comprises a plurality of city-level regions;
searching a target city-level region where the longitude and the latitude are located in the plurality of city-level regions, wherein the target city-level region comprises a plurality of county-level regions;
and searching a target county-level area where the longitude and latitude are located in the plurality of county-level areas to obtain an administrative district where the position corresponding to the longitude and latitude is located.
15. An electronic device comprising a processor, a memory, and a program or instructions stored on the memory and executable on the processor, the program or instructions, when executed by the processor, implementing the steps of the point-of-interest data processing method according to any one of claims 1 to 7.
16. A readable storage medium, on which a program or instructions are stored, which when executed by a processor, implement the steps of the point-of-interest data processing method according to any one of claims 1 to 7.
CN202110602863.6A 2021-05-31 2021-05-31 Interest point data processing method and device, electronic equipment and storage medium Pending CN113360789A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110602863.6A CN113360789A (en) 2021-05-31 2021-05-31 Interest point data processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110602863.6A CN113360789A (en) 2021-05-31 2021-05-31 Interest point data processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113360789A true CN113360789A (en) 2021-09-07

Family

ID=77530684

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110602863.6A Pending CN113360789A (en) 2021-05-31 2021-05-31 Interest point data processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113360789A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115408616A (en) * 2022-09-14 2022-11-29 何日妹 Big data analysis method for cloud service push and cloud service push system
CN117874371A (en) * 2024-03-11 2024-04-12 园测信息科技股份有限公司 Method, system, medium and equipment for inquiring point of interest storage under administrative division
CN117874371B (en) * 2024-03-11 2024-05-31 园测信息科技股份有限公司 Method, system, medium and equipment for inquiring point of interest storage under administrative division

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115408616A (en) * 2022-09-14 2022-11-29 何日妹 Big data analysis method for cloud service push and cloud service push system
CN117874371A (en) * 2024-03-11 2024-04-12 园测信息科技股份有限公司 Method, system, medium and equipment for inquiring point of interest storage under administrative division
CN117874371B (en) * 2024-03-11 2024-05-31 园测信息科技股份有限公司 Method, system, medium and equipment for inquiring point of interest storage under administrative division

Similar Documents

Publication Publication Date Title
US11423457B2 (en) User interface and geo-parsing data structure
CN107656913B (en) Map interest point address extraction method, map interest point address extraction device, server and storage medium
US9442905B1 (en) Detecting neighborhoods from geocoded web documents
CN110688434B (en) Method, device, equipment and medium for processing interest points
WO2014137820A2 (en) Systems and methods for associating microposts with geographic locations
CN112528174A (en) Address finishing and complementing method based on knowledge graph and multiple matching and application
CN103324749B (en) A kind of spatialization parsing based on received text address and method for correcting error
Brindley et al. Generating vague neighbourhoods through data mining of passive web data
Moura et al. Reference data enhancement for geographic information retrieval using linked data
CN113360789A (en) Interest point data processing method and device, electronic equipment and storage medium
US11030224B2 (en) Data import and reconciliation
CN111858613B (en) Service data retrieval method
CN110909110A (en) Address standardization method and device, storage medium and processor
US11023465B2 (en) Cross-asset data modeling in multi-asset databases
Koukoletsos A Framework for Quality Evaluation of VGI linear datasets
CN111949845A (en) Method, apparatus, computer device and storage medium for processing mapping information
CN116431625A (en) Positioning analysis method and device for geographic entity and computer equipment
CN104156364A (en) Display method and device of map search result
CN109241208B (en) Address positioning method, address monitoring method, information processing method and device
WO2021246954A1 (en) Processing apparatus and method for determining road names
CN114996600B (en) Multi-temporal image management database data writing and reading method and device
Li et al. Automatic construction and visualization of address models
CN115495469B (en) Method and device for updating chart file and electronic equipment
Abrol et al. MapIt: a case study for location driven knowledge discovery and mining
CN112861532A (en) Address standardization processing method, device and equipment and online search system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination