CN105786922A - Method and equipment for determining missing electronic map data - Google Patents
Method and equipment for determining missing electronic map data Download PDFInfo
- Publication number
- CN105786922A CN105786922A CN201410830042.8A CN201410830042A CN105786922A CN 105786922 A CN105786922 A CN 105786922A CN 201410830042 A CN201410830042 A CN 201410830042A CN 105786922 A CN105786922 A CN 105786922A
- Authority
- CN
- China
- Prior art keywords
- address
- coded
- missing
- electronic map
- map data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 239000012634 fragment Substances 0.000 claims abstract description 58
- 230000011218 segmentation Effects 0.000 claims description 51
- 230000003247 decreasing effect Effects 0.000 claims description 19
- 238000012217 deletion Methods 0.000 claims description 2
- 230000037430 deletion Effects 0.000 claims description 2
- 238000012163 sequencing technique Methods 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 10
- 238000004590 computer program Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 5
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 239000013589 supplement Substances 0.000 description 4
- 230000008569 process Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Landscapes
- Navigation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a method and equipment for determining missing electronic map data. The method comprises the following steps of: when geographic coding of a to-be-coded address is failed, determining that the to-be-coded address is missing electronic map data and stored in a pre-set missing database; and further re-combining address fragments corresponding to the to-be-coded address according to the position sequence of the address fragments in the to-be-coded address so as to obtain at least one new address including a part of address fragments corresponding to the to-be-coded address, performing geographic coding of the new address, and determining that the new address is the missing electronic map data if the geographic coding is failed. By means of the scheme, the missing electronic map data is determined in a geographic coding manner; basis is provided for subsequently supplying electronic map data; therefore, an electronic map database can be completed in a targeted manner; and thus, the subsequent geographic coding success rate is further increased.
Description
Technical Field
The invention relates to the technical field of navigation electronic maps, in particular to a method and equipment for determining missing electronic map data.
Background
Geocoding, also known as address matching, refers to a code set to identify the location and attributes of points, lines, faces. Through geocoding, Chinese address description information or place name description information can be converted into a specific coordinate position point on the earth surface, and the method is realized specifically as follows: and matching the Chinese address description information or the place name description information with standard map data in an electronic map database, and taking a coordinate position point corresponding to the successfully matched standard map data as a coordinate position point corresponding to the Chinese address description information or the place name description information when the Chinese address description information or the place name description information is matched with the standard map data. In the process of positioning or searching in an electronic map, address description information input by a user is often required to be converted into a specific coordinate position point through geocoding. But the geocoding fails because the address description information input by the user is likely to have errors or the address described by the address description information is a new address which is not updated into the electronic map data.
At present, when geocoding is successfully carried out on address information input by a user, coordinate position points obtained by geocoding are directly fed back; and if the geocoding fails, directly feeding back the geocoding failure. Since the address information input by the user is likely to be a plurality of location points which the user may query subsequently, the method has a high existing value, if the address information is not processed (such as the longitude and latitude coordinates of the collected address information), if the address information is still used for geocoding subsequently, the geocoding failure still can be displayed, and the method is not beneficial to the enrichment of the content of the electronic map database and the subsequent geocoding.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for determining missing electronic map data, which determine the missing electronic map data through geocoding and provide a basis for subsequently perfecting an electronic map database.
A method for determining missing electronic map data comprises the following steps:
receiving a geocoding request carrying an address to be coded;
performing word segmentation on the address to be coded in the geocoding request to obtain an address fragment forming the address to be coded;
geocoding an address fragment corresponding to the address to be coded, determining the address to be coded as missing electronic map data when the geocoding fails, and storing the missing electronic map data in a preset missing database;
and recombining the address segments corresponding to the address to be coded according to the position sequence of the address segments in the address to be coded to obtain at least one new address containing a part of the address segments corresponding to the address to be coded, geocoding the new address, and determining the new address as missing electronic map data and storing the missing electronic map data in a preset missing database if the geocoding fails.
A missing electronic map data determination apparatus comprising:
the receiving module is used for receiving a geocoding request carrying an address to be coded;
the word segmentation module is used for segmenting the address to be coded in the geocoding request to obtain an address fragment forming the address to be coded;
the first missing electronic map data determining module is used for carrying out geographic coding on the address fragment corresponding to the address to be coded, determining the address to be coded as missing electronic map data when the geographic coding fails, storing the missing electronic map data into a preset missing database, and triggering the second missing electronic map data determining module;
and the second missing electronic map data determining module is used for recombining the address segments corresponding to the addresses to be coded according to the position sequence of the address segments in the addresses to be coded to obtain at least one new address containing a part of the address segments corresponding to the addresses to be coded, geocoding the new address, and determining the new address as missing electronic map data and storing the missing electronic map data in a preset missing database if the geocoding fails.
The invention has the following beneficial effects:
according to the method provided by the embodiment of the invention, when a geocoding request carrying an address to be coded is received, the address to be coded in the geocoding request is subjected to word segmentation to obtain an address fragment forming the address to be coded; geocoding an address fragment corresponding to the address to be coded, determining the address to be coded as missing electronic map data when the geocoding fails, and storing the missing electronic map data in a preset missing database; and recombining the address segments corresponding to the address to be coded according to the position sequence of the address segments in the address to be coded to obtain at least one new address containing a part of the address segments corresponding to the address to be coded, geocoding the new address, and determining the new address as missing electronic map data and storing the missing electronic map data in a preset missing database if the geocoding fails. According to the scheme, missing electronic map data can be determined through a geographic coding mode, address segments corresponding to the address to be coded are recombined according to the position sequence of the address to be coded in the address to be coded, at least one new address containing partial address segments corresponding to the address to be coded is obtained, geographic coding is conducted on the new address, if the geographic coding fails, the new address is determined to be the missing electronic map data, and finally the determined missing electronic map data is stored in a preset missing database to provide a basis for subsequent supplement of electronic map data, so that the electronic map database can be perfected in a targeted mode, and the success rate of subsequent geographic coding is further improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a flowchart of a method for determining missing electronic map data according to an embodiment of the present invention;
fig. 2 is a second flowchart of a method for determining missing electronic map data according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a device for determining missing electronic map data according to an embodiment of the present invention.
Detailed Description
In order to achieve the purpose of the present invention, embodiments of the present invention provide a method and an apparatus for determining missing electronic map data, where when a geocoding request carrying an address to be coded is received, a word segmentation is performed on the address to be coded in the geocoding request to obtain an address fragment forming the address to be coded; geocoding an address fragment corresponding to the address to be coded, determining the address to be coded as missing electronic map data when the geocoding fails, and storing the missing electronic map data in a preset missing database; and recombining the address segments corresponding to the address to be coded according to the position sequence of the address segments in the address to be coded to obtain at least one new address containing a part of the address segments corresponding to the address to be coded, geocoding the new address, and determining the new address as missing electronic map data and storing the missing electronic map data in a preset missing database if the geocoding fails. According to the scheme, missing electronic map data can be determined through a geographic coding mode, address segments corresponding to the address to be coded are recombined according to the position sequence of the address to be coded in the address to be coded, at least one new address containing partial address segments corresponding to the address to be coded is obtained, geographic coding is conducted on the new address, if the geographic coding fails, the new address is determined to be the missing electronic map data, and finally the determined missing electronic map data is stored in a preset missing database to provide a basis for subsequent supplement of electronic map data, so that the electronic map database can be perfected in a targeted mode, and the success rate of subsequent geographic coding is further improved.
Various embodiments of the present invention are described in further detail below with reference to the accompanying drawings. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The first embodiment is as follows:
as shown in fig. 1, which is a flowchart of a method for determining missing electronic map data according to an embodiment of the present invention, the method includes:
step 101: receiving a geocoding request carrying an address to be encoded.
Wherein, the geocoding request comprises an address to be coded.
Preferably, in order to improve the geocoding efficiency and the balance of the geocoding system in the embodiment of the present invention, a plurality of independent geocoding servers with the same function are arranged in the geocoding system in the embodiment of the present invention. When the geocoding system receives a plurality of geocoding requests, the geocoding requests are uniformly distributed to different geocoding servers to be processed in parallel; or sending the geocoding request to a geocoding server which is idle at present.
Step 102: and performing word segmentation on the address to be coded in the geocoding request to obtain an address fragment forming the address to be coded.
Preferably, since the address to be encoded is typically user-entered information, the information may not be standardized, such as "at Beijing XX street XXXXX building", "near Beijing Western monograph mall", and the like. Before the word segmentation is performed on the address to be coded in the geocoding request in the aforementioned step 102, to normalize the information input by the user, the method further includes the following steps: and determining the invalid words in the address to be coded according to a preset invalid word library, and deleting the invalid words in the address to be coded. Step 102 is to perform word segmentation operation on the address to be coded after the invalid word is deleted.
Preferably, after deleting the invalid word in the address to be encoded, further, performing a specification operation such as case unification, letter case unification and the like on the numbers in the address to be encoded.
It should be noted that the invalid word described in the embodiment of the present invention refers to a word that is associated with an address but cannot determine the address, for example: home, find hotel, where, etc. In practical application, if it is determined that the address to be encoded included in the address encoding request includes an invalid word according to a preset invalid word bank, for example: when the user goes home, finds a hotel and the like, the invalid words cannot specify a specific address, and the geocoding server cannot recognize the invalid words, so that the invalid words in the address to be coded are deleted.
Preferably, since the address to be encoded input by the user may be a standard address or a non-standard address, the non-standard address may include an administrative name, a road name, a cell name, and the like, but may also include some fuzzy words, for example, the address to be encoded may be address descriptive information, such as "opposite to 10 # Suzhou street in Haizhou district, Beijing city"; the address to be encoded may also be an ambiguous address such as "near a high-tech building in the Hai lake district, Beijing". The standard address refers to a phrase capable of accurately determining the address position, and includes a word segmentation combination of administrative district names, road names and district names, for example: xx district xx Luxx number court xx floor of xx city. The embodiment of the invention has the advantages that the adopted word segmentation modes are different aiming at the standard address and the non-standard address, and the word segmentation is carried out on the address to be coded in a more proper word segmentation mode, so that the word segmentation is more accurate. Therefore, before the foregoing step 102 performs word segmentation on the address to be coded, the following steps may be further included:
and matching the address to be coded with a preset non-standard address lexicon, if the matching is successful, determining that the address to be coded is a non-standard address, and if the matching is failed, determining that the address to be coded is a standard address.
In the embodiment of the invention, the non-standard address lexicon comprises ambiguous words such as 'nearby', 'opposite', 'beside', and the like. If the words in the non-standard address word bank exist in the address to be coded, the matching of the address to be coded and the non-standard address word bank is successful.
At this time, in the step 102, performing word segmentation on the address to be coded in the geocoding request specifically includes:
when the address to be coded is a non-standard address, performing word segmentation on the address to be coded according to a preset general word segmentation word bank;
and when the address to be coded is a standard address, performing word segmentation on the address to be coded according to a preset standard word segmentation word bank.
In the embodiment of the invention, the general participle word bank refers to a word bank which is preset and contains all participles (such as professional terms, professional nouns and the like) related to each industry; the standard word segmentation word bank refers to a word bank of pre-set segmentation words related to the geographic information industry.
Step 103: and carrying out geocoding on the address fragment corresponding to the address to be coded.
Step 104: and when the geographic coding fails, determining the address to be coded as missing electronic map data, and storing the missing electronic map data in a preset missing database.
Step 105: and recombining the address segments corresponding to the address to be coded according to the position sequence of the address segments in the address to be coded to obtain at least one new address containing a part of the address segments corresponding to the address to be coded, geocoding the new address, and determining the new address as missing electronic map data and storing the missing electronic map data in a preset missing database if the geocoding fails.
The foregoing step 105 can be implemented in the following two ways:
mode 1: traversing the address fragment of the address to be coded from the first address fragment corresponding to the address to be coded, and executing the following operations every time one address fragment is traversed until the last address fragment of the address to be coded is traversed: combining the traversed address fragment and the traversed address fragment into a new address according to the position sequence in the address to be coded; and geocoding the new address, determining the new address as missing electronic map data if the geocoding fails, and storing the missing electronic map data in a preset missing database.
For example, the address C1C2C3C4 to be encoded sequentially includes address fragments C1, C2, C3, and C4, and sequentially traverses from the first address fragment of the address to be encoded: traversing C1, geocoding the C1 as a new address, and if the geocoding fails, taking the C1 as missing electronic map data; traversing C2, performing geocoding by taking C1C2 as a new address, and taking C1C2 as missing electronic map data if the geocoding fails; traversing C3, geo-coding C1C2C3 as a new address, and if geo-coding fails, taking the C1C2C3 as missing electronic map data.
Preferably, since the administrative level of the address fragment located at the front in the address to be encoded is higher than that of the address fragment located at the back, the following operations may be performed in the embodiment of the present invention: sequentially traversing from the first address fragment of the address to be coded, traversing C1, geocoding C1 as a new address, if the geocoding fails, taking C1 as missing electronic map data, and directly determining C1C2, C1C2C3 and the three new addresses as missing electronic map data; and if the geocoding is successful, continuously traversing C2, geocoding C1C2, if the geocoding fails, directly determining C1C2C3 as missing electronic map data, and if the geocoding succeeds, continuously traversing C3.
Mode 2: and sequentially decreasing the last address segment of the address to be coded in a descending manner in a reverse order, wherein the following operations are executed when the last address segment is decreased until the first address segment of the address to be coded is decreased: taking the address formed by the address segments left after the address segments are decreased as a new address; and geocoding the new address, determining the new address as missing electronic map data if the geocoding fails, and storing the missing electronic map data in a preset missing database.
For example, the address C1C2C3C4 to be encoded sequentially includes address fragments C1, C2, C3, and C4, and the address fragment whose address to be encoded is located at the last one sequentially decreases: deleting the last address fragment C4 in the C1C2C3C4 to obtain a new address C1C2C3, geocoding the new address, and if the geocoding fails, continuously deleting the last address fragment C3 in the C1C2C3 to obtain a new address C1C 2; and geocoding the new address C1C2, if the encoding fails, deleting the last address fragment C2 to obtain a new address C1, geocoding the C1, and the like.
Preferably, the following operations may also be performed in the embodiment of the present invention: deleting the last address fragment C4 in the C1C2C3C4 to obtain a new address C1C2C3, carrying out geocoding on the new address, and ending the flow if the geocoding is successful; if the address fails, the last address fragment C3 in the C1C2C3 is continuously deleted to obtain a new address C1C 2; and geocoding the new address C1C2, ending the process if the encoding is successful, deleting the last address fragment C2 to obtain a new address C1 if the encoding is failed, geocoding the C1, and the like.
Preferably, in the embodiment of the present invention, when it is determined that an address to be encoded is missing electronic map data, after determining the missing electronic map data according to the address to be encoded, a parent-child relationship of the missing electronic map data is established, and if the address to be encoded is C1C2C3C4, it is determined that C1C2, C1C2C3, and C1C2C3C4 are all missing electronic map data, and since a geographic area range included in a new address including fewer address fragments is wider, the parent-child relationship of C1C2, C1C2C3, and C1C2C3C4 is established as follows: C1C2 is a father node, C1C2C3 is a subordinate child node of the father node, and C1C2C3C4 is a subordinate child node of C1C2C 3.
Preferably, to further provide a better basis for perfecting the electronic map database so that the acquiring personnel can preferentially supplement missing electronic map data with higher importance, the embodiment of the present invention may further include the following steps 106 to 108 in the flow of the method shown in fig. 1, as shown in fig. 2:
step 106: determining a data missing type of the missing electronic map data.
Wherein, the data missing types comprise: administrative district type, road type, district type, house number type, building number type. The priority of the data missing type is as follows in sequence from high to low: administrative district type, road type, district type, house number type, building number type.
In step 106, determining a data missing type of the missing electronic map data, which may specifically be as follows: and judging the type of the last address segment of the missing electronic map data, comparing the type with the data missing type, and determining the data missing type in comparison as the data missing type of the missing electronic map data. For example: if the last address segment missing the electronic map data is an administrative district, determining that the data missing type of the electronic map data is an administrative district type, and if the missing electronic map data is an 'xx city xx district', determining that the data missing type of the missing electronic map data is the administrative district type; if the missing electronic map data is 'xx road in xx city xx', the data missing type of the missing electronic map data is a road type; if the missing electronic map data is 'xx road xx cell in xx city xx', the data missing type of the missing electronic map data is a cell type; if the missing electronic map data is' xx cell xx number xx of xx city xx, the type of data missing of the missing electronic map data is a type of house number.
Step 107: and determining the importance of the missing electronic map data according to the priority of the data missing type and the missing frequency of the missing electronic map data.
The higher the priority of the data missing type of the missing electronic map data is, the greater the importance of the data missing type is; the more frequent the electronic map data is missing, the greater its importance.
In step 107, determining the importance of the missing electronic map data according to the priority of the data missing type and the missing frequency of the missing electronic map data, wherein the specific implementation can be as follows: the higher the data missing type of the missing electronic map data is, the higher the corresponding importance degree is, and if the data missing type is the same, the higher the missing frequency is, the higher the corresponding importance degree is.
Step 108: and sequencing the missing electronic map data in the missing database according to the sequence of the importance degrees from high to low, and storing the sequenced missing electronic map data in the missing database into a preset acquisition database.
By the scheme of the first embodiment of the invention, when a geocoding request carrying an address to be coded is received, the address to be coded in the geocoding request is subjected to word segmentation to obtain an address fragment forming the address to be coded; geocoding an address fragment corresponding to the address to be coded, determining the address to be coded as missing electronic map data when the geocoding fails, and storing the missing electronic map data in a preset missing database; and recombining the address segments corresponding to the address to be coded according to the position sequence of the address segments in the address to be coded to obtain at least one new address containing a part of the address segments corresponding to the address to be coded, geocoding the new address, and determining the new address as missing electronic map data and storing the missing electronic map data in a preset missing database if the geocoding fails. According to the scheme, missing electronic map data can be determined through a geographic coding mode, address segments corresponding to the address to be coded are recombined according to the position sequence of the address to be coded in the address to be coded, at least one new address containing partial address segments corresponding to the address to be coded is obtained, geographic coding is conducted on the new address, if the geographic coding fails, the new address is determined to be the missing electronic map data, and finally the determined missing electronic map data is stored in a preset missing database to provide a basis for subsequent supplement of electronic map data, so that the electronic map database can be perfected in a targeted mode, and the success rate of subsequent geographic coding is further improved.
Example two:
fig. 3 is a schematic structural diagram of a data searching apparatus according to a second embodiment of the present invention. The apparatus comprises: a receiving module 31, a word segmentation module 32, a first missing electronic map data determination module 33, and a second missing electronic map data determination module 34, wherein:
a receiving module 31, configured to receive a geocoding request carrying an address to be coded;
a word segmentation module 32, configured to perform word segmentation on the address to be coded in the geocoding request to obtain an address fragment forming the address to be coded;
the first missing electronic map data determining module 33 is configured to perform geocoding on the address segment corresponding to the address to be coded, determine the address to be coded as missing electronic map data when the geocoding fails, store the missing electronic map data in a preset missing database, and trigger the second missing electronic map data determining module 34;
and a second missing electronic map data determining module 34, configured to recombine the address segments corresponding to the addresses to be coded according to the position sequence of the address segments in the addresses to be coded, obtain at least one new address including a partial address segment corresponding to the addresses to be coded, geocode the new address, determine the new address as missing electronic map data if the geocoding fails, and store the missing electronic map data in a preset missing database.
Optionally, the determining device further includes: a deletion module 35, wherein:
a deleting module 35, configured to determine an invalid word in the address to be encoded according to a preset invalid word bank before the word segmentation module 32 performs word segmentation on the address to be encoded in the geocoding request, and delete the invalid word in the address to be encoded; and triggering the word segmentation module 32 aiming at the address to be coded after the invalid word is deleted.
Optionally, the determining device further includes: a matching module 36, wherein:
the matching module 36 is configured to match the address to be coded with a preset non-standard address lexicon before the word segmentation module 32 performs word segmentation on the address to be coded in the geocoding request, determine that the address to be coded is a non-standard address if matching is successful, and determine that the address to be coded is a standard address if matching is failed;
the word segmentation module 32 is specifically configured to: when the address to be coded is a non-standard address, performing word segmentation on the address to be coded according to a preset general word segmentation word bank; and when the address to be coded is a standard address, performing word segmentation on the address to be coded according to a preset standard word segmentation word bank.
Specifically, the second missing electronic map data determining module 34 is specifically configured to:
traversing the address fragment of the address to be coded from the first address fragment corresponding to the address to be coded, and combining the traversed address fragment and the traversed address fragment into a new address according to the position sequence in the address to be coded when traversing one address fragment; geocoding the new address, determining the new address as missing electronic map data if the geocoding fails, and storing the missing electronic map data in a preset missing database until the last address segment of the address to be coded is traversed;
or,
sequentially decreasing the last address segment from the last address segment of the address to be coded in a reverse order decreasing mode, and taking the address formed by the rest address segments after the address segment is decreased as a new address when the address segment is decreased; and geocoding the new address, determining the new address as missing electronic map data if the geocoding fails, and storing the missing electronic map data in a preset missing database until the first address segment of the address to be coded is decreased.
Optionally, the apparatus may further comprise: a data loss type determining module 37, an importance determining module 38, and a ranking module 39, wherein:
a data missing type determining module 37, configured to determine a data missing type of the missing electronic map data;
the importance determining module 38 is configured to determine the importance of the missing electronic map data according to the priority of the data missing type and the missing frequency of the missing electronic map data;
and the sorting module 39 is configured to sort the missing electronic map data in the missing database according to the order of importance from high to low, and store the sorted missing electronic map data in the missing database into a preset collection database.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus (device), or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.
Claims (10)
1. A method for determining missing electronic map data, comprising:
receiving a geocoding request carrying an address to be coded;
performing word segmentation on the address to be coded in the geocoding request to obtain an address fragment forming the address to be coded;
geocoding an address fragment corresponding to the address to be coded, determining the address to be coded as missing electronic map data when the geocoding fails, and storing the missing electronic map data in a preset missing database;
and recombining the address segments corresponding to the address to be coded according to the position sequence of the address segments in the address to be coded to obtain at least one new address containing a part of the address segments corresponding to the address to be coded, geocoding the new address, and determining the new address as missing electronic map data and storing the missing electronic map data in a preset missing database if the geocoding fails.
2. The method of claim 1, wherein prior to tokenizing the address to be coded in the geocoding request, further comprising:
determining an invalid word in the address to be coded according to a preset invalid word library, and deleting the invalid word in the address to be coded;
and aiming at the address to be coded after the invalid word is deleted, performing the step of segmenting the address to be coded in the geocoding request to obtain an address fragment forming the address to be coded.
3. The method of claim 1, wherein prior to tokenizing the address to be coded in the geocoding request, further comprising:
matching the address to be coded with a preset non-standard address lexicon, if the matching is successful, determining that the address to be coded is a non-standard address, and if the matching is failed, determining that the address to be coded is a standard address;
the word segmentation of the address to be coded in the geocoding request specifically includes:
when the address to be coded is a non-standard address, performing word segmentation on the address to be coded according to a preset general word segmentation word bank;
and when the address to be coded is a standard address, performing word segmentation on the address to be coded according to a preset standard word segmentation word bank.
4. The method according to any one of claims 1 to 3, wherein address fragments corresponding to the address to be encoded are recombined according to a position sequence of the address to be encoded in the address to be encoded to obtain at least one new address including a partial address fragment corresponding to the address to be encoded, the new address is geocoded, and if the geocoding fails, the new address is determined as missing electronic map data and is stored in a preset missing database, specifically comprising:
traversing the address fragment of the address to be coded from the first address fragment corresponding to the address to be coded, and combining the traversed address fragment and the traversed address fragment into a new address according to the position sequence in the address to be coded when traversing one address fragment; geocoding the new address, determining the new address as missing electronic map data if the geocoding fails, and storing the missing electronic map data in a preset missing database until the last address segment of the address to be coded is traversed;
or,
sequentially decreasing the last address segment from the last address segment of the address to be coded in a reverse order decreasing mode, and taking the address formed by the rest address segments after the address segment is decreased as a new address when the address segment is decreased; and geocoding the new address, determining the new address as missing electronic map data if the geocoding fails, and storing the missing electronic map data in a preset missing database until the first address segment of the address to be coded is decreased.
5. The method of any of claims 1 to 3, further comprising:
determining a data missing type of the missing electronic map data;
determining the importance of the missing electronic map data according to the priority of the data missing type and the missing frequency of the missing electronic map data;
and sequencing the missing electronic map data in the missing database according to the sequence of the importance degrees from high to low, and storing the sequenced missing electronic map data in the missing database into a preset acquisition database.
6. A missing electronic map data determination apparatus, comprising:
the receiving module is used for receiving a geocoding request carrying an address to be coded;
the word segmentation module is used for segmenting the address to be coded in the geocoding request to obtain an address fragment forming the address to be coded;
the first missing electronic map data determining module is used for carrying out geographic coding on the address fragment corresponding to the address to be coded, determining the address to be coded as missing electronic map data when the geographic coding fails, storing the missing electronic map data into a preset missing database, and triggering the second missing electronic map data determining module;
and the second missing electronic map data determining module is used for recombining the address segments corresponding to the addresses to be coded according to the position sequence of the address segments in the addresses to be coded to obtain at least one new address containing a part of the address segments corresponding to the addresses to be coded, geocoding the new address, and determining the new address as missing electronic map data and storing the missing electronic map data in a preset missing database if the geocoding fails.
7. The determination device of claim 6, wherein the determination device further comprises: a deletion module, wherein:
the deleting module is used for determining the invalid words in the address to be coded according to a preset invalid word bank and deleting the invalid words in the address to be coded before the word segmentation module carries out word segmentation on the address to be coded in the geocoding request; and triggering the word segmentation module aiming at the address to be coded after the invalid word is deleted.
8. The determination device of claim 6, wherein the determination device further comprises: a matching module, wherein:
the matching module is used for matching the address to be coded with a preset non-standard address word bank before the word segmentation module performs word segmentation on the address to be coded in the geocoding request, determining the address to be coded as a non-standard address if the matching is successful, and determining the address to be coded as a standard address if the matching is failed;
the word segmentation module is specifically configured to: when the address to be coded is a non-standard address, performing word segmentation on the address to be coded according to a preset general word segmentation word bank; and when the address to be coded is a standard address, performing word segmentation on the address to be coded according to a preset standard word segmentation word bank.
9. The determination device according to any one of claims 6 to 8,
the second missing electronic map data determination module is specifically configured to:
traversing the address fragment of the address to be coded from the first address fragment corresponding to the address to be coded, and combining the traversed address fragment and the traversed address fragment into a new address according to the position sequence in the address to be coded when traversing one address fragment; geocoding the new address, determining the new address as missing electronic map data if the geocoding fails, and storing the missing electronic map data in a preset missing database until the last address segment of the address to be coded is traversed;
or,
sequentially decreasing the last address segment from the last address segment of the address to be coded in a reverse order decreasing mode, and taking the address formed by the rest address segments after the address segment is decreased as a new address when the address segment is decreased; and geocoding the new address, determining the new address as missing electronic map data if the geocoding fails, and storing the missing electronic map data in a preset missing database until the first address segment of the address to be coded is decreased.
10. The determination device according to any one of claims 6 to 8, further comprising:
the data missing type determining module is used for determining the data missing type of the missing electronic map data;
the importance determining module is used for determining the importance of the missing electronic map data according to the priority of the data missing type and the missing frequency of the missing electronic map data;
and the sorting module is used for sorting the missing electronic map data in the missing database from high to low according to the importance degree and storing the sorted missing electronic map data in the missing database into a preset acquisition database.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410830042.8A CN105786922B (en) | 2014-12-25 | 2014-12-25 | Method and device for determining missing electronic map data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410830042.8A CN105786922B (en) | 2014-12-25 | 2014-12-25 | Method and device for determining missing electronic map data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105786922A true CN105786922A (en) | 2016-07-20 |
CN105786922B CN105786922B (en) | 2020-02-14 |
Family
ID=56388913
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410830042.8A Active CN105786922B (en) | 2014-12-25 | 2014-12-25 | Method and device for determining missing electronic map data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105786922B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110688851A (en) * | 2019-09-26 | 2020-01-14 | 税友软件集团股份有限公司 | Method, device and medium for extracting key information of address text |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101882163A (en) * | 2010-06-30 | 2010-11-10 | 中国科学院地理科学与资源研究所 | Fuzzy Chinese address geographic evaluation method based on matching rule |
US20110087695A1 (en) * | 2009-10-09 | 2011-04-14 | Verizon Patent And Licensing Inc. | Apparatuses, methods and systems for a truncated postal code smart address parser |
CN102446186A (en) * | 2010-10-13 | 2012-05-09 | 上海众恒信息产业股份有限公司 | Chinese geographic coding and decoding method and device adopting same |
CN102567492A (en) * | 2011-12-22 | 2012-07-11 | 哈尔滨工程大学 | Method for sea-land vector map data integration and fusion |
US20130046604A1 (en) * | 2011-08-17 | 2013-02-21 | Bank Of America Corporation | Virtual loyalty card program |
CN103699623A (en) * | 2013-12-19 | 2014-04-02 | 百度在线网络技术(北京)有限公司 | Geo-coding realizing method and device |
CN103914544A (en) * | 2014-04-03 | 2014-07-09 | 浙江大学 | Method for quickly matching Chinese addresses in multi-level manner on basis of address feature words |
CN104166651A (en) * | 2013-05-16 | 2014-11-26 | 阿里巴巴集团控股有限公司 | Data searching method and device based on integration of data objects in same classes |
CN104216895A (en) * | 2013-05-31 | 2014-12-17 | 高德软件有限公司 | Method and device for generating POI data |
-
2014
- 2014-12-25 CN CN201410830042.8A patent/CN105786922B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110087695A1 (en) * | 2009-10-09 | 2011-04-14 | Verizon Patent And Licensing Inc. | Apparatuses, methods and systems for a truncated postal code smart address parser |
CN101882163A (en) * | 2010-06-30 | 2010-11-10 | 中国科学院地理科学与资源研究所 | Fuzzy Chinese address geographic evaluation method based on matching rule |
CN102446186A (en) * | 2010-10-13 | 2012-05-09 | 上海众恒信息产业股份有限公司 | Chinese geographic coding and decoding method and device adopting same |
US20130046604A1 (en) * | 2011-08-17 | 2013-02-21 | Bank Of America Corporation | Virtual loyalty card program |
CN102567492A (en) * | 2011-12-22 | 2012-07-11 | 哈尔滨工程大学 | Method for sea-land vector map data integration and fusion |
CN104166651A (en) * | 2013-05-16 | 2014-11-26 | 阿里巴巴集团控股有限公司 | Data searching method and device based on integration of data objects in same classes |
CN104216895A (en) * | 2013-05-31 | 2014-12-17 | 高德软件有限公司 | Method and device for generating POI data |
CN103699623A (en) * | 2013-12-19 | 2014-04-02 | 百度在线网络技术(北京)有限公司 | Geo-coding realizing method and device |
CN103914544A (en) * | 2014-04-03 | 2014-07-09 | 浙江大学 | Method for quickly matching Chinese addresses in multi-level manner on basis of address feature words |
Non-Patent Citations (1)
Title |
---|
谭侃侃: "基于规则的中文地址分词与匹配方法", 《中国优秀硕士学位论文全文数据库 基础科学辑》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110688851A (en) * | 2019-09-26 | 2020-01-14 | 税友软件集团股份有限公司 | Method, device and medium for extracting key information of address text |
CN110688851B (en) * | 2019-09-26 | 2023-07-28 | 亿企赢网络科技有限公司 | Method, device and medium for extracting key information of address text |
Also Published As
Publication number | Publication date |
---|---|
CN105786922B (en) | 2020-02-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105808609B (en) | Method and equipment for judging data redundancy of information points | |
CN107656913B (en) | Map interest point address extraction method, map interest point address extraction device, server and storage medium | |
US8452106B2 (en) | Partition min-hash for partial-duplicate image determination | |
CN108628811B (en) | Address text matching method and device | |
WO2016165538A1 (en) | Address data management method and device | |
WO2018177316A1 (en) | Information identification method, computing device, and storage medium | |
CN109661659B (en) | Visual positioning map storing and loading method, device, system and storage medium | |
CN112069276B (en) | Address coding method, address coding device, computer equipment and computer readable storage medium | |
CN102279889B (en) | A kind of question pushing method and system based on geography information | |
CN104679801B (en) | A kind of interest point search method and device | |
EP3364309B1 (en) | Account mapping method and device based on address information | |
CN108228657B (en) | Method and device for realizing keyword retrieval | |
CN106156082A (en) | A kind of body alignment schemes and device | |
CN105608113B (en) | Judge the method and device of POI data in text | |
CN109947881B (en) | POI weight judging method and device, mobile terminal and computer readable storage medium | |
CN105760360A (en) | Address correction method and device | |
CN112487122B (en) | Address normalization processing method and device | |
CN111931077B (en) | Data processing method, device, electronic equipment and storage medium | |
CN103914455B (en) | A kind of interest point search method and device | |
CN112632213A (en) | Address information standardization method and device, electronic equipment and storage medium | |
CN108776660B (en) | ArcGIS-based method for matching road attributes in batches | |
CN110688434B (en) | Method, device, equipment and medium for processing interest points | |
CN111896016A (en) | Position information processing method and device, storage medium and terminal | |
CN111177589A (en) | Address information query method, device, equipment and storage medium | |
CN110990651A (en) | Address data processing method and device, electronic equipment and computer readable medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20200417 Address after: 310012 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province Patentee after: Alibaba (China) Co.,Ltd. Address before: 102200, No. 8, No., Changsheng Road, Changping District science and Technology Park, Beijing, China. 1-5 Patentee before: AUTONAVI SOFTWARE Co.,Ltd. |