CN107656913B - Map interest point address extraction method, map interest point address extraction device, server and storage medium - Google Patents

Map interest point address extraction method, map interest point address extraction device, server and storage medium Download PDF

Info

Publication number
CN107656913B
CN107656913B CN201710922733.4A CN201710922733A CN107656913B CN 107656913 B CN107656913 B CN 107656913B CN 201710922733 A CN201710922733 A CN 201710922733A CN 107656913 B CN107656913 B CN 107656913B
Authority
CN
China
Prior art keywords
address
waybill
interest point
map
candidate set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710922733.4A
Other languages
Chinese (zh)
Other versions
CN107656913A (en
Inventor
宋宽
王海南
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201710922733.4A priority Critical patent/CN107656913B/en
Publication of CN107656913A publication Critical patent/CN107656913A/en
Application granted granted Critical
Publication of CN107656913B publication Critical patent/CN107656913B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities

Abstract

The embodiment of the invention discloses a map interest point address extraction method, a map interest point address extraction device, a map interest point address extraction server and a storage medium. The method comprises the following steps: selecting an waybill address containing a map interest point name from all waybill addresses of the quick waybill to obtain an address candidate set, wherein the address candidate set records the corresponding relation between the waybill address and the map interest point; extracting an address fragment meeting a preset address standard from each waybill address in the address candidate set by using a natural language processing technology; and combining the extracted address fragment and the city and administrative division corresponding to the map interest point corresponding to the waybill address to which the address fragment belongs to obtain the complete address of the map interest point. The embodiment of the invention solves the problems of high cost and poor efficiency of manually verifying and perfecting the map interest point address, realizes quick and accurate verification and perfecting the map interest point address, and can improve the map information accuracy and the user experience.

Description

Map interest point address extraction method, map interest point address extraction device, server and storage medium
Technical Field
The embodiment of the invention relates to an electronic map information mining technology, in particular to a map interest point address extraction method, a map interest point address extraction device, a server and a storage medium.
Background
There are a large number of location points in the electronic map, such as restaurants, hotels, scenic spots, toll booths, etc. marked on the map, and these location points are the points of interest that the user may query or want to reach. The address of the point of interest is one of the data which is most concerned by the user, so the electronic map needs to show the user a detailed and structured address, thereby helping the user to more easily map the position of the point of interest to the real world according to the address description. However, because the number of the interest points in the electronic map is huge, the addresses of some interest points are often incomplete and incorrect, which is easy to mislead users of the electronic map and affects user experience.
In the prior art, aiming at the imperfection of the addresses of the interest points in the electronic map, only manual checking and supplementing are carried out. However, the method has high cost and poor efficiency, and a large amount of manpower and time are required to perfect the addresses of the interest points on the electronic map.
Disclosure of Invention
The embodiment of the invention provides a map interest point address extraction method, a map interest point address extraction device, a map interest point address extraction server and a storage medium, so that the map interest point address is improved and corrected efficiently, the map interest point address is more complete and accurate, the use experience of a map user is improved, and the interest point address improvement cost is reduced.
In a first aspect, an embodiment of the present invention provides a method for extracting a map interest point address, where the method includes:
selecting an waybill address containing a map interest point name from all waybill addresses of the quick waybill to obtain an address candidate set, wherein the address candidate set records the corresponding relation between the waybill address and the map interest point;
extracting an address fragment meeting a preset address standard from each waybill address in the address candidate set by using a natural language processing technology;
and combining the extracted address fragment and the city and administrative division corresponding to the map interest point corresponding to the waybill address to which the address fragment belongs to obtain the complete address of the map interest point.
In a second aspect, an embodiment of the present invention further provides an apparatus for extracting a map point of interest address, where the apparatus includes:
the address candidate set establishing module is used for selecting waybill addresses containing map interest point names from all waybill addresses of the express waybill to obtain an address candidate set, wherein the address candidate set records the corresponding relation between the waybill addresses and the map interest points;
the address fragment extraction module is used for extracting an address fragment meeting a preset address standard from each waybill address in the address candidate set by utilizing a natural language processing technology;
and the complete address combination module is used for combining the extracted address fragment with the city and administrative division corresponding to the map interest point corresponding to the waybill address to obtain the complete address of the map interest point.
In a third aspect, an embodiment of the present invention further provides a server, where the server includes:
one or more processors;
storage means for storing one or more programs;
when the one or more programs are executed by the one or more processors, the one or more processors implement the method for extracting the map interest point address according to any one of the embodiments of the present invention.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the method for extracting an address of a point of interest of a map according to any one of the embodiments of the present invention.
According to the embodiment of the invention, the address fragment extracted from the express waybill address containing the map interest point address is combined with the city and administrative division corresponding to the map interest point corresponding to the waybill address to obtain the complete address of the map interest point, so that the problems of high cost and poor efficiency of manual verification and improvement of the map interest point address are solved, the rapid and accurate verification and improvement of the map interest point address are realized, the map information accuracy can be improved, and the user experience is improved.
Drawings
Fig. 1 is a flowchart of a method for extracting a map point of interest address according to a first embodiment of the present invention;
fig. 2 is a flowchart of a map interest point address extraction method according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a map interest point address extracting apparatus according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of a server in the fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a flowchart of a method for extracting a map point of interest address according to an embodiment of the present invention, where the embodiment is applicable to a situation of completing electronic map information. As shown in fig. 1, the method specifically includes:
s110, selecting the waybill address containing the name of the map interest point from all waybill addresses of the quick delivery waybill to obtain an address candidate set, wherein the address candidate set records the corresponding relation between the waybill address and the map interest point.
The waybill address refers to all addresses from the express waybill, the express waybill contains rich address information, and the waybill address is also an information source for extracting the address in the embodiment of the invention. The interest point refers to a marked place on the map, such as a restaurant, a hotel, a market, a scenic spot, a toll station, and the like, and is a place that a user of the electronic map may search for a location or want to reach on the electronic map. The data of the interest points are stored in a database of the server or a cloud database, and the data comprise information such as names, coordinate positions and the like of the interest points. The electronic map needs to show the detailed and structured address to the user as much as possible, so as to help the user find the destination in the real world more easily according to the address description.
Specifically, when an address of a certain interest point needs to be perfected, the name of the interest point is compared with the waybill address, if the name of the interest point is contained in the certain waybill address, the waybill address is used as one of candidate addresses, the candidate addresses are added into an address candidate set, and the corresponding relation between the waybill address and the map interest point is recorded. The correspondence relationship indicates which interest point an address corresponds to, and in the embodiment of the present invention, each address in the address candidate set corresponds to one interest point.
Preferably, before the obtaining the address candidate set, the selecting the waybill address containing the map interest point name from all the waybill addresses of the fast forwarding waybill further includes:
respectively acquiring a waybill address containing a name of a map interest point and coordinates of the corresponding interest point in an electronic map;
judging whether the distance between the waybill address containing the name of the map interest point and the corresponding interest point in the electronic map exceeds a preset threshold value or not according to the coordinates;
and taking the waybill address which does not exceed the preset threshold value as the waybill address in the address candidate set.
Specifically, it needs to be determined that the distance between the waybill address in the acquired candidate address set and the interest point is within a preset range. The preset range can be configured according to actual conditions, and the appropriate preset range can improve the accuracy of obtaining the candidate waybill address. For example, if the distance between the selected candidate waybill address and the coordinate position of the point of interest exceeds 50 meters, the address information extracted from the waybill address cannot represent the address of the point of interest, and the accuracy of the address information has a certain deviation, which may cause a certain trouble to the user in locating the point of interest.
And S120, extracting an address fragment meeting a preset address standard from each waybill address in the address candidate set by using a natural language processing technology.
A waybill address text is a character string composed of Chinese characters (including punctuation marks and the like) in form. Specifically, the address fragment meeting the preset address standard is finally extracted from the address character string by an analysis method of natural language processing such as word segmentation and semantic analysis. The preset address standard in the embodiment of the invention refers to the detailed address description to the road, street and door address. Of course, the address standard may also be set according to actual needs, and this application does not limit this.
S130, combining the extracted address fragment and the city and administrative division corresponding to the map interest point corresponding to the waybill address to which the address fragment belongs to obtain the complete address of the map interest point.
Specifically, the correction and supplement of the interest point address mainly supplement detailed address description including roads and levels below the roads, and the city and administrative division information corresponding to the interest points stored in the database is directly combined with the extracted address fragment to obtain the completely detailed description of the interest point address in consideration of the fact that the city or administrative division may be incorrectly and abnormally described in the waybill address.
Preferably, when a plurality of address fragments corresponding to the same target interest point exist in the extracted address fragment, combining the extracted address fragment with cities and administrative divisions corresponding to map interest points corresponding to waybill addresses to which the extracted address fragment belongs to obtain a complete address of the map interest point, further comprising:
taking the address fragment with the maximum number of the same address fragments as a target address fragment of the target interest point;
and combining the target address fragment with the city and administrative divisions corresponding to the target interest points to obtain the complete address of the target interest points.
Specifically, because the number of waybill addresses is large, and the user in the waybill addresses have different address description modes, there may be multiple waybill addresses with the same address description details corresponding to the same target interest point. For example, after natural language processing (i.e., word segmentation and word attribute labeling) is performed on waybill addresses in the address candidate set, if a plurality of waybill addresses all include address details such as street and house numbers and all correspond to the same target interest point, a plurality of address fragments can be extracted from the plurality of waybill addresses. Further, from the plurality of address fragments, the address fragment with the largest number of the same address fragments is selected as the target address fragment of the target interest point, and the selection result is also the most accurate address description fragment. Further, combining the obtained target address fragment with the city and administrative divisions corresponding to the target interest point to obtain the complete address of the target interest point as the result after the target interest point address is completed.
According to the technical scheme of the embodiment, the waybill addresses meeting the conditions are selected, the address fragments meeting the conditions are extracted from the waybill addresses by using a natural language processing technology, the city and administrative division information corresponding to the interest points are combined to obtain complete and detailed address information, and when a plurality of address fragments corresponding to the same target interest point exist in the extracted address fragments, the address fragment with the largest number of the same address fragments is selected as the final address fragment, so that the problems that manual verification is needed, the address cost of the interest point is high, the efficiency is poor are solved, the map information accuracy can be improved, and the user experience is improved.
Example two
Fig. 2 is a flowchart of a map interest point address extraction method according to a second embodiment of the present invention, and the second embodiment further optimizes the map interest point address based on the first embodiment. As shown in fig. 2, the method includes:
s210, selecting the waybill address containing the name of the map interest point from all the waybill addresses of the quick delivery waybill to obtain an address candidate set, wherein the address candidate set records the corresponding relation between the waybill address and the map interest point.
And S220, performing word segmentation and word attribute labeling on each waybill address in the address candidate set to obtain the word segmentation and the address attribute of each address.
Specifically, a segment of address is usually a combination of country, province, city, county, village, town, road, street, and door address segments. The characters of country, province, city, county, village, city, road, street and door address are used as key words, each waybill address is segmented into words, and the address attribute of the segmented words is marked through semantic analysis, namely the segmented address fragment belongs to which fragment of the country, province, city, county, village, road, street and door address.
And S230, selecting the waybill address of which the address attribute of the participle meets a preset address standard from the address candidate set according to the participle and the address attribute of each address to obtain a target address candidate set.
Specifically, after the waybill addresses in the address candidate set are segmented, the most detailed address description is selected according to the segmentation result and the address attribute and is stored in the target address candidate set. For example, assuming that there are 100 waybill addresses containing the names of the points of interest in the address candidate set, after word segmentation and semantic analysis, it can be determined that some waybill addresses describe the detailed level to the county, some address describe the detailed level to the street, and some describe the specific gate address. Then, the address of the waybill, which meets the preset address standard and is described in the most detail, of all the addresses is further stored as the target address in the target address candidate set.
S240, extracting an address segment which meets a preset address standard for each waybill address in the target address candidate set according to the word segmentation and the address attribute of the word segmentation.
Specifically, the operation is to extract address segments of the screened waybill address, that is, segments including the road and the following detailed addresses thereof, and extract corresponding address segments according to the word segmentation and the address attribute.
And S250, combining the extracted address fragment with cities and administrative divisions corresponding to the map interest points corresponding to the waybill addresses to which the address fragment belongs to obtain complete addresses of the map interest points.
According to the technical scheme, word segmentation and word attribute labeling are carried out on the waybill address, address fragments meeting the conditions are extracted from the waybill address meeting the conditions, and the address fragments are combined with city and administrative division information corresponding to the interest points to obtain complete and detailed address information, so that the problems that manual verification is needed, the cost is high and the efficiency is poor when the interest point address is perfected are solved, the map information accuracy can be improved, and the user experience is improved.
EXAMPLE III
Fig. 3 is a schematic structural diagram of a map interest point address extracting device in a third embodiment of the present invention. As shown in fig. 3, the apparatus for extracting a map interest point address includes:
an address candidate set establishing module 310, configured to select an waybill address containing a map interest point name from all waybill addresses of a quick delivery waybill to obtain an address candidate set, where a correspondence between the waybill address and the map interest point is recorded in the address candidate set;
an address fragment extracting module 320, configured to extract, by using a natural language processing technology, an address fragment meeting a preset address standard from each waybill address in the address candidate set;
and the complete address combination module 330 is configured to combine the extracted address fragment with the city and administrative division corresponding to the map interest point corresponding to the waybill address to obtain a complete address of the map interest point.
Further, the address candidate set creating module 310 includes:
the address information acquisition unit is used for respectively acquiring the waybill address containing the name of the map interest point and the coordinate of the corresponding interest point in the electronic map before the address candidate set is obtained;
the distance judgment unit is used for judging whether the waybill address containing the name of the map interest point and the distance of the corresponding interest point in the electronic map exceed a preset threshold value or not according to the coordinate;
and the candidate address selection unit is used for taking the waybill address which does not exceed a preset threshold value as the waybill address in the address candidate set.
Further, the address fragment extracting module 320 is specifically configured to:
performing word segmentation and word attribute labeling on each waybill address in the address candidate set to obtain a word segmentation and an address attribute of each address;
according to the participles and the address attributes of each address, selecting the waybill addresses of which the address attributes of the participles meet the preset address standard from the address candidate set to obtain a target address candidate set;
and extracting an address fragment which meets a preset address standard for each waybill address in the target address candidate set according to the word segmentation and the address attribute thereof.
Further, when a plurality of address fragments exist in the extracted address fragments and correspond to the same target interest point, the complete address combination module 330 is further configured to:
taking the address fragment with the maximum number of the same address fragments as a target address fragment of the target interest point;
and combining the target address fragment with the city and administrative divisions corresponding to the target interest points to obtain the complete address of the target interest points.
The map interest point address extraction device provided by the embodiment of the invention can execute the map interest point address extraction method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Example four
Fig. 4 is a schematic structural diagram of a server in the fourth embodiment of the present invention. FIG. 4 illustrates a block diagram of an exemplary server 412 suitable for use in implementing embodiments of the present invention. The server 412 shown in fig. 4 is only an example and should not bring any limitations to the function and scope of use of the embodiments of the present invention.
As shown in FIG. 4, the server 412 is in the form of a general purpose computing device. Components of server 412 may include, but are not limited to: one or more processors or processing units 416, a system memory 428, and a bus 418 that couples the various system components including the system memory 428 and the processing unit 416.
Bus 418 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Server 412 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by server 412 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 428 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)430 and/or cache memory 432. The server 412 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 434 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 4, commonly referred to as a "hard drive"). Although not shown in FIG. 4, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 418 by one or more data media interfaces. Memory 428 can include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 440 having a set (at least one) of program modules 442 may be stored, for instance, in memory 428, such program modules 442 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. The program modules 442 generally perform the functions and/or methodologies of the described embodiments of the invention.
The server 412 may also communicate with one or more external devices 414 (e.g., keyboard, pointing device, display 424, etc.), with one or more devices that enable a user to interact with the server 412, and/or with any devices (e.g., network card, modem, etc.) that enable the server 412 to communicate with one or more other computing devices. Such communication may occur via input/output (I/O) interfaces 422. Also, server 412 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet) through network adapter 420. As shown, network adapter 420 communicates with the other modules of server 412 over bus 418. It should be appreciated that although not shown in FIG. 4, other hardware and/or software modules may be used in conjunction with the server 412, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processing unit 416 executes various functional applications and data processing by running programs stored in the system memory 428, for example, implementing a map interest point address extracting method provided by the embodiment of the present invention, the method includes:
selecting an waybill address containing a map interest point name from all waybill addresses of the quick waybill to obtain an address candidate set, wherein the address candidate set records the corresponding relation between the waybill address and the map interest point;
extracting an address fragment meeting a preset address standard from each waybill address in the address candidate set by using a natural language processing technology;
and combining the extracted address fragment and the city and administrative division corresponding to the map interest point corresponding to the waybill address to which the address fragment belongs to obtain the complete address of the map interest point.
EXAMPLE five
An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements a method for extracting a map point of interest address, where the method includes:
selecting an waybill address containing a map interest point name from all waybill addresses of the quick waybill to obtain an address candidate set, wherein the address candidate set records the corresponding relation between the waybill address and the map interest point;
extracting an address fragment meeting a preset address standard from each waybill address in the address candidate set by using a natural language processing technology;
and combining the extracted address fragment and the city and administrative division corresponding to the map interest point corresponding to the waybill address to which the address fragment belongs to obtain the complete address of the map interest point.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "for example" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A method for extracting a map interest point address is characterized by comprising the following steps:
selecting an waybill address containing a map interest point name from all waybill addresses of the quick waybill to obtain an address candidate set, wherein the address candidate set records the corresponding relation between the waybill address and the map interest point;
extracting an address fragment meeting a preset address standard from each waybill address in the address candidate set by using a natural language processing technology;
and combining the extracted address fragment and the city and administrative division corresponding to the map interest point corresponding to the waybill address to which the address fragment belongs to obtain the complete address of the map interest point.
2. The method of claim 1, wherein before the obtaining the address candidate set, the selecting the waybill address containing the name of the map point of interest from all the waybill addresses of the fast forwarding waybill further comprises:
respectively acquiring a waybill address containing a name of a map interest point and coordinates of the corresponding interest point in an electronic map;
judging whether the distance between the waybill address containing the name of the map interest point and the corresponding interest point in the electronic map exceeds a preset threshold value or not according to the coordinates;
and taking the waybill address which does not exceed the preset threshold value as the waybill address in the address candidate set.
3. The method for extracting the map interest point address according to claim 1, wherein the extracting, by using a natural language processing technique, an address fragment meeting a preset address standard from each waybill address in the address candidate set comprises:
performing word segmentation and word attribute labeling on each waybill address in the address candidate set to obtain a word segmentation and an address attribute of each address;
according to the participles and the address attributes of each address, selecting the waybill addresses of which the address attributes of the participles meet the preset address standard from the address candidate set to obtain a target address candidate set;
and extracting an address fragment which meets a preset address standard for each waybill address in the target address candidate set according to the word segmentation and the address attribute thereof.
4. The method for extracting the map interest point address according to claim 1, wherein when a plurality of address segments exist in the extracted address segments and correspond to the same target interest point, the method combines the extracted address segments with cities and administrative divisions corresponding to the map interest points corresponding to waybill addresses to which the extracted address segments belong, so as to obtain a complete address of the map interest point, and further comprises:
taking the address fragment with the maximum number of the same address fragments as a target address fragment of the target interest point;
and combining the target address fragment with the city and administrative divisions corresponding to the target interest points to obtain the complete address of the target interest points.
5. An apparatus for extracting an address of a map point of interest, comprising:
the address candidate set establishing module is used for selecting waybill addresses containing map interest point names from all waybill addresses of the express waybill to obtain an address candidate set, wherein the address candidate set records the corresponding relation between the waybill addresses and the map interest points;
the address fragment extraction module is used for extracting an address fragment meeting a preset address standard from each waybill address in the address candidate set by utilizing a natural language processing technology;
and the complete address combination module is used for combining the extracted address fragment with the city and administrative division corresponding to the map interest point corresponding to the waybill address to obtain the complete address of the map interest point.
6. The apparatus for extracting map interest point address as claimed in claim 5, wherein the address candidate set creating module comprises:
the address information acquisition unit is used for respectively acquiring the waybill address containing the name of the map interest point and the coordinate of the corresponding interest point in the electronic map before the address candidate set is obtained;
the distance judgment unit is used for judging whether the waybill address containing the name of the map interest point and the distance of the corresponding interest point in the electronic map exceed a preset threshold value or not according to the coordinate;
and the candidate address selection unit is used for taking the waybill address which does not exceed a preset threshold value as the waybill address in the address candidate set.
7. The apparatus for extracting an address of a point of interest of a map according to claim 5, wherein the address segment extracting module is specifically configured to:
performing word segmentation and word attribute labeling on each waybill address in the address candidate set to obtain a word segmentation and an address attribute of each address;
according to the participles and the address attributes of each address, selecting the waybill addresses of which the address attributes of the participles meet the preset address standard from the address candidate set to obtain a target address candidate set;
and extracting an address fragment which meets a preset address standard for each waybill address in the target address candidate set according to the word segmentation and the address attribute thereof.
8. The apparatus for extracting an interest point address of a map according to claim 5, wherein when there are multiple address segments in the extracted address segments corresponding to the same target interest point, the complete address combination module is further configured to:
taking the address fragment with the maximum number of the same address fragments as a target address fragment of the target interest point;
and combining the target address fragment with the city and administrative divisions corresponding to the target interest points to obtain the complete address of the target interest points.
9. A server, characterized in that the server comprises:
one or more processors;
storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the method of extracting a map point of interest address as recited in any of claims 1-4.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a method for extracting a map point of interest address as claimed in any one of claims 1 to 4.
CN201710922733.4A 2017-09-30 2017-09-30 Map interest point address extraction method, map interest point address extraction device, server and storage medium Active CN107656913B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710922733.4A CN107656913B (en) 2017-09-30 2017-09-30 Map interest point address extraction method, map interest point address extraction device, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710922733.4A CN107656913B (en) 2017-09-30 2017-09-30 Map interest point address extraction method, map interest point address extraction device, server and storage medium

Publications (2)

Publication Number Publication Date
CN107656913A CN107656913A (en) 2018-02-02
CN107656913B true CN107656913B (en) 2021-03-23

Family

ID=61117611

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710922733.4A Active CN107656913B (en) 2017-09-30 2017-09-30 Map interest point address extraction method, map interest point address extraction device, server and storage medium

Country Status (1)

Country Link
CN (1) CN107656913B (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110556049B (en) * 2018-06-04 2021-11-12 百度在线网络技术(北京)有限公司 Map data processing method, device, server and storage medium
CN110716992B (en) * 2018-06-27 2022-05-27 百度在线网络技术(北京)有限公司 Method and device for recommending name of point of interest
CN110874442A (en) * 2018-08-31 2020-03-10 阿里巴巴集团控股有限公司 Method, apparatus, device and medium for processing information
CN110968654B (en) * 2018-09-29 2023-10-20 阿里巴巴集团控股有限公司 Address category determining method, equipment and system for text data
CN109389119B (en) * 2018-10-23 2021-10-26 百度在线网络技术(北京)有限公司 Method, device, equipment and medium for determining interest point region
CN111325022B (en) * 2018-11-28 2023-11-03 北京京东振世信息技术有限公司 Method and device for identifying hierarchical address
CN111460054B (en) * 2019-01-21 2023-06-30 阿里巴巴集团控股有限公司 Address data processing method and device, equipment and storage medium
CN111460057B (en) * 2019-01-22 2023-06-27 阿里巴巴集团控股有限公司 POI (Point of interest) coordinate determining method, device and equipment
CN111723165A (en) * 2019-03-18 2020-09-29 阿里巴巴集团控股有限公司 Address interest point determining method, device and system
CN110175216B (en) * 2019-05-15 2021-05-11 腾讯科技(深圳)有限公司 Coordinate error correction method and device and computer equipment
CN111984747A (en) * 2019-05-21 2020-11-24 丰图科技(深圳)有限公司 Method, device and equipment for acquiring geographic information data
CN110457706B (en) * 2019-08-15 2023-08-22 腾讯科技(深圳)有限公司 Point-of-interest name selection model training method, using method, device and storage medium
CN113706065A (en) * 2020-05-22 2021-11-26 百度在线网络技术(北京)有限公司 Goods classification method, device, equipment and storage medium
CN111782741A (en) * 2020-06-04 2020-10-16 汉海信息技术(上海)有限公司 Interest point mining method and device, electronic equipment and storage medium
CN111723172A (en) * 2020-06-10 2020-09-29 广东世纪高通科技有限公司 Data fusion method and device
CN112488194A (en) * 2020-11-30 2021-03-12 上海寻梦信息技术有限公司 Address abbreviation generation method, model training method and related equipment
CN112966192B (en) * 2021-02-09 2023-10-27 北京百度网讯科技有限公司 Regional address naming method, apparatus, electronic device and readable storage medium
CN113190640B (en) * 2021-05-20 2023-02-07 拉扎斯网络科技(上海)有限公司 Method and device for processing point of interest data
CN113935293B (en) * 2021-12-16 2022-03-22 湖南四方天箭信息科技有限公司 Address splitting and complementing method and device, computer equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103218375A (en) * 2012-01-20 2013-07-24 北京四维图新科技股份有限公司 POI (Point of Interest) information supplementing method and device
CN104899243A (en) * 2015-03-31 2015-09-09 北京奇虎科技有限公司 Method and apparatus for detecting accuracy of POI (Point of Interest) data
KR101556743B1 (en) * 2014-04-07 2015-10-02 주식회사 케이티 Apparatus and method for generating poi information based on web collection
CN105160031A (en) * 2015-09-30 2015-12-16 北京奇虎科技有限公司 Mining method and device for map point of interest (POI) data
CN105760360A (en) * 2014-12-16 2016-07-13 高德软件有限公司 Address correction method and device
CN106919569A (en) * 2015-12-24 2017-07-04 北京四维图新科技股份有限公司 A kind of method and device of the administrative division information for obtaining point of interest POI
CN106919567A (en) * 2015-12-24 2017-07-04 北京四维图新科技股份有限公司 A kind of processing method and processing device of point of interest POI addresses

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10452763B2 (en) * 2007-03-08 2019-10-22 Oath Inc. Autocomplete for integrating diverse methods of electronic communication
CN104216895B (en) * 2013-05-31 2018-01-30 高德软件有限公司 A kind of method and device for generating POI data
CN106156145A (en) * 2015-04-13 2016-11-23 阿里巴巴集团控股有限公司 The management method of a kind of address date and device
CN106682175A (en) * 2016-12-29 2017-05-17 华南师范大学 Method and system for matching address
CN106874384B (en) * 2017-01-10 2020-12-04 航天精一(广东)信息科技有限公司 Heterogeneous address standard conversion and matching method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103218375A (en) * 2012-01-20 2013-07-24 北京四维图新科技股份有限公司 POI (Point of Interest) information supplementing method and device
KR101556743B1 (en) * 2014-04-07 2015-10-02 주식회사 케이티 Apparatus and method for generating poi information based on web collection
CN105760360A (en) * 2014-12-16 2016-07-13 高德软件有限公司 Address correction method and device
CN104899243A (en) * 2015-03-31 2015-09-09 北京奇虎科技有限公司 Method and apparatus for detecting accuracy of POI (Point of Interest) data
CN105160031A (en) * 2015-09-30 2015-12-16 北京奇虎科技有限公司 Mining method and device for map point of interest (POI) data
CN106919569A (en) * 2015-12-24 2017-07-04 北京四维图新科技股份有限公司 A kind of method and device of the administrative division information for obtaining point of interest POI
CN106919567A (en) * 2015-12-24 2017-07-04 北京四维图新科技股份有限公司 A kind of processing method and processing device of point of interest POI addresses

Also Published As

Publication number Publication date
CN107656913A (en) 2018-02-02

Similar Documents

Publication Publication Date Title
CN107656913B (en) Map interest point address extraction method, map interest point address extraction device, server and storage medium
US11698261B2 (en) Method, apparatus, computer device and storage medium for determining POI alias
WO2018177316A1 (en) Information identification method, computing device, and storage medium
CN104090970A (en) Interest point showing method and device
US8429204B2 (en) Short point-of-interest title generation
CN109492066B (en) Method, device, equipment and storage medium for determining branch names of points of interest
CN110688434B (en) Method, device, equipment and medium for processing interest points
CN110990520A (en) Address coding method and device, electronic equipment and storage medium
WO2014163977A1 (en) Systems, methods and computer-readable media for interpreting geographical search queries
CN107577819A (en) A kind of content of text shows method, apparatus, computer equipment and storage medium
CN112528174A (en) Address finishing and complementing method based on knowledge graph and multiple matching and application
CN110555432B (en) Method, device, equipment and medium for processing interest points
CN110019617B (en) Method and device for determining address identifier, storage medium and electronic device
CN111896016A (en) Position information processing method and device, storage medium and terminal
CN110609879B (en) Interest point duplicate determination method and device, computer equipment and storage medium
CN110852620B (en) Logistics order processing method and device, electronic equipment and storage medium
CN110990651B (en) Address data processing method and device, electronic equipment and computer readable medium
CN110647595B (en) Method, device, equipment and medium for determining newly-added interest points
US20220188292A1 (en) Data processing method, apparatus, electronic device and readable storage medium
US11821748B2 (en) Processing apparatus and method for determining road names
CN111401051B (en) Express information analysis method and system
CN114385891A (en) Data searching method and device, electronic equipment and storage medium
CN110457705B (en) Method, device, equipment and storage medium for processing point of interest data
CN109815307B (en) Position determination method, apparatus, device, and medium
CN113360789A (en) Interest point data processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant