WO2007027608A2 - Local search - Google Patents

Local search Download PDF

Info

Publication number
WO2007027608A2
WO2007027608A2 PCT/US2006/033537 US2006033537W WO2007027608A2 WO 2007027608 A2 WO2007027608 A2 WO 2007027608A2 US 2006033537 W US2006033537 W US 2006033537W WO 2007027608 A2 WO2007027608 A2 WO 2007027608A2
Authority
WO
WIPO (PCT)
Prior art keywords
address
local search
data
information
search query
Prior art date
Application number
PCT/US2006/033537
Other languages
French (fr)
Other versions
WO2007027608A3 (en
Inventor
Kun Shing Luk
Huican Zhu
Hongjun Zhu
Original Assignee
Google Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Google Inc. filed Critical Google Inc.
Priority to EP06802480A priority Critical patent/EP1934829A4/en
Priority to CA002620770A priority patent/CA2620770A1/en
Priority to JP2008529167A priority patent/JP2009506459A/en
Priority to CN200680040129.9A priority patent/CN101313300B/en
Priority to BRPI0615323-2A priority patent/BRPI0615323A2/en
Publication of WO2007027608A2 publication Critical patent/WO2007027608A2/en
Publication of WO2007027608A3 publication Critical patent/WO2007027608A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries

Definitions

  • Implementations described herein relate generally to information retrieval, and, more particularly, to identifying local search results.
  • the World Wide Web (“web") contains a vast amount of information. Locating a desired portion of the information, however, may be challenging. This problem may be compounded because the amount of information on the web and the number of new users inexperienced at web searching are growing rapidly.
  • Search systems attempt to return hyperlinks to web pages in which a user is interested.
  • search systems base their determination of the user's interest on search terms (called a search query) entered by the user.
  • the goal of the search system may be to provide links to high quality, relevant results (e.g., web pages) to the user based on the search query.
  • the search system accomplishes this by matching the terms in the search query to a corpus of pre-stored web pages. Web pages that contain the user's search terms may be referred to as "hits" or "search results” and may be returned to the user as links.
  • Local search systems attempt to return relevant web pages and/or business listings within a specific geographic area. In some countries, detailed address information is not available for some businesses. As a result, performing local searches may be difficult.
  • a method may include receiving yellow page data, third-party map provider data, and document data in response to a local search query, and geocoding at least one of the yellow page data, the third-party map provider data, and the document data to assign a geographic identifier to and to match at least one address within the local search query.
  • the method may also include indexing the geocoded data to identify business information and location information corresponding to the local search query, and providing local search results and a third-party map based on the identified business information and location information.
  • a method of geocoding based on a local search query may include receiving third party map provider data and yellow page data, generating an address based on the local search query, parsing the address, locating longest matching prefixes in the address to identify at least one portion of the address, and locating a combination in the address to verify the address.
  • a method of indexing based on a local search query may include preprocessing yellow page data to a predetermined format, extracting business information from document data, storing the business information in a repository, and indexing address information from third party map provider data.
  • the method may also include clustering the yellow page data and the third party map provider data, and highlighting snippets in the document data.
  • the method may include setting a search distance for the local search query to a predetermined distance.
  • the method may further include setting the search distance to approximately a maximum distance from a centroid of the bound location to corners of the bound location.
  • a method may include generating a list of synonyms related to a local search queiy, rewriting the local search query to expand the query, generating local search results based on the expanded query, obtaining a map from a third party map provider based on the local search results, and generating advertisements based on geographical information related to the local search results.
  • a system may include an indexer to receive third party map provider data, yellow page data, and document data, preprocess the yellow page data to determine business information, extract' business ' m ⁇ brm"a ⁇ t ⁇ n ffofn'the document data, identify location information in a local search query, and index address data of the third party map provider data.
  • the system may also include a geocoder to receive information from the indexer, and assign geographic identifiers, and a front end server to receive information from the geocoder, rewrite the local search query, obtain a map from the third party map provider data, and generate local search results based on the local search query.
  • a system may include means for receiving yellow page data, third- party map provider data, and document data in response to a local search query, and means for geocoding at least one of the yellow page data, the third-party map provider data, and the document data to assign a geographic identifier to and to match at least one address within the local search query,
  • the system may also include means for indexing the geocoded data to identify business information and location information corresponding to the local search query, and means for providing local search results and a third-party map based on the identified business information and location information.
  • a system may include a memory to store a group of instructions, and a processor to execute instructions in the memory.
  • the processor may identify a location associated with a local search query, identify local search results relevant to the local search query and associated with the identified location, identify an identifier for each of a group of the local search results, and receive from a third party map provider a map associated with the identified location, where the map identifies a position of at least one local search result in the group of local search results.
  • a method may include receiving a local search query, identifying a location associated with the local search query, identifying a set of search results relevant to the local search query and associated with the identified location, and identifying an identifier for each of a group of the search results.
  • the method may also include providing the identifier for each of the group of the search results to a third party map provider, and receiving from the third party map provider a map associated with the identified location, where the map identifies a position of at least one search result in the group of search results.
  • a method may include generating a list of tokens, identifying a potential address within a web document, and parsing the potential address from a beginning to determine whether the potential address includes a token associated with a city. The method may also include further parsing the potential address to determine whether the potential address includes a token associated with a district, identifying a longest-matching token in the potential address after the token associated with the city or the token associated with the district, and determining whether the potential address is an actual address based on the token associated with the city, the token associated with the district, and the identified longest-matching token.
  • Fig. 1 is a diagram of an overview of an exemplary implementation described herein;
  • Fig. 2 is a diagram of an exemplary network in which systems and methods described herein may be implemented;
  • Fig. 3 is an exemplary diagram of a client or server within the exemplary network of Fig. 2;
  • Fig. 4 is a functional block diagram of an exemplary system for identifying local search results and providing a map associated with identified locations; !! ' "" fe- Ij? ⁇ !, Ms'"an ! 'e ! keftipla : fy B ⁇ afraBf ⁇ 'f an index/document repository of the exemplary system of Fig. 4;
  • Fig. 6 is an exemplary diagram of a geocoder of the exemplary system of Fig. 4;
  • Fig. 7 is an exemplary diagram of an indexer of the exemplary system of Fig. 4;
  • Fig. 8 is an exemplary diagram of a front end server of the exemplary system of Fig. 4;
  • Fig. 9 is a diagram of exemplary local search results and a map generated by the exemplary system of
  • Figs. 10A-10D is a flowchart of an exemplary process for identifying local search results and providing a map associated with identified locations.
  • map data and yellow page data may not be available from a single provider and must be obtained from several different providers. Due to export restrictions, it may not be possible to get detailed map data to render the map of an area or to get the actual latitude and longitude of addresses within the area. As a result, address approximation may be used for geocoding of addresses.
  • the local results page may include a list of relevant results and a pointer to a map provider's server (third party).
  • the map provider may be responsible for generating the map displayed to the user.
  • Implementations described herein may identify local search results and generate a map associated with identified locations.
  • a system may receive a local search query input by a user, and may identify a location associated with the local search query.
  • the system may identify a set of local search results (e.g., search results "A" through "H") that may be related to the local search query and may be associated with the identified location.
  • the local search results may include links to documents that may be related to the local search query.
  • the system may identify an identifier for a group of local search results, and may provide the identifier to a map provider.
  • the system may receive a map associated with the identified location from the map provider.
  • the map may identify a position of at least one search result (e.g., search result "A" as shown in Fig. 1) in the group of local search results.
  • a "document,” as the term is used herein, is to be broadly interpreted to include any machine- readable and machine-storable work product.
  • a document may include, for example, an e-mail, a file, a combination of files, one or more files with embedded links to other files, a news group posting, a blog, a web advertisement, etc.
  • a common document is a web page. Web pages often include textual information and may include embedded information (such as meta information, images, hyperlinks, etc.) and/or embedded instructions (such as Javascript, etc.).
  • a "link,” as the term is used herein, is to be broadly interpreted to include any reference to/from a document from/to another document or another part of the same document.
  • FIG. 2 is an exemplary diagram of a network 200 in which systems and methods described herein may be implemented.
  • Network 200 may include multiple clients 210 connected to multiple servers 220-240 via a network 250.
  • Two clients 210 and three servers 220-240 have been illustrated as connected to network 250 for simplicity: ' In'prlMee ' rth ' ereTtoay Be" more or fewer clients and servers.
  • a client may perform one or more functions of a server and/or a server may perform one or more functions of a client.
  • Clients 210 may include client entities.
  • An entity may be defined as a device, such as a personal computer, a wireless telephone, a personal digital assistant (PDA), a lap top, or another type of computation or communication device, a thread or process running on one of these devices, and/or an object executable by one of these devices.
  • Servers 220-240 may includedi: server entities that gather, process, search, and/or maintain documents.
  • server 220 may include a local search system 225 usable by clients 210.
  • Server 220 may crawl a corpus of documents, index; the documents, and store information associated with the documents in a repository of documents. Any combination of servers 220-240 may implement local search system 225 to identify local search results and provide a map associated with identified locations.
  • servers 220-240 are shown as separate entities, it may be possible for one or more of servers 220-240 to perform one or more of the functions of another one or more of servers 220-240. For example, it may be possible that two or more of servers 220-240 are implemented as a single server. It may also be possible for a single one of servers 220-240 to be implemented as two or more separate (and possibly distributed) devices.
  • Network 250 may include a local area network (LAN), a wide area network (WAN), a telephone network, such as the Public Switched Telephone Network (PSTN) or a cellular network, an intranet, the Internet, or a combination of networks.
  • LAN local area network
  • WAN wide area network
  • PSTN Public Switched Telephone Network
  • Clients 210 and servers 220-240 may connect to network 250 via wired, wireless, and/or optical connections.
  • Fig. 3 is an exemplary diagram of a client or server entity (hereinafter called “client/server entity”), which may correspond to one or more of clients 210 and servers 220-240.
  • the client/server entity may include a bus 310, a processor 320, a main memory 330, a read only memory (ROM) 340, a storage device 350, an input device 360, an output device 370, and a communication interface 380.
  • Bus 310 may include a path that permits communication among the elements of the client/server entity.
  • Processor 320 may include a processor, microprocessor, or processing logic that may interpret and execute instructions.
  • Main memory 330 may include a random access memory (RAM) or another type of dynamic storage device that may store information and instructions for execution by processor 320.
  • ROM 340 may include a ROM device or another type of static storage device that may store static information and instructions for use by processor 320.
  • Storage device 350 may include a magnetic and/or optical recording medium and its corresponding drive.
  • Input device 360 may include a mechanism that permits an operator to input information to the client/server entity, such as a keyboard, a mouse, a pen, voice recognition and/or biometric mechanisms, etc.
  • Output device 370 may include a mechanism that outputs information to the operator, including a display, a printer, a speaker, etc.
  • Communication interface 380 may include any transceiver-like mechanism that enables the client/server entity to communicate with other devices and/or systems.
  • communication interface 380 may include mechanisms for communicating with another device or system via a network, such as network 250.
  • the client/server entity may perfo ⁇ n certain operations, as will be described in detail below. The client/server entity may perform these operations in response to processor 320 executing software instructions contained in a computer-readable medium, such as memory 330.
  • a computer-readable medium may be defined as ' a physical or logical memory device and/or carrier wave.
  • the software instructions may be read into memory 330 from another computer-readable medium, such as data storage device 350, or from another device via communication interface 380.
  • the software instructions contained in memory 330 may cause processor 320 to perform processes that will be described later.
  • hardwired circuitry may be used in place of or in combination with software instructions to implement processes described herein.
  • implementations described herein are not limited to any specific combination of hardware circuitry and software.
  • EXEMPLARY LOCAL SEARCH SYSTEM Fig. 4 is a functional block diagram of an exemplary system (e.g., local search system 225) for identifying local search results and providing a map associated with identified locations.
  • a server e.g., server 220
  • a portion of server 220 e.g., servers 220-240
  • one or more of these functions may be performed by an entity separate from server 220, such as a client (e.g., client 210), a computer associated with server 220 or one of servers 230 or 240.
  • system 225 may include map provider data 400, yellow page data 405, web document data 410, an index/document repository 415, address and points of interest (POI) identification (ID) information 420, address fingerprint (FP) and POI ID mapping information 425, a geocoder 430, an indexer 435, a front end server 440, map Uniform Resource Locator (URL) information 445, and local search results 450.
  • Map provider data 400, yellow page data 405, and web document data 410 may be provided to indexer 435, and map provider data 400 may further be used to derive address/POI ID information 420 and address FP and POI ID mapping information 425.
  • Indexer 435 may connect to index/document repository 415 and geocoder 430, and address/POI ID information 420 may be provided to geocoder 430.
  • Front end server 440 may connect to geocoder 430 and index/document repository 415, and my receive address FP and POI ID mapping info ⁇ nation 425.
  • Front end server 440 may generate map URL information 445 and local search results 450.
  • System 225 may alternatively include other connections and/or component interrelations not shown in Fig. 4.
  • geocoder 430 may receive address/POI ID information 420 and/or information from indexer 435, may assign geographic identifiers (e.g., locations, coordinates, etc.) to objects, and may provide outputs to front end server 440.
  • Indexer 435 may receive map provider data 400, yellow page data 405, and/or web document data 410, may preprocess yellow page data 405 to determine business information, extract business information from web document data 410, identify location information in a search queiy, index the address data of map provider data 400, and may provide outputs to index/document repository 415 and/or geocoder 430.
  • Front end server 440 may receive address FP and POI ID information 425, information from geocoder 430, and/or information from index/document repository 415, may rewrite the search query, and may generate map URL information 450, local search results 450, and/or geographical information for use by an advertisements (ads) server (not shown).
  • Map provider data 400 may include a variety of information.
  • a third party map provider may provide a set of POIs (e.g., businesses and other places of interest, such as museums, parks, hospitals, schools, etc.) and their addresses for cities in China.
  • POIs e.g., businesses and other places of interest, such as museums, parks, hospitals, schools, etc.
  • Map provider data 400 may be updated periodically (e.g., daily, weekly, monthly, etc.),
  • the POI ID may change.
  • map provider data 400 may include the following information: (1) a normal POI that may contain a business name, address, telephone number, and grid index; (2) a road POI that may contain a street name and grid index of the center of the street; and/or (3) a postal code POI that may contain postal codes and a grid index of the approximate center of the postal codes.
  • map provider data 400 may contain one part of an address but may be missing another part of the address.
  • the street number for example, may be missing from map provider data 400 due to restrictions in China (e.g., only 20-30% of the POIs may have street numbers).
  • map provider data 400 may include only the street name of the POI or some higher level of abstraction. Also, or alternatively, the telephone number may be
  • Yellow page data 405 may include a variety of information and may be received from multiple sources (e.g., third party providers).
  • yellow page data 405 may include
  • yellow page data 405 may not contain any latitude and longitude information.
  • complete (or substantially complete) information regarding the POIs e.g., address, telephone number, and map position information
  • map provider data 400 may be deduced from map provider data 400 and
  • Web document data 410 may include a variety of information.
  • web document data 410 may include an e-mail, a file, a combination of files, one or more files with embedded links to other files, a news group posting, a blog, a web advertisement, a web page (which may include textual
  • IO information may include embedded information (such as meta information, images, hyperlinks, etc.) and/or embedded instructions (such as Javascript, etc.)), business information (e.g., address, telephone number, etc.), etc.
  • Web document data 410 may also include documents located based on a local search query. Address/POI ID Information
  • Address/POI ID information 420 may include a variety of information.
  • address/POI ID information 420 may include addresses extracted from map provider data 400, POI IDs extracted from map provider data 400, etc.
  • address POI ID information 420 may include the following information regarding Chinese addresses: address data provided by map provider data 400 (e.g., addresses of POIs, streets centers); POIs (e.g., schools, parks, buildings,
  • Address FP and POI ID Mapping Information " A " ddress FP and POI ID information 425 may include a variety of information.
  • address FP/POI ID information 425 may include a mapping between address fingerprints (FPs) and corresponding POI IDs, etc. The mapping between the address FPs and corresponding POI IDs may be used by front end server 440 to lookup the POI ID for each local search result, as described in detail below.
  • POI IDs may be directly derived without mapping between address FPs and corresponding POI IDs.
  • information 425 may include the directly derived POI IDs.
  • Figs. 5-8 are exemplary diagrams of some components of system 225 of Fig. 4. Index/Document Repository
  • Index/document repository 415 may be provided in a single storage device (e.g., main memory 330, ROM 340, and/or storage device 350). Index/document repository 415, as shown in Fig. 5, may store a variety of information related to documents, yellow page data 405, and/or map provider data 400. For example, in one implementation, index/document repository 415 may store the following information regarding Chinese addresses: address data provided by map provider data 400 (e.g., addresses of POIs, streets centers); POIs (e.g., schools, parks, buildings, hospitals, etc.); postal code centers (i.e., the center of an area covered by a postal code); additional cities that may not be included in the list provided by map provider data 400; etc. In another implementation, as shown in Fig. 5, index/document repository 415 may include a
  • [province] field 500 a [city] field 510, a [district] field 520, and a [street] field 530 (which may additionally or alternatively include an address field and/or a POI name field).
  • the following examples may correspond to Chinese address information stored in index/document repository 415:
  • index/document repository 415 may include an address fingerprint (FP) field 540 and an FP accuracy field 550.
  • Each address may be treated as a point by geocoder 430 with an address FP and FP accuracy (e.g., buildings may have a higher FP accuracy than street centers).
  • An address FP may be a fingerprint generated based on the address (e.g., a hash value that may be generated based on the address) that may be used to lookup the POI ID for displaying the correct map.
  • an FP generator 570 may receive an address (e.g., [province] field 500, [city] field 510, [district] field 520, [street] field 530, etc.), and may generate an address FP (e.g., [FP] field 540) and a FP accuracy (e.g., [FP] accuracy field 550) based on the address.
  • the address FP and FP accuracy may be used by a POI ID lookup table 580 to lookup the POI ID for displaying the correct map (e.g., with local search results).
  • FP generator 570 and/or POI ID lookup table 580 may be provided in geocoder 430, indexer 435, or front end server 440.
  • index/document repository 415 may include a [zcode] field 560 associated with the address.
  • An associated zcode may include a code like a postal code, which in the case of China may be computed from the six digit administrative code for districts in China. Locations, such as
  • JiangSu province, BeiJing City, BeiJing City XiCheng District may be computed by geocoder 430 as bounds that contain all the point locations within them.
  • ⁇ " ' ⁇ '' ⁇ ariants' ⁇ f'the'kdi ⁇ resses ' may also be dete ⁇ nined and stored in index/document repository 415 to increase recall.
  • single character synonyms for the provinces and cities may be identified (e.g., Shanghai City -> Lu; HeBei province -> Qi). Additionally or alternatively, portions of the addresses may be omitted (e.g., Shanghai City - ⁇ Shanghai; HaiDian District -> HaiDian; HuaiHai Central Road -> HuaiHai Road). Additionally or alternatively, synonyms may be included for famous places (e.g., Shanghai City Temple -> Old City Temple; LiuRong Temple -> LiuRong Ta).
  • index/document repository 415 may include any information that may be useful to identify local search results.
  • Geocoder 430 may perform a variety of tasks to aid in the identification of local search results.
  • geocoder 430 may assign geographic identifiers (e.g., locations, coordinates, etc.) to objects.
  • geocoder 430 may match addresses to addresses in index/document repository 415.
  • Output from geocoder 430 may be used for several purposes.
  • yellow page data 405 may be parsed to determine business information (e.g., address, telephone number, email address, facsimile number, hours of operation, etc.) for POIs, business information may be extracted from web document data 410, and/or location information provided in a search query may be identified.
  • indexer 435 may need to determine the map position by looking at Chinese addresses. This may implicate a variety of issues. For example, accurate position and detailed address information for most of the addresses in China are not readily available.
  • Chinese addresses do not have a well defined format and hierarchy. The possible components of a Chinese address may include city, district, town, village, road, street, street number, and building.
  • Yellow page data 405 may be in free form (i.e., not conforming to any particular form or format), especially in rural areas, and for some POIs, there are no corresponding address components available. Exemplary Chinese addresses may include: ShenZhen City HeGang Town AnLiang Village AnLiang Road 172 Number JingCheng Building;
  • Geocoder 430 may best match addresses in yellow page data 405 with addresses provided in map provider data 400, and obtain the most accurate map position possible.
  • Geocoder 430 may include an address list generator 600, a parser 610, a longest matching prefix locator 620, a combination locator 630, a query geocoding unit 640, etc.
  • the components of geocoder 430 may perform a number of tasks for each address (e.g., ShenZhen City HeGang Town AnLiang Village AnLiang Road 172 Number JingCheng Building).
  • Address list generator 600 may receive map provider data 400 and yellow page data 405, and may generate an address list (also referred to as tokens) based on the provinces, cities, districts, street names, addresses, POI names, or combinations thereof from map provider data 400 and/or yellow page data 405. For each address, parser 610 may attempt to parse the city from the beginning of the address. For example, "ShenZhen City" may be parsed by parser 610 from the address. Parser 610 may also attempt to parse the district from the address. For example, since a district is absent from the exemplary address described above, parser 610 may not be able to parse the district from the address.
  • an address list also referred to as tokens
  • Longest matching prefix locator 620 of geocoder 430 may be used to locate further portions of the address. For example, longest matching prefix locator 620 may attempt to locate the longest matching prefix (e!g., token) from the address list. ' This may fail if there is not a specific token in the address list (e.g., if "HeGang Town" were not in the address list). Longest matching prefix locator 620 may advance to the end of the word "Town" if it is present in the address list.
  • longest matching prefix locator 620 may attempt to locate the longest matching prefix (e!g., token) from the address list. ' This may fail if there is not a specific token in the address list (e.g., if "HeGang Town" were not in the address list).
  • Longest matching prefix locator 620 may advance to the end of the word "Town” if it is present in the address list.
  • longest matching prefix locator 620 may identify the token as a good match (e.g., "HeGang Town” may be identified as a good match). Longest matching prefix locator 620 may fry again to locate the longest matching prefix (e.g., token) from the address list. If this fails, then longest matching prefix locator 620 may advance to the end of the word "Village” if it is present (e.g., "AnLiang Village"). Longest matching prefix locator 620 may repeat the process again to locate the longest matching prefix (e.g., token) from the address list.
  • longest matching prefix locator 620 may attempt to match street names (e.g., "Road” or “Street”). Longest matching prefix locator 620 may advance to the end of the word(s) “Road” or “Street,” if they are present (e.g., “AnLiang Road”). Longest matching prefix locator 620 may also advance to the end of the word “Number,” if it is present (e.g., "172 Number”). Finally, longest matching prefix locator 620 may attempt to locate the longest matching prefix (e.g., token) from the POI names. This may provide matches for names of buildings, schools, parks, etc. (e.g., "JingCheng Building").
  • longest matching prefix e.g., token
  • Combination locator 630 may locate particular combinations so that POI names may be located first, then various levels of the addresses may be located, and, finally, districts or cities may be located. Such a locating arrangement may guarantee that the most specific address may be obtained by combination locator 630.
  • combination locator 630 may locate the following exemplary combinations: city + district + POI name; city + district + address located by longest matching prefix locator 620 (e.g., "Road” or “Street”); city + district + address located by longest matching prefix locator 620 (e.g., "Village”); city + district + address located by longest matching prefix locator 620 (e.g., "Town”); city + district; and/or city
  • the location part of the search query may be sent to a map server owned by a map provider. If the location query contains Chinese, Japanese, or Korean (CJK) characters, the map server may send it to query geocoding unit 640 for geocoding.
  • CJK Chinese, Japanese, or Korean
  • query geocoding unit 640 may present suggestions to the user. For example, if the location query is "History Museum,” the query geocoding unit 640 may present the following suggestions to the user: “Do you want to look for TianJian City HeDong District History Museum, or ShangHai City PuDong District ShangHai History Museum?"
  • Query geocoding unit 640 may compute the score of a search result based on the number of points in the search result location. For example, if ChangChuan City ChaoYang District scores less than Beijing City ChaoYang District, then Beijing City ChaoYang District may be displayed when the user's query location is "ChaoYang District.”
  • Fig. 6 shows exemplary tasks performed by geocoder 430, in other implementations, geocoder 430 may perform additional or different tasks that may be used to identify local search results. Furthermore, although Fig. 6 shows the components of geocoder 430 as interconnected, in other implementations, the components of geocoder 430 may be separate, non-interconnected components. Indexer
  • Indexer 435 may perform a variety of tasks to aid in the identification of local search result ' s.
  • indexer 435 may preprocess yellow page data 405 to determine business information (e.g., address, telephone number, email address, facsimile number, hours of operation, etc) for POIs, may extract business information from web document data 410, and/or may identify location information in a search query.
  • business information e.g., address, telephone number, email address, facsimile number, hours of operation, etc
  • Indexer 435 may include a map data indexer 700, a yellow page data preprocessor 710, a business information extractor 720, a distance flattener 730, a clusterer 740, a business information repository 750, a snippet highlighter 760, etc.
  • Map data indexer 700 may index address data from map provider data 400, which may include map position information.
  • Geocoder 430 may attempt to geocode the address again based on the indexed address data, and clusterer 740 (described below) may set the cluster position if the cluster position is different from the given position, but may not set the cluster position above a predetermined distance threshold (e.g., within three kilometers). Such an arrangement may be used for improving clustering, as described in detail below.
  • Yellow page data preprocessor 710 may receive yellow page data 405 from the different providers, and may preprocess yellow page data 405 to a common format. This formatted data may be provided to geocoder 430 during indexing, and geocoder 430 may attempt to geocode the address.
  • the address may be indexed as a normal entry. If the address can be geocoded to city or distinct level, the address may be indexed as an entry with an approximate position. During scoring, the entry may be treated as if it is at least twenty kilometers from the centroid (i.e., essentially having its score demoted). If the address cannot be geocoded, the address may be treated as an entry without a position. During indexing, if the entry without a position may be clustered with another entry (e.g., using its telephone number), the entry may be retained. Otherwise, the entry may be discarded.
  • Business information extractor 720 may extract business information (e.g., addresses and telephone numbers) from web document data 410 using a variety of techniques. The techniques may be modified based on the different formatting of addresses and telephone numbers in China and Japan.
  • business information extractor 720 may include a classifier that may analyze documents with addresses to determine business information associated with the addresses based on a statistical model.
  • the documents analyzed by the classifier may include documents with addresses for which there is no corresponding yellow page data 405 and/or documents with addresses for which there is possibly incorrect yellow page data 405.
  • the funclions perfo ⁇ ned by the classifier may differ based on whether the business information corresponds to business name (title) information or telephone number information. Yet other functions may be performed when the business information includes information other than business name or telephone number information.
  • a business name (title) may be identified by analyzing terms near the address and determining the probability that each term is part of the title.
  • a confidence score may be assigned to each candidate title that is identified.
  • a telephone number may be associated with an address by identifying a set of candidate telephone numbers in the document. It may be determined, based on the statistical model, the probability that each of the candidate telephone numbers is associated with the address given the prediction regarding the preceding candidate telephone number and given a window of terms (e.g., looking at a predetermined number of terms to the left and/or right) around the candidate telephone number. Confidence scores may be assigned to the candidate telephone numbers based on their determined probabilities. Optionally, a best telephone number for the address may be determined. The telephone number may then be associated with the address to form or supplement a business listing.
  • business information extractor 720 may include a location extractor whi'cMi ⁇ ay b'e fflclMeffaTparfSf a"sea ⁇ -clf e'hgine.
  • the location extractor may receive a search query and determine whether the search query includes a geographic reference. When the search query includes a geographic reference, the location extractor may separate the geographic reference from the search terms in the query and send them to a local search engine. When the search query does not include a geographic reference, the location extractor may forward the search terms to a web search engine that may include a traditional web search engine that returns a set of documents related to a search query.
  • the local search engine may include a specialized search engine, such as a business listings search engine.
  • the local search engine may receive the search terms and the geographic reference of a search query from the location extractor.
  • the local search engine may identify a set of documents that match the search query (i.e., documents that contain the set of search terms of the search query) by comparing the search terms to documents in a document corpus relating to the geographic area associated with the geographic reference.
  • the local search engine may score the identified documents, sort them based on their scores, and output them as a list of search results.
  • the location extractor may determine unambiguous addresses (e.g., cities) in a search query by setting a variable i equal to one, and performing a search for the name of a city for each city(i) in a list of cities. The number of search results for this search may be counted as countcity. A search may also be performed for the name of the city with the name of the corresponding province. The number of search results for this search may be counted as countcity/province.
  • countcity/province is at least X% (where X is a number greater than zero) of countcity.
  • the city may be considered an "unambiguous" city.
  • An "unambiguous city” may refer to a city whose name can be used alone in a search query and it will be understood that the user intended the city and not something else.
  • countcity/province is not at least X% of countcity, then it may be determined whether there are any more cities on the list. If there are more cities on the list, then the variable i may be incremented by one and the next city in the list of cities may be evaluated.
  • the documents of the search results may be analyzed to identify any postal codes that they contain.
  • the postal codes may be identified using a pattern matching technique and verified by comparing them to a list of postal codes. It may then be determined whether the postal codes correspond to postal codes associated with city(i).
  • the number of documents that contain postal codes associated with city(i) may be counted as countpostal. It may be determined whether countpostal is at least X% (e.g., 5%) of countcity. When countpostal is at least X% of countcity, then the city may be considered an unambiguous city.
  • business information extractor 720 may identify a geographically relevant document.
  • a geographically relevant document may generally refer to any document that, in some manner, has been determined to have particular relevance to a geographical location.
  • Business listings e.g., yellow page listings
  • Other documents such as web documents, may also have particular geographical relevance.
  • a business may have a home page, may be the subject of a document that comments on or reviews the business, or may be mentioned by a web page that in some other way relates to the business.
  • the particular geographic location for which a document is associated may be determined in a number of ways, such as from the postal address or from other geographic signals.
  • the geographic region associated with the geographically relevant document may be mapped to a corresponding location identifier. Additional location identifiers may be determined for the document. In particuTaiV ' l ⁇ catiori' ⁇ deht ⁇ Fiers corresponSing to surrounding regions within a predetermined range may also be determined. Each geographically relevant document may be indexed as if the document included the location identifiers associated with the document's region as well as the identified surrounding regions.
  • Fig. 7 shows business information extractor 720 as part of indexer 435, in other implementations, business information extractor 720 may be separate from indexer 435.
  • the information extracted by business information extractor 720 may be provided in business information repository 750.
  • Business information repository 750 may include a variety of information, e.g., the documents from which business information has been extracted by business information extractor 720.
  • Business information repository 750, together with the extracted business information may be provided within indexer 435.
  • Fig. 7 shows business information repository 750 as part of indexer 435, in other implementations, business information repository 750 may be separate from indexer 435.
  • Distance flattener 730 may set a search radius or distance for a local search query.
  • each local search query may be geocoded by geocoder 430 to a particular location.
  • Each location may be a point location (e.g., buildings, famous tourist places, schools, street centers, etc.) or a bound location (e.g., districts, cities, provinces, etc.).
  • distance flattener 730 may set the search radius to a predetermined distance (e.g., approximately five kilometers around the point).
  • distance flattener 730 may set the search radius to approximately the maximum distance from a centroid of the location to the corners. Scores of search results in a zcode set associated with the location-(i.e., the set of zcodes making up the location) may be promoted. In this way, when a user searches near a district name, the top results may be within that district.
  • Clusterer 740 may cluster map provider data 400 and yellow page data 405 together.
  • Much of map provider data 400 may include accurate position information so that it may be displayed accurately on a map, but may not include detailed address or telephone number information (e.g., it might include an address without a street number and/or may be missing a telephone number).
  • yellow page data 405 may include detailed address and telephone number information, but may not include accurate position information.
  • map provider data 400 may include a source (e.g., map provider), a title (e.g., "Beijing University”), an address (e.g., "Beijing City HaiDian District YiHe Yuan Road”), and/or a POI ID (for map display) (e.g., "A1234567”).
  • yellow page data 405 may include a source (e.g., yellow page data provider), a title (e.g., "Beijing University”), an address (e.g., "Beijing City HaiDian District YiHe Yuan Road 5 Number”), and/or a telephone number (e.g., "010-62752114"). If these entries are clustered together by clusterer 740, then front end server 440 may be able to provide the user with detailed address and telephone number information, as well as an accurate position on a map.
  • the position obtained from geocoding (e.g., with geocoder 430) the address from yellow page data 405 may be an approximation and may be far away from the accurate position provided by map provider data 400.
  • the same business from two providers may be located in much different neighborhoods, and thus may not be clustered together by clusterer 740.
  • the solution to this may include geocoding (e.g., with geocoder 430) the address from map provider data 400 to a cluster position.
  • the cluster position may be used for neighborhood generation as well as for clustering by clusterer 740.
  • the actual position provided by map provider data 400 may then be used for map display.
  • Snippet highlighter 760 may highlight snippets in documents (e.g., web documents). Web document snippet highlighting may typically be accomplished by term offsets in the documents. Since a CJK document instead of using space as delimiter, a long paragraph of text may need to be segmented to obtain the terms), in order to highlight specific terms, the entire document may be segmented to obtain the corresponding terms, which may be inefficient. However, snippet highlighter 760 may store byte offsets instead of term offsets to identify an address or telephone number (or some other business information) in a web document during indexing. Snippet highlighter 760 may use the byte offset to perform highlighting and no segmentation may be required. Although Fig. 7 shows snippet highlighter 760 as part of indexer 435, in other implementations, snippet highlighter 760 may be separate from indexer and/or included in another component (e.g., within front end server 440).
  • snippet highlighter 760 may be separate from indexer and/or
  • FIG. 7 shows exemplary tasks performed by indexer 435, in other implementations, indexer 435 may perform additional or different tasks that may be used to identify local search results. Furthermore, although Fig. 7 shows the components of indexer 435 as interconnected, in other implementations, the components of indexer 435 may be separate, non-interconnected components.
  • Front end server 440 may perform a variety of tasks to aid in the identification of local search results.
  • front end server 440 may include a query rewriter 800, a local search generator 810, a map generator 820, a geographical information generator 830, etc.
  • Fig. 8 shows exemplary tasks performed by front end server 440, in other implementations, front end server 440 may perform additional or different tasks that may be used to aid in the identification of local search results.
  • queiy rewriter 800 may perform a variety of tasks. For example, since the number of web document clusters may be much smaller for Chinese data compared to English data, a match in a title or category may be used to return valid results.
  • query rewriter 800 may generate a list of synonyms for each of the categories.
  • Query rewriter 800 may also rewrite each local search query to expand the query to a couple of search terms that may be joined by an "OR" operand.
  • Local search generator 810 may generate local search results 450.
  • local search generator 810 may generate results corresponding to relevant web pages and/or business listings within a specific geographic area, based on the address information obtained and processed by local search system 225.
  • Local search results 450 may be displayed (e.g., to the user who input the local search query) on a display (e.g., output device 370).
  • Map generator 820 may generate a map relating to local search results 450. For example, in order to display the map for the results, map generator 820 may construct an iframe (i.e., a floating frame inserted within a web page) on the search result page, and may execute a post action to a URL provided by the map provider.
  • the parameters in the post may contain the following fields for each search result: a title; an address; a telephone number; POI ID (for displaying the point on the map); and accuracy.
  • Locations that may be geocoded to a building level may be marked as accurate on the map by map generator 820.
  • Locations that may be geocoded to a street level (or some higher level of abstraction) may be marked as estimated on the map by map generator 820.
  • accurate locations 920 may be identified by one marker color (e.g., green) and estimated locations 930 may be identified by another marker color (e.g., red).
  • marker color e.g., green
  • estimated locations 930 may be identified by another marker color (e.g., red).
  • 'Whi ⁇ ei ' tw ⁇ " levels' of accuracy are shown in Fig. 9, additional levels of accuracy may be used in other implementations consistent with principles of the invention.
  • the fingerprint of the address (e.g., address FP 540) may be stored together with its associated accuracy (e.g., FP accuracy 550).
  • the mapping between the address fingerprint and the POI ID (e.g., address FP/POI ID mapping information 425) may be used by map generator 820 to lookup the POI ID for each search result.
  • the POI ID may be used by map generator 820 to identify a position on the map (e.g., map 900) for the map provider so that the map provider may show the position on the map provided within the result page.
  • geographical information generator 830 may provide geographical information to ads server 840 in a variety of ways.
  • each province or self-administered city may have a two digit code, which may be the second part of the region code defined by ISO-3166-2.
  • Geocoder 430 may index this code (e.g., "CN-dd") with each address and pass it to front end server 440 for every successfully geocoded address.
  • Geographical information generator 830 may send the code to ads server 840 as a geo-region-code.
  • advertisers may bid on keyword pairs of ⁇ keyword, location>, instead of using geo-targeting because, for web searching, geo-targeting may not work as well in Asian countries as it does in the United States.
  • the inventory for ads may be more for such keyword pairs.
  • geographical information generator 830 may concatenate keywords with the locations entered by the search query, and may use these concatenations as the keywords sent to ads server 840. For example, when a user searches for "restaurant” near "Beijing," the keyword sent to ads server 840 by geographical information generator 830 may be "restaurant Beijing.” Such an arrangement may be provided for both Japan and China.
  • geographical information generator 830 may determine whether the search query includes a geographic reference. If the search query does not include a geographic reference, then regular advertisements may be presented by ads server 840. However, it may be determined whether an indicator of the user's location, such as the user" IP address, is available. When an indicator of the user's location is available, then local advertisements may be presented based on the user's location.
  • geographical information generator 830 may determine whether the geographic reference corresponds to a city name alone (i.e., without any other geographic information, such as no province information). If the search query includes a geographic reference other than a city name alone, then local advertisements may be presented.
  • geographical information generator 830 may determine whether the city corresponds to an unambiguous city. If the city does not correspond to an unambiguous city, then regular advertisements may be presented. If the city corresponds to an unambiguous city, then geographical information generator 830 may determine whether the city name with one or more other search terms of the query appear on a blacklist. A blacklist may be maintained for unambiguous city names that, when combined with one or more words, mean something other than their respective cities. If the city name with one or more other search terms of the query appears on the blacklist, then regular advertisements may be presented. If the city name with one or more other search terms of the query does not appear on the blacklist, then local advertisements may be presented based on the geographic reference of the query.
  • information concerning the user's location may be used by geographical information generator 830 to determine whether that lodatioB is witn'fri ' £ predetermined 'distance of the location corresponding to the geographic reference. If the user's location is within the predetermined distance, then local advertisements may be presented. If the user's location is outside the predetermined distance, however, then regular advertisements may be presented.
  • Front end server 440 may perform some tasks that may be specific to China. For example, front end server 440 may hide driving directions, provide a display unit
  • Figs. 1 OA-I OD is a flowchart of an exemplary process for identifying local search results and providing a map associated with identified locations.
  • a process 1000 for identifying local search results and providing a map associated with identified locations may begin with receipt of yellow page, map provider, and document data (block 1005).
  • map provider data 400, yellow page data 405, and web document data 410 may be provided to indexer 435, and map provider data 400 may further be used to derive address/POI ID information 420 and address FP/POI ID mapping information 425
  • Process 1000 may perform geocoding on the data (block 1010). For example, in one implementation described above in connection with Fig.
  • address list generator 600 of geocoder 430 may receive map provider data 400 and yellow page data 405, and may generate an address list based on the provinces, cities, districts, street names, addresses, POI names, or combinations thereof from map provider data 400 and/or yellow page data 405. Parser 610 of geocoder 430 may attempt to parse the city and/or district from the beginning of each address. Longest matching prefix locator 620 of geocoder 430 may be used to locate further portions of each address. Combination locator 630 of geocoder may locate particular combinations so that POI names may be located first, then various levels of the addresses may be located, and, finally, districts or cities may be located.
  • Query geocoding unit 640 of geocoder 430 may compute the score of a search result based on the number of points in the search result location.
  • process 1000 may perform indexing on the data (block 1015).
  • map data indexer 700 of indexer 435 may index address data from map provider data 400, which may include map position information.
  • Yellow page data preprocessor 710 of indexer 435 may receive yellow page data 405 from the different providers, and may preprocess yellow page data 405 to a common format.
  • Business information extractor 720 of indexer 435 may extract business information (e.g., addresses and telephone numbers) from web document data 410 using a variety of techniques.
  • the techniques may be modified based on the different formatting of addresses and telephone numbers in China and Japan.
  • the extracted business information may be provided in business information repository 750.
  • Distance flattener 730 of indexer 435 may set a search radius for a local search query.
  • Clusterer 740 of indexer 435 may cluster map provider data 400 and yellow page data 405 together.
  • Snippet highlighter 760 of indexer 435 may highlight snippets in documents (e.g., web documents).
  • Process 1000 may generate local search results and may provide a map URL (block 1020).
  • local search generator 810 of front end' serv ' er"4'40 ' ffiay ⁇ ' eneflte l ⁇ caL"se r afcn" results 450 e.g., relevant web pages and/or business listings within a specific geographic area, based on the address information obtained and processed by local search system 225).
  • Map generator 820 of front end server 440 may generate a map relating to local search results 450.
  • map generator 820 may construct an iframe (i.e., a floating frame inserted within a web page) on the search result page, and may execute a post action to a URL provided by the map provider.
  • iframe i.e., a floating frame inserted within a web page
  • Process block 1010 (Fig. 10A) of process 1000 may include the blocks shown in Fig. 1OB.
  • process block 1010 may generate an address list (block 1025).
  • address list generator 600 of geocoder 430 may receive map provider data 400 and yellow page data 405, and may generate an address list (tokens) based on the provinces, cities, districts, street names, addresses, POI names, or combinations thereof from map provider data 400 and/or yellow page data 405.
  • Process block 1010 may parse each address in the address list (block 1030). For example in one implementation described above in connection with Fig. 6, for each address, parser 610 of geocoder 430 may attempt to parse the city and the district from the address.
  • process block 1010 may locate the longest matching prefixes from each address in the address list (block 1035). For example in one implementation described above in connection with Fig. 6, longest matching prefix locator 620 of geocoder 430 may be used to locate further portions of the address (e.g., a town, a village, a road or street, number, POI names, etc.). If the longest matching prefixes are not located in the address (block 1040 - NO), then process block 1010 may continue to locate further portions of the address (block 1035).
  • longest matching prefix locator 620 of geocoder 430 may be used to locate further portions of the address (e.g., a town, a village, a road or street, number, POI names, etc.). If the longest matching prefixes are not located in the address (block 1040 - NO), then process block 1010 may continue to locate further portions of the address (block 1035).
  • process block 1010 may locate combinations in each address (block 1045). For example in one implementation described above in connection with Fig. 6, combination locator 630 of geocoder 430 may locate particular combinations so that POI names may be located first, then various levels of the addresses may be located, and, finally, districts or cities may be located. Such a locating arrangement may guarantee or verify that the most specific possible address may be obtained by combination locator 630.
  • Process block 1015 (Fig. 10A) of process 1000 may include the blocks shown in Fig. 1OC.
  • process block 1015 may preprocess the yellow page data (block 1050).
  • yellow page data preprocessor 710 of indexer 435 may receive yellow page data 405 from the different providers, and may preprocess yellow page data 405 to a common (or predetermined) format. This formatted data may be provided to geocoder 430 during indexing, and geocoder 430 may attempt to geocode the address. If the address may be geocoded to building or street level, the address may be indexed as a normal entry. If the address may be geocoded to city or district level, address may be indexed as an entry with an approximate position.
  • Process block 1015 may extract business information from documents (block 1055) and may store the business information (block 1060).
  • business information extractor 720 of indexer 435 may extract business information (e.g., addresses and telephone numbers) from web document data 410 using a variety of techniques. The techniques may be modified based on the different formatting of addresses and telephone numbers in China and Japan.
  • the extracted business information may be prdv ⁇ cfed iin busl ⁇ eSs' mtofmatibn repository 750.
  • Business information repository 750 may include a variety of information, e.g., the documents from which business information has been extracted by business information extractor 720.
  • Process block 1015 may index address dala received from a map provider (block 1065).
  • map data indexer 700 of indexer 435 may index address data from map provider data 400, which may include map position information.
  • Geocoder 430 may attempt to geocode the address again based on the indexed address data, and clusterer 740 may set the cluster position if the cluster position is different from the given position, but may not set the cluster position above a predetermined distance threshold (e.g., within three kilometers).
  • process block 1015 may cluster yellow page data and map provider data (block 1070). For example, in one implementation described above in connection with Fig.
  • clusterer 740 of indexer 435 may cluster map provider data 400 and yellow page data 405 together. If these entries are clustered together by clusterer 740, then front end server 440 may be able to provide the user with detailed address and telephone number information, as well as an accurate position on a map.
  • Process block 1015 may highlight snippets provided in documents (block 1075). For example, in one implementation described above in connection with Fig. 7, snippet highlighter 760 of indexer 435 may highlight snippets in documents (e.g., web documents). Snippet highlighter 760 may store byte offsets instead of term offsets to identify an address or telephone number (or some other business information) in a web document during indexing. Snippet highlighter 760 may use the byte offset to perfo ⁇ n highlighting and no segmentation may be required.
  • process block 1015 may set a search distance (block 1080).
  • distance flattener 730 of indexer 435 may set a search radius for a local search query geocoded by geocoder 430 to a particular location.
  • Each location may be a point location (e.g., buildings, famous tourist places, schools, street centers, etc.) or a bound location (e.g., districts, cities, provinces, etc.).
  • distance flattener 730 may set the search radius to a predetermined distance (e.g., approximately five kilometers around the point).
  • distance flattener 730 may set the search radius to approximately the maximum distance from a centroid of the location to the comers.
  • Exemplary Front End Server Process Process block 1020 (Fig. 10A) of process 1000 may include the blocks shown in Fig. 10D.
  • process block 1020 may rewrite a search query (block 1085).
  • search query block 1085.
  • query rewriter 800 of front end server 440 may generate a list of synonyms for each of the categories.
  • Query rewriter 800 may also rewrite each local search query that may expand the queiy to a couple of search terms that may be joined by an "OR" operand.
  • Process block 1020 may generate local search results based on the search query, and may generate a map showing location(s) of the search results (block 1090).
  • local search generator 810 of front end server 440 may generate local search results (e.g., relevant web pages and/or business listings within a specific geographic area, based on the address information obtained and processed by local search system 225).
  • ge ⁇ erator 820 of front end server 440 may generate a map relating to local search, results 450.
  • map generator 820 may construct an iframe (i.e., a floating frame inserted within a web page) on the search result page, and may execute a post action to a URL provided by the map provider.
  • the POI ID may be used by map generator 820 to identify a position on the map (e.g., map 900) for the map provider so that the map provider may show the position on the map provided within the result page.
  • process block 1020 may generate geographical information for an ads server (block 1095).
  • geographical information generator 830 of front end server 440 may provide geographical information to ads server 840.
  • geographical information generator 830 may send the region code to ads server 840 as a geo-region-code.
  • geographical information generator 830 may concatenate keywords with the locations entered by the user input search query, and may use these concatenations as the keywords sent to ads server 840.
  • CONCLUSION Implementations described herein may provide systems and methods for identifying local search results and generating a map associated with identified locations.
  • the system may receive a local search query input by a user, and may identify a location associated with the local search query.
  • the system may identify a set of local search results that may be related to the local search query and may be associated with the identified location.
  • the local search results may include links to documents that may be related to the local search query.
  • the system may identify an identifier for a group of local search results, and may provide the identifier to a map provider.
  • the system may receive a map associated with the identified location from the map provider.
  • the map may identify a position of at least one search result in the group of local search results.
  • map data and yellow page data may be utilized from several different providers to identify local search results and generate a map associated with identified locations.
  • the map may be conveniently displayed with the local search results.
  • Such an arrangement avoids generation of local search results and a pointer to a third-party map provider's server.
  • the map may provide detailed map data based on the yellow page data. This may make it possible to generate a map that includes detailed map data in countries where export restrictions may limit the availability of detailed map data to render the map or may limit the availability of the actual latitude and longitude of addresses within the area.
  • server 220 may perform most, if not all, of the acts described with regard to the processing of Figs. 10A-10D.
  • one or more, or all, of the acts may be performed by another entity, such as another server 230 and/or 240 or client 210.
  • geocoder 430 may attempt to locate the closest point for an address to be geocoded. For example, suppose that the points "1 ABC Street” and “10 ABC Street” are identified by the map provider. When trying to geocode the address "3 ABC Street,” geocoder 430 may return the location of "1 ABC Street,” which is the closest point to "3 ABC Street,” In another alternative approach, geocoder 430 may attempt to interpolate a point.
  • geocoder 430 may determine that the address of "3 ABC Street” is at grid index (3, 6), based upon interpolation.
  • the POI ID may be stored with the location data from the map provider.
  • the POI IDs may be returned by front end server 440 during serving time.
  • the POI IDs may change in different versions of the map provider data, Storing the POI ID in the index makes the index dependent upon the data from the map provider.
  • the addresses of the search results may be geocoded during serving time and geocoder 430 may be requested to provide the closest matching points.
  • Geocoder 430 may return the POI IDs of the points.
  • the requests for the closest matching points may be sent as batches (e.g., batches often) of geocoding requests so the performance impact may be small.

Abstract

A system receives yellow page data, map provider data, and document data in response to a local search query, and geocodes the data to assign a geographic idetifier and to match at least one address associated with the local search query. The system also index the geocoded data to determine business information and location information associated with thee local search query. The system further provides local search results and a map based on the indexed data (Figure 1).

Description

LOCAL SEARCH
BACKGROUND
Implementations described herein relate generally to information retrieval, and, more particularly, to identifying local search results. The World Wide Web ("web") contains a vast amount of information. Locating a desired portion of the information, however, may be challenging. This problem may be compounded because the amount of information on the web and the number of new users inexperienced at web searching are growing rapidly.
Search systems attempt to return hyperlinks to web pages in which a user is interested. Generally, search systems base their determination of the user's interest on search terms (called a search query) entered by the user. The goal of the search system may be to provide links to high quality, relevant results (e.g., web pages) to the user based on the search query. Typically, the search system accomplishes this by matching the terms in the search query to a corpus of pre-stored web pages. Web pages that contain the user's search terms may be referred to as "hits" or "search results" and may be returned to the user as links.
Local search systems attempt to return relevant web pages and/or business listings within a specific geographic area. In some countries, detailed address information is not available for some businesses. As a result, performing local searches may be difficult.
SUMMARY
According to one aspect, a method may include receiving yellow page data, third-party map provider data, and document data in response to a local search query, and geocoding at least one of the yellow page data, the third-party map provider data, and the document data to assign a geographic identifier to and to match at least one address within the local search query. The method may also include indexing the geocoded data to identify business information and location information corresponding to the local search query, and providing local search results and a third-party map based on the identified business information and location information. According to another aspect, a method of geocoding based on a local search query may include receiving third party map provider data and yellow page data, generating an address based on the local search query, parsing the address, locating longest matching prefixes in the address to identify at least one portion of the address, and locating a combination in the address to verify the address.
According to yet another aspect, a method of indexing based on a local search query may include preprocessing yellow page data to a predetermined format, extracting business information from document data, storing the business information in a repository, and indexing address information from third party map provider data. The method may also include clustering the yellow page data and the third party map provider data, and highlighting snippets in the document data. For a point location, the method may include setting a search distance for the local search query to a predetermined distance. For a bound location, the method may further include setting the search distance to approximately a maximum distance from a centroid of the bound location to corners of the bound location.
According to a further aspect, a method may include generating a list of synonyms related to a local search queiy, rewriting the local search query to expand the query, generating local search results based on the expanded query, obtaining a map from a third party map provider based on the local search results, and generating advertisements based on geographical information related to the local search results.
According to another aspect, a system may include an indexer to receive third party map provider data, yellow page data, and document data, preprocess the yellow page data to determine business information, extract' business ' mϊbrm"a~tϊόn ffofn'the document data, identify location information in a local search query, and index address data of the third party map provider data. The system may also include a geocoder to receive information from the indexer, and assign geographic identifiers, and a front end server to receive information from the geocoder, rewrite the local search query, obtain a map from the third party map provider data, and generate local search results based on the local search query.
According to yet another aspect, a system may include means for receiving yellow page data, third- party map provider data, and document data in response to a local search query, and means for geocoding at least one of the yellow page data, the third-party map provider data, and the document data to assign a geographic identifier to and to match at least one address within the local search query, The system may also include means for indexing the geocoded data to identify business information and location information corresponding to the local search query, and means for providing local search results and a third-party map based on the identified business information and location information.
According to a further aspect, a system may include a memory to store a group of instructions, and a processor to execute instructions in the memory. The processor may identify a location associated with a local search query, identify local search results relevant to the local search query and associated with the identified location, identify an identifier for each of a group of the local search results, and receive from a third party map provider a map associated with the identified location, where the map identifies a position of at least one local search result in the group of local search results.
According to a still further aspect, a method may include receiving a local search query, identifying a location associated with the local search query, identifying a set of search results relevant to the local search query and associated with the identified location, and identifying an identifier for each of a group of the search results. The method may also include providing the identifier for each of the group of the search results to a third party map provider, and receiving from the third party map provider a map associated with the identified location, where the map identifies a position of at least one search result in the group of search results. According to another aspect, a method may include generating a list of tokens, identifying a potential address within a web document, and parsing the potential address from a beginning to determine whether the potential address includes a token associated with a city. The method may also include further parsing the potential address to determine whether the potential address includes a token associated with a district, identifying a longest-matching token in the potential address after the token associated with the city or the token associated with the district, and determining whether the potential address is an actual address based on the token associated with the city, the token associated with the district, and the identified longest-matching token.
BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate one or more embodiments of the invention and, together with the description, explain aspects of the invention. In the drawings:
Fig. 1 is a diagram of an overview of an exemplary implementation described herein; Fig. 2 is a diagram of an exemplary network in which systems and methods described herein may be implemented; Fig. 3 is an exemplary diagram of a client or server within the exemplary network of Fig. 2;
Fig. 4 is a functional block diagram of an exemplary system for identifying local search results and providing a map associated with identified locations; !!'"" fe- Ij?}!, Ms'"an!'e!keftipla:fy BϊafraBfό'f an index/document repository of the exemplary system of Fig. 4; Fig. 6 is an exemplary diagram of a geocoder of the exemplary system of Fig. 4; Fig. 7 is an exemplary diagram of an indexer of the exemplary system of Fig. 4; Fig. 8 is an exemplary diagram of a front end server of the exemplary system of Fig. 4; Fig. 9 is a diagram of exemplary local search results and a map generated by the exemplary system of
Fig. 4; and
Figs. 10A-10D is a flowchart of an exemplary process for identifying local search results and providing a map associated with identified locations.
DETAILED DESCRIPTION The following detailed description of the invention refers to the accompanying drawings. The same reference numbers in different drawings may identify the same or similar elements. Also, the following detailed description does not limit the invention.
In some countries, like China, for example, map data and yellow page data may not be available from a single provider and must be obtained from several different providers. Due to export restrictions, it may not be possible to get detailed map data to render the map of an area or to get the actual latitude and longitude of addresses within the area. As a result, address approximation may be used for geocoding of addresses. When results from a local search are sent to a user, the local results page may include a list of relevant results and a pointer to a map provider's server (third party). The map provider may be responsible for generating the map displayed to the user. OVERVIEW
Implementations described herein may identify local search results and generate a map associated with identified locations. For example, in one implementation, as shown in Fig. 1, a system may receive a local search query input by a user, and may identify a location associated with the local search query. The system may identify a set of local search results (e.g., search results "A" through "H") that may be related to the local search query and may be associated with the identified location. The local search results may include links to documents that may be related to the local search query. The system may identify an identifier for a group of local search results, and may provide the identifier to a map provider. The system may receive a map associated with the identified location from the map provider. The map may identify a position of at least one search result (e.g., search result "A" as shown in Fig. 1) in the group of local search results. A "document," as the term is used herein, is to be broadly interpreted to include any machine- readable and machine-storable work product. A document may include, for example, an e-mail, a file, a combination of files, one or more files with embedded links to other files, a news group posting, a blog, a web advertisement, etc. In the context of the Internet, a common document is a web page. Web pages often include textual information and may include embedded information (such as meta information, images, hyperlinks, etc.) and/or embedded instructions (such as Javascript, etc.).
A "link," as the term is used herein, is to be broadly interpreted to include any reference to/from a document from/to another document or another part of the same document.
EXEMPLARY NETWORK CONFIGURATION Fig. 2 is an exemplary diagram of a network 200 in which systems and methods described herein may be implemented. Network 200 may include multiple clients 210 connected to multiple servers 220-240 via a network 250. Two clients 210 and three servers 220-240 have been illustrated as connected to network 250 for simplicity:' In'prlMee'rth'ereTtoay Be" more or fewer clients and servers. Also, in some instances, a client may perform one or more functions of a server and/or a server may perform one or more functions of a client.
Clients 210 may include client entities. An entity may be defined as a device, such as a personal computer, a wireless telephone, a personal digital assistant (PDA), a lap top, or another type of computation or communication device, a thread or process running on one of these devices, and/or an object executable by one of these devices. Servers 220-240 may includi: server entities that gather, process, search, and/or maintain documents.
In one implementation, server 220 may include a local search system 225 usable by clients 210. Server 220 may crawl a corpus of documents, index; the documents, and store information associated with the documents in a repository of documents. Any combination of servers 220-240 may implement local search system 225 to identify local search results and provide a map associated with identified locations.
While servers 220-240 are shown as separate entities, it may be possible for one or more of servers 220-240 to perform one or more of the functions of another one or more of servers 220-240. For example, it may be possible that two or more of servers 220-240 are implemented as a single server. It may also be possible for a single one of servers 220-240 to be implemented as two or more separate (and possibly distributed) devices.
Network 250 may include a local area network (LAN), a wide area network (WAN), a telephone network, such as the Public Switched Telephone Network (PSTN) or a cellular network, an intranet, the Internet, or a combination of networks. Clients 210 and servers 220-240 may connect to network 250 via wired, wireless, and/or optical connections.
EXEMPLARY CLIENT/SERVER ARCHITECTURE
Fig. 3 is an exemplary diagram of a client or server entity (hereinafter called "client/server entity"), which may correspond to one or more of clients 210 and servers 220-240. The client/server entity may include a bus 310, a processor 320, a main memory 330, a read only memory (ROM) 340, a storage device 350, an input device 360, an output device 370, and a communication interface 380. Bus 310 may include a path that permits communication among the elements of the client/server entity.
Processor 320 may include a processor, microprocessor, or processing logic that may interpret and execute instructions. Main memory 330 may include a random access memory (RAM) or another type of dynamic storage device that may store information and instructions for execution by processor 320. ROM 340 may include a ROM device or another type of static storage device that may store static information and instructions for use by processor 320. Storage device 350 may include a magnetic and/or optical recording medium and its corresponding drive.
Input device 360 may include a mechanism that permits an operator to input information to the client/server entity, such as a keyboard, a mouse, a pen, voice recognition and/or biometric mechanisms, etc. Output device 370 may include a mechanism that outputs information to the operator, including a display, a printer, a speaker, etc. Communication interface 380 may include any transceiver-like mechanism that enables the client/server entity to communicate with other devices and/or systems. For example, communication interface 380 may include mechanisms for communicating with another device or system via a network, such as network 250. The client/server entity may perfoπn certain operations, as will be described in detail below. The client/server entity may perform these operations in response to processor 320 executing software instructions contained in a computer-readable medium, such as memory 330. A computer-readable medium may be defined as' a physical or logical memory device and/or carrier wave.
The software instructions may be read into memory 330 from another computer-readable medium, such as data storage device 350, or from another device via communication interface 380. The software instructions contained in memory 330 may cause processor 320 to perform processes that will be described later. Alternatively, hardwired circuitry may be used in place of or in combination with software instructions to implement processes described herein. Thus, implementations described herein are not limited to any specific combination of hardware circuitry and software.
EXEMPLARY LOCAL SEARCH SYSTEM Fig. 4 is a functional block diagram of an exemplary system (e.g., local search system 225) for identifying local search results and providing a map associated with identified locations. According to one implementation, one or more of the functions of system 225, as described below, may be performed by a server (e.g., server 220), a portion of server 220, or a combination of servers (e.g., servers 220-240). According to another implementation, one or more of these functions may be performed by an entity separate from server 220, such as a client (e.g., client 210), a computer associated with server 220 or one of servers 230 or 240.
As shown in Fig. 4, system 225 may include map provider data 400, yellow page data 405, web document data 410, an index/document repository 415, address and points of interest (POI) identification (ID) information 420, address fingerprint (FP) and POI ID mapping information 425, a geocoder 430, an indexer 435, a front end server 440, map Uniform Resource Locator (URL) information 445, and local search results 450. Map provider data 400, yellow page data 405, and web document data 410 may be provided to indexer 435, and map provider data 400 may further be used to derive address/POI ID information 420 and address FP and POI ID mapping information 425. Indexer 435 may connect to index/document repository 415 and geocoder 430, and address/POI ID information 420 may be provided to geocoder 430. Front end server 440 may connect to geocoder 430 and index/document repository 415, and my receive address FP and POI ID mapping infoπnation 425. Front end server 440 may generate map URL information 445 and local search results 450. System 225 may alternatively include other connections and/or component interrelations not shown in Fig. 4.
Generally, geocoder 430 may receive address/POI ID information 420 and/or information from indexer 435, may assign geographic identifiers (e.g., locations, coordinates, etc.) to objects, and may provide outputs to front end server 440. Indexer 435 may receive map provider data 400, yellow page data 405, and/or web document data 410, may preprocess yellow page data 405 to determine business information, extract business information from web document data 410, identify location information in a search queiy, index the address data of map provider data 400, and may provide outputs to index/document repository 415 and/or geocoder 430. Front end server 440 may receive address FP and POI ID information 425, information from geocoder 430, and/or information from index/document repository 415, may rewrite the search query, and may generate map URL information 450, local search results 450, and/or geographical information for use by an advertisements (ads) server (not shown). Map Provider Data
There may be two kinds of data for Chinese address information: map data and yellow page data. Map provider data 400 may include a variety of information. For example, in one implementation, a third party map provider may provide a set of POIs (e.g., businesses and other places of interest, such as museums, parks, hospitals, schools, etc.) and their addresses for cities in China. There may be associated POI ID u informatioή'fof eacli'Pδl 'OΪϊe to Iff a'Tf evictions, the map provider may not provide latitude and longitude information for each POI. They may, however, divide a map into grids (e.g., three-hundred meter by three- hundred meter grids) and may provide a grid index for each POI and a program to compute the distance between the grids. Map provider data 400 may be updated periodically (e.g., daily, weekly, monthly, etc.),
5 and, during each update, the POI ID may change.
In another implementation, map provider data 400 may include the following information: (1) a normal POI that may contain a business name, address, telephone number, and grid index; (2) a road POI that may contain a street name and grid index of the center of the street; and/or (3) a postal code POI that may contain postal codes and a grid index of the approximate center of the postal codes.
O In still another implementation, there may be issues with map provider data 400. For example, map provider data 400 may contain one part of an address but may be missing another part of the address. The street number, for example, may be missing from map provider data 400 due to restrictions in China (e.g., only 20-30% of the POIs may have street numbers). As a result, map provider data 400 may include only the street name of the POI or some higher level of abstraction. Also, or alternatively, the telephone number may be
5 missing for some of the POIs of map provider data 400. Also, or alternatively, the addresses provided by map provider data 400 may be improperly formatted. Yellow Page Data
Yellow page data 405 may include a variety of information and may be received from multiple sources (e.g., third party providers). For example, in one implementation, yellow page data 405 may include
'.0 address (e.g., county, city, province) and/or telephone number information for POIs, business names, email addresses, facsimile numbers, web site addresses, CEO names, business descriptions, SIC-style categories, types of businesses, etc. In another implementation, yellow page data 405 may not contain any latitude and longitude information. As a result, complete (or substantially complete) information regarding the POIs (e.g., address, telephone number, and map position information) may be deduced from map provider data 400 and
'.5 yellow page data 405. Web Document Data
Web document data 410 may include a variety of information. For example, in one implementation web document data 410 may include an e-mail, a file, a combination of files, one or more files with embedded links to other files, a news group posting, a blog, a web advertisement, a web page (which may include textual
IO information and may include embedded information (such as meta information, images, hyperlinks, etc.) and/or embedded instructions (such as Javascript, etc.)), business information (e.g., address, telephone number, etc.), etc. Web document data 410 may also include documents located based on a local search query. Address/POI ID Information
15 Address/POI ID information 420 may include a variety of information. For example, in one implementation, address/POI ID information 420 may include addresses extracted from map provider data 400, POI IDs extracted from map provider data 400, etc. In another implementation, address POI ID information 420 may include the following information regarding Chinese addresses: address data provided by map provider data 400 (e.g., addresses of POIs, streets centers); POIs (e.g., schools, parks, buildings,
K) hospitals, etc.); postal code centers (i.e., the center of an area covered by a postal code); additional cities that may not be included in the list provided by map provider data 400, etc. Address FP and POI ID Mapping Information " A"ddress FP and POI ID information 425 may include a variety of information. For example, in one implementation address FP/POI ID information 425 may include a mapping between address fingerprints (FPs) and corresponding POI IDs, etc. The mapping between the address FPs and corresponding POI IDs may be used by front end server 440 to lookup the POI ID for each local search result, as described in detail below. In an alternative implementation, POI IDs may be directly derived without mapping between address FPs and corresponding POI IDs. In such an alternative, information 425 may include the directly derived POI IDs.
Although exemplary information included in map provider data 400, yellow page data 405, web document data 410, address/POI ID information 420, and address FP/POI ID mapping information 425 have been described above, in other implementations, additional or different information about addresses that may be useful to identify local search results may be included in the exemplary information. Figs. 5-8 are exemplary diagrams of some components of system 225 of Fig. 4. Index/Document Repository
Index/document repository 415 may be provided in a single storage device (e.g., main memory 330, ROM 340, and/or storage device 350). Index/document repository 415, as shown in Fig. 5, may store a variety of information related to documents, yellow page data 405, and/or map provider data 400. For example, in one implementation, index/document repository 415 may store the following information regarding Chinese addresses: address data provided by map provider data 400 (e.g., addresses of POIs, streets centers); POIs (e.g., schools, parks, buildings, hospitals, etc.); postal code centers (i.e., the center of an area covered by a postal code); additional cities that may not be included in the list provided by map provider data 400; etc. In another implementation, as shown in Fig. 5, index/document repository 415 may include a
[province] field 500, a [city] field 510, a [district] field 520, and a [street] field 530 (which may additionally or alternatively include an address field and/or a POI name field). The following examples may correspond to Chinese address information stored in index/document repository 415:
[Beijing City] [XiCheng District] [BeiLiShi Road]; [Beijing City] [HaiDian District] [YiHe Yuan Road 5 Number];
[Beijing City] [HaiDian District] [BeiJing University]; and [JiangSu Province] [NanJing City] [GuLou District] [HanZhong Road].
In still another implementation, index/document repository 415 may include an address fingerprint (FP) field 540 and an FP accuracy field 550. Each address may be treated as a point by geocoder 430 with an address FP and FP accuracy (e.g., buildings may have a higher FP accuracy than street centers). An address FP may be a fingerprint generated based on the address (e.g., a hash value that may be generated based on the address) that may be used to lookup the POI ID for displaying the correct map. For example, an FP generator 570 may receive an address (e.g., [province] field 500, [city] field 510, [district] field 520, [street] field 530, etc.), and may generate an address FP (e.g., [FP] field 540) and a FP accuracy (e.g., [FP] accuracy field 550) based on the address. The address FP and FP accuracy may be used by a POI ID lookup table 580 to lookup the POI ID for displaying the correct map (e.g., with local search results). FP generator 570 and/or POI ID lookup table 580 may be provided in geocoder 430, indexer 435, or front end server 440.
In still a further implementation, index/document repository 415 may include a [zcode] field 560 associated with the address. An associated zcode may include a code like a postal code, which in the case of China may be computed from the six digit administrative code for districts in China. Locations, such as
JiangSu Province, BeiJing City, BeiJing City XiCheng District, may be computed by geocoder 430 as bounds that contain all the point locations within them. " '''^ariants'δf'the'kdiϋresses' may also be deteπnined and stored in index/document repository 415 to increase recall. For example, single character synonyms for the provinces and cities may be identified (e.g., Shanghai City -> Lu; HeBei Province -> Qi). Additionally or alternatively, portions of the addresses may be omitted (e.g., Shanghai City -^ Shanghai; HaiDian District -> HaiDian; HuaiHai Central Road -> HuaiHai Road). Additionally or alternatively, synonyms may be included for famous places (e.g., Shanghai City Temple -> Old City Temple; LiuRong Temple -> LiuRong Ta).
Although Fig. 5 shows exemplary information included in index/document repository 415, in other implementations, index/document repository 415 may include any information that may be useful to identify local search results. Geocoder
Geocoder 430, as shown in Fig. 6, may perform a variety of tasks to aid in the identification of local search results. In one implementation, for example, geocoder 430 may assign geographic identifiers (e.g., locations, coordinates, etc.) to objects. For example, geocoder 430 may match addresses to addresses in index/document repository 415. Output from geocoder 430 may be used for several purposes. For example, yellow page data 405 may be parsed to determine business information (e.g., address, telephone number, email address, facsimile number, hours of operation, etc.) for POIs, business information may be extracted from web document data 410, and/or location information provided in a search query may be identified.
In order to index yellow page data 405, indexer 435 may need to determine the map position by looking at Chinese addresses. This may implicate a variety of issues. For example, accurate position and detailed address information for most of the addresses in China are not readily available. Chinese addresses do not have a well defined format and hierarchy. The possible components of a Chinese address may include city, district, town, village, road, street, street number, and building. Yellow page data 405 may be in free form (i.e., not conforming to any particular form or format), especially in rural areas, and for some POIs, there are no corresponding address components available. Exemplary Chinese addresses may include: ShenZhen City HeGang Town AnLiang Village AnLiang Road 172 Number JingCheng Building;
ShenZhen City LongHua Town MingZhiMingLe Office Complex;
ShenZhen City FuHong Road ShiMao Square Block A 17 Floor; and
Beijing City HaiDian District SiJiQingXiaoFu.
Geocoder 430 may best match addresses in yellow page data 405 with addresses provided in map provider data 400, and obtain the most accurate map position possible. Geocoder 430 may include an address list generator 600, a parser 610, a longest matching prefix locator 620, a combination locator 630, a query geocoding unit 640, etc. The components of geocoder 430 may perform a number of tasks for each address (e.g., ShenZhen City HeGang Town AnLiang Village AnLiang Road 172 Number JingCheng Building).
Address list generator 600 may receive map provider data 400 and yellow page data 405, and may generate an address list (also referred to as tokens) based on the provinces, cities, districts, street names, addresses, POI names, or combinations thereof from map provider data 400 and/or yellow page data 405. For each address, parser 610 may attempt to parse the city from the beginning of the address. For example, "ShenZhen City" may be parsed by parser 610 from the address. Parser 610 may also attempt to parse the district from the address. For example, since a district is absent from the exemplary address described above, parser 610 may not be able to parse the district from the address.
Longest matching prefix locator 620 of geocoder 430 may be used to locate further portions of the address. For example, longest matching prefix locator 620 may attempt to locate the longest matching prefix (e!g., token) from the address list.' This may fail if there is not a specific token in the address list (e.g., if "HeGang Town" were not in the address list). Longest matching prefix locator 620 may advance to the end of the word "Town" if it is present in the address list. If an entire token (e.g., "HeGang Town AnLiang Village AnLiang Road") is present in the address list, longest matching prefix locator 620 may identify the token as a good match (e.g., "HeGang Town" may be identified as a good match). Longest matching prefix locator 620 may fry again to locate the longest matching prefix (e.g., token) from the address list. If this fails, then longest matching prefix locator 620 may advance to the end of the word "Village" if it is present (e.g., "AnLiang Village"). Longest matching prefix locator 620 may repeat the process again to locate the longest matching prefix (e.g., token) from the address list. This time, longest matching prefix locator 620 may attempt to match street names (e.g., "Road" or "Street"). Longest matching prefix locator 620 may advance to the end of the word(s) "Road" or "Street," if they are present (e.g., "AnLiang Road"). Longest matching prefix locator 620 may also advance to the end of the word "Number," if it is present (e.g., "172 Number"). Finally, longest matching prefix locator 620 may attempt to locate the longest matching prefix (e.g., token) from the POI names. This may provide matches for names of buildings, schools, parks, etc. (e.g., "JingCheng Building"). Combination locator 630 may locate particular combinations so that POI names may be located first, then various levels of the addresses may be located, and, finally, districts or cities may be located. Such a locating arrangement may guarantee that the most specific address may be obtained by combination locator 630. For example, combination locator 630 may locate the following exemplary combinations: city + district + POI name; city + district + address located by longest matching prefix locator 620 (e.g., "Road" or "Street"); city + district + address located by longest matching prefix locator 620 (e.g., "Village"); city + district + address located by longest matching prefix locator 620 (e.g., "Town"); city + district; and/or city When a local search query is entered by a user, the location part of the search query may be sent to a map server owned by a map provider. If the location query contains Chinese, Japanese, or Korean (CJK) characters, the map server may send it to query geocoding unit 640 for geocoding. If the location entered by the user is ambiguous (e.g., when the score of the first result from the geocoder is less than twice the score of the second result), query geocoding unit 640 may present suggestions to the user. For example, if the location query is "History Museum," the query geocoding unit 640 may present the following suggestions to the user: "Do you want to look for TianJian City HeDong District History Museum, or ShangHai City PuDong District ShangHai History Museum?"
Query geocoding unit 640 may compute the score of a search result based on the number of points in the search result location. For example, if ChangChuan City ChaoYang District scores less than Beijing City ChaoYang District, then Beijing City ChaoYang District may be displayed when the user's query location is "ChaoYang District."
Although Fig. 6 shows exemplary tasks performed by geocoder 430, in other implementations, geocoder 430 may perform additional or different tasks that may be used to identify local search results. Furthermore, although Fig. 6 shows the components of geocoder 430 as interconnected, in other implementations, the components of geocoder 430 may be separate, non-interconnected components. Indexer
Indexer 435, as shown in Fig. 7, may perform a variety of tasks to aid in the identification of local search result's. In όne'ϊmpiementationj' for example, indexer 435 may preprocess yellow page data 405 to determine business information (e.g., address, telephone number, email address, facsimile number, hours of operation, etc) for POIs, may extract business information from web document data 410, and/or may identify location information in a search query. Indexer 435 may include a map data indexer 700, a yellow page data preprocessor 710, a business information extractor 720, a distance flattener 730, a clusterer 740, a business information repository 750, a snippet highlighter 760, etc.
Map data indexer 700 may index address data from map provider data 400, which may include map position information. Geocoder 430 may attempt to geocode the address again based on the indexed address data, and clusterer 740 (described below) may set the cluster position if the cluster position is different from the given position, but may not set the cluster position above a predetermined distance threshold (e.g., within three kilometers). Such an arrangement may be used for improving clustering, as described in detail below. Yellow page data preprocessor 710 may receive yellow page data 405 from the different providers, and may preprocess yellow page data 405 to a common format. This formatted data may be provided to geocoder 430 during indexing, and geocoder 430 may attempt to geocode the address. If the address can be geocoded to building or street level, the address may be indexed as a normal entry. If the address can be geocoded to city or distinct level, the address may be indexed as an entry with an approximate position. During scoring, the entry may be treated as if it is at least twenty kilometers from the centroid (i.e., essentially having its score demoted). If the address cannot be geocoded, the address may be treated as an entry without a position. During indexing, if the entry without a position may be clustered with another entry (e.g., using its telephone number), the entry may be retained. Otherwise, the entry may be discarded.
Business information extractor 720 may extract business information (e.g., addresses and telephone numbers) from web document data 410 using a variety of techniques. The techniques may be modified based on the different formatting of addresses and telephone numbers in China and Japan.
For example, in a First technique, business information extractor 720 may include a classifier that may analyze documents with addresses to determine business information associated with the addresses based on a statistical model. The documents analyzed by the classifier may include documents with addresses for which there is no corresponding yellow page data 405 and/or documents with addresses for which there is possibly incorrect yellow page data 405. The funclions perfoπned by the classifier may differ based on whether the business information corresponds to business name (title) information or telephone number information. Yet other functions may be performed when the business information includes information other than business name or telephone number information. A business name (title) may be identified by analyzing terms near the address and determining the probability that each term is part of the title. A confidence score may be assigned to each candidate title that is identified.
A telephone number may be associated with an address by identifying a set of candidate telephone numbers in the document. It may be determined, based on the statistical model, the probability that each of the candidate telephone numbers is associated with the address given the prediction regarding the preceding candidate telephone number and given a window of terms (e.g., looking at a predetermined number of terms to the left and/or right) around the candidate telephone number. Confidence scores may be assigned to the candidate telephone numbers based on their determined probabilities. Optionally, a best telephone number for the address may be determined. The telephone number may then be associated with the address to form or supplement a business listing.
In a second exemplary technique, business information extractor 720 may include a location extractor whi'cMiϊay b'e fflclMeffaTparfSf a"seaϊ-clf e'hgine. The location extractor may receive a search query and determine whether the search query includes a geographic reference. When the search query includes a geographic reference, the location extractor may separate the geographic reference from the search terms in the query and send them to a local search engine. When the search query does not include a geographic reference, the location extractor may forward the search terms to a web search engine that may include a traditional web search engine that returns a set of documents related to a search query. The local search engine may include a specialized search engine, such as a business listings search engine. In operation, the local search engine may receive the search terms and the geographic reference of a search query from the location extractor. The local search engine may identify a set of documents that match the search query (i.e., documents that contain the set of search terms of the search query) by comparing the search terms to documents in a document corpus relating to the geographic area associated with the geographic reference. The local search engine may score the identified documents, sort them based on their scores, and output them as a list of search results.
The location extractor may determine unambiguous addresses (e.g., cities) in a search query by setting a variable i equal to one, and performing a search for the name of a city for each city(i) in a list of cities. The number of search results for this search may be counted as countcity. A search may also be performed for the name of the city with the name of the corresponding province. The number of search results for this search may be counted as countcity/province.
It may then be determined whether countcity/province is at least X% (where X is a number greater than zero) of countcity. When countcity/province is at least X% of countcity, then the city may be considered an "unambiguous" city. An "unambiguous city" may refer to a city whose name can be used alone in a search query and it will be understood that the user intended the city and not something else. When countcity/province is not at least X% of countcity, then it may be determined whether there are any more cities on the list. If there are more cities on the list, then the variable i may be incremented by one and the next city in the list of cities may be evaluated.
The documents of the search results may be analyzed to identify any postal codes that they contain. The postal codes may be identified using a pattern matching technique and verified by comparing them to a list of postal codes. It may then be determined whether the postal codes correspond to postal codes associated with city(i). The number of documents that contain postal codes associated with city(i) may be counted as countpostal. It may be determined whether countpostal is at least X% (e.g., 5%) of countcity. When countpostal is at least X% of countcity, then the city may be considered an unambiguous city.
In a third exemplary technique, business information extractor 720 may identify a geographically relevant document. A geographically relevant document, as used herein, may generally refer to any document that, in some manner, has been determined to have particular relevance to a geographical location. Business listings (e.g., yellow page listings) may be considered a geographically relevant document that is relevant to the geographic region defined by the address of the business. Other documents, such as web documents, may also have particular geographical relevance. For example, a business may have a home page, may be the subject of a document that comments on or reviews the business, or may be mentioned by a web page that in some other way relates to the business. The particular geographic location for which a document is associated may be determined in a number of ways, such as from the postal address or from other geographic signals.
The geographic region associated with the geographically relevant document may be mapped to a corresponding location identifier. Additional location identifiers may be determined for the document. In particuTaiV'lόcatiori'ϊdehtϊFiers corresponSing to surrounding regions within a predetermined range may also be determined. Each geographically relevant document may be indexed as if the document included the location identifiers associated with the document's region as well as the identified surrounding regions.
Although Fig. 7 shows business information extractor 720 as part of indexer 435, in other implementations, business information extractor 720 may be separate from indexer 435. To ensure that the set of composite documents may be available at indexing, the information extracted by business information extractor 720 may be provided in business information repository 750. Business information repository 750 may include a variety of information, e.g., the documents from which business information has been extracted by business information extractor 720. Business information repository 750, together with the extracted business information, may be provided within indexer 435. Although Fig. 7 shows business information repository 750 as part of indexer 435, in other implementations, business information repository 750 may be separate from indexer 435.
Distance flattener 730 may set a search radius or distance for a local search query. For example, in one implementation, each local search query may be geocoded by geocoder 430 to a particular location. Each location may be a point location (e.g., buildings, famous tourist places, schools, street centers, etc.) or a bound location (e.g., districts, cities, provinces, etc.). For point locations, distance flattener 730 may set the search radius to a predetermined distance (e.g., approximately five kilometers around the point). For bound locations, distance flattener 730 may set the search radius to approximately the maximum distance from a centroid of the location to the corners. Scores of search results in a zcode set associated with the location-(i.e., the set of zcodes making up the location) may be promoted. In this way, when a user searches near a district name, the top results may be within that district.
Clusterer 740 may cluster map provider data 400 and yellow page data 405 together. Much of map provider data 400 may include accurate position information so that it may be displayed accurately on a map, but may not include detailed address or telephone number information (e.g., it might include an address without a street number and/or may be missing a telephone number). On the other hand, yellow page data 405 may include detailed address and telephone number information, but may not include accurate position information. For example, map provider data 400 may include a source (e.g., map provider), a title (e.g., "Beijing University"), an address (e.g., "Beijing City HaiDian District YiHe Yuan Road"), and/or a POI ID (for map display) (e.g., "A1234567"). Whereas, yellow page data 405 may include a source (e.g., yellow page data provider), a title (e.g., "Beijing University"), an address (e.g., "Beijing City HaiDian District YiHe Yuan Road 5 Number"), and/or a telephone number (e.g., "010-62752114"). If these entries are clustered together by clusterer 740, then front end server 440 may be able to provide the user with detailed address and telephone number information, as well as an accurate position on a map.
The position obtained from geocoding (e.g., with geocoder 430) the address from yellow page data 405 may be an approximation and may be far away from the accurate position provided by map provider data 400. In this case, the same business from two providers may be located in much different neighborhoods, and thus may not be clustered together by clusterer 740. The solution to this may include geocoding (e.g., with geocoder 430) the address from map provider data 400 to a cluster position. The cluster position may be used for neighborhood generation as well as for clustering by clusterer 740. The actual position provided by map provider data 400 may then be used for map display.
Snippet highlighter 760 may highlight snippets in documents (e.g., web documents). Web document snippet highlighting may typically be accomplished by term offsets in the documents. Since a CJK document
Figure imgf000014_0001
instead of using space as delimiter, a long paragraph of text may need to be segmented to obtain the terms), in order to highlight specific terms, the entire document may be segmented to obtain the corresponding terms, which may be inefficient. However, snippet highlighter 760 may store byte offsets instead of term offsets to identify an address or telephone number (or some other business information) in a web document during indexing. Snippet highlighter 760 may use the byte offset to perform highlighting and no segmentation may be required. Although Fig. 7 shows snippet highlighter 760 as part of indexer 435, in other implementations, snippet highlighter 760 may be separate from indexer and/or included in another component (e.g., within front end server 440).
Although Fig. 7 shows exemplary tasks performed by indexer 435, in other implementations, indexer 435 may perform additional or different tasks that may be used to identify local search results. Furthermore, although Fig. 7 shows the components of indexer 435 as interconnected, in other implementations, the components of indexer 435 may be separate, non-interconnected components. Front End Server
Front end server 440, as shown in Fig. 8, may perform a variety of tasks to aid in the identification of local search results. In one implementation, for example, front end server 440 may include a query rewriter 800, a local search generator 810, a map generator 820, a geographical information generator 830, etc. Although Fig. 8 shows exemplary tasks performed by front end server 440, in other implementations, front end server 440 may perform additional or different tasks that may be used to aid in the identification of local search results. In one implementation, queiy rewriter 800 may perform a variety of tasks. For example, since the number of web document clusters may be much smaller for Chinese data compared to English data, a match in a title or category may be used to return valid results. Also, in Chinese there may be several synonyms referring to the same term. For example, the term "restaurant" may have several common synonyms in Chinese. In addition, the different providers of yellow page data 400 may use different names to represent the same category. In order to provide the most recall for the search query entered by the user, query rewriter 800 may generate a list of synonyms for each of the categories. Query rewriter 800 may also rewrite each local search query to expand the query to a couple of search terms that may be joined by an "OR" operand. Local search generator 810 may generate local search results 450. For example, local search generator 810 may generate results corresponding to relevant web pages and/or business listings within a specific geographic area, based on the address information obtained and processed by local search system 225. Local search results 450 may be displayed (e.g., to the user who input the local search query) on a display (e.g., output device 370).
Map generator 820 may generate a map relating to local search results 450. For example, in order to display the map for the results, map generator 820 may construct an iframe (i.e., a floating frame inserted within a web page) on the search result page, and may execute a post action to a URL provided by the map provider. The parameters in the post may contain the following fields for each search result: a title; an address; a telephone number; POI ID (for displaying the point on the map); and accuracy. Locations that may be geocoded to a building level may be marked as accurate on the map by map generator 820. Locations that may be geocoded to a street level (or some higher level of abstraction) may be marked as estimated on the map by map generator 820. Fig. 9 illustrates how these two levels of accuracy may be differentiated on a map 900 generated based on local search results 910. For example, accurate locations 920 may be identified by one marker color (e.g., green) and estimated locations 930 may be identified by another marker color (e.g., red). 'Whiϊei' twδ" levels' of accuracy (accurate and estimate) are shown in Fig. 9, additional levels of accuracy may be used in other implementations consistent with principles of the invention.
During indexing by indexer 435, the fingerprint of the address (e.g., address FP 540) may be stored together with its associated accuracy (e.g., FP accuracy 550). The mapping between the address fingerprint and the POI ID (e.g., address FP/POI ID mapping information 425) may be used by map generator 820 to lookup the POI ID for each search result. The POI ID may be used by map generator 820 to identify a position on the map (e.g., map 900) for the map provider so that the map provider may show the position on the map provided within the result page.
In order to display local ads for China, geographical information generator 830 may provide geographical information to ads server 840 in a variety of ways. For example, in one implementation, each province or self-administered city may have a two digit code, which may be the second part of the region code defined by ISO-3166-2. Geocoder 430 may index this code (e.g., "CN-dd") with each address and pass it to front end server 440 for every successfully geocoded address. Geographical information generator 830 may send the code to ads server 840 as a geo-region-code. In another implementation, advertisers may bid on keyword pairs of <keyword, location>, instead of using geo-targeting because, for web searching, geo-targeting may not work as well in Asian countries as it does in the United States. The inventory for ads may be more for such keyword pairs. As a result, geographical information generator 830 may concatenate keywords with the locations entered by the search query, and may use these concatenations as the keywords sent to ads server 840. For example, when a user searches for "restaurant" near "Beijing," the keyword sent to ads server 840 by geographical information generator 830 may be "restaurant Beijing." Such an arrangement may be provided for both Japan and China. In still another implementation, geographical information generator 830 may determine whether the search query includes a geographic reference. If the search query does not include a geographic reference, then regular advertisements may be presented by ads server 840. However, it may be determined whether an indicator of the user's location, such as the user" IP address, is available. When an indicator of the user's location is available, then local advertisements may be presented based on the user's location.
If, on the other hand, the search query includes a geographic reference, geographical information generator 830 may determine whether the geographic reference corresponds to a city name alone (i.e., without any other geographic information, such as no province information). If the search query includes a geographic reference other than a city name alone, then local advertisements may be presented.
If the search query includes a geographic reference corresponding to a city name alone, then geographical information generator 830 may determine whether the city corresponds to an unambiguous city. If the city does not correspond to an unambiguous city, then regular advertisements may be presented. If the city corresponds to an unambiguous city, then geographical information generator 830 may determine whether the city name with one or more other search terms of the query appear on a blacklist. A blacklist may be maintained for unambiguous city names that, when combined with one or more words, mean something other than their respective cities. If the city name with one or more other search terms of the query appears on the blacklist, then regular advertisements may be presented. If the city name with one or more other search terms of the query does not appear on the blacklist, then local advertisements may be presented based on the geographic reference of the query.
If local ("targeted") advertisements are to be presented, information concerning the user's location (e.g., the user's IP address) may be used by geographical information generator 830 to determine whether that lodatioB is witn'fri '£ predetermined 'distance of the location corresponding to the geographic reference. If the user's location is within the predetermined distance, then local advertisements may be presented. If the user's location is outside the predetermined distance, however, then regular advertisements may be presented.
Front end server 440 (or components of front end server 440) may perform some tasks that may be specific to China. For example, front end server 440 may hide driving directions, provide a display unit
(kilometer versus mile) that may be country dependent, perform filtering for sensitive keywords when the user is in China, provide specific formatting of Chinese addresses and telephone numbers for display, show the geocoded location on top of the map, and/or round the distance to 0.5 kilometers instead of 0.1 kilometers and remove the direction. EXEMPLARY PROCESS
Figs. 1 OA-I OD is a flowchart of an exemplary process for identifying local search results and providing a map associated with identified locations. Exemplary Local Search Result And Map Generation Process
As shown in Fig. 1OA, a process 1000 for identifying local search results and providing a map associated with identified locations may begin with receipt of yellow page, map provider, and document data (block 1005). For example, in one implementation described above in connection with Fig. 4, map provider data 400, yellow page data 405, and web document data 410 may be provided to indexer 435, and map provider data 400 may further be used to derive address/POI ID information 420 and address FP/POI ID mapping information 425 Process 1000 may perform geocoding on the data (block 1010). For example, in one implementation described above in connection with Fig. 6, address list generator 600 of geocoder 430 may receive map provider data 400 and yellow page data 405, and may generate an address list based on the provinces, cities, districts, street names, addresses, POI names, or combinations thereof from map provider data 400 and/or yellow page data 405. Parser 610 of geocoder 430 may attempt to parse the city and/or district from the beginning of each address. Longest matching prefix locator 620 of geocoder 430 may be used to locate further portions of each address. Combination locator 630 of geocoder may locate particular combinations so that POI names may be located first, then various levels of the addresses may be located, and, finally, districts or cities may be located. Query geocoding unit 640 of geocoder 430 may compute the score of a search result based on the number of points in the search result location. As further shown in Fig. 1OA, process 1000 may perform indexing on the data (block 1015). For example, in one implementation described above in connection with Fig. 7, map data indexer 700 of indexer 435 may index address data from map provider data 400, which may include map position information. Yellow page data preprocessor 710 of indexer 435 may receive yellow page data 405 from the different providers, and may preprocess yellow page data 405 to a common format. Business information extractor 720 of indexer 435 may extract business information (e.g., addresses and telephone numbers) from web document data 410 using a variety of techniques. The techniques may be modified based on the different formatting of addresses and telephone numbers in China and Japan. The extracted business information may be provided in business information repository 750. Distance flattener 730 of indexer 435 may set a search radius for a local search query. Clusterer 740 of indexer 435 may cluster map provider data 400 and yellow page data 405 together. Snippet highlighter 760 of indexer 435 may highlight snippets in documents (e.g., web documents).
Process 1000 may generate local search results and may provide a map URL (block 1020). For example, in one implementation described above in connection with Fig. 8, local search generator 810 of front end' serv'er"4'40 'ffiay^'eneflte lόcaL"serafcn" results 450 (e.g., relevant web pages and/or business listings within a specific geographic area, based on the address information obtained and processed by local search system 225). Map generator 820 of front end server 440 may generate a map relating to local search results 450. For example, in order to display the map for the results, map generator 820 may construct an iframe (i.e., a floating frame inserted within a web page) on the search result page, and may execute a post action to a URL provided by the map provider. Exemplary Geocoding Process
Process block 1010 (Fig. 10A) of process 1000 may include the blocks shown in Fig. 1OB. Thus, process block 1010 may generate an address list (block 1025). For example, in one implementation described above in connection with Fig. 6, address list generator 600 of geocoder 430 may receive map provider data 400 and yellow page data 405, and may generate an address list (tokens) based on the provinces, cities, districts, street names, addresses, POI names, or combinations thereof from map provider data 400 and/or yellow page data 405. Process block 1010 may parse each address in the address list (block 1030). For example in one implementation described above in connection with Fig. 6, for each address, parser 610 of geocoder 430 may attempt to parse the city and the district from the address.
As further shown in Fig. 1OB, process block 1010 may locate the longest matching prefixes from each address in the address list (block 1035). For example in one implementation described above in connection with Fig. 6, longest matching prefix locator 620 of geocoder 430 may be used to locate further portions of the address (e.g., a town, a village, a road or street, number, POI names, etc.). If the longest matching prefixes are not located in the address (block 1040 - NO), then process block 1010 may continue to locate further portions of the address (block 1035).
If the longest matching prefixes are located in the address (block 1040 - YES), then process block 1010 may locate combinations in each address (block 1045). For example in one implementation described above in connection with Fig. 6, combination locator 630 of geocoder 430 may locate particular combinations so that POI names may be located first, then various levels of the addresses may be located, and, finally, districts or cities may be located. Such a locating arrangement may guarantee or verify that the most specific possible address may be obtained by combination locator 630. Exemplary Indexing Process
Process block 1015 (Fig. 10A) of process 1000 may include the blocks shown in Fig. 1OC. Thus, process block 1015 may preprocess the yellow page data (block 1050). For example, in one implementation described above in connection with Fig. 7, yellow page data preprocessor 710 of indexer 435 may receive yellow page data 405 from the different providers, and may preprocess yellow page data 405 to a common (or predetermined) format. This formatted data may be provided to geocoder 430 during indexing, and geocoder 430 may attempt to geocode the address. If the address may be geocoded to building or street level, the address may be indexed as a normal entry. If the address may be geocoded to city or district level, address may be indexed as an entry with an approximate position.
Process block 1015 may extract business information from documents (block 1055) and may store the business information (block 1060). For example, in one implementation described above in connection with Fig. 7, business information extractor 720 of indexer 435 may extract business information (e.g., addresses and telephone numbers) from web document data 410 using a variety of techniques. The techniques may be modified based on the different formatting of addresses and telephone numbers in China and Japan. In another implementation described above in connection with Fig. 7, the extracted business information may be prdvϊcfed iin buslήeSs' mtofmatibn repository 750. Business information repository 750 may include a variety of information, e.g., the documents from which business information has been extracted by business information extractor 720.
Process block 1015 may index address dala received from a map provider (block 1065). For example, in one implementation described above in connection with Fig. 7, map data indexer 700 of indexer 435 may index address data from map provider data 400, which may include map position information. Geocoder 430 may attempt to geocode the address again based on the indexed address data, and clusterer 740 may set the cluster position if the cluster position is different from the given position, but may not set the cluster position above a predetermined distance threshold (e.g., within three kilometers). As further shown in Fig. 1OC, process block 1015 may cluster yellow page data and map provider data (block 1070). For example, in one implementation described above in connection with Fig. 7, clusterer 740 of indexer 435 may cluster map provider data 400 and yellow page data 405 together. If these entries are clustered together by clusterer 740, then front end server 440 may be able to provide the user with detailed address and telephone number information, as well as an accurate position on a map. Process block 1015 may highlight snippets provided in documents (block 1075). For example, in one implementation described above in connection with Fig. 7, snippet highlighter 760 of indexer 435 may highlight snippets in documents (e.g., web documents). Snippet highlighter 760 may store byte offsets instead of term offsets to identify an address or telephone number (or some other business information) in a web document during indexing. Snippet highlighter 760 may use the byte offset to perfoπn highlighting and no segmentation may be required.
As further shown in Fig. 1OC, process block 1015 may set a search distance (block 1080). For example, in one implementation described above in connection with Fig. 7, distance flattener 730 of indexer 435 may set a search radius for a local search query geocoded by geocoder 430 to a particular location. Each location may be a point location (e.g., buildings, famous tourist places, schools, street centers, etc.) or a bound location (e.g., districts, cities, provinces, etc.). For point locations, distance flattener 730 may set the search radius to a predetermined distance (e.g., approximately five kilometers around the point). For bound locations, distance flattener 730 may set the search radius to approximately the maximum distance from a centroid of the location to the comers. Exemplary Front End Server Process Process block 1020 (Fig. 10A) of process 1000 may include the blocks shown in Fig. 10D. Thus, process block 1020 may rewrite a search query (block 1085). For example, in one implementation described above in connection with Fig. 8, there may be several synonyms referring to the same term in Chinese. In addition, the different providers of yellow page data 400 may use different names to represent the same category. In order to provide the most recall for the search queiy entered by the user, query rewriter 800 of front end server 440 may generate a list of synonyms for each of the categories. Query rewriter 800 may also rewrite each local search query that may expand the queiy to a couple of search terms that may be joined by an "OR" operand.
Process block 1020 may generate local search results based on the search query, and may generate a map showing location(s) of the search results (block 1090). For example, in one implementation described above in connection with Fig. 8, local search generator 810 of front end server 440 may generate local search results (e.g., relevant web pages and/or business listings within a specific geographic area, based on the address information obtained and processed by local search system 225). In another implementation described abdrve'ui coime'ctiόlϊ witlΪFΪg. 8 "fnap"geήerator 820 of front end server 440 may generate a map relating to local search, results 450. For example, in order to display the map for the results, map generator 820 may construct an iframe (i.e., a floating frame inserted within a web page) on the search result page, and may execute a post action to a URL provided by the map provider. The POI ID may be used by map generator 820 to identify a position on the map (e.g., map 900) for the map provider so that the map provider may show the position on the map provided within the result page.
As further shown in Fig. 1OD, process block 1020 may generate geographical information for an ads server (block 1095). For example, in one implementation described above in connection with Fig. 8, in order to display local ads for China, geographical information generator 830 of front end server 440 may provide geographical information to ads server 840. For example, geographical information generator 830 may send the region code to ads server 840 as a geo-region-code. In another example, geographical information generator 830 may concatenate keywords with the locations entered by the user input search query, and may use these concatenations as the keywords sent to ads server 840.
CONCLUSION Implementations described herein may provide systems and methods for identifying local search results and generating a map associated with identified locations. The system may receive a local search query input by a user, and may identify a location associated with the local search query. The system may identify a set of local search results that may be related to the local search query and may be associated with the identified location. The local search results may include links to documents that may be related to the local search query. The system may identify an identifier for a group of local search results, and may provide the identifier to a map provider. The system may receive a map associated with the identified location from the map provider. The map may identify a position of at least one search result in the group of local search results.
The described implementations provide one or more of the following advantages. For example, map data and yellow page data may be utilized from several different providers to identify local search results and generate a map associated with identified locations. The map may be conveniently displayed with the local search results. Such an arrangement avoids generation of local search results and a pointer to a third-party map provider's server.
In another example, the map may provide detailed map data based on the yellow page data. This may make it possible to generate a map that includes detailed map data in countries where export restrictions may limit the availability of detailed map data to render the map or may limit the availability of the actual latitude and longitude of addresses within the area.
The foregoing description of embodiments of the present invention provides illustration and description, but is not intended to be exhaustive or to limit the invention to the precise form disclosed. Modifications and variations are possible in light of the above teachings or may be acquired from practice of the invention. For example, while series of acts have been described with regard to Figs. 10A-10D, the order of the acts may be modified in other implementations consistent with principles of the invention. Further, non-dependent acts may be performed in parallel. In one implementation, server 220 may perform most, if not all, of the acts described with regard to the processing of Figs. 10A-10D. In another implementation, one or more, or all, of the acts may be performed by another entity, such as another server 230 and/or 240 or client 210.
In another example, alternative approaches may be utilized for geocoding addresses that are not provided by'theinlp pfoViderrltfMS1 alternative approach, geocoder 430 may attempt to locate the closest point for an address to be geocoded. For example, suppose that the points "1 ABC Street" and "10 ABC Street" are identified by the map provider. When trying to geocode the address "3 ABC Street," geocoder 430 may return the location of "1 ABC Street," which is the closest point to "3 ABC Street," In another alternative approach, geocoder 430 may attempt to interpolate a point. For example, if "1 ABC Street" is at grid index (0, 0) and "10 ABC Street" is at grid index (10, 20), geocoder 430 may determine that the address of "3 ABC Street" is at grid index (3, 6), based upon interpolation.
In still another example, alternative approaches may be utilized for obtaining the POI IDs to draw the map. In one alternative approach, the POI ID may be stored with the location data from the map provider. The POI IDs may be returned by front end server 440 during serving time. The POI IDs may change in different versions of the map provider data, Storing the POI ID in the index makes the index dependent upon the data from the map provider. In another alternative approach, the addresses of the search results may be geocoded during serving time and geocoder 430 may be requested to provide the closest matching points. Geocoder 430 may return the POI IDs of the points. The requests for the closest matching points may be sent as batches (e.g., batches often) of geocoding requests so the performance impact may be small.
In a further example, while the systems and methods were described in terms of a Chinese local search, in other implementation, some of the techniques described herein may equally apply to local searching in other countries.
It will be apparent to one of ordinary skill in the art that aspects of the invention, as described above, may be implemented in many different forms of software, finnware, and hardware in the implementations illustrated in the figures. The actual software code or specialized control hardware used to implement aspects consistent with principles of the invention is not limiting of the invention. Thus, the operation and behavior of the aspects were described without reference to the specific software code-it being understood that one of ordinary skill in the art would be able to design software and control hardware to implement the aspects based on the description herein.
No element, act, or instruction used in the present application should be construed as critical or essential to the invention unless explicitly described as such. Also, as used herein, the article "a" is intended to include one or more items. Where only one item is intended, the term "one" or similar language is used. Further, the phrase "based on" is intended to mean "based, at least in part, on" unless explicitly stated otherwise.

Claims

WttA'T IS CLAΪMEϋ'lS:
1. A method comprising; receiving yellow page data, third-party map provider data, and document data in response to a local search query; geocoding at least one of the yellow page data, the third-party map provider data, and the document data to assign a geographic identifier to and to match at least one address within the local search query; indexing the geocoded data to identify business information and location information corresponding to the local search query; and providing local search results and a third-party map based on the identified business information and location information.
2. The method of claim 1 , wherein the yellow page data comprises at least one of address information for points of interest (POIs), telephone number information for POIs, a business name, an email address, a facsimile number, a web site address, a CEO name, a business description, a SIC-style category, or a type of business.
3. The method of claim 1 , wherein the third party map provider data comprises a point of interest (POI) and an address of the POI.
4. The method of claim 3, wherein the third party map provider data further comprises a grid for the POL
5. The method of claim 1, wherein the third party map provider data comprises at least one of: a normal point of interest (POI) that includes a business name, an address, a telephone number, and a grid index, a road POI that includes a street name and a grid index of a center of the street, or a postal code POI that includes a postal code and a grid index of an approximate center of the postal code.
6. The method of claim 1, wherein the geocoding comprises: generating an address list based on the local search query; parsing each address in the address list; locating longest matching prefixes in each address to identify portions of each address; and locating combinations in each address to verify each address.
7. The method of claim 1, wherein the indexing comprises: preprocessing the yellow page data to a predetermined format; extracting the business information from the document data; indexing address information from the third party map provider data; and clustering the yellow page data and the third party map provider data. 1I"/" IJI i¥hl-;i e4h t'ς: a 'π 7 ¥«ierein the indexing further comprises: storing the business information in a repository; highlighting snippets in the document data; and setting a search distance for the local search query.
9. The method of claim 1 , wherein the providing comprises: generating a list of synonyms related to the local search query; rewriting the local search query to expand the query; generating the local search results based on the expanded query; obtaining the third party map based on the local search results; and generating advertisements based on geographical information related to the local search results.
10. The method of claim 1 , wherein the location information is determined from the third party map provider data and the yellow page data.
11. A method of geocoding based on a local search query, comprising: receiving third party map provider data and yellow page data; generating an address based on the local search query; parsing the address; locating longest matching prefixes in the address to identify at least one portion of the address; and locating a combination in the address to verify the address.
12. The method of claim 11, further comprising: generating the address based on information included in the third party map provider data or the yellow page data.
13. The method of claim 11 , further comprising: parsing a city and a district from the address.
14. The method of claim 11, wherein the portion of the address comprises at least one of a town, a village, a road, a street, a number, or a name of a point of interest.
15. The method of claim 11 , further comprising: locating the combination in the address so that a name of a point of interest is located first, other portions of the address are located second, and a district or a city is located third.
16. The method of claim 11 , further comprising: indexing the address as an exact position if the address includes a building or a street portion.
17. The method of claim 11 , further comprising: indexing the address to an approximate position if the address includes a city portion or a district portion. ψ/ iyi aiiiαe 'O ϊn g jg ^e on a oca searc query, comprising: preprocessing yellow page data to a predetermined foπnat; extracting business information from document data; storing the business information in a repository; indexing address information from third party map provider data; clustering the yellow page data and the third party map provider data; highlighting snippets in the document data; for a point location, setting a search distance for the local search query to a predetermined distance; and for a bound location, setting the search distance to approximately a maximum distance from a centroid of the bound location to corners of the bound location.
19. The method of claim 18, further comprising: preprocessing the yellow page data to a predetermined format.
20. The method of claim 18, wherein the business information comprises at least one of a business address or a business telephone number.
21. The method of claim 20, further comprising: modifying the extraction of the business information based on foπnat differences of the business address or the business telephone number.
22. The method of claim 18, further comprising: storing documents associated with the business information in the repository.
23. The method of claim 18, further comprising: geocoding an address based on the indexed address information; and determining a new cluster position based on the geocoded address if the new cluster position is different than a previous cluster position.
24. The method of claim 18, further comprising: highlighting snippets in the document data based on byte offsets that are stored to identify the business information in the document data.
25. A method comprising: generating a list of synonyms related to a local search query; rewriting the local search query to expand the query; generating local search results based on the expanded queiy; obtaining a map from a third party map provider based on the local search results; and generating advertisements based on geographical information related to the local search results.
26. The method of claim 25, further comprising: rewriting the local search query by coupling search terms with an "OR" operand.
27. The method of claim 25, wherein the obtaining the map comprises: constructing a map frame with the local search results; executing a post action to a URL provided by the third party map provider; and identifying a position on the map to enable the third party map provider to show the position on the map.
28. The method of claim 25, further comprising: generating the advertisements based on a region code related to the local search results.
29. The method of claim 25, further comprising: concatenating a keyword with a location related to the local search query; and generating the advertisements based on the concatenated keyword.
30. A system comprising: an indexer to receive third party map provider data, yellow page data, and document data, preprocess the yellow page data to determine business infoπnation, extract business information from the document data, identify location information in a local search query, and index address data of the third party map provider data; a geocoder to receive information Irom the indexer, and assign geographic identifiers; and a front end server to receive information from the geocoder, rewrite the local search query, obtain a map from the third party map provider data, and generate local search results based on the local search query.
31. The system of claim 30, further comprising: an index/document repository that stores at least one of address data provided by the third party map provider data, points of interest information, postal code centers, or cities not included in the third party map provider data.
32. The system of claim 30, wherein the indexer preprocesses the yellow page data to a predetermined format.
33. The system of claim 30, wherein the indexer sets a search distance for the local search query.
34. The system of claim 30, wherein the indexer clusters the third party map provider data and the yellow page data.
35. The system of claim 30, wherein the indexer highlights snippets in the document data.
36. The system of claim 30, wherein the geocoder receives the third party map provider data and the yellow page data, and generates an address list from the third party map provider data and the yellow page data. f ... - .
38. The system of claim 37, wherein the geocoder locates combinations within the address.
39. The system of claim 30, wherein the geocoder computes the scores of each local search result based on a number of points in a search result location.
40. The system of claim 30, wherein the front end server generates a list of synonyms related to the local search query.
41. The system of claim 30, wherein the front end server generates an estimated location and an accurate location on the map for at least one local search result.
42. The system of claim 30, wherein the front end server generates advertisements based on geographical infoπnation related to the local search results.
43. A system comprising: means for receiving yellow page data, third-party map provider data, and document data in response to a local search query; means for geocoding at least one of the yellow page data, the third-party map provider data, and the document data to assign a geographic identifier to and to match at least one address within the local search query; means for indexing the geocoded data to identify business information and location information corresponding to the local search query; and means for providing local search results and a third-party map based on the identified business information and location information.
44. A system comprising: a memory to store a plurality of instructions; and a processor to execute instructions in the memory to: identify a location associated with a local search query, identify local search results relevant to the local search query and associated with the identified location, identify an identifier for each of a group of the local search results, and receive from a third party map provider a map associated with the identified location, where the map identifies a position of at least one local search result in the group of local search results.
45. A method comprising: receiving a local search query; identifying a location associated with the local search query; identifying a set of search results relevant to the local search query and associated with the identified location; identifying an identifier for each of a group of the search results; Pjjro.viquB gϊrøej fflβn ii e o ac p provi er; an receiving from the third party map provider a map associated with the identified location, where the map identifies a position of at least one search result in the group of search results.
46. A method comprising: generating a list of tokens; identifying a potential address within a web document; parsing the potential address from a beginning to determine whether the potential address includes a token associated with a city; further parsing the potential address to determine whether the potential address includes a token associated with a district; identifying a longest-matching token in the potential address after the token associated with the city or the token associated with the district; and determining whether the potential address is an actual address based on the token associated with the city, the token associated with the district, and the identified longest-matching token.
PCT/US2006/033537 2005-08-30 2006-08-30 Local search WO2007027608A2 (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
EP06802480A EP1934829A4 (en) 2005-08-30 2006-08-30 Local search
CA002620770A CA2620770A1 (en) 2005-08-30 2006-08-30 Local search
JP2008529167A JP2009506459A (en) 2005-08-30 2006-08-30 Local search
CN200680040129.9A CN101313300B (en) 2005-08-30 2006-08-30 Local search
BRPI0615323-2A BRPI0615323A2 (en) 2005-08-30 2006-08-30 local search, geocoding and indexing methods and local search systems

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US71214605P 2005-08-30 2005-08-30
US60/712,146 2005-08-30

Publications (2)

Publication Number Publication Date
WO2007027608A2 true WO2007027608A2 (en) 2007-03-08
WO2007027608A3 WO2007027608A3 (en) 2007-08-30

Family

ID=37809410

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2006/033537 WO2007027608A2 (en) 2005-08-30 2006-08-30 Local search

Country Status (7)

Country Link
EP (1) EP1934829A4 (en)
JP (1) JP2009506459A (en)
KR (1) KR100985450B1 (en)
CN (1) CN101313300B (en)
BR (1) BRPI0615323A2 (en)
CA (1) CA2620770A1 (en)
WO (1) WO2007027608A2 (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009009518A1 (en) * 2007-07-09 2009-01-15 Google Inc. Interpreting local search queries
WO2009137967A1 (en) * 2008-05-16 2009-11-19 Shanghai Hewlett-Packard Co., Ltd Provisioning a geographical image for retrieval
WO2009145438A1 (en) * 2008-03-31 2009-12-03 팅크웨어(주) Method and system for advertisement of map using virtual poi (point of interest)
CN102479229A (en) * 2010-11-29 2012-05-30 北京四维图新科技股份有限公司 Method and system for generating point of interest (POI) data
US8260775B2 (en) 2010-01-12 2012-09-04 Microsoft Corporation Geotemporal search
JP2012256356A (en) * 2012-08-15 2012-12-27 Zenrin Datacom Co Ltd Document data evaluation method, document data evaluation device, document data selection method, document data selection device, database generation method, database generation device, and computer program
WO2013079767A1 (en) * 2011-10-18 2013-06-06 Nokia Corporation Methods and apparatuses for facilitating interaction with a geohash-indexed data set
US8682646B2 (en) 2008-06-04 2014-03-25 Microsoft Corporation Semantic relationship-based location description parsing
US20140236689A1 (en) * 2011-02-11 2014-08-21 Thinkware Systems Corporation Method and system for advertisement of map using virtual poi (point of interest)
US8958817B1 (en) 2012-01-19 2015-02-17 Google Inc. Weighted-distance spatial indexing
CN105808715A (en) * 2016-03-07 2016-07-27 武汉大学 Method for establishing map per location
WO2017097230A1 (en) * 2015-12-09 2017-06-15 北京奇虎科技有限公司 Method and apparatus for displaying map searching result
WO2020245437A1 (en) * 2019-06-06 2020-12-10 Deepreach Method for generating a composite visibility indicator for an entity, system
WO2022076081A1 (en) * 2020-10-06 2022-04-14 SafeGraph, Inc. Systems and methods for generating multi-part place identifiers
US11561943B2 (en) 2018-12-11 2023-01-24 SafeGraph, Inc. Feature-based deduplication of metadata for places
US11762914B2 (en) 2020-10-06 2023-09-19 SafeGraph, Inc. Systems and methods for matching multi-part place identifiers
US11899696B2 (en) 2020-10-06 2024-02-13 SafeGraph, Inc. Systems and methods for generating multi-part place identifiers

Families Citing this family (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8060555B2 (en) 2005-08-17 2011-11-15 Canada Post Corporation Electronic content management systems and methods
US8306973B2 (en) * 2009-04-06 2012-11-06 Google Inc. Method for generating location targeted, keyword-triggered, ads and generating user recallable layer-based ads
KR100925278B1 (en) * 2009-04-29 2009-11-05 (주)지오투정보기술 System for geocoding using digital map and method therefor
KR101289082B1 (en) * 2009-09-02 2013-07-22 한국전자통신연구원 System and method for providing area information service
US20110131500A1 (en) * 2009-11-30 2011-06-02 Google Inc. System and method of providing enhanced listings
CN102004793B (en) * 2010-12-08 2013-09-04 厦门雅迅网络股份有限公司 POI (Point Of Interest) position inquiry index file based on grid space and information inquiry method
US9047103B2 (en) 2010-12-21 2015-06-02 Microsoft Technology Licensing, Llc Resource index identifying multiple resource instances and selecting most appropriate UI resource instance based on weighted resource request conditions
US8495570B2 (en) 2010-12-23 2013-07-23 Microsoft Corporation Resource deployment based on conditions
US9495371B2 (en) * 2010-12-28 2016-11-15 Microsoft Technology Licensing, Llc Unified access to resources
CN102622349B (en) * 2011-01-26 2014-10-22 北京四维图新科技股份有限公司 Processing method and processing device of spatial position information database
CN102222084B (en) * 2011-05-13 2014-02-19 北京百度网讯科技有限公司 Method and device for displaying retrieval result on map
KR101303869B1 (en) * 2011-10-20 2013-09-04 경북대학교 산학협력단 System and method for example-based place search
CN103150309B (en) * 2011-12-07 2016-03-30 清华大学 A kind of direction in space perception map interest point search method and system
CN103049481B (en) * 2012-11-29 2016-03-02 百度在线网络技术(北京)有限公司 A kind of searching method and search equipment
KR101499842B1 (en) * 2013-12-06 2015-03-10 아주대학교산학협력단 Method and Apparatus for searching for data object
US9465811B2 (en) * 2014-03-20 2016-10-11 Facebook, Inc. Polygon-based indexing of places
US20160092518A1 (en) * 2014-09-25 2016-03-31 Microsoft Corporation Dynamic results
CN104899243B (en) * 2015-03-31 2016-09-07 北京安云世纪科技有限公司 The method and device of detection point of interest POI data accuracy
CN104699838B (en) * 2015-04-01 2018-08-17 姚林 A kind of Webpage search method for pushing, and more site searches combined method
US9787557B2 (en) * 2015-04-28 2017-10-10 Google Inc. Determining semantic place names from location reports
CN105005577A (en) * 2015-05-08 2015-10-28 裴克铭管理咨询(上海)有限公司 Address matching method
CN105120072A (en) * 2015-07-17 2015-12-02 广东欧珀移动通信有限公司 Method and device for screening yellow page telephone numbers
CN106897302B (en) * 2015-12-18 2020-03-31 北京四维图新科技股份有限公司 Method and device for updating point of interest
CN107292302B (en) * 2016-03-31 2021-05-14 阿里巴巴(中国)有限公司 Method and system for detecting interest points in picture
CN106304109B (en) * 2016-07-28 2019-09-17 中国科学院软件研究所 A kind of generation method of the shortwave broadcasting resource scheduling scheme based on local search
CN106534246A (en) * 2016-08-31 2017-03-22 成都数联铭品科技有限公司 Peripheral enterprise search system based on location service
CN106341471A (en) * 2016-08-31 2017-01-18 成都数联铭品科技有限公司 Peripheral target geographic information acquiring and searching method for position service
KR101896543B1 (en) * 2017-11-13 2018-09-07 (주) 알트소프트 Local box advertisement service system which be able to share banner advertisement between local box business
CN108427710B (en) * 2018-01-26 2020-05-08 金蝶软件(中国)有限公司 Enterprise data visualization processing method, server and storage medium
CN110580270A (en) * 2018-06-07 2019-12-17 北京京东尚科信息技术有限公司 Address output method and system, computer system, and computer-readable storage medium
CN110619088B (en) * 2019-05-23 2022-04-19 北京无限光场科技有限公司 Method and apparatus for processing information
CN110619086B (en) * 2019-05-23 2022-02-25 北京无限光场科技有限公司 Method and apparatus for processing information
CN110619087B (en) * 2019-05-23 2022-04-15 北京无限光场科技有限公司 Method and apparatus for processing information
CN113568951A (en) * 2021-07-30 2021-10-29 拉扎斯网络科技(上海)有限公司 Data mining and processing method and device, storage medium and electronic equipment

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5938721A (en) 1996-10-24 1999-08-17 Trimble Navigation Limited Position based personal digital assistant
US6701307B2 (en) * 1998-10-28 2004-03-02 Microsoft Corporation Method and apparatus of expanding web searching capabilities
EP1269357A4 (en) * 2000-02-22 2005-10-12 Metacarta Inc Spatially coding and displaying information
JP2002063196A (en) * 2000-03-06 2002-02-28 Katsuyoshi Nagashima Device for automatically retrieving internet information, and method for the same
JP2002082982A (en) * 2000-09-06 2002-03-22 Nippon Telegr & Teleph Corp <Ntt> Device and method for providing information and recording medium with information providing program recorded thereon
JP2005078206A (en) * 2003-08-28 2005-03-24 Canon Inc On-line print sales system and on-line print sales method
US6934634B1 (en) * 2003-09-22 2005-08-23 Google Inc. Address geocoding
JP2005149073A (en) * 2003-11-14 2005-06-09 Matsushita Electric Ind Co Ltd Data retrieval device
WO2006028478A1 (en) * 2003-11-25 2006-03-16 Google Inc. Assigning geographic location identifiers to web pages

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of EP1934829A4 *

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101785002B (en) * 2007-07-09 2014-04-09 谷歌公司 Interpreting local search queries
WO2009009518A1 (en) * 2007-07-09 2009-01-15 Google Inc. Interpreting local search queries
US7917490B2 (en) * 2007-07-09 2011-03-29 Google Inc. Interpreting local search queries
US8156099B2 (en) * 2007-07-09 2012-04-10 Google Inc. Interpreting local search queries
US20110131092A1 (en) * 2008-03-31 2011-06-02 Thinkwaresystem Corp. Method and system for advertisement of map using virtual poi (point of interest)
KR100997873B1 (en) * 2008-03-31 2010-12-02 팅크웨어(주) Advertisement method and system of map using virtual point of interest
WO2009145438A1 (en) * 2008-03-31 2009-12-03 팅크웨어(주) Method and system for advertisement of map using virtual poi (point of interest)
CN102027468B (en) * 2008-05-16 2014-04-23 上海惠普有限公司 Provisioning a geographical image for retrieval
WO2009137967A1 (en) * 2008-05-16 2009-11-19 Shanghai Hewlett-Packard Co., Ltd Provisioning a geographical image for retrieval
US8682646B2 (en) 2008-06-04 2014-03-25 Microsoft Corporation Semantic relationship-based location description parsing
US8260775B2 (en) 2010-01-12 2012-09-04 Microsoft Corporation Geotemporal search
CN102479229A (en) * 2010-11-29 2012-05-30 北京四维图新科技股份有限公司 Method and system for generating point of interest (POI) data
US20140236689A1 (en) * 2011-02-11 2014-08-21 Thinkware Systems Corporation Method and system for advertisement of map using virtual poi (point of interest)
WO2013079767A1 (en) * 2011-10-18 2013-06-06 Nokia Corporation Methods and apparatuses for facilitating interaction with a geohash-indexed data set
US8983953B2 (en) 2011-10-18 2015-03-17 Nokia Corporation Methods and apparatuses for facilitating interaction with a geohash-indexed data set
US8958817B1 (en) 2012-01-19 2015-02-17 Google Inc. Weighted-distance spatial indexing
JP2012256356A (en) * 2012-08-15 2012-12-27 Zenrin Datacom Co Ltd Document data evaluation method, document data evaluation device, document data selection method, document data selection device, database generation method, database generation device, and computer program
WO2017097230A1 (en) * 2015-12-09 2017-06-15 北京奇虎科技有限公司 Method and apparatus for displaying map searching result
CN105808715A (en) * 2016-03-07 2016-07-27 武汉大学 Method for establishing map per location
US11561943B2 (en) 2018-12-11 2023-01-24 SafeGraph, Inc. Feature-based deduplication of metadata for places
WO2020245437A1 (en) * 2019-06-06 2020-12-10 Deepreach Method for generating a composite visibility indicator for an entity, system
FR3097064A1 (en) * 2019-06-06 2020-12-11 Deepreach PROCESS FOR GENERATING A COMPOSITE VISIBILITY INDICATOR OF AN ENTITY, SYSTEM
US11874889B2 (en) 2019-06-06 2024-01-16 Deepreach Method for generating a composite visibility indicator for an entity, system
US11899696B2 (en) 2020-10-06 2024-02-13 SafeGraph, Inc. Systems and methods for generating multi-part place identifiers
WO2022076081A1 (en) * 2020-10-06 2022-04-14 SafeGraph, Inc. Systems and methods for generating multi-part place identifiers
US11762914B2 (en) 2020-10-06 2023-09-19 SafeGraph, Inc. Systems and methods for matching multi-part place identifiers

Also Published As

Publication number Publication date
WO2007027608A3 (en) 2007-08-30
CA2620770A1 (en) 2007-03-08
BRPI0615323A2 (en) 2011-05-17
KR100985450B1 (en) 2010-10-07
JP2009506459A (en) 2009-02-12
EP1934829A4 (en) 2012-04-18
CN101313300B (en) 2014-11-12
CN101313300A (en) 2008-11-26
EP1934829A2 (en) 2008-06-25
KR20080040044A (en) 2008-05-07

Similar Documents

Publication Publication Date Title
KR100985450B1 (en) Local search
US6934634B1 (en) Address geocoding
US7483881B2 (en) Determining unambiguous geographic references
KR100814667B1 (en) Systems and methods for clustering search results
CA2845194C (en) Classification of ambiguous geographic references
US9189496B2 (en) Indexing documents according to geographical relevance
US7231405B2 (en) Method and apparatus of indexing web pages of a web site for geographical searchine based on user location
Borges et al. Discovering geographic locations in web pages using urban addresses
JP2005182817A (en) Query recognizer
CA2548948C (en) Assigning geographic location identifiers to web pages
Watters et al. GeoSearcher: Location‐based ranking of search engine results
Tabarcea et al. Framework for location-aware search engine
EP2763052A1 (en) Search method and information management device
EP1138007A1 (en) System and method for finding near matches among records in databases
Asadi et al. Using local popularity of web resources for geo-ranking of search engine results
JP2006508466A (en) Method for registering website information in search engine and website search service method using the same
Jameel et al. Compounded uniqueness level: Geo-location indexing using address parser
Watters et al. GeoSearcher: Geospatial ranking of search engine results

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200680040129.9

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application
ENP Entry into the national phase in:

Ref document number: 2620770

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 2008529167

Country of ref document: JP

NENP Non-entry into the national phase in:

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2006802480

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 583/MUMNP/2008

Country of ref document: IN

Ref document number: 1020087007591

Country of ref document: KR

ENP Entry into the national phase in:

Ref document number: PI0615323

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20080229