CN110019645B - Index library construction method, search method and device - Google Patents

Index library construction method, search method and device Download PDF

Info

Publication number
CN110019645B
CN110019645B CN201710901601.3A CN201710901601A CN110019645B CN 110019645 B CN110019645 B CN 110019645B CN 201710901601 A CN201710901601 A CN 201710901601A CN 110019645 B CN110019645 B CN 110019645B
Authority
CN
China
Prior art keywords
poi
index
information
character
inverted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710901601.3A
Other languages
Chinese (zh)
Other versions
CN110019645A (en
Inventor
谭鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Network Technology Co.,Ltd.
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN201710901601.3A priority Critical patent/CN110019645B/en
Publication of CN110019645A publication Critical patent/CN110019645A/en
Application granted granted Critical
Publication of CN110019645B publication Critical patent/CN110019645B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/319Inverted lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Remote Sensing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides an index database construction method, a search method and a search device. The index base construction method comprises the following steps: scanning a data source of the interest point Poi, and determining attribute information and city information of each Poi; creating Poi an inverted index from the attribute information of each Poi; segmenting the reverse index of Poi according to the city information to obtain index blocks based on the city information; an index library is built Poi from the index chunks. The Poi index database is constructed based on the index blocks of the cities, the purpose of storing Poi according to the cities to which Poi belong is achieved, and therefore in the subsequent searching process, searching can be conducted according to the index blocks, corresponding to the cities to which Poi belong, in the Poi index database, the searching range is narrowed, and searching efficiency is improved.

Description

Index library construction method, search method and device
Technical Field
The present invention relates to the field of search technologies, and in particular, to an index library construction method, an index library-based search method, an index library construction apparatus, an index library-based search apparatus, a device, and a readable storage medium.
Background
With the rapid development of mobile internet, more and more users utilize network to obtain services, such as Point of Interest (Point of Interest, Poi) -based information network services. Poi refers to the data of interest, such as buildings like restaurants, scenic spots, schools, etc., or road information like Kyowa high speed, etc.
Currently, more and more users use mobile terminals to conduct Poi searches on maps. Specifically, the conventional Poi search scheme generally determines a corresponding Poi name according to a query word input by a user, and searches corresponding Poi results at different positions of a map according to the Poi name, so as to provide the searched Poi results for the user to select. These Poi results may be in the city where the user is currently located or may be in a city surrounding the city where the user is currently located. Obviously, the existing Poi searching scheme mainly searches according to the name Poi, and the searching range is too large, which affects the searching efficiency.
Disclosure of Invention
The technical problem to be solved by the embodiment of the invention is to provide an index database construction method and a search method based on the index database so as to improve the search efficiency.
Correspondingly, the embodiment of the invention also provides an index database construction device, a search device based on the index database, equipment and a readable storage medium, which are used for ensuring the realization and application of the method.
In order to solve the above problem, an embodiment of the present invention discloses an index database construction method, including: scanning a data source of the interest point Poi, and determining attribute information and city information of each Poi; creating Poi an inverted index from the attribute information of each Poi; segmenting the reverse index of Poi according to the city information to obtain index blocks based on the city information; an index library is built Poi from the index chunks.
Optionally, the attribute information is determined based on a query domain of Poi, and the inverted index of Poi includes: based on the inverted index of a single character. The creating Poi an inverted index from the attribute information of each Poi, comprising: extracting Poi address domain information and/or name domain information from the attribute information of Poi; the name characters included in the name field information and the address characters included in the address field information are counted at Poi, and an inverted index based on a single character is determined.
Optionally, the constructing Poi an index library according to the index blocks includes: constructing Poi a correspondence between an inverted index and index partitions, the inverted index comprising at least one of: poi name index and Poi address index; and constructing Poi index database based on the corresponding relation between the Poi inverted index and the index blocks.
Optionally, counting name characters included in each Poi name domain information, and determining an inverted index based on a single character includes: counting name characters contained in the name domain information, and determining the frequency corresponding to each name character; determining an inverted chain table of each name character according to the frequency corresponding to each name character, wherein the inverted chain table comprises: name character number, character position, and Poi heat; for each Poi name domain information, a Poi name index is constructed based on the inverted chain table of name characters.
Optionally, counting address characters included in each Poi address domain information, and determining an inverted index based on a single character includes: counting address characters contained in the address domain information, and determining the frequency corresponding to each address character; determining an inverted chain table of each address character according to the frequency corresponding to each address character, wherein the inverted chain table comprises: address character number, character position, and Poi heat; for each Poi realm information, a Poi address index is constructed based on the inverted chain table of address characters.
Optionally, the segmenting the reverse index of Poi according to the city information to obtain index blocks based on the city information, including: segmenting the reverse indexes of Poi according to the city information to obtain Poi reverse indexes corresponding to the city information; based on the reverse index of Poi corresponding to the same city information, a corresponding index block is generated.
Optionally, before the determining the inverted index based on the single character, the method further includes: acquiring historical search data, wherein the historical search data comprises input method record data of a user, webpage click history data and map click history data; and comprehensively analyzing the input method record data, the webpage click history data and the map click history data in the historical search data to obtain Poi heat corresponding to each character.
The embodiment of the invention also discloses a searching method based on the index database, which comprises the following steps: receiving query information input by a user, wherein the query information comprises character information of a point of interest Poi input by the user in a searching process; poi character information is identified from the query information, and the city information to which Poi belongs is determined based on the current geographic position of the user; querying an index database according to the city information to which the Poi belongs to obtain a target inverted index corresponding to the Poi character information, wherein the index database comprises index blocks based on the city information, and the index blocks comprise Poi inverted indexes; generating at least one Poi result according to the target inverted index, and displaying the Poi result.
Optionally, identifying Poi character information from the query information includes: performing word segmentation on the query information to obtain at least one word segmentation result; performing part-of-speech tagging on the word segmentation result to obtain corresponding attribute information; and analyzing according to the word segmentation result and the attribute information to determine Poi character information, wherein the Poi character information comprises name character information and/or address character information.
Optionally, querying an index database according to the city information to which the Poi belongs to obtain a target inverted index corresponding to the Poi character information, including: searching a target index block corresponding to the city information to which the Poi belongs in an index database; in the target index block, inquiring an inverted list of characters contained in the Poi character information; and determining a target inverted index according to the inquired inverted chain table, wherein the target inverted index comprises a name domain index and/or an address domain index.
Optionally, generating at least one Poi result from the target inverted index includes: fusing the name domain index and the address domain index to obtain a fused Poi index; generating at least one Poi result based on the Poi indexing a reverse linked list of corresponding characters, the reverse linked list comprising: character number, character position, and Poi heat.
Optionally, the displaying Poi result includes: for each Poi result, acquiring corresponding Poi heat and character position from the inverted list of the characters; sorting the Poi results according to the obtained Poi heat, character positions and the query information, and determining a sorting sequence corresponding to each Poi result; the results of Poi are presented in rank order.
The embodiment of the invention also discloses an index database construction device, which comprises: the data source scanning module is used for scanning the data source of the interest point Poi and determining attribute information and city information of each Poi; an index creation module for creating Poi inverted indexes according to the attribute information of each Poi; the index segmentation module is used for segmenting the reverse index of Poi according to the city information to obtain index blocks based on the city information; and the index base building module is used for building Poi an index base according to the index blocks.
Optionally, the attribute information is determined based on a query domain of Poi, and the inverted index of Poi includes: based on the inverted index of a single character. The index creation module includes: an attribute information extraction submodule for extracting Poi address domain information and/or name domain information from the attribute information of Poi; and the inverted index determining submodule is used for counting the name characters contained in the name domain information of Poi and the address characters contained in the address domain information and determining an inverted index based on a single character.
Optionally, the index library building module includes: a correspondence construction submodule, configured to construct Poi a correspondence between an inverted index and an index partition, where the inverted index includes at least one of: poi name index and Poi address index; and the index base constructing submodule is used for constructing Poi index base based on the corresponding relation between the Poi inverted index and the index block.
Optionally, the inverted index determining sub-module includes: a name character counting unit, configured to count name characters included in the name domain information, and determine a frequency corresponding to each name character; a first determining unit, configured to determine an inverted link table of each name character according to a frequency corresponding to each name character, where the inverted link table includes: name character number, character position, and Poi heat; and the name index building unit is used for building Poi name indexes according to the name domain information of each Poi and based on the inverted chain table of the name characters.
Optionally, the inverted index determining sub-module includes: the address character counting unit is used for counting the address characters contained in the address domain information and determining the frequency corresponding to each address character; a second determining unit, configured to determine an inverted link table of each address character according to a frequency corresponding to each address character, where the inverted link table includes: address character number, character position, and Poi heat; and the address index building unit is used for building Poi address indexes according to the inverted link table of the address characters aiming at the address domain information of each Poi.
Optionally, the index splitting module comprises: the reverse index segmentation submodule is used for segmenting the reverse indexes of Poi according to the city information to obtain Poi reverse indexes corresponding to the city information; and the index block generation submodule is used for generating corresponding index blocks based on the reverse indexes of Poi corresponding to the same city information.
Optionally, the apparatus further comprises: the historical data acquisition module is used for acquiring historical search data, and the historical search data comprises input method record data of a user, webpage click historical data and map click historical data; and the comprehensive analysis module is used for comprehensively analyzing the input method record data, the webpage click history data and the map click history data in the historical search data to obtain Poi heat corresponding to each character.
The embodiment of the invention also discloses a searching device based on the index database, which comprises: the information receiving module is used for receiving query information input by a user, wherein the query information comprises character information of an interest point Poi input by the user in a searching process; poi recognition module, which is used to recognize Poi character information from the query information and determine Poi city information based on the current geographic location of the user; the index query module is used for querying an index database according to the city information to which the Poi belongs to obtain a target inverted index corresponding to the Poi character information, wherein the index database comprises index blocks based on the city information, and the index blocks comprise the inverted index of Poi; and the result display module is used for generating at least one Poi result according to the target inverted index and displaying the Poi result.
Optionally, the Poi identification module includes: the word segmentation sub-module is used for performing word segmentation on the query information to obtain at least one word segmentation result; the labeling submodule is used for performing part-of-speech labeling on the word segmentation result to obtain corresponding attribute information; and the analysis submodule is used for analyzing according to the word segmentation result and the attribute information to determine Poi character information, wherein the Poi character information comprises name character information and/or address character information.
Optionally, the index query module includes: the block searching submodule is used for searching a target index block corresponding to the city information to which the Poi belongs in an index database; the linked list searching submodule is used for searching the inverted linked list of the characters contained in the Poi character information in the target index block; and the index determining submodule is used for determining a target inverted index according to the inquired inverted chain table, wherein the target inverted index comprises a name domain index and/or an address domain index.
Optionally, the result presentation module comprises: the index fusion submodule is used for fusing the name domain index and the address domain index to obtain a fused Poi index; a result generation submodule, configured to generate at least one Poi result based on the Poi indexed inverted list of corresponding characters, the inverted list including: character number, character position, and Poi heat.
Optionally, the result presentation module comprises: the obtaining submodel is used for obtaining corresponding Poi heat and character positions from the inverted list of the characters aiming at each Poi result; the sequencing submodule is used for sequencing the Poi results according to the obtained Poi heat, character positions and the query information, and determining the corresponding arrangement sequence of the Poi results; and the display submodule is used for displaying the Poi results according to the arrangement sequence.
Also disclosed in an embodiment of the present invention is an apparatus comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors comprise instructions for: scanning a data source of the interest point Poi, and determining attribute information and city information of each Poi; creating Poi an inverted index from the attribute information of each Poi; segmenting the reverse index of Poi according to the city information to obtain index blocks based on the city information; an index library is built Poi from the index chunks.
Optionally, the attribute information is determined based on a query domain of Poi, and the inverted index of Poi includes: based on the inverted index of a single character. The creating Poi an inverted index from the attribute information of each Poi, comprising: extracting Poi address domain information and/or name domain information from the attribute information of Poi; the name characters included in the name field information and the address characters included in the address field information are counted at Poi, and an inverted index based on a single character is determined.
Optionally, the constructing Poi an index library according to the index blocks includes: constructing Poi a correspondence between an inverted index and index partitions, the inverted index comprising at least one of: poi name index and Poi address index; and constructing Poi index database based on the corresponding relation between the Poi inverted index and the index blocks.
Optionally, counting name characters included in each Poi name domain information, and determining an inverted index based on a single character includes: counting name characters contained in the name domain information, and determining the frequency corresponding to each name character; determining an inverted chain table of each name character according to the frequency corresponding to each name character, wherein the inverted chain table comprises: name character number, character position, and Poi heat; for each Poi name domain information, a Poi name index is constructed based on the inverted chain table of name characters.
Optionally, counting address characters included in each Poi address domain information, and determining an inverted index based on a single character includes: counting address characters contained in the address domain information, and determining the frequency corresponding to each address character; determining an inverted chain table of each address character according to the frequency corresponding to each address character, wherein the inverted chain table comprises: address character number, character position, and Poi heat; for each Poi realm information, a Poi address index is constructed based on the inverted chain table of address characters.
Optionally, the segmenting the reverse index of Poi according to the city information to obtain index blocks based on the city information, including: segmenting the reverse indexes of Poi according to the city information to obtain Poi reverse indexes corresponding to the city information; based on the reverse index of Poi corresponding to the same city information, a corresponding index block is generated.
Optionally, prior to determining the inverted index based on the single character, the executing, by the one or more processors, the one or more programs includes instructions further for: acquiring historical search data, wherein the historical search data comprises input method record data of a user, webpage click history data and map click history data; and comprehensively analyzing the input method record data, the webpage click history data and the map click history data in the historical search data to obtain Poi heat corresponding to each character.
Also disclosed in an embodiment of the present invention is an apparatus comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors comprise instructions for: receiving query information input by a user, wherein the query information comprises character information of a point of interest Poi input by the user in a searching process; poi character information is identified from the query information, and the city information to which Poi belongs is determined based on the current geographic position of the user; inquiring a target inverted index corresponding to the Poi character information from an index database according to the city information to which the Poi belongs, wherein the index database comprises index blocks based on the city information, and the index blocks comprise Poi inverted indexes; generating at least one Poi result according to the target inverted index, and displaying the Poi result.
Optionally, identifying Poi character information from the query information includes: performing word segmentation on the query information to obtain at least one word segmentation result; performing part-of-speech tagging on the word segmentation result to obtain corresponding attribute information; and analyzing according to the word segmentation result and the attribute information to determine Poi character information, wherein the Poi character information comprises name character information and/or address character information.
Optionally, querying an index database according to the city information to which the Poi belongs to obtain a target inverted index corresponding to the Poi character information, including: searching a target index block corresponding to the city information to which the Poi belongs in an index database; in the target index block, inquiring an inverted list of characters contained in the Poi character information; and determining a target inverted index according to the inquired inverted chain table, wherein the target inverted index comprises a name domain index and/or an address domain index.
Optionally, generating at least one Poi result from the target inverted index includes: fusing the name domain index and the address domain index to obtain a fused Poi index; generating at least one Poi result based on the Poi indexing a reverse linked list of corresponding characters, the reverse linked list comprising: character number, character position, and Poi heat.
Optionally, the displaying Poi result includes: for each Poi result, acquiring corresponding Poi heat and character position from the inverted list of the characters; sorting the Poi results according to the obtained Poi heat, character positions and the query information, and determining a sorting sequence corresponding to each Poi result; the results of Poi are presented in rank order.
The embodiment of the invention also discloses a readable storage medium, and when the instructions in the storage medium are executed by a processor of the equipment, the equipment can execute one or more index base construction methods in the embodiment of the invention.
The embodiment of the invention also discloses a readable storage medium, and when the instructions in the storage medium are executed by a processor of the equipment, the equipment can execute one or more searching methods based on the index database in the embodiment of the invention.
The embodiment of the invention has the following advantages:
after the Poi inverted index is created, the Poi inverted index can be segmented according to the Poi city information to obtain index blocks based on the city information, that is, the created Poi inverted index is segmented into index blocks based on cities according to the difference of the cities to which Poi belongs, and a Poi index library can be constructed based on the index blocks of the cities, so that the purpose of storing Poi according to the city to which Poi belongs is achieved, and therefore in the subsequent searching process, the corresponding index blocks in the Poi index library can be searched according to the city to which Poi belongs, that is, the whole Poi index library does not need to be searched, the searching range is narrowed, and the searching efficiency is improved.
In addition, in the process of creating Poi the inverted index, the embodiment of the present invention may create Poi inverted indexes corresponding to different query domains, such as creating Poi a name index for name domain information, creating Poi an address index for address domain information, and the like; furthermore, the search can be performed according to different query domains in the Poi search process, so that Poi inverted indexes corresponding to different query domains can be searched, and the searched Poi inverted indexes can be merged, for example, Poi address indexes and Poi name indexes are merged, so that the search range is enlarged, the search recall rate is increased, and the search efficiency is further improved.
Drawings
FIG. 1 is a flowchart illustrating steps of an embodiment of a method for creating an index repository;
FIG. 2 is a flowchart illustrating steps of an alternative embodiment of a method for building an index repository;
FIG. 3 is a schematic diagram of a single name character based inverted linked list in an example of the invention;
FIG. 4 is a schematic illustration of separately creating Poi an address index and Poi a name index in one example of the invention;
FIG. 5 is a diagram of one embodiment of creating a pinyin index in accordance with the invention;
FIG. 6 is a schematic diagram of partitioning an inverted index in one example of the invention;
FIG. 7 is a flowchart illustrating the steps of an embodiment of a method for index-based search according to the present invention;
FIG. 8 is a flowchart illustrating the steps of an alternative embodiment of an index base-based search method of the present invention;
FIG. 9 is a schematic diagram of the connection of a user query intent understanding system, a query presentation system, and an index service system in an example application of the present invention;
FIG. 10 is a block diagram illustrating an embodiment of an index library constructing apparatus according to the present invention;
FIG. 11 is a block diagram of an embodiment of an index-based search apparatus according to the present invention;
FIG. 12 is a block diagram illustrating the structure of a device according to an exemplary embodiment;
fig. 13 is a schematic structural diagram of a server in an embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
In the process of searching Poi by the user, the requirement of the user on the region is relatively clear. One of the core concepts of the embodiment of the invention is as follows: after the Poi inverted index is created, partitioning the Poi inverted index according to the city Poi belongs to obtain index partitions based on the city; therefore, in the subsequent searching process, the index blocks can be searched according to the city to which Poi needing to be searched belongs, namely, the searching range is narrowed, and the searching efficiency can be improved.
Referring to fig. 1, a flowchart illustrating steps of an embodiment of a method for establishing an index library according to the present invention is shown, which may specifically include the following steps:
step 102, scanning the data source of the point of interest Poi, and determining attribute information and city information of each Poi.
In particular implementations, the data sources of Poi may include Poi name data, Poi address data, Poi heat data, Poi click data, Poi affiliations, an error correction dictionary repository, a synonym repository, user click history data, and the like. The embodiment of the invention can obtain Poi global information by scanning various data sources of Poi. The global information of Poi may include attribute information of Poi and city information. The city information may be used to indicate that Poi belongs to a city, and Poi belongs to a city that may be distinguished according to administrative partitions, which is not limited in this embodiment of the present invention. Poi may be used to represent attributes of the point of interest, such as may include name domain information, address domain information, type domain information, synonym domain information, and so forth.
The name field information may include information related to Poi names, such as the number of characters contained in Poi names, Poi names, the positions of the characters, and the like. The address domain information may include information related to Poi addresses, and specifically may include longitude and latitude of Poi, detailed address information of Poi, and the like, for example, when Poi is "do 1 th courtyard of zhongguan search for X building 15 level", the address domain information of Poi may be "do 1 th courtyard of zhongguan. The type domain information may be used to include information related to the category to which the point of interest belongs Poi, and specifically may include service information corresponding to the category to which the point of interest belongs, such as a service industry code, a service industry name, and the like. The synonym field information may include Poi name aliases, acronyms, and names of great use, as well as word aliases, and the like. For example, the abbreviation "qing hua" of the point of interest "qing university" may be used as synonym domain information of the point of interest "qing hua university"; for example, the "development and improvement committee" and the "development and improvement committee" may express the same meaning, that is, any one of the Poi names may be used as the synonym domain information of the other Poi name, specifically, the "development and improvement committee" may be used as the synonym domain information of the "development and improvement committee", or the "development and improvement committee" may be used as the synonym domain information of the "development and improvement committee"; as another example, in the case where the alias of the point of interest "chinese restaurant" is "chinese restaurant", the "chinese restaurant" may be used as the synonym domain information of the point of interest "chinese restaurant"; for another example, the great name "xiangfan city" of the point of interest "xiangyang city" may be used as the synonym domain information of the point of interest "xiangyang city", and so on. Book (I)
Of course, the attribute information of Poi may include other information, such as peripheral landmark information and accessory information, which is not limited in this embodiment of the present invention. The peripheral landmark information may be used to characterize Poi peripheral information of the geographic location, such as peripheral city information that may be used to characterize a city to which the point of interest belongs. The collateral information may be used to determine Poi the heat. Poi the heat may be a composite heat weight obtained by comprehensively analyzing the user's input method records, web page click history, and map click history.
At step 104, an inverted index of Poi is created based on the attribute information of each Poi.
In the embodiment of the present invention, an inverted index of Poi may be created for the attribute information of Poi, so as to facilitate that Poi corresponding to the user input information may be found based on the inverted index of Poi in the subsequent search processing. For example, a Poi address index may be created for the address domain information of Poi, such that Poi corresponding to the characters of the user-entered address may be searched against the Poi address index in a subsequent search process; as another example, a Poi name index may be created for the name domain information of Poi, such that Poi corresponding to the user-entered name character may be searched against the Poi name index in a subsequent search process.
And 106, segmenting the reverse index of Poi according to the city information to obtain index blocks based on the city information.
In the embodiment of the present invention, the city to which Poi belongs can be determined according to the city information of Poi. In the process of creating the inverted index, the created Poi inverted index can be divided into index blocks corresponding to the cities according to the difference of the cities to which Poi belong, and the index blocks based on the city information can be obtained.
And step 108, building Poi an index database according to the index blocks.
In the embodiment of the present invention, the index blocks corresponding to different cities may be stored in the database, i.e., constitute Poi index database. Specifically, the reverse indexes of Poi can be stored according to index blocks corresponding to different cities, so as to construct a Poi index library. The Poi index library may include one or more index chunks and correspondences between index chunks and city information.
To sum up, after the Poi inverted index is created, the Poi inverted index can be segmented according to the Poi city information to obtain index blocks based on the city information, that is, the created Poi inverted index is segmented into index blocks based on cities according to the difference of the cities to which Poi belongs, and a Poi index library can be constructed based on the index blocks of the cities, so that the Poi can be stored according to the city to which Poi belongs, and therefore, in the subsequent search process, the corresponding index blocks in the Poi index library can be searched according to the city to which Poi belongs, that is, the whole Poi index library does not need to be searched, the search range is reduced, and the search efficiency is improved.
Referring to fig. 2, a flowchart illustrating steps of an alternative embodiment of the index repository establishing method of the present invention is shown, which may specifically include the following steps:
step 202, scan the data source of the point of interest Poi, and determine attribute information and city information for each Poi.
As one example of the invention, a document may be constructed by scanning Poi the data source. The document may include global information for one or more Poi scanned. The global information of the interest point may include Poi city, category, Poi degree of heat, and Poi number of characters and position of the characters.
For example, after scanning five Poi "fox search network building", "fox search media building", "dog search technology company", "tianjin coastal international airport" and "tianjin university", a document may be constructed based on the global information of these five Poi, as shown in table 1 below:
poi numbered Poi name Poi city Poi heat degree
D10001481 Fox-searching network mansion 01 10000
D10001482 Fox-searching media mansion 01 9000
D10001483 Sao Gong Tech Co Ltd 01 9600
D10001493 TIANJIN BINHAI INTERNATIONAL AIRPORT 02 600
D10001494 Tianjin University 02 900
TABLE 1
Wherein, the Poi number may be a unique code (ID) of Poi, and specifically, the identifier Poi may be used, for example, Poi number D1001481 is adopted in table 1 to identify "fox search media building"; poi, the city number can be used to indicate the city Poi, as shown in table 1, the city Poi of "fox search network building", "fox search media building" and "dog search science and technology company" can be beijing with the city number 01 corresponding to beijing, and the city Poi of "tianjin coastal international airport" and "tianjin university" can be tianjin with the city number 02 corresponding to tianjin.
At step 204, an inverted index of Poi is created based on the attribute information of each Poi.
In a particular implementation, an inverted index of Poi may be created based on the characters contained in Poi. The inverted index of Poi may include: the present invention is not limited to the reverse index based on a single character, the reverse index based on a word, the reverse index based on a phrase, and the like. Wherein, the inverted index based on a single character may refer to an index created for each single word of Poi; the word-based inverted index may refer to an index created for the word contained in Poi; the phrase-based inverted index may refer to an index created for the corresponding phrase in Poi.
In an embodiment of the present invention, corresponding indexes may be created for the various attribute information of Poi. The attribute information may be determined based on the query domain from Poi, such as may be determined Poi realm information based on realm; also, as the name domain information may be determined Poi based on the name domain, and so on. In a specific implementation, the attribute information corresponding to the index to be created may be represented by using the specific attribute information, so that the reverse index of Poi may be created with respect to the specific attribute information in the creation process of the reverse index of Poi. The specific attribute information may include at least one of: address domain information and name domain information.
In an alternative embodiment of the present invention, the attribute information is determined based on the query domain of Poi. Poi may include a name field and/or an address field. Creating Poi an inverted index based on the attribute information of each Poi may specifically include: extracting Poi address domain information and/or name domain information from the attribute information of Poi; the name characters included in the name field information and the address characters included in the address field information are counted at Poi, and an inverted index based on a single character is determined.
For example, in connection with the above example, an inverted index of Poi may be constructed by cyclically scanning Poi contained in the constructed document. Specifically, the character dictionary may include all the characters that need to be created. In the process of constructing the inverted index, the characters contained in the specific attribute information of each Poi in the document can be split according to the characters in the character dictionary to generate a character set; and the characters in the character set can be counted, the frequency corresponding to each character is determined, and the inverted list based on the single character is created based on the frequency corresponding to each character. The inverted link list may include information such as a character number, a frequency, a character position, Poi heat, and may be represented by a character ID, a frequency, a character position, Poi heat. The character ID may represent the number of a single character, and the character ID may have an offset value relationship with the Poi ID, so that the representation form of the inverted list may be set to < Poi ID, frequency, character position, Poi heat > based on the offset relationship between the character ID and the Poi ID, which is not limited in this embodiment. Frequency may represent the frequency or number of times a single character appears in Poi; the character position may represent the position at which a single character appears at Poi; poi heat may represent Poi heat in a user-demand oriented sense.
In one embodiment of the present invention, the offset relationship between the character ID and the Poi ID of the character contained in Poi can be expressed as follows:
δ=(A1+A2+A3+A4+……+An)+X
specifically, δ may represent a Poi ID; (a1+ a2+ A3+ a4+ … … + An) may represent the sum of the character IDs of all characters contained in Poi, where a1 may represent the character ID of the first character in Poi, a2 may represent the character ID of the second character in Poi, A3 may represent the character ID of the third character in Poi, and so on, An may represent the character ID of the nth character in Poi, n being An integer; x may represent an offset value that may be set according to the characters contained in Poi, which is not limited by the embodiments of the present invention.
For example, when the character IDs of the individual characters in the "fox search network building" are β 1, β 2, β 5, β 6, β 7, and β 21, respectively, and the offset value X is X1, δ is D10001481 by calculating (β 1+ β 2+ β 5+ β 6+ β 7+ β 21) + X1, it is determined that the Poi ID of the "fox search network building" is D10001481; for another example, when the character IDs of the individual characters in the "fox search media building" are β 1, β 2, β 3, β 4, β 7, and β 21, respectively, and the offset value X is X2, δ is D14810001482 obtained by calculating (β 1+ β 2+ β 3+ β 4+ β 7+ β 21) + X2, it is determined that the Poi ID of the "fox search media building" is D10001482, and so on.
When the inverted index is created, Poi heat and user click history data corresponding to each Poi can be recorded at the same time, and Poi heat and user click history data can be merged into the inverted index of Poi; therefore, when Poi is recommended to a user based on the query words input by the user, Poi recommended by the user can be ranked based on the Poi popularity and the user click history data, a more accurate Poi ranking result is obtained, and the recall rate is improved.
Alternatively, the Poi popularity may be aggregated by integrating the frequency of clicks, search frequency, and Poi input frequency of the user. The frequency of clicks of the user may represent the number of times the user clicked Poi, such as the number of times the user clicked a certain Poi on a map, which may be obtained from a user click history log; the search frequency may be used to indicate the number of times the user searched Poi, such as the number of times the user searched a certain Poi on a map, which may be obtained from a user search history log; poi the frequency of input may represent the number of times the user entered Poi, such as the number of times the user entered a certain Poi in the input method. The user click history data may be data obtained from a click history log of Poi smart tips.
It should be noted that the Poi smart prompt refers to a search keyword smart prompt. Specifically, when the user inputs the fewest search terms, the system gives Poi results, navigation or routing queries that the user most wants to improve the user's search experience.
After creating the inverted list of individual characters, an inverted index for Poi may be constructed from the inverted list of individual characters. Optionally, the inverted chain table based on a single character can be sorted according to Poi heat degrees in the inverted chain table, the arrangement sequence of the inverted chain table is obtained, and then the inverted chain table can be stored according to the arrangement sequence, so that the inverted index of Poi can be searched for based on Poi heat degrees in the following process, and the search efficiency is improved. Of course, the inverted link table based on a single character may also be sorted in other manners, for example, the inverted link table may be sorted based on the Poi ID and the character position in the inverted link table, which is not limited in this embodiment of the present invention.
In an optional embodiment of the present invention, before the determining the inverted index based on the single character, the method may further comprise: acquiring historical search data, wherein the historical search data comprises input method record data of a user, webpage click history data and map click history data; and comprehensively analyzing the input method record data, the webpage click history data and the map click history data in the historical search data to obtain Poi heat corresponding to each character. Wherein the input method log data may allow the user to determine the number of times each Poi was entered by the user; the web page click history data may be used to determine the number of times the user searched and/or clicked each Poi on the web page; the map click history data may be used to determine the number of times the user searched and/or clicked each Poi on the map application.
As an example of the present invention, a single name character based inverted linked list may be created according to the name characters contained in each Poi name in table 1, as shown in fig. 3. Taking the name character "search" as an example, the frequency corresponding to the name character "search" is 3, and it can be determined that "search" appears in 3 Poi names; and the inverted chain table corresponding to the name character 'search' includes: < D10001481, 3, 1, 10000>, < D1000182, 3, 1, 9000> and < D1000183, 3, 1, 9600 >; wherein < D10001481, 3, 1, 10000> may indicate that "search" appears in Poi name of "search network building", and "search" is the first name character of Poi name of "search network building", and its corresponding Poi heat is 9600.
In an alternative embodiment of the present invention, counting the name characters contained in each Poi name field information, and determining the inverted index based on a single character, includes: counting name characters contained in the name domain information, and determining the frequency corresponding to each name character; determining an inverted chain table of each name character according to the frequency corresponding to each name character, wherein the inverted chain table comprises: name character number, character position, and Poi heat; for each Poi name domain information, a Poi name index is constructed based on the inverted chain table of name characters.
Taking Poi name index for constructing "search fox" as an example, as shown in fig. 3, based on a single character inverted chain table, it may be determined that the inverted chain table corresponding to "search fox" includes < D10001481, 3, 1, 10000>, < D1000182, 3, 1, 9000>, < D10001481, 2, 2, 10000> and < D10001482, 2, 2, 9000 >; further, based on the Poi ID and character position of the inverted chain table, < D10001481, 3, 1, 10000> and < D10001481, 2, 2, 10000> are indexed to form one Poi name of "search fox", and < D1000182, 3, 1, 9000> and < D10001482, 2, 2, 2, 9000> are indexed to form the other Poi name of "search fox".
In particular implementations, one Poi may have name domain information, as well as address domain information. In creating Poi the inverted index, not only the Poi name index may be created for name domain information, but also a Poi address index may be created for address domain information, as shown in FIG. 4. Optionally, counting address characters included in each Poi address domain information, and determining an inverted index based on a single character may include: counting address characters contained in the address domain information, and determining the frequency corresponding to each address character; determining an inverted chain table of each address character according to the frequency corresponding to each address character, wherein the inverted chain table comprises: address character number, character position, and Poi heat; for each Poi realm information, a Poi address index is constructed based on the inverted chain table of address characters.
Since Poi names and Poi addresses belong to different query domains, respectively, the Poi name index and the Poi address index may be created separately in creating the inverted index. Taking the example that the interest point is the 'fox searching network building', when the address of the 'fox searching network building' is 'zhongguan dongduo No. 1 yard No. 9', the Poi address 'zhongguan dongduo No. 1 yard No. 9' can be used as address domain information, and further, a Poi address index corresponding to the 'fox searching network building' can be constructed based on characters contained in the Poi address 'zhongguan dongduo No. 1 yard No. 9'; of course, the Poi name index may be constructed based on characters included in the "search fox network building" using the Poi name "search fox network building" as name domain information.
During the daily input process, the user usually inputs complete or incomplete pinyin characters, such as "beijing". In this case, the embodiment of the present invention may convert Poi containing kanji characters into corresponding pinyin character strings during the process of creating the inverted index, so as to create corresponding pinyin indexes for the pinyin character strings. For example, as shown in FIG. 5, Poi, the Chinese characters contained in the name "search network building" are converted into the corresponding Pinyin string "sou-hu-wang-luo-da-sha"; and the converted pinyin character string can be converted into a single pinyin character, namely the pinyin character string 'sou-hu-wang-luo-da-sha' is converted into pinyin 'sou', 'hu', 'wang', 'luo', 'da' and 'sha' corresponding to a single Chinese character. Then, a prefix index tree of pinyin can be created for a single pinyin character according to the prefix tree principle, so that pinyin indexes corresponding to single Chinese characters can be generated based on the prefix index tree. Therefore, the corresponding Poi result can be recommended to the user aiming at the input pinyin character string according to the pinyin index in the searching process, and the searching efficiency and the recall rate are improved.
And step 206, segmenting the reverse indexes of Poi according to the city information to obtain Poi reverse indexes corresponding to the city information.
And step 208, generating corresponding index blocks based on the reverse indexes of Poi corresponding to the same city information.
As an example of the present invention, after creating all scanned Poi inverted indexes, all Poi inverted indexes that can be created are stored in one inverted file; and the inverted index of each Poi in the inverted file can be divided into index partitions of different cities according to the city to which Poi belongs, and the cities can be grade cities in the administrative division definition. For example, as shown in fig. 6, the inverted indexes in the inverted file are divided according to the city to which Poi belongs, so that the inverted chaining table corresponding to Poi belonging to beijing can be divided into the beijing index chunks, the inverted chaining table corresponding to Poi belonging to shenzhen can be divided into the shenzhen index chunks, and the inverted chaining tables corresponding to Poi belonging to other cities can be divided into other city index chunks.
Step 210, constructing Poi a corresponding relationship between an inverted index and an index block, where the inverted index includes at least one of: poi name index and Poi address index.
Specifically, after dividing the inverted index of Poi into index chunks, a correspondence between the inverted index of each Poi of the index chunks and the index chunk may be constructed. For example, according to the fact that the city to which Poi corresponding to the point of interest "fox search network building" belongs is beijing, after the Poi name index and the Poi address index of the point of interest "fox search network building" are divided into beijing index blocks, the corresponding relationship between the Poi address index of the point of interest "fox search network building" and the beijing index blocks can be constructed, and the corresponding relationship between the Poi name index of the point of interest "fox search network building" and the beijing index blocks can be constructed.
And step 212, constructing Poi index database based on the corresponding relation between the Poi inverted index and the index blocks.
After the correspondence between the inverted index of Poi and the index block is constructed, the inverted index of Poi and the city information of Poi may be stored based on the established correspondence between the inverted index of Poi and the index block, so as to construct a Poi index library. For example, based on the correspondence between the inverted index of Poi and the index block, the inverted list corresponding to the character included in Poi and the city information corresponding to the city to which Poi belongs may be stored, and the Poi index library may be generated. Therefore, indexes can be carried out on the indexes corresponding to the index blocks of the city Poi in the Poi searching process, the searching range is narrowed, and the searching efficiency is improved.
To sum up, in the process of creating Poi the inverted index, the embodiment of the present invention may create an inverted index based on a single character for each character included in Poi, and divide the created inverted index into corresponding index blocks according to a city to which Poi belongs to store, thereby constructing a Poi index library; in the subsequent Poi searching process, the corresponding index blocks can be found out in the Poi index database according to the city to which Poi belongs, then the corresponding target inverted index can be found out in the index blocks based on the characters contained in the Poi character information input by the user, and then one or more Poi results can be generated and displayed according to the target inverted index, so that Poi results can be recommended to the user to meet the user requirements.
Referring to fig. 7, a flowchart illustrating steps of an embodiment of a search method based on an index repository according to the present invention is shown, which may specifically include the following steps:
step 702, receiving query information input by a user, where the query information includes character information of a point of interest Poi input by the user in a search process.
Specifically, after the user inputs the query information using the terminal, the terminal may send the query information input by the user to the server, so that the server may receive the query information input by the user. Specifically, when the user wants to obtain a certain item Poi, Poi input information to be obtained for the user may be input at the terminal to trigger the terminal to use the input information as query information, so that the query information may be sent to the server through the terminal, so that the server may receive the query information and execute step 704.
And step 704, identifying Poi character information from the query information, and determining Poi city information based on the current geographic position of the user.
In the embodiment of the invention, the server can perform Poi identification and Poi attribute labeling on the query information input by the user to identify Poi character information contained in the query information. The Poi character information may include character information corresponding to different query domains, and may include address character information corresponding to address domains, name character information corresponding to name domains, and the like. Poi the character form of the character information can be determined according to the input mode of the user, such as when the user uses Chinese input method, the Poi character information can be Chinese character; as another example, when the user uses an english input method, the character of the Poi character information may be a pinyin character, and the like, which is not limited in this embodiment of the present invention.
Meanwhile, the server may obtain the current geographic location of the user through the terminal, and may further determine Poi the city information belonging to based on the current geographic location of the user. The city information Poi belongs to can be used to characterize the city Poi the user needs to obtain, for example, when the user searches Poi in the hai lake area of beijing, the city code 01 corresponding to beijing can be used as the city information Poi belongs to, and Poi search is performed in the index block corresponding to beijing; for example, when the user performs Poi search at tianjin university, the city code 02 corresponding to tianjin may be used as the city information to which Poi of the user search belongs, and Poi search may be performed in the index partition corresponding to tianjin.
Step 706, querying an index database according to the city information to which the Poi belongs to obtain a target inverted index corresponding to the Poi character information, wherein the index database comprises index blocks based on the city information, and the index blocks comprise the inverted index of Poi.
After determining Poi the city information, the server may query in a pre-constructed index library according to the city information, so as to search for the index partition corresponding to the Poi city information, and determine the found index partition as a target index partition, so as to search for the inverted index corresponding to the Poi character information in the target index partition.
After the reverse index corresponding to the Poi character information is found, the found reverse index can be used as the target reverse index corresponding to the Poi character information, that is, the target reverse index corresponding to the Poi character information can be obtained from the target index blocks, so that the search range is reduced, and the search efficiency is improved.
For example, in combination with the inverted linked list based on a single character shown in fig. 3, the inverted linked list corresponding to "search fox" including < D10001481, 3, 1, 10000>, < D1000182, 3, 1, 9000>, < D10001481, 2, 2, 10000> and < D10001482, 2, 2, 2, 9000> may be found in the beijing index partition, and then < D10001481, 3, 1, 10000>, < D1000182, 3, 1, 9000>, < D10001481, 2, 2, 10000> and < D10001482, 2, 2, 9000> may be determined as the target inverted index.
Step 708, generating at least one Poi result according to the target inverted index, and displaying the Poi result.
After determining the target inverted index, the server may construct one or more Poi results based on the characters corresponding to the target inverted index, and may return Poi results to the terminal used by the user, through which the Poi results are presented.
For example, in combination with the above example, based on < D10001481, 3, 1, 10000> and < D10001481, 2, 2, 10000>, Poi "search fox network building" corresponding to Poi number D10001481 can be taken as a Poi result; meanwhile, Poi "fox search media building" corresponding to the Poi number D10001482 can be used as another Poi result based on < D1000182, 3, 1, 9000> and < D10001482, 2, 2, 9000 >; poi "search media building" and Poi "search network building" can then be presented on the terminal screen for recommendation to the user to meet the user's needs.
In the actual process, during the process of inputting the query information, the user can input Poi character information of different query fields to trigger the server to perform Poi search in different query fields, so that the search range can be increased and the search efficiency can be improved. The query domain may include, but is not limited to, an address domain and a name domain. For example, when the query information input by the user is "manyfold No. 1 search fox building", the "manyfold No. 1" may be recognized as address character information, and the "search fox building" may be recognized as name character information, that is, a mixed query request including different query fields may occur. Aiming at the mixed query requirement, the implementation of the invention can carry out merging query on the Poi index according to Poi character information corresponding to different query domains, thereby being capable of querying Poi inverted indexes corresponding to different query domains.
Referring to fig. 8, a flowchart illustrating steps of an alternative embodiment of the search method based on the index repository according to the present invention is shown, which may specifically include the following steps:
step 802, receiving query information input by a user, where the query information includes character information of a point of interest Poi input by the user in a search process.
And step 804, identifying Poi character information from the query information, and determining Poi city information based on the current geographic position of the user.
In this embodiment of the present invention, optionally, identifying Poi character information from the query information may include: performing word segmentation on the query information to obtain at least one word segmentation result; performing part-of-speech tagging on the word segmentation result to obtain corresponding attribute information; and analyzing according to the word segmentation result and the attribute information to determine Poi character information, wherein the Poi character information comprises name character information and/or address character information. In a specific implementation, the query information input by the user can be segmented according to a preset part of speech to obtain at least one segmentation result; and part-of-speech tagging can be performed on each word segmentation result, attribute information corresponding to each word segmentation result can be determined based on the tagged part-of-speech, and then the word segmentation results can be analyzed based on the attribute information to determine Poi character information corresponding to the query domain. For example, based on the name domain information, the characters included in the name Poi can be analyzed from the word segmentation result obtained after the word segmentation, that is, the name character information can be obtained; for another example, the address character information may be obtained by analyzing the characters included in the Poi address from the word segmentation result obtained after the word segmentation based on the address domain information.
Step 806, searching a target index block corresponding to the city information to which the Poi belongs in an index database.
In the embodiment of the present invention, according to the correspondence between the city information and the index blocks, the target index block corresponding to the city information to which Poi belongs may be searched in the index database, so as to query the target inverted index corresponding to the Poi character information in the target index block.
For example, when a user uses a terminal to input query information "search fox" on an operation interface of a map application program in beijing, the server may identify the query information "search fox" as Poi character information to search Poi that the user needs to query according to "search fox", and may determine a city number 01 corresponding to beijing as the city information of Poi that the user needs to query, so as to search an inverted index corresponding to "search fox" in a beijing index block corresponding to the city number 01.
And step 808, inquiring an inverted list of characters matched with the Poi character information in the target index block.
Step 810, determining a target inverted index according to the inverted chain table of the matched characters, wherein the target inverted index comprises a name domain index and/or an address domain index.
In the embodiment of the present invention, the inverted list of each character included in the character information may be searched Poi in the target index partition, that is, the inverted lists of the characters corresponding to different query domains may be obtained, and further, the target inverted indexes corresponding to different query domains may be determined based on the inverted lists of the characters corresponding to different query domains. For example, the name characters included in the name character information of Poi may be searched in the target index block, and further, a name domain index corresponding to Poi on the name domain may be determined based on an inverted linked list of the searched name characters; similarly, the address characters included in the address character information of Poi may be searched in the target index block, and further, the address domain index corresponding to the address domain in the address name domain may be determined Poi based on the inverted chain table of the searched address characters.
And 812, generating at least one Poi result according to the target inverted index, and displaying the Poi result.
When a compound domain query occurs, that is, under the condition that query information input by a user contains Poi character information corresponding to different query domains, the embodiment of the present invention may merge and merge queried target inverted lines according to the Poi ID after determining the target inverted lines corresponding to different query domains, so as to generate Poi results to be recommended based on the merged target inverted lines, thereby reducing the recommended number of Poi results. Optionally, in a case that the queried target inverted index includes an address domain index corresponding to an address domain and a name domain index corresponding to a name domain, generating at least one Poi result according to the target inverted index, including: fusing the name domain index and the address domain index to obtain a fused Poi index; generating at least one Poi result based on the Poi indexing a reverse linked list of corresponding characters, the reverse linked list comprising: character number, character position, and Poi heat.
For example, after the user inputs "chu xiao 1 search for fox building" in cheng fu road, the server determines the address domain index corresponding to "chu lu 1" and the name domain index corresponding to "chu xiao"; and the address domain index and the name domain index are merged to obtain Poi index corresponding to 'Chengfu Lu 1 search fox building'. The Poi index may include a list of inverted lists corresponding to each kanji character contained in "chengfu lu 1 search fox building". Therefore, Poi results corresponding to the 'Chengfu Lu 1 Xunhu mansion' can be formed based on the inverted chain table corresponding to each Chinese character contained in the 'Chengfu Lu 1 Xunhu mansion' such as displaying the geographical position of an interest point of the 'Chengfu Lu 1 Xunhu mansion' on an interface of a map application, or displaying navigation routes corresponding to more geographical positions from the current geographical position of a user to the geographical position of the interest point of the 'Chengfu Lu 1 Xunhu mansion'.
In the embodiment of the present invention, when there is more than one Poi result, that is, after two or more Poi results are generated, the Poi results are sorted according to Poi heat to obtain a corresponding sorting order, so that the Poi results can be displayed according to the sorting order, that is, the Poi result with higher Poi heat is preferentially recommended to the user. Optionally, presenting the Poi result may include: for each Poi result, acquiring corresponding Poi heat and character position from the inverted list of the characters; sorting the Poi results according to the obtained Poi heat, character positions and the query information, and determining a sorting sequence corresponding to each Poi result; the results of Poi are presented in rank order. Specifically, the embodiment of the present invention may determine the position of the character at Poi based on the character position and the Poi ID in the inverted list, determine the association degree between the character included in the Poi result and the query information based on the position of the character at Poi, and further sort the generated Poi results by using the association degree and the Poi heat degree to obtain a corresponding sort order, so as to display the Poi result according to the sort order, that is, recommend the Poi result with a higher Poi heat degree and a higher association degree to the user, thereby improving the recall rate corresponding to the query information.
As an application of the present invention, the search method based on the index library provided in the embodiment of the present invention may be applied to a user query intention understanding system and a query presentation system, and the index library establishment method may be applied to an index service system. The user query intention understanding system can be used as a system for judging preliminary requirements of a user, can be used for receiving query information input by the user, can identify Poi character information such as Poi partial input information, Poi complete input information and Poi error input information from the query information input by the user, and can determine user requirements such as Poi address query requirements, navigation requirements and route query requirements, so that the query presentation system can search Poi results corresponding to the identified Poi character information according to the user requirements to provide the optimal Poi results for the user.
For example, as shown in fig. 9, after receiving query information input by a user, the user query intention understanding system 910 may perform word segmentation on the query information through the word segmentation module 912 to obtain one or more word segmentation results, where the word segmentation results may include one or more characters; and the word segmentation result obtained can be labeled Poi attribute according to the preset part of speech through the part of speech labeling module 914, that is, the attribute information of Poi is determined. The preset parts of speech may include words such as stop words, navigation skip words, bus skip words, brand chain words, administrative division words, bus stops, numbers, simple points of interest, common suffix words, subway skip words, classification brand words, road words, direction words, and letters, which are not limited in this embodiment of the present invention. Alternatively, the parts of speech may be set according to words contained in a lexicon. The thesaurus may be obtained by analyzing the user log, and may include Poi name thesaurus, synonym thesaurus, category and brand thesaurus, and the like. Optionally, under the condition that the query information input by the user may not refer to one Poi, for example, under the condition that the query information includes query words corresponding to multiple kinds of attribute information, the probability of the currently labeled Poi attribute may be determined by combining the query information, so that the ranking can be performed according to the probability of the Poi attribute in the subsequent ranking process to give a reasonable Poi result. In addition, the user query intention understanding system 910 may further analyze the results generated by the segmentation module 912 and the part-of-speech tagging module 914 through the query analysis module 916, for example, analyze the segmentation results based on a preset rule template and tagged attribute information to obtain a reasonable query logic, and send the query logic to the query presentation system 920, so as to trigger the query presentation system 920 to perform a query according to the query logic. For example, the query presentation system 920 may query Poi for incomplete input information, filter stop words, process navigation jumpers, process subway jumpers, query address Poi, query taxonomy brands, and so forth according to the query logic.
Specifically, the query presentation system 920 may match the Poi inverted index corresponding to the Poi character information in the index library of the index service system 930 according to the Poi character information and the user requirement sent by the user query intention understanding system, and may further generate Poi results based on the matching to the Poi inverted index, and reasonably present Poi results. The rationalized presentations may include Poi presentations, structured presentations, aggregate presentations, main child node presentations, navigation jumps, route jump presentations, and the like. The structured show may represent a show of a facility to which Poi belongs, such as a show of "door", "parking lot", "entrance", "ticket office", "seat a and seat B", and the like.
The aggregate presentation may represent an aggregate presentation of results that the user may need Poi. Specifically, when the user inputs less character information, the user requirement determined by the user query intention understanding system may be more, and the query presentation system may present the Poi results that are most likely to be required by the user to the user in an aggregated manner, thereby improving the user input efficiency. For example, when the user inputs the pinyin "jiao", the query presentation system may obtain the words "transportation bank", "educational training", "dumpling hall", "church", etc. by aggregating the rankings to provide the user with the choice.
The master child node display may be used to display the affiliations and correlations of a brand or category. For example, the category of "hotel" includes "fast hotel", "star hotel", "youth hotel", etc., if the query information input by the user is "hotel", the query presentation system can give out subclasses belonging to the hotel, such as three categories of "fast hotel", "star hotel" and "youth hotel"; if the query information input by the user is "quick hotel," the query presentation system may give results for hotel brands such as "seven days," "like home," and the like.
The navigation jump and route jump display refers to displaying an interface jumping to a navigation page or a navigation route page. Specifically, when the user query intention understanding system 910 determines that the user needs to perform navigation and route query by analyzing the query words input by the user, the query presentation system 920 may provide a jump interface according to the searched Poi result, so as to quickly meet the user's requirement.
In the sorting process, the query presentation system 920 may sort the generated Poi results based on the word order, word closeness, word affiliation, regularized weight increase, Poi click rate, and other factors, so as to improve the recall rate and the search efficiency. Wherein, the word order can represent the position of a word at Poi, for example, the word order can be determined by using the character position and the Poi ID in the inverted list; word closeness may be calculated from the edit distance of a word from word. The affiliation of a word may represent administrative division affiliation of the word and synonym proximity. The regularized weight-lifting can indicate that the weight of a specific word needs to be adjusted under the requirements of certain users, for example, when the user inputs numbers in sequence, the sequencing weight of a bus line can be improved; when the query information input by the user contains partial words in parentheses, the ranking weight of the partial words is reduced; as another example, cross-city popularity data may be weighted when processing a cross-city query, i.e., when it is desired to balance results for a current city with results across cities, and so on.
In one embodiment, a fitting model may be created based on word order, word closeness, word affiliation, regularized weight-increasing factor, and Poi click rate, so as to comprehensively score the generated Poi result through the fitting model to obtain a corresponding comprehensive score value
Figure BDA0001423236520000251
For example, the created fitting model can be represented by the following formula:
Figure BDA0001423236520000252
wherein, the parameter x may represent word compactness, the parameter y may represent word order, the parameter z may represent word relationship, p may represent regularized weighting factor, and t may represent Poi click rate. a may represent a weight of word closeness, b may represent a weight of word order, and c may represent a weight of word affiliation. It should be noted that a, b, and c may be set by performing regression based on a supervised learning method, for example, the regression may be obtained by a preset regression model, and the embodiment of the present invention is not limited thereto.
For example, after the user inputs the query information "beijing concierge", the Poi results that the query presentation system can find in the index repository of the index service system may include: a, traffic administration of Western City, B, traffic administration of Beijing public Security office (Fuchengmen North street), C, traffic administration of Beijing public Security office, and D, Yongfeng base hot spring traffic administration. When the user query intention understanding system understands the user input amount query information as 'searching for the bureau in Beijing city', that is, the identified Poi character information can be the "traffic bureau", it can be determined that the word compactness and word order weight of the "traffic bureau" and the "Yongfeng base hot spring traffic bureau" are higher, the word compactness and word order of the "traffic bureau" and the "Beijing city public bureau traffic bureau" of C are higher, the traffic administration, A, the traffic administration of the Western City and B, the traffic administration of the Beijing public Security administration (Fuchengmen North street), has lower word compactness and word order, if only by comparing the query information with the character meaning of the Poi result, the arrangement order of the Poi results can be determined as D "Yongfeng base hot spring traffic administration", C "Beijing city public security bureau of traffic administration", A "West City traffic administration", B "Beijing city public security bureau of traffic administration (Fuchengmen North street)". When the user query intention understanding system understands the user input amount query information as referring to "beijing city traffic administration", that is, the identified Poi character information is "beijing city traffic administration", it can be determined that the word relationship between "beijing city traffic administration" and C "beijing city public bureau traffic administration", B "beijing city public bureau traffic administration (fugongmen north street)" is higher, and if only by comparing the query information with the character meaning of Poi result, it can be determined that the arrangement order of Poi results is: c, Beijing city public security bureau traffic administration, B, Beijing city public security bureau traffic administration (Fuchengmen North street), A, West City traffic administration, and D, Yongfeng base hot spring traffic administration. By applying the fitting model in this example to comprehensively score the Poi results, it can be determined that the final ranking of Poi results is: the method comprises the steps that A, a Western city traffic administration, B, Beijing public bureau traffic administration (Fuchengmen North street), C, Beijing public bureau administration of public bureau administration and D, Yongfeng base hot spring bureau are combined with Poi click amount to sort Poi results, so that a more accurate sorting result is obtained, and which Poi result meets user requirements better from the perspective of a user can be considered, so that recall rate is improved.
The index service system 930 may be configured to scan various data sources, create an inverted index based on a single character, and divide the created inverted index into index partitions corresponding to different cities for storage; and the Poi indexes found by the query presentation system 920 may be merged and sorted. Among other things, the data sources may include Poi name data, Poi address data, Poi heat data, Poi click data, Poi affiliated relationship libraries, error correction dictionary libraries, synonym libraries, user click history data, and so forth.
In a specific implementation, the Poi address data and Poi belonging relational database can be used to describe data of Poi specific addresses; the Poi address data may not describe the relationship of the administrative division of Poi, and may be specifically used to describe the specific location of Poi, for example, the Poi address data "kindred scientific building D seat of qing east way 1 kindergarten, hai lake, beijing city" describes the detailed address of Poi. Poi the relational database may include data indicating the administrative division Poi, which may include five levels of administrative units "province-city-county (district) -town-street". With Poi the affiliation library, the index service system 930 can accurately obtain Poi affiliations, and at the same time, can allow the user query intent understanding system 910 to correct administrative divisions of user erroneous inputs.
The error correction dictionary base can be a word base of common near words, easily-mistaken words and fuzzy tones obtained by mining and analyzing the user input log, and the word base can obtain the probability of the most probable input errors of the user by counting the log, so that the index service system 930 can correct the error information input by the user through the error correction dictionary base, and the problem of low recall rate caused by the fact that the existing error information input by the user cannot be corrected through a homophonic error correction model is solved.
The thesaurus may contain Poi name aliases, acronyms and names of great use, and word aliases, etc. By increasing the coverage of the thesaurus of synonyms, the recognition rate of Poi by the user query intention understanding system 910 can be increased, and the accuracy of labeling can be increased.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Referring to fig. 10, a block diagram of an embodiment of an index library constructing apparatus according to the present invention is shown, which may specifically include the following modules:
the data source scanning module 1002 is configured to scan a data source of the point of interest Poi, and determine attribute information and city information of each Poi;
an index creation module 1004 for creating Poi inverted indexes according to the attribute information of each Poi;
the index segmentation module 1006 is configured to segment the reverse index of Poi according to the city information to obtain an index block based on the city information;
and an index base building module 1008, configured to build Poi an index base according to the index blocks.
In an alternative implementation of the invention, the attribute information is determined based on the query domain of Poi. Poi may include: based on the inverted index of a single character. The index creation module 1004 may include the following sub-modules:
an attribute information extraction submodule for extracting Poi address domain information and/or name domain information from the attribute information of Poi;
and the inverted index determining submodule is used for counting the name characters contained in the name domain information of Poi and the address characters contained in the address domain information and determining an inverted index based on a single character.
In an alternative implementation of the present invention, the index repository construction module 1008 may include the following sub-modules:
a correspondence construction submodule, configured to construct Poi a correspondence between an inverted index and an index partition, where the inverted index includes at least one of: poi name index and Poi address index;
and the index base constructing submodule is used for constructing Poi index base based on the corresponding relation between the Poi inverted index and the index block.
In an alternative implementation of the present invention, the reverse index determination sub-module may include the following units:
a name character counting unit, configured to count name characters included in the name domain information, and determine a frequency corresponding to each name character;
a first determining unit, configured to determine an inverted link table of each name character according to a frequency corresponding to each name character, where the inverted link table includes: name character number, character position, and Poi heat;
and the name index building unit is used for building Poi name indexes according to the name domain information of each Poi and based on the inverted chain table of the name characters.
In an alternative implementation of the present invention, the reverse index determination sub-module may include the following units:
the address character counting unit is used for counting the address characters contained in the address domain information and determining the frequency corresponding to each address character;
a second determining unit, configured to determine an inverted link table of each address character according to a frequency corresponding to each address character, where the inverted link table includes: address character number, character position, and Poi heat;
and the address index building unit is used for building Poi address indexes according to the inverted link table of the address characters aiming at the address domain information of each Poi.
In an alternative implementation of the invention, the index splitting module 1006 may include the following sub-modules:
the reverse index segmentation submodule is used for segmenting the reverse indexes of Poi according to the city information to obtain Poi reverse indexes corresponding to the city information;
and the index block generation submodule is used for generating corresponding index blocks based on the reverse indexes of Poi corresponding to the same city information.
In an optional implementation of the present invention, the apparatus may further include the following modules:
the historical data acquisition module is used for acquiring historical search data, and the historical search data comprises input method record data of a user, webpage click historical data and map click historical data;
and the comprehensive analysis module is used for comprehensively analyzing the input method record data, the webpage click history data and the map click history data in the historical search data to obtain Poi heat corresponding to each character.
Referring to fig. 11, a block diagram of a searching apparatus based on an index repository according to an embodiment of the present invention is shown, which may specifically include the following modules:
an information receiving module 1102, configured to receive query information input by a user, where the query information includes character information of a point of interest Poi input by the user in a search process;
poi, an identifying module 1104, configured to identify Poi character information from the query information, and determine Poi city information based on the current geographic location of the user;
an index query module 1106, configured to query an index library according to the city information to which the Poi belongs, to obtain a target inverted index corresponding to the Poi character information, where the index library includes index partitions based on city information, and the index partitions include an inverted index of Poi;
a result displaying module 1108, configured to generate at least one Poi result according to the target inverted index, and display the Poi result.
In an alternative implementation of the present invention, the Poi recognition module 1104 may include the following sub-modules:
the word segmentation sub-module is used for performing word segmentation on the query information to obtain at least one word segmentation result;
the labeling submodule is used for performing part-of-speech labeling on the word segmentation result to obtain corresponding attribute information;
and the analysis submodule is used for analyzing according to the word segmentation result and the attribute information to determine Poi character information, wherein the Poi character information comprises name character information and/or address character information.
In an alternative implementation of the present invention, index query module 1106 may include the following sub-modules:
the block searching submodule is used for searching a target index block corresponding to the city information to which the Poi belongs in an index database;
the linked list searching submodule is used for searching the inverted linked list of the characters contained in the Poi character information in the target index block;
and the index determining submodule is used for determining a target inverted index according to the inquired inverted chain table, wherein the target inverted index comprises a name domain index and/or an address domain index.
In an alternative implementation of the present invention, the result presentation module 1108 may include the following sub-modules:
the index fusion submodule is used for fusing the name domain index and the address domain index to obtain a fused Poi index;
a result generation submodule, configured to generate at least one Poi result based on the Poi indexed inverted list of corresponding characters, the inverted list including: character number, character position, and Poi heat.
In an alternative implementation of the present invention, the result presentation module 1108 may include the following sub-modules:
the obtaining submodel is used for obtaining corresponding Poi heat and character positions from the inverted list of the characters aiming at each Poi result;
the sequencing submodule is used for sequencing the Poi results according to the obtained Poi heat, character positions and the query information, and determining the corresponding arrangement sequence of the Poi results;
and the display submodule is used for displaying the Poi results according to the arrangement sequence.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
Fig. 12 is a block diagram illustrating the structure of an apparatus 1200 according to an example embodiment. For example, device 1200 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like; or may be a server-side device, such as a server.
Referring to fig. 12, device 1200 may include one or more of the following components: processing component 1202, memory 1204, power component 1206, multimedia component 1208, audio component 1210, input/output (I/O) interface 1212, sensor component 1214, and communications component 1216.
The processing component 1202 generally controls overall operation of the device 1200, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 1202 may include one or more processors 1220 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 1202 can include one or more modules that facilitate interaction between the processing component 1202 and other components. For example, the processing component 1202 can include a multimedia module to facilitate interaction between the multimedia component 1208 and the processing component 1202.
The memory 1204 is configured to store various types of data to support operation at the device 1200. Examples of such data include instructions for any application or method operating on device 1200, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 1204 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
A power supply component 1206 provides power to the various components of the device 1200. Power components 1206 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for device 1200.
The multimedia components 1208 include a screen that provides an output interface between the device 1200 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 1208 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the device 1200 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
Audio component 1210 is configured to output and/or input audio signals. For example, audio assembly 1210 includes a Microphone (MIC) configured to receive external audio signals when device 1200 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 1204 or transmitted via the communication component 1216. In some embodiments, audio assembly 1210 further includes a speaker for outputting audio signals.
The I/O interface 1212 provides an interface between the processing component 1202 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 1214 includes one or more sensors for providing various aspects of state assessment for the device 1200. For example, the sensor assembly 1214 may detect an open/closed state of the device 1200, the relative positioning of the components, such as a display and keypad of the device 1200, the sensor assembly 1214 may also detect a change in the position of the device 1200 or a component of the device 1200, the presence or absence of user contact with the device 1200, orientation or acceleration/deceleration of the device 1200, and a change in the temperature of the device 1200. The sensor assembly 1214 may include a proximity sensor configured to detect the presence of a nearby object in the absence of any physical contact. The sensor assembly 1214 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 1214 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
Communications component 1216 is configured to facilitate communications between device 1200 and other devices in a wired or wireless manner. The device 1200 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 1216 receives the broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communications component 1216 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the device 1200 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer readable storage medium is also provided that includes instructions, such as the memory 1204 that includes instructions, that are executable by the processor 1220 of the device 1200 to perform the above-described methods. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
A non-transitory computer readable storage medium in which instructions, when executed by a processor of a device, enable the device to perform a method of index base construction, the method comprising: scanning a data source of the interest point Poi, and determining attribute information and city information of each Poi; creating Poi an inverted index from the attribute information of each Poi; segmenting the reverse index of Poi according to the city information to obtain index blocks based on the city information; an index library is built Poi from the index chunks.
Optionally, the attribute information is determined based on a query domain of Poi, and the inverted index of Poi includes: based on the inverted index of a single character. The creating Poi an inverted index from the attribute information of each Poi, comprising: extracting Poi address domain information and/or name domain information from the attribute information of Poi; the name characters included in the name field information and the address characters included in the address field information are counted at Poi, and an inverted index based on a single character is determined.
Optionally, the constructing Poi an index library according to the index blocks includes: constructing Poi a correspondence between an inverted index and index partitions, the inverted index comprising at least one of: poi name index and Poi address index; and constructing Poi index database based on the corresponding relation between the Poi inverted index and the index blocks.
Optionally, counting name characters included in each Poi name domain information, and determining an inverted index based on a single character includes: counting name characters contained in the name domain information, and determining the frequency corresponding to each name character; determining an inverted chain table of each name character according to the frequency corresponding to each name character, wherein the inverted chain table comprises: name character number, character position, and Poi heat; for each Poi name domain information, a Poi name index is constructed based on the inverted chain table of name characters.
Optionally, counting address characters included in each Poi address domain information, and determining an inverted index based on a single character includes: counting address characters contained in the address domain information, and determining the frequency corresponding to each address character; determining an inverted chain table of each address character according to the frequency corresponding to each address character, wherein the inverted chain table comprises: address character number, character position, and Poi heat; for each Poi realm information, a Poi address index is constructed based on the inverted chain table of address characters.
Optionally, the segmenting the reverse index of Poi according to the city information to obtain index blocks based on the city information, including: segmenting the reverse indexes of Poi according to the city information to obtain Poi reverse indexes corresponding to the city information; based on the reverse index of Poi corresponding to the same city information, a corresponding index block is generated.
Optionally, prior to determining the inverted index based on the single character, the executing, by the one or more processors, the one or more programs includes instructions further for: acquiring historical search data, wherein the historical search data comprises input method record data of a user, webpage click history data and map click history data; and comprehensively analyzing the input method record data, the webpage click history data and the map click history data in the historical search data to obtain Poi heat corresponding to each character.
A non-transitory computer readable storage medium in which instructions, when executed by a processor of a device, enable the device to perform a method of index-based repository searching, the method comprising: receiving query information input by a user, wherein the query information comprises character information of a point of interest Poi input by the user in a searching process; poi character information is identified from the query information, and the city information to which Poi belongs is determined based on the current geographic position of the user; querying an index database according to the city information to which the Poi belongs to obtain a target inverted index corresponding to the Poi character information, wherein the index database comprises index blocks based on the city information, and the index blocks comprise Poi inverted indexes; generating at least one Poi result according to the target inverted index, and displaying the Poi result.
Optionally, Poi character information is optionally identified from the query information, including: performing word segmentation on the query information to obtain at least one word segmentation result; performing part-of-speech tagging on the word segmentation result to obtain corresponding attribute information; and analyzing according to the word segmentation result and the attribute information to determine Poi character information, wherein the Poi character information comprises name character information and/or address character information.
Optionally, querying an index database according to the city information to which the Poi belongs to obtain a target inverted index corresponding to the Poi character information, including: searching a target index block corresponding to the city information to which the Poi belongs in an index database; in the target index block, inquiring an inverted list of characters contained in the Poi character information; and determining a target inverted index according to the inquired inverted chain table, wherein the target inverted index comprises a name domain index and/or an address domain index.
Optionally, generating at least one Poi result from the target inverted index includes: fusing the name domain index and the address domain index to obtain a fused Poi index; generating at least one Poi result based on the Poi indexing a reverse linked list of corresponding characters, the reverse linked list comprising: character number, character position, and Poi heat.
Optionally, the displaying Poi result includes: for each Poi result, acquiring corresponding Poi heat and character position from the inverted list of the characters; sorting the Poi results according to the obtained Poi heat, character positions and the query information, and determining a sorting sequence corresponding to each Poi result; the results of Poi are presented in rank order.
Fig. 13 is a schematic structural diagram of a server in an embodiment of the present invention. The server 1300 may vary widely in configuration or performance and may include one or more Central Processing Units (CPUs) 1322 (e.g., one or more processors) and memory 1332, one or more storage media 1330 (e.g., one or more mass storage devices) storing applications 1342 or data 1344. Memory 1332 and storage medium 1330 may be, among other things, transitory or persistent storage. The program stored on the storage medium 1330 may include one or more modules (not shown), each of which may include a sequence of instructions operating on a server. Still further, the central processor 1322 may be arranged in communication with the storage medium 1330, executing a sequence of instruction operations in the storage medium 1330 on the server 1300.
The server 1300 may also include one or more power supplies 1326, one or more wired or wireless network interfaces 1350, one or more input-output interfaces 1358, one or more keyboards 1356, and/or one or more operating systems 1341 such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The method and the device for constructing the index library, the method and the device for searching based on the index library, the equipment and the readable storage medium provided by the invention are described in detail, specific examples are applied in the text to explain the principle and the implementation mode of the invention, and the description of the above embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (38)

1. An index library construction method is characterized by comprising the following steps:
scanning a data source of the interest point Poi, and determining attribute information and city information of each Poi;
creating Poi an inverted index from the attribute information of each Poi;
segmenting the reverse index of Poi according to the city information to obtain index blocks based on the city information;
storing the reverse indexes of Poi according to index blocks corresponding to different city information to construct a Poi index library;
the method further comprises the following steps:
recording Poi heat and user click history data corresponding to each Poi, and merging the Poi heat and the user click history data into an inverted index of Poi, so as to sort Poi based on the Poi heat and the user click history number.
2. The method of claim 1, wherein the attribute information is determined based on a query domain of Poi, and wherein the inverted index of Poi comprises: creating Poi an inverted index from the attribute information of each Poi based on the inverted index of the single character, comprising:
extracting Poi address domain information and/or name domain information from the attribute information of Poi;
the name characters included in the name field information and the address characters included in the address field information are counted at Poi, and an inverted index based on a single character is determined.
3. The method of claim 2, wherein the building Poi an index library from the index chunks comprises:
constructing Poi a correspondence between an inverted index and index partitions, the inverted index comprising at least one of: poi name index and Poi address index;
and constructing Poi index database based on the corresponding relation between the Poi inverted index and the index blocks.
4. The method according to claim 3, wherein the counting name characters contained in each Poi name field information and determining the inverted index based on a single character comprises:
counting name characters contained in the name domain information, and determining the frequency corresponding to each name character;
determining an inverted chain table of each name character according to the frequency corresponding to each name character, wherein the inverted chain table comprises: name character number, character position, and Poi heat;
for each Poi name domain information, a Poi name index is constructed based on the inverted chain table of name characters.
5. The method according to claim 3, wherein the counting address characters contained in each Poi address field information and determining the inverted index based on a single character comprises:
counting address characters contained in the address domain information, and determining the frequency corresponding to each address character;
determining an inverted chain table of each address character according to the frequency corresponding to each address character, wherein the inverted chain table comprises: address character number, character position, and Poi heat;
for each Poi realm information, a Poi address index is constructed based on the inverted chain table of address characters.
6. The method of claim 1, wherein the slicing Poi inverted index according to the city information to obtain index blocks based on city information comprises:
segmenting the reverse indexes of Poi according to the city information to obtain Poi reverse indexes corresponding to the city information;
based on the reverse index of Poi corresponding to the same city information, a corresponding index block is generated.
7. The method of claim 2, wherein prior to determining the inverted index based on the single character, the method further comprises:
acquiring historical search data, wherein the historical search data comprises input method record data of a user, webpage click history data and map click history data;
and comprehensively analyzing the input method record data, the webpage click history data and the map click history data in the historical search data to obtain Poi heat corresponding to each character.
8. A search method based on an index library is characterized by comprising the following steps:
receiving query information input by a user, wherein the query information comprises character information of a point of interest Poi input by the user in a searching process;
poi character information is identified from the query information, and the city information to which Poi belongs is determined based on the current geographic position of the user;
querying an index base according to the city information to which the Poi belongs to obtain a target inverted index corresponding to the Poi character information, wherein the index base is constructed by storing the inverted index of Poi according to index blocks corresponding to different city information, the index base comprises index blocks based on the city information, and the index blocks comprise the inverted index of Poi; poi heat and user click history data corresponding to Poi are blended into the inverted index of Poi;
generating at least one Poi result according to the target inverted index, and displaying the Poi result;
wherein, displaying the Poi result comprises:
displaying Poi after sorting based on the Poi popularity and the user click history number.
9. The method of claim 8, wherein the identifying Poi character information from the query information comprises:
performing word segmentation on the query information to obtain at least one word segmentation result;
performing part-of-speech tagging on the word segmentation result to obtain corresponding attribute information;
and analyzing according to the word segmentation result and the attribute information to determine Poi character information, wherein the Poi character information comprises name character information and/or address character information.
10. The method as claimed in claim 8, wherein said querying an index database according to the city information to which the Poi belongs to obtain the target inverted index corresponding to the Poi character information comprises:
searching a target index block corresponding to the city information to which the Poi belongs in an index database;
in the target index block, inquiring an inverted list of characters contained in the Poi character information;
and determining a target inverted index according to the inquired inverted chain table, wherein the target inverted index comprises a name domain index and/or an address domain index.
11. The method of claim 10, wherein generating at least one Poi result from the target inverted index comprises:
fusing the name domain index and the address domain index to obtain a fused Poi index;
generating at least one Poi result based on the Poi indexing a reverse linked list of corresponding characters, the reverse linked list comprising: character number, character position, and Poi heat.
12. The method of claim 11, wherein the presenting Poi results comprises:
for each Poi result, acquiring corresponding Poi heat and character position from the inverted list of the characters;
sorting the Poi results according to the obtained Poi heat, character positions and the query information, and determining a sorting sequence corresponding to each Poi result;
the results of Poi are presented in rank order.
13. An index repository constructing apparatus, comprising:
the data source scanning module is used for scanning the data source of the interest point Poi and determining attribute information and city information of each Poi;
an index creation module for creating Poi inverted indexes according to the attribute information of each Poi;
the index segmentation module is used for segmenting the reverse index of Poi according to the city information to obtain index blocks based on the city information;
the index base building module is used for storing the Poi inverted indexes according to the index blocks corresponding to different city information to build a Poi index base; the module is further used for recording Poi heat and user click history data corresponding to each Poi, and merging the Poi heat and the user click history data into an inverted index of Poi, so as to sort Poi based on the Poi heat and the user click history number.
14. The apparatus of claim 13, wherein the attribute information is determined based on a query domain of Poi, and wherein the inverted index of Poi comprises: based on the inverted index of the single character, the index creating module comprises:
an attribute information extraction submodule for extracting Poi address domain information and/or name domain information from the attribute information of Poi;
and the inverted index determining submodule is used for counting the name characters contained in the name domain information of Poi and the address characters contained in the address domain information and determining an inverted index based on a single character.
15. The apparatus of claim 14, wherein the index repository construction module comprises:
a correspondence construction submodule, configured to construct Poi a correspondence between an inverted index and an index partition, where the inverted index includes at least one of: poi name index and Poi address index;
and the index base constructing submodule is used for constructing Poi index base based on the corresponding relation between the Poi inverted index and the index block.
16. The apparatus of claim 15, wherein the inverted index determination sub-module comprises:
a name character counting unit, configured to count name characters included in the name domain information, and determine a frequency corresponding to each name character;
a first determining unit, configured to determine an inverted link table of each name character according to a frequency corresponding to each name character, where the inverted link table includes: name character number, character position, and Poi heat;
and the name index building unit is used for building Poi name indexes according to the name domain information of each Poi and based on the inverted chain table of the name characters.
17. The apparatus of claim 15, wherein the inverted index determination sub-module comprises:
the address character counting unit is used for counting the address characters contained in the address domain information and determining the frequency corresponding to each address character;
a second determining unit, configured to determine an inverted link table of each address character according to a frequency corresponding to each address character, where the inverted link table includes: address character number, character position, and Poi heat;
and the address index building unit is used for building Poi address indexes according to the inverted link table of the address characters aiming at the address domain information of each Poi.
18. The apparatus of claim 13, wherein the index slicing module comprises:
the reverse index segmentation submodule is used for segmenting the reverse indexes of Poi according to the city information to obtain Poi reverse indexes corresponding to the city information;
and the index block generation submodule is used for generating corresponding index blocks based on the reverse indexes of Poi corresponding to the same city information.
19. The apparatus of claim 14, further comprising:
the historical data acquisition module is used for acquiring historical search data, and the historical search data comprises input method record data of a user, webpage click historical data and map click historical data;
and the comprehensive analysis module is used for comprehensively analyzing the input method record data, the webpage click history data and the map click history data in the historical search data to obtain Poi heat corresponding to each character.
20. An index repository-based search apparatus, comprising:
the information receiving module is used for receiving query information input by a user, wherein the query information comprises character information of an interest point Poi input by the user in a searching process;
poi recognition module, which is used to recognize Poi character information from the query information and determine Poi city information based on the current geographic location of the user;
an index query module, configured to query an index library according to the city information to which the Poi belongs, to obtain a target inverted index corresponding to the Poi character information, where the index library is constructed by storing the inverted index of Poi according to index partitions corresponding to different city information, where the index library includes index partitions based on the city information, and the index partitions include Poi inverted indexes; poi heat and user click history data corresponding to Poi are blended into the inverted index of Poi;
the result display module is used for generating at least one Poi result according to the target inverted index and displaying the Poi result; also for ranking Poi for presentation based on the Poi popularity and the user click history.
21. The apparatus of claim 20, wherein the Poi recognition module comprises:
the word segmentation sub-module is used for performing word segmentation on the query information to obtain at least one word segmentation result;
the labeling submodule is used for performing part-of-speech labeling on the word segmentation result to obtain corresponding attribute information;
and the analysis submodule is used for analyzing according to the word segmentation result and the attribute information to determine Poi character information, wherein the Poi character information comprises name character information and/or address character information.
22. The apparatus of claim 20, wherein the index query module comprises:
the block searching submodule is used for searching a target index block corresponding to the city information to which the Poi belongs in an index database;
the linked list searching submodule is used for searching the inverted linked list of the characters contained in the Poi character information in the target index block;
and the index determining submodule is used for determining a target inverted index according to the inquired inverted chain table, wherein the target inverted index comprises a name domain index and/or an address domain index.
23. The apparatus of claim 22, wherein the results presentation module comprises:
the index fusion submodule is used for fusing the name domain index and the address domain index to obtain a fused Poi index;
a result generation submodule, configured to generate at least one Poi result based on the Poi indexed inverted list of corresponding characters, the inverted list including: character number, character position, and Poi heat.
24. The apparatus of claim 23, wherein the results presentation module comprises:
the obtaining submodel is used for obtaining corresponding Poi heat and character positions from the inverted list of the characters aiming at each Poi result;
the sequencing submodule is used for sequencing the Poi results according to the obtained Poi heat, character positions and the query information, and determining the corresponding arrangement sequence of the Poi results;
and the display submodule is used for displaying the Poi results according to the arrangement sequence.
25. An index repository construction apparatus comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured for execution by one or more processors, the one or more programs including instructions for:
scanning a data source of the interest point Poi, and determining attribute information and city information of each Poi;
creating Poi an inverted index from the attribute information of each Poi;
segmenting the reverse index of Poi according to the city information to obtain index blocks based on the city information;
storing the reverse indexes of Poi according to index blocks corresponding to different city information to construct a Poi index library;
recording Poi heat and user click history data corresponding to each Poi, and merging the Poi heat and the user click history data into an inverted index of Poi, so as to sort Poi based on the Poi heat and the user click history number.
26. The apparatus of claim 25, wherein the attribute information is determined based on a query domain of Poi, and wherein the inverted index of Poi comprises: creating Poi an inverted index from the attribute information of each Poi based on the inverted index of the single character, comprising:
extracting Poi address domain information and/or name domain information from the attribute information of Poi;
the name characters included in the name field information and the address characters included in the address field information are counted at Poi, and an inverted index based on a single character is determined.
27. The apparatus of claim 26, wherein said building Poi an index library from said index chunks comprises:
constructing Poi a correspondence between an inverted index and index partitions, the inverted index comprising at least one of: poi name index and Poi address index;
and constructing Poi index database based on the corresponding relation between the Poi inverted index and the index blocks.
28. The apparatus according to claim 27, wherein said counting name characters contained in each Poi name field information and determining an inverted index based on a single character comprises:
counting name characters contained in the name domain information, and determining the frequency corresponding to each name character;
determining an inverted chain table of each name character according to the frequency corresponding to each name character, wherein the inverted chain table comprises: name character number, character position, and Poi heat;
for each Poi name domain information, a Poi name index is constructed based on the inverted chain table of name characters.
29. The apparatus according to claim 27, wherein said counting address characters contained in each Poi address field information and determining an inverted index based on a single character comprises:
counting address characters contained in the address domain information, and determining the frequency corresponding to each address character;
determining an inverted chain table of each address character according to the frequency corresponding to each address character, wherein the inverted chain table comprises: address character number, character position, and Poi heat;
for each Poi realm information, a Poi address index is constructed based on the inverted chain table of address characters.
30. The apparatus of claim 25, wherein the slicing Poi inverted index according to the city information to obtain index blocks based on city information comprises:
segmenting the reverse indexes of Poi according to the city information to obtain Poi reverse indexes corresponding to the city information;
based on the reverse index of Poi corresponding to the same city information, a corresponding index block is generated.
31. The apparatus of claim 26, wherein prior to said determining an inverted index based on a single character, said execution of said one or more programs by one or more processors comprises instructions for further:
acquiring historical search data, wherein the historical search data comprises input method record data of a user, webpage click history data and map click history data;
and comprehensively analyzing the input method record data, the webpage click history data and the map click history data in the historical search data to obtain Poi heat corresponding to each character.
32. An index-based repository search apparatus comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured for execution by one or more processors, the one or more programs including instructions for:
receiving query information input by a user, wherein the query information comprises character information of a point of interest Poi input by the user in a searching process;
poi character information is identified from the query information, and the city information to which Poi belongs is determined based on the current geographic position of the user;
inquiring a target inverted index corresponding to the Poi character information from an index base according to the city information to which the Poi belongs, wherein the index base is constructed by storing the inverted index of Poi according to index blocks corresponding to different city information, the index base comprises index blocks based on the city information, and the index blocks comprise the inverted index of Poi; poi heat and user click history data corresponding to Poi are blended into the inverted index of Poi;
generating at least one Poi result according to the target inverted index, and displaying the Poi result; poi is also shown after ranking based on the Poi heat and the user click history number.
33. The apparatus as claimed in claim 32, wherein said identifying Poi character information from said query information comprises:
performing word segmentation on the query information to obtain at least one word segmentation result;
performing part-of-speech tagging on the word segmentation result to obtain corresponding attribute information;
and analyzing according to the word segmentation result and the attribute information to determine Poi character information, wherein the Poi character information comprises name character information and/or address character information.
34. The apparatus as claimed in claim 32, wherein said querying an index database according to the city information to which said Poi belongs to obtain the target inverted index corresponding to said Poi character information comprises:
searching a target index block corresponding to the city information to which the Poi belongs in an index database;
in the target index block, inquiring an inverted list of characters contained in the Poi character information;
and determining a target inverted index according to the inquired inverted chain table, wherein the target inverted index comprises a name domain index and/or an address domain index.
35. The apparatus as claimed in claim 34, wherein said generating at least one Poi result from said target inverted index comprises:
fusing the name domain index and the address domain index to obtain a fused Poi index;
generating at least one Poi result based on the Poi indexing a reverse linked list of corresponding characters, the reverse linked list comprising: character number, character position, and Poi heat.
36. The apparatus of claim 35, wherein said presenting Poi results comprises:
for each Poi result, acquiring corresponding Poi heat and character position from the inverted list of the characters;
sorting the Poi results according to the obtained Poi heat, character positions and the query information, and determining a sorting sequence corresponding to each Poi result;
the results of Poi are presented in rank order.
37. A readable storage medium, characterized in that instructions in the storage medium, when executed by a processor of a device, enable the device to perform the index repository construction method according to any of the method claims 1-7.
38. A readable storage medium, characterized in that instructions in the storage medium, when executed by a processor of a device, enable the device to perform the index repository based search method according to any of the method claims 8-12.
CN201710901601.3A 2017-09-28 2017-09-28 Index library construction method, search method and device Active CN110019645B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710901601.3A CN110019645B (en) 2017-09-28 2017-09-28 Index library construction method, search method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710901601.3A CN110019645B (en) 2017-09-28 2017-09-28 Index library construction method, search method and device

Publications (2)

Publication Number Publication Date
CN110019645A CN110019645A (en) 2019-07-16
CN110019645B true CN110019645B (en) 2022-04-19

Family

ID=67186336

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710901601.3A Active CN110019645B (en) 2017-09-28 2017-09-28 Index library construction method, search method and device

Country Status (1)

Country Link
CN (1) CN110019645B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765262A (en) * 2019-09-24 2020-02-07 北京嘀嘀无限科技发展有限公司 POI text retrieval method and device and electronic equipment
CN111008625B (en) * 2019-12-06 2023-07-18 建信金融科技有限责任公司 Address correction method, device, equipment and storage medium
CN112197779A (en) * 2020-09-14 2021-01-08 汉海信息技术(上海)有限公司 Navigation path planning method and device and printing equipment
CN112214573A (en) * 2020-10-30 2021-01-12 数贸科技(北京)有限公司 Information search system, method, computing device, and computer storage medium
CN112685540A (en) * 2021-01-07 2021-04-20 深圳市欢太科技有限公司 Search method, search device, storage medium and terminal
CN113743054A (en) * 2021-08-17 2021-12-03 上海明略人工智能(集团)有限公司 Alphabet vector learning method, system, storage medium and electronic device
CN113672627B (en) * 2021-09-08 2023-08-18 湖南惠农科技有限公司 Method and device for constructing index of elastic search engine
CN114661688B (en) * 2022-03-25 2023-09-19 马上消费金融股份有限公司 Address error correction method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102147795A (en) * 2010-02-05 2011-08-10 北京四维图新科技股份有限公司 Method and device for searching points of interest as well as navigation system
CN102456055A (en) * 2010-10-28 2012-05-16 腾讯科技(深圳)有限公司 Method and device for retrieving interest points
CN102831224A (en) * 2012-08-24 2012-12-19 北京百度网讯科技有限公司 Creating method for data index base and searching suggest generation method and device
CN103577442A (en) * 2012-07-30 2014-02-12 腾讯科技(深圳)有限公司 Method and device for calculating map data importance

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070276845A1 (en) * 2006-05-12 2007-11-29 Tele Atlas North America, Inc. Locality indexes and method for indexing localities
CN102024375A (en) * 2009-09-23 2011-04-20 深圳市天翼方向科技有限公司 Special formatting and calculation method of point of interest (POI) data of electronic map
US8620947B2 (en) * 2010-02-08 2013-12-31 Navteq B.V. Full text search in navigation systems
CN103185581B (en) * 2011-12-28 2017-03-08 上海博泰悦臻电子设备制造有限公司 Information presentation device, the reminding method of POI search result
EP2620749B1 (en) * 2012-01-26 2017-07-26 Hyundai Motor Company Indexing system of spatial information for combined SOI object and content
CN103714092A (en) * 2012-09-29 2014-04-09 北京百度网讯科技有限公司 Geographic position searching method and geographic position searching device
CN102944243B (en) * 2012-11-16 2016-12-21 沈阳美行科技有限公司 A kind of map datum can be with the method for incremental update
CN103902626B (en) * 2012-12-30 2018-04-06 上海合众思壮科技有限公司 Interest point search method, device and the equipment with the device
KR101558685B1 (en) * 2013-12-09 2015-10-07 현대자동차주식회사 System And Method for providing a Communication Service, and Vehicle supporting the same
US10097973B2 (en) * 2015-05-27 2018-10-09 Apple Inc. Systems and methods for proactively identifying and surfacing relevant content on a touch-sensitive device
CN106874287B (en) * 2015-12-11 2020-06-30 北京四维图新科技股份有限公司 Method and device for processing POI address codes

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102147795A (en) * 2010-02-05 2011-08-10 北京四维图新科技股份有限公司 Method and device for searching points of interest as well as navigation system
CN102456055A (en) * 2010-10-28 2012-05-16 腾讯科技(深圳)有限公司 Method and device for retrieving interest points
CN103577442A (en) * 2012-07-30 2014-02-12 腾讯科技(深圳)有限公司 Method and device for calculating map data importance
CN102831224A (en) * 2012-08-24 2012-12-19 北京百度网讯科技有限公司 Creating method for data index base and searching suggest generation method and device

Also Published As

Publication number Publication date
CN110019645A (en) 2019-07-16

Similar Documents

Publication Publication Date Title
CN110019645B (en) Index library construction method, search method and device
US11698261B2 (en) Method, apparatus, computer device and storage medium for determining POI alias
US10051030B2 (en) Interactive searching and recommanding method and apparatus
CN110619076B (en) Search term recommendation method and device, computer and storage medium
CN105701254B (en) Information processing method and device for information processing
CN107589855B (en) Method and device for recommending candidate words according to geographic positions
US11861516B2 (en) Methods and system for associating locations with annotations
US20120296865A1 (en) Terminal device and word stock update method thereof
US9811564B2 (en) POI information providing system, POI information providing device, POI information output device, POI information providing method, and program therefor
CN110232137B (en) Data processing method and device and electronic equipment
US11176143B2 (en) Location-aware content detection
KR20130090612A (en) Method and system for providing location based contents by analyzing keywords on social network service
CN111984749A (en) Method and device for ordering interest points
US20140280053A1 (en) Contextual socially aware local search
CN104850238A (en) Method and device for sorting candidate items generated by input method
CN112417318B (en) Method and device for determining states of interest points, electronic equipment and medium
CN108241690A (en) A kind of data processing method and device, a kind of device for data processing
CN111694906A (en) Method, device and equipment for displaying POI in map
CN103955480A (en) Method and equipment for determining target object information corresponding to user
US20090186631A1 (en) Location Based Information Related to Preferences
US20200043074A1 (en) Apparatus and method of recommending items based on areas
CN108241678B (en) Method and device for mining point of interest data
CN103853437A (en) Candidate item obtaining method and device
CN112861532B (en) Address standardization processing method, device, equipment and online searching system
CN105488204A (en) Searching method, searching apparatus and apparatus for searching

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220921

Address after: 100084. Room 9, floor 01, cyber building, building 9, building 1, Zhongguancun East Road, Haidian District, Beijing

Patentee after: BEIJING SOGOU TECHNOLOGY DEVELOPMENT Co.,Ltd.

Patentee after: Beijing Sogou Network Technology Co.,Ltd.

Address before: 100084. Room 9, floor 01, cyber building, building 9, building 1, Zhongguancun East Road, Haidian District, Beijing

Patentee before: BEIJING SOGOU TECHNOLOGY DEVELOPMENT Co.,Ltd.

TR01 Transfer of patent right