WO2008005808A2 - Adaptive index with variable compression - Google Patents
Adaptive index with variable compression Download PDFInfo
- Publication number
- WO2008005808A2 WO2008005808A2 PCT/US2007/072411 US2007072411W WO2008005808A2 WO 2008005808 A2 WO2008005808 A2 WO 2008005808A2 US 2007072411 W US2007072411 W US 2007072411W WO 2008005808 A2 WO2008005808 A2 WO 2008005808A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- search
- key
- computer
- objects
- tree
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/02—Input arrangements using manually operated switches, e.g. using keyboards or dials
- G06F3/023—Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
- G06F3/0233—Character input methods
- G06F3/0237—Character input methods using prediction or retrieval techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
- G06F16/322—Trees
Definitions
- a number of applications can use stored geographic data to provide mapping services for a user.
- the applications that can be implemented for mobile or stationary systems can include map rendering, spatial object search, geocoding or geo-lookup, path search, direction and positioning.
- Object search particularly object search by a string key, can be used for these applications.
- Figure 1 shows a map-based system of one embodiment of the present invention.
- Figures 2A-2B shows systems with and without indexing.
- Figures 3A-3B shows a short leaf node and a long leaf node.
- Figures 4 A-4B shows a tree system of one embodiment.
- Figure 5 is a flowchart of a method of one embodiment.
- Figures 6A-6C illustrates the operation of embodiments of the method of figure 5.
- Figure 7 illustrates an example where nodes contain indications of other search criteria, such as exclusion or inclusion information.
- FIG. 8 illustrates the use of a single object store with multiple trees.
- Figure 9A-9B illustrates the use of an API to select a key structure for the tree.
- One embodiment of the present invention is a computer-implemented method for adaptive construction of a search system for searching objects by their string key.
- the search system can include objects that reside in the object store 108, and the tree 102 that can be constructed using the objects with their string key structure.
- the tree 102 can be based on a trie, or prefix tree, which is an ordered tree data structure that is used to index objects where the keys are strings that accommodate a specific search method.
- the trie facilitates retrieval of the choice of next characters, given a partial string key input.
- a description of tries is given in Donald Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching, Third Edition. Addison-Wesley, 1997. ISBN 0-201-89685-0. Section 6 3. Digital Searching, pp.492-512.
- the trie can be reconstructed to be adaptable to restrictive storage requirements through a serious of steps that manipulate key prefixes and minimize the number of nodes and leaves.
- most leaf nodes of the tree 102 can be associated with multiple objects in the object store, which can mean that only a portion of the full key is searched in the tree before obtaining a group of objects from the object store.
- the tree 102 can be a variable-scale compression of a full trie.
- Tree storage can be minimized based on a given compression criteria.
- An adaptable search method can be used to retrieve objects via the search tree and said object store. The search can adapt to the tree structure resulting from compression and a given user interface.
- the object store 108 models spatial objects in the real world that can be searched using a string key.
- the real world spatial objects can include cities, streets in the city, intersections, points of interests (POIs), or another type of object that can be associated with a string key.
- POIs points of interests
- objects can be stored in the leaves of the trie.
- a separate object store can be constructed for a given type of object as either fixed length or variable length storage. Storage for variable length objects can be constructed with a fixed length object offset directory. Such an object store can accommodate search and scrolling of objects.
- the object store entries can be determined by the unique key of each object, wherein the order of objects in the store can be determined by the object's sorting key. The object store can distinguish between components of the search key
- the search tree 102 can be constructed such that the reference to each object in the store can be found in a tree leaf using object's search key
- the search key structure of the tree can indicate spatial objects, such as streets, street intersections, Points of Interest (POIs) or other elements or attributes of objects.
- POIs Points of Interest
- the order by which components of the key for a given class of objects are concatenated indicates specific search method for this class of objects.
- the order of concatenation of the key components can be the simple mechanism by which the system designer can rapidly prototype and experiment with various user interfaces embodied in a multitude of string key definitions for a given class of objects.
- An API can be implemented and used in an application to aid in the construction of the search key and thus the search method, to allow designers to produce and experiment with a variety of user interfaces for the search of an object.
- a GUI can be built using such API to define and select a key structure(s) for the search method(s) on a given class of objects, whereby imposing an appropriate order on the object store and tree(s).
- the API can give system designers the flexibility to change a search system's interface with ease typically associated with the RDBMS technology, without relying on a relational database management system, which may have impractical storage requirements for some environments.
- a designer can assess a variety of user interfaces and underlying search methods.
- Figure 9A and 9B show the construction of trees and object stores using an API 902.
- a designer can select key components and order from the data 904.
- data fields of the data 904 can be used as key components.
- the key structure and the data can be used to construct a tree and object store that can be accessed via a user interface, which implements a search method expressed in the key structure.
- Figure 9A shows an example where the key structure is given by CITY/STREET. This means that the user interface 906 is adapted to receive data in this order.
- Figure 9B shows an example when the key structure is given by STREET/CITY with the user interface 908 adapted to receive data in this order.
- a score can be generated for search tree size, object store size, memory requirements, and best and worst case search performance. This can give system designer a tool to balance various requirements by comparing scores of different implementations.
- For example, the amount of compression can affect the performance of the system. High levels of compression can mean that more objects need to be obtained from the object store and analyzed. Low levels of compression can result in large storage requirements for the tree.
- the variable compression criteria that regulate tree construction and can maximize number of objects referenced by a leaf node can be tuned to reasonably balance performance, memory and storage use for the ultimate application. In one embodiment, compression criteria regulate a minimum number of objects under any branch of the tree.
- the tree can include leaf nodes that reference multiple objects in an object store 108 as the result of variable compression.
- the objects or object references in a leaf node can have different key values, with the common prefix matching that of a leafs parent node. This can mean a more complex search algorithm that augments a partial search on the tree with following the object references to the object store to complete the search, as opposed to the straight forward search on the original non-compressed trie structure,
- the leaf nodes can be distinguished as a short leaf node or a long leaf node.
- a short leaf references a first object in the contiguous list, and a number of objects referenced.
- a long leaf can reference an arbitrary list of objects by storing a count of references, and a direct reference for each object in the list.
- the search can include finding a leaf node 1 10 based on a search key and locating a set of matches among the objects referenced by the leaf node.
- a user inputs a search string character by character and the application searches the tree 102 to indicate a set of valid next input characters, until the search string is complete or the user requests a set of objects that match a partial key.
- the application can provide a display indicating the valid next characters, or otherwise output the valid next characters.
- a user inputs an entire or a partial search string.
- the tree that supports such searches can store the key prefix string at each tree node, with the shortest at the root and the full search key at the leaf.
- the search tree is compressed by reducing the node's key prefix to store only its own extension of the parent's prefix, such that the actual key prefix of a node is obtained by concatenating all key prefix strings on the path from the root to this node with the node's stored key prefix.
- the search tree is further compressed by collapsing nodes with a single child.
- One embodiment of the present invention is a system comprising an application 104 with a map display 106 and a search system including a tree 102 and the object store 108.
- the Tree 102 can be constructed with nodes associated with a key structure.
- the tree 102 can be compressed by reducing each node's prefix.
- the tree 102 can include leaf nodes that store objects, when the class of objects is intended to be accessed via a single search method.
- the tree 102 can include leaf nodes that contain references to objects in the object store, when that class of objects is intended to be accessed via more than one search method.
- the tree 102 can include leaf nodes that reference multiple objects in an object store 108, as the result of vaiiable compression.
- the search can include searching to find a leaf node 110 based on a search key and checking the objects indicated by the leaf node.
- the system 100 can have a user interface 110 that can receive input from the user and produce an output
- One exemplary output can be next character indication that shows the valid next characters.
- the set of available next characters can be determined from a search of the tree 102 and/or object store 108 as discussed below.
- Figures 2A and 2B shows examples of short leaf nodes that contain a pointer (ID) to the first object and a number of objects that can be retrieved from an object store with a minimum number of read operations.
- Figure 2A shows an example where objects are stored in an object store 202 with fixed sized entries, requiring a single read (provided there is sufficient memory.)
- Figure 2B shows an example where objects are stored in an object store 204 with variable sized entries. In that case, an offset array 206 with fixed sized offsets can be used to limit the number of read operations to two (provided there is sufficient memory.)
- object data can be obtained corresponding to the number of counts.
- Short leaf nodes reduce storage requirements for the tree. This can be valuable for mobile geographic applications implemented on resource-constrained systems.
- In one embodiment, consecutive objects can be stored in a short leaf as shown in figure 3A.
- the short leaf node can contain an ID and a count. The order of the objects in the object store is arranged in the order indicated by the key structure.
- Figure 3B shows a long leaf.
- the long leaf can be used to point to non- consecutive objects with individual pointers for each object.
- Figure 8 shows an example with two trees 802 and 804 pointing to objects in the same object store 806.
- the two trees can be associated with two different input elements in a user interface
- the tree whose search key structure follows the order of objects in the store 806 can use short leaf nodes to point to consecutive objects.
- the other trees can use long leaf nodes. Long leaves increase storage requirements. Number of read operations on long leaf objects is proportional to the number of objects in a long leaf.
- Figure 4A illustrates an example of how to get the set of "next available character" in one embodiment. If a user has input "PIN", the next available characters can be obtained by checking the prefixes of children nodes for node 402.
- Figure 4B shows a system where a leaf node 404 references multiple consecutive objects in the object store.
- the names of objects corresponding to node 404 are obtained from the object store and analyzed to get the "next character" information.
- the names PINE RIDGE, PINE VALLEY, PINEBROOK, PINECONE, PINNACLE, PINTAIL, PINTO of objects associated with leaf node 404 all start with the user input "PIN”. These names can be analyzed to obtain the valid next characters ⁇ ",'E','N','T' ⁇ , which can be output to the user.
- leaf nodes need not have associated key information. This can mean that the leaf nodes will have the same key prefix as their parent node. This can allow the objects or object references to be easily combined into leaf nodes for most efficient packing
- the objects referenced by the leaf node can be accessed then analyzed to determine the next character and to implement scrolling.
- the tree can have the leaf nodes at different levels of the tree.
- One embodiment of the present invention is a computer-implemented method of constructing a tree comprising a list of keys following a key structure, constructing a full tree structure and then pruning it, by combining nodes such that most leaf nodes are associated with multiple objects.
- Compression techniques can include maximizing leaf node references to objects to minimize storage overhead required for each node, based on a given criteria.
- Figure 5 shows an exemplary flowchart of one embodiment.
- a key structure is determined.
- An exemplary key structure for street name can be "street name?city- name" where '"?'" Is a delimiting character. For example, "Kensington?San Francisco”.
- An exemplary key structure for Intersection may be "streetlname?street2name?city-name”. Objects can be duplicated in the object store so that either order can be used to search for streets intersections.
- step 504 a list of keys for the objects based on the key structure can be determined.
- the key structure can also determine the order of the objects in the object store.
- step 506 a full node structure can be created based on the list of keys. This full node structure can be compressed as shown in steps 508, 510 and 512 to reduce (lie size of the tree by reducing the number of nodes and leaves. Exemplary steps are also shown in Figures 6A-6C. [0047J In figure 6A, node 604 with a single child node 602 is combined with the child node 602 to form node 606. Node 606 is associated with multiple characters in the search string.
- FIG. 6B shows an example of a compressing step.
- each grandchild node is checked to see whether it can be combined with another grandchild node. In one example, if both grandchild nodes have less than a given number of associated objects (such as 16 in one implementation), the tree is pruned to accommodate this criteria.
- nodes 610, 612 and 614 are combined together to form node 616.
- Figure 6C shows a case where node 620 is split into nodes 622 and 624 to keep the number of objects associated in each leaf node below a maximum size (such as 63 in one embodiment). [0050) The above example shows the steps as distinct.
- tree nodes can store indications of other search criteria.
- a search or other operation on the tree can use the indications to determine whether the node and its offspring nodes or associated objects need to be further analyzed.
- the indications can be used to implement an n-dimensional search
- the searches can be filtered by object attributes such as a category or a city.
- the indications can include indications of object categories that are not found among the node's offspring and/or indications of object categories that are included in at least one of its offspring.
- the searches can be filtered by a city.
- a user can search a point of interest by name, refined by a specified object category, and further refined by the name of the city where it resides.
- a character search for a point of interest could eliminate from the search path the nodes that exclude a category, such as fast food.
- the nodes can store category exclusion or inclusion information to simplify and speed up a search for a specific category.
- the exclusion information can indicate that no object associated with the node is in the category.
- the inclusion information can indicate that there is an object associated with the node in the category .
- FIG. 7 shows one example.
- a search on the tree segment can stop at node 702 if the search is for a restaurant and at node 704 if the search is for a gas station.
- the indications of other search criteria, such as exclusion information, can be implemented at the time of creation of the node tree.
- the tree of figure 7 can be used for a multi-dimensional search.
- the key information can be checked for a first dimension of the search and the search criteria information can be checked for additional dimensions of the search.
- the user interface can include checkboxes or the like to receive user input for additional search criteria indicated on the tree, for example object categories.
- the search can use the category information to determine which nodes to examine in the search. In the example of figure 7, if the user was looking for a gas station and had input a "P", "1" would not be shown as a next available character because node 704 excludes gas stations.
- the search criteria can be a code associated with certain nodes to indicate the categories not found among the node's offspring or the like.
- the objects in the object store can also have associated category information so the two dimensional search can involve both the nodes of the tree and the objects in the object store.
- the API used to select the key structure can be used to add additional search criteria to the tree to enable the multi-dimensional search.
- One embodiment may be implemented using a conventional general purpose of a specialized digital computer or microprocessors) programmed according to the teachings of the present disclosure, as will be apparent to those skilled in the computer art.
- Appropriate software coding can readily be prepared by skilled programmers based on the teachings of the present discloser, as will be apparent to those skilled in the software art.
- the invention may also be implemented by the preparation of integrated circuits or by interconnecting an appropriate network of conventional component circuits, as will be readily apparent to those skilled in the an.
- One embodiment includes a computer program product which is a storage medium (media) having instructions stored thereon/in which can be used to program a computer to perform any of the features present herein.
- the storage medium can include, but is not limited to, any type of disk including floppy disks, optical discs, DVD, CD-ROMs, micro drive, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, DRAMs, flash memory of media or device suitable for storing instructions and/or data stored on any one of the computer readable medium (media), the present invention includes software for controlling both the hardware of the general purpose/specialized computer or microprocessor, and for enabling the computer or microprocessor to interact with a human user or other mechanism utilizing the results of the present invention.
- Such software may include, but is not limited to, device drivers, operating systems, execution environments/containers, and user applications.
Abstract
Description
Claims
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CA002655011A CA2655011A1 (en) | 2006-06-30 | 2007-06-28 | Adaptive index with variable compression |
AU2007269283A AU2007269283A1 (en) | 2006-06-30 | 2007-06-28 | Adaptive index with variable compression |
BRPI0712822-3A BRPI0712822A2 (en) | 2006-06-30 | 2007-06-28 | Adaptable index with variable compression |
JP2009518560A JP2009543224A (en) | 2006-06-30 | 2007-06-28 | Adaptive index with variable compression |
EP07799151A EP2035973A4 (en) | 2006-06-30 | 2007-06-28 | Adaptive index with variable compression |
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US80636706P | 2006-06-30 | 2006-06-30 | |
US80636606P | 2006-06-30 | 2006-06-30 | |
US60/806,366 | 2006-06-30 | ||
US60/806,367 | 2006-06-30 | ||
US11/770,058 US20080016066A1 (en) | 2006-06-30 | 2007-06-28 | Adaptive index with variable compression |
US11/770,426 US20080040384A1 (en) | 2006-06-30 | 2007-06-28 | Nearest search on adaptive index with variable compression |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2008005808A2 true WO2008005808A2 (en) | 2008-01-10 |
WO2008005808A3 WO2008005808A3 (en) | 2008-10-16 |
Family
ID=38895344
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2007/072411 WO2008005808A2 (en) | 2006-06-30 | 2007-06-28 | Adaptive index with variable compression |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2008005808A2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2575054A1 (en) * | 2011-09-30 | 2013-04-03 | Harman Becker Automotive Systems GmbH | Method of generating search trees and navigation device |
-
2007
- 2007-06-28 WO PCT/US2007/072411 patent/WO2008005808A2/en active Application Filing
Non-Patent Citations (1)
Title |
---|
See references of EP2035973A4 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2575054A1 (en) * | 2011-09-30 | 2013-04-03 | Harman Becker Automotive Systems GmbH | Method of generating search trees and navigation device |
WO2013045642A1 (en) * | 2011-09-30 | 2013-04-04 | Harman Becker Automotive Systems Gmbh | Method of generating search trees and navigation device |
US9904742B2 (en) | 2011-09-30 | 2018-02-27 | Harman Becker Automotive Systems Gmbh | Method of generating search trees and navigation device |
Also Published As
Publication number | Publication date |
---|---|
WO2008005808A3 (en) | 2008-10-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20080016066A1 (en) | Adaptive index with variable compression | |
JP5342958B2 (en) | How to query the structure of compressed data | |
EP1360616B1 (en) | Database system and query optimiser | |
US20050278378A1 (en) | Systems and methods of geographical text indexing | |
WO2003098477A1 (en) | Search and presentation engine | |
EP2241983A1 (en) | Method for searching objects in a database | |
EP3367268A1 (en) | Spatially coding and displaying information | |
CN106503223B (en) | online house source searching method and device combining position and keyword information | |
US20120265778A1 (en) | Fuzzy searching in a geocoding application | |
JP2002501256A (en) | Database device | |
US8700661B2 (en) | Full text search using R-trees | |
US20060101353A1 (en) | Multi-pane navigation model for graphical user interfaces | |
WO2008005808A2 (en) | Adaptive index with variable compression | |
CN101467149A (en) | Adaptive index with variable compression | |
US7359904B2 (en) | Method to efficiently process and present possible arrangements of a set of contiguous peer-to-peer links | |
US8745035B1 (en) | Multistage pipeline for feeding joined tables to a search system | |
US6625599B1 (en) | Method and apparatus for data searching and computer-readable medium for supplying program instructions | |
Azez | R-tree for spatial data structure | |
JPH09305619A (en) | Hierarchical index retrieving device and document retrieving method | |
KR20060078185A (en) | Information searching system for car navigation system | |
Kruckenberg et al. | Index Concepts | |
WO2002089005A2 (en) | Data retrieval method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 200780022043.8 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 07799151 Country of ref document: EP Kind code of ref document: A2 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2007799151 Country of ref document: EP |
|
ENP | Entry into the national phase in: |
Ref document number: 2655011 Country of ref document: CA |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2007269283 Country of ref document: AU |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2009518560 Country of ref document: JP Ref document number: 1020087030264 Country of ref document: KR |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2008149110 Country of ref document: RU |
|
WWE | Wipo information: entry into national phase |
Ref document number: 10372/DELNP/2008 Country of ref document: IN |
|
NENP | Non-entry into the national phase in: |
Ref country code: DE |
|
ENP | Entry into the national phase in: |
Ref document number: 2007269283 Country of ref document: AU Date of ref document: 20070628 Kind code of ref document: A |
|
ENP | Entry into the national phase in: |
Ref document number: PI0712822 Country of ref document: BR Kind code of ref document: A2 Effective date: 20081212 |