WO2020215436A1 - 应用于电子地图的空间关键字查询的搜索方法 - Google Patents

应用于电子地图的空间关键字查询的搜索方法 Download PDF

Info

Publication number
WO2020215436A1
WO2020215436A1 PCT/CN2019/088770 CN2019088770W WO2020215436A1 WO 2020215436 A1 WO2020215436 A1 WO 2020215436A1 CN 2019088770 W CN2019088770 W CN 2019088770W WO 2020215436 A1 WO2020215436 A1 WO 2020215436A1
Authority
WO
WIPO (PCT)
Prior art keywords
keyword
query
node
construct
spatial
Prior art date
Application number
PCT/CN2019/088770
Other languages
English (en)
French (fr)
Inventor
姚斌
过敏意
陈�全
张建锋
林昊
Original Assignee
上海交通大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海交通大学 filed Critical 上海交通大学
Publication of WO2020215436A1 publication Critical patent/WO2020215436A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries

Definitions

  • the invention belongs to the technical field of positioning, and specifically relates to a search method for spatial keyword query applied to an electronic map on a Spark platform.
  • Spatial keyword query takes the user's geographic location information and multiple query keywords as parameters, and returns spatial objects that have spatial and text relevance to these parameters.
  • constructing an effective index structure can greatly improve query efficiency.
  • an index in a space it refers to a data structure that arranges the position information, size and shape of the object in a certain structure.
  • the most advanced solutions for accurate spatial keyword query are based on a space-first index structure. The problem with this solution is that a general spatial text object will have at least dozens of keywords.
  • the structure based on space priority is very inefficient when indexing and optimizing spatial text objects with tens of keywords on average.
  • the space optimization structure uses string matching to prune irrelevant nodes, which may be meaningless when dealing with keywords that appear frequently. In this case, we still need to visit many child nodes. Therefore, how to develop a new search method for spatial keyword query, which can improve the indexing efficiency of keywords in the process of spatial keyword query, and save system resources, is the direction that those skilled in the art need to study.
  • R-tree Another form of B-tree development towards multi-dimensional space, which divides space objects into ranges, and each node corresponds to a region and a disk page , The non-leaf node's disk page stores the area range of all its child nodes, and the area of all child nodes of the non-leaf node falls within its area range.
  • IR-tree Based on the inverted index and the R-tree index, the calculation model of the text similarity through the inverted index.
  • BFIR-tree IR-tree based on massive data processing
  • CBFIR-tree dynamic BFIR-tree
  • S2I-V structure model structure that should be processed differently for keywords of different frequencies
  • eBRQ based on keywords contained Range query
  • aBRQ k nearest neighbor query based on approximate keywords
  • falsepositive false detection rate
  • KNN algorithm Proximity algorithm, is one of the simplest methods in data mining classification technology.
  • I-Node A leaf R-tree node, which stores an inverted list that maps each keyword to a spatial keyword object.
  • the technical problem to be solved by the present invention is to provide a search method for spatial keyword query applied to electronic maps, which can improve the indexing efficiency of keywords and save system resources.
  • a search method for spatial keyword query applied to an electronic map which includes the following steps: S1: Read each piece of data in a data set to construct an index, and jump to step S2 for each keyword of a single piece of data; S2: Compare the frequency of the keyword with the frequency threshold.
  • step S7 If the frequency of the keyword is lower than the frequency threshold, skip to step S7, otherwise skip to step S3; S3, construct a leaf node u: set the points contained in u The set of is up, each keyword t is mapped to the object list containing t to construct the inverted list of u, and the vocabulary of u is collected to construct the Bloom filter of the parent node; S4, construct the non-leaf node p: let p The sub-items of is ⁇ c1,...,cf ⁇ , the f is the maximum number of sub-items that a node can hold, the sub-nodes pointed to by each sub-item of p form the vocabulary of node p, and insert and initialize each keyword S5, build the root node, complete the IR-tree construction based on the Bloom filter; S6, build the query index based on the Bloom filter IR-Tree structure; S7: construct R for query keywords -tree data query structure.
  • keywords of different frequencies are processed differently; specifically, keywords that appear frequently and frequently are pruned in the search process based on the Bloom filter. At the same time, the keywords that appear less frequently are directly mapped to an R-tree data structure.
  • the KNN algorithm is used to implement the eBKQ query in step S61.
  • the frequency threshold in step S1 is an adjustable value.
  • R-tree is a mature data query structure for processing multi-dimensional data.
  • each node corresponds to a region and a disk page, and a disk page that is not a leaf node.
  • the area of all its child nodes is stored in the area, and the area of all child nodes of non-leaf nodes falls within its area; the disk page of the leaf node stores all the space objects within its area Circumscribed rectangle.
  • the present invention filters most of the sub-nodes in the R-tree based on the Bloom filter in the process of spatial keyword search, and verifies the accurate matching of the filtered sub-nodes. This avoids traversing all nodes every time the R-tree is accessed, thereby achieving the technical effect of improving its indexing efficiency for keywords and saving system resources.
  • Fig. 1 is a schematic diagram of the working process of embodiment 1;
  • Figure 2 is a schematic diagram of the influence of the query percentage of the present invention on the query area of the solution when the query percentage gradually increases;
  • Figure 3 is a schematic diagram of the influence of the number of keywords in the present invention on the query area of the scheme when the number of keywords gradually increases.
  • a search method for spatial keyword query applied to electronic maps of the present invention includes the following steps:
  • step S1 Read each piece of data in the data set for index construction, and jump to step S2 for each keyword of a single piece of data;
  • step S2 Compare the frequency of the keyword with the frequency threshold, and if the frequency of the keyword is lower than the frequency threshold, skip to step S7, otherwise skip to step S3;
  • S3 construct a leaf node u: set the set of points contained in u as up, map each keyword t to an object list containing t to construct an inverted list of u, and collect u's vocabulary to construct the bloom of the parent node filter;
  • step S61 the KNN algorithm is used to implement eBKQ query: by maintaining a priority queue, which is sorted according to the distance of each record to a given query location. Add the records matched by the text to the queue; then add the dequeued records to the final result, and stop the search until k results are obtained or the queue is empty.
  • a Bloom filter maps a set S composed of m elements to an n-bit binary array (represented by ⁇ B[1],...,B[n] ⁇ , each of which is initialized to 0). Bloom filter is based on a hash function family H composed of k independent hash functions. Each hash function maps each element in a given element space U to a random number v ⁇ [1,n]. Each element in the set S is mapped to a corresponding value through k hash functions and the corresponding position of the binary array is 1. If you need to query whether an element t is in the set S, check whether the binary bits B[hi(y)](i ⁇ [1,k]) corresponding to the element t are all 1 in the array of bloom filters . If it is all 1, the element t has a high probability to exist in S, otherwise the element t definitely does not exist in S.
  • the lower error rate of the Bloom filter can be ensured, and the efficient pruning ability of the B-Node can be realized.
  • setting multiple query keywords can further reduce the false positive probability of the B-Node. Assuming that a Bloom filter indexes n elements based on k hash functions and m-bit binary arrays, and assuming that the number of keywords in the current query is s, the false positive probability is:
  • the experiment was performed on a cluster of 17 nodes with two configurations: (1) 8 machines with 6-core Intel Xeon E5-2603 v3 1.60GHz processors and 20GB RAM; (2) 2 machines with 6-core Intel Xeon E5-2620 2.00GHz processor and 56GB RAM machine; (3) 7 machines equipped with 6-core Intel Xeon E5-2609 1.90GHz processor and 16GB RAM.
  • Each slave node uses 15GB of memory and all available 6 cores for subsequent calculations. All nodes are running on the Ubuntu 14.04.2 LTS system, with Hadoop 2.4.1 and Spark 1.3.0 installed. Perform related experiments on two real massive data sets.
  • the influence of the query area is shown by gradually increasing the query percentage from 1% to 20%. These four structures all show quite slow performance degradation (that is, system throughput and average delay). A larger query area usually introduces higher search costs, but due to the additional pruning capabilities of text matching, the cost is only slightly increased.
  • BFIR tree performance is as good as IR-tree.
  • BFIR-tree is superior to IR-tree in terms of space overhead.
  • the CBFIR-tree is basically similar to the BFIR tree, and the performance gap between them is very small.
  • S2I-V improves accordingly.
  • S2I-V achieves a significant performance improvement by using the pruning ability of infrequent keywords.
  • the technical solution of the present invention is suitable for service applications based on geographic location such as public comment.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Remote Sensing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种应用于电子地图的空间关键字查询的搜索方法,其包括如下步骤:S1:读取待查询关键字的数目,若所述待查询关键字为多关键字则跳转至S2,否则跳转至步骤S7;S2:将待查询关键字的频率与频率阈值进行比对,若待查询关键字的频率为低频则跳转至步骤S7、否则跳转至步骤S3;S3:构建叶子节点u:将各关键字t映射到包含t的对象列表来构建u的倒排列表,并收集u的词汇表构建父节点的布隆过滤器;S4:构建非叶子节点p:将p的各个子项指向的子节点构成节点p的词汇表,并插入初始化布隆过滤器;S5:基于布隆过滤器的IR-tree的构建;S6:构建IR-tree的查询索引;S7:对待查询关键字构建R-tree查询结构。上述方法提升其对关键字的索引效率,节约系统资源。

Description

应用于电子地图的空间关键字查询的搜索方法 技术领域
本发明属于定位技术领域,具体来说涉及一种应用于Spark平台上的应用于电子地图的空间关键字查询的搜索方法。
背景技术
近年来随着通信技术的发展和移动终端的广泛使用,基于位置的社会服务层出不穷。空间关键字查询是以用户的地理位置信息和多个查询关键字作为参数,返回和这些参数有着空间和文本相关度的空间对象。在一个查询中,构建有效的索引结构,可以极大地提高查询效率。对于一个空间中的索引,是指将对象的位置信息,大小形状等按照一定结构排列的一种数据结构。准确空间关键字查询的最先进的解决方案都是基于空间优先的索引结构,这种方案存在的问题是,一般的空间文本对象都会有至少数十个关键字。而基于空间优先的结构在对平均具有数十个关键字的空间文本对象进行索引优化时非常低效。此外,空间优化结构利用字符串匹配来剪枝无关节点,这在处理出现频率较高的关键字时可能是无意义的,而在这种情况下,我们仍然需要访问许多子节点。因此,如何开发出一种新型的空间关键字查询的搜索方法,能够在空间关键字查询过程中提升其对关键字的索引效率,节约系统资源,是本领域技术人员需要研究的方向。以下为本申请中所涉及的字母缩写的注释:R-tree:B-tree向多维空间发展的另一种形式,它将空间对象按范围划分,每个结点都对应一个区域和一个磁盘页,非叶结点的磁盘页中存储其所有子结点的区域范围,非叶结点的所有子结点的区域都落在它的区域范围之内。IR-tree:以倒排索引和R-tree索引为基础,通过倒排索引解决文本相似度的计算模型。BFIR-tree:基于海量数据处理实现的IR-tree;CBFIR-tree:动态的BFIR-tree;S2I-V结构:对不同频率的关键字应被区别处理的模型结构;eBRQ:基于关键字包含的范围查询;aBRQ:基于近似关键字包含的k最近邻查询;falsepositive:误检率;。KNN算法:即临近算法,是数据挖掘分类技术中最简单的方法之一。I-Node:一个叶子R树节点,它存储了将每个关键字映射到空间关键字对象的倒排列表。
发明内容
本发明要解决的技术问题是提供了一种应用于电子地图的空间关键字查询的搜索方法,能够提升其对关键字的索引效率,节约系统资源。
其采用的技术方案如下:
一种应用于电子地图的空间关键字查询的搜索方法,其包括如下步骤:S1:读取数据集的各条数据进行索引构建、针对单条数据的各个关键字分别跳转至步骤S2;S2:将关键字的频率与频率阈值进行比对,若关键字的频率低于所述频率阈值则跳转至步骤S7、否则跳转至步骤S3;S3,构建叶子节点u:设u中包含的点的集合为up,将各关键字t映射到包含t的对象列表来构建u的倒排列表,并收集u的词汇表构建父节点的布隆过滤器;S4,构建非叶子节点p:设p的子项为{c1,…,cf},所述f为一个节点最大能容纳的子项数目,将p的各个子项指向的子节点构成节点p的词汇表,并对各关键字插入初始化的布隆过滤器;S5,构建根节点、完成基于布隆过滤器的IR-tree的构建;S6,构建基于布隆过滤器的IR-Tree结构的查询索引;S7:对待查询关键字构建R-tree数据查询结构。
通过采用这种技术方案:将不同频率的关键字进行区别处理;具体来说,将出现频繁较高的关键字基于布隆过滤器实现在搜索过程中的剪枝。同时,将出现频率较低的关键字直接映射到一颗R-tree数据结构。
优选的是,上述应用于电子地图的空间关键字查询的搜索方法中:步骤S6包括如下步骤:S61:给定eBKQ查询公式为:eBKQ={Qs=(τ,ε),Qt},所述Qs为空间条件,Qt为一组关键字,检查当前节点中Qs是否位于查询区域,若Qs位于查询区域内,则跳转至S23,若Qs不在查询区域中,则递归检查该节点的子节点;S62:检测Qt中的各关键字是否存在于该节点的布隆过滤器中,若否则剪枝该节点,若是则跳转至S43;S63:将各个关键字映射到其对应的记录列表,并对这些列表进行取交集操作以获得最后的解集。
更优选的是,上述空间关键字查询的搜索方法中,步骤S61中采用KNN算法实现eBKQ查询。
进一步优选的是,上述应用于电子地图的空间关键字查询的搜索方法中,步骤S1中所述频率阈值为可调整值。
在上述方案中:R-tree是一种成熟的用于处理多维数据的数据查询结构,通过将对象空间按范围划分,每个结点对应一个区域和一个磁盘页,非叶结点的磁盘页中存储其所有子结点的区域范围,非叶结点的所有子结点的区域都落在它的区域范围之内;叶结点的磁盘页中存储其区域范围之内的所有空间对象的外接矩形。
同时,本发明在进行空间关键字搜索过程中基于布隆过滤器过滤R-tree中的大部分子节点,并对经过过滤后的各子节点进行准确匹配进行验证。从而避免在每次访问R-tree时遍历所有节点,由此实现提升其对关键字的索引效率,节约系统资源的技术效果。
附图说明
下面结合附图与具体实施方式对本发明作进一步详细的说明:
图1为实施例1的工作流程示意图;
图2为本发明中查询百分比逐渐增加时对本方案查询区域的影响示意图;
图3为本发明中关键字数目逐渐增加时对本方案查询区域的影响示意图。
具体实施方式
为了更清楚地说明本发明的技术方案,下面将结合各个实施例作进一步描述。
实施例1:
如图1所示,本发明一种应用于电子地图的空间关键字查询的搜索方法,其包括如下步骤:
S1:读取数据集的各条数据进行索引构建、针对单条数据的各个关键字分别跳转至步骤S2;
S2:将关键字的频率与频率阈值进行比对,若关键字的频率低于所述频率阈值则跳转至步骤S7、否则跳转至步骤S3;
S3,构建叶子节点u:设u中包含的点的集合为up,将各关键字t映射到包含t的对象列表来构建u的倒排列表,并收集u的词汇表构建父节点的布隆过滤器;
S4,构建非叶子节点p:设p的子项为{c1,…,cf},所述f为一个节点最大能容纳的子项数目,将p的各个子项指向的子节点构成节点p的词汇表,并对各关键字插入初始化的布隆过滤器;
S5,构建根节点、完成基于布隆过滤器的IR-tree的构建;
S61:给定eBKQ查询公式为:eBKQ={Qs=(τ,ε),Qt},所述Qs为空间条件,Qt为一组关键字,检查当前节点中Qs是否位于查询区域,若Qs位于查询区域内,则跳转至S23,若Qs不在查询区域中,则递归检查该节点的子节点;
S62:检测Qt中的各关键字是否存在于该节点的布隆过滤器中,若否则剪枝该节点,若是则跳转至S43;
S63:将各个关键字映射到其对应的记录列表,并对这些列表进行取交集操作以获得最后的解集。
S7:对待查询关键字构建R-tree数据查询结构。
其中,步骤S61中采用KNN算法实现eBKQ查询:通过维护一个优先级队列,其按照 每个记录到给定查询地点的距离排序。将通过文本匹配的记录添加到队列中;然后将出队的记录加入到最后的结果中,直至得到k个结果或者队列为空则停止搜索。
在上述过程中:
一个布隆过滤器将由m个元素组成的集合S映射到一个n位二进制数组(用{B[1],…,B[n]}代表,各位都初始化为0)。布隆过滤器基于由k个独立哈希函数组成的哈希函数族H,每个哈希函数都将给定元素空间U内的每个元素映射到一个随机数v∈[1,n]。将集合S中的各个元素通过k个哈希函数映射到对应值并将二进制数组的相应位置为1。如果需要查询某个元素t是否在集合S内,检查在布隆过滤器的数组中,元素t所对应的二进制位B[hi(y)](i∈[1,k])是否都为1。若其全为1,则元素t有很大的概率存在于S中,否则元素t绝对不存在于S中。
通过选择合适的哈希函数数目k和二进制数组B的大小,可以保证布隆过滤器的较低误算率,实现B-Node的高效剪枝能力。同时,设置多个查询关键字可以进一步减小B-Node的falsepositive概率。假设某个布隆过滤器基于k个哈希函数和m位的二进制数组对n个元素进行索引,并假设当前查询的关键字数目为s,则falsepositive概率为:
Figure PCTCN2019088770-appb-000001
当n/m=10且k=7时,false positive rate的概率仅为(0.008)s。而且I-Node作为最后一步检测可以保证BFIR-tree的正确性,这意味着其具有100%的recall。
比对实验:
实验在具有两种配置的由17个节点组成的集群上执行:(1)8台具有6核Intel Xeon E5-2603 v3 1.60GHz处理器和20GB RAM的机器;(2)2台配备6核Intel Xeon E5-2620 2.00GHz处理器和56GB RAM的机器;(3)7台配备6核Intel Xeon E5-2609 1.90GHz处理器和16GB RAM的机器。我们选择一台类型(2)的机器作为主节点,其他机器作为从节点。每个从节点使用15GB内存和所有可用的6个内核进行后续计算。所有节点在Ubuntu 14.04.2 LTS系统上运行,并安装有Hadoop 2.4.1和Spark 1.3.0。在两个真实的海量数据集上进行相关实验。
如图2所示:
通过将查询百分比从1%逐渐增加为20%来显示查询区域的影响。这四种结构都表现出相当缓慢的性能下降(即系统吞吐量和平均延迟)。更大的查询区域通常来说会引入更高的搜索成本,但是由于文本匹配的额外剪枝能力,成本只是稍微增加。
如图3所示:
BFIR树性能与IR-tree一样好。同时,BFIR-tree在空间开销方面优于IR-tree。而CBFIR-tree基本上类似于BFIR树,它们之间的性能差距很小。当关键字数量增加时,S2I-V的性能相应提升。当用户仅使用单个关键字发布空间关键字查询时,S2I-V与其他三个结构相类似。当查询关键字的数量增加时,S2I-V通过利用非频繁关键字的剪枝能力达到显著的性能提升。
因此,本发明的技术方案适用于大众点评等基于地理位置的服务应用。
以上所述,仅为本发明的具体实施例,但本发明的保护范围并不局限于此,任何熟悉本领域技术的技术人员在本发明公开的技术范围内,可轻易想到的变化或替换,都应涵盖在本发明的保护范围之内。本发明的保护范围以权利要求书的保护范围为准。

Claims (4)

  1. 一种应用于电子地图的空间关键字查询的搜索方法,其特征在于,包括如下步骤:
    S1:读取数据集的各条数据进行索引构建、针对单条数据的各个关键字分别跳转至步骤S2;
    S2:将关键字的频率与频率阈值进行比对,若关键字的频率低于所述频率阈值则跳转至步骤S7、否则跳转至步骤S3;
    S3,构建叶子节点u:设u中包含的点的集合为up,将各关键字t映射到包含t的对象列表来构建u的倒排列表,并收集u的词汇表构建父节点的布隆过滤器;
    S4,构建非叶子节点p:设p的子项为{c1,…,cf},所述f为一个节点最大能容纳的子项数目,将p的各个子项指向的子节点构成节点p的词汇表,并对各关键字插入初始化的布隆过滤器;
    S5,构建根节点、完成基于布隆过滤器的IR-tree的构建;
    S6,构建基于布隆过滤器的IR-Tree结构的查询索引;
    S7:对待查询关键字构建R-tree数据查询结构。
  2. 如权利要求1所述应用于电子地图的空间关键字查询的搜索方法,其特征在于:步骤S6包括如下步骤:
    S61:给定eBKQ查询公式为:eBKQ={Qs=(τ,ε),Qt},所述Qs为空间条件,Qt为一组关键字,检查当前节点中Qs是否位于查询区域,若Qs位于查询区域内,则跳转至S23,若Qs不在查询区域中,则递归检查该节点的子节点;
    S62:检测Qt中的各关键字是否存在于该节点的布隆过滤器中,若否则剪枝该节点,若是则跳转至S43;
    S63:将各个关键字映射到其对应的记录列表,并对这些列表进行取交集操作,以获得最后的解集。
  3. 如权利要求2所述空间关键字查询的搜索方法,其特征在于,步骤S61中采用KNN算法实现eBKQ查询。
  4. 如权利要求1所述空间关键字查询的搜索方法,其特征在于,步骤S1中所述频率阈值为可调整值。
PCT/CN2019/088770 2019-04-24 2019-05-28 应用于电子地图的空间关键字查询的搜索方法 WO2020215436A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910333874.1A CN110069592A (zh) 2019-04-24 2019-04-24 应用于电子地图的空间关键字查询的搜索方法
CN201910333874.1 2019-04-24

Publications (1)

Publication Number Publication Date
WO2020215436A1 true WO2020215436A1 (zh) 2020-10-29

Family

ID=67368627

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/088770 WO2020215436A1 (zh) 2019-04-24 2019-05-28 应用于电子地图的空间关键字查询的搜索方法

Country Status (2)

Country Link
CN (1) CN110069592A (zh)
WO (1) WO2020215436A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116821279B (zh) * 2023-06-06 2024-06-07 哈尔滨理工大学 一种带排斥关键字的空间关键字查询方法和系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102722553A (zh) * 2012-05-24 2012-10-10 浙江大学 基于用户日志分析的分布式倒排索引组织方法
CN102722526A (zh) * 2012-05-16 2012-10-10 成都信息工程学院 基于词性分类统计的重复网页和近似网页的识别方法
CN103279560A (zh) * 2013-06-13 2013-09-04 清华大学 基于安全区域的关键字连续查询方法
CN106874516A (zh) * 2017-03-15 2017-06-20 电子科技大学 一种云存储中基于kcb树和布隆过滤器的高效密文检索方法
US20180004786A1 (en) * 2016-06-29 2018-01-04 EMC IP Holding Company LLC Incremental bloom filter rebuild for b+ trees under multi-version concurrency control

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104090962B (zh) * 2014-07-14 2017-03-29 西北工业大学 面向海量分布式数据库的嵌套查询方法
US9910878B2 (en) * 2014-07-21 2018-03-06 Oracle International Corporation Methods for processing within-distance queries
CN104536984B (zh) * 2014-12-08 2017-10-13 北京邮电大学 一种外包数据库中的空间文本Top‑k查询的验证方法及系统
CN107391636B (zh) * 2017-07-10 2020-06-09 江苏省现代企业信息化应用支撑软件工程技术研发中心 top-m反近邻空间关键字查询方法
CN107491497B (zh) * 2017-07-25 2020-08-11 福州大学 支持任意语言查询的多用户多关键词排序可搜索加密系统
CN108337085B (zh) * 2018-01-03 2020-11-13 西安电子科技大学 一种支持动态更新的近似邻检索构建方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102722526A (zh) * 2012-05-16 2012-10-10 成都信息工程学院 基于词性分类统计的重复网页和近似网页的识别方法
CN102722553A (zh) * 2012-05-24 2012-10-10 浙江大学 基于用户日志分析的分布式倒排索引组织方法
CN103279560A (zh) * 2013-06-13 2013-09-04 清华大学 基于安全区域的关键字连续查询方法
US20180004786A1 (en) * 2016-06-29 2018-01-04 EMC IP Holding Company LLC Incremental bloom filter rebuild for b+ trees under multi-version concurrency control
CN106874516A (zh) * 2017-03-15 2017-06-20 电子科技大学 一种云存储中基于kcb树和布隆过滤器的高效密文检索方法

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116821279B (zh) * 2023-06-06 2024-06-07 哈尔滨理工大学 一种带排斥关键字的空间关键字查询方法和系统

Also Published As

Publication number Publication date
CN110069592A (zh) 2019-07-30

Similar Documents

Publication Publication Date Title
Wei et al. Analyticdb-v: A hybrid analytical engine towards query fusion for structured and unstructured data
Zhang et al. Keyword search in spatial databases: Towards searching by document
Jagadish et al. iDistance: An adaptive B+-tree based indexing method for nearest neighbor search
CN109947904B (zh) 一种基于Spark环境的偏好空间Skyline查询处理方法
Hjaltason et al. Ranking in spatial databases
Cheema et al. Probabilistic reverse nearest neighbor queries on uncertain data
Kouiroukidis et al. The effects of dimensionality curse in high dimensional knn search
Qian et al. Semantic-aware top-k spatial keyword queries
Zhang et al. Making the pyramid technique robust to query types and workloads
CN104035949A (zh) 一种基于局部敏感哈希改进算法的相似性数据检索方法
CN111026865A (zh) 知识图谱的关系对齐方法、装置、设备及存储介质
WO2020215438A1 (zh) 电子地图空间关键字查询分布式索引系统和方法
WO2020215437A1 (zh) 应用于电子地图的空间关键字查询的近似搜索方法
Zheng et al. Searching activity trajectory with keywords
Günnemann et al. Subspace clustering for indexing high dimensional data: a main memory index based on local reductions and individual multi-representations
CN112214488A (zh) 一种欧式空间数据索引树及构建和检索方法
Wang et al. All-visible-k-nearest-neighbor queries
Chen et al. Indexing metric uncertain data for range queries
WO2020215436A1 (zh) 应用于电子地图的空间关键字查询的搜索方法
Zhang et al. An optimized query index method based on R-tree
Wang et al. PL-Tree: An efficient indexing method for high-dimensional data
Hasan Performances analysis of NoSQL and relational databases for analyzing GeoJSON spatial data
CN110059148A (zh) 应用于电子地图的空间关键字查询的准确搜索方法
Lin Efficient and compact indexing structure for processing of spatial queries in line-based databases
Li et al. Top-k queries over digital traces

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19926133

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 03.02.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19926133

Country of ref document: EP

Kind code of ref document: A1