CN106055674B - A kind of top-k under distributed environment based on metric space dominates querying method - Google Patents

A kind of top-k under distributed environment based on metric space dominates querying method Download PDF

Info

Publication number
CN106055674B
CN106055674B CN201610393610.1A CN201610393610A CN106055674B CN 106055674 B CN106055674 B CN 106055674B CN 201610393610 A CN201610393610 A CN 201610393610A CN 106055674 B CN106055674 B CN 106055674B
Authority
CN
China
Prior art keywords
ann
skyband
dominates
metric space
distance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610393610.1A
Other languages
Chinese (zh)
Other versions
CN106055674A (en
Inventor
何洁月
罗浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN201610393610.1A priority Critical patent/CN106055674B/en
Publication of CN106055674A publication Critical patent/CN106055674A/en
Application granted granted Critical
Publication of CN106055674B publication Critical patent/CN106055674B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24532Query optimisation of parallel queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention discloses the top-k under a kind of distributed environment based on metric space and dominates querying method, successively the following steps are included: step 1: given inquiry inputs the range formula d () in set Q and metric space, and range formula is used to measure the distance between entire data object and query object Q;Step 2: being proposed to be based on set ANN and k-skyband parallel algorithm according to step 1.The characteristics of by making full use of the parallel computation between each node under distributed environment, query performance is dominated by the top-k that beta pruning, sequence greatly improve under large data sets environment based on metric space, accelerate inquiry velocity, provides service for the decision of user.

Description

A kind of top-k under distributed environment based on metric space dominates querying method
Technical field
The present invention relates to a kind of querying methods, and in particular to a kind of to be based on measurement in the case where mass data concentrates distributed environment The parallel top-k in space dominates querying method.
Background technique
Top-k based on metric space dominates inquiry and is more and more more closed as a kind of important complex query Note, it concentrates the data for returning to a part and meeting user demand from magnanimity multidimensional data.Such inquiry provides for user Decision, such as be widely used in fields such as Webpage search, multimedia retrieval, e-commerce.The inquiry do not need user to Determine evaluation function and result set is controllable, calculates each object and dominate score, return and dominate the highest k result set of score.
Top-k based on metric space dominates query-defined as follows: using O={ o1,o2,…,onIndicate all data objects Set, oiIt indicates that wherein i-th of data object, each data object have D dimension, and is all a point in space.For one The top-k of a metric space dominates inquiry, and Q indicates inquiry input set, and d () indicates range formula in metric space, it is this away from It oneself can be defined from formula, such as the shortest path in figure, the maximum stream flow in network, manhatton distance etc., k indicates to return Dominate the highest k result of score.Domination is meant that: o if it existsi∈O,oi'∈ O is shown between two objects with symbol < table < Dominance relation, if oi< oi'< oi’, then have:
Give a data object oi∈ O, object oiDomination score dscore be entire data set in by it dominate object Number, it is as follows:
Dscore=| { oj∈O|oi< oj}|
It is a kind of dynamic as long as the top-k based on metric space dominates inquiry and finally obtains dominating k element of score highest Top-k dominate inquiry.Tiakas E et al. proposes the concept at first, but the also only research under traditional single cpu mode, Sharply increased now with data set, traditional uniprocessor algorithm encounters performance bottleneck, and Tiakas E et al. using M-tree this Kind index storage organization is completely not applicable for large data sets, will lead to a large amount of data redundancy, so research is based on measurement sky Between parallel top-k dominate algorithm it is extremely urgent.
Summary of the invention
Goal of the invention: it is an object of the invention to solve the deficiencies in the prior art, provides and a kind of distributed ring Parallel top-k under border based on metric space dominates querying method.
Technical solution: the parallel top-k under a kind of distributed environment of the present invention based on metric space dominates inquiry Method successively includes the steps that following sequence executes:
(1) the range formula d (), range formula d () in inquiry input data object set Q and metric space are given For measuring the distance between entire data object O and inquiry input data object set Q;
(2) it proposes to calculate parallel based on set ANN and k-skyband according to step (1), the particular content of the parallel algorithm Are as follows:
(21) ANN (Q, k) beta pruning is utilized:
According to distance metric function d () and inquiry input Q calculate all data objects and inquire input object between away from From Deal_Data_RDD and save it in each subregion then each independent Parallel implementation subregion of subregion middle ANN (Q, K), finally ANN (Q, the k) result of each subregion is screened to obtain global ANN (Q, k) by reduce interface;It will obtain The global ANN (Q, k) taken is broadcast on each node, is gone to filter original data set using ANN (Q, k), is finally obtained candidate Collect KANN (Q, k) _ RDD, centainly dominates result set D comprising last top-k in KANN (Q, k) _ RDD, the rule of filtering is not It is dominated by object in ANN (Q, k);
(22) k-skyband beta pruning is utilized:
Since obtained KANN (Q, k) _ RDD is possible to very big, own if directly calculated in KANN (Q, k) _ RDD The domination score of object is also very time-consuming, so finding the k- in KANN (Q, k) _ RDD using k-skyband thought The further beta pruning of skyband obtains final Candidate Set GlobalCandidate (k-skyband);
(23) top-k is obtained to dominate:
The domination score for calculating all objects in GlobalCandidate (k-skyband), then finds out top-k branch It is highest with score, it returns and dominates result as top-k.
Further, in the step (21), since the ANN (Q, k) of each subregion is not necessarily global ANN (Q, k), The distance that the ANN (Q, k) of each subregion is compared distance one by one is then needed to finally obtain global ANN (Q, k).
Further, the detailed content of the step (23) are as follows: by Candidate Set and initial data obtained in step (22) Collection carries out cartesian product operation, then using the api interface of the Spark ReduceByKey provided, obtains the branch of each Candidate Set With score.
The utility model has the advantages that the present invention, which is provided, dominates inquiry based on the empty top-k of measurement under distributed environment, and propose three kinds Distributed algorithm goes to solve top-k domination, by making full use of the parallel computation between each node under distributed environment Feature dominates query performance by the top-k that beta pruning, sequence greatly improve under large data sets environment based on metric space, Accelerate inquiry velocity, provides service for the decision of user;Specifically include following advantages:
(1) it proposes parallel computation skyline method, each subregion can be made while carrying out solution skyline, in this way may be used Result set is dominated with rapid solving skyline to obtain top-k;
(2) parallel computation k-skyband method is proposed, each subregion individually solves k-skyband, is independent of each other, and utilizes The characteristic of k-skyband, which does not need circulation, can be obtained by result;
(3) it proposes first with set ANN beta pruning, then parallel computation k-skyband method.Effective beta pruning, reduces Comparison operation between data, to accelerate inquiry velocity.
Detailed description of the invention
The flow chart of DAKDA algorithm in Fig. 1 present invention;
The size that Fig. 2 is k in embodiment influences schematic diagram to inquiry;
The size that Fig. 3 is m in embodiment influences schematic diagram to inquiry;
The size inquiry that Fig. 4 is c in embodiment influences schematic diagram;
Fig. 5 is the scalability comparison diagram of each algorithm in the present invention;
Fig. 6 is distributed treatment figure of the present invention;
Fig. 7 is exemplary diagram of the invention.
Specific embodiment
Technical solution of the present invention is described in detail below, but protection scope of the present invention is not limited to the implementation Example.
The hereinafter definition of involved symbol and parameter such as table 1:
1 symbol description of table
Define 1 (KNN (q, k)): given data set an O, d () are metric function, and the k- neighbour of o ∈ O, object o are KNN (o, k), KNN (o, k) indicate the k object nearest apart from object o.
Define 2 (ANN (Q, k)): given data set an O, d () are metric function, and Q indicates a group polling input object collection Close Q={ q1,q2,…,qm, ANN (Q, k) indicates k nearest object of distance Q.Select reasonable aggregate distance function d () can shadow Inquiry is rung, in general aggregate distance function has: minimum, maximum, average value etc..
Define 3 (dominations in metric space): if (O, d ()) is a metric space, Q indicates a group polling input object Set Q={ q1,q2,…,qm}.So for object o ∈ O, all object distance set in it and Q are as follows:
Adist (o, Q)={ d (o, q1),d(o,q2),…,d(o,qm)}
As object p ∈ O, if o < p, has:
This dominate is measured by the size of distance.
Define 4 (top-k based on measurement is dominated): a given group polling inputs Q and distance metric function d ().According to degree Dominance relation in quantity space, if data object oi∈ O, object oiDomination score are as follows:
Dscore=| and p ∈ O | o < p }, whereinIt returns It returns and wherein dominates the highest k object of score, the top-k for being namely based on metric space dominates query results.
Shown such as Fig. 7, the top-k based on metric space in the present embodiment dominates inquiry, first inquiry input Q= {q1,q2, the distance metric function d () used is Euclidean distance, and it is o that top-1, which dominates result,1, because of o1To q1,q2Distance Respectively less than outer (including on circle) all the points of circle, only o2Object is not by o1It dominates (because of o2To q1Distance is less than o1To q1Distance), If there is n data object o in space1Domination score be dscore (o1)=n-1, and o2Object o is not dominated at least1,o3, institute With o2Domination score dscore (o2)≤n-2, then dscore (o1) > dscore (o2) so it is o that top-1, which is dominated,1
5 (k-skyband) entire data spaces are defined,At most k-1 object dominates object o, a series of this o group At set be exactly k-skyband.
Theorem 1:top-k dominates result set
Prove reduction to absurdity, it is assumed that there are an object o1∈ D, and dominate o1Object number > k-1, therefore certainly exist k Domination score dscore >=o.dscore+1 of object, at this timeContradiction, therefore top-k dominates result set It must demonstrate,prove.
Theorem 2: inquiry input set Q, the k object { o of ANN (Q, k)1,o2,…,ok∈ O, by(whereinExpression does not dominate) set KANN (Q, k) is formed, wherein kANN (Q, k) includes Object ANN (Q, k) itself, top-k dominate result set
The 1- neighbour's object for proving that sets ANN (Q, 1) query object Q is o, because object all in D-1ANN (Q, 1) is equal It is dominated by object o, so top-1, which dominates one, is scheduled on 1ANN (Q, 1).If top-1 domination is not object o, from the above, it can be seen that branch It is scheduled in set 1ANN (Q, 1) with the high object one of score second;If top-1 domination is object o, from the above, it can be seen that dominating score the Two high objects one are scheduled in set 2ANN (Q, 2), and so on it is understood that top-k dominate result set It must demonstrate,prove.
All algorithms are realized on spark platform below::
(1) top-k based on skyline dominates algorithm (DSDA)
In existing DSDA, data set is assigned randomly in each node first, then using in spark Mappartition interface is realized in Mappartition interface and calculates skyline algorithm, each subregion available in this way Skyline, finally by the skyline of each subregion two-by-two compare obtain overall situation skyline, return skyline in Zhi Peifen The highest object of number is exactly the result set that top-k is dominated.Successively carrying out k circulation can be obtained by final result set.
(2) top-k based on k-skyband dominates algorithm (DKDA)
The thought of algorithm parallelization in spark cluster, parallel algorithm is similar to skyline by existing DKDA.Root According to top-k dominating result set known to theorem 1So k-skyband is first sought, then from k- Returning in skyband and dominating the highest k object of score is top-k dominating result set.
Data set is assigned randomly in each node first, then uses the Mappartition interface in spark, It is realized in Mappartition interface and calculates k-skyband algorithm, the k-skyband of each subregion available in this way, finally The k-skyband of each subregion is compared two-by-two and obtains overall situation k-skyband, it is highest to return to domination score in k-skyband Object is exactly the result set that top-k is dominated.This method is not needing to carry out k times to recycle in contrast to skyline method advantage, but The k-skyband for being to solve for raw data set is very time-consuming.
(3) algorithm (DAKDA) is dominated based on the parallel top-k of set ANN beta pruning and k-skyband
Since algorithm 1 needs to carry out k circulation, query time is caused to increase with k and increase, and algorithm 2 solves original number It is very time-consuming according to collection k-skyband, so the present invention can carry out beta pruning.
In the present invention, result set is dominated according to 1 top-k of theoremAnd according to 2 top- of theorem K dominates result setIt is time-consuming due to solving k-skyband ratio solution KANN (Q, k), so first with set ANN remove be not Candidate Set data, obtain Candidate Set KANN (Q, k), then solve KANN (Q, k) in k-skyband, most It is returned from k-skyband afterwards and dominates the highest k result of score as top-k domination.Step is as shown in Figure 1:
Step 1: utilizing ANN (Q, k) beta pruning
Shown in following Fig. 1 stage one, need that data handle according to distance metric function d () and inquiry input Q It is stored in each subregion to the distance between each object and query object Deal_Data_RDD, then seeks each subregion Middle ANN (Q, k) finally obtains global ANN (Q, k).It is obtained using the data set that global ANN (Q, k) goes filter original Candidate Set KANN (Q, k) _ RDD, according to theorem 2 it is known that centainly dominating knot comprising last top-k in KANN (Q, k) _ RDD Fruit collects D.
Step 2: utilizing k-skyband beta pruning
Shown in following Fig. 1 stage two, since obtained KANN (Q, k) _ RDD is possible to very big, if directly calculated The domination score of all objects is also very time-consuming in KANN (Q, k) _ RDD, so being found using k-skyband thought The further beta pruning of k-skyband in KANN (Q, k) _ RDD obtains final Candidate Set GlobalCandidate (k- skyband).According to theorem 1 it is known that centainly being dominated comprising final top-k in GlobalCandidate (k-skyband) Result set D.
Step 3: obtaining top-k and dominate result set
Shown in following Fig. 1 stage three, Candidate Set and raw data set are subjected to cartesian product operation, formation < key, value > form, wherein key indicates Candidate Set, otherwise it is 0 that value, which is 1, if Candidate Set dominates the data that initial data is concentrated; Finally by ReduceByKey, this api interface obtains the domination of all objects in GlobalCandidate (k-skyband) Then it is highest to find out top-k domination score for score.
Embodiment 1:
The present embodiment is completed on the spark distributed type assemblies of 7 nodes, and spark is built on hadoop, Use the yarn resource manager and HDFS document storage system of hadoop.Master node is both used as Driver in 7 nodes Node does worker node again, remaining 6 node is worker node.All algorithms are write with Scala language, substantially Configuration such as the following table 2:
The configuration of 2 experimental situation of table
As shown in Figures 2 to 5, experimental section mainly evaluates DSDA, DKDA, DAKDA tri- in terms of several from following Algorithm: influence (selection rationalization partition number) of the number of partitions num to query time returns the result influence, inquiry of the k to inquiry Influence, the comparison of each algorithm Candidate Set and the scalability of algorithm of the set Q size to query time are inputted, in experiment Parameter default setting is as shown in table 3 below, and wherein all data sets of radius/covering of coverage rate c=covering input Q smallest circle are minimum Radius of circle.
Table 3 tests default parameters configuration
First analyze true larger data collection: ZILLOW data set, raw data set have 2245109, due to The attribute value vacancy having in some records, the data set size after deletion is 1771107, a total of 5 attributes, for degree The range formula of quantity space uses horse Hatton's distance.Detailed process is as shown in Figure 1.As shown in fig. 6, data set is uniformly divided Less than in each slaver node, then algorithm set forth above is individually performed in each node, obtains Candidate Set, finally summarizes Result set is dominated to top-k.
Given m=5, the 1 each algorithm of evaluation of experiment is with the performance for returning the result quantity k situation of change.As shown in Fig. 2, hair Existing DSDA algorithm is obvious with the variation of k, and DAKDA algorithm is small with the variation of k, illustrates that DSDA algorithm is more sensitive to k.
Given k=10, experiment 2 evaluate each algorithm with the performance of query set Q size m situation of change.From Fig. 3 we It was found that algorithm DKDA increased dramatically with the increase of m.
Given k=10, m=5, experiment 3 evaluate each algorithm with the performance of query set Q coverage rate c situation of change.Such as figure Shown in 4: DSDA algorithm queries are most slow in the larger situation of coverage rate.The scalability of the method for the present invention is as shown in Figure 5.
Through the foregoing embodiment 1 as can be seen that the present invention for given data set, according to the inquiry of user input and Range formula in given metric space proposes that the top-k for being suitable for large data sets dominates parallel scheme;Utilize k- Result set characteristic is dominated comprising top-k in skyband result set, obtains Candidate Set first with set k- neighbour's beta pruning, then again The k-skyband of Candidate Set is obtained, top-k is finally solved and dominates result.
It is this that top-k domination is solved using skyline in contrast to traditional based on k-skyband and set ANN method, with And top-k administration method is solved using k-skyband merely, data are screened, number of comparisons between data is reduced, Accelerate inquiry velocity.Present invention Parallel Implementation on spark platform, since the top-k based on metric space dominates inquiry mesh Preceding research is uniprocessor algorithm, and it is parallel algorithm that this is proposed by the present invention, and far faster than single machine, the result of embodiment 1 is also exactly Prove the conclusion, thus the present invention by traditional based on skyline and k-skyband method parallelization, method inquiry velocity is more Fastly, and for biggish input set or mass data collection it is all suitable for.

Claims (3)

1. the top-k under a kind of distributed environment based on metric space dominates querying method, it is characterised in that: successively include following The step of sequence executes:
(1) the range formula d () in inquiry input data object set Q and metric space is given, range formula d () is used to Measure the distance between entire data object O and inquiry input data object set Q;
(2) it is proposed to be based on set ANN and k-skyband parallel algorithm, the particular content of the parallel algorithm according to step (1) are as follows:
(21) ANN (Q, k) beta pruning is utilized:
The distance between all data objects and inquiry input object are calculated according to distance metric function d () and inquiry input Q Deal_Data_RDD is simultaneously saved it in each subregion, then the middle ANN of the respective subregion of the independent Parallel implementation of each subregion (Q, k) is finally screened ANN (Q, the k) result of each subregion by reduce interface to obtain global ANN (Q, k);It will The global ANN (Q, k) of acquisition is broadcast on each node, is gone to filter original data set using ANN (Q, k), is finally obtained time Result set D centainly is dominated comprising last top-k in selected works KANN (Q, k) _ RDD, KANN (Q, k) _ RDD;
(22) it utilizes k-skyband beta pruning: utilizing k-skyband thought, find the k-skyband in KANN (Q, k) _ RDD, so Further beta pruning obtains final Candidate Set GlobalCandidate (k-skyband) afterwards;
(23) it obtains top-k and dominates result set:
The domination score for calculating all objects in GlobalCandidate (k-skyband), then finds out top-k Zhi Peifen Number is highest, returns and dominates result as top-k;
Wherein, KNN (q, k) refers to the k-NN of data object q, indicates the object of the k nearest apart from object q;ANN (Q, k) refers to The k-NN of query set Q indicates k nearest object of distance Q.
2. the top-k under distributed environment according to claim 1 based on metric space dominates querying method, feature exists In: in the step (21), since the ANN (Q, k) of each subregion is not necessarily global ANN (Q, k), then need each point The ANN (Q, k) in area compares the distance of distance one by one, and then finally obtains global ANN (Q, k).
3. the top-k under distributed environment according to claim 1 based on metric space dominates querying method, feature exists In: the detailed content of the step (23) are as follows: Candidate Set obtained in step (22) and raw data set are subjected to cartesian product Operation obtains the domination score of each Candidate Set then using the api interface of the Spark ReduceByKey provided.
CN201610393610.1A 2016-06-03 2016-06-03 A kind of top-k under distributed environment based on metric space dominates querying method Active CN106055674B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610393610.1A CN106055674B (en) 2016-06-03 2016-06-03 A kind of top-k under distributed environment based on metric space dominates querying method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610393610.1A CN106055674B (en) 2016-06-03 2016-06-03 A kind of top-k under distributed environment based on metric space dominates querying method

Publications (2)

Publication Number Publication Date
CN106055674A CN106055674A (en) 2016-10-26
CN106055674B true CN106055674B (en) 2019-05-31

Family

ID=57170263

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610393610.1A Active CN106055674B (en) 2016-06-03 2016-06-03 A kind of top-k under distributed environment based on metric space dominates querying method

Country Status (1)

Country Link
CN (1) CN106055674B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107273464B (en) * 2017-06-02 2020-05-12 浙江大学 Distributed measurement similarity query processing method based on publish/subscribe mode
CN110245022B (en) * 2019-06-21 2021-11-12 齐鲁工业大学 Parallel Skyline processing method and system under mass data
CN113065036B (en) * 2021-04-14 2021-11-16 深圳大学 Method and device for measuring performance of space supporting point and related components

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102799681A (en) * 2012-07-24 2012-11-28 河海大学 Top-k query method oriented to any data segment
CN103970871A (en) * 2014-05-12 2014-08-06 华中科技大学 Method and system for inquiring file metadata in storage system based on provenance information

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102799681A (en) * 2012-07-24 2012-11-28 河海大学 Top-k query method oriented to any data segment
CN103970871A (en) * 2014-05-12 2014-08-06 华中科技大学 Method and system for inquiring file metadata in storage system based on provenance information

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Efficient processing of top-k dominating queries;Daichi Amagata等;《world wide web》;20150429;23:1-38
Processing Top-k Dominating Queries in Metric Spaces;Tiakas E等;《ACM Transactions on Database Systems(TODS)》;20160131;第40卷(第4期);1-33

Also Published As

Publication number Publication date
CN106055674A (en) 2016-10-26

Similar Documents

Publication Publication Date Title
JP2017512344A (en) System and method for rapid data analysis
Cao et al. Efficient and accurate strategies for differentially-private sliding window queries
CN102750328B (en) A kind of construction and storage method of data structure
CN102915347A (en) Distributed data stream clustering method and system
CN106055674B (en) A kind of top-k under distributed environment based on metric space dominates querying method
CN110222029A (en) A kind of big data multidimensional analysis computational efficiency method for improving and system
Xu et al. Distributed maximal clique computation and management
CN106708989A (en) Spatial time sequence data stream application-based Skyline query method
CN106777093A (en) Skyline inquiry systems based on space time series data stream application
CN109308303B (en) Multi-table connection online aggregation method based on Markov chain
Fahad et al. A modified K-means algorithm for big data clustering
CN108520035A (en) SPARQL parent map pattern query processing methods based on star decomposition
CN108549696B (en) Time series data similarity query method based on memory calculation
CN105550332A (en) Dual-layer index structure based origin graph query method
CN115442242A (en) Workflow arrangement system and method based on importance ordering
CN108764307A (en) The density peaks clustering method of natural arest neighbors optimization
Chen et al. Optimizing multi-top-k queries over uncertain data streams
CN116226468B (en) Service data storage management method based on gridding terminal
Ding et al. Efficient probabilistic skyline query processing in mapreduce
CN104794237B (en) web information processing method and device
CN112148942A (en) Business index data classification method and device based on data clustering
Zheng et al. User preference-based data partitioning top-k skyline query processing algorithm
Svynchuk et al. Modification of query processing methods in distributed databases using fractal trees
Wang et al. Efficient aggregate farthest neighbour query processing on road networks
Dzolkhifli et al. A skyline query processing approach over interval uncertain data stream with K-means clustering technique

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant