CN110347676A - Uncertain temporal data management and querying method based on relationship R tree - Google Patents

Uncertain temporal data management and querying method based on relationship R tree Download PDF

Info

Publication number
CN110347676A
CN110347676A CN201910504660.6A CN201910504660A CN110347676A CN 110347676 A CN110347676 A CN 110347676A CN 201910504660 A CN201910504660 A CN 201910504660A CN 110347676 A CN110347676 A CN 110347676A
Authority
CN
China
Prior art keywords
tree
relationship
uncertain
node
weight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910504660.6A
Other languages
Chinese (zh)
Other versions
CN110347676B (en
Inventor
许建秋
韦建华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN201910504660.6A priority Critical patent/CN110347676B/en
Publication of CN110347676A publication Critical patent/CN110347676A/en
Application granted granted Critical
Publication of CN110347676B publication Critical patent/CN110347676B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of uncertain temporal data management based on relationship R tree and querying method, the method is applied to database field, and management is realized in extendible Moving objects database SECONDO and inquires uncertain temporal data.The present invention is managed by the section that uncertain but length determines to given a large amount of starting point and ending points, and a relationship R tree is built up into these sections, manage uncertain temporal data, it is that weight attribute has also been managed while having managed time attribute uncertainty, combine relationship R tree, in query process, can the influence according to weight size to query result be indexed, to improve search efficiency, and intersection probability is calculated, finally return that the k number evidence for the maximum weight that maximum possible intersects with inquiry data.

Description

Uncertain temporal data management and querying method based on relationship R tree
Technical field
The invention belongs to data processing techniques, and in particular to a kind of uncertain temporal data management based on relationship R tree With querying method.
Background technique
With application development, the storage object of data becomes to become increasingly complex, and the data of not only some determinations are also deposited In some uncertain data, such as in project planning, it is contemplated that the target date is often loose definition, such as " project will It is completed in three to six months hereafter ".Some tense variable descriptions at uncertain temporal information, the data generated in this way are more Close to the intuitive of human knowledge, the case where being also more in line with real world.For these there are probabilistic temporal data, How to carry out managing well is that we can efficiently utilize vital a part of these data.
For not knowing temporal data, be made of uncertain starting point and end point: the form of expression is as follows:
<<x1,x2>,<y1,y2>>。
When therefore establishing temporal data index, spatial index algorithm is utilized mostly, general<x1, x2>,<y1, y2>be mapped to sky Between 4 fixed points of rectangle handled.It is most common be by R tree technology in conjunction with uncertain Temporal Index.But due to this Kind of space manner indicates that effective time can have an invalid region, and when inquiry can have some impact on result, and It has only managed time attribute and has not managed weight, therefore we have proposed a kind of uncertain tense numbers based on relationship R tree According to management and querying method, weight has also been managed while managing uncertain temporal data.
Summary of the invention
Goal of the invention: the influence of inactive area is eliminated when in order to manage uncertain temporal data, the present invention is intended to provide one Uncertain temporal data management and querying method of the kind based on relationship R tree.
Technical solution: a kind of uncertain temporal data management and querying method based on relationship R tree, the method is first The uncertain temporal data of starting point and ending point is managed based on determining siding-to-siding block length, relationship R tree is then constructed, presses It is inquired according to the weight size order of temporal data;Include the following steps:
(1) generate uncertain temporal data and opening relationships R tree: given interval parameter generates original uncertain tense Then these data intervals are built up a R tree according to the rectangle frame that time attribute and weight construct by data interval;
(2) relation management of relationship R tree: the R tree constructed in traversal step (1) obtain pressed in present node weight from greatly to The relationship of the entry index of small sequence, and supplementary structure is combined to store;
(3) do not know temporal data top-k inquiry: the relationship R tree as obtained in step (2) and the value to be inquired are as defeated Enter, be compared since root point with each node, when the child node of present node to be accessed, is stored according in relationship R tree Node nodeid and the relationship of weight size select next access target.
Further, step (1) includes, using range as x-axis, being weighed according to the interval range and weight for not knowing interval data Value is y-axis, and is extended up and down to weight, uncertain interval data is built into a rectangle frame, then basis obtains One R tree of rectangle frame construction.
It further, is to choose one to will not influence inquiry when carrying out extension up and down to the weight of uncertain interval data As a result minimum is extended calculating.
Further, supplementary structure described in step (2) is B-tree and array;
In step (2) entry in a present node and present node is obtained and managed according to the R tree that traversal has constructed The relationship of call number, can be stored in relationship by the weight sequential access node from big to small of the entry in present node Be weight and call number;When with B-tree come administrative relationships, we are to traverse the nodeid in the resulting relation table of R tree Keyword is contribute, and is found its position in B-tree according to No. id of current R tree node and is obtained the corresponding son of present node The call number and weight of node;When with array come administrative relationships, a size and the same number of array of R tree node are created, Using the nodeid of node as array index, the tupleid in relationship is that array content is mapped, and the rule of mapping is that will traverse The nodeid of node is corresponding with its access order when R tree.
Further, it is preferentially accessed when access R tree interior joint according to the big node of uncertain data weight in step (3).
Preferably, supplementary structure described in step (2) is B-tree and array;The R constructed in step (2) according to traversal Tree obtains and manages the relationship of the entry index in a present node and present node, by the power of the entry in present node The sequential access node of value from big to small, relationship are expressed as follows:
Rel (tuple (int:tupleid, int:nodeid, list:entries)), list=< (w1, index1),……,wn,indexn)>;
Wherein the node of B-tree is denoted as n (nodeid, L), L=<(w1, index1) ..., (wn, indexn)>;
Based on above-mentioned array administrative relationships, a size and the same number of array of R tree node are created, with node Nodeid is array index, and the tupleid in relationship is that array content is mapped, and the rule of mapping is will to traverse the R tree time The nodeid of point is corresponding with its access order, is expressed as follows:
R-Array [nodeid]=tupleid.
It is preferentially accessed when access R tree interior joint according to the big node of uncertain data weight in step (3).
Uncertain temporal data management of the present invention based on relationship R tree and querying method realize respectively, Uncertain temporal data is built into rectangle frame according to time attribute and weight to be built into a R tree first, then is passed through time The relationship that this R tree obtains the entry index sorted from large to small in present node by weight is gone through, and combines supplementary structure (B- Tree and array) this relationship is stored, it in this way can be by when inquiring data-oriented and intersecting with uncertain temporal data Weight accessed node from big to small.
The utility model has the advantages that compared with prior art, the uncertain temporal data management of the present invention based on relationship R tree With querying method, except the uncertainty for having managed data has also managed the weight attribute of data, and inactive area is eliminated;? When doing top-k inquiry, for the child node of the node of current accessed, it can be accessed from big to small by the weight of data, and can To calculate the probability that inquiry data intersect with current data.
Detailed description of the invention
Fig. 1 is the data representation format of uncertain temporal data of the present invention;
Fig. 2 is the two-dimensional representation that embodiment does not know temporal data;
Fig. 3 is three kinds of situations of two interval datas of embodiment intersection;
Fig. 4 is the basic block diagram of relationship R tree of the present invention;
Fig. 5 is the relational graph that relationship R tree of the present invention need to safeguard;
Fig. 6 is to safeguard relational graph using B-tree in embodiment;
Relational graph is safeguarded using array in Fig. 7 embodiment.
Specific embodiment
In order to which technical solution disclosed in this invention is described in detail, with reference to the accompanying drawings of the specification and specific embodiment is done It is further elucidated above.
Disclosed in this invention is a kind of uncertain temporal data management based on relationship R tree and querying method, is used for Realize the management that temporal data is not known in extendible Moving objects database SECONDO.First by uncertain temporal data Rectangle frame is built into according to time attribute and weight to be built into a R tree, then obtains present node by traversing this R tree In the relationship of entry index that is sorted from large to small by weight, and store this pass in conjunction with supplementary structure (B-tree and array) System, in this way can be by weight accessed node from big to small when inquiring data-oriented and intersecting with uncertain temporal data.
(1) temporal data is not known to generate and contribute;
The present invention considers data-oriented special case, in order to test and actual needs, need to generate uncertain tense in advance Data.We indicate temporal data with section, and section is system automatically generated, but need to give some interval parameters:
1) section minimum value and maximum value, this be it is controllable in order to limit interval value, be in the present invention, as defined in us Section minimum value is 1, and maximum value is 100000;
2) siding-to-siding block length and interval right weight, length are the definite length for providing each section, are provided in the present invention Siding-to-siding block length is the random value in interval range;3) section quantity, section quantity specified in this experiment is 2000000.More than Value can adjust at any time merely to experiment needs according to experimental conditions.Then according to the temporal data of generation, a R is established Tree.
(2) relation management of relationship R tree;
For the R tree generated, we begun stepping through from root node and record each node nodeid and it The id and weight of entries will be obtained in a present node and present node after having traversed this tree by power in this way The relationship for the entry index that value sorts from large to small.In order to manage this relationship, we have proposed two methods, and one is utilizations B-tree management, we are contribute using traversing the nodeid in the resulting relation table of R tree as keyword;One is combine array management, wound A size and the same number of array of R tree node are built, using the nodeid of node as array index, the tupleid in relationship is Array content is mapped one by one.
(3) temporal data top-k inquiry is not known;
Given uncertain temporal data is searched and most may be used with this section when arbitrarily giving a query range K section of the maximum weight that can intersect.We search according to the relationship R tree built up, first since root node, Judge whether to intersect with section to be checked, if intersection, continues to access its child node, at this point, no longer indexing by R tree interior joint suitable Sequence access, but sequential access of the child node by being stored in relation table by weight from big to small.
Specifically, the uncertain temporal data management and querying method of the present invention based on relationship R tree, is managing Can also manage weight while time attribute, and can the influence according to weight size to query result be indexed, therefore The relationship of weight size between we need to know node in inquiry, thus our marriage relation R trees manage a node Relationship between nodeid and the id and weight of its entries.Key step is as follows:
(1) temporal data is not known to generate and contribute;
The present invention needs to construct a series of satisfactory uncertain temporal datas, and the temporal data of construction is saved in In extensible database system SECONDO, for the convenience of experiment, we set some basic parameters of data, the kind of parameter Class and meaning are described in detail before.
Fig. 1 is that a series of uncertain temporal datas generated indicate form, and the building of rectangle frame is as shown in Fig. 2, dotted line generation The movable range of table uncertain data, y-axis indicate weight, and weight is extended a very small range up and down with will be one-dimensional by us Line segment be extended to two-dimensional rectangle frame, these rectangle frames are then configured to a R tree again, as shown in Figure 4.
(2) relation management of relationship R tree;
For the R tree generated, we begun stepping through from root node and record each node nodeid and it The id and weight of entries will be obtained in a present node and present node after having traversed this tree by power in this way The relationship for the entry index that value sorts from large to small, as shown in Figure 5.In order to manage this relationship, we have proposed two methods, One is B-tree management is utilized, we are contribute using traversing the nodeid in the resulting relation table of R tree as keyword, can be according to current R No. id of the node of tree finds its position in B-tree and obtains the call number and weight of the corresponding child node of present node, As shown in Figure 6.One is array management is combined, a size and the same number of array of R tree node are created, with node Nodeid is array index, and the tupleid in relationship is that array content is mapped, and the rule of mapping is will to traverse the R tree time The nodeid of point is corresponding with its access order, as shown in Figure 7.
(3) temporal data Top-k inquiry is not known;
Another free-revving engine using relationship R tree is exactly to find and intersect with query range most probable for quick search K section of maximum weight.It looks into the present invention provides the intersection method for calculating probability of uncertain temporal data and weight are preferential Look for method.Two interval data intersections are divided into four kinds of situations, as shown in figure 3, the probability function of the two interval datas intersection can It is denoted as:
Wherein (a, b) is polling interval, and (s, e) is uncertain interval, and L is uncertain data range, and l is uncertain data Length.We access since root node when inquiry, judge whether to intersect with section to be checked, if intersection, continues to access it Child node, at this point, no longer press R tree interior joint indexed sequential access method, but pass through in relation table the child node that stores by weight from Small sequential access is arrived greatly.If Nodeid is the node of x in Fig. 4, the access order for standard R tree its child node is E1, E2, We are based on relationship R tree, have managed the relationship of inode number Yu weight size, at this time can be according to the B-tree or number constructed in (2) Sequential access node of the information stored in group by weight from big to small, i.e. E2, E1

Claims (6)

1. a kind of uncertain temporal data management and querying method based on relationship R tree, it is characterised in that: the method is first The uncertain temporal data of starting point and ending point is managed based on determining siding-to-siding block length, relationship R tree is then constructed, presses It is inquired according to the weight size order of temporal data;Include the following steps:
(1) generate uncertain temporal data and opening relationships R tree: given interval parameter generates original uncertain temporal data Then these data intervals are built up a R tree according to the rectangle frame that time attribute and weight construct by section;
(2) relation management of relationship R tree: the R tree constructed in traversal step (1) obtains is arranged in present node by weight from big to small The relationship of the entry index of sequence, and supplementary structure is combined to store;
(3) do not know temporal data top-k inquiry: the relationship R tree as obtained in step (2) and the value to be inquired are used as input, It is compared since root point with each node, when the child node of present node to be accessed, according to the section stored in relationship R tree The relationship of point nodeid and weight size selects next access target.
2. the uncertain temporal data management and querying method, feature according to claim 1 based on relationship R tree exists In: step (1) includes according to the interval range and weight for not knowing interval data, and using range as x-axis, weight is y-axis, and to power Value is extended up and down, and uncertain interval data is built into a rectangle frame, then according to obtained one R of rectangle frame construction Tree.
3. the uncertain temporal data management and querying method, feature according to claim 2 based on relationship R tree exists In: step (1) be extended to up and down to the weight of uncertain interval data and be calculated according to its minimum.
4. the uncertain temporal data management and querying method, feature according to claim 1 based on relationship R tree exists In: supplementary structure described in step (2) is B-tree and array.
5. the uncertain temporal data management and querying method, feature according to claim 1 based on relationship R tree exists In: in step (2) entry index in a present node and present node is obtained and managed according to the R tree that traversal has constructed Relationship, by the weight sequential access node from big to small of the entry in present node, what is stored in relationship is weight and rope Quotation marks;
When with B-tree come administrative relationships, contribute using traversing the nodeid in the resulting relation table of R tree as keyword, according to working as No. id of preceding R tree node finds its position in B-tree and obtains the call number and power of the corresponding child node of present node Value;
When with array come administrative relationships, a size and the same number of array of R tree node are created, is with the nodeid of node Array index, the tupleid in relationship are that array content is mapped, and the rule of mapping is node when will traverse R tree Nodeid is corresponding with its access order.
6. the uncertain temporal data management and querying method, feature according to claim 1 based on relationship R tree exists In: it is preferentially accessed when access R tree interior joint in step (3) according to the big node of uncertain data weight.
CN201910504660.6A 2019-06-11 2019-06-11 Uncertainty tense data management and query method based on relation R tree Active CN110347676B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910504660.6A CN110347676B (en) 2019-06-11 2019-06-11 Uncertainty tense data management and query method based on relation R tree

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910504660.6A CN110347676B (en) 2019-06-11 2019-06-11 Uncertainty tense data management and query method based on relation R tree

Publications (2)

Publication Number Publication Date
CN110347676A true CN110347676A (en) 2019-10-18
CN110347676B CN110347676B (en) 2021-07-27

Family

ID=68181813

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910504660.6A Active CN110347676B (en) 2019-06-11 2019-06-11 Uncertainty tense data management and query method based on relation R tree

Country Status (1)

Country Link
CN (1) CN110347676B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111723093A (en) * 2020-06-17 2020-09-29 江苏海平面数据科技有限公司 Uncertain interval data query method based on data division
CN115098616A (en) * 2022-07-25 2022-09-23 北京国科恒通科技股份有限公司 Multi-temporal spatial data storage and query methods, devices and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100036865A1 (en) * 2008-08-07 2010-02-11 Yahoo! Inc. Method For Generating Score-Optimal R-Trees
CN102810118A (en) * 2012-07-05 2012-12-05 上海电力学院 K nearest neighbor search method for variable weight network
CN103455531A (en) * 2013-02-01 2013-12-18 深圳信息职业技术学院 Parallel indexing method supporting real-time biased query of high dimensional data
US20180246896A1 (en) * 2017-02-24 2018-08-30 Microsoft Technology Licensing, Llc Corpus Specific Generative Query Completion Assistant
CN108829804A (en) * 2018-06-05 2018-11-16 洛阳师范学院 Based on the high dimensional data similarity join querying method and device apart from partition tree

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100036865A1 (en) * 2008-08-07 2010-02-11 Yahoo! Inc. Method For Generating Score-Optimal R-Trees
CN102810118A (en) * 2012-07-05 2012-12-05 上海电力学院 K nearest neighbor search method for variable weight network
CN103455531A (en) * 2013-02-01 2013-12-18 深圳信息职业技术学院 Parallel indexing method supporting real-time biased query of high dimensional data
US20180246896A1 (en) * 2017-02-24 2018-08-30 Microsoft Technology Licensing, Llc Corpus Specific Generative Query Completion Assistant
CN108829804A (en) * 2018-06-05 2018-11-16 洛阳师范学院 Based on the high dimensional data similarity join querying method and device apart from partition tree

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
鲍金玲等: ""路网环境下的最近邻查询技术"", 《软件学报》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111723093A (en) * 2020-06-17 2020-09-29 江苏海平面数据科技有限公司 Uncertain interval data query method based on data division
CN115098616A (en) * 2022-07-25 2022-09-23 北京国科恒通科技股份有限公司 Multi-temporal spatial data storage and query methods, devices and storage medium
CN115098616B (en) * 2022-07-25 2022-12-02 北京国科恒通科技股份有限公司 Multi-temporal spatial data storage and query methods, devices and storage medium

Also Published As

Publication number Publication date
CN110347676B (en) 2021-07-27

Similar Documents

Publication Publication Date Title
CN106372114B (en) A kind of on-line analysing processing system and method based on big data
CN106528773B (en) Map computing system and method based on Spark platform supporting spatial data management
CN108038136A (en) The method for building up and graph inquiring method of Company Knowledge collection of illustrative plates based on graph model
CN104361113B (en) A kind of OLAP query optimization method under internal memory flash memory mixing memory module
CN104252528B (en) Big data secondary index establishing method based on identifier space mapping
WO2005106717A1 (en) Partial query caching
CN110147377A (en) General polling algorithm based on secondary index under extensive spatial data environment
CN106383830B (en) Data retrieval method and equipment
CN110175175A (en) Secondary index and range query algorithm between a kind of distributed space based on SPARK
CN113407810B (en) City information and service integration system and method based on big data
CN110347676A (en) Uncertain temporal data management and querying method based on relationship R tree
CN109635037A (en) A kind of the fragment storage method and device of relationship type distributed data base
CN109542846A (en) A kind of Internet of Things vulnerability information management system based on data virtualization
CN111639075A (en) Non-relational database vector data management method based on flattened R tree
CN112925789A (en) Spark-based space vector data memory storage query method and system
JPH10240765A (en) Method for retrieving similar object and device therefor
CN103377236B (en) A kind of Connection inquiring method and system for distributed data base
KR101255639B1 (en) Column-oriented database system and join process method using join index thereof
CN110032676B (en) SPARQL query optimization method and system based on predicate association
Broutin et al. Partial match queries in random quadtrees
JP4440246B2 (en) Spatial index method
CN116383247A (en) Large-scale graph data efficient query method
US12026162B2 (en) Data query method and apparatus, computing device, and storage medium
CN107273464B (en) Distributed measurement similarity query processing method based on publish/subscribe mode
CN112800056A (en) Multi-layer index construction method based on multi-granularity space-time data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant