CN100472537C - Resource space model storage and access method - Google Patents

Resource space model storage and access method Download PDF

Info

Publication number
CN100472537C
CN100472537C CNB2007101176398A CN200710117639A CN100472537C CN 100472537 C CN100472537 C CN 100472537C CN B2007101176398 A CNB2007101176398 A CN B2007101176398A CN 200710117639 A CN200710117639 A CN 200710117639A CN 100472537 C CN100472537 C CN 100472537C
Authority
CN
China
Prior art keywords
resource
bit string
space model
notion
space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2007101176398A
Other languages
Chinese (zh)
Other versions
CN101071444A (en
Inventor
诸葛海
何超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dongfangjianyu Institute of Concrete Science & Technology Limited Compan
Beijing Xinao Concrete Group Co.,Ltd.
Beijing Xinhang Building Material Group Co., Ltd.
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CNB2007101176398A priority Critical patent/CN100472537C/en
Publication of CN101071444A publication Critical patent/CN101071444A/en
Application granted granted Critical
Publication of CN100472537C publication Critical patent/CN100472537C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a resource space model storing method, comprising: inputting a resource space model; making bit-string coding on the axes and concepts in the resource space model and obtaining the corresponding multidimensional bit-string space, where the bit-string coding reserves hierarchical relation between concepts; storing mapping relation of axis or concept and corresponding bit-string code; according to the inputted resource space model, initializing indexing tree corresponding to the multidimensional bit-string space and obtaining a bottom-layer indexing tree. And the invention also discloses a resource space model accessing method. And the invention can completely hold resource classification semantics and raise resource semantics searching efficiency.

Description

A kind of storage of resource space model and access method
Technical field
The present invention relates to database technical field, particularly a kind of storage of resource space model and visit implementation method.
Background technology
The storage of information resources is one of basic problems of facing of computer science, is that auxiliary people carry out the information processing core technology.Common information resources storage mode can be divided into non-structure storage (as plain text), semi-structured storage (as hypertext markup language document and xml document), structured storage (as relation database table).The semanteme of information resources normally implies and blurs in the non-structure storage mode; Only part is clear and definite for the semanteme of information resources in the semi-structured storage, utilizes mark to indicate the semanteme of document partial content as extending mark language; The semanteme of resource then is explicit in the structured storage mode, comes the storage and the retrieval of standard resource according to some predefined attributes as relation database table.
Resource space model (Resource Space Model) is a kind of new structuring resource representation method, and its principal character is to utilize the semantic organizational resources of classification.Whole resource space is made of some axles, and every axle represents a semantic feature to resource classification, and classification is by the representation of concept on the axle.Can have hierarchical relationship between the notion on every axle, the representative of filial generation notion is to the further segmentation of the resource that belongs to the parent concept classification.For example, there is following resource space:
Resource space (sex (man, woman), specialty (computing machine (software, hardware), historical (ancient history, modern history, contemporary history))).
There are two axles in this space, is respectively " sex " and " specialty ".Classification concept has " man " and " woman " on " sex " axle.The ground floor classification concept has " computing machine " and " history " on " specialty " axle, and the resource that belongs to " computing machine " can be by its sub-notion " software " and " hardware " further segmentation.A point is represented a kind of semantic classification in the resource space, deposits the resource that all belong to this classification.In list of references 1 " Hai Zhuge; Resource spacemodel; its design method and applications (method for designing of resource space model and application); Journalof Systems and Software; Volume 72; Issue 1, and June 2004, Pages 71-81 ", have detailed description about resource space model.
Resource space model has possessed comparatively perfect paradigm theory and has instructed its building process, but still lacks effective memory mechanism at present, keeps the classification semanteme of resource space model on storage medium.Compare with other spatial models of the prior art, an advantage of resource space model is the maintenance to the resource classification semanteme, therefore, when realizing the storage of resource space model on storage medium, should keep the classification semantic information of resource space model.But the mechanism that is used for realizing other spatial models storages in the prior art can not realize the reservation to resource space model classification semantic information.With the hyperspace model comparatively close with resource space model is example, during the storage of hyperspace model, require the coordinate on each dimension of space to have linear order, and in the resource space model, each dimension (axle) goes up coordinate (notion) and has only hierarchical relationship, do not have linear order, therefore, existing hyperspace storage means can not directly apply to the storage of resource space model.If, can lose the level semantic relation between notion in the resource space model, and efficient is not high when multiattribute is inquired about with relation database table commonly used memory mechanism as resource space model yet.
Based on the These characteristics of resource space model, need a kind of storage implementation method of resource space model, and visit implementation method accordingly.
Summary of the invention
The objective of the invention is to overcome existing storage means can't the storage resources spatial model in level semantic relation between notion, and when multiattribute is inquired about the not high defective of efficient, thereby provide a kind of resource space model storing method and to the method that conducts interviews of resource space model after the storage.
To achieve these goals, the invention provides a kind of resource space model storing method, may further comprise the steps:
Step 1), resource space model of input include axle in the described resource space model, and the notion of expression resource classification is arranged on the axle, on each described axle, have hierarchical relationship between described notion;
Step 2), described axle in the resource space model and notion are carried out the Bit String coding, obtain corresponding multidimensional Bit String space, described Bit String coding keeps the hierarchical relationship between described notion;
Mapping relations between step 3), the described axle of preservation or notion and its Bit String coding;
Step 4), according to the resource space model of described input, the index tree of the described multidimensional Bit String of initialization space correspondence obtains an index tree that is positioned at bottom.
In the technique scheme, in described step 2) and described step 3) between, the Bit String of described notion coding is carried out lossless compression-encoding.
Continuous 1 bit sequence during described lossless compression-encoding is encoded Bit String replaces with contained 1 number.
In the technique scheme, in described step 2) in, described Bit String coding adopts binary-tree coding.
In described binary-tree coding, at first the conceptional tree forest on every axle in the described resource space model is converted to binary tree, then the binary tree forest of all formations is converted to single binary tree; To pointing to limit assignment 0 bit of left subtree on described single the binary tree, point to limit assignment 1 bit of right subtree at last; For a certain notion on described single the binary tree, it is exactly the Bit String coding of described notion that the value that plays the limit of described notion place node process from root node is stitched together.
In the technique scheme, in described step 4), described initialization comprises the dimension of setting described multidimensional Bit String space, preserve the continuous disk space size of single index tree leaf node, preserve the continuous disk space size of single index tree intermediate node, set the value that page load factor, the node splitting factor, node are heavily inserted the factor.
The present invention also provides a kind of resource space model accessing method, specifically may further comprise the steps:
Step a), user propose resource access request, and the resource access request to the user provides is converted to the form that described multidimensional Bit String space is understood by the mapping relations between described axle or notion and Bit String coding with described resource access request;
Step b), according to resource access request, from beginning to search the leaf node relevant, determine the position of described leaf node with resource access request with the root node of multidimensional Bit String space manipulative indexing tree;
Step c), on described leaf node, realize described resource access request;
Step d), return the result of described resource access request, and utilize described axle or the notion mapping relations between encoding with Bit String that described result is converted into resource representation form in the described resource space model, and be shown to the user.
In the technique scheme, described resource access request comprises resource insertion request, resource removal request, resource modifying request, scope of resource query requests and the accurate query requests of resource.
In the technique scheme, before the user proposes resource access request, also need to define the specification area in the described multidimensional Bit String space, set the contiguous module in the multidimensional Bit String space, define the optimization aim of described index tree, and determine the screening technique of optimum subregion and the splitting method of described specification area.
The specific implementation method of the specification area in the described multidimensional Bit String of the described definition space comprises: zone in the corresponding described multidimensional Bit String of the specification area space, its projection on every dimension are this subclass of tieing up the set of all notions compositions; Specification area dynamically produces in the index tree generative process, and the classification point in same block sort zone has the semantic propinquity of classification.
Contiguous module in the described setting multidimensional Bit String space is meant; The semantic degree of approximation of classification in the described multidimensional Bit String space, it comprises the adjacency between the classification point, adjacency between specification area, adjacency between classification point and specification area, the degree of approximation of described classification semanteme obtains by calculating the shortest path length of coaxial notion on the concept hierarchy tree.
In the technique scheme, describedly search the leaf node relevant and be meant: from root node,, enter the branch node of the optimum subregion correspondence of present node, at every turn until the leaf node that arrives tree according to optimum subregion screening method with resource access request.
Described optimum subregion is to comprise to insert after the resource that subregion of overall growth minimum aspect spatial volume, space overlap volume and space girth three.
The invention has the advantages that:
1, the present invention adopts resource space model to realize can intactly keeping the resource classification semanteme to the storage of resource.
2, the resource-based classification of resource space model storage means of the present invention is semantic preserves resource on disk, improved the efficient of resource semantic retrieval.
3, resource space model storage means of the present invention is carried out the Bit String coding to axle in the resource space model and notion, only according to the Bit String sign indicating number of notion, just can judge semantic relations such as its ancestors' descendent relationship, set membership, brotherhood.
4, resource space model storage means of the present invention has realized the optional compressed encoding to notion Bit String sign indicating number, avoids long notion Bit String sign indicating number to take the too much problem of storage resources when storage.
5, defined semantic distance between notion in the resource space model storage means of the present invention, this is apart from having reflected the understanding of people to semantic relation between notion, and semantic distance can obtain by the Bit String sign indicating number that calculates notion at an easy rate between notion.
6, the present invention is by carrying out the Bit String coding, make resource space model be converted to being positioned at the multidimensional Bit String space of bottom to notion.Can realize the insertion, deletion, modification of resource, accurate various resource operations such as inquiry and range query efficiently based on multidimensional Bit String space.
7, multidimensional Bit String space utilization specification area comes the semantic similar resource of tissue typing, the continuous storage space of the corresponding disk lastblock of while specification area, and therefore, the efficient of resource operation is higher.
8, the present invention is based on semantic distance between notion and defined the adjacency between the classification point in the multidimensional Bit String space, the adjacency between specification area, the adjacency between classification point and specification area.Designed selecting of optimum subregion when optimizing criterion and instructing resource access based on the definition of adjacency, and the division of specification area, make that the resource distribution in the specification area is compacter, reduce white space, thereby reduce the time of resource operation.
Description of drawings
Fig. 1 is a resource space model synoptic diagram in one embodiment;
Synoptic diagram after the mode that Fig. 2 sets with concept hierarchy for the resource space model among Fig. 1 is represented;
Fig. 3 is for carrying out the synoptic diagram of binary-tree coding to the resource space model among Fig. 1;
Fig. 4 is coded in the synoptic diagram of storing in the disk file for each notion and its Bit String in the resource space model shown in Figure 1;
Fig. 5 is the synoptic diagram of presentation class zone drop shadow spread on certain root axle;
Fig. 6 is a resource space model storing method of the present invention;
Fig. 7 is a resource space model accessing method of the present invention.
Embodiment
Below in conjunction with the drawings and specific embodiments the present invention is described in further detail:
When storing, need the requirement of level semantic information on the retaining shaft at resource space model, the present invention adopts the mode of binary-tree coding that resource space model is encoded, obtain the Bit String coding of notion in the resource space model, thereby form the multidimensional Bit String space that is positioned at bottom, on the basis in multidimensional Bit String space, realize associative operation then to resource.Because resource space model and multidimensional Bit String space is different, in the aforesaid operations process, also need the operational order of resource space model is converted to the understandable form in multidimensional Bit String space, and the operating result in the multidimensional Bit String space that will finally obtain is converted to the resource representation form of resource space model.
Specific implementation process to the inventive method describes below:
Fig. 1 is a resource space model RS (X (C 1(C 11, C 12, C 13), C 2(C 21, C 22)), Y (C 3(C 31, C 32), C 4(C 41, C 42)) synoptic diagram, as seen from the figure, this resource space model is the resource space model of a two dimension, the ground floor on X-axis has two notions, is respectively C 1And C 2, for notion C 1Do further segmentation, be divided into C 11, C 12, C 13, for notion C 2Can do further segmentation, be divided into C 21, C 22Ground floor on Y-axis also has two notions, is respectively C 3And C 4, C 3Further be subdivided into C 31, C 32, C 4Further be subdivided into C 41, C 42In the above-mentioned resource space, C 2Be C 21And C 22Father's notion, C 21And C 22Between be brotherhood.Hierarchical relationship in the resource space model between each notion can be represented with the mode of the tree of the concept hierarchy among Fig. 2.
Realize the storage of resource space model, will encode that Fig. 3 has described and how to have realized resource space model shown in Figure 1 is encoded to resource space model.In Fig. 3, the coded system that is adopted is a binary-tree coding, in when coding, at first the conceptional tree forest on every axle in the resource space model is converted to binary tree, then the binary tree forest of all formations is converted to single binary tree; To pointing to limit assignment 0 bit of left subtree on the binary tree, point to limit assignment 1 bit of right subtree at last.For a certain notion on the binary tree, it is exactly the Bit String coding of this notion that the value that plays the limit of this notion place node process from root node is stitched together.For example, notion C 31The Bit String sign indicating number be 100.Except the above-mentioned binary-tree coding mode that present embodiment is mentioned, in other embodiment, can also be to pointing to limit assignment 1 bit of left subtree on the binary tree, to pointing to limit assignment 0 bit of right subtree.After resource space model carries out binary-tree coding, be positioned at the multidimensional Bit String space of bottom accordingly, owing to compare with resource space model, resulting multidimensional Bit String space logically belongs to lower floor, so multidimensional Bit String space is also referred to as bottom multidimensional Bit String space usually.
In above-mentioned coding implementation procedure, forest is converted to binary tree, binary tree forest, and to be converted to single binary tree all be ripe prior art, therefore in the present embodiment not to its realization explanation that makes an explanation.
After finishing coding to resource space model, according in the resource space model with layer notion number what, whether decision needs the Bit String coding of notion is carried out compressed encoding.Can know that from aforementioned cataloged procedure to resource space in the Bit String coding of a notion, 0 bit is represented a notion segmentation, 1 bit is represented a fraternal notion of equivalent layer.Because the degree of depth of concept hierarchy is generally not too large in the resource space model, but may be very big with the number of the fraternal notion of layer, so the number of 1 bit normally causes the long main cause of Bit String sign indicating number.In order to save storage space and to improve retrieval rate, the present invention can according in the resource space model with layer notion number how much determine whether the Bit String coding of notion is carried out compressed encoding.
When the Bit String of notion coding is carried out compressed encoding, adopt the method for lossless compression-encoding, it has multiple implementation.In one embodiment, number is surpassed 32 continuous 1 bit sequence with contained 1 number replacement.For example, the compressed encoding of Bit String sequence " 0-11111111-11111111-11111111-11111111-111-00-11011 " is 0-" 35 "-00-11011.In other embodiments, also can adjust, as to surpassing 8 continuous 1 bit sequence with contained 1 number replacement the number of continuous 1 bit sequence that will replace.In the above-mentioned compression encoding process, adopting can be with 32 integer value representation maximum 2 to the compression expression of continuous 1 bit sequence 31-1 continuous 1 bit.Therefore, the upper limit of the Bit String sign indicating number after overcompression is to represent with 33 integer values, in the practical application much smaller than this value; After tested, the Bit String sign indicating number average length after overcompression is no more than 160 bits.The Bit String of notion coding is compressed with to be beneficial to the Bit String coding that overcomes notion long, is unfavorable for the defective of storing.
Behind the Bit String coding of finishing axle in the resource space model and notion, also need to preserve the mapping relations between axle and notion and their Bit String coding.Corresponding relation between notion (axle) and its Bit String coding is exactly described mapping relations, mapping preservation mechanism between notion and Bit String can guarantee to obtain from given notion efficiently the Bit String coding of this notion, or obtains corresponding notion from the Bit String coding of a notion.In a resource space model, mapping relations between all notions (axle) and their Bit String coding are kept in the single file, when the file size of preserving notion and Bit String mapping relations is no more than predetermined value (as 32M byte), directly deposit in this document in the disk file, and before the user carries out resource operation, once read in internal memory, in internal memory, realize two-way mapping and efficient two-way search with existing mature technology (as Hash table); When the file size of preserving notion and Bit String mapping relations surpasses predetermined value, set up the index of this mapping at the top of file of preserving this mapping, adopt searching between outer indexed mode real concept and Bit String coding.The structure of index can utilize known technology.
Fig. 4 has represented that each notion in the resource space model shown in Figure 1 and its Bit String are coded in the storage in the disk file, and this storage file is called as the pattern string file, readily appreciates that relation between notion and Bit String coding by this pattern string file.
Resource space model also needs the index tree of multidimensional Bit String space correspondence is done initialization operation after forming multidimensional Bit String space through above-mentioned cataloged procedure.This initialization procedure comprises the dimension of setting space, preserve continuous disk space (the being called the page) size of single index tree leaf node, preserve the continuous disk space size of single index tree intermediate node (being called index node), set the value of the heavily slotting factor of page load factor, the node splitting factor, node etc.The above-mentioned parameter that sets in initialization procedure will influence operations such as follow-up resource insertion, node splitting, also will influence the speed that reads single index node from external memory.And in initialization procedure, different parameters is set, also will generate different index trees.
In this step, the index tree in bottom multidimensional Bit String space is to set at the new spatial index in multidimensional Bit String space specially, is different from the existing space index tree.Its maximum characteristics just are based on resource space model multidimensional hierarchical classification semanteme and set up index, have the level semantic relation of non-line preface on the dimension between the Bit String coordinate, and dimension is the line order relation between coordinate upward in the prior art.The index tree in the bottom multidimensional Bit String space bottom index tree that is otherwise known as.
After realizing the storage of resource space model in computer system by said method of the present invention, it is a kind of to resource space model accessing method that the present invention also provides, to realize the various operations based on the database of resource space model.
When the user will conduct interviews to the database based on resource space model, resource access request is at first proposed, for this resource access request, operation requests is converted to the understandable form in bottom multidimensional Bit String space by the mapping relations between notion and Bit String coding.Can know from the characteristics of resource space model self, there is not linear precedence between the axle in the resource space model, also do not have linear precedence between the fraternal notion on each, and on bottom multidimensional Bit String space, an axle and a same layer notion are solidified into a preface arbitrarily.Therefore, for the resource access request that provides according to resource space model, only be converted to the understandable form in bottom multidimensional Bit String space and could further in bottom multidimensional Bit String space, finish corresponding resource operation.For example, to resource space model shown in Figure 1, the user wants to insert resource r (Y=C 32, X=C 12): LOCATION, wherein (Y=C 32, X=C 12) be the classification semantic description of resource r, LOCATION is the position of resource r, can be the file path name under stand-alone environment, can be the unified resource identifier under network environment.The pattern string file of searching among Fig. 4 to be put down in writing, this resource access request will be converted into the canonical form r (001,1001) that is fit to bottom multidimensional Bit String space (first dimension is X, and second dimension is Y): LOCATION, wherein 001 is C 12The Bit String sign indicating number, the 1001st, C 32The Bit String sign indicating number.
Based on the resource access request type that the database root of resource space model provides according to the user, just can call insertion, deletion, modification and query interface based on the bottom index tree, realize associative operation.
But before in bottom multidimensional Bit String space, realizing above-mentioned resource operation, related notion related in the resource operation is described.
Specification area in A, the definition bottom multidimensional Bit String space: behind the defining classification zone, can judge that a notion is whether in this specification area.The part in the corresponding multidimensional Bit String of a specification area space, it has comprised the resource with similar classification semanteme.Notion set on this axle in the corresponding resource space model of the drop shadow spread of specification area on every dimension.In Fig. 5, suppose that dark node is all notions of the resource correspondence on certain in certain specification area, then the drop shadow spread of this specification area on this is the notion set of all shaded nodes and dark node correspondence, in Fig. 5, C sBe that the concept hierarchy tree comes the most preceding node in all dark node, C under preorder traversal eBe to come rearmost node in all dark node, then the scope under the preorder traversal of concept hierarchy tree is [C to certain specification area in the drop shadow spread on this s, C e].In the present invention, set drop shadow spread and be for fear of representing resource in certain specification area corresponding all notions on certain with set because this can cause the index tree upper layer node the expression complexity too of corresponding specification area.Under the setting of the drop shadow spread that the present invention provides, the classification semanteme of the notion in the same scope also is close, and this makes that this drop shadow spread is compact, has reduced the probability that specification area intersects.According to notion C s, C eWith the Bit String coding of notion C, can judge whether C belongs to this scope [C s, C e].Therefore, just can know also whether a notion belongs to a certain specification area.
Thereby the above-mentioned Bit String coding of notion relatively of passing through is judged a certain notion the specific implementation method in a certain specification area is as follows: make that s is notion C sBit String coding, e is notion C eBit String coding, t is the Bit String coding of notion C.If r is the longest common prefix of s and e, s ' is that s removes the remaining Bit String of a r, e ' is that e removes the remaining Bit String of a r, t ' is that t removes the remaining Bit String of a r, x is the longest common prefix of t ' and s ', and y is the longest common prefix of t ' and e ', and p is that t ' removes the remaining Bit String of an x, q is that s ' removes the remaining Bit String of an x, and m is that t ' removes the remaining Bit String of a y.When following condition that and if only if was set up, C belonged to scope [C s, C e]:
(1) r is the prefix of t and is not equal to t;
(2) if t ' starts with 0, then x equals s ', and perhaps p starts with 0 with 1 beginning and q; If t ' is with 1 beginning, then when t ' be not the prefix of e ' or t ' when equaling e ', m is empty string or starts with 0.
" vicinity " module in B, definition bottom multidimensional Bit String space: this standard has reflected the semantic degree of approximation of classification, and it is divided into the vicinity between the classification point, the vicinity between specification area, the vicinity between classification point and specification area.Upward the semantic similarity (or being called semantic distance) between notion is relevant with axle for the semantic degree of approximation of classification, and the semantic similarity between the coaxial notion (or semantic distance) equals their shortest path length on the concept hierarchy tree.Semantic distance on the axle between notion has following feature:
(a) if C is the ancestor node of C ' among two notion C and the C ', then C ' is offspring far away more, and their semantic distance is far away more, otherwise still;
(b) if there are not ancestors' descendent relationship in two notion C and C ', suppose C " be their nearest public ancestor node, then working as C ' is C " offspring far away more, the semantic distance of C and C ' is far away more; When C is C " offspring far away more, the semantic distance of C and C ' is far away more;
(c), another C then if C and C ' they are brotherhoods " and should equal C to the semantic distance of C " to the semantic distance of C ';
(d) semantic distance satisfies symmetry, and promptly the semantic distance of C and C ' equals the semantic distance of C ' and C;
(e) semantic distance satisfies ormality, and the semantic distance of promptly any C and C ' all is not less than 0;
(f) semantic distance satisfies strict ormality, if promptly the semantic distance of C and C ' equals 0, then C and C ' are identical concepts, instead says it, and the semantic distance that has only same concept is 0.
Utilize semantic similarity between above-mentioned coaxial notion can calculate adjacency between the classification semanteme, for example, the adjacency between the 2 classification points can with they on each dimension semantic distance and represent; When the adjacency of calculating between classification point and specification area, at first calculate classification point and the semantic distance of specification area between the projection on each dimension, then to these semantic distance summations; When calculating the adjacency between specification area, can change to be classified as and calculate the adjacency of classifying between point and specification area.The operation of aforementioned calculation adjacency also can have the implementation that is different from present embodiment, as when calculating the adjacency of classifying between point and specification area, also can at first calculate classification point and the semantic distance of specification area between the projection on each dimension, then these semantic distances be asked the root of quadratic sum.
The optimization aim of the bottom index tree in C, the definition multidimensional Bit String space: the bottom index tree is optimized so that the node that need visit during searching resource is less, to improve search efficiency.The index of weighing the index tree optimization aim has three classes, comprising: overlapping volume and specification area girth between the volume of specification area, specification area.Calculating to above-mentioned three desired values is ripe prior art, and the semantic distance definition that utilization provides above can calculate.
According to above-mentioned index tree optimization aim, optimum subregion screening technique when the present invention has provided resource access in the multidimensional Bit String space: optimum subregion is to comprise to insert after the resource that subregion of overall growth minimum aspect spatial volume, space overlap volume and space girth three.Simultaneously, the present invention gives the fragmentation criterion of specification area in the multidimensional Bit String space: distance wants far away between two sub-specification areas that split into, little, the overlapping volume of cumulative volume is wanted little and overall circumference will be lacked.According to above-mentioned fragmentation criterion, the fission process of specification area is as follows:
The first step, from all subclassification zones, choose two farthest as seed;
In second step, remaining specification area is distributed to certain seed one by one according to fragmentation criterion;
In the 3rd step, the subclassification zone that belongs to first seed constitutes first specification area that division obtains, and the subclassification zone that belongs to second seed constitutes second specification area that division obtains.
In the present embodiment,, provide a kind of splitting method of optimum, but also can adopt other splitting method in actual use according to fragmentation criterion.For example, based on the fragmentation criterion described in the present embodiment, at first select two sub-specification areas as seed, according to fragmentation criterion seed is distributed in remaining subclassification zone one by one then, according to the seed under the subclassification zone whole regional split is become two zones at last.In above-mentioned fission process, do not require distance farthest as two quantum splitting zones of seed.
Finish above-mentioned to multidimensional Bit String space in the definition of the determining of contiguous module, index tree optimization aim in the definition, multidimensional Bit String space of specification area, and after having determined the splitting method of optimum subregion screening technique, specification area, operations such as the insertion of resource, deletion, inquiry are specifically described.
When in multidimensional Bit String space, inserting resource, owing in the leaf node of bottom index tree, include several classification points, and each classification point includes some resources, therefore should at first in the bottom index tree, search the leaf node that can insert resource, in the relevant classification point of leaf node, insert resource then.Its specific implementation process is as follows: given resource, root node from the bottom index tree, according to the optimum subregion screening technique that preamble is mentioned, enter the pairing branch node of optimum subregion of present node, until the leaf node that arrives the bottom index tree at every turn.Classify a little under in the leaf node that is found, searching given resource, find, then resource is inserted this classification point, otherwise, the classification point of newly-built given resource correspondence, and with the resource adding wherein.Newly-built classification point may cause the overflow of leaf node, if there is not overflow, then finishes insertion process; If overflow, then this leaf node splits into two leaf nodes, and wherein newly-generated leaf node is added father node, if also overflow of father node is handled by same process, until root node.If root node also divides, a then newly-built root node, its child nodes is two nodes that former root node division obtains.
When in multidimensional Bit String space, deleting resource, also at first in the bottom index tree, search the leaf node at resource place, under this leaf node, search relevant classification point then, under the classification point, delete resource.Its specific implementation process is as follows: given resource, search the leaf node that comprises this resource earlier.From the root node of bottom index tree, the selection sort zone comprises the child node of given resource, up to arriving certain leaf node.If the specification area of all child nodes of present node does not comprise the classification point of given resource correspondence in this process, illustrate that resource does not exist, finish delete procedure; If all classification points all are not equal to given resource classification point in the leaf node that finds, illustrate that resource does not exist, finish delete procedure; If the leaf node that finds comprises the classification point of given resource correspondence, but do not store given resource in this classification point, finish delete procedure; If in the classification point of leaf node, have corresponding resource, then delete it.The resource number is zero after given resource deletion in the spatial classification point of given resource if comprise, and then deletes this classification point.The deletion of classification point may cause the underflow of the leaf node corresponding stored page.If there is not underflow, then finish delete procedure; Otherwise, all classification points in the leaf node are inserted in the tree again, and from father node, delete this leaf node.If cause the father node underflow, then all child nodes in the father node are inserted into again layer second from the bottom, the i.e. layer at these original places of child node in the tree.Recurrence is carried out this process, until root node.
When revising resource in multidimensional Bit String space, its implementation procedure is similar to delete procedure, therefore, no longer its specific implementation process is described.
The kind of in multidimensional Bit String space resource being inquired about comprises range query and accurately inquiry, and the resource delete procedure has comprised the accurate inquiry of resource, therefore, no longer its specific implementation process is described.
It is on the basis that provides a specified scope that resource is done range query, checks that the resource that will inquire about is whether in the scope of appointment.The specific implementation process of resource being done range query is as follows: provide specified scope, from the tree root node, selection sort zone and the crossing child node of specified scope, up to arriving certain leaf node, this leaf node is for comprising the node of the query resource of wanting, if the specification area of all child nodes of the present node given range of all getting along well intersects in this search procedure, illustrate that resource does not exist, the inquiry return results is empty, if the leaf node that finds does not comprise the classification point in the given range, the inquiry return results is empty; Otherwise, return all resources in all the classification points in given range.
After finishing above-mentioned resource operation, for the associative operation that need return the resource operation result to the user, return results as accurate inquiry or range query, mapping relations between utilization axle and notion and Bit String coding are translated into the resource representation form in the resource space model, and are shown to the user.
It should be noted last that above embodiment is only unrestricted in order to technical scheme of the present invention to be described.Although the present invention is had been described in detail with reference to embodiment, those of ordinary skill in the art is to be understood that, technical scheme of the present invention is made amendment or is equal to replacement, do not break away from the spirit and scope of technical solution of the present invention, it all should be encompassed in the middle of the claim scope of the present invention.

Claims (13)

1, a kind of resource space model storing method may further comprise the steps:
Step 1), resource space model of input include axle in the described resource space model, and the notion of expression resource classification is arranged on the axle, on each described axle, have hierarchical relationship between described notion;
Step 2), described axle in the resource space model and notion are carried out the Bit String coding, obtain corresponding multidimensional Bit String space, described Bit String coding keeps the hierarchical relationship between described notion;
Mapping relations between step 3), the described axle of preservation or notion and its Bit String coding;
Step 4), according to the resource space model of described input, the index tree of the described multidimensional Bit String of initialization space correspondence obtains an index tree that is positioned at bottom.
2, resource space model storing method according to claim 1 is characterized in that, in described step 2) and described step 3) between, the Bit String of described notion coding is carried out lossless compression-encoding.
3, resource space model storing method according to claim 2 is characterized in that, continuous 1 bit sequence during described lossless compression-encoding is encoded Bit String replaces with contained 1 number.
4, resource space model storing method according to claim 1 and 2 is characterized in that, in described step 2) in, described Bit String coding adopts binary-tree coding.
5, resource space model storing method according to claim 4, it is characterized in that, in described binary-tree coding, at first the conceptional tree forest on every axle in the described resource space model is converted to binary tree, then the binary tree forest of all formations is converted to single binary tree; To pointing to limit assignment 0 bit of left subtree on described single the binary tree, point to limit assignment 1 bit of right subtree at last; For a certain notion on described single the binary tree, it is exactly the Bit String coding of described notion that the value that plays the limit of described notion place node process from root node is stitched together.
6, resource space model storing method according to claim 1 and 2, it is characterized in that, in described step 4), described initialization comprises the dimension of setting described multidimensional Bit String space, preserve the continuous disk space size of single index tree leaf node, preserve the continuous disk space size of single index tree intermediate node, set the value that page load factor, the node splitting factor, node are heavily inserted the factor.
7, a kind of access method to one of claim 1-6 institute stored resource spatial model specifically may further comprise the steps:
Step a), user propose resource access request, and the resource access request to the user provides is converted to the form that described multidimensional Bit String space is understood by the mapping relations between described axle or notion and Bit String coding with described resource access request;
Step b), according to resource access request, from beginning to search the leaf node relevant, determine the position of described leaf node with resource access request with the root node of multidimensional Bit String space manipulative indexing tree;
Step c), on described leaf node, realize described resource access request;
Step d), return the result of described resource access request, and utilize described axle or the notion mapping relations between encoding with Bit String that described result is converted into resource representation form in the described resource space model, and be shown to the user.
8, resource space model accessing method according to claim 7 is characterized in that, described resource access request comprises resource insertion request, resource removal request, resource modifying request, scope of resource query requests and the accurate query requests of resource.
9, resource space model accessing method according to claim 7, it is characterized in that, before the user proposes resource access request, also need to define the specification area in the described multidimensional Bit String space, set the contiguous module in the multidimensional Bit String space, define the optimization aim of described index tree, and determine the screening technique of optimum subregion and the splitting method of described specification area.
10, resource space model accessing method according to claim 9, it is characterized in that, the specific implementation method of the specification area in the described multidimensional Bit String of the described definition space comprises: zone in the corresponding described multidimensional Bit String of the specification area space, its projection on every dimension are this subclass of tieing up the set of all notions compositions; Specification area dynamically produces in the index tree generative process, and the classification point in same block sort zone has the semantic propinquity of classification.
11, resource space model accessing method according to claim 9, it is characterized in that, contiguous module in the described setting multidimensional Bit String space is meant: the semantic degree of approximation of classification in the described multidimensional Bit String space, it comprises the adjacency between the classification point, adjacency between specification area, adjacency between classification point and specification area, the degree of approximation of described classification semanteme obtains by calculating the shortest path length of coaxial notion on the concept hierarchy tree.
12, resource space model accessing method according to claim 7, it is characterized in that, describedly search the leaf node relevant and be meant: from root node with resource access request, according to optimum subregion screening method, enter the branch node of the optimum subregion correspondence of present node, until the leaf node that arrives tree at every turn.
13, resource space model accessing method according to claim 12 is characterized in that, described optimum subregion is to comprise to insert after the resource that subregion of overall growth minimum aspect spatial volume, space overlap volume and space girth three.
CNB2007101176398A 2007-06-20 2007-06-20 Resource space model storage and access method Expired - Fee Related CN100472537C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2007101176398A CN100472537C (en) 2007-06-20 2007-06-20 Resource space model storage and access method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2007101176398A CN100472537C (en) 2007-06-20 2007-06-20 Resource space model storage and access method

Publications (2)

Publication Number Publication Date
CN101071444A CN101071444A (en) 2007-11-14
CN100472537C true CN100472537C (en) 2009-03-25

Family

ID=38898667

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2007101176398A Expired - Fee Related CN100472537C (en) 2007-06-20 2007-06-20 Resource space model storage and access method

Country Status (1)

Country Link
CN (1) CN100472537C (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103036663B (en) * 2012-12-06 2015-09-09 北京北方烽火科技有限公司 The method of SRS resource, device and base station is distributed in a kind of LTE system
CN105740272B (en) * 2014-12-10 2019-05-31 博雅网络游戏开发(深圳)有限公司 Resource file searching method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1037044A (en) * 1988-04-08 1989-11-08 国际商业机器公司 The method and apparatus of concurrent modification transacter index tree
CN1216841A (en) * 1997-10-31 1999-05-19 国际商业机器公司 Multidimensional data clustering and dimension reduction for indexing and searching
US6161105A (en) * 1994-11-21 2000-12-12 Oracle Corporation Method and apparatus for multidimensional database using binary hyperspatial code
CN1558343A (en) * 2004-01-30 2004-12-29 中国科学院计算技术研究所 Three dimensional resource browser and manager and method thereof
US6922700B1 (en) * 2000-05-16 2005-07-26 International Business Machines Corporation System and method for similarity indexing and searching in high dimensional space

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1037044A (en) * 1988-04-08 1989-11-08 国际商业机器公司 The method and apparatus of concurrent modification transacter index tree
US6161105A (en) * 1994-11-21 2000-12-12 Oracle Corporation Method and apparatus for multidimensional database using binary hyperspatial code
CN1216841A (en) * 1997-10-31 1999-05-19 国际商业机器公司 Multidimensional data clustering and dimension reduction for indexing and searching
US6922700B1 (en) * 2000-05-16 2005-07-26 International Business Machines Corporation System and method for similarity indexing and searching in high dimensional space
CN1558343A (en) * 2004-01-30 2004-12-29 中国科学院计算技术研究所 Three dimensional resource browser and manager and method thereof

Also Published As

Publication number Publication date
CN101071444A (en) 2007-11-14

Similar Documents

Publication Publication Date Title
JP3952518B2 (en) Multidimensional data processing method
EP2924594B1 (en) Data encoding and corresponding data structure in a column-store database
CN101819596B (en) Memory-based XML script buffer
CN102521334B (en) Data storage and query method based on classification characteristics and balanced binary tree
CN110309196A (en) Block chain data storage and query method, apparatus, equipment and storage medium
EP3014488B1 (en) Incremental maintenance of range-partitioned statistics for query optimization
CN110291518A (en) Merge tree garbage index
CN100594497C (en) System for implementing network search caching and search method
CN108874971A (en) A kind of tool and method applied to the storage of magnanimity labeling solid data
CN101799808A (en) Data processing method and system thereof
CN102722531A (en) Query method based on regional bitmap indexes in cloud environment
CN109815283A (en) A kind of heterogeneous data source visual inquiry method
CN108536692A (en) A kind of generation method of executive plan, device and database server
CN110147377A (en) General polling algorithm based on secondary index under extensive spatial data environment
CN101710336A (en) Method for accelerating data processing by using relational middleware
CN106484815B (en) A kind of automatic identification optimization method based on mass data class SQL retrieval scene
CN110597805A (en) Efficient novel memory index structure processing method
CN111078705A (en) Spark platform based data index establishing method and data query method
CN100472537C (en) Resource space model storage and access method
CN116089414B (en) Time sequence database writing performance optimization method and device based on mass data scene
US8700822B2 (en) Parallel aggregation system
WO2024026931A1 (en) Big data processing and forming method and model for adding value to data asset
EP4030312A1 (en) Method and apparatus for querying data, computing device, and storage medium
Kvet Database Block Management using Master Index
CN112632118A (en) Method, device, computing equipment and storage medium for querying data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: BEIJING XIN'AO CONCRETE GROUP CO., LTD. BEIJING DO

Owner name: BEIJING XINHANG BUILDING MATERIALS GROUP CO., LTD.

Free format text: FORMER OWNER: INST. OF COMPUTING TECHNOLOGY, CHINESE ACADEMY OF SCIENCES

Effective date: 20110110

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 100080 NO. 6, KEXUEYUAN SOUTH ROAD, ZHONGGUANCUN, HAIDIAN DISTRICT, BEIJING TO: 101118 GUANTOU VILLAGE, SONGZHUANG TOWN, TONGZHOU DISTRICT, BEIJING

TR01 Transfer of patent right

Effective date of registration: 20110110

Address after: 101118 Beijing Tongzhou District Songzhuang Town tube head village

Co-patentee after: Beijing Xinao Concrete Group Co.,Ltd.

Patentee after: Beijing Xinhang Building Material Group Co., Ltd.

Co-patentee after: Beijing Dongfangjianyu Institute of Concrete Science & Technology Limited Compan

Address before: 100080 Haidian District, Zhongguancun Academy of Sciences, South Road, No. 6, No.

Patentee before: Institute of Computing Technology, Chinese Academy of Sciences

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090325

Termination date: 20180620