CN103488639B - A kind of querying method of XML data - Google Patents
A kind of querying method of XML data Download PDFInfo
- Publication number
- CN103488639B CN103488639B CN201210192018.7A CN201210192018A CN103488639B CN 103488639 B CN103488639 B CN 103488639B CN 201210192018 A CN201210192018 A CN 201210192018A CN 103488639 B CN103488639 B CN 103488639B
- Authority
- CN
- China
- Prior art keywords
- xml
- node
- layer
- xpath
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/80—Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
- G06F16/83—Querying
- G06F16/835—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/80—Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
- G06F16/81—Indexing, e.g. XML tags; Data structures therefor; Storage structures
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Document Processing Apparatus (AREA)
Abstract
The present invention provides the querying method of a kind of XML data, and its step includes: 1) using Native XML mode to store XML data, its storage organization includes: interior nodes layer, the node of storage XML tree, and XML element uses DDE coded system to encode;Leaf node layer, the text data of storage XML tree leaf node;Arrange layer, the inverted index of storage interior nodes layer;2) according to the XPath query statement of input, from the described row's of falling layer, take out the element sequence corresponding with the node of described XPath, and use the vanquished tree to carry out merger sequence;3) XML element after sorting merger carries out stacked and Pop operations in order, obtains Query Result from relief area.The present invention can process with keyword " OR " and the XPath of asterisk wildcard " * ", and has the highest efficiency.
Description
Technical field
The invention belongs to database technical field, relate to storage and the querying method of semi-structured data XML, be specifically related to one
Plant the XML data query method that can effectively support XML query language XPath.
Background technology
Owing to increasing application system uses XML to issue and exchange data, the scale of XML data as reference format
Drastically expand, in IDC(Internet data center) the nearest a report display issued, the IT department of 500 enterprises that are interviewed
In have 29% to use XML document and XML database the most in a large number.The most effectively manage XML data to become in the urgent need to solving
Problem certainly.
Quick and precisely search the XPath all coupling elements in XML database, be the core operation of XML query process.
Such as, XPath expression formula a: book [title=' XML '] //author [fn=' Jane ' AND ln=' Doe '], this table
The node author reaching formula coupling needs to meet: 1) having child node fn, its content is ' Jane ';2) there is a child node
Ln, its content is ' Doe ';3) it is the offspring of book node, and book node has the content to be ' the title joint of XML '
Point.
In XML-schema matching process the more typical TurboXPath algorithm for XML data stream having DB2 to develop and
The TwigStack algorithm of academia proposition in 2002.
In TwigStack algorithm, each node q on XPath correspond to Tq and Sq.Tq representative element sequence,
Q is the tag names on XPath, and Tq is all elements in XML document with q name matching, and the unit in Tq
Element arranges according to document sequence.Sq representative element stack, storage and the element of q name matching, the element processed when algorithm is
Crossing when closing label of element in stack, in stack, element to be popped.Algorithm only to element operation in Tq, skips unrelated XML unit
Element, so the IO efficiency of algorithm is the highest.But TwigStack algorithm can not process two kinds of situations: first is to have asterisk wildcard " * "
XPath, such as //a/* [b]/c because TwigStack algorithm uses Interval Coding, though have element a and element b and
The level difference 2 of c, but also cannot determine whether element b and c has identical father;Second is that TwigStack algorithm can only be located
It is the XPath of XPath, such as //a [bAND the c]/d of the AND ' relation ', but can not process keyword ' OR ' between reason twig,
Such as //a [b OR c]/d.
TurboXPath algorithm is the match query algorithm to XML stream (XML stream) that DB2 uses, and has not both had rope
Drawing, the most do not encode, the XML element in XML stream arranges according to document sequence, can process keyword easily ' OR '
XPath.TurboXpath function is more sound, but for the XML data in data base, TurboXPath algorithm is from the beginning
Scanning XML document to tail, IO cost is very big, particularly with the XML document that those are bigger.
Summary of the invention
It is an object of the invention to for the problems of the prior art, it is provided that the querying method of a kind of new XML data, it is possible to place
Reason is with keyword " OR " and the XPath of asterisk wildcard " * ", and has the highest efficiency.
For achieving the above object, the present invention adopts the following technical scheme that
A kind of querying method of XML data, its step includes:
1) using Native XML mode to store XML data, its storage organization includes: interior nodes layer, and storage is according to document
The node of the XML tree of sequence arrangement, wherein XML element uses DDE coded system to encode;Leaf node layer, stores XML
The text data of leaf nodes;Arranging layer, the inverted index of storage interior nodes layer, each index entry is the unit that tag names is identical
The sequence that element is arranged according to document sequence;
2) according to the XPath query statement of input, from the described row's of falling layer, the element sequence corresponding with the node of described XPath is taken out
Row, and use the vanquished tree to carry out merger sequence;
3) XML element after sorting merger carries out stacked and Pop operations in order, and obtains Query Result from relief area.
Further, in described interior nodes layer, the information of every record includes: the integer identifiers that is mapped to by namespace node, DDE
Coding and node type.
Further, in the described row of falling layer, the information of each element includes: element type, this element in the address of interior nodes layer and
DDE encodes.
Further, described interior nodes layer points to described leaf node layer by pointer.
Further, described employing the vanquished tree carries out merger sequence, is that the coding of the DDE to two elements compares, obtains institute
Stating relation before and after two elements, and set preceding element as victor, posterior element is the vanquished.
Further, in described XPath, each node q has two data structures: element sequence Tq and stack Sq;Tq is XML
With all elements of q name matching in document, and arrange according to document sequence;Sq is used for the element of storage and q name matching, and
Carry out stacked and Pop operations.
Further, when stack-incoming operation, only retaining the ancestors of new element in stack, all elements in stack is all that ancestors offspring is closed
System.
Further, if element e wants stacked SE, on XPath, the father node of node E is A, then stacked for element e judgement
Condition is:
A) SA have the element of chain;Described go out chain refer to the record of the ancestors that are not e from connecting all elements stack
Chained list is deleted;
B) chain and the child of the element near stack top are not gone out during e is SA;
C) type of e is identical with the type of E on XPath.
Further, when XPath occurs asterisk wildcard " * ", amplify out three kinds of new axles: the sub-axle of grandfather, absolute ancestors offspring
Axle, special ancestors' offspring's axle, and use described three kinds of new axles that the XPath containing asterisk wildcard " * " carries out equivalent rewriting.
The XML data query method of the present invention, solves TwigStack method and can not support with keyword " OR " and lead to
Join the XPath problem of symbol " * ";For the query processing of XML data in data base, have as TwigStack method
IO efficiency, and in hgher efficiency than TurboXPath method.At present, increasing application system uses XML conduct
Data are issued and exchanged to reference format, and the scale of XML data drastically expands, and finance, medical treatment, E-Government, news etc. are led
Territory has used the XML standard of each formulation to realize the data exchange between different department, different enterprise, the inventive method
Can be widely applied to these fields, realize the effective query to XML data and management efficiently.
Accompanying drawing explanation
Fig. 1 is the flow chart of steps of the XML data query method of the embodiment of the present invention.
Fig. 2 is the Native XML storage mode schematic diagram of the embodiment of the present invention.
Fig. 3 is the schematic diagram of interior nodes layer in Fig. 2.
Fig. 4 is the querying flow figure of right in the embodiment of the present invention //a [//c]/b.
Fig. 5 is the stacked Pop operations schematic diagram of what right in the embodiment of the present invention //a [//c]/b inquired about.
Fig. 6 is the stacked Pop operations schematic diagram of what right in the embodiment of the present invention //a/* [c]/b inquired about.
Detailed description of the invention
Below by specific embodiment, and coordinate accompanying drawing, the present invention is described in detail.
Fig. 1 is the flow chart of the XML data query method of the present invention, and concrete steps include:
1) Native XML mode is used to store the XML data in data base.
The XML data query method of the present invention belongs to overall sprig method of attachment, compared with early structureization connection, overall little
Branch interconnection technique can avoid the most invalid intermediate object program.The basis of the inventive method is Native XML storage, to XML
Element uses DDE coded system.Native memory mechanism maintains the document sequence of XML element, by the opening of bid of an element
Sign physical address and just can be taken off the subdocument with this element as root.DDE coding is used for the common structure relation (ancestral to XML element
First offspring, father and son, brother etc.) judge.
The design Storage of the present invention is divided into three layers: interior nodes layer, leaf node layer and arrange layer, as shown in Figure 2.
A) interior nodes layer
The node of XML tree is arranged according to document sequence, is stored in interior nodes layer.Every record of this layer is an XML tree joint
Point, the information of every record includes the convenient storage of integer identifiers tagID(that is mapped to by namespace node, conveniently compares), DDE
Coding, node type (element, attribute, text) etc..Fig. 3 is the example of a simple interior nodes layer, wherein, (a)
For XML tree;B () is the sequential storage corresponding with (a), with "/" beginning for closing label record;" Database " and " 25.00 "
Two leaf nodes are here pointer, and actual content is stored in leaf node layer.
The structural relation that XML coding is used to judge between XML element.TwigStack algorithm can not process with wildcard
The XPath of symbol " * ", because the Interval Coding that it uses can not judge brother's axle.The present invention uses DDE to encode, and DDE compiles
Code has than the benefit of Interval Coding:
The axle DDE that Interval Coding can determine that can judge, and DDE also can determine that brother's axle, and Interval Coding but can not;
DDE coding can support the renewal of XML document well, and i.e. when XML document changes, original coding is not required to more
Changing, Interval Coding is not accomplished.
B) leaf node layer
The text data of every record one leaf node of storage, the text data of storage XML tree leaf node.Interior nodes layer has finger
Pin points to here, is found the Physical Page at text data place by these pointers.
C) layer is arranged
The row's of falling layer is similar to the inverted index in IR system.The elementary composition sequence that in the row's of falling layer, all tag names are identical,
And arrange according to document sequence.In sequence, the information of each element has: it in the address of interior nodes layer, element type, DDE coding
Deng.Information according to the row's of falling layer storage just can complete the match query to XPath, in going according to the element address inquired again
Node layer obtains the subdocument between element opening and closing label.In the row's of the falling layer shown in Fig. 2, E1, E2, E3 are to represent XML
Element information.
2) according to the XPath query statement of input, from the row's of falling layer, the element sequence corresponding with the node on XPath is taken out, and
The vanquished tree is used to carry out merger sequence.
For each node q on XPath, there are two data structures: element sequence TqWith stack Sq。TqIt it is XML document
In with all elements of q name matching, and TqIn element according to document sequence arrange.SqDuring algorithm is carried out storage with
The element of q name matching, a new element is stacked, and the element of those its ancestors non-will be popped.
The method of the present invention is properly termed as " TurboStack " method.Assume that XPath has n node: q1, q2..., qn,
The T corresponding with each nodeqi(1 i n) obtains from the row's of falling layer, TqiIn XML element be ordered into.TurboStack
The input of method is the XML element according to the arrangement of document sequence, it is therefore desirable to Tq1, Tq2..., TqnThis n element sequence
Row carry out merger sequence, n sequence are merged into a sequence according to the arrangement of document sequence, as the input of algorithm.
DDE can be relied on to encode and use the vanquished tree to carry out merger sequence: for coding dde1 and dde2 of two elements,
Compare with the two coding, it can be determined that go out relation before and after two elements, and set preceding element as victor, rear
Element be the vanquished.
3) element after sorting for merger, carries out stacked and Pop operations in order, obtains Query Result from relief area.
Execution flow process for holistic approach of the present invention shown below, is designated as algorithm 1, as follows to function declaration therein:
ConstructStack (q) is that node q sets up stack, and GetStream (q) obtains the element sequence of node q from bottom stores,
The MultiMergeSort (XPath) T to nodes all on XPathqCarrying out merger sequence, getPopElement (e) chooses from stack
Not being the element of the ancestors of e, match (e) judges whether e can be stacked.
It is specifically described with popping stacked below.
3.1) stacked
To all TqCarry out merger sequence, after making element arrange according to document sequence, it is possible to prepare stacked successively.Element e to enter
Stack, is equivalent to encounter the opening of bid label of e in XML document.In stack, the ancestors of e still remain in stack, reason be according to
XML tree structure, the label that closes as ancestors' node of e does not the most scan.And those not es stacked prior to e
The element of ancestors, their label that closes have passed through, it should pops.
The most stacked ancestors the most only retaining new element, all elements in stack is all ancestors' descendent relationship.The unit in all stacks
Element chained list couples together, and this chained list is called Last Push List, is abbreviated as LPL.After new element e is stacked, it is placed in LPL
Head position.One new element e, before stacked, starts to be examined in each record E from LPL headi, compile with DDE
Code is by EiCompare with e, if EiIt not the ancestors of e, from LPL, delete Ei, it is referred to as chain, otherwise stops comparing.Go out chain mark
Will this element and has been popped, but does not really remove from stack.
If e wants stacked SE, on XPath, the upper layer node of E is A, and in algorithm 1, the Rule of judgment of match (e) is:
1.SAIn also have chain element;
If 2. the axle between A and E is father and son's axle, SAIn do not go out chain and the element near stack top is a1, then e must be
The child of a1;
The type of 3.e identical with the type of E on XPath (type of E is probably node element or attribute node).
If E is the root node of XPath, can be stacked as long as then meeting the 3rd condition.
Stacked new record comprises four information:
1. element information (tagID, DDE, type);
2. pointer PLPL, point to LPL next record;
3. pointer Pstack, point to SAIn also on LPL and near the record of stack top.
4. matching status position status, is originally false.If this record also meets the XPath structural requirement to it, then it is set to true.
If E is the leaf node of XPath, then status mode bit is initialized as true.
If E is the output node of XPath, e is placed in the outputBuffer of result buffer.
Shown below for stacked execution flow process, it is designated as algorithm 2, as follows to function declaration therein:
Push (e, SE) element e is put into SEStack top, Lappend_head (e, LPL) is placed in element e on the head of chained list LPL
Position, Lappend (e, outputBuffer) puts into element e in output buffer outputBuffer.
3.2) pop
When record in stack goes out chain, the P that this is recordedLPLIt is set to NULL, but this record is not popped.
On XPath, except leaf node, other nodes have child node, and the stack of child node is referred to as the sub-stack of father node.Father saves
When element in some stack goes out chain, the element of sub-stack is popped.
Node A has two child nodes B and C, A and B to be father and son's axles, A Yu C is ancestors' offspring's axles.Assume present stack SA
In have two record a1 and a2, a2 in stack top, be now to a2 to go out chain.Record a2 does not pop, and that pop is SBAnd SC
In record.
First have to judge to record whether a2 meets XPath to query node A requirement structurally, the letter of algorithm 3 the most hereafter
The process of number matchStructure (e):
A) SBMiddle record b1,b2,……,bnPstackPointer points to a2, SCMiddle record c1,c2,……,cm PstackPointer is also directed to
a2.These records have gone out chain, and their matching status position status has obtained going out chain when.
If b) b and c is AND relation, then
a2->status=(b1->status||......||bn->status)&&(c1->status||......||cm->status);
If between b and c being OR relation, then
a2->status=(b1->status||......||bn->status)||(c1->status||......||cm->status)。
If n=0 or m=0, i.e. SBOr SCIn be not pointed towards the record of a2, then SBOr SCMode bit treat as false.
SBMiddle b1,b2,……,bnAll pop, because they are unlikely to be the child of a1;SCMiddle c1,c2,……,cmMode bit status
Popping for false, the P of remaining recordstackPointer all points to a1, because they are also the offsprings of a1.
If now a2-> status=false, then the record belonging to a2 offspring in output buffer is deleted from relief area.
T as the root node root of XPathrootFor empty and SrootIn record the most all go out chain, algorithm stops, in outputBuffer
Element be exactly Query Result.
For the execution flow process popped shown in lower surface frame, it is designated as algorithm 3, as follows to function declaration therein: IsEmpty (LPL, SE)
Judge SEWhether also has the element of chain, if not returning true;Delete_Stacks(SE, e) SESub-stack in after e
Delete for element;(outputBuffer e) deletes offspring's element of e from output buffer to Delete_InterResult;stack_top(SE)
Return stack does not goes out chain and the element near stack top;childStack(SE) return SEAll sub-stack;descendants(SC,e)
Return stack SCIn belong to the element of offspring of e;PC (E, C) judges whether E and C is filiation;AD (E, C) judges E and C
Whether it is ancestors' descendent relationship.
3.3) with the XPath of asterisk wildcard " * "
The common axle of XPath has ancestors offspring (AD) axle, father and son (PC) axle etc., if XPath occurs asterisk wildcard " * ",
Then amplify out three kinds of new axles:
A) grandfather's (grand parent-child, i.e. GPC) axle, such as a/*/c, a and c is the grandfather's pass every two-layer
System;A/*/*/c, a and c is the grandfather's subrelation every three layers.Present invention use/nRepresent GPC axle, n is integer, represent every
Which floor.
B) absolute ancestors offspring (absolute ancestor/descendant, i.e. AAD) axle, such as a/* //c or a//* //c,
A and c is at least every absolute ancestors' descendent relationship of two-layer.Use //nRepresenting AAD axle, n is integer, represents at least every several
Layer.
C) special ancestors offspring (special ancestor/descendant, i.e. SAD) axle, such as a//*/c, a and c is at least
Special ancestors' descendent relationship every two-layer.Use ///nRepresenting SAD axle, n is integer, represents at least every which floor.
AAD and SAD to be distinguished?It is such as AAD axle between a//* [//d] //c, a and d, c, does not has between d and c
Relation;For being SAD axle between a//* [d]/c, a and d, c, and d and c is brotherhood;For a//* [d] //c, a
And be SAD axle between d, it is AAD axle between a and c, between d and c, it doesn't matter.
GPC, AAD and SAD are the special cases of AD, use DDE coding can judge GPC, AAD and SAD easily,
The information because DDE coding has levels.
With tri-kinds of axles of GPC, SAD and AAD, the XPath having asterisk wildcard " * " is carried out equivalence to rewrite.
When in XPath occur " * " and it be node of divergence, such as a/* [d]/c//e, be rewritten as a [/2d]/2C//e, because d and c
Brotherhood must also be met, to pay special attention to this situation when processing three kinds of new axles.
New element is stacked, and the stack to enter is Sb, on XPath the father node of b be a, a and b be GPC, SAD and AAD
One in three kinds of axles.Stacked condition to be met is:
A) SaIn must have the element of chain;
B) SaMiddle existence element and new element meet the hierarchical relationship required by axle.
C) new element type meets the requirement of b.
If a Yu b, c be/nOr ///nAxle, and n is equal, then b and c needs to meet brotherhood.Assume present element
A1 goes out chain, b1,b2,……,bnAnd c1,c2,……,cmIt is the offspring of a1, calculates whether a1 mates, first have to brother's pairing, example
Such as (b1,c1,c2), (b2,b3,c3,c4) ... the element in bracket is all brother, then:
If between b and c being AND relation, then a1-> status=[b1->status&&(c1->status||c2->status)]|
[(b2->status||b3->status)&&(c3->status||c4->status)]||……;
If between b and c being OR relation, then a1-> status=(b1->status||c1->status||c2->status)||(b2->status||
b3->status||c3->status||c4->status)||......。
If cannot match, then a1-> status=false.
For stack SbAnd ScIn record be to continue with staying stack, still should delete, GPC axle with reference to PC axle process, SAD axle,
AAD axle then processes with reference to AD axle.
Fig. 4 is with the querying flow figure of example //a [//c]/b, wherein: all of element sequence merger is sorted by (a);B () sequentially
Process each element;C () puts into relief area matching result.Fig. 5 is the stacked Pop operations schematic diagram of example shown in Fig. 4.
In stack, three parts of each record are: the left side is element information;Top right-hand side is that matching status position status, F represent false, T table
Show true;Limit, bottom right is pointer Pstack.What figure bottom was shown is the change of LPL.The step that right //a [//c]/b inquires about is concrete
It is described as follows:
The first step, element sequence T of node a, c and b from the row's of falling layera、TcAnd TbTake out.
Second step, utilizes the vanquished tree to Ta、TcAnd TbCarry out merger sequence, obtain an element sequence: first a, second
Individual a, c, first b, second b.
3rd step, these 5 elements are the most stacked and Pop operations:
1) front 3 elements broadly fall into ancestors' descendent relationship, and they are the most stacked, because c is leaf node, so the status of c
For true.
2) first b element is stacked, checks that LPL's, c closes label mistake, and c goes out chain.Because b is leaf node, so
The status of b is true.Because b is output node, first b puts into output buffer.
3) second b is stacked, check LPL, at this moment first b and second a close label mistake, they go out chain.The
When two a go out chain, its status becomes true, because the status of its child node b and c is true.Second a goes out
Chain makes first b element pop, because it is not the daughter element of first a, but c element is not popped, because it is
The offspring of first a, points to first a the Pstack of c.Because b is output node, second b element is also placed in defeated
Go out relief area.
4) last, all elements has processed, the element chain to be gone out in LPL.When first a goes out chain, its status becomes
For true, because the status of its child node b and c is true, then two stacks of Sb and Sc all empty, because Sa
Stack does not has element.
4th step, finally checks there are two results in output buffer.
Fig. 6 is the query case of right //a/* [c]/b, it is desirable to asterisk wildcard " * " has two child nodes c and b, c and b to be that brother is closed
System.Query steps is as follows:
The first step, carries out equivalent rewriting to XPath, after rewriting be //a [/ 2c]/2b, i.e. a have two grandchild node b and c, and b
Must be brother with c.
Second step, takes out element sequence Ta, Tc and Tb of node a, c and b from the row's of falling layer.
3rd step, utilizes the vanquished tree that Ta, Tc and Tb are carried out merger sequence, obtains an element sequence: first a,
Two a, c, first b, second b.
4th step, these 5 elements are the most stacked and Pop operations:
1) front 3 elements broadly fall into ancestors' descendent relationship, and they are the most stacked, because c is grandson's element of first a, and institute
First a element is pointed to the Pstack of c.Because c is leaf node, so the status of c is true.
2) first b element is stacked, checks that LPL's, c closes label mistake, and c goes out chain.First b is first a
Grandson's element, so the Pstack of b points to first a element.Because b is leaf node, so the status of b is true.Cause
Being output node for b, first b puts into output buffer.
3) second b is stacked, check LPL, at this moment first b and second a close label mistake, they go out chain.The
When two a go out chain, its statu remains as false, because it does not has grandson element c and b.Now stack Sa only has first
Individual a element or effective element, but second b element is not its grandson's element, so second b element is discontented with
The stacked condition of foot.
4) last, all elements has processed, the element chain to be gone out in LPL.When first a element goes out chain, its status
Becoming true, because the status of its grandson element b and c is true, and b and c is brotherhood.Then Sb and Sc
Two stacks all empty, because not having element in Sa stack.
5th step, finally checks there is a result in output buffer.
Above example is only limited in order to technical scheme to be described, those of ordinary skill in the art can
Technical scheme is modified or equivalent, without departing from the spirit and scope of the present invention, the guarantor of the present invention
The scope of protecting should be as the criterion with described in claim.
Claims (7)
1. a querying method for XML data, its step includes:
1) using Native XML mode to store XML data, its storage organization includes: interior nodes layer, and storage is according to document
The node of the XML tree of sequence arrangement, wherein XML element uses DDE coded system to encode;Leaf node layer, stores XML
The text data of leaf nodes;Arranging layer, the inverted index of storage interior nodes layer, each index entry is the unit that tag names is identical
The sequence that element is arranged according to document sequence;
2) according to the XPath query statement of input, from the described row's of falling layer, the element sequence corresponding with the node of described XPath is taken out
Row, and use the vanquished tree to carry out merger sequence;Described employing the vanquished tree carries out merger sequence, is the coding of the DDE to two elements
Comparing, obtain relation before and after said two element, and set preceding element as victor, posterior element is the vanquished;
When XPath occurs asterisk wildcard " * ", amplify out three kinds of new axles: after the sub-axle of grandfather, absolute ancestors' offspring's axle, special ancestors
For axle, use described three kinds of new axles that the XPath containing asterisk wildcard " * " carries out equivalent rewriting;
3) XML element after sorting merger carries out stacked and Pop operations in order, and obtains Query Result from relief area.
2. the method for claim 1, it is characterised in that in described interior nodes layer, the information of every record includes: by node name
Integer identifiers, DDE coding and the node type that word is mapped to.
3. the method for claim 1, it is characterised in that in the described row of falling layer, the information of each element includes: element type,
This element encodes in the address of interior nodes layer and DDE.
4. the method for claim 1, it is characterised in that described interior nodes layer points to described leaf node layer by pointer.
5. the method for claim 1, it is characterised in that: in described XPath, each node q has two data structures: element
Sequence TqWith stack Sq;TqIt is all elements with q name matching in XML document, and arranges according to document sequence;SqFor depositing
Storage and the element of q name matching, and carry out stacked and Pop operations.
6. the method for claim 1, it is characterised in that when stack-incoming operation, only retains the ancestors of new element, in stack in stack
All elements be all ancestors' descendent relationship.
7. method as claimed in claim 6, it is characterised in that if element e wants stacked SE, father's joint of node E on XPath
Point is A, then stacked for element e Rule of judgment is:
a)SAIn have the element of chain;Described go out chain refer to the record of the ancestors that are not e from connecting the chain of all elements stack
Table is deleted;
B) e is SAIn do not go out chain and the child of the element near stack top;
C) type of e is identical with the type of E on XPath.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210192018.7A CN103488639B (en) | 2012-06-11 | 2012-06-11 | A kind of querying method of XML data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210192018.7A CN103488639B (en) | 2012-06-11 | 2012-06-11 | A kind of querying method of XML data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103488639A CN103488639A (en) | 2014-01-01 |
CN103488639B true CN103488639B (en) | 2016-12-07 |
Family
ID=49828879
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210192018.7A Expired - Fee Related CN103488639B (en) | 2012-06-11 | 2012-06-11 | A kind of querying method of XML data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103488639B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105677740A (en) * | 2015-12-29 | 2016-06-15 | 中国民用航空上海航空器适航审定中心 | Method for matching entity-based text data and XML files |
CN108614808B (en) * | 2016-12-12 | 2020-09-04 | 北大方正集团有限公司 | Typesetting method and typesetting device for XML (extensive markup language) document |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1584884A (en) * | 2003-08-20 | 2005-02-23 | 富士通株式会社 | Apparatus and method for searching data of structured document |
CN101010674A (en) * | 2004-06-16 | 2007-08-01 | 甲骨文国际公司 | Efficient extraction of XML content stored in a LOB |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6804677B2 (en) * | 2001-02-26 | 2004-10-12 | Ori Software Development Ltd. | Encoding semi-structured data for efficient search and browsing |
-
2012
- 2012-06-11 CN CN201210192018.7A patent/CN103488639B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1584884A (en) * | 2003-08-20 | 2005-02-23 | 富士通株式会社 | Apparatus and method for searching data of structured document |
CN101010674A (en) * | 2004-06-16 | 2007-08-01 | 甲骨文国际公司 | Efficient extraction of XML content stored in a LOB |
Also Published As
Publication number | Publication date |
---|---|
CN103488639A (en) | 2014-01-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Meier et al. | Nosql databases | |
US8738667B2 (en) | Mapping of data from XML to SQL | |
JP5255605B2 (en) | Registry-driven interoperability and document exchange | |
US6804677B2 (en) | Encoding semi-structured data for efficient search and browsing | |
US20060218160A1 (en) | Change control management of XML documents | |
US20070061706A1 (en) | Mapping property hierarchies to schemas | |
US20060173865A1 (en) | System and method of translating a relational database into an XML document and vice versa | |
US20040060006A1 (en) | XML-DB transactional update scheme | |
US20080301168A1 (en) | Generating database schemas for relational and markup language data from a conceptual model | |
WO2001061566A1 (en) | System and method for automatic loading of an xml document defined by a document-type definition into a relational database including the generation of a relational schema therefor | |
CN102033954A (en) | Full text retrieval inquiry index method for extensible markup language document in relational database | |
CN101661481A (en) | XML data storing method, method and device thereof for executing XML query | |
US9805112B2 (en) | Method and structure for managing multiple electronic forms and their records using a static database | |
CN102214243A (en) | Version management system for x extensible business reporting language (XBRL) classification standard | |
CN109871473A (en) | A kind of method of pair of project file and Database full-text search document | |
US9037553B2 (en) | System and method for efficient maintenance of indexes for XML files | |
Koupil et al. | A universal approach for multi-model schema inference | |
EP2425382B1 (en) | Method and device for improved ontology engineering | |
CN103488639B (en) | A kind of querying method of XML data | |
CN105550176A (en) | Basic mapping method for relational database and XML | |
Koupil et al. | Schema inference for multi-model data | |
Barbosa et al. | Efficient incremental validation of XML documents after composite updates | |
Moro et al. | Schema advisor for hybrid relational-XML DBMS | |
WO2010147453A1 (en) | System and method for designing a gui for an application program | |
Cavalieri et al. | On the reduction of sequences of XML document and schema update operations |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20161207 Termination date: 20190611 |
|
CF01 | Termination of patent right due to non-payment of annual fee |