CN1711534A - Scalably accessing data in an arbitrarily large document - Google Patents

Scalably accessing data in an arbitrarily large document Download PDF

Info

Publication number
CN1711534A
CN1711534A CNA2003801027567A CN200380102756A CN1711534A CN 1711534 A CN1711534 A CN 1711534A CN A2003801027567 A CNA2003801027567 A CN A2003801027567A CN 200380102756 A CN200380102756 A CN 200380102756A CN 1711534 A CN1711534 A CN 1711534A
Authority
CN
China
Prior art keywords
document
data structure
specific part
xml
permanent storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA2003801027567A
Other languages
Chinese (zh)
Other versions
CN100432993C (en
Inventor
西瓦桑卡兰·钱德拉塞卡
拉维·默西
尼普恩·阿加瓦尔
埃里克·塞德拉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Oracle International Corp
Original Assignee
Oracle International Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oracle International Corp filed Critical Oracle International Corp
Publication of CN1711534A publication Critical patent/CN1711534A/en
Application granted granted Critical
Publication of CN100432993C publication Critical patent/CN100432993C/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Techniques for accessing data that resides in a document on a computer-readable medium by a device with device resources of limited resource amount include determining usage for each portion of the document that consumes the device resources of a plurality of portions of the document. Each portion may be accessed independently of a different portion of the document. Based on the usage, a particular portion of the document is selected to cease consuming the device resources. The device resources consumed by the particular portion are released. The techniques allow a document-processing device with limited resources to scale up to process a large document that would otherwise exceed the available resources. This capability is an advantage when first inserting a large XML document, which cannot be fully manifested in available memory, as multiple loadable units into a database or other persistent store.

Description

Data in the access arbitrarily large document pro rata
The cross reference of related application
The application requires the right of the provisional application 60/424,543 of submission on November 6th, 2002, and its full content is hereby expressly incorporated by reference.
Technical field
The limited resources that the present invention relates to be used for being accessed in and can offer equipment are proportional from any technology of the data of file greatly, relate in particular to and handle the data that are kept in the XML file that can show greater than available memory.
Background technology
The quantity of e-business message exchange constantly increases.The commerce of exchange message has had recognized the need to be used to represent the universal standard of data.(ExtensibleMarkup Language XML) promptly becomes the universal standard of representing data to extend markup language.
XML describes and for data entity provides structure, and for example file or packet are referred to herein as XML document.The XML standard provides the label that limits a plurality of parts of XML document to be called the XML element or to be reduced to " element ".
Element may comprise various types of data, comprises element property and other element.The element that is contained in other element is known as the daughter element of other element.Comprise the element of attribute and daughter element by definition, XML document has defined element, the father and son's classification relationship between its daughter element and its attribute.
The term node refers to independent element and the element property in the XML document.Therefore, XML document has defined the level of the node with set membership.Such level referred to herein as node tree or node level.
Term attribute referred to herein as the discrete portions or the element of structure, for example belongs to the data structure or the object of object type according to OO method.Attribute can be the complex structure that comprises one or more other attributes, this member as attribute.The XML standard provides the element property of title-value to form.And comprising element property in the meaning of this used term attribute, this term is unrestricted.
Industrial standard has defined the structure that is used to represent XML document.A standard like this is that (Document Object Model DOM), is issued by World WideWeb association (W3C) DOM Document Object Model.
For computing machine is moved, produce internal memory (memory, the storer) expression of an XML document on XML document.Usually, XML document is to load (disk that for example stores the file that comprises the XML entity) or the data load that receives by communication channel from memory device, is used for representing the internal storage data structure of XML file with generation.The internal storage data structure is operated by computer processes executive software and program.XML document is loaded into internal memory and produces the process that the internal memory of the XML document in the internal memory represents and be known as expression or expression XML document.Typically, application program is by API Access and the internal storage data structure of operation by the expression establishment.
In traditional method that is used for representing, when XML document was expressed, whole XML document was expressed.XML document can be very big, and therefore need very large memory size when representing them.To such an extent as to some XML document are represented its required internal memory very greatly and are far surpassed the internal memory of distributing to it and the ability that may far surpass many computing machines.
Based on the above, be desirable to provide a kind of mechanism and can reduce the required memory size of expression XML document.
In a kind of method that the Pannala that quotes in the above describes, but XML document is broken down into a plurality of loading units in the database object that can be separately stored in Database Systems.Then, when process attempts to represent data from XML document, but only will comprise be concerned about that the loading unit of data is loaded into the internal memory from database.Whole XML document is not expressed.But loading unit is one group of one or more node in XML document.When but a node in loading unit is expressed, but all nodes in the loading unit all are expressed.But loading unit can but be not must be relevant with the content structure of node in being stored in permanent storage.
The Pannala system can be used for a lot of purposes, but some operation can not utilize the loading unit of separate storage and loading.These operations still need too much memory size; It can not be proportional with big XML document.But these operations comprise the operation of the interested unit in the loading unit that is illustrated in a plurality of or all big XML document, and begin the operation with whole big XML document insertion permanent storage, for example insert database.
Based on the above, be desirable to provide the technology that reduces operation required memory capacity, but described operation includes the loading unit of enough XML document to surpass free memory.
In addition, but the content that Pannela method hypothesis is loaded in the loading unit of internal memory do not change, but but therefore always can replace loading unit by reloading loading unit from permanent storage.Yet in a lot of operations, but the content of one or more loading unit in internal memory is different with content in being separately stored in permanent storage.For example, when inserting Database Systems at first, but neither one resides in the database of permanent storage as the separate storage unit in first loading unit that is loaded in the internal memory.But loading unit like this is known as " dirty (dirty) ".But the Pannela method is not suitable for dirty loading unit.
Based on the above, but it would also be desirable to provide the technology that when the operation of the dirty loading unit that relates to XML document, keeps information.
The method in described past of this part may be utilized, but must not be the unnecessary method of having been expected and having used.Therefore, unless otherwise indicated, described method can not just be considered to the prior art with respect to claim only because these methods appear at the background technology part in this part.
Description of drawings
The present invention is described with example, and is not limited to this.In the accompanying drawings, identical reference number is represented similar element, wherein:
Fig. 1 illustrates the structure of being used by server according to embodiment to handle the block diagram of the XML document that surpasses available memory;
Fig. 2 illustrates and is used to handle process flow diagram above the sophisticated method of the XML document of available memory according to embodiment; And
Fig. 3 illustrates the block diagram that can realize the computer system of the embodiment of the invention thereon.
Embodiment
A kind of method and manufacturing description that is used for by the big document data of equipment access with limited resources described.In the following description, in order to explain, stated that a plurality of specific detail are in order to intactly to understand the present invention.But obviously, there are not these specific detail can realize the present invention yet.In other example, known structure and equipment are represented to avoid unnecessarily fuzzy the present invention with the form of block diagram.
At this embodiments of the invention are described, in context, if when employed all or part of XML document is all represented in internal memory, will be above available internal memory capacity in the database server.Yet the present invention is not limited to this.Other embodiments of the invention can be applied to server not to be the situation of database server but to provide other service based on XML document.In other embodiments, XML document can be used as the input of the stand-alone utility that is not server, and just, application program does not respond the request from separate customer end process (client process).In certain embodiments, not XML document by the document of access, but comprise document according to a plurality of structures of another SGML.In certain embodiments, not limited internal memory but other resource, for example be connected to the impact damper of the equipment that inputs or outputs on the main frame of the application program of handling document, or the bandwidth that connects based on its network that sends file.
In an exemplary embodiment, the separated access of XML file part (for example: " but loading unit " of Pannala, loadable unit abbreviates LU as) is loaded in the internal memory and monitors the use amount of these parts.When the extra internal memory of needs, according to use amount, the internal memory that will distribute to one or more these parts discharges.For example, distributing to the recently minimum internal memory that is used part is released.The part that loses the internal memory of distributing to it is known as from internal memory " unloading ".In certain embodiments, if unloaded part be dirty (but for example comprise not as the separate section of loading unit by the data of permanent storage in one or more containers of database), then before internal memory is reallocated this part to the other parts of document by permanent storage.But because each loading unit can with the permanent storage independent access, so if but reuse loading unit, it can obtain from permanent storage subsequently.In certain embodiments, the direct permanent storage of part quilt is in the object relationship data structure of database; In certain embodiments, this part by permanent storage in the temporary file of permanent storage.
Technology of the present invention makes the document processing device, document processing with limited resources increase in proportion to handle big document otherwise can surpass available resources.This is favourable for the first time can not but the big XML document of perfect representation is inserted databases or other permanent storage as a plurality of loading units in free memory the time.After this different piece in big XML document has been separated to insert permanent data storage, if operation relates to more than the part that can once be expressed in internal memory, if but or operation favourable when having revised the content of loading unit.The operation of the example of these operations for using XML type list to format the entire document that is used on graphical interfaces, representing, and the operation of the arbitrary element of the big XML document of editor.
Architectural overview
Fig. 1 shows according to the employed data structure of server in the XML disposal system 100 of embodiment to handle the block diagram of the XML document that surpasses free memory.System 100 comprises database server 120, and database server comprises and being assigned with the quick internal memory 150 of the server of processing XML document.The quick internal memory of term refers to the internal memory that can the most easily be used for processor as used herein, is used for storage data and instruction outside processor.In current techniques, the internal memory that is used for this purpose has fast response time; If but the electric energy that offers internal memory lose, data can not remain in this internal memory so.High-speed internal memory is more expensive than the internal memory of other type, so rarer than the internal memory of other type.Server 120 also comprises one or more permanent storage appliances that server permanent storage 130 is provided.In current techniques, such permanent storage also can keep data when loss of power; But sort memory is very slow and can not be used as the quick free memory of processor.Permanent storage is included as interim storage 132 storage area of reserving and the storage area that is used for database storing space 140.
Database storing space 140 comprises that the storer 144a, the 144b that are used for one or more object relationship type structures and other represent with suspension points 146, below is expressed as object relationship structure 144 jointly.Object relationship structure 144 is stored data object in one or more relational database data structures, for example the program of table, row, column and preservation.
System's 100 processing XML Doctype data structures 102, this data structure definition the attribute of each element type that may be used by XML document, wherein XML document is the example of type.Data structure can reside in any computer-readable medium that can be read by server 120, for example on moveable magnetic disc or CD or in the channel, will describe in detail in the chapters and sections below.Database storing space 140 comprises element property or mapping between the element 142 and the data object in one or more object relationship structures 144 that is used in the XML document.System 100 uses any mechanism known in the art to produce the content of mapping 142 based on the content of XML document categorical data structure 142.In certain embodiments, the object relationship structure is by actual definition and based on the content creating of XML document categorical data structure 102.In an embodiment, the mechanism of describing among the Pannala is applied to best property of attribute mapping with element and XML document categorical data structure in the data object of object relationship data structure 144.
XML document 110 is examples of the type of definition in data structure 102.XML document 110 may comprise one or more attributes as one or more elements that comprise of definition in Doctype data structure 102 in each file, and each attribute may be another element.Therefore XML document 110 has set up special node hierarchy, and wherein each child node is represented the attribute of its father node, and each attribute has the value that is fit to its type.
For the purpose of description, suppose that XML document 110 has a plurality of attributes, comprise an attribute of representing by element 112a and another attribute of representing by element 112b.Suppose that also element 112a has a plurality of attributes, comprise by element 114a, 114b, 115 and the attribute usually represented by other yuan of suspension points 113 representative.Supposing also that element 114a, 114b may comprise comprises other attribute of an element (not shown).Same hypothesis element 115 is data blocks, and it is not divided or definition by XML categorical data structure 102 further; It is known as " opaque " element, and can be very big.For example, opaque element 115 may comprise that the treatise of the terms and conditions that are used for the order contract describes text, maybe can comprise the character string of expression genetic code sequence.Suppose that also element 112b has a plurality of attributes, comprise by element 116,117 reaching the attribute of usually representing by other yuan of suspension points 118 representatives.
(for example 152a, 152b and other be by suspension points 153 expressions for loadable unit, LU) data structure, below in conjunction with together as LU data structure 152 but server has been set up one or more loading units.In processing procedure, the LU that the data that are used for one or more nodes of XML document 110 are switched to storer 150 is used for being stored in permanent storage 130.For some LU, property value is included among the LU.For some LU, daughter element for example, property value is not included among the LU.On the contrary, other attribute itself is different LU.The position of LU in permanent storage 130 that the steady arm indication is different.The steady arm that is used for Different L U is included in father LU.Therefore, LU comprises that enough information loads remaining XML tree with the node place of node in surpassing father LU.LU by the steady arm among father LU indication is called " line is outer " LU sometimes because its often be stored in store father LU in the different one or more data structures of data structure in.
Server 120 is also set up the use amount data structure 156 that is used for writing down the data use amount in the LU data structure 152.For example, use amount data structure 156 comprises first data item, and whether the processing to the data in the specific LU data structure 152 in the internal memory 150 in its indication server 120 finishes.In certain embodiments, the use amount data structure comprises second data item, the relative time that this data item indicates the data in the specific LU data structure 152 to be used at last.
Server 120 has also been set up locator data structure 158.Locator data structure 158 is preserved a steady arm, and this steady arm is indicated specific LU to be kept at the position in the permanent storage 130 and this steady arm is associated with LU identifier (for example LU title).For example, steady arm indication file name and the side-play amount that writes the data of the big object (LOB) in the temporary storage 132.In another embodiment, the data item in one or more object relationship structures 144 of steady arm indication in database storing space 140 is for example gone.Object relationship structure 144 may comprise the database LOB structure that is arranged in other structure.
Functional overview
Fig. 2 shows and is used to handle process flow diagram above the sophisticated method 200 of the XML document of free memory according to embodiment.Although be to describe step, can go up overlapping these steps of carrying out by different orders or time in other embodiments with specific order in Fig. 2.
In step 202, but determine that according to the element type in XML document categorical data structure 102 definition is used for the loading unit of document.The standard definition of XML be used to specify the pattern language of XML document structure.This language can be used in the data structure 102.In XML document categorical data structure 102, be known as the XML pattern below the structure of definition.Database server 120 has translation XML document categorical data structure 102 and creates or revise the ability of the object relationship structure 144 of supporting that the XML pattern is required.For example, XML pattern member "<complexType〉" is mapped to the object type in the database.Additional customer in the XML pattern explains and is used to indicate special stored parameter, makes that therefore the some parts in the XML document is stored in extra table, big object (LOB) and other database container.In other embodiments, different servers have the ability that translation XML document categorical data structure 102 and establishment or modification are supported other data capsule that the XML pattern is required.
In step 210, be equipped with in the quick internal memory 150 of processing XML document and be stored in the LU data structure 152 but loading unit (LU) is loaded into branch.LU can be from the database the data space 140, or from temporary storage 132, or obtain in the XML document 110 from some other computer-readable mediums.The LU of steady arm that is used for one or more attributes when use also be the position establishment steady arm value of attribute in the database by when temporary storage 132 or XML document 110 loads; In this case, be used for the nonce of steady arm, for example the title of first element of corresponding LU.
For example, for purpose of explanation, but supposing the element 112a of a corresponding loading unit, is each different loaded element with element 114a, 114b and 115.Therefore the LU that is used for element 112a uses the actual value of the attribute of steady arm rather than corresponding element 114a, 114b and 115.When element 112a is read from new XML document 110 by server 120, LU data structure 152 (for example, 152b) in storer 150, produce to keep this element, but element 114a, 114b, 115 also are not read into or are stored in the permanent storage 130, so be not the LU data structure definition steady arm that will keep these LU.When these elements are read out, create the LU data structure.
In step 220, determine to be kept at the use amount of the LU in the internal storage data structure 152 and use amount is kept in the use amount data structure 156.In one embodiment, use amount data structure 156 comprises the LU memory address field (below be appointed as " MEM_ADDR ") of each the LU data structure that is used for internal memory 150, count area (below be appointed as " COUNT ") and time field (below be appointed as " TIME ").In certain embodiments, use amount data structure 156 is independent of LU data structure 152 and comprises the separate records that is used for each LU data structure 152 in use amount data structure 156.When in internal memory 150, creating LU data structure 152, in use amount data structure 158, add the row that has these field values.The value of MEM_ADDR field shows the position that LU data structure 152 begins in internal memory 150.The value of COUNT field is set to " 1 ", represents that a process using LU data structure (under this kind situation, process is being created data structure).The value of TIME field is set to current system time and is used for representing the time that the LU data structure is used at last.
In other embodiments, use field more or less in use amount data structure 156.For example, in some embodiments, comprise that the size field that is used for each LU data structure 152 (below be appointed as " SIZE ") is used to represent the size of LU data structure 152.In some embodiments, comprise whether the dirty tag field that is used for each LU data structure 152 (below be appointed as " DIRTY " field) is dirty with the content of expression LU, just, may be with different from the content of the LU in the permanent storage 130.The DIRTY field keeps in two values; The LU of a value representation correspondence is dirty, and the LU of another value representation correspondence is not dirty.
In some embodiments, use amount data structure 156 is parts of LU data structure 152.In such embodiments, the MEM-ADDR field can be with being omitted and when LU data structure 152 was created in internal memory 150, value was stored in other field of use amount data structure 156, as mentioned above.
No matter use the process of the server 120 of LU data structure 152 when to begin, add 1, and be updated corresponding to the value of the TIME field of this LU corresponding to the value of the COUNT field of this LU." the triggering (touch) " that is called this LU data structure below the beginning of the process of use LU data structure.No matter when use the process of the server 120 of LU data structure 152 to finish, subtract 1, and be updated corresponding to the value of the TIME field of this LU corresponding to the value of the COUNT field of this LU.For example, simply from XML document 110 process of loading data to cause the value of corresponding COUNT when creating LU data structure 152 be 1, when LU was loaded on the data structure 152 fully, the COUNT value reduced to 0.In case the LU of corresponding attribute is loaded, father LU just is considered to load fully.
In illustrative examples, use amount data structure 156 comprises the tabulation of unloading unit.When the value of COUNT field arrived 0, corresponding LU data structure was added in the tabulation.LU in available any method indication tabulation.In one example, the LU in the tabulation is indicated by its memory address.In another embodiment, be not that 0 COUNT field value can be used to be defined for the LU data structure that is included in the unloading unit tabulation.Use the embodiment of this tabulation not need to be maintained in TIME field in the use amount data structure 156, because the LU data structure is to become being added in the unloading unit tabulation to the latest time sequencing the earliest of can unloading.
In step 230, determine whether the condition that discharges the internal memory of distributing to one or more LU data structures 152 satisfies.In an exemplary embodiment, the condition of releasing memory is to surpass threshold value by the full memory 150 that LU data structure 152 is used.Threshold value is selected as usually less than the full memory that distributes.For example, the condition of releasing memory can be to be consumed by LU data structure 152 greater than 75% of internal memory 150.The full memory that can obtain consuming by all values addition with the SIZE field in the use amount data structure 156.Do not satisfy if determine the condition of releasing memory in step 230, control flow is got back to step 210 so that next LU is loaded into quick internal memory so.Satisfy if determine the condition of releasing memory in step 230, control flow is to step 240 so.
In step 240, select the one or more LU data structure of unloading according to use amount.For example, select the unloading minimum LU data structure of using recently.Can determine the minimum LU data structure of using recently by the LU data structure 152 of seeking the value the earliest in the corresponding TIME field.
Still at the LU that uses, in some embodiments, only consider that unloading COUNT value is 0 LU data structure for fear of the unloading process.In some such embodiment, the minimum LU that uses only determines from aforesaid unloading unit tabulation recently.Recently the minimum LU data structure of using is first LU data structure of indication in the unloading unit tabulation.
In step 250, the internal memory of distributing to selected LU data structure is released.In certain embodiments, step 250 comprises step 252,254,256.
In step 252, determine whether the content of selected LU data structure 152 is separately stored in the permanent storage.In the embodiment with DIRTY field relevant with each LU data structure 152, whether step 252 can indicate that LU data structure 152 is not dirty to be carried out by determining the DIRTY field.If determine that selected LU data structure 152 is not dirty, control flow arrives step 256 so, and is as described below, content do not write permanent storage.
If determine that in step 252 selected LU data structure is dirty, control procedure is to step 254.In step 254, the LU content in the LU data structure 152 is written in the LU data structure of permanent storage 130.In some embodiments, LU is written in the data structure of temporary storage 132.In some embodiments, LU is written in the object relationship structure 144 in database storing space 140.In other embodiments, steady arm is returned as the LU indicating positions in the permanent storage 130, so LU can be re-loaded in the internal memory 150 in the time after a while.In illustrated embodiment, the steady arm that returns is stored in the locator data structure 158, and wherein steady arm is relevant with the LU identifier, for example the title of the LU title or the first corresponding XML element.
In step 256, the internal memory of distributing to the selected LU data structure 152 in quick internal memory 152 is released (de-allocate) and can distributes to different LU data structures.In certain embodiments, this step comprises that deletion is corresponding to the use amount information of selected LU data structure from use amount data structure 156.Whether control flow turns back to step 230 and still satisfies with the condition of determining releasing memory then.
These technology make the equipment with limited resources (for example limited quick internal memory) can proportionally improve the ability of handling big arbitrarily document.This ability is favourable when for the first time but the big XML document that can not be represented by available internal memory fully being inserted database or other permanent storage as a plurality of loading units.After the different piece of big XML document has been separated to insert permanent data storage, if the part that operation comprises is more than the part that once can represent in internal memory, for example, when handling whole XML document when being applied in the type of appointment in XML type list (XSL) document, this ability also is favourable.The one exemplary embodiment of both of these case will be described in two parts below.
In database, insert XML document
In order to describe this kind situation, suppose in step 202, according to the XML pattern in the XML document categorical data structure 102, create or revise the object relationship structure 144 in the database.Also suppose for example also will be inserted into database by channel or from the XML document 110 that removable CD receives from some external resources.Suppose that also the quick internal memory 150 of server has been allocated for processing document 110.COUNT field, SIZE field and the DIRTY field of also supposing use amount data structure 156 are included in each LU data structure, and the use amount data structure 156 that is positioned at LU data structure 152 outsides comprises the unloading unit tabulation, and unloading unit tabulation indication COUNT value is 0 LU data structure.Memory address mark LU data structure in tabulation by first byte in the LU data structure 152.Also hypothesis distribute to the capacity of quick internal memory 150 of the document be 2 megabyte (2MB) but and the threshold value that is used to unload loading unit be 1.5MB.
In step 210, server 120 is created and is called " LU-A " below the LU data structure 152a to store the high node in the XML level, the documentation level node of document 110.COUNT field value of being initialized to 1.In order to describe, suppose that DIRTY field value of being initialized to 1 is used to indicate dirty LU data structure.LU data structure LU-A 152a is dirty, because its content does not have as LU by permanent storage.The SIZE field is initialized to the minimum dimension that is used for documentation level LU, comprises enough spaces of the value that is used for attribute and steady arm, and the value of these attributes and steady arm is to be used for the least member quantity that element 112a, 112b etc. estimate according to its pattern up to document 110.In order to describe, the SIZE that supposes LU-A is 0.01MB.The first few lines of server 120 processing XML documents 110 also is loaded into the property value of document among the LU-A 152a.Then, before the steady arm outside the row of the LU that is used for being associated with element 112a, 112b was determined, server proceeded to the row of the XML document of start element 112a.Therefore, it is 1 o'clock that LU-A and COUNT keep the value of setting, and loading procedure can not finish.
In step 220, determine that the full memory that is used by LU data structure 152 is the SIZE of LU-A.Unloading unit tabulation in use amount data structure 156 is for empty.In step 230, determine that amount of ram (0.01MB) does not surpass threshold value 1.5MB and control flow and gets back to step 210 to begin to load next LU.
When circulation step 210, but server 120 has been created the second loading unit data structure 152b, below is called " LU-B " in order to store the node relevant with element 112a.COUNT field and DIRTY field value of being initialized to 1.The SIZE field is initialized to the minimum value of the LU that is used for element 112, the enough spaces that comprise the value that is used for attribute and steady arm, the value of these attributes and steady arm are to be used for the least member quantity that element 114a, 114b, 115 etc. estimate according to its pattern up to element 112a.In order to describe, the SIZE that supposes LU-B is 0.1MB.The first few lines of server 120 processing element 112a also is loaded into LU-B 152b with the attribute of an element value.Then, before the steady arm of the LU outside being used for the row that is associated with element 114a, 1142b, 115 was determined, server proceeded to the row of the XML document of start element 114a.Therefore, it is 1 o'clock that LU-B and COUNT keep the value of setting, and loading procedure can not finish.
When next circulation step 220, determine that the full memory that is used by LU data structure 152 is the SIZE of LU-A and LU-B.Unloading unit tabulation in use amount data structure 156 is for empty.In ensuing circulation step 230, determine that amount of ram (0.11MB) does not surpass threshold value 1.5MB and control flow and turns back to step 210 to begin to load next LU.
This process continues to handle the ensuing LU relevant with the daughter element of 112a, comprises 114a, 114b, 115.In order to describe, suppose that element 114a, 114b and 115 do not comprise daughter element, and the SIZE field value of these three elements is respectively 0.2MB, 0.2MB and 1.1MB.Therefore also suppose the existence according to element 114b and 115, extra steady arm has been added to LU-B, and the SIZE field value among the LU-B is added to 0.11.As long as each is loaded into respectively in LU data structure LU-C, LU-D in the internal memory 150, the LU-E (not shown) fully, the COUNT field value just is reduced to 0, and the address of three LU data structures is added in the unloading unit tabulation of use amount data structure 156.
In circulation step 220, determine by the full memory that LU data structure 152 is used be LU-A, LU-B, LU-C, LU-D and LU-E the SIZE field value and.Unloading unit tabulation in the use amount data structure 156 comprises the memory address of LU-C, LU-D and LU-E.In ensuing circulation step 230, determine that memory size (1.62MB) surpasses threshold value 1.5MB really, and control flow is got back to step 240 to select from the LU data structure 152 of internal memory 150 unloadings.
In step 240, be chosen in the recent minimum LU data structure of using in the unloading unit tabulation.LU data structure (LU-C of corresponding element 114a) in the tabulation is minimum using recently.Therefore, the selected conduct of the LU-C LU data structure that will unload.In other embodiments, can adopt other choice criteria.For example, can select to have the LU data structure LU-E of maximum SIZE value (1.1MB).Correct selection is based on the employed mode of system.Think that the LU that recently is used at most is that most probable is used again, and recent minimum least may being used again of being used.Therefore, select least possible to avoid unloading the LU that more may be loaded once more.
In step 252, determine whether LU-C is dirty.Because the value of DIRTY field indication LU-C is dirty, control flow is to step 254.In step 254, LU-C is written into the object relationship structure 144 of database and is used for the steady arm of LU-C, is referred to herein as " L-C " and is returned in process.Server 120 will be worth " L-C " and write the relevant locator data structure 158 of identifier with the LU that forms for element 114a.LU data structure 152 in all internal memories (having the uncertain of element 114a quoted as attribute) will determine that these quote by using steady arm " L-C ".Its last not LU data structure of definite steady arm of any reception has the COUNT field value that reduces.If the LU data structure is relevant with the COUNT value that reaches 0, the LU data structure is added in the unloading unit tabulation so.In step 256, the internal memory of distributing to LU-C is released, so it can be assigned to other LU data structure.In step 256, the use amount information of the LU-C in use amount data structure 156 is deleted equally.
Control flow turns back to step 230 and whether still surpasses threshold value with the use amount of determining internal memory then.The full memory that uses by LU data structure 152 be LU-A, LU-B, LU-D and LU-E (not having LU-C) the SIZE field value and.Unloading unit tabulation in use amount data structure 156 comprises the memory address of LU-D, LU-E (not having LU-C).Determine this amount of ram (1.42MB) but do not surpass threshold value 1.5MB and control flow to turn back to step 210 so that next loading unit is loaded in the internal memory 150.
Therefore, use the server with limited quick amount of ram of distributing to document, the XML document of size can be inserted in the database arbitrarily.
In certain embodiments, XML document is the temporary transient document that only is used the very short time, but not by permanent storage in database.In such embodiments, step is similar, and except writing permanent storage in step 254, LU is written into the data structure in the temporary storage.In such embodiment, the temporary storaging data structure is the LOB file with skew byte that a plurality of LU begin to locate, and shines upon 142 and still be stored in the database.
Processing is from the XML document of database
In order to describe this kind situation, reuse the hypothesis of using in the former case, except, not from external source, to obtain XML document 110, it has been arranged in database.Therefore the LU of each use row my husband LU has the steady arm for its sub-lu definition.Suppose that also whole XMl document will be operated to form the expression according to the XSL file design, will send to display device through the result of design.
In step 210, but server 120 is created the first loading unit data structure 152a, is called " LU-A " to store the high node in the XML level, the documentation level node of document 110.COUNT field value of being initialized to 1.DIRTY field value of being initialized to 0 is with the not dirty LU data structure of indication.The LU data structure is not dirty to be not to be changed because its content is obtained from the permanent storage of database.The SIZE field is initialized to the actual size of documentation level LU.Server 120 sends to the destination, display device according to preceding several attributes of XSL document process XML document 110 and with the result.Must to be designed (styled) in entire document preceding for server then, starts design element 112a.Therefore design process can not finish when LU-A and COUNT keep the value of setting 1.
As above, in step 220, determine the full memory that uses by LU data structure 152 SIZE as LU-A.Unloading unit tabulation in use amount data structure 156 is for empty.In step 230, determine that amount of ram (0.01MB) does not surpass threshold value 1.5MB and control flow and returns step 210 to begin to load next LU.
In circulation step 210, but server 120 is created the second loading unit data structure 152b, is called " LU-B " in order to store the node relevant with element 112a.COUNT field and DIRTY field are initialized to 1 and 0 respectively.The SIZE field is initialized to the actual size that is used for element 112.In order to illustrate, the SIZE that supposes LU-B is 0.11MB.Preceding several attributes of server 120 design element 112a also send to the destination display device with the result.Server designs the attribute corresponding to element 114a before the design of closure element 112a.Therefore when LU-B and COUNT remained setting value 1, design process did not finish.
In ensuing circulation step 220, determine that the full memory that is used by LU data structure 152 is the SIZE of LU-A and LU-B.Unloading unit tabulation in use amount data structure 156 is for empty.In ensuing circulation step 230, determine that amount of ram (0.12MB) is no more than threshold value 1.5MB and control flow turns back to step 210 to begin to load next LU.
Continue to handle and comprise 114a, 114b, 115 the relevant ensuing LU of daughter element 112a.As above, in order to illustrate, suppose that element 114a, 114b and 115 do not comprise daughter element, and the value of the SIZE field of three elements is respectively 0.2MB, 0.2MB and 1.1MB.As long as each is loaded into (not shown) among LU data structure LU-C in internal memory, LU-D, the LU-E respectively fully, COUNT just is increased to 1; When the design beginning, COUNT is added to 2 once more.When design finished, COUNT was reduced to 1, and when the result was sent to the destination display device, COUNT was reduced to 0 once more.When each COUNT that is used for three LU data structures was reduced to 0, the address of each was added in the unloading unit tabulation in the use amount data structure 156.
In ensuing circulation step 220, determine by the full memory that LU data structure 152 is used be LU-A, LU-B, LU-C, LU-D and LU-E the SIZE field value and.Unloading unit tabulation in the use amount data structure 156 comprises the memory address of LU-C, LU-D, LU-E.In ensuing circulation step 230, determine that amount of ram (1.62MB) surpasses threshold value 1.5MB and control flow really to the LU data structure 152 of step 240 to select to unload from internal memory 150.
In step 240, select to use recently in the unloading unit tabulation minimum LU data structure.LU data structure (corresponding to the LU-C of element 114a) in tabulation is minimum using recently.Therefore LU-C is selected as the LU data structure that unloads.
In step 252, determine whether LU-C is dirty.Because LU-C is not dirty for the indication of DIRTY field value, so control flow is to step 256.In step 256, the internal memory of distributing to LU-C is released with the use amount information that is assigned to another LU data structure and the LU-C in use amount data structure 156 deleted.
Control flow turns back to step 230 to determine whether the internal memory use amount still surpasses threshold value then.Determine by the full memory that LU data structure 152 is used be LU-A, LU-B, LU-D and LU-E (not having LU-C) the SIZE field value and.Unloading unit tabulation in the use amount data structure 156 comprises the memory address of LU-D, LU-E (not having LU-C).In ensuing circulation step 230, determine amount of ram (1.42MB) but be no more than threshold value 1.5MB and control flow turns back to step 210 so that next loading unit is loaded into internal memory 150.
Therefore can handle by server from the XML document of any size of database with limited quick internal memory of distributing to document.
Ardware overview
Fig. 3 is the block diagram that the computer system 300 that can use embodiments of the invention has been described.Computer system 300 comprises the bus 302 that is used to the information of transmitting or other communicator, the processor that is used for process information 304 that is connected with bus 302.Computer system 300 also comprises primary memory 306, and for example random access storage device (RAM) or other dynamic storage device are connected with bus 302, are used for the instruction that store information and processor 304 will be carried out.Primary memory 306 also is used in storage temporary variable or other intermediate informations in the processor 304 execution command processes.Computer system 300 also comprises ROM (read-only memory) (ROM) 308 or other static memories, is connected with bus 302, is used to store the instruction that will carry out of static information and processor 304.Memory storage 310 as disk or CD, is connected with store information and instruction with bus 302.
Computer system 300 can be connected to display 312 via bus 302, as cathode ray tube (CRT), is used for the display message to the computer user.The input media 314 that comprises alphanumeric key and other keys links to each other with bus 302, is used to transmit the communication information and command selection to processor 304.Another kind of user input apparatus is cursor control 316, as mouse, tracking ball or cursor direction key, is used for direction of transfer information and command selection and moves to the cursor on processor 304 and the control display 312.This input equipment usually on two axles (first axle (for example X-axis) and second axle (for example Y-axis)) have two degree of freedom, make the position on the device energy given plane.
The present invention relates to use, be used to finish technology described here computer system 300.According to one embodiment of present invention, these technology are carried out to carry out one or more sequences of the one or more instructions that comprise in main memory 306 in response to processor 304 by computer system 300.Such instruction can be read in the main memory 306 from another computing machine scale medium, and for example memory storage 310.Be included in instruction sequence in the main memory 306 by execution, make processor 304 carry out treatment step described herein.In optional embodiment, hard-wired circuit (hard-wired circuitry) can replace software instruction or combine with software instruction implements this invention.Therefore, the embodiment among the present invention will be not limited to any particular combinations of hardware circuit and software.
Term used herein " computer-readable medium " is meant that the instruction that participates in being provided for carrying out gives any medium of processor 304.This medium can be taked a lot of forms, includes but not limited to non-volatile media, Volatile media and transmission medium.Non-volatile media comprises CD or disk for instance, as memory storage 310.Volatile media comprises dynamic storage, as main memory 306.Transmission medium comprises concentric cable, copper cash and optical fiber, comprises the lead of being made up of bus 302.Transmission medium also can be taked sound wave or form of light waves, for example those sound wave and light waves that produce in radiowave and infrared data communication process.
Common computer-readable medium comprises floppy disk, flexible disk, hard disk, hard disk, tape for instance, perhaps any other magnetic medium, CD-ROM, any other light medium, punching paper, paper tape or any physical medium with holes, RAM, PROM, EPROM, FLASH-EPROM or other any storage chip or tape, the carrier wave of mentioning below perhaps or calculate any other medium that function is read.
One or more sequences that various forms of computer-readable mediums can participate in transmitting one or more instruction to processor 304 in order to carry out.For example, the instruction beginning can be carried in the disk of remote computer.The remote computation function is loaded into this instruction sequence in its dynamic storage, uses modulator-demodular unit to send information by telephone wire then.The modulator-demodular unit that this locality is connected to computer system 300 can receive the data on the telephone wire, uses infrared converter that data-switching is become infrared signal then.Infrared eye can receive the data that infrared signal is carried, and can be placed into information on the bus 302 by certain circuit.In main memory 306, these instructions are fetched and carried out to processor 304 from main memory 306 to bus 302 data transfer.Before or after processor 304 was carried out these instructions, the instruction that main memory 306 receives can optionally be stored in the memory storage 310.
Computer system 300 also comprises the communication interface 318 that is connected to bus 302.Communication interface 318 provides bidirectional data communication, is connected to the network link 320 that links to each other with LAN (Local Area Network) 322.For example, communication interface 318 can be Integrated Service Digital Network card or modulator-demodular unit, is used to be provided to the data communication connection of respective type telephone wire.And for example, communication interface 318 can be the Local Area Network card, is used to provide the data communication to compatible Local Area Network to connect.Also can use Radio Link.Which kind of no matter adopts connect, and communication interface 318 all sends and receives electric signal, electromagnetic signal and the optical signalling of the digital data stream of the various expression polytype information of carrying.
Network link 320 can provide data communication to other data set by one or more network usually.For example, network link 320 can be connected with main frame 324 by LAN (Local Area Network) 322, perhaps is connected with the data equipment that ISP (ISP) 326 operates.ISP326 provides data communication services by the worldwide packet data communication network that is commonly referred to as " internet " 328 at present again.LAN (Local Area Network) 322 and internet 328 all use electric signal, electromagnetic signal or the optical signalling of carrying digital data stream.These signals, as signal by diverse network, signal on the network link 320, the signal by communication interface 318 (send numerical data to computer system 300 or send numerical data from computer system) is the example form of the carrier wave of transmission information.
Computer system 300 can be passed through network, network link 320 and communication interface 318 and send message and receive data, comprises program code.For example, on the internet, server 330 can pass through internet 328, ISP 326, LAN (Local Area Network) 322 and communication interface 318, transmits the program code that is used for application program of being asked.
When code was received and/or is stored on the memory storage 310 or be used for carrying out subsequently on other non-volatile media, processor 304 can be carried out received code.In this manner, computer system 300 can obtain the application code of carrier format.
In aforesaid instructions, the present invention has been described with reference to specific embodiment.Yet, obviously, can make numerous modifications and variations, and not break away from spirit and scope widely of the present invention.Instructions and accompanying drawing only are used to illustrate the present invention, limit the scope of the invention and be not used in.

Claims (19)

1. an equipment access that is used for using the device resource with limited resources amount resides in the method according to the data of the document of the content of SGML structure of having on the computer-readable medium, said method comprising the steps of:
Be each part of the described document in a plurality of parts of described document, determine that how many parts of expression will be by the use amount of access;
Wherein, each part is all based on one or more members of described SGML;
Based on described use amount, select the specific part of described document, to stop to consume described device resource; And
The described device resource that release is consumed by described specific part.
2. method according to claim 1 is selected the step of described specific part to comprise to select to consume in a plurality of parts of described device resource and is used minimum described specific part recently.
3. method according to claim 1, wherein:
Described method also comprises the step that determines whether to satisfy the condition that discharges resource; And
Only when determining to satisfy the condition of described release resource, carry out the step of selecting described specific part.
4. method according to claim 3, wherein:
The step of determining use amount also comprises whole resource use amounts of determining by all parts uses of the described document that consumes described device resource; And
The condition that is used to discharge resource comprises that described whole resource use amount surpasses the threshold value stock number less than described limited resources amount.
5. method according to claim 1, wherein:
The step of described definite use amount also comprises the quantity of determining in the uncompleted operation of the above equipment of each part of consumer device resource; And
The step of the described specific part of described selection comprises that the quantity of determining uncompleted operation on specific part is less than minimum number.
6. method according to claim 5, wherein, described minimum number is 1.
7. method according to claim 5, the step of described selection specific part also comprise the step of selecting to use recently minimum described specific part from a plurality of parts, and the quantity of the uncompleted operation of described specific part is less than described minimum number.
8. method according to claim 1, described release be may further comprise the steps by the step of the described device resource of described specific part consumption:
Whether the content of determining described specific part is separated with the different piece of described document and is resided in the permanent storage device; And
If determining that described content is not separated with different piece does not reside in the permanent storage, so before discharging the described device resource that consumes by described specific part, described content and different piece separated writing permanent storage.
9. method according to claim 8, release is also comprised by the step of the described device resource of described specific part consumption: if definite described content is separated with different piece reside in the permanent storage, carry out the step that discharges the device resource that consumes by described specific part so, and described content is not write permanent storage.
10. method according to claim 8, wherein, described permanent storage is the file in the file system.
11. method according to claim 8, wherein, described permanent storage is the database object in the Database Systems.
12. method according to claim 1 also comprises step:
Determine hierarchical elements from the described document of the type definition document that is associated with described document; And
Determine a plurality of parts of described document according to described hierarchical elements.
13. method according to claim 1, wherein, described document is extend markup language (XML) document.
14. method according to claim 12, wherein, described document is an XML document, and described type definition document is DTD (Document Type Definition) (DTD) document.
15. method according to claim 12, wherein, described document is an XML document, and described type definition document is the XML schema document.
16. method according to claim 8, the step of described releasing arrangement resource also comprises the step of returning quote to the described specific part in the described permanent storage.
17. a method of operating that is used to carry out on the document with the content of constructing according to SGML said method comprising the steps of:
Determine that described operation relates to a plurality of parts of described document, comprise first group and second group of one or more parts of one or more parts;
Described operation the term of execution, carry out following steps
Described first group with one or more parts of described document is loaded in the volatile memory;
Before finishing described operation, be chosen at least one part in first group of described part to stop to consume volatile memory; And
Before finishing described operation and after selecting described at least one part, discharge the described volatile memory of preserving described at least one part, so that described second group of described one or more parts of described document is loaded into volatile memory.
18. method according to claim 17, wherein:
Described document is extend markup language (XML) document, has the size above the described volatile memory of computer equipment;
Described operation comprises by the data stream that will represent described XML document and receiving in the volatile memory of described computer equipment, described document is loaded in the described volatile memory of described computer equipment;
Select the step of at least one part to comprise and determine specific part in a plurality of parts of described document according to one or more XML members;
Described method is further comprising the steps of:
Before all described XML document have been received described volatile memory, separately be kept at described specific part in the permanent storage; And
In described volatile memory, the steady arm that is used in described specific part with corresponding to described particular portion branch based on the XML member of father node of at least one XML member be associated.
19. the computer-readable medium of the one or more instruction sequences of carrying when carrying out described instruction sequence by one or more processors, makes described one or more processor enforcement of rights require each described method in 1 to 18.
CNB2003801027567A 2002-11-06 2003-11-06 Scalably accessing data in an arbitrarily large document Expired - Lifetime CN100432993C (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US42454302P 2002-11-06 2002-11-06
US60/424,543 2002-11-06
US10/306,130 2002-11-26

Publications (2)

Publication Number Publication Date
CN1711534A true CN1711534A (en) 2005-12-21
CN100432993C CN100432993C (en) 2008-11-12

Family

ID=35707232

Family Applications (3)

Application Number Title Priority Date Filing Date
CNB2003801044295A Expired - Lifetime CN100351791C (en) 2002-11-06 2003-11-06 Techniques for supporting application-specific access controls with a separate server
CNB2003801027567A Expired - Lifetime CN100432993C (en) 2002-11-06 2003-11-06 Scalably accessing data in an arbitrarily large document
CNB2003801071860A Expired - Lifetime CN100429654C (en) 2002-11-06 2003-11-06 Techniques for managing multiple hierarchies of data from a single interface

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CNB2003801044295A Expired - Lifetime CN100351791C (en) 2002-11-06 2003-11-06 Techniques for supporting application-specific access controls with a separate server

Family Applications After (1)

Application Number Title Priority Date Filing Date
CNB2003801071860A Expired - Lifetime CN100429654C (en) 2002-11-06 2003-11-06 Techniques for managing multiple hierarchies of data from a single interface

Country Status (1)

Country Link
CN (3) CN100351791C (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103186636A (en) * 2011-12-31 2013-07-03 北大方正集团有限公司 Method and system for loading readable file in mobile equipment
CN103208136A (en) * 2012-07-06 2013-07-17 北京中盈高科信息技术有限公司 Three dimensional image processing method and electronic device

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103207813B (en) * 2012-01-11 2018-08-14 华为技术有限公司 The method and apparatus for managing resource
JP6645508B2 (en) * 2015-11-04 2020-02-14 富士通株式会社 Structure analysis method and structure analysis program

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5696898A (en) * 1995-06-06 1997-12-09 Lucent Technologies Inc. System and method for database access control
GB2319705B (en) * 1996-11-21 2001-01-24 Motorola Ltd Arrangement for encryption/decryption of data and data carrier incorporating same
US6192476B1 (en) * 1997-12-11 2001-02-20 Sun Microsystems, Inc. Controlling access to a resource
US6449652B1 (en) * 1999-01-04 2002-09-10 Emc Corporation Method and apparatus for providing secure access to a computer system resource
US6721727B2 (en) * 1999-12-02 2004-04-13 International Business Machines Corporation XML documents stored as column data
US20020056025A1 (en) * 2000-11-07 2002-05-09 Qiu Chaoxin C. Systems and methods for management of memory
US6542911B2 (en) * 2001-03-01 2003-04-01 Sun Microsystems, Inc. Method and apparatus for freeing memory from an extensible markup language document object model tree active in an application cache
CN1313950C (en) * 2001-11-29 2007-05-02 上海复旦光华信息科技股份有限公司 Centralized domain user authorization and management system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103186636A (en) * 2011-12-31 2013-07-03 北大方正集团有限公司 Method and system for loading readable file in mobile equipment
CN103208136A (en) * 2012-07-06 2013-07-17 北京中盈高科信息技术有限公司 Three dimensional image processing method and electronic device

Also Published As

Publication number Publication date
CN100351791C (en) 2007-11-28
CN1717656A (en) 2006-01-04
CN100432993C (en) 2008-11-12
CN100429654C (en) 2008-10-29
CN1729467A (en) 2006-02-01

Similar Documents

Publication Publication Date Title
EP1559035B1 (en) Scalable access to data in an arbitrarily large document
AU2007254441B2 (en) Efficient piece-wise updates of binary encoded XML data
US8156149B2 (en) Composite nested streams
US7873899B2 (en) Mapping schemes for creating and storing electronic documents
US20050050092A1 (en) Direct loading of semistructured data
AU2006304109B2 (en) Managing relationships between resources stored within a repository
CN101819596B (en) Memory-based XML script buffer
EP1647905A1 (en) Method and system for mapping of XML schema data into relational data structures
US8209361B2 (en) Techniques for efficient and scalable processing of complex sets of XML schemas
JP4787617B2 (en) Techniques for supporting application-specific access control using separate servers
CN102314506B (en) Based on the distributed buffering district management method of dynamic index
CN102419838B (en) The service of project information after merging is provided
CN102243660A (en) Data access method and device
JP5011311B2 (en) Method and mechanism for loading an XML document into memory
KR20110010736A (en) Paging hierarchical data
US20120224482A1 (en) Credit feedback system for parallel data flow control
US11977548B2 (en) Allocating partitions for executing operations of a query
US8538980B1 (en) Accessing forms using a metadata registry
CN1853167A (en) System and method for dynamic content processing with extendable provisioning
JP6754696B2 (en) Systems and methods to support data type conversion in heterogeneous computing environments
CN1711534A (en) Scalably accessing data in an arbitrarily large document
US11249916B2 (en) Single producer single consumer buffering in database systems
CN101989280A (en) Method and system for managing configuration resources
CN116893788A (en) Metadata processing method, hardware acceleration network card, system and readable storage medium
CN104298562A (en) Resource management method and system for game development

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CX01 Expiry of patent term

Granted publication date: 20081112

CX01 Expiry of patent term