CN103544262A - XML-based stream page release method and system - Google Patents

XML-based stream page release method and system Download PDF

Info

Publication number
CN103544262A
CN103544262A CN201310484727.7A CN201310484727A CN103544262A CN 103544262 A CN103544262 A CN 103544262A CN 201310484727 A CN201310484727 A CN 201310484727A CN 103544262 A CN103544262 A CN 103544262A
Authority
CN
China
Prior art keywords
xml document
document
size
xml
reconstruction processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310484727.7A
Other languages
Chinese (zh)
Other versions
CN103544262B (en
Inventor
王冬雪
麻锐
孟利民
王辉
张标标
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yinjiang Technology Co.,Ltd.
Original Assignee
Enjoyor Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Enjoyor Co Ltd filed Critical Enjoyor Co Ltd
Priority to CN201310484727.7A priority Critical patent/CN103544262B/en
Publication of CN103544262A publication Critical patent/CN103544262A/en
Application granted granted Critical
Publication of CN103544262B publication Critical patent/CN103544262B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/80Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
    • G06F16/84Mapping; Conversion
    • G06F16/88Mark-up to mark-up conversion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • G06F16/986Document structures and storage, e.g. HTML extensions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The invention discloses an XML-based stream page release method and system. The method includes the steps of 1, streaming an XML input document that meets predetermined segmenting conditions, namely segmenting and reconfiguring, and selectively performing streaming again; 2, quickly paging the XML input document that meets predetermined partitioning conditions, namely multiple binary tree partitioning and reconfiguring; 3, according to a conversion pattern table provided by a terminal device, converting the input document into documents of other standard formats to output; 4, transmitting the documents of different standard formats to the corresponding terminal devices. The system comprises a streaming unit, a fast paging unit, an XSLT converter and a release server, wherein the streaming unit comprises a segmenting device and a reconfiguring device, and the fast paging unit comprises a partitioning device and a reconfiguring device. The method and system is applicable to super-large XML documents, with improved conversion reliability and fault tolerance, and is well flexible and widely applicable.

Description

A kind of streaming paging dissemination method and system based on XML
Technical field
The present invention relates to a kind of paging dissemination method and system based on XML.
Background technology
Develop rapidly along with infotech, increasing enterprises and institutions all need to operate mass data, as the traffic data of the medical data of hospital, Department of Communications, the electric power data of power administration, the layout data of regional planning agency, the hydrology of water conservancy bureau, water conservancy data, the weather data of weather bureau, these data are stored in server with the form of XML often, and user only needs the document on access services device just can realize obtaining of data.But, when user conducts interviews to the document on server by different terminal devices such as PC, handheld device, smart mobile phones, different due to the storage of terminal display format, software systems and reading format, want correctly to receive and show data, just must carry out format conversion to the document on server.At present, XML document format conversion instrument mainly comprises: DOM, SAX and XSLT, and wherein, XSLT is as one of present most popular XML document format conversion technology, and function is very powerful, and principle of work is fairly simple, as shown in Figure 1.
In the process in conversion, first need that XML source document is resolved to dom tree and leave in internal memory, document is excessive will certainly cause overflowing of internal memory.Therefore, user reads in the process of large data at terminal devices such as using PC, handheld device, smart mobile phone, tends to because of low memory or screen size is too small cannot correctly receive and show data.
Again because traditional paging processing procedure has just realized the function of segment processor,, input document is carried out to iterative staging treating, so all little XML document obtaining is right and wrong " form is good " all, make next step conversion operations not possess relative independentability, reliability and fault-tolerance are also poor, and the processing mode of iteration has also reduced the speed of staging treating widely in addition.
Summary of the invention
For overcome existing paging dissemination method based on XML and system can not to be applicable to XML document excessive, the deficiency of the occasion of having relatively high expectations with conversion reliability, fault-tolerance, dirigibility, applicability, streaming paging dissemination method and system based on XML under the occasion of the invention provides and be a kind ofly applicable to that XML document is excessive, conversion reliability, fault-tolerance, dirigibility, applicability being had relatively high expectations.
The technical solution adopted for the present invention to solve the technical problems is:
A streaming paging dissemination method of XML, described dissemination method comprises the following steps:
(1) fluidization treatment process:
For each large-scale input xml document, fluidization treatment device first will judge its size, if document size is no more than predefined segmentation read threshold, i.e. T s≤ T m, enter so step (2) and process; Otherwise, if document size surpasses predefined segmentation read threshold, i.e. T s>T m, fluidization treatment device will carry out segmentation and reconstruction processing to the document so, after processing, will generate two XML document that form is good, and a size equals T m, another size equals T s-T m, the former will be admitted to step (2) and process, and the latter will be sent to that fluidization treatment device judges again, segmentation and reconstruction processing;
(2) quick paging processing procedure:
If XML document F s0,1size considerably beyond the demand internal memory T of terminal device, i.e. T s0,1> > T, to XML document F s0,1carry out cutting apart and reconstruction processing of the first round, generate the new XML document F of two " form is good " s1,1and F s1,2; Next again to two newly-generated document F s1,1and F s1,2judge and second take turns cut apart and reconstruction processing, that is, if two newly-generated document F s1,1and F s1,2still meet and cut apart condition: T s1,1> > T and T s1,2> > T should be cut apart and reconstruction processing these two documents simultaneously, generates four " form is good " document F s2,1, F s2,2, F s2,3and F s2,4, the rest may be inferred, judges repeatedly, cuts apart and reconstruct, and the size of taking turns all XML document of cutting apart generation until a certain is all no more than the demand internal memory of terminal device, cuts apart with reconstruction processing process and finishes;
(3) XSLT transfer process: the conversion style sheet that contrast terminal device provides, converts input document to the document output of other standard format;
(4) issuing process: the document with various criterion form is sent to corresponding terminal device.
Further, in described step (1), fluidization treatment process comprises segmentation process and reconstruction processing process, described segmentation process:
Suppose to have now an XML document F s, size is T s, fluidization treatment device can with maximum memory be T mif XML document is very large, be far longer than the maximum memory that fluidization treatment device can be used, i.e. T s> > T m, in other words, satisfy condition: T s≈ pT m, p > > 1, is used the sectionaliser in fluidization treatment device to carry out staging treating to it so, specifically comprises following three steps:
The first, read XML document F s;
The second, set segmentation read threshold T d=T m;
Three, carry out staging treating, generate two non-" form is good " XML document:
1. F s1, size is designated as T s1, T s1=T d=T m;
2. F s2, size is designated as T s2, T s2=T s-T d=T s-T m.
Further, described reconstruction processing process comprises preliminary reconstruct and two steps of reconstruct again, and process is as follows:
1.1) read the XML document F that third step generates s1;
1.2) pointer is navigated to afterbody;
1.3) search for forward the beginning label " </ " of end-tag, and to record its position be L 1;
1.4) from L 1start to search for backward corresponding end mark " > ", and to record its position be L 2, now have two kinds of possibilities:
If can search out end mark " > ", so L 2value be exactly the positional value of this mark;
Otherwise, if fail to search out end mark " > ", at this moment pointer should be navigated to L 1place, performs step 1.3 again), obtain new L 1after value, then perform step 1.4), obtain new L 2value, the L that this is new 2value is only the actual location of end mark in this situation;
1.5) by because cutting apart the deficiency of data that causes from F s1afterbody move on to F s2stem; After this XML document F of deficiency of data will be obtained deleting s1, and added the XML document F of deficiency of data s2;
1.6) obtain because cutting apart the tag names of all ancestor nodes that lack:
1.6.1) set and to read sign flag=True, when the length scale of the value reading or while equaling 0, flag=False;
1.6.2) read step 1.5) the XML document F that deletes deficiency of data that generates s1, by each node label name, except empty tag names, add in list;
1.6.3) different elements and the number thereof in statistics list, should be supporting according to the beginning label of the good XML document of form and end-tag, the principle that empty label will be closed, number is the element of odd number, except first element, be the ancestor node tag names lacking because cutting apart, these tag names are put into another list, while obtaining these node label names, should keep its original order in list constant;
1.7) by step 1.5) two XML document F generating s1and F s2the XML document that the form that is configured to is good:
1.7.1) step 1.6.3) element in the list that obtains as end-tag inverted order add the XML document F that deletes deficiency of data to s1afterbody;
1.7.2) step 1.6.3) element in the list that obtains, except first element, adds as starting label positive sequence ground the XML document F that adds deficiency of data to s2stem;
1.7.3) step 1.6.3) first element in the list that obtains, state tag names, as starting label, add step 1.7.2 to) the XML document F that obtains s2stem.
Further again, in described step (2), suppose to have now an XML document F s0,1, its size is T s0,1, in the demand of terminal device, save as T, if XML document is very large, be far longer than the demand internal memory of terminal device, i.e. T s0,1> > T, in other words, satisfies condition: T s0,1≈ qT, q > > 1, carries out cutting apart of the first round and reconstruction processing to this XML document, and process is as follows:
2.1) read XML document F s0,1;
2.2) set a segmentation threshold
Figure BDA0000396669650000051
2.3) carry out first round dividing processing, this takes turns and comprises segmentation process one time, obtains the XML document of two non-" form is good " after cutting apart:
1. F s1,1, size is designated as T s1,1, T s1,1=T f0,1;
2. F s1,2, size is designated as T s1,2, T s1,2=T f0,1;
2.4) to step 2.3) two XML document F generating s1,1and F s1,2be reconstructed processing, after processing, will obtain the XML document of two new " form are good ":
1. F s1,1, size is designated as T s1,1, T s1,1≈ T f0,1;
2. F s1,2, size is designated as T s1,2, T s1,2≈ T f0,1;
Now finish cutting apart with reconstruction processing process of the first round, next we will set two segmentation thresholds
Figure BDA0000396669650000052
with
Figure BDA0000396669650000053
to step 2.4) two XML document F generating s1,1and F s1,2carry out second take turns cut apart and reconstruction processing, after processing, will obtain the document F of four " form is good " s2,1, F s2,2, F s2,3, F s2,4, size equals respectively T f1,1, T f1,1, T f1,2, T f1,2, the rest may be inferred, sets 2 n-1individual segmentation threshold
Figure BDA0000396669650000054
to 2 n-1individual XML document F s (n-1), k, k=1 ..., 2 n-1carry out cutting apart and reconstruction processing of n wheel, after processing, will obtain 2 nthe XML document F of individual " form is good " sn, k, k=1 ..., 2 n, so far, the size of all documents is all no more than the demand internal memory of terminal device, i.e. T sn, k≤ T, k=1 ..., 2 n, no longer meet and cut apart condition, cut apart with reconstruction processing and finish.
A streaming paging delivery system of XML, described delivery system comprises:
Fluidization treatment device: for each large-scale input xml document, fluidization treatment device first will judge its size, if document size is no more than predefined segmentation read threshold, i.e. T s≤ T m, so this document is transferred to quick paging device to process; Otherwise, if document size surpasses predefined segmentation read threshold, i.e. T s>T m, fluidization treatment device will carry out segmentation and reconstruction processing to the document so, after processing, can generate two XML document that form is good, and a size equals T m, another size equals T s-T m, the former will be admitted to quick paging device and process, and the latter will be sent to that fluidization treatment device judges again, segmentation and reconstruction processing;
Quick paging device: if XML document F s0,1size considerably beyond the demand internal memory T of terminal device, i.e. T s0,1> > T, to XML document F s0,1carry out cutting apart and reconstruction processing of the first round, generate the new XML document F of two " form is good " s1,1and F s1,2; Next again to two newly-generated document F s1,1and F s1,2judge and second take turns cut apart and reconstruction processing, that is, if two newly-generated document F s1,1and F s1,2still meet and cut apart condition: T s1,1> > T and T s1,2> > T should be cut apart and reconstruction processing these two documents simultaneously, generates four " form is good " document F s2,1, F s2,2, F s2,3and F s2,4, the rest may be inferred, judges repeatedly, cuts apart and reconstruct, and the size of taking turns all XML document of cutting apart generation until a certain is all no more than the demand internal memory of terminal device, cuts apart with reconstruction processing process and finishes;
XSLT converter: the conversion style sheet providing for contrasting terminal device, converts input document to the document output of other standard format;
Publisher server: for the document with various criterion form is sent to corresponding terminal device.
Further, described fluidization treatment device comprises sectionaliser and reconstructor, wherein,
In described sectionaliser, suppose to have now an XML document F s, size is T s, fluidization treatment device can with maximum memory be T mif XML document is very large, be far longer than the maximum memory that fluidization treatment device can be used, i.e. T s> > T m, in other words, satisfy condition: T s≈ pT m, p > > 1, is used the sectionaliser in fluidization treatment device to carry out staging treating to it so, specifically comprises following three steps:
The first, read XML document F s;
The second, set segmentation read threshold T d=T m;
Three, carry out staging treating, generate two non-" form is good " XML document:
1. F s1, size is designated as T s1, T s1=T d=T m;
2. F s2, size is designated as T s2, T s2=T s-T d=T s-T m.
Further, in described reconstructor, processing procedure comprises preliminary reconstruct and two steps of reconstruct again, and process is as follows:
1.1) read the XML document F that third step generates s1;
1.2) pointer is navigated to afterbody;
1.3) search for forward the beginning label " </ " of end-tag, and to record its position be L 1;
1.4) from L 1start to search for backward corresponding end mark " > ", and to record its position be L 2, now have two kinds of possibilities:
If can search out end mark " > ", so L 2value be exactly the positional value of this mark;
Otherwise, if fail to search out end mark " > ", at this moment pointer should be navigated to L 1place, performs step 1.3 again), obtain new L 1after value, then perform step 1.4), obtain new L 2value, the L that this is new 2value is only the actual location of end mark in this situation;
1.5) by because cutting apart the deficiency of data that causes from F s1afterbody move on to F s2stem; After this XML document F of deficiency of data will be obtained deleting s1, and added the XML document F of deficiency of data s2;
1.6) obtain because cutting apart the tag names of all ancestor nodes that lack:
1.6.1) set and to read sign flag=True, when the length scale of the value reading or while equaling 0, flag=False;
1.6.2) read step 1.5) the XML document F that deletes deficiency of data that generates s1, by each node label name, except empty tag names, add in list;
1.6.3) different elements and the number thereof in statistics list, should be supporting according to the beginning label of the good XML document of form and end-tag, the principle that empty label will be closed, number is the element of odd number, except first element, be the ancestor node tag names lacking because cutting apart, these tag names are put into another list, while obtaining these node label names, should keep its original order in list constant;
1.7) by step 1.5) two XML document F generating s1and F s2the XML document that the form that is configured to is good:
1.7.1) step 1.6.3) element in the list that obtains as end-tag inverted order add the XML document F that deletes deficiency of data to s1afterbody;
1.7.2) step 1.6.3) element in the list that obtains, except first element, adds as starting label positive sequence ground the XML document F that adds deficiency of data to s2stem;
1.7.3) step 1.6.3) first element in the list that obtains, state tag names, as starting label, add step 1.7.2 to) the XML document F that obtains s2stem.
Further again, described quick paging device comprises dispenser and reconstructor, wherein,
In described dispenser, suppose to have now an XML document F s0,1, its size is T s0,1, in the demand of terminal device, save as T, if XML document is very large, be far longer than the demand internal memory of terminal device, i.e. T s0,1> > T, in other words, satisfies condition: T s0,1≈ qT, q > > 1, carries out cutting apart of the first round and reconstruction processing to this XML document, and process is as follows:
2.1) read XML document F s0,1;
2.2) set a segmentation threshold
Figure BDA0000396669650000081
2.3) carry out first round dividing processing, this takes turns and comprises segmentation process one time, obtains the XML document of two non-" form is good " after cutting apart:
1. F s1,1, size is designated as T s1,1, T s1,1=T f0,1;
2. F s1,2, size is designated as T s1,2, T s1,2=T f0,1;
2.4) to step 2.3) two XML document F generating s1,1and F s1,2in described reconstructor, be reconstructed processing, after processing, will obtain the XML document of two new " form are good ":
1. F s1,1, size is designated as T s1,1, T s1,1≈ T f0,1;
2. F s1,2, size is designated as T s1,2, T s1,2≈ T f0,1;
Now finish cutting apart with reconstruction processing process of the first round, next we will set two segmentation thresholds with
Figure BDA0000396669650000092
to step 2.4) two XML document F generating s1,2and F s1,2carry out second take turns cut apart and reconstruction processing, after processing, will obtain the document F of four " form is good " s2,1, F s2,2, F s2,3, F s2,4, size equals respectively T f1,1, T f1,1, T f1,2, T f1,2, the rest may be inferred, sets 2 n-1individual segmentation threshold
Figure BDA0000396669650000093
Figure BDA0000396669650000094
to 2 n-1individual XML document F s (m-1), k, k=1 ..., 2 n-1carry out cutting apart and reconstruction processing of n wheel, after processing, will obtain 2 nthe XML document F of individual " form is good " sn, k, k=1 ..., 2 n, so far, the size of all documents is all no more than the demand internal memory of terminal device, i.e. T sn, k≤ T, k=1 ..., 2 n, no longer meet and cut apart condition, cut apart with reconstruction processing and finish.
Technical conceive of the present invention is: use fluidization treatment device and quick paging device, this large XML document is divided into a plurality of little XML document that are no more than the restriction of terminal device internal memory, and then completes conversion work by XSLT converter.Through above-mentioned analysis and research, we can draw the procedure chart that large data are read in paging, as shown in Figure 2.
From upper figure, this process is to complete under XML file server, streaming paging server and publisher server three's acting in conjunction, and wherein XML file server is used for storing and sending the XML document of isomery; Streaming paging server has been responsible for the cutting apart of XML document, reconstruct and translation function; The purposes of publisher server is to terminal devices such as PC, handheld device, smart mobile phones, to send the little XML document of respective standard form, as long as these terminal devices propose the request of swap data to XSLT converter, and provides the conversion style sheet of self.
Beneficial effect of the present invention is mainly manifested in: (1) this system is used streaming paging server to carry out paging processing to data, data can be sent on terminal device with the form of " page ", solve end user device and cannot correctly obtain or show the problem of data because of restrictions such as its internal memory, screen size and specifications;
(2) this system can meet a plurality of users, i.e. multiple different terminals or a plurality of terminal of the same race with different-format standard, and the requirement of the XML document on request access file server simultaneously, dirigibility is high, widely applicable.
(3) this system can be used streaming paging server to convert the XML document paging of arbitrary size to form and size that terminal device can receive and show, that is, system can be applicable to the XML document of arbitrary size, has general applicability.
(4) compared to traditional paging delivery system of just having realized segment processor function, this system has not only increased reconstructor, and the XML document that makes output is all " form is good ", has improved and has resolved and reliability and the fault-tolerance of conversion; Also increased quick paging device, paging has been processed more quick, efficient.
Native system has but increased reconstructor on this basis, and object is in order these non-" form is good " little XML document all to be converted to the little XML document of " form is good ", to make next step parsing and be converted into possibility;
Native system has also increased quick paging device, what it adopted is a kind of partitioning algorithm based on binary tree, can cut apart fast and reconstruction processing the little XML document of above-mentioned all " form is good ", accelerate generally paging processing speed, can reach more rapidly the paging effect of expection.
(5) any one link in system, comprise fluidization treatment device, fast paging device and XSLT converter, and all XML document of output are all " form are good ", can resolve independently and format conversion.Wherein, any one document generation processing or error of transmission can not affect correct parsing and the format conversion of other document, that is, effectively mistake is completely cut off in corresponding wrong document, and can not be diffused in other correct document; Once the error correction of document, just can form a complete document in conjunction with other correct document, effectively improved the fault-tolerance of parsing, format conversion.
(6) all XML document that receive due to end user device are also " form are good ", and therefore, it can split fast, resolve and assemble these documents.
Accompanying drawing explanation
Fig. 1 is XSLT transfer principle figure.
Fig. 2 is that procedure chart is read in the paging of large data.
Fig. 3 is streaming paging delivery system block diagram.
Fig. 4 is the composition frame chart of fluidization treatment device.
Fig. 5 is the composition frame chart of quick paging device.
Fig. 6 is fluidization treatment schematic diagram, wherein, and situation when (a) expression document size is no more than predefined segmentation read threshold, the situation while (b) representing document size over predefined segmentation read threshold.
Fig. 7 is F s1and F s2the process flow diagram of reconstruction processing process.
Fig. 8 is the schematic diagram of the quick paging algorithm based on binary tree, and wherein, each oval node represents a XML document that corresponding round processing is used.
Fig. 9 is the process flow diagram of the quick paging process based on binary tree.
Figure 10 is quick paging processing flow chart.
Figure 11 is cut apart the first round and reconstruction processing process flow diagram.
Embodiment
Below in conjunction with accompanying drawing, the invention will be further described.
Embodiment 1
With reference to Fig. 1~Figure 11, a kind of streaming paging dissemination method based on XML, described dissemination method comprises the following steps:
(1) fluidization treatment process:
For each large-scale input xml document, fluidization treatment device first will judge its size, if document size is no more than predefined segmentation read threshold, i.e. T s≤ T m, enter so step (2) and process; Otherwise, if document size surpasses predefined segmentation read threshold, i.e. T s>T m, fluidization treatment device will carry out segmentation and reconstruction processing to the document so, after processing, will generate two XML document that form is good, and a size equals T m, another size equals T s-T m, the former will be admitted to step (2) and process, and the latter will be sent to that fluidization treatment device judges again, segmentation and reconstruction processing;
(2) quick paging processing procedure:
If XML document F s0,1size considerably beyond the demand internal memory T of terminal device, i.e. T s0,1> > T, to XML document F s0,1carry out cutting apart and reconstruction processing of the first round, generate the new XML document F of two " form is good " s1,1and F s1,1; Next again to two newly-generated document F s1,1and F s1,2judge and second take turns cut apart and reconstruction processing, that is, if two newly-generated document F s1,1and F s1,2still meet and cut apart condition: T s1,1> > T and T s1,2> > T should be cut apart and reconstruction processing these two documents simultaneously, generates four " form is good " document F s2,1, F s2,2, F s2,3and F s2,4, the rest may be inferred, judges repeatedly, cuts apart and reconstruct, and the size of taking turns all XML document of cutting apart generation until a certain is all no more than the demand internal memory of terminal device, cuts apart with reconstruction processing process and finishes;
(3) XSLT transfer process: the conversion style sheet that contrast terminal device provides, converts input document to the document output of other standard format;
(4) issuing process: the document with various criterion form is sent to corresponding terminal device.
Further, in described step (1), fluidization treatment process comprises segmentation process and reconstruction processing process, described segmentation process:
Suppose to have now an XML document F s, size is T s, fluidization treatment device can with maximum memory be T mif XML document is very large, be far longer than the maximum memory that fluidization treatment device can be used, i.e. T s> > T m, in other words, satisfy condition: T s≈ pT m, p > > 1, is used the sectionaliser in fluidization treatment device to carry out staging treating to it so, specifically comprises following three steps:
The first, read XML document F s;
The second, set segmentation read threshold T d=T m;
Three, carry out staging treating, generate two non-" form is good " XML document:
1. F s1, size is designated as T s1, T s1=T d=T m;
2. F s2, size is designated as T s2, T s2=T s-T d=T s-T m.
Further, described reconstruction processing process comprises preliminary reconstruct and two steps of reconstruct again, and process is as follows:
1.1) read the XML document F that third step generates s1;
1.2) pointer is navigated to afterbody;
1.3) search for forward the beginning label " </ " of end-tag, and to record its position be L 1;
1.4) from L 1start to search for backward corresponding end mark " > ", and to record its position be L 2, now have two kinds of possibilities:
If can search out end mark " > ", so L 2value be exactly the positional value of this mark;
Otherwise, if fail to search out end mark " > ", at this moment pointer should be navigated to L 1place, performs step 1.3 again), obtain new L 1after value, then perform step 1.4), obtain new L 2value, the L that this is new 2value is only the actual location of end mark in this situation;
1.5) by because cutting apart the deficiency of data that causes from F s1afterbody move on to F s2stem; After this XML document F of deficiency of data will be obtained deleting s1, and added the XML document F of deficiency of data s2;
1.6) obtain because cutting apart the tag names of all ancestor nodes that lack:
1.6.1) set and to read sign flag=True, when the length scale of the value reading or while equaling 0, flag=False;
1.6.2) read step 1.5) the XML document F that deletes deficiency of data that generates s1, by each node label name, except empty tag names, add in list;
1.6.3) different elements and the number thereof in statistics list, should be supporting according to the beginning label of the good XML document of form and end-tag, the principle that empty label will be closed, number is the element of odd number, except first element, be the ancestor node tag names lacking because cutting apart, these tag names are put into another list, while obtaining these node label names, should keep its original order in list constant;
1.7) by step 1.5) two XML document F generating s1and F s2the XML document that the form that is configured to is good:
1.7.1) step 1.6.3) element in the list that obtains as end-tag inverted order add the XML document F that deletes deficiency of data to s1afterbody;
1.7.2) step 1.6.3) element in the list that obtains, except first element, adds as starting label positive sequence ground the XML document F that adds deficiency of data to s2stem;
1.7.3) step 1.6.3) first element in the list that obtains, state tag names, as starting label, add step 1.7.2 to) the XML document F that obtains s2stem.
Further again, in described step (2), suppose to have now an XML document F s0,1, its size is T s0,1, in the demand of terminal device, save as T, if XML document is very large, be far longer than the demand internal memory of terminal device, i.e. T s0,1> > T, in other words, satisfies condition: T s0,1≈ qT, q > > 1, carries out cutting apart of the first round and reconstruction processing to this XML document, and process is as follows:
2.1) read XML document F s0,1;
2.2) set a segmentation threshold
Figure BDA0000396669650000141
2.3) carry out first round dividing processing, this takes turns and comprises segmentation process one time, obtains the XML document of two non-" form is good " after cutting apart:
1. F s1,1, size is designated as T s1,1, T s1,1=T f0,1;
2. F s1,2, size is designated as T s1,2, T s1,2=T f0,1;
2.4) to step 2.3) two XML document F generating s1,1and F s1,2be reconstructed processing, after processing, will obtain the XML document of two new " form are good ":
1. F s1,1, size is designated as T s1,1, T s1,1≈ T f0,1;
2. F s1,2, size is designated as T s1,2, T s1,2≈ T f0,1;
Now finish cutting apart with reconstruction processing process of the first round, next we will set two segmentation thresholds
Figure BDA0000396669650000142
with
Figure BDA0000396669650000143
to step 2.4) two XML document F generating s1,1and F s1,2carry out second take turns cut apart and reconstruction processing, after processing, will obtain the document F of four " form is good " s2,1, F s2,2, F s2,3, F s2,4, size equals respectively T f1,1, T f1,1, T f1,2, T f1,2, the rest may be inferred, sets 2 n-1individual segmentation threshold
Figure BDA0000396669650000151
to 2 n-1individual XML document F s (n-1), k, k=1 ..., 2 n-1carry out cutting apart and reconstruction processing of n wheel, after processing, will obtain 2 nthe XML document F of individual " form is good " sn, k, k=1 ..., 2 n, so far, the size of all documents is all no more than the demand internal memory of terminal device, i.e. T sn, k≤ T, k=1 ..., 2 n, no longer meet and cut apart condition, cut apart with reconstruction processing and finish.
Embodiment 2
With reference to Fig. 1~Figure 11, a kind of streaming paging delivery system based on XML, described delivery system comprises:
Fluidization treatment device: for each large-scale input xml document, fluidization treatment device first will judge its size, if document size is no more than predefined segmentation read threshold, i.e. T s≤ T m, so this document is transferred to quick paging device to process; Otherwise, if document size surpasses predefined segmentation read threshold, i.e. T s>T m, fluidization treatment device will carry out segmentation and reconstruction processing to the document so, after processing, can generate two XML document that form is good, and a size equals T m, another size equals T s-T m, the former will be admitted to quick paging device and process, and the latter will be sent to that fluidization treatment device judges again, segmentation and reconstruction processing;
Quick paging device: if XML document F s0,1size considerably beyond the demand internal memory T of terminal device, i.e. T s0,1> > T, to XML document F s0,1carry out cutting apart and reconstruction processing of the first round, generate the new XML document F of two " form is good " s1,1and F s1,2; Next again to two newly-generated document F s1,1and F s1,2judge and second take turns cut apart and reconstruction processing, that is, if two newly-generated document F s1,1and F s1,2still meet and cut apart condition: T s1,1> > T and T s1,2> > T should be cut apart and reconstruction processing these two documents simultaneously, generates four " form is good " document F s2,1, F s2,2, F s2,3and F s2,4, the rest may be inferred, judges repeatedly, cuts apart and reconstruct, and the size of taking turns all XML document of cutting apart generation until a certain is all no more than the demand internal memory of terminal device, cuts apart with reconstruction processing process and finishes;
XSLT converter: the conversion style sheet providing for contrasting terminal device, converts input document to the document output of other standard format;
Publisher server, for sending to corresponding terminal device by the document with various criterion form.
In the present embodiment, streaming paging delivery system is by streaming paging server, comprise fluidization treatment device, quick paging device, XSLT converter, form with two parts of publisher server, as shown in Figure 3, its principle of work is first by streaming paging server, large XML document to be cut apart, reconstruct and conversion process, generate the little XML destination document of a plurality of " form is good ", the size of these documents and data layout depend on respectively free memory and the conversion style sheet that terminal device provides, and then by publisher server, these little XML document are sent to corresponding terminal device with the form of " page ".
Fluidization treatment device is comprised of sectionaliser and reconstructor, and as shown in Figure 4, the basic thought of work is first the size of the XML document of input to be judged, if document does not meet predefined segmentation condition, is just left intact and is sent to quick paging device; Otherwise, if document meets predefined segmentation condition, just the document being carried out to segmentation reads, the size of guaranteeing reading data is no more than predefined segmentation read threshold, and then use reconstructor to be reconstructed processing to storing the new XML document of above-mentioned data, making each XML document of fluidization treatment device output is that form is good.
Paging device is comprised of dispenser and reconstructor fast, as shown in Figure 5, the basic thought of work is first the size of the XML document of input to be judged, if document meets the predefined condition of cutting apart, just the document is carried out to binary tree formula dividing processing, and then all new XML document that this dividing processing is generated is reconstructed processing.Identical with fluidization treatment device of the reconstruction processing process herein of it is emphasized that.
The function of XSLT converter is the conversion style sheet that contrast terminal device provides, and input document is converted to the document output of other standard format; The function of publisher server is that the document with various criterion form is sent to corresponding terminal device.
Fluidization treatment process utilizes fluidization treatment device to realize, and for each large-scale input xml document, fluidization treatment device all first will judge its size.If document size is no more than predefined segmentation read threshold, i.e. T s≤ T m, fluidization treatment device will be left intact the document will be outputed to quick paging device so, as shown in Figure 6 a.Otherwise, if document size surpasses predefined segmentation read threshold, i.e. T s>T m, fluidization treatment device will carry out segmentation and reconstruction processing to the document so, after processing, will generate two XML document that form is good, and a size approximates T m, another size approximates T s-T m, the former will be sent to quick paging device, and the latter will be sent to that fluidization treatment device judges again, segmentation and reconstruction processing, and as shown in Figure 6 b.
Segmentation process: suppose to have now an XML document F s, size is T s, fluidization treatment device can with maximum memory be T mif XML document is very large, be far longer than the maximum memory that fluidization treatment device can be used, i.e. T s> > T m, in other words, satisfy condition: T s≈ pT m, p > > 1, we will use the sectionaliser in fluidization treatment device to carry out staging treating to it so, specifically comprise following three steps:
The first, read XML document F s.
The second, set segmentation read threshold T d=T m.
Three, carry out staging treating, generate two non-" form is good " XML document:
1. F s1, size is designated as T s1, T s1=T d=T m;
2. F s2, size is designated as T s2, T s2=T s-T d=T s-T m.
Reconstruction processing process more complicated, it comprises preliminary reconstruct and two steps of reconstruct again, as shown in Figure 7, embodiment is as follows for its realization flow:
1.1) read the XML document F that third step generates s1.
1.2) pointer is navigated to afterbody.
1.3) search for forward the beginning label " </ " of end-tag, and to record its position be L 1.
1.4) from L 1start to search for backward corresponding end mark " > ", and to record its position be L 2.Now have two kinds of possibilities:
If can search out end mark " > ", so L 2value be exactly the positional value of this mark;
Otherwise, if fail to search out end mark " > ", at this moment pointer should be navigated to L 1place, holds again
Row step 1.3), obtain new L 1after value, then perform step 1.4), obtain new L 2value, the L that this is new 2value
Be only the actual location of end mark in this situation.
1.5) by because cutting apart the deficiency of data that causes from F s1afterbody move on to F s2stem.After this XML document F of deficiency of data will be obtained deleting s1, and added the XML document F of deficiency of data s2.
1.6) obtain because cutting apart the tag names of all ancestor nodes that lack:
1.6.1) set and to read sign flag=True, when the length scale of the value reading or while equaling 0, flag=False;
1.6.2) read step 1.5) the XML document F that deletes deficiency of data that generates s1, by each node label name, except empty tag names, add in list;
1.6.3) different elements and the number thereof in statistics list, should be supporting according to the beginning label of the good XML document of form and end-tag, the principle that empty label will be closed, number is the element of odd number, except first element, be the ancestor node tag names lacking because cutting apart, these tag names are put into another list.It should be noted that while obtaining these node label names, should keep its original order in list constant.
1.7) by step 1.5) two XML document F generating s1and F s2the XML document that the form that is configured to is good:
1.7.1) step 1.6.3) element in the list that obtains as end-tag inverted order add the XML document F that deletes deficiency of data to s1afterbody;
1.7.2) step 1.6.3) element in the list that obtains, except first element, adds as starting label positive sequence ground the XML document F that adds deficiency of data to s2stem;
1.7.3) step 1.6.3) first element in the list that obtains, state tag names, as starting label, add step 1.7.2 to) the XML document F that obtains s2stem.
Paging processing procedure utilizes quick paging device to realize fast, and its core is binary tree formula partitioning algorithm, a kind of special Parallel segmentation algorithm, and it,, with respect to traditional iteration partitioning algorithm, has advantage more fast and efficiently.
Quick paging algorithm based on binary tree: the ultimate principle of the quick paging algorithm based on binary tree is first an XML document cut apart and reconstituted two new documents, and then these two new XML document are cut apart separately and reconstituted two other document, obtain altogether four XML document, the rest may be inferred, when all documents that obtain no longer meet while cutting apart condition, cut apart end.Below we will illustrate the ultimate principle of above-mentioned algorithm with a sketch, as shown in Figure 8.
In order further to explain the ultimate principle of the quick paging algorithm based on binary tree, we can convert the document variation diagram shown in Fig. 9 to Fig. 8, and by document called after F sn, k, wherein descend small tenon n representative to cut apart the round with reconstruction processing, as n=1 specification documents be the first round cut apart with reconstruction processing in generate; Lower small tenon k represent each take turns cut apart with reconstruction processing after the sequence number of the document that generates, its span depends on round n, i.e. k=1 ..., 2 n, for example, when n=1, the value of k is 1 and 2, F s1,1represent the first round cut apart with reconstruction processing after first document of generating, F s1,2represent the first round cut apart with reconstruction processing after second document generating.
By the research and analysis to Fig. 9, we can draw the overall flow that quick paging is processed, as shown in figure 10.Due to XML document F s0,1size considerably beyond the demand internal memory T of terminal device, i.e. T s0,1> > T, so we need to carry out cutting apart of the first round and reconstruction processing to it, generate the new XML document F of two " form is good " s1,1and F s1,2.Next we need to be to two newly-generated document F s1,1and F s1,2judge and second take turns cut apart and reconstruction processing, that is, if two newly-generated document F s1,1and F s1,2still meet and cut apart condition: T s1,1> > T and T s1,2> > T should be cut apart and reconstruction processing these two documents simultaneously, generates four " form is good " document F s2,1, F s2,2, F s2,3and F s2,4, the rest may be inferred, judges repeatedly, cuts apart and reconstruct, and the size of taking turns all XML document of cutting apart generation until a certain is all no more than the demand internal memory of terminal device, cuts apart with reconstruction processing process and finishes.
Binary tree formula is cut apart and reconstruction processing process: suppose to have now an XML document F s0,1, its size is T s0,1, in the demand of terminal device, save as T, if XML document is very large, be far longer than the demand internal memory of terminal device, i.e. T s0,1> > T, in other words, satisfies condition: T s0,1≈ qT, q > > 1, we will use quick paging device so, comprise dispenser and reconstructor, this XML document is carried out to cutting apart of the first round and reconstruction processing, and the flow process of this processing as shown in figure 11, specifically comprises following four steps:
2.1) read XML document F s0,1;
2.2) set a segmentation threshold
2.3) carry out first round dividing processing, this takes turns and comprises segmentation process one time, obtains the XML document of two non-" form is good " after cutting apart:
1. F s1,1, size is designated as T s1,1, T s1,1=T f0,1;
2. F s1,2, size is designated as T s1,2, T s1,2=T f0,1;
2.4) to step 2.3) two XML document F generating s1,1and F s1,2be reconstructed processing, after processing, will obtain the XML document of two new " form are good ":
1. F s1,1, size is designated as T s1,1, T s1,1≈ T f0,1;
2. F s1,2, size is designated as T s1,2, T s1,2≈ T f0,1;
Now finish cutting apart with reconstruction processing process of the first round.Next we will set two segmentation thresholds
Figure BDA0000396669650000202
with
Figure BDA0000396669650000203
two XML document F that step 4) is generated s1,1and F s1,2carry out second take turns cut apart and reconstruction processing, after processing, will obtain the document F of four " form is good " s2,1, F s2,2, F s2,3, F s2,4, size approximates respectively T f1,1, T f1,1, T f1,2, T f1,2.The rest may be inferred, sets 2 n-1individual segmentation threshold
Figure BDA0000396669650000204
to 2 n-1individual XML document F s (n-1), k, k=1 ..., 2 n-1carry out cutting apart and reconstruction processing of n wheel, after processing, will obtain 2 nthe XML document F of individual " form is good " sn, k, k=1 ..., 2 n.So far, the size of all documents is all no more than the demand internal memory of terminal device, i.e. T sn, k≤ T, k=1 ..., 2 n, no longer meet and cut apart condition, cut apart with reconstruction processing and finish.

Claims (8)

1. the streaming paging dissemination method based on XML, is characterized in that: described dissemination method comprises the following steps:
(1) fluidization treatment process:
For each large-scale input xml document, fluidization treatment device first will judge its size, if document size is no more than predefined segmentation read threshold, i.e. T s≤ T m, enter so step (2) and process; Otherwise, if document size surpasses predefined segmentation read threshold, i.e. T s>T m, fluidization treatment device will carry out segmentation and reconstruction processing to the document so, after processing, will generate two XML document that form is good, and a size equals T m, another size equals T s-T m, the former will be admitted to step (2) and process, and the latter will be sent to that fluidization treatment device judges again, segmentation and reconstruction processing;
(2) quick paging processing procedure:
If XML document F s0,1size considerably beyond the demand internal memory T of terminal device, i.e. T s0,1> > T, to XML document F s0,1carry out cutting apart and reconstruction processing of the first round, generate the new XML document F of two " form is good " s1,1and F s1,2; Next again to two newly-generated document F s1,1and F s1,2judge and second take turns cut apart and reconstruction processing, that is, if two newly-generated document F s1,1and F s1,2still meet and cut apart condition: T s1,1> > T and T s1,2> > T should be cut apart and reconstruction processing these two documents simultaneously, generates four " form is good " document F s2,1, F s2,2, F s2,3and F s2,4, the rest may be inferred, judges repeatedly, cuts apart and reconstruct, and the size of taking turns all XML document of cutting apart generation until a certain is all no more than the demand internal memory of terminal device, cuts apart with reconstruction processing process and finishes;
(3) XSLT transfer process: the conversion style sheet that contrast terminal device provides, converts input document to the document output of other standard format;
(4) issuing process: the document with various criterion form is sent to corresponding terminal device.
2. a kind of streaming paging dissemination method based on XML as claimed in claim 1, is characterized in that: in described step (1), fluidization treatment process comprises segmentation process and reconstruction processing process, described segmentation process:
Suppose to have now an XML document F s, size is T s, fluidization treatment device can with maximum memory be T mif XML document is very large, be far longer than the maximum memory that fluidization treatment device can be used, i.e. T s> > T m, in other words, satisfy condition: T s≈ pT m, p > > 1, is used the sectionaliser in fluidization treatment device to carry out staging treating to it so, specifically comprises following three steps:
The first, read XML document F s;
The second, set segmentation read threshold T d=T m;
Three, carry out staging treating, generate two non-" form is good " XML document:
1. F s1, size is designated as T s1, T s1=T d=T m;
2. F s2, size is designated as T s2, T s2=T s-T d=T s-T m.
3. a kind of streaming paging dissemination method based on XML as claimed in claim 2, is characterized in that: described reconstruction processing process comprises preliminary reconstruct and two steps of reconstruct again, and process is as follows:
1.1) read the XML document F that third step generates s1;
1.2) pointer is navigated to afterbody;
1.3) search for forward the beginning label " </ " of end-tag, and to record its position be L 1;
1.4) from L 1start to search for backward corresponding end mark " > ", and to record its position be L 2, now have two kinds of possibilities:
If can search out end mark " > ", so L 2value be exactly the positional value of this mark;
Otherwise, if fail to search out end mark " > ", at this moment pointer should be navigated to L 1place, performs step 1.3 again), obtain new L 1after value, then perform step 1.4), obtain new L 2value, the L that this is new 2value is only the actual location of end mark in this situation;
1.5) by because cutting apart the deficiency of data that causes from F s1afterbody move on to F s2stem; After this XML document F of deficiency of data will be obtained deleting s1, and added the XML document F of deficiency of data s2;
1.6) obtain because cutting apart the tag names of all ancestor nodes that lack:
1.6.1) set and to read sign flag=True, when the length scale of the value reading or while equaling 0, flag=False;
1.6.2) read step 1.5) the XML document F that deletes deficiency of data that generates s1, by each node label name, except empty tag names, add in list;
1.6.3) different elements and the number thereof in statistics list, should be supporting according to the beginning label of the good XML document of form and end-tag, the principle that empty label will be closed, number is the element of odd number, except first element, be the ancestor node tag names lacking because cutting apart, these tag names are put into another list, while obtaining these node label names, should keep its original order in list constant;
1.7) by step 1.5) two XML document F generating s1and F s2the XML document that the form that is configured to is good:
1.7.1) step 1.6.3) element in the list that obtains as end-tag inverted order add the XML document F that deletes deficiency of data to s1afterbody;
1.7.2) step 1.6.3) element in the list that obtains, except first element, adds as starting label positive sequence ground the XML document F that adds deficiency of data to s2stem;
1.7.3) step 1.6.3) first element in the list that obtains, state tag names, as starting label, add step 1.7.2 to) the XML document F that obtains s2stem.
4. a kind of streaming paging dissemination method based on XML as described in one of claim 1~3, is characterized in that: in described step (2), suppose to have now an XML document F s0,1, its size is T s0,1, in the demand of terminal device, save as T, if XML document is very large, be far longer than the demand internal memory of terminal device, i.e. T s0,1> > T, in other words, satisfies condition: T s0,1≈ qT, q > > 1, carries out cutting apart of the first round and reconstruction processing to this XML document, and process is as follows:
2.1) read XML document F s0,1;
2.2) set a segmentation threshold
Figure FDA0000396669640000041
2.3) carry out first round dividing processing, this takes turns and comprises segmentation process one time, obtains the XML document of two non-" form is good " after cutting apart:
1. F s1,1, size is designated as T s1,1, T s1,1=T f0,1;
2. F s1,2, size is designated as T s1,2, T s1,2=T f0,1;
2.4) to step 2.3) two XML document F generating s1,1and F s1,2be reconstructed processing, after processing, will obtain the XML document of two new " form are good ":
1. F s1,1, size is designated as T s1,1, T s1,1≈ T f0,1;
2. F s1,2, size is designated as T s1,2, T s1,2≈ T f0,1;
Now finish cutting apart with reconstruction processing process of the first round, next we will set two segmentation thresholds
Figure FDA0000396669640000042
with
Figure FDA0000396669640000043
to step 2.4) two XML document F generating s1,1and F s1,2carry out second take turns cut apart and reconstruction processing, after processing, will obtain the document F of four " form is good " s2,1, F s2,2, F s2,3, F s2,4, size equals respectively T f1,1, T f1,1, T f1,2, T f1,2, the rest may be inferred, sets 2 n-1individual segmentation threshold
Figure FDA0000396669640000044
to 2 n-1individual XML document F s (n-1), k, k=1 ..., 2 n-1carry out cutting apart and reconstruction processing of n wheel, after processing, will obtain 2 nthe XML document F of individual " form is good " sn, k, k=1 ..., 2 n, so far, the size of all documents is all no more than the demand internal memory of terminal device, i.e. T sn, k≤ T, k=1 ..., 2 n, no longer meet and cut apart condition, cut apart with reconstruction processing and finish.
5. the streaming paging delivery system based on XML, is characterized in that: described delivery system comprises:
Fluidization treatment device: for each large-scale input xml document, fluidization treatment device first will judge its size, if document size is no more than predefined segmentation read threshold, i.e. T s≤ T m, so this document is transferred to quick paging device to process; Otherwise, if document size surpasses predefined segmentation read threshold, i.e. T s>T m, fluidization treatment device will carry out segmentation and reconstruction processing to the document so, after processing, can generate two XML document that form is good, and a size equals T m, another size equals T s-T m, the former will be admitted to quick paging device and process, and the latter will be sent to that fluidization treatment device judges again, segmentation and reconstruction processing;
Quick paging device: if XML document F s0,1size considerably beyond the demand internal memory T of terminal device, i.e. T s0,1> > T, to XML document F s0,1carry out cutting apart and reconstruction processing of the first round, generate the new XML document F of two " form is good " s1,1and F s1,2; Next again to two newly-generated document F s1,1and F s1,2judge and second take turns cut apart and reconstruction processing, that is, if two newly-generated document F s1,1and F s1,2still meet and cut apart condition: T s1,1> > T and T s1,2> > T should be cut apart and reconstruction processing these two documents simultaneously, generates four " form is good " document F s2,1, F s2,2, F s2,3and F s2,4, the rest may be inferred, judges repeatedly, cuts apart and reconstruct, and the size of taking turns all XML document of cutting apart generation until a certain is all no more than the demand internal memory of terminal device, cuts apart with reconstruction processing process and finishes;
XSLT converter: the conversion style sheet providing for contrasting terminal device, converts input document to the document output of other standard format;
Publisher server: for the document with various criterion form is sent to corresponding terminal device.
6. the streaming paging delivery system based on XML as claimed in claim 5, is characterized in that: described fluidization treatment device comprises sectionaliser and reconstructor, wherein,
In described sectionaliser, suppose to have now an XML document F s, size is T s, fluidization treatment device can with maximum memory be T mif XML document is very large, be far longer than the maximum memory that fluidization treatment device can be used, i.e. T s> > T m, in other words, satisfy condition: T s≈ pT m, p > > 1, is used the sectionaliser in fluidization treatment device to carry out staging treating to it so, specifically comprises following three steps:
The first, read XML document F s;
The second, set segmentation read threshold T d=T m;
Three, carry out staging treating, generate two non-" form is good " XML document:
1. F s1, size is designated as T s1, T s1=T d=T m;
2. F s2, size is designated as T s2, T s2=T s-T d=T s-T m.
7. a kind of streaming paging delivery system based on XML as claimed in claim 6, is characterized in that: in described reconstructor, processing procedure comprises preliminary reconstruct and two steps of reconstruct again, and process is as follows:
1.1) read the XML document F that third step generates s1;
1.2) pointer is navigated to afterbody;
1.3) search for forward the beginning label " </ " of end-tag, and to record its position be L 1;
1.4) from L 1start to search for backward corresponding end mark " > ", and to record its position be L 2, now have two kinds of possibilities:
If can search out end mark " > ", so L 2value be exactly the positional value of this mark;
Otherwise, if fail to search out end mark " > ", at this moment pointer should be navigated to L 1place, performs step 1.3 again), obtain new L 1after value, then perform step 1.4), obtain new L 2value, the L that this is new 2value is only the actual location of end mark in this situation;
1.5) by because cutting apart the deficiency of data that causes from F s1afterbody move on to F s2stem; After this XML document F of deficiency of data will be obtained deleting s1, and added the XML document F of deficiency of data s2;
1.6) obtain because cutting apart the tag names of all ancestor nodes that lack:
1.6.1) set and to read sign flag=True, when the length scale of the value reading or while equaling 0, flag=False;
1.6.2) read step 1.5) the XML document F that deletes deficiency of data that generates s1, by each node label name, except empty tag names, add in list;
1.6.3) different elements and the number thereof in statistics list, should be supporting according to the beginning label of the good XML document of form and end-tag, the principle that empty label will be closed, number is the element of odd number, except first element, be the ancestor node tag names lacking because cutting apart, these tag names are put into another list, while obtaining these node label names, should keep its original order in list constant;
1.7) by step 1.5) two XML document F generating s1and F s2the XML document that the form that is configured to is good:
1.7.1) step 1.6.3) element in the list that obtains as end-tag inverted order add the XML document F that deletes deficiency of data to s1afterbody;
1.7.2) step 1.6.3) element in the list that obtains, except first element, adds as starting label positive sequence ground the XML document F that adds deficiency of data to s2stem;
1.7.3) step 1.6.3) first element in the list that obtains, state tag names, as starting label, add step 1.7.2 to) the XML document F that obtains s2stem.
8. a kind of streaming paging delivery system based on XML as described in one of claim 5~7, is characterized in that: described quick paging device comprises dispenser and reconstructor, wherein,
In described dispenser, suppose to have now an XML document F s0,1, its size is T s0,1, in the demand of terminal device, save as T, if XML document is very large, be far longer than the demand internal memory of terminal device, i.e. T s0,1> > T, in other words, satisfies condition: T s0,1≈ qT, q > > 1, carries out cutting apart of the first round and reconstruction processing to this XML document, and process is as follows:
2.1) read XML document F s0,1;
2.2) set a segmentation threshold
2.3) carry out first round dividing processing, this takes turns and comprises segmentation process one time, obtains the XML document of two non-" form is good " after cutting apart:
1. F s1,1, size is designated as T s1,1, T s1,1=T f01,;
2. F s1,2, size is designated as T s1,2, T s1,2=T f0,1;
2.4) to step 2.3) two XML document F generating s1,1and F s1,2in described reconstructor, be reconstructed processing, after processing, will obtain the XML document of two new " form are good ":
1. F s1,1, size is designated as T s1,1, T s1,1≈ T f0,1;
2. F s1,2, size is designated as T s1,2, T s1,2≈ T f0,1;
Now finish cutting apart with reconstruction processing process of the first round, next we will set two segmentation thresholds
Figure FDA0000396669640000081
with
Figure FDA0000396669640000082
to step 2.4) two XML document F generating s1,1and F s1,2carry out second take turns cut apart and reconstruction processing, after processing, will obtain the document F of four " form is good " s2,1, F s2,2, F s2,3, F s2,4, size equals respectively T f1,1, T f1,1, T f1,2, T f1,2, the rest may be inferred, sets 2 n-1individual segmentation threshold to 2 n-1individual XML document F s (n-1), k, k=1 ..., 2 n-1carry out cutting apart and reconstruction processing of n wheel, after processing, will obtain 2 nthe XML document F of individual " form is good " sn, k, k=1 ..., 2 n, so far, the size of all documents is all no more than the demand internal memory of terminal device, i.e. T sn, k≤ T, k=1 ..., 2 n, no longer meet and cut apart condition, cut apart with reconstruction processing and finish.
CN201310484727.7A 2013-10-16 2013-10-16 XML-based stream page release method and system Active CN103544262B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310484727.7A CN103544262B (en) 2013-10-16 2013-10-16 XML-based stream page release method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310484727.7A CN103544262B (en) 2013-10-16 2013-10-16 XML-based stream page release method and system

Publications (2)

Publication Number Publication Date
CN103544262A true CN103544262A (en) 2014-01-29
CN103544262B CN103544262B (en) 2017-01-11

Family

ID=49967714

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310484727.7A Active CN103544262B (en) 2013-10-16 2013-10-16 XML-based stream page release method and system

Country Status (1)

Country Link
CN (1) CN103544262B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018129852A1 (en) * 2017-01-11 2018-07-19 深圳大普微电子科技有限公司 Hardware system for data conversion, and memory
CN109597980A (en) * 2018-12-07 2019-04-09 万兴科技股份有限公司 PDF document dividing method, device and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090089658A1 (en) * 2007-09-27 2009-04-02 The Research Foundation, State University Of New York Parallel approach to xml parsing
US20090157715A1 (en) * 2007-12-18 2009-06-18 Ravi Murthy Managing large collection of interlinked xml documents
US20110072319A1 (en) * 2009-09-24 2011-03-24 International Business Machines Corporation Parallel Processing of ETL Jobs Involving Extensible Markup Language Documents
US20120078929A1 (en) * 2010-09-29 2012-03-29 International Business Machines Corporation Utilizing Metadata Generated During XML Creation to Enable Parallel XML Processing
CN103020176A (en) * 2012-11-28 2013-04-03 方跃坚 Data block dividing method in XML parsing and XML parsing method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090089658A1 (en) * 2007-09-27 2009-04-02 The Research Foundation, State University Of New York Parallel approach to xml parsing
US20090157715A1 (en) * 2007-12-18 2009-06-18 Ravi Murthy Managing large collection of interlinked xml documents
US20110072319A1 (en) * 2009-09-24 2011-03-24 International Business Machines Corporation Parallel Processing of ETL Jobs Involving Extensible Markup Language Documents
US20120078929A1 (en) * 2010-09-29 2012-03-29 International Business Machines Corporation Utilizing Metadata Generated During XML Creation to Enable Parallel XML Processing
CN103020176A (en) * 2012-11-28 2013-04-03 方跃坚 Data block dividing method in XML parsing and XML parsing method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
孙静 等: "大型XML文件的分割和动态加载研究", 《计算机工程与应用》, no. 16, 30 June 2003 (2003-06-30) *
马学韬 等: "一种基于VTD-XML模型的XML文件分块解析方法", 《计算机应用于软件》, vol. 28, no. 1, 31 January 2011 (2011-01-31) *
马永萍: "XML文档转换技术的研究与应用", 《中国优秀硕士学位论文全文数据库·信息科技辑》, no. 02, 15 August 2007 (2007-08-15), pages 138 - 100 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018129852A1 (en) * 2017-01-11 2018-07-19 深圳大普微电子科技有限公司 Hardware system for data conversion, and memory
CN109597980A (en) * 2018-12-07 2019-04-09 万兴科技股份有限公司 PDF document dividing method, device and electronic equipment

Also Published As

Publication number Publication date
CN103544262B (en) 2017-01-11

Similar Documents

Publication Publication Date Title
CN110784419B (en) Method and system for visualizing professional railway electric service data
CN102325188B (en) Method for realizing webpage browsing on a mobile terminal and system thereof
CN105827733B (en) A kind of method, apparatus and electronic equipment of propelling data
US10984194B2 (en) Efficient publish subscribe broadcast using binary delta streams
Gil et al. Impacts of data interchange formats on energy consumption and performance in smartphones
CN102163233A (en) Method and system for converting webpage markup language format
US20060107206A1 (en) Form related data reduction
CN112016290A (en) Automatic document typesetting method, device, equipment and storage medium
CN103500196A (en) EXCEL data export method and export device in multi-concurrence large data volume environment
CN112650529B (en) System and method for configurable generation of mobile terminal APP codes
CN102982010A (en) Method and device for abstracting document structure
CN109408780A (en) A kind of method that Excel file is converted to JSON file
JP2011501272A (en) Method, apparatus and system for providing local and online data services
CN102103587A (en) Method and device for converting form
CN103345522B (en) Displaying processing, methods of exhibiting and the device of data
US20110093510A1 (en) Methods and systems for serially transmitting records in xml format
CN103544262A (en) XML-based stream page release method and system
CN102622344A (en) Control method and control system for picture batch uploading facing to Mediawiki
CN104021216B (en) Message proxy server and information publish subscription method and system
CN103544260A (en) Method for converting large XML (extensive makeup language) document
CN106126299B (en) Service plug-in processing method and device
EP2869216A1 (en) Related content retrieval device and related content retrieval method
CN101799890B (en) Certificate data processing method and system
CN112948474B (en) Data export method, device, equipment and computer readable storage medium
JP6095487B2 (en) Question answering apparatus and question answering method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 310012 1st floor, building 1, 223 Yile Road, Hangzhou City, Zhejiang Province

Patentee after: Yinjiang Technology Co.,Ltd.

Address before: 310012 1st floor, building 1, 223 Yile Road, Hangzhou City, Zhejiang Province

Patentee before: ENJOYOR Co.,Ltd.