EP1567942A2 - Method for encoding an xml-based document - Google Patents

Method for encoding an xml-based document

Info

Publication number
EP1567942A2
EP1567942A2 EP03789106A EP03789106A EP1567942A2 EP 1567942 A2 EP1567942 A2 EP 1567942A2 EP 03789106 A EP03789106 A EP 03789106A EP 03789106 A EP03789106 A EP 03789106A EP 1567942 A2 EP1567942 A2 EP 1567942A2
Authority
EP
European Patent Office
Prior art keywords
codes
xml
content
mixed
document
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
EP03789106A
Other languages
German (de)
French (fr)
Inventor
Jörg Heuer
Andreas Hutter
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Siemens AG
Original Assignee
Siemens AG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from DE10339971A external-priority patent/DE10339971A1/en
Application filed by Siemens AG filed Critical Siemens AG
Publication of EP1567942A2 publication Critical patent/EP1567942A2/en
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • H04N21/2353Processing of additional data, e.g. scrambling of additional data or processing content descriptors specifically adapted to content descriptors, e.g. coding, compressing or processing of metadata
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • G06F40/143Markup, e.g. Standard Generalized Markup Language [SGML] or Document Type Definition [DTD]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream

Definitions

  • the invention relates to a method for coding an XML-based document that contains content according to an XML schema language definition, a corresponding decoding method and corresponding coding and decoding devices.
  • XML extensible markup language
  • XML schema A more detailed description of the XML schema as well as the structures, data types and content models used in it can be found in references [1], [2] and [3].
  • the object of the invention is therefore to provide a method for coding XML-based documents, which simpler access to coded textual content of the "Complex Type” data type with "mixed” content model enabled.
  • a coded binary representation of an XML-based document is generated by assigning binary structure codes to the contents of the document via coding tables, textual contents of a "Complex Type" data type being assigned to the "mixed" content model structure codes.
  • the structure codes are the SBC schema branch codes defined in section 7.6.1 of document [4]. The assignment of structure codes to the contents of the document as described in [4] enables the position of these contents in the structure of the XML documents to be signaled or addressed.
  • the invention essentially consists in the fact that the textual content of a type “complex type ⁇ modell with content model“ mixed ”is regarded as an element declaration in the type definition in the code assignment. Accordingly, in addition to the declared elements in a type definition, the coding is also used A defined structure code is assigned to the textual content if a content model is defined as mixed, which means that textual content is addressed in the coded data stream so that it can be accessed without having to decode the entire data stream.
  • the structure codes are assigned to the textual contents of a "complex type" data type with “mixed” content models exclusively via OperandTBC-
  • the position of the textual contents of a data type “complex type” with the content model “mixed” is also assigned. These are the position codes described in section 7.6.5.5 of document [4]. Since in a data type "Complex Type” with the content model "mixed” several textual contents can be contained, the information about the position of the textual content within the data type is thereby transmitted.
  • “single element position codes” and / or “multiple element position codes” are used in the assignment of the "position codes”. These position codes are described in more detail in the publication [4], section 7.6.5.5.
  • Single Element position codes are used in particular when no "model group” in the type definition of the "complex type” can occur more than once in the XML schema definition. A definition of the "model group” can be found in document [2 ].
  • the single element position code determines the position of a content with respect to a particular particle in an instantiation of a data type. A definition for particles can also be found in document [2].
  • the single element position code is encoded on the assumption that the textual content is declared a maximum of MPA + 1 times, where MPA is the number of all possible particle types in this data type
  • a multiple element position code is used if in the definition of the "complex type""modelgroups" in the XML schema definition can occur more than once.
  • the multiple element position code is coded on the assumption that a total of 2 * MPA + 1 positions can be addressed, this code representing the position the content of all particles in an instantiation of a data type.
  • the position codes are coded with codes of variable length, in particular with the code vluimsbf5, which is described in document [4], section 4.3.
  • the invention also comprises a decoding method with which a binary representation of an XML-based document coded according to the coding method described above is decoded.
  • a decoding method with which a binary representation of an XML-based document coded according to the coding method described above is decoded.
  • binary representations of textual contents of a "Complex Type” data type with the "mixed” content model, to which structure codes (SBC) were assigned during coding are incorporated into the XML textual contents assigned to the structure codes (SBC). based document converted.
  • the assignment is carried out by structure codes (SBC) via operand TBC coding tables.
  • SBC structure codes
  • binary representations of textual content of a "Complex Type" data type with the "mixed” content model, addressed with “Position Codes”, are converted into textual content at the assigned position.
  • the "Position Codes” can in turn be “Single” Element Position Codes "and / or” Multiple Element Position Codes "include.
  • These position codes are the same position codes as are defined in relation to the coding method.
  • the "position codes” can also be coded with codes of variable length, these codes being decoded when the position codes are converted into textual content.
  • the position codes are preferably coded with the code vluimsbf ⁇ .
  • the invention further comprises an encoding and decoding method, which comprises the encoding method according to the invention and the decoding method according to the invention.
  • the invention relates to a device for coding XML-based documents with which the coding method according to the invention can be carried out, the device comprising a storage means in which at least one assignment of a textual content of a data type "complex type” with the content model "mixed" to a structure code is stored.
  • the invention relates to a device for decoding a coded binary representation of an XML-based document, the device being set up in such a way that the decoding method according to the invention can be carried out.
  • the device comprises a storage means in which at least one assignment of a structure code to a textual content of a data type "complex type” with the content model "mixed” is stored.
  • the invention relates to a device for coding and decoding an XML-based document, comprising the above-described coding device according to the invention and the above-described decoding device according to the invention.
  • Figure 1 is a schematic diagram of an encoding and decoding system according to the invention with encoder and decoder;
  • FIG. 2 shows an XML schema definition in which, among other things, a data type “complex type” with a content model “mixed” is defined;
  • FIG. 3 shows an XML document in which an element “MixedElement” declared in the XML schema definition of FIG. 2 is instantiated;
  • FIG. 4 shows a graphical representation of the structure of the element “MixedElement” instantiated in the XML document of FIG. 3;
  • FIG. 5 shows an illustration to explain the assignment of structure codes in the case of data types “complex type” with content model “mixed”;
  • FIG. 6 shows an illustration to explain the assignment of “position codes” for “complex type” data types
  • FIG. 1 shows an example of a coding and decoding system with an encoder ENG and a decoder DEC, with which XML documents DOC are coded or decoded.
  • Both the encoder and the decoder both have a so-called XML schema S, in which the elements and types of the XML document used for communication are declared and defined.
  • Code tables CT are generated from the scheme S via corresponding scheme compilations SC in the encoder and decoder. If the XML document is DOC encoded, binary codes are assigned to the contents of the XML document via the code tables. This creates a binary representation BDOC of the document DOC, which can be decoded again in the decoder using the code table CT.
  • the method according to the invention is characterized in that textual contents of a "complex type” data type are assigned binary structure codes with the "mixed” content model. This enables the textual data to be filtered out of the BDOC binary representation without the entire BDOC binary representation having to be decoded.
  • FIG. 2 shows an example of a schema S, an element with the name “Example” being declared in lines 4 to 10 of this schema, which in turn contains an element of the name "MixedElement" of the type "MixedType". In the lines 12 to 17 the type "MixedType” is defined. This is a “complex type” data type with the content model "mixed", which can be seen in particular from line 12.
  • the "MixedType” type contains two elements with the names "firstElement” and “secondElement", both of which are of the type "string”.
  • FIG. 3 shows an instantiation of the "MixedElement” element in an XML document. Since the content model "mixed" can contain textual content in the form of strings, textual content can occur before, after or between the first and second elements “firstElement” and “secondElement". In the example in FIG. 3, a total of three textual contents occur.
  • Any document based on the XML language can be represented by a so-called tree structure, the contents of the XML document forming nodes in the tree structure and so-called context paths leading to these nodes.
  • Binary structure codes are assigned to the nodes of the tree structure during coding.
  • a structure code for the parent node and for the elements "firstElement” and “secondElement” are assigned for the element node “MixedElement” shown in FIG Node of the element "MixedElement” is connected.
  • a structure code is assigned for the parent node and the elements "firstElement and" secondElement ", but also a structure code is assigned for the textual content. This is illustrated in FIG code 00 is assigned to the parent node, code 01 is assigned to the textual content and codes 10 and 11 are assigned to the "first element” and the "second element", respectively.

Abstract

The invention relates to a method for encoding an XML-based document (DOC), the contents of said document corresponding to an XML-schema voice definition. According to said method, an encoded binary representation (BDOC) of the document is produced by associating the contents of the document with binary structural codes (SBC) using encoding tables (CT), textual contents of a "complex type" data type being associated with the "mixed" content model structural codes (SBC).

Description

Beschreibungdescription
Verfahren zur Codierung eines XML-basierten DokumentsMethod for coding an XML-based document
Die Erfindung betrifft ein Verfahren zur Codierung eines XML- basierten Dokuments, das Inhalte gemäß einer XML-Schema- Sprachdefinition enthält, ein entsprechendes Decodierverfah- ren sowie entsprechende Codier- und Decodiervorrichtungen.The invention relates to a method for coding an XML-based document that contains content according to an XML schema language definition, a corresponding decoding method and corresponding coding and decoding devices.
XML (= extensible markup language) ist eine Sprache, mit der eine strukturierte Beschreibung der Inhalte eines Dokuments mittels XML-Schema-Sprachdefinitionen ermöglicht wird. Eine genauere Beschreibung des XML-Schemas sowie der darin verwendeten Strukturen, Datentypen und Inhaltsmodelle findet sich in den Referenzen [1] , [2] und [3] .XML (= extensible markup language) is a language that enables a structured description of the contents of a document using XML schema language definitions. A more detailed description of the XML schema as well as the structures, data types and content models used in it can be found in references [1], [2] and [3].
Aus dem Stand der Technik sind Verfahren zur Codierung von XML-basierten Dokumenten bekannt, bei denen das Dokument in eine codierte Binärdarstellung umgewandelt wird. Beispiels- weise werden in dem Dokument [4], das im Rahmen der Entwicklung eines MPEG-7-CodierStandards entstanden ist, Verfahren zur Codierung und Decodierung von XML-basierten Dokumenten beschrieben.Methods for coding XML-based documents are known from the prior art, in which the document is converted into a coded binary representation. For example, methods for encoding and decoding XML-based documents are described in document [4], which was developed in the course of developing an MPEG-7 coding standard.
Die aus dem Stand der Technik bekannten Verfahren zur Erzeugung einer Binärdarstellung von XML-basierten Dokumenten weisen Nachteile bei der Codierung von "Complex Type"-Datentypen mit dem Inhaltsmodell "mixed" auf, da diese Datentypen neben Elementen textuelle Inhalte enthalten können, die jedoch nur durch die Decodierung des gesamten Datenstroms rekonstruiert werden können. Eine nähere Beschreibung des Datentyps "Complex Type" sowie des Inhaltsmodells "mixed" findet sich in Dokument [1] .The methods known from the prior art for generating a binary representation of XML-based documents have disadvantages when coding "complex type" data types with the content model "mixed", since these data types can contain textual content in addition to elements, but only can be reconstructed by decoding the entire data stream. A more detailed description of the "Complex Type" data type and the "mixed" content model can be found in document [1].
Aufgabe der Erfindung ist es deshalb, ein Verfahren zur Codierung von XML-basierten Dokumenten zu schaffen, welches einen einfacheren Zugriff auf codierte textuelle Inhalte des Datentyps "Complex Type" mit Inhaltsmodell "mixed" ermöglicht.The object of the invention is therefore to provide a method for coding XML-based documents, which simpler access to coded textual content of the "Complex Type" data type with "mixed" content model enabled.
Diese Aufgabe wird durch die unabhängigen Patentansprüche ge- löst. Weiterbildungen der Erfindung sind in den abhängigen Ansprüchen definiert.This task is solved by the independent claims. Further developments of the invention are defined in the dependent claims.
Bei dem erfindungsgemäßen Codierverfahren wird eine codierte Binärdarstellung eines XML-basierten Dokuments erzeugt, indem den Inhalten des Dokuments binäre Struktur-Codes über Codiertabellen zugeordnet werden, wobei textuellen Inhalten eines Datentyps "Complex Type" mit dem Inhaltsmodell "mixed" Struktur-Codes zugeordnet werden. Bei den Struktur-Codes handelt es sich um die im Abschnitt 7.6.1 des Dokuments [4] definier- ten Schema-Branch-Codes SBC. Durch die in [4] beschriebene Zuordnung von Struktur-Codes zu Inhalten des Dokuments kann die Lage dieser Inhalte in der Struktur der XML-Dokumente signalisiert oder adressiert werden.In the coding method according to the invention, a coded binary representation of an XML-based document is generated by assigning binary structure codes to the contents of the document via coding tables, textual contents of a "Complex Type" data type being assigned to the "mixed" content model structure codes. The structure codes are the SBC schema branch codes defined in section 7.6.1 of document [4]. The assignment of structure codes to the contents of the document as described in [4] enables the position of these contents in the structure of the XML documents to be signaled or addressed.
Die Erfindung besteht im Wesentlichen darin, dass der textuelle Inhalt eines Typs „Complex TypeλΛ mit Inhaltsmodell „mixed" bei der Codezuweisung wie eine Elementdeklaration in der Typendefinition betrachtet wird. Dementsprechend wird für die Codierung neben den deklarierten Elementen in einer Typ- definition zusätzlich auch dem textuellen Inhalt ein festgelegter Struktur-Code zugewiesen, wenn für den Typ ein Inhaltsmodell mixed definiert ist. Hierdurch werden textuelle Inhalte im codierten Datenstrom adressiert, so dass auf diese Inhalte zugegriffen werden kann, ohne dass der gesamte Daten- ström decodiert werden muss.The invention essentially consists in the fact that the textual content of a type “complex type λmodell with content model“ mixed ”is regarded as an element declaration in the type definition in the code assignment. Accordingly, in addition to the declared elements in a type definition, the coding is also used A defined structure code is assigned to the textual content if a content model is defined as mixed, which means that textual content is addressed in the coded data stream so that it can be accessed without having to decode the entire data stream.
In einer bevorzugten Ausführungsform des erfindungsgemäßen Codierverfahrens erfolgt die Zuordnung der Struktur-Codes zu den textuellen Inhalten eines Datentyps "Complex Type" mit Inhaltsmodellen "mixed" ausschließlich über OperandTBC-In a preferred embodiment of the coding method according to the invention, the structure codes are assigned to the textual contents of a "complex type" data type with "mixed" content models exclusively via OperandTBC-
Codiertabellen. Diese Codiertabellen legen die Codes der sog. OperandTBCs, d.h. der sog. TBCs (TBC = Tree Branch Code) der sog. Operand Nodes fest. Eine genaue Beschreibung und die Definitionen der OperandTBCs und Operand Nodes findet sich in den Abschnitten 7.6.1 sowie 7.6.5.2 des Dokuments [4].Coding. These coding tables define the codes of the so-called operand TBCs, ie the so-called TBCs (TBC = Tree Branch Code) so-called operand nodes. A precise description and the definitions of the operand TBCs and operand nodes can be found in sections 7.6.1 and 7.6.5.2 of the document [4].
In einer besonders bevorzugten Ausführungsform werden den textuellen Inhalten eines Datentyps „Complex Type" mit dem Inhaltsmodell „mixed" ferner „Position Codes zugeordnet. Es handelt sich hierbei um die im Abschnitt 7.6.5.5 des Dokuments [4] näher beschriebenen Position Codes. Da in einem Da- tentyp „Complex Type" mit dem Inhaltsmodell „mixed" mehrere textuelle Inhalte enthalten sein können, wird hierdurch die Information übertragen, an welcher Stelle sich die textuellen Inhalte innerhalb des Datentyps befinden.In a particularly preferred embodiment, the position of the textual contents of a data type “complex type” with the content model “mixed” is also assigned. These are the position codes described in section 7.6.5.5 of document [4]. Since in a data type "Complex Type" with the content model "mixed" several textual contents can be contained, the information about the position of the textual content within the data type is thereby transmitted.
In einer besonders bevorzugten Ausführungsform werden bei der Zuordnung der „Position Codes" „Single Element Position Codes" und/oder „Multiple Element Position Codes" verwendet. Diese Position Codes sind in der Druckschrift [4], Abschnitt 7.6.5.5 näher beschrieben. Single Element Position Codes wer- den insbesondere dann verwendet, wenn keine „Model Group" in der Typendefinition des „Complex Type" in der XML-Schema- Definition öfters als einmal auftreten kann. Eine Definition der „Model Group" findet sich in Dokument [2] . Der Single E- lement Position Code bestimmt hierbei die Position eines In- halts bezüglich eines jeweiligen Partikels in einer Instanzi- ierung eines Datentyps. Eine Definition für Partikel befindet sich ebenfalls im Dokument [2] . Der Single Element Position Code wird unter der Annahme codiert, dass der textuelle Inhalt maximal MPA+1-mal deklariert ist, wobei MPA die Anzahl aller in diesem Datentypen möglichen Partikel-In a particularly preferred embodiment, "single element position codes" and / or "multiple element position codes" are used in the assignment of the "position codes". These position codes are described in more detail in the publication [4], section 7.6.5.5. Single Element position codes are used in particular when no "model group" in the type definition of the "complex type" can occur more than once in the XML schema definition. A definition of the "model group" can be found in document [2 ]. The single element position code determines the position of a content with respect to a particular particle in an instantiation of a data type. A definition for particles can also be found in document [2]. The single element position code is encoded on the assumption that the textual content is declared a maximum of MPA + 1 times, where MPA is the number of all possible particle types in this data type
Instanziierungen bezeichnet. Ein Multiple Element Position Code wird dann verwendet, wenn in der Definition des „Complex Type" „Model Groups" in der XML-Schema-Definition öfters als einmal auftreten können. Der Multiple Element Position Code wird unter der Annahme codiert, dass insgesamt 2*MPA+1 Positionen adressiert werden können, wobei dieser Code die Posi- tion des Inhalts bezüglich aller Partikel in einer Instanzi- ierung eines Datentyps wiedergibt.Designated instantiations. A multiple element position code is used if in the definition of the "complex type""modelgroups" in the XML schema definition can occur more than once. The multiple element position code is coded on the assumption that a total of 2 * MPA + 1 positions can be addressed, this code representing the position the content of all particles in an instantiation of a data type.
In einer weiteren bevorzugten Ausführungsform werden die Po- sition Codes mit Codes variabler Länge codiert, insbesondere mit dem Code vluimsbf5, der in Dokument [4], Abschnitt 4.3 beschrieben ist.In a further preferred embodiment, the position codes are coded with codes of variable length, in particular with the code vluimsbf5, which is described in document [4], section 4.3.
Neben dem oben beschriebenen Codierverfahren umfasst die Er- findung ferner ein Decodierverfahren, mit dem eine gemäß dem oben beschriebenen Codierverfahren codierte Binärdarstellung eines XML-basierten Dokuments decodiert wird. Bei diesem Decodierverfahren werden binäre Repräsentationen von textuellen Inhalten eines Datentyps „Complex Type" mit dem Inhaltsmodell „mixed", denen bei der Codierung Struktur-Codes (SBC) zugeordnet wurden, in die den Struktur-Codes (SBC) zugeordneten textuellen Inhalte des XML-basierten Dokuments umgewandelt.In addition to the coding method described above, the invention also comprises a decoding method with which a binary representation of an XML-based document coded according to the coding method described above is decoded. In this decoding process, binary representations of textual contents of a "Complex Type" data type with the "mixed" content model, to which structure codes (SBC) were assigned during coding, are incorporated into the XML textual contents assigned to the structure codes (SBC). based document converted.
In Analogie zum Codierverfahren erfolgt in einer bevorzugten Ausführungsform die Zuordnung durch Struktur-Codes (SBC) über OperandTBC-Codiertabellen.In a preferred embodiment, analogy to the coding method, the assignment is carried out by structure codes (SBC) via operand TBC coding tables.
In einer bevorzugten Ausführungsform werden ferner binäre Repräsentationen von textuellen Inhalten eines Datentyps „Complex Type" mit dem Inhaltsmodell „mixed", adressiert mit „Position Codes" , in textuelle Inhalte an der zugeordneten Position umgewandelt. Die „Position Codes" können hierbei wiederum „Single Element Position Codes" und/oder „Multiple Element Position Codes" umfassen. Es handelt sich bei diesen Position Codes um die gleichen Position Codes, wie sie in Bezug auf das Codierverfahren definiert sind. In Analogie zum Codierverfahren können die "Position Codes" ferner mit Codes variabler Länge codiert sein, wobei diese Codes bei der Umwandlung der Position Codes in textuelle Inhalte decodiert werden. Vorzugsweise sind die Position Codes mit dem Code vluimsbfδ codiert. Neben den oben beschriebenen Codier- bzw. Decodierverfahren umfasst die Erfindung ferner ein Codier- und Decodierverfahren, welches das erfindungsgemäße Codierverfahren und das erfindungsgemäße Decodierverfahren umfasst.In a preferred embodiment, binary representations of textual content of a "Complex Type" data type with the "mixed" content model, addressed with "Position Codes", are converted into textual content at the assigned position. The "Position Codes" can in turn be "Single" Element Position Codes "and / or" Multiple Element Position Codes "include. These position codes are the same position codes as are defined in relation to the coding method. In analogy to the coding method, the "position codes" can also be coded with codes of variable length, these codes being decoded when the position codes are converted into textual content. The position codes are preferably coded with the code vluimsbfδ. In addition to the encoding and decoding methods described above, the invention further comprises an encoding and decoding method, which comprises the encoding method according to the invention and the decoding method according to the invention.
Darüber hinaus betrifft die Erfindung eine Vorrichtung zur Codierung von XML-basierten Dokumenten, mit denen das erfindungsgemäße Codierverfahren durchführbar ist, wobei die Vorrichtung ein Speichermittel umfasst, in dem wenigstens eine Zuordnung eines textuellen Inhalts eines Datentyps "Complex Type" mit dem Inhaltsmodell "mixed" zu einem Struktur-Code gespeichert ist. Die Erfindung betrifft analog eine Vorrichtung zur Decodierung einer codierten Binärdarstellung eines XML-basierten Dokuments, wobei die Vorrichtung derart einge- richtet ist, dass das erfindungsgemäße Decodierverfahren durchführbar ist. Die Vorrichtung umfasst ein Speichermittel, in dem wenigstens eine Zuordnung eines Struktur-Codes zu einem textuellen Inhalt eines Datentyps "Complex Type" mit dem Inhaltsmodell "mixed" gespeichert ist.In addition, the invention relates to a device for coding XML-based documents with which the coding method according to the invention can be carried out, the device comprising a storage means in which at least one assignment of a textual content of a data type "complex type" with the content model "mixed" to a structure code is stored. Analogously, the invention relates to a device for decoding a coded binary representation of an XML-based document, the device being set up in such a way that the decoding method according to the invention can be carried out. The device comprises a storage means in which at least one assignment of a structure code to a textual content of a data type "complex type" with the content model "mixed" is stored.
Darüber hinaus betrifft die Erfindung eine Vorrichtung zur Codierung und Decodierung eines XML-basierten Dokuments, umfassend die oben beschriebene erfindungsgemäße Codiervorrichtung und die oben beschriebene erfindungsgemäße Decodiervor- richtung.In addition, the invention relates to a device for coding and decoding an XML-based document, comprising the above-described coding device according to the invention and the above-described decoding device according to the invention.
Ausführungsbeispiele der Erfindung werden nachfolgend anhand der beigefügten Zeichnungen erläutert.Embodiments of the invention are explained below with reference to the accompanying drawings.
Es zeigen:Show it:
Figur 1 eine Prinzipdarstellung eines erfindungsgemäßen Codier- und DecodierSystems mit Encoder und Decoder;Figure 1 is a schematic diagram of an encoding and decoding system according to the invention with encoder and decoder;
Figur 2 eine Darstellung einer XML-Schema-Definition, in der u.a. ein Datentyp „Complex Type" mit Inhaltsmodell „mixed" definiert ist; Figur 3 eine Darstellung eines XML-Dokuments, in dem ein in der XML-Schema-Definition der Fig. 2 deklariertes Element „MixedElement" instanziiert wird;FIG. 2 shows an XML schema definition in which, among other things, a data type “complex type” with a content model “mixed” is defined; FIG. 3 shows an XML document in which an element “MixedElement” declared in the XML schema definition of FIG. 2 is instantiated;
Figur 4 eine graphische Darstellung des Aufbaus des in dem XML-Dokument der Fig. 3 instanziierten Elements „MixedElement" ;FIG. 4 shows a graphical representation of the structure of the element “MixedElement” instantiated in the XML document of FIG. 3;
Figur 5 eine Darstellung zur Erläuterung der Zuweisung von Struktur-Codes bei Datentypen „Complex Type" mit Inhaltsmodell „mixed" ; undFIG. 5 shows an illustration to explain the assignment of structure codes in the case of data types “complex type” with content model “mixed”; and
Figur 6 eine Darstellung zur Erläuterung der Zuweisung von „Position Codes" bei Datentypen „Complex Type" mitFIG. 6 shows an illustration to explain the assignment of “position codes” for “complex type” data types
Inhaltsmodell „mixed" .Content model "mixed".
In Figur 1 ist beispielhaft ein Codier- und Decodiersystem mit einem Encoder ENG und einem Decoder DEC dargestellt, mit denen XML-Dokumente DOC codiert bzw. decodiert werden. Sowohl der Encoder als auch der Decoder verfügen beide über ein sogenanntes XML-Schema S, in dem die zur Kommunikation genutzten Elemente und Typen des XML-Dokuments deklariert und definiert sind. Aus dem Schema S werden über entsprechende Sche- ma-Compilationen SC im Encoder und Decoder Code-Tabellen CT erzeugt. Wenn das XML-Dokument DOC codiert wird, werden den Inhalten des XML-Dokuments über die Code-Tabellen binäre Codes zugeordnet. Hierdurch wird eine Binärdarstellung BDOC des Dokuments DOC erzeugt, die mithilfe der Code-Tabelle CT im Decoder wieder decodiert werden kann.FIG. 1 shows an example of a coding and decoding system with an encoder ENG and a decoder DEC, with which XML documents DOC are coded or decoded. Both the encoder and the decoder both have a so-called XML schema S, in which the elements and types of the XML document used for communication are declared and defined. Code tables CT are generated from the scheme S via corresponding scheme compilations SC in the encoder and decoder. If the XML document is DOC encoded, binary codes are assigned to the contents of the XML document via the code tables. This creates a binary representation BDOC of the document DOC, which can be decoded again in the decoder using the code table CT.
Das erfindungsgemäße Verfahren zeichnet sich dadurch aus, dass textuellen Inhalten eines Datentyps „Complex Type" mit dem Inhaltsmodell „mixed" binäre Struktur-Codes zugeordnet werden. Dies ermöglicht ein Herausfiltern der textuellen Daten aus der Binärdarstellung BDOC, ohne dass die gesamte Binärdarstellung BDOC decodiert werden muss. In Figur 2 ist beispielhaft ein Schema S dargestellt, wobei in diesem Schema in den Zeilen 4 bis 10 ein Element mit dem Namen „Example" deklariert ist, das wiederum ein Element vom Namen „MixedElement" vom Typ „MixedType" enthält. In den Zeilen 12 bis 17 ist der Typ „MixedType" definiert. Es handelt sich hierbei um einen „Complex Type" Datentyp mit dem Inhaltsmodell „mixed" , was insbesondere der Zeile 12 entnommen werden kann. Der Typ „MixedType" enthält zwei Elemente mit den Namen „firstElement" und „secondElement" , die beide vom Typ „string" sind.The method according to the invention is characterized in that textual contents of a "complex type" data type are assigned binary structure codes with the "mixed" content model. This enables the textual data to be filtered out of the BDOC binary representation without the entire BDOC binary representation having to be decoded. FIG. 2 shows an example of a schema S, an element with the name "Example" being declared in lines 4 to 10 of this schema, which in turn contains an element of the name "MixedElement" of the type "MixedType". In the lines 12 to 17 the type "MixedType" is defined. This is a "complex type" data type with the content model "mixed", which can be seen in particular from line 12. The "MixedType" type contains two elements with the names "firstElement" and "secondElement", both of which are of the type "string".
In Figur 3 ist eine Instanziierung des Elements "MixedElement" in einem XML-Dokument dargestellt. Da das Inhaltsmodell "mixed" textuelle Inhalte in Form von strings enthalten kann, können vor, nach oder zwischen den ersten und zweiten Elementen „firstElement" und „secondElement" textuelle Inhalte auftreten. In dem Beispiel der Figur 3 treten insgesamt drei textuelle Inhalte auf.FIG. 3 shows an instantiation of the "MixedElement" element in an XML document. Since the content model "mixed" can contain textual content in the form of strings, textual content can occur before, after or between the first and second elements "firstElement" and "secondElement". In the example in FIG. 3, a total of three textual contents occur.
In Figur 4 ist die Struktur des Elements „MixedElement" , das in Figur 3 instanziiert ist, nochmals anschaulich als Baumstruktur dargestellt. Von dem obersten MixedElement/mixedType-Knoten hängen in einer ersten Hierarchieebene fünf weitere Knoten ab, welche sowohl die textuellen Inhalte als auch die Elemente „firstElement" bzw. „secondElement" enthalten. Die Elemente „firstElement" und „secondElement" enthalten in einer zweiten Hierarchieebene ferner die entsprechenden Inhalte „Content of firstElement" bzw. „Content of secondElement" .The structure of the element “MixedElement”, which is instantiated in FIG. 3, is again clearly shown as a tree structure in FIG. 4. From the top MixedElement / mixedType node, five further nodes depend in a first hierarchical level, both the textual content and contain the elements "firstElement" and "secondElement". The elements "firstElement" and "secondElement" also contain the corresponding contents "Content of firstElement" and "Content of secondElement" in a second hierarchical level.
Ein beliebiges Dokument basierend auf der XML-Sprache kann durch eine sog. Baumstruktur dargestellt werden, wobei die Inhalte des XML-Dokuments Knoten in der Baumstruktur bilden und sog. Context-Pfade zu diesen Knoten führen. Den Knoten der Baumstruktur werden bei der Codierung binäre Struktur- Codes zugewiesen. Nach dem Stand der Technik werden für den in Figur 4 gezeigten Elementknoten „MixedElement" ein Struktur-Code für den Vaterknoten sowie für die Elemente „firstElement" und „secondElement" zugewiesen. Der Vaterknoten ist hierbei der Knoten, der in der nächsthöheren Hierarchieebene mit dem Knoten des Elements „MixedElement" verbunden ist. Im Unterschied hierzu wird gemäß dem erfindungsgemäßen Verfahren nicht nur ein Struktur-Code für den Vaterknoten und die Elemente „firstElement und „secondElement" vergeben, sondern es wird ferner ein Struktur-Code für den textuellen Inhalt zugewiesen. Dies ist in Figur 5 verdeutlicht, wobei der Code 00 dem Vaterknoten zugewiesen wird, der Code 01 dem textuellen Inhalt zugewiesen wird und die Codes 10 bzw. 11 dem „firstEle- ment" bzw. dem „secondElement" zugewiesen werden.Any document based on the XML language can be represented by a so-called tree structure, the contents of the XML document forming nodes in the tree structure and so-called context paths leading to these nodes. Binary structure codes are assigned to the nodes of the tree structure during coding. According to the prior art, a structure code for the parent node and for the elements "firstElement" and "secondElement" are assigned for the element node "MixedElement" shown in FIG Node of the element "MixedElement" is connected. In contrast to this, according to the method according to the invention not only a structure code is assigned for the parent node and the elements "firstElement and" secondElement ", but also a structure code is assigned for the textual content. This is illustrated in FIG code 00 is assigned to the parent node, code 01 is assigned to the textual content and codes 10 and 11 are assigned to the "first element" and the "second element", respectively.
In dem erfindungsgemäßen Verfahren ist es ferner möglich, den einzelnen textuellen Inhalten auch noch „Position Codes" zuzuweisen, wie in Figur 6 dargestellt ist. Da insgesamt an drei Positionen textuelle Inhalte auftreten können, werden hierzu drei „Position Codes" benötigt, wobei gemäß Figur 6 die Codes 00, 01 und 10 verwendet werden. In the method according to the invention, it is also possible to also assign “position codes” to the individual textual contents, as shown in FIG. 6. Since a total of three positions can contain textual content, three “position codes” are required for this, according to FIG 6 codes 00, 01 and 10 are used.
Literaturverzeichnis :Bibliography :
[1] http://www.w3.org/TR/2001/REC-xmlschema-0-20010502/[1] http://www.w3.org/TR/2001/REC-xmlschema-0-20010502/
[2] http://www.w3.org/TR/2001/REC-xmlschema-l-20010502/[2] http://www.w3.org/TR/2001/REC-xmlschema-l-20010502/
[3] http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/[3] http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/
[4] ISO/IEC FDIS 15938-1 "Information Technology - Multime- dia Content Description Interface -Part 1: Systems", Geneva 2002 [4] ISO / IEC FDIS 15938-1 "Information Technology - Multimedia Content Description Interface - Part 1: Systems", Geneva 2002

Claims

Patentansprüche claims
1. Verfahren zur Codierung eines XML-basierten Dokuments1. Method for coding an XML-based document
(DOC) , das Inhalte gemäß einer XML-Schema-Sprachdefinition enthält, bei dem: eine codierte Binärdarstellung (BDOC) des Dokuments erzeugt wird, indem den Inhalten des Dokuments binäre Struktur-Codes (SBC) über Codiertabellen (CT) zugeordnet werden, wobei textuellen Inhalten eines Datentyps „Complex Type" mit dem Inhaltsmodell „mixed" Struktur-Codes (SBC) zugeordnet werden.(DOC), which contains content according to an XML schema language definition, in which: an encoded binary representation (BDOC) of the document is generated by assigning the structure of the document binary structure codes (SBC) via coding tables (CT), where Textual content of a "Complex Type" data type with the "mixed" content model structure codes (SBC) are assigned.
2. Verfahren nach Anspruch 1, bei dem die Zuordnung der Struktur-Codes (SBC) zu den textuellen Inhalten eines Da- tentyps „Complex Type" mit Inhaltsmodell „mixed" ausschließlich über OperandTBC-Codiertabellen erfolgt.2. The method according to claim 1, in which the assignment of the structure codes (SBC) to the textual contents of a data type “complex type” with a content model “mixed” takes place exclusively via operand TBC coding tables.
3. Verfahren nach Anspruch 1 oder 2, bei dem den textuellen Inhalten eines Datentyps „Complex Type" mit dem Inhaltsmo- dell „mixed" ferner „Position Codes" zugeordnet werden.3. The method according to claim 1 or 2, in which the textual contents of a data type "complex type" with the content model "mixed" are also assigned "position codes".
4. Verfahren nach Anspruch 3, bei dem bei der Zuordnung der „Position Codes" „Single Element Position Codes" (SPC) und/oder „Multiple Element Position Codes" (MPC) verwendet werden.4. The method according to claim 3, in which "single element position codes" (SPC) and / or "multiple element position codes" (MPC) are used in the assignment of the "position codes".
5. Verfahren nach Anspruch 3 oder 4, bei dem die „Position Codes" mit Codes variabler Länge codiert werden.5. The method of claim 3 or 4, wherein the "position codes" are encoded with codes of variable length.
6. Verfahren nach Anspruch 5, bei dem die „Position Codes" mit dem Code vluimsbf5 codiert werden.6. The method of claim 5, wherein the "position codes" are encoded with the code vluimsbf5.
7. Verfahren zur Decodierung einer mit einem Verfahren nach einem der vorhergehenden Ansprüche codierten Binärdarstel- lung eines XML-basierten Dokuments (DOC) , bei dem binäre Repräsentationen von textuellen Inhalten eines Datentyps „Complex Type" mit dem Inhaltsmodell „mixed", denen bei der Codierung Struktur-Codes (SBC) zugeordnet wurden, in die den Struktur-Codes (SBC) zugeordneten textuellen Inhalte des XML-basierten Dokuments umgewandelt werden.7. A method for decoding a binary representation of an XML-based document (DOC) coded by a method according to one of the preceding claims, in which binary representations of textual contents of a data type “complex type” with the content model “mixed”, those in the structure codes (SBC) have been assigned to the coding, into which the textual contents of the XML-based document assigned to the structure codes (SBC) are converted.
8. Verfahren nach Anspruch 7, bei dem die Zuordnung durch Struktur-Codes (SBC) über OperandTBC-Codiertabellen erfolgt.8. The method according to claim 7, in which the assignment is carried out by structure codes (SBC) via operand TBC coding tables.
9. Verfahren nach Anspruch 7 oder 8 zur Decodierung einer mit einem Verfahren nach einem der Ansprüche 3 bis 6 codierten9. The method according to claim 7 or 8 for decoding a coded with a method according to any one of claims 3 to 6
Binärdarstellung eines XML-basierten Dokuments (DOC) , bei dem ferner binäre Repräsentationen von textuellen Inhalten eines Datentyps „Complex Type" mit dem Inhaltsmodell „mixed", adressiert mit „Position Codes", in textuelle Inhal- te an der zugeordneten Position umgewandelt werden.Binary representation of an XML-based document (DOC), in which binary representations of textual content of a "Complex Type" data type with the "mixed" content model, addressed with "Position Codes", are also converted into textual content at the assigned position.
10. Verfahren nach Anspruch 9, bei dem die „Position Codes" „Single Element Position Codes" (SPC) und/oder „Multiple Element Position Codes" (MPC) umfassen.10. The method according to claim 9, wherein the “position codes” comprise “single element position codes” (SPC) and / or “multiple element position codes” (MPC).
11. Verfahren nach Anspruch 9 oder 10, bei dem die „Position Codes" mit Codes variabler Länge codiert sind.11. The method according to claim 9 or 10, wherein the "position codes" are encoded with codes of variable length.
12. Verfahren nach Anspruch 11, bei dem die „Position Codes" mit dem Code vluimsbf5 codiert sind.12. The method of claim 11, wherein the "position codes" are encoded with the code vluimsbf5.
13. Verfahren zur Codierung und Decodierung von XML- basierten Dokumenten, umfassend ein Verfahren nach einem der Ansprüche 1 bis 6 und ein Verfahren nach einem der An- sprüche 7 bis 12.13. A method for coding and decoding XML-based documents, comprising a method according to one of claims 1 to 6 and a method according to one of claims 7 to 12.
14. Vorrichtung zur Codierung von XML-basierten Dokumenten gemäß einem Verfahren nach einem der Ansprüche 1 bis 6, umfassend ein Speichermittel, in dem wenigstens eine Zu- Ordnung eines textuellen Inhalts eines Datentyps „Complex Type" mit dem Inhaltsmodell „mixed" zu einem Struktur-Code (SBC) gespeichert ist. 14. Device for coding XML-based documents according to a method according to one of claims 1 to 6, comprising a storage means in which at least one assignment of a textual content of a data type "complex type" with the content model "mixed" to a structure Code (SBC) is stored.
15. Vorrichtung zur Decodierung einer codierten Binärdarstellung eines XML-basierten Dokuments gemäß einem Verfahren nach einem der Ansprüche 7 bis 12, umfassend ein Spei- chermittel, in dem wenigstens eine Zuordnung eines Struktur-Codes (SBC) zu einem textuellen Inhalt eines Datentyps „Complex Type" mit dem Inhaltsmodell „mixed" gespeichert ist.15. Device for decoding a coded binary representation of an XML-based document according to a method according to one of claims 7 to 12, comprising a storage means in which at least one assignment of a structure code (SBC) to a textual content of a data type “complex Type "with the content model" mixed "is saved.
16. Vorrichtung zur Codierung und Decodierung eines XML- basierten Dokuments (DOC) , umfassend die Vorrichtung nach Anspruch 14 und die Vorrichtung nach Anspruch 15. 16. Device for encoding and decoding an XML-based document (DOC), comprising the device according to claim 14 and the device according to claim 15.
EP03789106A 2002-12-03 2003-12-01 Method for encoding an xml-based document Ceased EP1567942A2 (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
DE10256455 2002-12-03
DE10256455 2002-12-03
DE10339971 2003-08-29
DE10339971A DE10339971A1 (en) 2002-12-03 2003-08-29 Method for coding an XML-based document
PCT/EP2003/013511 WO2004051502A2 (en) 2002-12-03 2003-12-01 Method for encoding an xml-based document

Publications (1)

Publication Number Publication Date
EP1567942A2 true EP1567942A2 (en) 2005-08-31

Family

ID=32471494

Family Applications (1)

Application Number Title Priority Date Filing Date
EP03789106A Ceased EP1567942A2 (en) 2002-12-03 2003-12-01 Method for encoding an xml-based document

Country Status (3)

Country Link
EP (1) EP1567942A2 (en)
AU (1) AU2003293743A1 (en)
WO (1) WO2004051502A2 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7747558B2 (en) 2007-06-07 2010-06-29 Motorola, Inc. Method and apparatus to bind media with metadata using standard metadata headers

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BING L. ET AL: "An Architecture for Multidatabase Systems Based on Corba and XML", 12TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, 3 September 2001 (2001-09-03), PISCATAWAY, NJ, USA, XP010558716 *
BOX D. ET AL: "Essential XML - Beyond Markup", September 2000, ADDISON-WESLEY *

Also Published As

Publication number Publication date
WO2004051502A2 (en) 2004-06-17
AU2003293743A1 (en) 2004-06-23
WO2004051502A3 (en) 2005-02-03

Similar Documents

Publication Publication Date Title
EP1522028B9 (en) Method and devices for encoding/decoding structured documents, especially xml documents
EP2197213B1 (en) Method for improving the functionality of the binary representation of MPEG-7 and other XML-based content descriptions
DE60107964T2 (en) DEVICE FOR CODING AND DECODING STRUCTURED DOCUMENTS
DE60225785T2 (en) PROCESS FOR CODING AND DECODING A PATH IN THE TREE STRUCTURE OF A STRUCTURED DOCUMENT
WO2006005646A1 (en) Method for encoding an xml document, decoding method, encoding and decoding method, coding device, and encoding and decoding device
EP1645133B1 (en) Method for coding structured documents
WO2003001811A1 (en) System for the improved encoding/decoding of structured, particularly xml-based, documents and methods and devices for the improved encoding/decoding of binary representations of such documents
EP1561281A2 (en) Method for the creation of a bit stream from an indexing tree
EP1616274B1 (en) Method for encoding a structured document
EP1400124B1 (en) Method for improving the functions of the binary representation of mpeg-7 and other xml-based content descriptions
DE10339971A1 (en) Method for coding an XML-based document
EP1567942A2 (en) Method for encoding an xml-based document
EP0763920A2 (en) Method for the encoding or decoding of protocol data units
WO2003001404A2 (en) Method for rapidly searching elements or attributes or for rapidly filtering fragments in binary representations of structured documents
EP1717958A2 (en) Method for coding positions of data elements in a data structure
EP0828368B1 (en) Method and system for accessing a multimedia document
DE10351897A1 (en) Method for coding structured documents
DE102004044164A1 (en) Method and device for coding XML documents
DE10248758B4 (en) Methods and devices for encoding / decoding XML documents
EP1787474A1 (en) Method for encoding an xml-based document
DE3523247A1 (en) Device for the data reduction of binary data streams

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20050422

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL LT LV MK

DAX Request for extension of the european patent (deleted)
RIN1 Information on inventor provided before grant (corrected)

Inventor name: HUTTER, ANDREAS

Inventor name: HEUER, JOERG

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN REFUSED

18R Application refused

Effective date: 20081007