EP1344151A1 - Method for dividing structured documents into several parts - Google Patents

Method for dividing structured documents into several parts

Info

Publication number
EP1344151A1
EP1344151A1 EP01271587A EP01271587A EP1344151A1 EP 1344151 A1 EP1344151 A1 EP 1344151A1 EP 01271587 A EP01271587 A EP 01271587A EP 01271587 A EP01271587 A EP 01271587A EP 1344151 A1 EP1344151 A1 EP 1344151A1
Authority
EP
European Patent Office
Prior art keywords
information
document
main
transmitted
secondary part
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP01271587A
Other languages
German (de)
French (fr)
Inventor
Claude Seyrat
Cédric Thienot
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Expway SA
Original Assignee
Expway SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Expway SA filed Critical Expway SA
Publication of EP1344151A1 publication Critical patent/EP1344151A1/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • G06F16/986Document structures and storage, e.g. HTML extensions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99942Manipulating data structure, e.g. compression, compaction, compilation

Definitions

  • the present invention relates to a method for dividing structured documents into several parts.
  • the documents thus manipulated and transmitted contain several types of information integrated into a structure.
  • a structured document is a collection of sets of information, each associated with a type and attributes, and composed together according to mainly hierarchical relationships. These documents use a structuring language such as SGML, HTML, XML, making it possible in particular to distinguish the different subsets of information composing the document. In contrast, in a so-called linear document, the content information of the document is mixed with the presentation and typing information.
  • a structured document includes separation marks for the different sets of information in the document.
  • these marks called “tags” are of the form " ⁇ XXXX>” and " ⁇ /XXXX>", the first mark indicating the start of a set of information "XXXX” and the second the end of this set.
  • a set of information can be composed of several sets of lower level information.
  • a structured document presents a hierarchical or tree structure diagram, each node representing a set of information and being connected to a node of higher hierarchical level representing a set of information which contains the sets of information of lower level.
  • the nodes located at the end of the branch of this tree structure represent sets of information containing data of a predefined type, which cannot be broken down into subsets of information.
  • a structured document contains separation marks represented in the form of textual or binary data, these marks delimiting sets or subsets of information which may themselves contain other subsets of information delimited by marks .
  • a structured document is associated with what is called a structure diagram defining in the form of rules the structure and the type of information of each set of information in the document.
  • a schema is made up of nested groups of structures of information sets, these groups can be ordered sequences, groups of alternative elements or groups of necessary elements, ordered or unordered.
  • a structured document when a structured document must be transmitted, it is compressed beforehand, so as to minimize the volume of data to be transmitted.
  • the document structuring data are also compressed, knowing that the recipient of the document is supposed to know beforehand the structure diagram of the document and can use the structure diagram to determine every instant what set of information it will receive. It is therefore essential that the structure of the transmitted document exactly match the structure diagram that the recipient of the document plans to use for receiving and decoding the document, failing which, the recipient cannot determine the type of data transmitted, in particular. , and therefore is unable to decode them and reconstruct the original document.
  • the structured documents to be transmitted tend to become more and more voluminous. It is envisaged, for example, to transmit or broadcast in this way complete descriptions of cinematographic works or television programs.
  • the object of the present invention is to eliminate this drawback.
  • This objective is achieved by providing a method for dividing a structured document having a hierarchical structure defined by a structure diagram, this document grouping together a main set of information including subsets of information, at least some of the information subsets which may include information subsets of lower hierarchical level, each information subset being associated with a respective type of information.
  • this method comprises the steps consisting in:
  • each part is understandable in itself and can be decoded, regardless of the division chosen. Furthermore, when such a part is transmitted and the transmission fails, the rest of the document remains valid and the part which is not transmitted correctly can be retransmitted without the need to retransmit the entire document. Furthermore, it is not necessary to have the main and secondary parts upstream of a part in order to be able to decode the latter, since each part is valid and understandable in itself. Thanks to these provisions, a transmitted document can be enriched and modified over time.
  • the document comprises a header which is inserted into each part, this header comprising an indicator whose value indicates whether the document is complete or not.
  • each part comprises a header comprising information giving the location of the part in the hierarchical structure of the document.
  • Said location information of the secondary part in the hierarchical structure of the document advantageously describes a path in this structure, defining the position of the secondary part in the document.
  • Said path can be defined absolutely relative to the main information set of the document. It can also be defined relatively with respect to the position of a last transmitted secondary part.
  • each type of information assigned to the predefined value is followed by a reference to the secondary part containing the subset of information associated with the type of information, said information on the location of the secondary part in the hierarchical structure. of the document being the reference of said secondary part.
  • This method can further comprise the transmission of several parts of the document associated with the same location in the structure.
  • the last part transmitted replaces the previous one which is associated with the same location.
  • the header of each part includes information specifying a method of processing the part with respect to a part associated with the same location in the structure.
  • the structured document is for example of the SGML, XML or HTML type.
  • FIG. 1 represents a tree structure in which each node symbolizes a set or subset of information from a structured document which is normally transmitted in one go;
  • Figure 2 shows the structured document shown in Figure 1 cut into several parts, each of which can be transmitted separately according to the invention
  • Figure 3 shows in more detail the structure of the information contained in a structured document
  • FIG. 4 represents another tree structure illustrating a method of defining the position of a part of the structure, transmitted separately from the rest of the structure.
  • FIG. 1 represents a tree structure comprising a root node 1 decomposed into three nodes of lower rank, of which the first node 1.1 is not decomposed into nodes of lower rank, the second node 1.2 consists of two nodes 1.2.1 and 1.2 .2 and the third node 1.3 consists of a single node 1.3.1.
  • the two nodes 1.2.1 and 1.2.2 of the second node 1.2 are attached respectively to a 1.2.1.1 and two nodes 1.2.2.1 and 1.2.2.2 of lower rank.
  • This structure represents a structured document D comprising a header H in which are defined a certain number of parameters defining the coding and representation format of the document, and a main body B gathering the information and sets of information constituting the document.
  • a structured document can be transmitted in several separate parts PI, P2, P3, namely a main part and secondary parts P2, P3 which are attached to the main part ( Figure 2). Such a transmission is preferably carried out after compression in an appropriate manner of each part to be transmitted separately.
  • Each part of the document whether compressed or not, includes a header H, H2, H3 and a main body B1, B2, B3.
  • a main document body B comprises a data header DH and one or more data bodies DB each gathering the information from a subset of information in the document.
  • the DH data header may include a field K making it possible to remove any ambiguity when decoding the document, in particular by giving a number making it possible to define the following set of information, and / or a field containing the number N d 'occurrences of the DB data body.
  • each DB data body can include a T field indicating the type of information it contains, an L field giving the length of this information in number of bits or bytes, an A field gathering attributes of the information subset and a Val field containing the value or content of the information subset.
  • the Val field can itself contain a DH data header field and one or more fields containing a DB data body.
  • the field T containing the type of information of a body of data DB not transmitted or removed from the document receives a value predefined indicating that the following subset of information is not transmitted.
  • This particular predefined value of type of information is for example chosen to be equal to 0 in the case of a document in compressed form, the values of the other types of information being different from 0. If this predefined value appears in the transmitted document, the length L field and the A and Val fields which normally follow the type of information, do not appear in the transmitted data. Consequently, following a type of information equal to the predefined value, the DH header of the next set of information is found in the document or an end of document indicator.
  • Parts PI, P2 and P3 can be transmitted separately one or more times. They have for this purpose a header H, H2, H3 comprising first of all a parameter indicating that the document is not complete, followed by a definition of the location of the part transmitted in the tree structure of the complete document.
  • a structured document can be enriched and modified over time.
  • the transmission of the main part PI is not necessary since, thanks to the definition of the location appearing in the header of the secondary parts, the processing unit which receives the transmitted secondary parts can determine the location of the received part in the document structure and thus decode it.
  • the document can be split up so that the main part does not contain any useful data, and so that the whole document can be reconstructed from the secondary parts and their location in the document structure. .
  • the header H, H2, H3 of the parts PI, P2, P3 can include information specifying a method of processing the part with respect to a part already transmitted associated with the same location in the structure, namely for example, if the transmitted part must replace a part associated with the same location, which has already been transmitted, or not be taken into account if it already appears in the document received, or else be merged with the part associated with the same location, which has already been transmitted.
  • this definition of location can include the name of all the upper nodes up to the root node R, possibly associated with a sequence number relative to the upper node.
  • the first node of the first node of the third node of the first node attached to the root node can be referenced as follows: / c / a [last ] / b (l) / d
  • the definition of the location of the transmitted document part P2, P3 can simply include a reference to the document part, this reference having been transmitted beforehand in the main part PI of the document, for example following the predefined value indicating that the following subset of information is not transmitted.
  • the document or the parts PI, P2, P3 of the document to be transmitted are compressed beforehand.
  • the structure information and the content information are distinguished, certain parts of the document possibly comprising no content information.
  • the structure information consists of all the fields except the Val value fields, when these are not structured, that is to say are not decomposable into structured information subsets.
  • these are the Val fields of the information subsets 1.1, 1.2.1.1, 1.2.2.1, 1.2.2.2, and 1.3.1, located at the lower ends of the branches of the tree structure of the document.
  • the compression processing proper consists for example in sequentially reading the part of the document to be compressed, in applying an appropriate compression algorithm to process the structure information and in applying a compression algorithm adapted to the type of information when a Val field not decomposable appears during the reading of the document part. It should be noted that in the document or part of the compressed document, the structure information and the content information appear in the same order as in the original uncompressed document. We can also apply a statistical compression algorithm, such as Zip.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Document Processing Apparatus (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention concerns a method applicable to a structured document (D) having a hierarchical structure defined by a structural schema, and assembling a main data set (1) including data subsets (1.1, 1.2, 1.3, , 1.2.2.2), which themselves can include data subsets of lower hierarchical level, each data subset being associated with a respective type of data. Said method comprises steps which consist in: dividing the documents into parts (P1, P2, P3) capable of being separately handled, namely a main part (P1) and at least a secondary part (P2, P3), the main part containing at least the main data set (1), and the secondary part containing a data subset (1.2.1, 1.2.2) which is removed from the main data set, each secondary part being related to the main or to another secondary part, and assigning a predefined value to the type of data of each data subset (1.2.1, 1.2.2) removed from a data set of higher hierarchical level (1.2).

Description

PROCEDE POUR DIVISER DES DOCUMENTS STRUCTURES EN PLUSIEURS PARTIES.METHOD FOR DIVIDING STRUCTURED DOCUMENTS INTO MULTIPLE PARTS.
La présente invention concerne un procédé permettant de diviser des documents structurés en plusieurs parties.The present invention relates to a method for dividing structured documents into several parts.
Elle s'applique notamment, mais non exclusivement, à la manipulation, à la transmission, au stockage et à la lecture de documents structurés multimédia, d'images ou de séquences d'images vidéo ou numériques, des œuvres cinématographiques ou des programmes vidéo, et plus généralement à tout transfert de tels documents entre unités de traitements interconnectées par des réseaux de transmission de données, ou entre une unité de traitement et une unité de stockage, ou encore entre une unité de traitement et une unité de reproduction telle qu'un poste de télévision dans le cas de programmes vidéo.It applies in particular, but not exclusively, to the manipulation, transmission, storage and playback of structured multimedia documents, images or sequences of video or digital images, cinematographic works or video programs, and more generally to any transfer of such documents between processing units interconnected by data transmission networks, or between a processing unit and a storage unit, or even between a processing unit and a reproduction unit such as a television set in the case of video programs.
De plus en plus fréquemment, les documents ainsi manipulés et transmis contiennent plusieurs types d'informations intégrées dans une structure.More and more frequently, the documents thus manipulated and transmitted contain several types of information integrated into a structure.
Un document structuré est une collection d'ensembles d'informations associés chacun à un type et des attributs, et composés entre eux selon des relations principalement hiérarchiques. Ces documents emploient un langage de structuration tel que SGML, HTML, XML, permettant notamment de distinguer les différents sous-ensembles d'informations composant le document. Par opposition, dans un document dit linéaire, les informations de contenu du document sont mélangées aux informations de présentation et de typage.A structured document is a collection of sets of information, each associated with a type and attributes, and composed together according to mainly hierarchical relationships. These documents use a structuring language such as SGML, HTML, XML, making it possible in particular to distinguish the different subsets of information composing the document. In contrast, in a so-called linear document, the content information of the document is mixed with the presentation and typing information.
Un document structuré inclut des repères de séparation des différents ensembles d'informations du document. Dans le cas des formats SGML, XML ou HTML, ces repères appelés "balises" sont de la forme "<XXXX>" et "</XXXX>", le premier repère indiquant le début d'un ensemble d'informations "XXXX" et le second la fin de cet ensemble. Un ensemble d'informations peut être composé de plusieurs ensembles d'informations de plus bas niveau. Ainsi, un document structuré présente un schéma de structure hiérarchique ou arborescente, chaque nœud représentant un ensemble d'informations et étant relié à un nœud de niveau hiérarchique supérieur représentant un ensemble d'informations qui contient les ensembles d'informations de niveau inférieur. Les nœuds situés en bout de branche de cette structure arborescente représentent des ensembles d'informations contenant des données d'un type prédéfini, qui ne peuvent pas être décomposées en sous-ensembles d'informations.A structured document includes separation marks for the different sets of information in the document. In the case of SGML, XML or HTML formats, these marks called "tags" are of the form "<XXXX>" and "</XXXX>", the first mark indicating the start of a set of information "XXXX" and the second the end of this set. A set of information can be composed of several sets of lower level information. Thus, a structured document presents a hierarchical or tree structure diagram, each node representing a set of information and being connected to a node of higher hierarchical level representing a set of information which contains the sets of information of lower level. The nodes located at the end of the branch of this tree structure represent sets of information containing data of a predefined type, which cannot be broken down into subsets of information.
Ainsi, un document structuré contient des repères de séparation représentés sous la forme de données textuelles ou binaires, ces repères délimitant des ensembles ou sous-ensembles d'informations pouvant eux-mêmes contenir d'autres sous- ensembles d'informations délimités par des repères.Thus, a structured document contains separation marks represented in the form of textual or binary data, these marks delimiting sets or subsets of information which may themselves contain other subsets of information delimited by marks .
Un document structuré est associé à ce qu'on appelle un schéma de structure définissant sous la forme de règles la structure et le type d'information de chaque ensemble d'informations du document. Un schéma est constitué de groupes imbriqués de structures d'ensembles d'informations, ces groupes pouvant être des séquences ordonnées, des groupes d'éléments alternatifs ou des groupes d'éléments nécessaires, ordonnés ou non ordonnés.A structured document is associated with what is called a structure diagram defining in the form of rules the structure and the type of information of each set of information in the document. A schema is made up of nested groups of structures of information sets, these groups can be ordered sequences, groups of alternative elements or groups of necessary elements, ordered or unordered.
A l'heure actuelle, lorsqu'un document structuré doit être transmis, il est préalablement compressé, de manière à minimiser le volume des données à transmettre. Pour une plus grande efficacité d'un tel traitement de compression, les données de structuration du document sont également compressées, sachant que le destinataire du document est sensé connaître au préalable le schéma de structure du document et peut utiliser le schéma de structure pour déterminer à chaque instant quel ensemble d'informations il va recevoir. Il est donc indispensable que la structure du document transmis corresponde exactement au schéma de structure que le destinataire du document envisage d'utiliser pour la réception et le décodage du document, faute de quoi, le destinataire ne peut pas déterminer le type des données transmises notamment, et donc est incapable de les décoder et de reconstituer le document d'origine.At present, when a structured document must be transmitted, it is compressed beforehand, so as to minimize the volume of data to be transmitted. For greater efficiency of such compression processing, the document structuring data are also compressed, knowing that the recipient of the document is supposed to know beforehand the structure diagram of the document and can use the structure diagram to determine every instant what set of information it will receive. It is therefore essential that the structure of the transmitted document exactly match the structure diagram that the recipient of the document plans to use for receiving and decoding the document, failing which, the recipient cannot determine the type of data transmitted, in particular. , and therefore is unable to decode them and reconstruct the original document.
Or les documents structurés à transmettre ont tendance à devenir de plus en plus volumineux. On envisage par exemple de transmettre ou diffuser de cette manière des descriptions complètes d'œuvres cinématographiques ou de programmes de télévision.However, the structured documents to be transmitted tend to become more and more voluminous. It is envisaged, for example, to transmit or broadcast in this way complete descriptions of cinematographic works or television programs.
Dans ce contexte, si une erreur de transmission survient durant la transmission d'un document, le destinataire du document peut ne plus être en mesure de déterminer quel sous-ensemble est en cours de transmission, si bien que l'ensemble du document doit à nouveau être transmis. En outre, si l'on souhaite transmettre et simultanément afficher sur un écran une séquence cinématographique, il peut être nécessaire de respecter des plages horaires de transmission des différents éléments de la séquence. Certains éléments de la séquence doivent en outre pouvoir être transmis plusieurs fois pour permettre à un destinataire qui n'était pas connecté au début de la transmission de la séquence, de recevoir et afficher la fin de celle-ci.In this context, if a transmission error occurs during the transmission of a document, the recipient of the document may no longer be able to determine which subset is being transmitted, so that the entire document must be again be forwarded. In addition, if one wishes to transmit and simultaneously display a cinematographic sequence on a screen, it may be necessary to respect the time slots for transmission of the various elements of the sequence. Some elements of the sequence must also be able to be transmitted several times to allow a recipient who was not connected at the start of the sequence transmission, to receive and display the end of the sequence.
Il peut être également nécessaire de remplacer une partie de document par une autre, ces deux parties ayant le même schéma de structure.It may also be necessary to replace one document part with another, these two parts having the same structure diagram.
La solution consistant à retransmettre l'ensemble du document conduirait à augmenter considérablement le volume des informations à transmettre. Il est donc souhaitable de pouvoir diviser un document en plusieurs parties qui sont transmises séparément. Il s'avère que les procédés de transmission actuels ne permettent pas de transmettre partiellement un document.The solution consisting in retransmitting the entire document would considerably increase the volume of information to be transmitted. It is therefore desirable to be able to divide a document into several parts which are transmitted separately. It turns out that current transmission methods do not make it possible to partially transmit a document.
La présente invention a pour but de supprimer cet inconvénient. Cet objectif est atteint par la prévision d'un procédé pour diviser un document structuré présentant une structure hiérarchique définie par un schéma de structure, ce document regroupant un ensemble d'informations principal incluant des sous- ensembles d'informations, au moins une partie des sous-ensembles d'informations pouvant inclure des sous-ensembles d'informations de plus bas niveau hiérarchique, chaque sous-ensemble d'informations étant associé à un type d'informations respectif.The object of the present invention is to eliminate this drawback. This objective is achieved by providing a method for dividing a structured document having a hierarchical structure defined by a structure diagram, this document grouping together a main set of information including subsets of information, at least some of the information subsets which may include information subsets of lower hierarchical level, each information subset being associated with a respective type of information.
Selon l'invention, ce procédé comprend les étapes consistant à :According to the invention, this method comprises the steps consisting in:
- diviser le document en parties manipulables séparément, à savoir une partie principale et au moins une partie secondaire, la partie principale contenant au moins l'ensemble d'informations principal, et la partie secondaire contenant un sous-ensemble d'informations qui est retiré de l'ensemble d'informations principal, chaque partie secondaire étant rattachée à la partie principale ou à une autre partie secondaire, et- divide the document into parts which can be handled separately, namely a main part and at least one secondary part, the main part containing at least the main information set, and the secondary part containing a subset of information which is removed the main information set, each secondary part being attached to the main part or to another secondary part, and
- attribuer une valeur prédéfinie au type d'informations de chaque sous- ensemble d'informations retiré d'un ensemble d'informations de niveau hiérarchique supérieur.- assign a predefined value to the type of information of each subset of information removed from a set of information of higher hierarchical level.
De cette manière, chaque partie est compréhensible en elle-même et peut être décodée, et ce quel que soit le découpage choisi. En outre, lorsqu'une telle partie est transmise et que la transmission échoue, le reste du document reste valide et la partie non transmise correctement peut être retransmise sans avoir besoin de retransmettre l'ensemble du document. Par ailleurs, il n'est pas nécessaire de disposer des parties principales et secondaires en amont d'une partie pour pouvoir décoder cette dernière, puisque chaque partie est valide et compréhensible en elle-même. Grâce à ces dispositions, un document transmis peut être enrichi et modifié au cours du temps.In this way, each part is understandable in itself and can be decoded, regardless of the division chosen. Furthermore, when such a part is transmitted and the transmission fails, the rest of the document remains valid and the part which is not transmitted correctly can be retransmitted without the need to retransmit the entire document. Furthermore, it is not necessary to have the main and secondary parts upstream of a part in order to be able to decode the latter, since each part is valid and understandable in itself. Thanks to these provisions, a transmitted document can be enriched and modified over time.
Avantageusement, le document comprend un entête qui est inséré dans chaque partie, cet entête comprenant un indicateur dont la valeur indique si le document est complet ou non.Advantageously, the document comprises a header which is inserted into each part, this header comprising an indicator whose value indicates whether the document is complete or not.
Selon une particularité de l'invention, chaque partie comprend un entête comportant une information donnant l'emplacement de la partie dans la structure hiérarchique du document.According to a feature of the invention, each part comprises a header comprising information giving the location of the part in the hierarchical structure of the document.
Ladite information d'emplacement de la partie secondaire dans la structure hiérarchique du document décrit avantageusement un chemin dans cette structure, définissant la position de la partie secondaire dans le document.Said location information of the secondary part in the hierarchical structure of the document advantageously describes a path in this structure, defining the position of the secondary part in the document.
Ledit chemin peut être défini d'une manière absolue par rapport à l'ensemble principal d'informations du document. Il peut également être défini d'une manière relative par rapport à la position d'une dernière partie secondaire transmise.Said path can be defined absolutely relative to the main information set of the document. It can also be defined relatively with respect to the position of a last transmitted secondary part.
Alternativement, chaque type d'informations affecté à la valeur prédéfinie est suivi d'une référence à la partie secondaire contenant le sous-ensemble d'informations associé au type d'informations, ladite information d'emplacement de la partie secondaire dans la structure hiérarchique du document étant la référence de ladite partie secondaire.Alternatively, each type of information assigned to the predefined value is followed by a reference to the secondary part containing the subset of information associated with the type of information, said information on the location of the secondary part in the hierarchical structure. of the document being the reference of said secondary part.
Ce procédé peut en outre comprendre la transmission de plusieurs parties du document associées au même emplacement dans la structure. Dans ce cas, la dernière partie transmise remplace la précédente qui est associée au même emplacement.This method can further comprise the transmission of several parts of the document associated with the same location in the structure. In this case, the last part transmitted replaces the previous one which is associated with the same location.
On peut prévoir également que l'entête de chaque partie comprend une information précisant un mode de traitement de la partie par rapport à une partie associée au même emplacement dans la structure.We can also provide that the header of each part includes information specifying a method of processing the part with respect to a part associated with the same location in the structure.
Le document structuré est par exemple de type SGML, XML ou HTML.The structured document is for example of the SGML, XML or HTML type.
Un mode de réalisation préféré de l'invention sera décrit ci-après, à titre d'exemple non limitatif, avec référence aux dessins annexés dans lesquels : La figure 1 représente une structure arborescente dont chaque nœud symbolise un ensemble ou sous-ensemble d'informations d'un document structuré qui est normalement transmis en une seule fois ;A preferred embodiment of the invention will be described below, by way of nonlimiting example, with reference to the appended drawings in which: FIG. 1 represents a tree structure in which each node symbolizes a set or subset of information from a structured document which is normally transmitted in one go;
La figure 2 montre le document structuré représenté sur la figure 1 découpé en plusieurs parties, chacune pouvant être transmise séparément selon l'invention ;Figure 2 shows the structured document shown in Figure 1 cut into several parts, each of which can be transmitted separately according to the invention;
La figure 3 montre plus en détail la structure des informations contenues dans un document structuré ;Figure 3 shows in more detail the structure of the information contained in a structured document;
La figure 4 représente une autre structure arborescente illustrant une méthode de définition de la position d'une partie de la structure, transmise séparément du reste de la structure.FIG. 4 represents another tree structure illustrating a method of defining the position of a part of the structure, transmitted separately from the rest of the structure.
La figure 1 représente une structure arborescente comprenant un nœud racine 1 décomposé en trois nœuds de rang inférieur, dont le premier nœud 1.1 n'est pas décomposé en nœuds de rang inférieur, le second nœud 1.2 se compose de deux nœuds 1.2.1 et 1.2.2 et le troisième nœud 1.3 se compose d'un seul nœud 1.3.1. Les deux nœuds 1.2.1 et 1.2.2 du second nœud 1.2 sont rattachés respectivement à un 1.2.1.1 et deux nœuds 1.2.2.1 et 1.2.2.2 de rang inférieur.FIG. 1 represents a tree structure comprising a root node 1 decomposed into three nodes of lower rank, of which the first node 1.1 is not decomposed into nodes of lower rank, the second node 1.2 consists of two nodes 1.2.1 and 1.2 .2 and the third node 1.3 consists of a single node 1.3.1. The two nodes 1.2.1 and 1.2.2 of the second node 1.2 are attached respectively to a 1.2.1.1 and two nodes 1.2.2.1 and 1.2.2.2 of lower rank.
Cette structure représente un document structuré D comprenant un entête H dans lequel sont définis un certain nombre de paramètres définissant le format de codage et de représentation du document, et un corps principal B rassemblant les informations et ensembles d'informations constituant le document.This structure represents a structured document D comprising a header H in which are defined a certain number of parameters defining the coding and representation format of the document, and a main body B gathering the information and sets of information constituting the document.
Selon l'invention, un document structuré peut être transmis en plusieurs parties séparées PI, P2, P3, à savoir une partie principale et des parties secondaires P2, P3 qui sont rattachées à la partie principale (figure 2). Une telle transmission est effectuée de préférence après compression d'une manière appropriée de chaque partie à transmettre séparément. Chaque partie de document qu'elle soit compressée ou non comprend un entête H, H2, H3 et un corps principal Bl, B2, B3. Comme représenté sur la figure 3, un corps principal B de document comprend un entête de données DH et un ou plusieurs corps de données DB rassemblant chacun les informations d'un sous-ensemble d'informations du document. L'entête de données DH peut comprendre un champ K permettant de supprimer toute ambiguïté au moment du décodage du document, en donnant notamment un numéro permettant de définir l'ensemble d'informations qui suit, et/ou un champ contenant le nombre N d'occurrences du corps de données DB. Selon le format utilisé, chaque corps de données DB peut comprendre un champ T indiquant le type d'informations qu'il contient, un champ L donnant la longueur de ces informations en nombre de bits ou d'octets, un champ A rassemblant des attributs du sous-ensemble d'informations et un champ Val contenant la valeur ou le contenu du sous-ensemble d'informations. Comme le document est structuré sous une forme arborescente, le champ Val peut lui-même contenir un champ d'entêté de données DH et un ou plusieurs champs contenant un corps de données DB.According to the invention, a structured document can be transmitted in several separate parts PI, P2, P3, namely a main part and secondary parts P2, P3 which are attached to the main part (Figure 2). Such a transmission is preferably carried out after compression in an appropriate manner of each part to be transmitted separately. Each part of the document, whether compressed or not, includes a header H, H2, H3 and a main body B1, B2, B3. As shown in FIG. 3, a main document body B comprises a data header DH and one or more data bodies DB each gathering the information from a subset of information in the document. The DH data header may include a field K making it possible to remove any ambiguity when decoding the document, in particular by giving a number making it possible to define the following set of information, and / or a field containing the number N d 'occurrences of the DB data body. Depending on the format used, each DB data body can include a T field indicating the type of information it contains, an L field giving the length of this information in number of bits or bytes, an A field gathering attributes of the information subset and a Val field containing the value or content of the information subset. As the document is structured in a tree-like form, the Val field can itself contain a DH data header field and one or more fields containing a DB data body.
Il est à noter à ce sujet que dans le schéma de structure représenté sur la figure 1, les informations contenues dans le document sont rassemblées dans les nœuds 1.1, 1.2.1.1, 1.2.2.1, 1.2.2.2 et 1.3.1 situés aux extrémités des branches, ainsi que dans les champs attribut A des sous-ensembles symbolisés par tous les nœuds du document.It should be noted on this subject that in the structure diagram represented on figure 1, the information contained in the document are gathered in nodes 1.1, 1.2.1.1, 1.2.2.1, 1.2.2.2 and 1.3.1 located at the ends branches, as well as in the attribute A fields of subsets symbolized by all the nodes of the document.
Selon l'invention, lorsque l'on souhaite transmettre partiellement un tel document qu'il soit préalablement compressé ou non, le champ T contenant le type d'information d'un corps de données DB non transmis ou retiré du document, reçoit une valeur prédéfinie indiquant que le sous-ensemble d'informations qui suit n'est pas transmis. Cette valeur particulière prédéfinie de type d'information est par exemple choisie égale à 0 dans le cas d'un document sous forme compressée, les valeurs des autres types d'information étant différentes de 0. Si cette valeur prédéfinie apparaît dans le document transmis, le champ longueur L et les champs A et Val qui suivent normalement le type d'information, n'apparaissent pas dans les données transmises. Par conséquent, à la suite d'un type d'information égal à la valeur prédéfinie, on trouve l'entête DH de l'ensemble d'informations suivant dans le document ou un indicateur de fin de document.According to the invention, when it is desired to partially transmit such a document whether it is compressed beforehand or not, the field T containing the type of information of a body of data DB not transmitted or removed from the document, receives a value predefined indicating that the following subset of information is not transmitted. This particular predefined value of type of information is for example chosen to be equal to 0 in the case of a document in compressed form, the values of the other types of information being different from 0. If this predefined value appears in the transmitted document, the length L field and the A and Val fields which normally follow the type of information, do not appear in the transmitted data. Consequently, following a type of information equal to the predefined value, the DH header of the next set of information is found in the document or an end of document indicator.
On peut prévoir d'ajouter à l'entête H du document un paramètre indiquant si le document est totalement transmis ou non, de manière à indiquer au destinataire du document si le document qu'il est en train de recevoir est transmis entièrement ou non.We can plan to add to the header H of the document a parameter indicating whether the document is completely transmitted or not, so as to indicate to the recipient of the document whether the document it is receiving is transmitted entirely or not.
Les parties PI, P2 et P3 peuvent être transmises séparément une ou plusieurs fois. Elles ont à cet effet un entête H, H2, H3 comprenant tout d'abord un paramètre indiquant que le document n'est pas complet, suivi d'une définition de l'emplacement de la partie transmise dans la structure arborescente du document complet.Parts PI, P2 and P3 can be transmitted separately one or more times. They have for this purpose a header H, H2, H3 comprising first of all a parameter indicating that the document is not complete, followed by a definition of the location of the part transmitted in the tree structure of the complete document.
De cette manière, un document structuré peut être enrichi et modifié au cours du temps.In this way, a structured document can be enriched and modified over time.
Il est à noter que la transmission de la partie principale PI n'est pas nécessaire puisque, grâce à la définition de l'emplacement figurant dans l'entête des parties secondaires, l'unité de traitement qui reçoit les parties secondaires transmises peut déterminer l'emplacement de la partie reçue dans la structure du document et ainsi décoder celle-ci. En outre, le découpage du document peut être réalisé de manière à ce que la partie principale ne contienne aucune donnée utile, et à ce que l'ensemble du document puisse être reconstitué à partir des parties secondaires et de leur emplacement dans la structure du document.It should be noted that the transmission of the main part PI is not necessary since, thanks to the definition of the location appearing in the header of the secondary parts, the processing unit which receives the transmitted secondary parts can determine the location of the received part in the document structure and thus decode it. In addition, the document can be split up so that the main part does not contain any useful data, and so that the whole document can be reconstructed from the secondary parts and their location in the document structure. .
En outre, l'entête H, H2, H3 des parties PI, P2, P3 peut comprendre une information précisant un mode de traitement de la partie par rapport à une partie déjà transmise associée au même emplacement dans la structure, à savoir par exemple, si la partie transmise doit remplacer une partie associée au même emplacement, qui a déjà été transmise, ou ne pas être prise en compte si elle figure déjà dans le document reçu, ou encore être fusionnée à la partie associée au même emplacement, qui a déjà été transmise.In addition, the header H, H2, H3 of the parts PI, P2, P3 can include information specifying a method of processing the part with respect to a part already transmitted associated with the same location in the structure, namely for example, if the transmitted part must replace a part associated with the same location, which has already been transmitted, or not be taken into account if it already appears in the document received, or else be merged with the part associated with the same location, which has already been transmitted.
Comme illustré sur la figure 4, cette définition d'emplacement peut comprendre le nom de tous les nœuds supérieurs jusqu'au nœud racine R, éventuellement associés à un numéro d'ordre par rapport au nœud supérieur. Par exemple, le premier nœud du premier nœud du troisième nœud du premier nœud rattaché au nœud racine (repéré sur la figure 4 par une succession de flèches issues du nœud racine R) peut être référencé de la manière suivante : /c/a[last]/b(l)/dAs illustrated in FIG. 4, this definition of location can include the name of all the upper nodes up to the root node R, possibly associated with a sequence number relative to the upper node. For example, the first node of the first node of the third node of the first node attached to the root node (identified in Figure 4 by a succession of arrows from the root node R) can be referenced as follows: / c / a [last ] / b (l) / d
Cette notation indique qu'il s'agit du nœud de type "d" relié au premier nœud de type "b" relié au dernier nœud de type "a" relié au nœud de type "c" qui est relié directement au nœud racine R. D'autres parties du document peuvent être ensuite transmises soit en utilisant la méthode de définition absolue (par rapport au nœud racine R) décrite ci-dessus, ou bien, avantageusement, en utilisant une méthode de définition relative. Ainsi, par exemple, le troisième nœud relié au même nœud immédiatement supérieur que le nœud précédent peut être référencé de la manière suivante :This notation indicates that it is about the node of type “d” connected to the first node of type “b” connected to the last node of type “a” connected to the node of type “c” which is connected directly to the root node R . Other parts of the document can then be transmitted either by using the absolute definition method (relative to the root node R) described above, or, advantageously, by using a relative definition method. So, for example, the third node connected to the same immediately superior node as the previous node can be referenced as follows:
../e[2]../e[2]
Cette notation indique que l'on fait référence au second nœud qui doit être de type "e" relié au même nœud de niveau immédiatement supérieur référencé par la notation ".7". Il apparaît que cette seconde méthode est plus compacte que la première.This notation indicates that one refers to the second node which must be of type "e" connected to the same node of immediately higher level referenced by the notation ".7". It appears that this second method is more compact than the first.
Alternativement, la définition de l'emplacement de la partie de document transmise P2, P3 peut simplement comprendre une référence à la partie de document, cette référence ayant été au préalable transmise dans la partie principale PI du document, par exemple à la suite de la valeur prédéfinie indiquant que le sous-ensemble d'informations qui suit n'est pas transmis.Alternatively, the definition of the location of the transmitted document part P2, P3 can simply include a reference to the document part, this reference having been transmitted beforehand in the main part PI of the document, for example following the predefined value indicating that the following subset of information is not transmitted.
De préférence, le document ou les parties PI, P2, P3 de document à transmettre sont préalablement compressées. A cet effet, on distingue avantageusement dans chaque partie de document, les informations de structure et les informations de contenu, certaines parties de document pouvant ne comprendre aucune information de contenu. Ainsi dans l'exemple des figures 2 et 3, les informations de structure sont constituées par tous les champs à l'exception des champs valeur Val, lorsque ceux-ci ne sont pas structurés, c'est-à-dire ne sont pas décomposables en sous-ensembles d'informations structurés. Dans l'exemple de la figure 2, il s'agit des champs Val des sous-ensembles d'informations 1.1, 1.2.1.1, 1.2.2.1, 1.2.2.2, et 1.3.1, situés aux extrémités inférieures des branches de la structure arborescente du document.Preferably, the document or the parts PI, P2, P3 of the document to be transmitted are compressed beforehand. To this end, advantageously, in each part of the document, the structure information and the content information are distinguished, certain parts of the document possibly comprising no content information. Thus in the example of FIGS. 2 and 3, the structure information consists of all the fields except the Val value fields, when these are not structured, that is to say are not decomposable into structured information subsets. In the example in FIG. 2, these are the Val fields of the information subsets 1.1, 1.2.1.1, 1.2.2.1, 1.2.2.2, and 1.3.1, located at the lower ends of the branches of the tree structure of the document.
Le traitement de compression proprement dit consiste par exemple à lire séquentiellement la partie de document à compresser, à appliquer un algorithme de compression approprié pour traiter les informations de structure et à appliquer un algorithme de compression adapté au type d'informations lorsqu'un champ Val non décomposable apparaît durant la lecture de la partie de document. Il est à noter que dans le document ou la partie de document compressé les informations de structure et les informations de contenu apparaissent dans le même ordre que dans le document d'origine non compressé. On peut également bien appliquer un algorithme de compression statistique, tel que Zip. The compression processing proper consists for example in sequentially reading the part of the document to be compressed, in applying an appropriate compression algorithm to process the structure information and in applying a compression algorithm adapted to the type of information when a Val field not decomposable appears during the reading of the document part. It should be noted that in the document or part of the compressed document, the structure information and the content information appear in the same order as in the original uncompressed document. We can also apply a statistical compression algorithm, such as Zip.

Claims

REVENDICATIONS
1. Procédé pour diviser un document structuré (D) présentant une structure hiérarchique définie par un schéma de structure, ce document regroupant un ensemble d'informations principal (1) structuré, incluant des sous-ensembles d'informations (1.1, 1.2, 1.3, ..., 1.2.2.2), au moins une partie des sous-ensembles d'informations étant structuré et pouvant inclure des sous- ensembles d'informations de plus bas niveau hiérarchique, chaque sous- ensemble d'informations étant associé dans l'ensemble d'informations de niveau supérieur à un type d'informations (T) respectif, caractérisé en ce qu'il comprend les étapes consistant à :1. Method for dividing a structured document (D) having a hierarchical structure defined by a structure diagram, this document gathering a main set of information (1) structured, including subsets of information (1.1, 1.2, 1.3 , ..., 1.2.2.2), at least part of the information subsets being structured and possibly including information subsets of lower hierarchical level, each information subset being associated in the set of information at a higher level than a respective type of information (T), characterized in that it comprises the steps consisting in:
- diviser le document en parties (PI, P2, P3) structurées manipulables individuellement, à savoir une partie principale (PI) et au moins une partie secondaire (P2, P3), la partie principale contenant au moins l'ensemble d'informations principal (1), et la partie secondaire contenant un sous- ensemble d'informations (1.2.1, 1.2.2) qui est retiré de l'ensemble d'informations principal, chaque partie secondaire étant rattachée à la partie principale ou à une autre partie secondaire, et- divide the document into individually manipulated structured parts (PI, P2, P3), namely a main part (PI) and at least one secondary part (P2, P3), the main part containing at least the main information set (1), and the secondary part containing a subset of information (1.2.1, 1.2.2) which is removed from the main information set, each secondary part being attached to the main part or to another secondary part, and
- attribuer dans les ensembles d'information (1.2) dans lequel on a retiré au moins un sous-ensemble d'informations une valeur prédéfinie au type d'informations (T) de chaque sous-ensemble d'informations (1.2.1, 1.2.2) retiré .- assign in the information sets (1.2) in which at least one subset of information has been removed a predefined value for the type of information (T) of each subset of information (1.2.1, 1.2 .2) removed.
2. Procédé selon la revendication 1, caractérisé en ce que le document (D) comprend un entête (H) qui est inséré dans chaque partie (PI, P2, P3) retirée du document, cet entête comprenant un indicateur dont la valeur indique si le document est complet ou non.2. Method according to claim 1, characterized in that the document (D) comprises a header (H) which is inserted in each part (PI, P2, P3) removed from the document, this header comprising an indicator whose value indicates whether the document is complete or not.
3. Procédé selon la revendication 1 ou 2, caractérisé en ce que chaque partie (PI, P2, P3) retirée du document comprend un entête (H, H2, H3) comportant une information donnant l'emplacement de la partie dans la structure hiérarchique du document.3. Method according to claim 1 or 2, characterized in that each part (PI, P2, P3) removed from the document comprises a header (H, H2, H3) comprising information giving the location of the part in the hierarchical structure of the document.
4. Procédé selon la revendication 3, caractérisé en ce que ladite information d'emplacement de la partie secondaire dans la structure hiérarchique du document décrit un chemin dans cette structure, définissant la position de la partie secondaire dans le document. 4. Method according to claim 3, characterized in that said location information of the secondary part in the hierarchical structure of the document describes a path in this structure, defining the position of the secondary part in the document.
5. Procédé selon la revendication 4, caractérisé en ce que ledit chemin est défini d'une manière absolue par rapport à l'ensemble principal d'informations du document.5. Method according to claim 4, characterized in that said path is defined in an absolute manner with respect to the main set of information of the document.
6. Procédé selon la revendication 4, caractérisé en ce que chaque partie secondaire retirée du document principal étant transmise séparément de la partie principale du document, ledit chemin est défini d'une manière relative par rapport à la position d'une dernière partie secondaire transmise.6. Method according to claim 4, characterized in that each secondary part removed from the main document being transmitted separately from the main part of the document, said path is defined in a relative manner relative to the position of a last transmitted secondary part .
7. Procédé selon la revendication 3, caractérisé en ce que chaque type d'informations (T) affecté à la valeur prédéfinie, apparaissant dans un ensemble d'informations, est suivi d'une référence à la partie secondaire (P2, P3) contenant le sous-ensemble d'informations retiré de l'ensemble d'informations, ladite information d'emplacement de la partie secondaire dans la structure hiérarchique du document étant la référence de ladite partie secondaire.7. Method according to claim 3, characterized in that each type of information (T) assigned to the predefined value, appearing in a set of information, is followed by a reference to the secondary part (P2, P3) containing the information subset removed from the information set, said location information of the secondary part in the hierarchical structure of the document being the reference of said secondary part.
8. Procédé selon l'une des revendications 1 à 7, caractérisé en ce qu'il comprend en outre la transmission de plusieurs parties du document associées au même emplacement dans la structure, la dernière partie transmise remplaçant la partie du document précédemment transmise, associée au même emplacement dans la structure.8. Method according to one of claims 1 to 7, characterized in that it further comprises the transmission of several parts of the document associated with the same location in the structure, the last transmitted part replacing the part of the document previously transmitted, associated at the same location in the structure.
9. Procédé selon l'une des revendications 1 à 7, caractérisé en ce qu'il comprend en outre la transmission de plusieurs parties du document associées au même emplacement dans la structure, l'entête de chaque partie comprenant une information indiquant le mode de traitement de la partie par rapport à une partie déjà transmise associée au même emplacement dans la structure.9. Method according to one of claims 1 to 7, characterized in that it further comprises the transmission of several parts of the document associated with the same location in the structure, the header of each part comprising information indicating the mode of processing of the part with respect to a part already transmitted associated with the same location in the structure.
10. Procédé selon l'une des revendications 1 à 9, caractérisé en ce que la partie principale et les parties secondaires retirées de la partie principale sont compressées, puis transmises séparément.10. Method according to one of claims 1 to 9, characterized in that the main part and the secondary parts removed from the main part are compressed, then transmitted separately.
11. Procédé selon la revendication 10, caractérisé en ce que chaque ensemble et sous-ensemble d'informations comprenant des informations de structure et des informations de contenu, les informations de structure sont compressées à l'aide d'un algorithme de compression d'informations de structure, et les informations de contenu sont compressées à l'aide d'un algorithme adapté au type d'informations (T) de contenu, les informations de structure et de contenu apparaissant dans la partie de document compressée dans le même ordre que dans la partie de document correspondante non compressée.11. The method as claimed in claim 10, characterized in that each set and sub-set of information comprising structure information and content information, the structure information is compressed using a structure information compression algorithm, and content information is compressed using an algorithm suitable for the type of information (T) content, the information structure and content appearing in the compressed document part in the same order as in the corresponding uncompressed document part.
12. Procédé selon l'une des revendications 1 à 11, caractérisé en ce que le document est de type SGML, XML ou HTML. 12. Method according to one of claims 1 to 11, characterized in that the document is of SGML, XML or HTML type.
EP01271587A 2000-12-18 2001-12-14 Method for dividing structured documents into several parts Withdrawn EP1344151A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
FR0016507A FR2818409B1 (en) 2000-12-18 2000-12-18 METHOD FOR DIVIDING STRUCTURED DOCUMENTS INTO MULTIPLE PARTS
FR0016507 2000-12-18
PCT/FR2001/004008 WO2002050708A1 (en) 2000-12-18 2001-12-14 Method for dividing structured documents into several parts

Publications (1)

Publication Number Publication Date
EP1344151A1 true EP1344151A1 (en) 2003-09-17

Family

ID=8857802

Family Applications (1)

Application Number Title Priority Date Filing Date
EP01271587A Withdrawn EP1344151A1 (en) 2000-12-18 2001-12-14 Method for dividing structured documents into several parts

Country Status (6)

Country Link
US (2) US7275060B2 (en)
EP (1) EP1344151A1 (en)
JP (1) JP4145144B2 (en)
AU (1) AU2002219311A1 (en)
FR (1) FR2818409B1 (en)
WO (1) WO2002050708A1 (en)

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3944014B2 (en) * 2002-07-09 2007-07-11 株式会社東芝 Document editing method, document editing system, and document processing program
US7838430B2 (en) * 2003-10-28 2010-11-23 Applied Materials, Inc. Plasma control using dual cathode frequency mixing
US7464330B2 (en) * 2003-12-09 2008-12-09 Microsoft Corporation Context-free document portions with alternate formats
US7418652B2 (en) * 2004-04-30 2008-08-26 Microsoft Corporation Method and apparatus for interleaving parts of a document
US7549118B2 (en) * 2004-04-30 2009-06-16 Microsoft Corporation Methods and systems for defining documents with selectable and/or sequenceable parts
US7383500B2 (en) * 2004-04-30 2008-06-03 Microsoft Corporation Methods and systems for building packages that contain pre-paginated documents
US8661332B2 (en) * 2004-04-30 2014-02-25 Microsoft Corporation Method and apparatus for document processing
US7359902B2 (en) * 2004-04-30 2008-04-15 Microsoft Corporation Method and apparatus for maintaining relationships between parts in a package
US7755786B2 (en) 2004-05-03 2010-07-13 Microsoft Corporation Systems and methods for support of various processing capabilities
US8363232B2 (en) * 2004-05-03 2013-01-29 Microsoft Corporation Strategies for simultaneous peripheral operations on-line using hierarchically structured job information
US7580948B2 (en) * 2004-05-03 2009-08-25 Microsoft Corporation Spooling strategies using structured job information
US7519899B2 (en) 2004-05-03 2009-04-14 Microsoft Corporation Planar mapping of graphical elements
US8243317B2 (en) * 2004-05-03 2012-08-14 Microsoft Corporation Hierarchical arrangement for spooling job data
US7617450B2 (en) 2004-09-30 2009-11-10 Microsoft Corporation Method, system, and computer-readable medium for creating, inserting, and reusing document parts in an electronic document
US7584111B2 (en) * 2004-11-19 2009-09-01 Microsoft Corporation Time polynomial Arrow-Debreu market equilibrium
US7617451B2 (en) * 2004-12-20 2009-11-10 Microsoft Corporation Structuring data for word processing documents
US7617229B2 (en) * 2004-12-20 2009-11-10 Microsoft Corporation Management and use of data in a computer-generated document
US20060136816A1 (en) * 2004-12-20 2006-06-22 Microsoft Corporation File formats, methods, and computer program products for representing documents
US7770180B2 (en) * 2004-12-21 2010-08-03 Microsoft Corporation Exposing embedded data in a computer-generated document
US7752632B2 (en) * 2004-12-21 2010-07-06 Microsoft Corporation Method and system for exposing nested data in a computer-generated document in a transparent manner
US8111694B2 (en) 2005-03-23 2012-02-07 Nokia Corporation Implicit signaling for split-toi for service guide
US20060277452A1 (en) * 2005-06-03 2006-12-07 Microsoft Corporation Structuring data for presentation documents
US20070022128A1 (en) * 2005-06-03 2007-01-25 Microsoft Corporation Structuring data for spreadsheet documents
US8176414B1 (en) * 2005-09-30 2012-05-08 Google Inc. Document division method and system
WO2007038844A1 (en) * 2005-10-06 2007-04-12 Smart Internet Technology Crc Pty Ltd Methods and systems for facilitating access to a schema
JP5570202B2 (en) * 2009-12-16 2014-08-13 キヤノン株式会社 Structured document analysis apparatus, structured document analysis method, and computer program
JP5480034B2 (en) * 2010-06-24 2014-04-23 インターナショナル・ビジネス・マシーンズ・コーポレーション Method, program and system for dividing tree structure of structured document

Family Cites Families (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5142689A (en) * 1982-09-27 1992-08-25 Siemens Nixdort Informationssysteme Ag Process for the preparation of the connection of one of several data processor devices to a centrally synchronized multiple line system
JPH08255155A (en) * 1995-03-16 1996-10-01 Fuji Xerox Co Ltd Device and method for full-text registered word retrieval
JP3724847B2 (en) * 1995-06-05 2005-12-07 株式会社日立製作所 Structured document difference extraction method and apparatus
WO1997034240A1 (en) * 1996-03-15 1997-09-18 University Of Massachusetts Compact tree for storage and retrieval of structured hypermedia documents
US6061697A (en) * 1996-09-11 2000-05-09 Fujitsu Limited SGML type document managing apparatus and managing method
JPH10143403A (en) * 1996-11-12 1998-05-29 Fujitsu Ltd Information management device and information management program storage medium
WO1998037655A1 (en) * 1996-12-20 1998-08-27 Financial Services Technology Consortium Method and system for processing electronic documents
JPH1185750A (en) * 1997-07-08 1999-03-30 Hitachi Ltd Structured document rpocessing method, structured document rpocessor and computer-readable recording medium recorded with structured document processing program
US6119123A (en) * 1997-12-02 2000-09-12 U.S. Philips Corporation Apparatus and method for optimizing keyframe and blob retrieval and storage
EP0928070A3 (en) * 1997-12-29 2000-11-08 Phone.Com Inc. Compression of documents with markup language that preserves syntactical structure
JP3657424B2 (en) * 1998-03-20 2005-06-08 松下電器産業株式会社 Center device and terminal device for broadcasting program information
US6304578B1 (en) * 1998-05-01 2001-10-16 Lucent Technologies Inc. Packet routing and queuing at the headend of shared data channel
JP2000083059A (en) 1998-07-06 2000-03-21 Jisedai Joho Hoso System Kenkyusho:Kk Index information distributing method, index information distributing device, retrieving device and computer readable recording medium recording program for functioning computer as each means of those devices
JP3460597B2 (en) * 1998-09-22 2003-10-27 日本電気株式会社 Compound document management system, compound document structure management method, and recording medium storing compound document structure management program
JP4003854B2 (en) * 1998-09-28 2007-11-07 富士通株式会社 Data compression apparatus, decompression apparatus and method thereof
CA2255047A1 (en) * 1998-11-30 2000-05-30 Ibm Canada Limited-Ibm Canada Limitee Comparison of hierarchical structures and merging of differences
JP4141556B2 (en) * 1998-12-18 2008-08-27 株式会社日立製作所 Structured document management method, apparatus for implementing the method, and medium storing the processing program
US6377957B1 (en) * 1998-12-29 2002-04-23 Sun Microsystems, Inc. Propogating updates efficiently in hierarchically structured date
US6311187B1 (en) * 1998-12-29 2001-10-30 Sun Microsystems, Inc. Propogating updates efficiently in hierarchically structured data under a push model
US6635089B1 (en) * 1999-01-13 2003-10-21 International Business Machines Corporation Method for producing composite XML document object model trees using dynamic data retrievals
JP2000224257A (en) * 1999-01-29 2000-08-11 Jisedai Joho Hoso System Kenkyusho:Kk Transmitter and receiver
TW428146B (en) * 1999-05-05 2001-04-01 Inventec Corp Data file updating method by increment
US6671853B1 (en) * 1999-07-15 2003-12-30 International Business Machines Corporation Method and system for selectively streaming markup language documents
US6996770B1 (en) * 1999-07-26 2006-02-07 Microsoft Corporation Methods and systems for preparing extensible markup language (XML) documents and for responding to XML requests
US6966027B1 (en) * 1999-10-04 2005-11-15 Koninklijke Philips Electronics N.V. Method and apparatus for streaming XML content
GB2363217B (en) * 2000-06-06 2002-05-08 Oracle Corp Data file processing
US6826726B2 (en) * 2000-08-18 2004-11-30 Vaultus Mobile Technologies, Inc. Remote document updating system using XML and DOM
US6850948B1 (en) * 2000-10-30 2005-02-01 Koninklijke Philips Electronics N.V. Method and apparatus for compressing textual documents

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO0250708A1 *

Also Published As

Publication number Publication date
AU2002219311A1 (en) 2002-07-01
JP4145144B2 (en) 2008-09-03
FR2818409B1 (en) 2003-03-14
US20040054669A1 (en) 2004-03-18
US7275060B2 (en) 2007-09-25
US20070277096A1 (en) 2007-11-29
JP2004524606A (en) 2004-08-12
WO2002050708A1 (en) 2002-06-27
FR2818409A1 (en) 2002-06-21

Similar Documents

Publication Publication Date Title
EP1344151A1 (en) Method for dividing structured documents into several parts
EP1358583B1 (en) Method for encoding and decoding a path in the tree structure of a structured document
WO2002063776A2 (en) Method for compressing/decompressing a structured document
EP1316220B1 (en) Method for compressing/decompressing structured documents
FR2945363A1 (en) METHOD AND DEVICE FOR CODING A STRUCTURAL DOCUMENT
EP2015587B1 (en) Method of storing a multimedia object in memory, associated data structure and terminal
EP1592015A2 (en) Method and device for recording or playing back a data stream
FR2936623A1 (en) METHOD FOR ENCODING A STRUCTURED AND DECODING DOCUMENT, CORRESPONDING DEVICES
FR2924244A1 (en) METHOD AND DEVICE FOR ENCODING AND DECODING INFORMATION
EP0593341A1 (en) Query optimisation help method of a relational database management system and resulting syntactic analysis method
FR2931271A1 (en) METHOD AND DEVICE FOR CODING A STRUCTURED DOCUMENT AND METHOD AND DEVICE FOR DECODING A DOCUMENT SO CODE
FR2933793A1 (en) METHODS OF ENCODING AND DECODING, BY REFERENCING, VALUES IN A STRUCTURED DOCUMENT, AND ASSOCIATED SYSTEMS.
FR2926378A1 (en) METHOD AND PROCESSING DEVICE FOR ENCODING A HIERARCHISED DATA DOCUMENT
FR2930661A1 (en) METHOD FOR ACCESSING A PART OR MODIFYING A PART OF A BINARY XML DOCUMENT, ASSOCIATED DEVICES
FR2853797A1 (en) METHOD AND DEVICE FOR PRE-PROCESSING REQUESTS LINKED TO A DIGITAL SIGNAL IN A CUSTOMER-SERVER ARCHITECTURE
WO2017055771A1 (en) Method for encoding streams of video data based on groups of pictures (gop)
WO2002003245A1 (en) Method for storing xml-format information objects in a relational database
FR2821458A1 (en) SCHEME, SYNTAX ANALYSIS METHOD, AND METHOD FOR GENERATING A BINARY STREAM FROM A SCHEME
BE1013153A3 (en) Method and system for information collection.
WO2008074855A1 (en) Method of despatching multimedia products to at least one multimedia unit, method of processing these multimedia products and a multimedia unit for implementing these methods
EP1532549A1 (en) Database model for hierarchical data formats
EP2225853B1 (en) Improved message-based communication system monitor
EP1999649A2 (en) Method of generating a file describing a bit stream, corresponding device and computer program product
Berkman La littérature algorithmique: frontière entre auteur et lecteur
WO2011064493A1 (en) Encoding method and device with error correction suitable for transaction marking

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20030620

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

AX Request for extension of the european patent

Extension state: AL LT LV MK RO SI

RIN1 Information on inventor provided before grant (corrected)

Inventor name: THIENOT, CEDRIC

Inventor name: SEYRAT, CLAUDE

17Q First examination report despatched

Effective date: 20031112

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20170701