JP2009093631A

JP2009093631A - Document-encoding apparatus and document-encoding method

Info

Publication number: JP2009093631A
Application number: JP2008226380A
Authority: JP
Inventors: Toru Ishizaki; 透石嵜
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2007-09-20
Filing date: 2008-09-03
Publication date: 2009-04-30
Anticipated expiration: 2028-09-03
Also published as: JP5207886B2

Abstract

PROBLEM TO BE SOLVED: To provide a technique for improving the compression efficiency with respect to a structured document, where a plurality of decimals having fewer digits are described without having to impose a heavy load on analysis processing of the structured document. SOLUTION: Digit counts c' beyond the decimal point of attribute values in a structured document are acquired (S404). The detected attribute values are transformed into numeric character strings representing integers, by manipulating the decimal point positions of the attribute values, according to the maximum number of acquired digits (S406). The transformed numeric character strings, and the maximum number of digits C are encoded (S407). COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、構造化文書の符号化技術に関するものである。 The present invention relates to a structured document encoding technique.

従来、Ｗ３Ｃで策定されたＸＭＬ言語仕様では、データをＸＭＬ言語で記述する場合、ＵＴＦ−８、ＵＴＦ−１６などの文字符号化方式で符号化するのが一般的である。属性値や要素内容として記述するデータが整数や小数など文字以外のデータの場合、文字として符号化することで元のデータよりもサイズが大きくなり解析処理に時間がかかっていた。 Conventionally, in the XML language specification formulated by the W3C, when data is described in the XML language, it is generally encoded by a character encoding method such as UTF-8 or UTF-16. When data described as attribute values and element contents is data other than characters such as integers and decimals, encoding as characters makes the size larger than the original data and takes time for analysis processing.

ＩＳＯで策定されたＦａｓｔＩｎｆｏｓｅｔ（ＩＳＯ／ＩＥＣ２４８２４−１）仕様などバイナリＸＭＬ技術では、ＸＭＬデータ内の属性値や要素内容を文字だけでなく整数や小数など本来のデータ型でエンコーディングすることができる。これにより、データサイズの圧縮や解析処理時間の短縮を行うことができた。 In binary XML technology such as the Fast Infoset (ISO / IEC 24824-1) specification established by ISO, attribute values and element contents in XML data can be encoded not only with characters but also with original data types such as integers and decimals. . As a result, data size compression and analysis processing time could be shortened.

なお、特許文献１には、XMLデータの要素名や属性名、数値で、複数回繰り返し現れる文字列をより短いバイト列に置き換えて圧縮を行うことが開示されている。
特開2005-215950号公報 Patent Document 1 discloses that a character string that repeatedly appears multiple times is replaced with a shorter byte string in XML data element names, attribute names, and numerical values, and compression is performed.
JP 2005-215950 A

しかしながら、グラフィック、地図、図面など座標値を含むデータでは、従来技術では圧縮効果が得られない場合がある。 However, there is a case where the compression effect cannot be obtained with the conventional technology for data including coordinate values such as graphics, maps, drawings and the like.

従来のバイナリＸＭＬ技術では、解析処理時間を短縮するために、小数値をコンピュータ上のデータフォーマットであるＩＥＥＥ７５４(IEEE Standard for Binary Floating-Point Arithmetic (ANSI／IEEE Std 754-1985))形式に符号化する。ＩＥＥＥ７５４形式で符号化する場合、符号化サイズが最低４バイト必要であるため、-0.1や0.2など桁数の小さな小数では圧縮効果が得られない。よって、ＳＶＧのような桁数の小さな小数が数多く記述される構造化文書の場合、解析処理時間は短縮できても、圧縮効果はあまり得ることができない。 In conventional binary XML technology, decimal values are encoded in the IEEE754 (IEEE Standard for Binary Floating-Point Arithmetic (ANSI / IEEE Std 754-1985)) format, which is a data format on a computer, in order to shorten the analysis processing time. To do. When encoding in the IEEE 754 format, since the encoding size needs to be at least 4 bytes, a compression effect cannot be obtained with a decimal with a small number of digits such as -0.1 or 0.2. Therefore, in the case of a structured document in which a large number of decimals having a small number of digits such as SVG are described, even if the analysis processing time can be shortened, the compression effect cannot be obtained so much.

本発明は以上の問題に鑑みて成されたものであり、構造化文書の解析処理に大きな負荷をかけることなく、桁数の小さな小数が数多く記述されている構造化文書に対する圧縮効率を向上させるための技術を提供することを目的とする。 The present invention has been made in view of the above problems, and improves the compression efficiency for a structured document in which a large number of decimals having a small number of digits are described without imposing a heavy load on the analysis processing of the structured document. It aims at providing the technique for.

本発明の目的を達成するために、例えば、本発明の文書符号化装置は以下の構成を備える。 In order to achieve the object of the present invention, for example, a document encoding apparatus of the present invention comprises the following arrangement.

即ち、構造化文書を符号化する文書符号化装置であって、
構造化文書中の各属性値を検出する検出手段と、
前記検出手段が検出した各属性値の小数点以下の桁数を取得する取得手段と、
前記取得手段が取得したそれぞれの桁数のうち最大桁数に応じて前記各属性値の小数点位置を操作することで、前記検出手段が検出した各属性値を、整数値を表す数値文字列に変換する変換手段と、
前記変換手段による各数値文字列と、前記最大桁数と、を符号化する符号化手段と
を備えることを特徴とする。 That is, a document encoding apparatus that encodes a structured document,
Detecting means for detecting each attribute value in the structured document;
Obtaining means for obtaining the number of digits after the decimal point of each attribute value detected by the detecting means;
By operating the decimal point position of each attribute value according to the maximum number of digits acquired by the acquisition unit, each attribute value detected by the detection unit is converted into a numeric character string representing an integer value. Conversion means for converting;
And encoding means for encoding each numeric character string by the conversion means and the maximum number of digits.

即ち、構造化文書を符号化する文書符号化装置であって、
構造化文書中の各属性値を検出する検出手段と、
前記検出手段が検出した各属性値の並び順において、先頭から順に属性値間の差分値を計算する計算手段と、
前記計算手段が計算した各差分値の小数点以下の桁数を取得する取得手段と、
前記取得手段が取得したそれぞれの桁数のうち最大桁数に応じて前記各差分値の小数点位置を操作することで、前記計算手段が計算した各差分値を、整数値を表す数値文字列に変換する変換手段と、
前記変換手段による各数値文字列、前記検出手段が検出した各属性値の並び順において先頭位置における属性値、前記最大桁数、を符号化する符号化手段と
を備え、
前記符号化手段は、前記先頭位置における属性については、ＩＥＥＥ７５４形式で符号化することを特徴とする。 That is, a document encoding apparatus that encodes a structured document,
Detecting means for detecting each attribute value in the structured document;
In the arrangement order of the attribute values detected by the detection means, calculation means for calculating a difference value between the attribute values in order from the top;
Obtaining means for obtaining the number of digits after the decimal point of each difference value calculated by the calculating means;
By operating the decimal point position of each difference value according to the maximum number of digits acquired by the acquisition means, each difference value calculated by the calculation means is converted into a numeric character string representing an integer value. Conversion means for converting;
Encoding means for encoding each numerical character string by the conversion means, the attribute value at the head position in the arrangement order of each attribute value detected by the detection means, and the maximum number of digits,
The encoding means encodes the attribute at the head position in the IEEE754 format.

本発明の目的を達成するために、例えば、本発明の文書符号化方法は以下の構成を備える。 In order to achieve the object of the present invention, for example, a document encoding method of the present invention comprises the following arrangement.

即ち、構造化文書を符号化する文書符号化装置が行う文書符号化方法であって、
構造化文書中の各属性値を検出する検出工程と、
前記検出工程で検出した各属性値の小数点以下の桁数を取得する取得工程と、
前記取得工程で取得したそれぞれの桁数のうち最大桁数に応じて前記各属性値の小数点位置を操作することで、前記検出工程で検出した各属性値を、整数値を表す数値文字列に変換する変換工程と、
前記変換工程による各数値文字列と、前記最大桁数と、を符号化する符号化工程と
を備えることを特徴とする。 That is, a document encoding method performed by a document encoding apparatus that encodes a structured document,
A detection step for detecting each attribute value in the structured document;
An acquisition step of acquiring the number of digits after the decimal point of each attribute value detected in the detection step;
By manipulating the decimal point position of each attribute value according to the maximum number of digits acquired in the acquisition step, each attribute value detected in the detection step is converted into a numeric character string representing an integer value. A conversion process to convert;
An encoding step for encoding each numeric character string by the conversion step and the maximum number of digits is provided.

即ち、構造化文書を符号化する文書符号化装置が行う文書符号化方法であって、
構造化文書中の各属性値を検出する検出工程と、
前記検出工程で検出した各属性値の並び順において、先頭から順に属性値間の差分値を計算する計算工程と、
前記計算工程で計算した各差分値の小数点以下の桁数を取得する取得工程と、
前記取得工程で取得したそれぞれの桁数のうち最大桁数に応じて前記各差分値の小数点位置を操作することで、前記計算工程で計算した各差分値を、整数値を表す数値文字列に変換する変換工程と、
前記変換工程による各数値文字列、前記検出工程で検出した各属性値の並び順において先頭位置における属性値、前記最大桁数、を符号化する符号化工程と
を備え、
前記符号化工程では、前記先頭位置における属性については、ＩＥＥＥ７５４形式で符号化することを特徴とする。 That is, a document encoding method performed by a document encoding apparatus that encodes a structured document,
A detection step for detecting each attribute value in the structured document;
In the arrangement order of the attribute values detected in the detection step, a calculation step for calculating a difference value between the attribute values in order from the top,
An acquisition step of acquiring the number of digits after the decimal point of each difference value calculated in the calculation step;
By operating the decimal point position of each difference value according to the maximum number of digits acquired in the acquisition step, each difference value calculated in the calculation step is converted into a numeric character string representing an integer value. A conversion process to convert;
An encoding step for encoding each numeric character string by the conversion step, an attribute value at the head position in the arrangement order of each attribute value detected by the detection step, and the maximum number of digits, and
In the encoding step, the attribute at the head position is encoded in the IEEE754 format.

本発明の構成によれば、構造化文書の解析処理に大きな負荷をかけることなく、桁数の小さな小数が数多く記述されている構造化文書に対する圧縮効率を向上させることができる。 According to the configuration of the present invention, it is possible to improve the compression efficiency for a structured document in which a large number of decimals having a small number of digits are described without imposing a heavy load on the analysis processing of the structured document.

以下、添付図面を参照し、本発明の好適な実施形態について詳細に説明する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.

［第１の実施形態］
図１は、本実施形態に係る文書符号化装置に適用可能なコンピュータのハードウェア構成例を示すブロック図である。なお、本実施形態に係る文書符号化装置に適用可能な装置が有する構成は、図１に示した構成に限定するものではなく、当業者であれば、種種の変形例が考え得る。更に、本実施形態に係る文書符号化装置を１台の装置で実現させることに限定するものではなく、複数台の装置による協調動作でもって、本実施形態に係る文書符号化装置を実現させても良い。この場合、複数台の装置間は、ＬＡＮなどのネットワークを介して接続されていることになる。 [First Embodiment]
FIG. 1 is a block diagram illustrating a hardware configuration example of a computer applicable to the document encoding apparatus according to the present embodiment. Note that the configuration of the apparatus applicable to the document encoding apparatus according to the present embodiment is not limited to the configuration illustrated in FIG. 1, and various modifications can be considered by those skilled in the art. Further, the document encoding apparatus according to the present embodiment is not limited to being realized by a single apparatus, and the document encoding apparatus according to the present embodiment is realized by a cooperative operation by a plurality of apparatuses. Also good. In this case, a plurality of devices are connected via a network such as a LAN.

図１において、ＣＰＵ１０１は、ＲＯＭ１０２やＲＡＭ１０３に格納されているプログラムやデータを用いて、コンピュータ１００全体の制御を行うと共に、コンピュータ１００が行うものとして説明する後述の各処理を実行する。 In FIG. 1, a CPU 101 controls the entire computer 100 using programs and data stored in a ROM 102 and a RAM 103, and executes each process described later as what the computer 100 performs.

ＲＯＭ１０２には、コンピュータ１００の設定データやブートプログラム、変更を必要としないパラメータのデータなどが格納されている。 The ROM 102 stores setting data of the computer 100, a boot program, data of parameters that do not need to be changed, and the like.

ＲＡＭ１０３は、外部記憶装置１０４からロードされたプログラムやデータ、ネットワークインターフェース１０６を介して外部から受信したデータなどを一時的に記憶するためのエリアを有する。更には、ＲＡＭ１０３は、ＣＰＵ１０１が各種の処理を実行する際に用いるワークエリアも有する。 The RAM 103 has an area for temporarily storing programs and data loaded from the external storage device 104, data received from the outside via the network interface 106, and the like. Furthermore, the RAM 103 also has a work area used when the CPU 101 executes various processes.

外部記憶装置１０４は、ハードディスクドライブ装置に代表される大容量情報記憶装置である。外部記憶装置１０４には、ＯＳ（オペレーティングシステム）や、コンピュータ１００が行うものとして説明する後述の各処理をＣＰＵ１０１に実行させるためのプログラムやデータが保存されている。また、外部記憶装置１０４には、後述する処理の対象となる構造化文書のデータが、ファイルとして保存されている。外部記憶装置１０４に保存されているプログラムやデータは、ＣＰＵ１０１による制御に従って適宜ＲＡＭ１０３にロードされ、ＣＰＵ１０１による処理対象となる。 The external storage device 104 is a large-capacity information storage device represented by a hard disk drive device. The external storage device 104 stores an OS (operating system) and programs and data for causing the CPU 101 to execute each process described later that is executed by the computer 100. The external storage device 104 stores structured document data to be processed, which will be described later, as a file. Programs and data stored in the external storage device 104 are appropriately loaded into the RAM 103 under the control of the CPU 101 and are processed by the CPU 101.

入力インターフェース１０５は、キーボードやマウスなどにより構成されており、コンピュータ１００の操作者が操作することで、各種の指示をＣＰＵ１０１に対して入力することができる。 The input interface 105 is configured by a keyboard, a mouse, and the like, and can input various instructions to the CPU 101 when operated by an operator of the computer 100.

ネットワークインターフェース１０６は、コンピュータ１００をＬＡＮやインターネット等に接続するためのものであり、コンピュータ１００はこのネットワークインターフェース１０６を介して、外部機器とのデータ通信を行うことができる。 The network interface 106 is used to connect the computer 100 to a LAN, the Internet, or the like, and the computer 100 can perform data communication with an external device via the network interface 106.

１０７は、上述の各部を繋ぐバスである。 A bus 107 connects the above-described units.

なお、コンピュータ１００に接続可能な記憶装置は、外部記憶装置１０４に限定するものではなく、次のようなものをコンピュータ１００に設けても良い。即ち、メモリカード、コンピュータ１００に対して着脱可能なフレキシブルディスク（ＦＤ）やCompact Disk（ＣＤ）等の光ディスク、磁気や光カード、ＩＣカード等である。 The storage device that can be connected to the computer 100 is not limited to the external storage device 104, and the following may be provided in the computer 100. That is, a memory card, an optical disk such as a flexible disk (FD) and a Compact Disk (CD) that can be attached to and detached from the computer 100, a magnetic or optical card, an IC card, and the like.

次に、コンピュータ１００が行う構造化文書の符号化処理について説明する。 Next, structured document encoding processing performed by the computer 100 will be described.

携帯電話、デジタルカメラ、プリンタなどのハードウェアリソースの少ない機器では、構造化文書としてのＸＭＬデータのサイズ縮小や、パース処理の高速化が求められている。従来、このような課題を解決する技術としてバイナリＸＭＬという符号化技術が用いられてきた。バイナリＸＭＬでは、ＸＭＬデータの要素や属性などの構造をバイナリデータに符号化し、要素や属性の値を整数や小数などの本来のデータ型で符号化する。バイナリデータとすることで、ＵＴＦ−８やＵＴＦ−１６などの文字符号化よりも、データサイズが小さくなり、パース処理が高速化される。 Devices with few hardware resources such as mobile phones, digital cameras, and printers are required to reduce the size of XML data as structured documents and to increase the speed of parsing processing. Conventionally, an encoding technique called binary XML has been used as a technique for solving such a problem. In binary XML, the structure of XML data such as elements and attributes is encoded into binary data, and the values of elements and attributes are encoded in original data types such as integers and decimals. By using binary data, the data size becomes smaller and the parsing process is faster than character encoding such as UTF-8 and UTF-16.

しかし、Ｗ３Ｃで策定されたＳＶＧ仕様(http:／／www.w3.org／Graphics／SVG)のような桁数の小さな小数が数多く記述される構造化文書の場合には問題がある。 However, there is a problem in the case of a structured document in which a large number of decimals having a small number of digits are described, such as the SVG specification (http://www.w3.org/Graphics/SVG) formulated by the W3C.

図２は、ＳＶＧの一例を示す図である。図２に示すＳＶＧでは、keyTimes属性に多くの数値が記述されている。これらの数値は、アニメーション処理のタイミングを示しており、全体時間内での割合を表す０〜１の小数値である。これらの値は、桁数に制限があるわけではないが、一般的に図２に示すような”0.01”、“0.02”といった桁数の小さな小数で指定することが多い。また、図２に示すＳＶＧでは、pathのd属性に座標情報を表す実数値が多く記述される。これらの実数値における小数部も、桁数の制限があるわけではないが、一般的に図２に示すような”3.183”, “-0.911”といった桁数の小さな小数部を含む実数値で指定することが多い。 FIG. 2 is a diagram illustrating an example of SVG. In the SVG shown in FIG. 2, many numerical values are described in the keyTimes attribute. These numerical values indicate the timing of the animation processing, and are decimal values of 0 to 1 that represent the ratio within the entire time. These values are not limited in the number of digits, but are generally specified by small decimal numbers such as “0.01” and “0.02” as shown in FIG. Further, in the SVG shown in FIG. 2, many real values representing coordinate information are described in the d attribute of the path. The decimal part of these real values is not limited in the number of digits, but is generally specified by a real value including a decimal part with a small number of digits such as “3.183” and “-0.911” as shown in FIG. Often to do.

図２のＳＶＧにバイナリＸＭＬを適用すると、実数値はＩＥＥＥ７５４形式に符号化できる。ＩＥＥＥ７５４形式は、図３に示すような符号、指数部、仮数部から構成されるフォーマットである。最低でも４バイト必要とする。図３には、0.01に対する例を示しているが、数値文字列とＩＥＥＥ７５４形式で長さは変わらない。 When binary XML is applied to the SVG of FIG. 2, real values can be encoded in the IEEE 754 format. The IEEE754 format is a format composed of a sign, an exponent part, and a mantissa part as shown in FIG. At least 4 bytes are required. FIG. 3 shows an example for 0.01, but the length does not change between the numeric character string and the IEEE754 format.

図２のＳＶＧには、３〜６バイト程度の数値文字列が記述されている。このような記述を多く含む場合、バイナリＸＭＬを適用しても、解析処理時間の短縮は行えるが、データサイズの縮小はほとんど行えない。 In the SVG of FIG. 2, a numerical character string of about 3 to 6 bytes is described. When many such descriptions are included, even if binary XML is applied, the analysis processing time can be reduced, but the data size can hardly be reduced.

図４は、ＳＶＧを符号化する処理のフローチャートである。なお、図４のフローチャートに係る説明では、ＳＶＧの例として、図２に示したＳＶＧを用いる。図４のフローチャートに従った処理をＣＰＵ１０１に実行させるためのプログラムやデータ、構造化文書としてのＳＶＧのデータは、外部記憶装置１０４に保存されている。従って、ＣＰＵ１０１は、係るプログラムやデータ、ＳＶＧのデータをＲＡＭ１０３にロードし、ロードしたこれらを用いて処理を実行する。これにより、コンピュータ１００は、以下説明する、図４のフローチャートに従った処理を実行することになる。 FIG. 4 is a flowchart of a process for encoding SVG. In the description of the flowchart of FIG. 4, the SVG shown in FIG. 2 is used as an example of the SVG. Programs and data for causing the CPU 101 to execute processing according to the flowchart of FIG. 4 and SVG data as a structured document are stored in the external storage device 104. Therefore, the CPU 101 loads such programs, data, and SVG data into the RAM 103, and executes processing using these loaded programs. As a result, the computer 100 executes processing according to the flowchart of FIG. 4 described below.

先ず、図４のフローチャートに従った処理を実行する前に、ＳＶＧのデータにおいて、符号化対象部分を予め指定しておく。全ての値を符号化対象とするならば特に指定する必要はない。 First, before executing the processing according to the flowchart of FIG. 4, the encoding target portion is designated in advance in the SVG data. If all values are to be encoded, there is no need to specify them.

本実施形態では、図２のＳＶＧのデータにおいて、桁数の小さい小数値が連続するkeyTimes属性を、符号化対象として指定したものとする。即ち、animateTransform要素のkeyTimes属性を指定する。また、属性値として列挙されている小数値の区切り文字”;”を指定する。 In the present embodiment, it is assumed that, in the SVG data of FIG. 2, the keyTimes attribute in which decimal values with a small number of digits are consecutive is designated as an encoding target. That is, the keyTimes attribute of the animateTransform element is specified. Also, the decimal separator “;” listed as the attribute value is specified.

そして次に、図２のＳＶＧのデータを外部記憶装置１０４からＲＡＭ１０３にロードする。なお、ＲＡＭ１０３に図２のＳＶＧのデータを取得するための方法については特に限定するものではない。 Next, the SVG data in FIG. 2 is loaded from the external storage device 104 to the RAM 103. Note that the method for acquiring the data of the SVG of FIG. 2 in the RAM 103 is not particularly limited.

そして、図４のフローチャートに従った処理を開始する。 And the process according to the flowchart of FIG. 4 is started.

先ず、ステップＳ４０１では、符号化対象としてＲＡＭ１０３に入力されたＳＶＧのデータ内（構造化文書中）を順次参照するのであるが、参照した部分が、animateTransform要素のkeyTimes属性であるのかを判断する。係る判断の結果、animateTransform要素のkeyTimes属性ではない場合は処理をステップＳ４０２に進める。ステップＳ４０２では、参照した部分については、従来通りのバイナリＸＭＬの符号化を行う。図２の例では、d属性の値は従来通りのバイナリＸＭＬの符号化が行われる。 First, in step S401, the SVG data (in the structured document) input to the RAM 103 as an encoding target is sequentially referred to, and it is determined whether the referred portion is the keyTimes attribute of the animateTransform element. As a result of the determination, if it is not the keyTimes attribute of the animateTransform element, the process proceeds to step S402. In step S 402, binary XML encoding is performed on the referenced portion as usual. In the example of FIG. 2, the conventional binary XML encoding is performed on the value of the d attribute.

一方、ステップＳ４０１における判断の結果、animateTransform要素のkeyTimes属性である場合には、処理をステップＳ４０３に進める。 On the other hand, if the result of determination in step S401 is that the keyTimes attribute of the animateTransform element, processing proceeds to step S403.

ステップＳ４０３では、animateTransform要素のkeyTimes属性における全ての数値文字列（属性値）を抽出したか否かをチェックする。係るチェックの結果、全ての数値文字列を抽出した場合には処理をステップＳ４０５に進め、全ての数値文字列を抽出していない場合には処理をステップＳ４０４に進める。 In step S403, it is checked whether all numeric character strings (attribute values) in the keyTimes attribute of the animateTransform element have been extracted. If all numeric character strings have been extracted as a result of the check, the process proceeds to step S405. If all numeric character strings have not been extracted, the process proceeds to step S404.

ステップＳ４０４では先ず、animateTransform要素のkeyTimes属性において未だ抽出していない数値文字列（属性値）抽出する。数値文字列の抽出には、予め指定された区切り文字”;”を利用する。図２において、０、０．０１、０．０２、０．１、１が“；”により区切られているので、これら数値文字列を抽出する。また、特に指定された区切り文字を使わなくとも、属性値の文字列を先頭から順次チェックし、数値文字列として妥当な部分までを一つの数値文字列として検出することもできる。 In step S404, first, a numeric character string (attribute value) that has not yet been extracted in the keyTimes attribute of the animateTransform element is extracted. A numeric character string is extracted by using a delimiter “;” designated in advance. In FIG. 2, 0, 0.01, 0.02, 0.1, and 1 are separated by “;”, so these numerical character strings are extracted. Even without using a designated delimiter, it is possible to sequentially check the character string of the attribute value from the beginning and detect a valid numerical character string as a single numerical character string.

そして、抽出した数値文字列の小数点以下の桁数ｃ’を取得する。ここで、桁数ｃ’は、小数点文字’.’の位置から属性値の末尾までの文字数を数えることで算出できる。 Then, the number of digits c ′ after the decimal point of the extracted numerical character string is acquired. Here, the number of digits c ′ can be calculated by counting the number of characters from the position of the decimal point character “.” To the end of the attribute value.

即ち、ステップＳ４０３からステップＳ４０５に処理が進んだということは、animateTransform要素のkeyTimes属性における全ての数値文字列（属性値）について、小数点以下の桁数ｃ’を取得したことになる。 In other words, the fact that the process has advanced from step S403 to step S405 means that the number of digits c 'after the decimal point has been obtained for all numeric character strings (attribute values) in the keyTimes attribute of the animateTransform element.

ステップＳ４０５では、animateTransform要素のkeyTimes属性における全ての数値文字列（属性値）についてステップＳ４０４で取得した桁数ｃ’のうち、最大値（最大桁数）Ｃを求める。 In step S405, the maximum value (maximum number of digits) C is obtained from the number of digits c 'acquired in step S404 for all numeric character strings (attribute values) in the keyTimes attribute of the animateTransform element.

ここで、数値文字列を抽出後、抽出した属性値（小数値）の整数符号化方式を決定する。小数値を整数化するには、一時的に小数点位置を移動（操作）させなければならない。移動した小数点位置は解析時に元に戻す。この処理は、ＩＥＥＥ７５４形式で符号化する場合に比べてオーバヘッドになる。本実施形態では、オーバヘッドを削減するために、予め指定された構造部分内では、同じ整数符号化形式を使うようにする。オーバヘッドが無視できるならば、別々の整数符号化形式を用いてもよい。 Here, after extracting the numeric character string, an integer encoding method of the extracted attribute value (decimal value) is determined. To convert a decimal value into an integer, you must temporarily move (manipulate) the decimal point position. The moved decimal point position is restored during analysis. This processing is overhead compared to the case of encoding in the IEEE754 format. In the present embodiment, in order to reduce the overhead, the same integer coding format is used in the structure portion designated in advance. If the overhead is negligible, a separate integer encoding format may be used.

ここで、検出した各小数値は、１０進表現で少なくともＣ個小数点を下位に移動させれば整数化できる。そこで、小数値を１０^Ｃ倍して得られる整数値を符号化する方式を、この構造部分の共通の整数符号化方式として決定する。本実施形態では、図５に示すようにＣ＝２であるので、小数値を１０^２倍して得られる整数値を符号化する方式を、この構造部分の共通の整数符号化方式として決定する。図５は、各数値文字列の符号化について説明する図である。 Here, each detected decimal value can be converted into an integer by moving at least C decimal points in decimal notation. Therefore, a method for encoding an integer value obtained by multiplying the decimal value by 10 ^C is determined as a common integer encoding method for this structure portion. In the present embodiment, since it is C = 2, as shown in FIG. 5, a method of encoding an integer value obtained by 10 ^two-fold decimal values, determined as a common integer encoding method of this structural part . FIG. 5 is a diagram for explaining the encoding of each numeric character string.

整数符号化方式が決定した後、検出した各小数値の符号化を行う。符号化を行うには、小数の数値文字列を一旦ＩＥＥＥ７５４形式に変換し、１０のべき乗して整数化し、符号化することで行うことができる。しかし一旦ＩＥＥＥ７５４形式に変換するのはオーバヘッドが大きいので、本実施形態では、数値文字列の状態で小数値を整数符号化する。 After the integer encoding method is determined, each detected decimal value is encoded. Encoding can be performed by converting a decimal numeric character string into the IEEE754 format, converting it to a power of 10, and converting it into an integer. However, once converting to the IEEE 754 format has a large overhead, in this embodiment, the decimal value is integer-coded in the state of the numeric character string.

従って、ステップＳ４０６では、数値文字列を１０^Ｃ倍して得られる整数値を表す数値文字列を生成する。係る処理は換言すれば、数値文字列から小数点を省いた結果としての整数値の下位に、この数値文字列について取得した桁数ｃ’と最大桁数Ｃとの差（Ｃ−ｃ’）個だけ０を加えた整数値を表す数値文字列を生成する処理である。 Accordingly, in step S406, a numeric character string representing an integer value obtained by multiplying the numeric character string by 10 ^C is generated. In other words, such processing is performed by subtracting the difference (C−c ′) between the number of digits c ′ and the maximum number of digits C obtained for the numeric character string below the integer value resulting from omitting the decimal point from the numeric character string This is a process of generating a numerical character string representing an integer value obtained by adding only 0.

ステップＳ４０６における処理を、全ての数値文字列について行うことで、図５に示すように、それぞれ”０”→”０００”、”０．０１”→”００１”、”０．０２”→”００２”、”０．１”→”０１０”、”１“→”１００”というように変換される。 By performing the processing in step S406 for all numeric character strings, as shown in FIG. 5, “0” → “000”, “0.01” → “001”, “0.02” → “002”, respectively. “,” “0.1” → “010”, “1” → “100”.

そして最後に、ステップＳ４０７では、全ての数値文字列についてステップＳ４０６で生成した数値文字列と、最大桁数Ｃと、を符号化する。本実施形態では、バイナリＸＭＬとして符号化する。符号化する際には、他のデータの一部として記述される。 Finally, in step S407, the numerical character strings generated in step S406 and the maximum number of digits C are encoded for all numerical character strings. In the present embodiment, encoding is performed as binary XML. When encoding, it is described as a part of other data.

従来、バイナリＸＭＬでは、図５に示すように、属性を符号化する際に、属性構造を示す符号(0x9c)、属性値のデータ型を示す符号、属性値といった順の記述を行う。各小数値は数値文字列またはＩＥＥＥ７５４形式で符号化され、どのような形式で符号化されたかはデータ型で示されている。 Conventionally, in binary XML, as shown in FIG. 5, when attributes are encoded, description is made in the order of a code (0x9c) indicating an attribute structure, a code indicating a data type of an attribute value, and an attribute value. Each decimal value is encoded in a numeric character string or IEEE 754 format, and the format in which it is encoded is indicated by a data type.

データ型は、図６に示すように定義されており、文字列ならば0x17、ＩＥＥＥ７５４単精度（float）ならば0x1a、倍精度（double）ならば0x1bとなっている。そこで本実施形態では、図６のようにデータ型定義の拡張を行い、0x3*を１バイト整数とし、0x31を１０の１乗、0x32を１０の２乗、...とする。本実施形態では１０の２乗なのでデータ型は0x32になる。このデータ型を使って図５のようにバイナリＸＭＬとして符号化する。属性値のデータ型の記述は0x32になり、小数値はそれぞれ、”0”→0x00、”0.01”→0x01、”0.02”→0x02、”0.1”→0x0a、”1“→0x64という１バイト整数に符号化される。以上のようにして、桁数の小さな小数値でもデータサイズを小さくすることができる。 The data type is defined as shown in FIG. 6, and is 0x17 for character strings, 0x1a for IEEE754 single precision (float), and 0x1b for double precision (double). Therefore, in the present embodiment, the data type definition is expanded as shown in FIG. 6, and 0x3 * is a 1-byte integer, 0x31 is a power of 10, 0x32 is a power of 10, and so on. In this embodiment, since it is a square of 10, the data type is 0x32. Using this data type, encoding is performed as binary XML as shown in FIG. The description of the data type of the attribute value is 0x32, and the decimal values are 1-byte integers of “0” → 0x00, “0.01” → 0x01, “0.02” → 0x02, “0.1” → 0x0a, “1” → 0x64, respectively. Is encoded. As described above, the data size can be reduced even with a decimal value having a small number of digits.

ここで言う整数化は、小数値を固定小数点数とし、固定された小数点位置と数値を符号化することも含まれる。 The integerization referred to here includes encoding a fixed decimal point position and a numerical value by using a decimal value as a fixed-point number.

デコード側は、符号化データのデータ型に整数符号化の方式が記述されているので、記述されている方式に従って逆変換を行い元の小数値を取得する。本実施形態では、構造部分内で整数符号化方式が統一されているので、一連の計算処理を行った後逆変換を行い、オーバヘッドを減らすこともできる。 On the decoding side, since the integer encoding method is described in the data type of the encoded data, inverse conversion is performed according to the described method to obtain the original decimal value. In the present embodiment, since the integer coding method is unified within the structure portion, it is possible to reduce the overhead by performing inverse conversion after performing a series of calculation processes.

［第２の実施形態］
本実施形態では、図２に示したＳＶＧのd属性に対して、第１の実施形態で説明した符号化方法を適用する場合について説明する。keyTimes属性と同様に、d属性にも桁数の小さな小数部を含む実数値が多く記述されている。ところがd属性には、図２に示すように、”198.784”,”59.762”というようなkeyTimes属性よりも若干桁数の大きな小数部を含む実数値が含まれる。このような場合、整数符号化した方がサイズが大きくなり、ＩＥＥＥ７５４形式で符号化した方がよい場合がある。そこで本実施形態では、符号化する対象の数値に応じてＩＥＥＥ７５４形式と整数符号化を切り替えて用いる場合について説明する。 [Second Embodiment]
In the present embodiment, a case will be described in which the encoding method described in the first embodiment is applied to the d attribute of the SVG illustrated in FIG. As with the keyTimes attribute, many real values including a decimal part with a small number of digits are described in the d attribute. However, as shown in FIG. 2, the d attribute includes a real value including a decimal part having a slightly larger number of digits than the keyTimes attribute such as “198.784” and “59.762”. In such a case, integer encoding may increase the size, and encoding in the IEEE 754 format may be better. Therefore, in the present embodiment, a case will be described in which the IEEE754 format and the integer encoding are switched according to the numerical value to be encoded.

本実施形態に係る符号化処理のフローチャートは基本的には図４に示したフローチャートとほぼ同様であるので、以下、本実施形態に係る符号化処理について、図４を用いて説明する。 Since the flowchart of the encoding process according to the present embodiment is basically the same as that shown in FIG. 4, the encoding process according to the present embodiment will be described below with reference to FIG.

先ず、図４のフローチャートに従った処理を実行する前に、ＳＶＧのデータにおいて、符号化対象部分を予め指定しておく。全ての実数値を符号化対象とするならば特に指定する必要はない。 First, before executing the processing according to the flowchart of FIG. 4, the encoding target portion is designated in advance in the SVG data. If all real values are to be encoded, there is no need to specify them.

本実施形態では、図２のＳＶＧのデータにおいて、path要素のd属性を、符号化対象として指定したものとする。また、属性値として列挙されている実数値の区切り文字“Ｍ”、“ｃ”、“,”を指定する。 In the present embodiment, it is assumed that the d attribute of the path element is designated as an encoding target in the SVG data of FIG. Also, real-valued delimiters “M”, “c”, “,” listed as attribute values are designated.

先ず、ステップＳ４０１では、符号化対象としてＲＡＭ１０３に入力されたＳＶＧのデータ内を順次参照するのであるが、参照した部分が、path要素のd属性であるのかを判断する。係る判断の結果、path要素のd属性ではない場合は処理をステップＳ４０２に進める。ステップＳ４０２では、参照した部分については、従来通りのバイナリＸＭＬの符号化を行う。 First, in step S401, the SVG data input to the RAM 103 as an encoding target is sequentially referred to, and it is determined whether the referred portion is the d attribute of the path element. As a result of the determination, if it is not the d attribute of the path element, the process proceeds to step S402. In step S 402, binary XML encoding is performed on the referenced portion as usual.

一方、ステップＳ４０１における判断の結果、path要素のd属性である場合には、処理をステップＳ４０３に進める。 On the other hand, if the result of determination in step S401 is d attribute of the path element, processing proceeds to step S403.

ステップＳ４０３では、path要素のd属性における全ての数値文字列（属性値）を抽出したか否かをチェックする。係るチェックの結果、全ての数値文字列を抽出した場合には処理をステップＳ４０５に進め、全ての数値文字列を抽出していない場合には処理をステップＳ４０４に進める。 In step S403, it is checked whether all numeric character strings (attribute values) in the d attribute of the path element have been extracted. If all numeric character strings have been extracted as a result of the check, the process proceeds to step S405. If all numeric character strings have not been extracted, the process proceeds to step S404.

ステップＳ４０４では先ず、path要素のd属性において未だ抽出していない数値文字列（属性値）抽出する。数値文字列の抽出には、予め指定された区切り文字”Ｍ”、”ｃ”、”，”を利用する。本実施形態では、抽出すべき数値文字列は、１９８．７８４、５９．７６２、３．１８３、−０．９１１、４．９７２、−２．８２５、５．３６６、−５．７４２である。第１の実施形態と同様に、特に指定された区切り文字を使わなくとも数値文字列の検出を行うことはできる。 In step S404, first, a numeric character string (attribute value) that has not been extracted in the d attribute of the path element is extracted. For the extraction of the numerical character string, predetermined delimiters “M”, “c”, “,” are used. In this embodiment, numeric character strings to be extracted are 198.784, 59.762, 3.183, -0.911, 4.972, -2.825, 5.366, and -5.742. Similar to the first embodiment, it is possible to detect a numeric character string without using a designated delimiter.

そして、抽出した数値文字列の小数点以下の桁数ｃ’を取得する。 Then, the number of digits c ′ after the decimal point of the extracted numerical character string is acquired.

即ち、ステップＳ４０３からステップＳ４０５に処理が進んだということは、path要素のd属性における全ての数値文字列（属性値）について、小数点以下の桁数ｃ’を取得したことになる。 That is, the process proceeds from step S403 to step S405 means that the number of digits c 'after the decimal point is acquired for all numeric character strings (attribute values) in the d attribute of the path element.

ステップＳ４０５では、path要素のd属性における全ての数値文字列（属性値）についてステップＳ４０４で取得した桁数ｃ’のうち、最大値（最大桁数）Ｃを求める。 In step S405, the maximum value (maximum number of digits) C is obtained from the number of digits c 'acquired in step S404 for all numeric character strings (attribute values) in the d attribute of the path element.

ステップＳ４０５では、Ｃの値を決定するのであるが、本実施形態では第１の実施形態とは異なり、次のようにして決定する。 In step S405, the value of C is determined, but in the present embodiment, unlike the first embodiment, it is determined as follows.

ＳＶＧのような描画処理では、必要となる数値精度を予め決めておくことができる。そこで本実施形態では、予め指定された数値精度である小数点以下の桁数３をＣの値に決定する。 In drawing processing such as SVG, the required numerical accuracy can be determined in advance. Therefore, in this embodiment, the number of digits 3 after the decimal point, which is a numerical accuracy designated in advance, is determined as the value of C.

ここで、符号化方式を決定する前に、整数符号化を適用するかどうかの判定を行う。判定を行うために、予め閾値を設定しておく。 Here, before determining the encoding method, it is determined whether to apply integer encoding. In order to make a determination, a threshold value is set in advance.

整数符号化は、ＩＥＥＥ７５４で符号化する場合に比べてサイズが大きくなっては意味がない。ＩＥＥＥ７５４で符号化した場合は最低４バイトになる。４バイトより小さくなれば、符号化サイズを縮小できる。本実施形態では十分なサイズ縮小効果が得られる２バイト以下の整数に符号化される場合に整数符号化を適用することにする。閾値はバイト単位ではなく、ビット単位でもよいが、解析時にシフト処理が必要になるので本実施形態ではバイト単位とする。 Integer coding is meaningless if the size is larger than when encoding with IEEE754. When encoded with IEEE754, the minimum length is 4 bytes. If it becomes smaller than 4 bytes, the encoding size can be reduced. In the present embodiment, integer encoding is applied when encoding is performed to an integer of 2 bytes or less that can provide a sufficient size reduction effect. The threshold value may be a bit unit instead of a byte unit. However, since a shift process is required at the time of analysis, the threshold is a byte unit in this embodiment.

２バイト整数は、２の補数表現を使うと、−３２７６８〜３２７６７の値を表現することができる。実数を整数化した際に、−３２７６８〜３２７６７の間に入っていれば整数符号化を適用すべきということになる。換言すれば、係る数値範囲は、符号化結果のデータサイズとして予め設定したサイズに基づいて決まる数値範囲内である。 A 2-byte integer can represent a value of −32768 to 32767 using a two's complement expression. When a real number is converted into an integer, if it is between -32768 and 32767, integer encoding should be applied. In other words, the numerical range is within a numerical range determined based on a size set in advance as the data size of the encoding result.

次にステップＳ４０６では、第１の実施形態と同様に、数値文字列を１０^Ｃ倍して得られる整数値を表す数値文字列を生成する。ステップＳ４０６における処理を、全ての数値文字列について行うことで、図７に示す如く、次のような変換結果が得られる。即ち、”１９８．７８４”→”１９８７８４”、”５９．７６２”→”５９７６２”、”３．１８３”→”３１８３”、”−０．９１１”→”−０９１１”、”４．９７２“→”４９７２”、”−２．８２５“→”−２８２５”→”５．３６６“→”５３６６”、”−５．７４２“→”−５７４２”という数値文字列が得られる。 Next, in step S406, as in the first embodiment, a numeric character string representing an integer value obtained by multiplying the numeric character string by 10 ^C is generated. By performing the process in step S406 for all the numeric character strings, the following conversion results are obtained as shown in FIG. That is, “198.784” → “198784”, “59.762” → “59762”, “3.183” → “3183”, “−0.911” → “−0911”, “4.972” → Numerical character strings “4972”, “−2.825” → “−2825” → “5.366” → “5366”, and “−5.742” → “−5742” are obtained.

次に、係る変換により得られた数値文字列が表す数値が、-32768〜32767の数値範囲内に収まっているか否かを判断するのであるが、係る判断は、数値文字列同士の比較や数値文字列の長さの比較で判定すれば良い。 Next, it is determined whether or not the numerical value represented by the numerical character string obtained by such conversion falls within the numerical range of -32768 to 32767. Such determination can be made by comparing numerical character strings or numerical values. This can be determined by comparing the lengths of character strings.

そして、収まっている数値文字列については、第１の実施形態と同様に、そのままバイナリＸＭＬとして符号化する。一方、収まっていない数値文字列については、ステップＳ４０６における変換前（変換元）の数値文字列をＩＥＥＥ７５４形式でバイナリＸＭＬとして符号化する。本実施形態では、図７に示すように、描画コマンドＭの座標情報、198.784、59.762がＩＥＥＥ７５４形式で符号化すると判定され、それ以外の実数値が整数符号化すると判定される。 The stored numeric character string is encoded as binary XML as it is, as in the first embodiment. On the other hand, for numeric character strings that do not fit, the numeric character string before conversion (conversion source) in step S406 is encoded as binary XML in the IEEE754 format. In this embodiment, as shown in FIG. 7, it is determined that the coordinate information 198.784, 59.762 of the drawing command M is encoded in the IEEE754 format, and other real values are determined to be integer encoded.

この判定は、上記のような方法を用いなくても良く、ＳＶＧに記述される情報を利用して行うこともできる。ＳＶＧのpath要素のd属性の記述では、描画コマンドと処理に必要な数値情報を列挙する。このとき、描画コマンドがＭやＣなどの大文字であれば、絶対値表現、cなどの小文字であれば相対値表現であることが決まっている。図２に示すように、絶対値表現の場合、相対値表現に比べて桁数の大きな数値になる傾向がある。そこで、予め絶対値表現、相対値表現を示す識別文字を指定しておき、絶対値表現であればＩＥＥＥ７５４形式で符号化、相対値表現であれば整数符号化を行うと判定する。 This determination need not use the method described above, and can also be performed using information described in the SVG. In the description of the d attribute of the SVG path element, a drawing command and numerical information necessary for processing are listed. At this time, if the drawing command is an uppercase letter such as M or C, it is determined to be an absolute value expression, and if it is a lowercase letter such as c, it is determined to be a relative value expression. As shown in FIG. 2, the absolute value expression tends to be a numerical value having a larger number of digits than the relative value expression. Therefore, identification characters indicating absolute value expression and relative value expression are designated in advance, and it is determined that encoding is performed in the IEEE754 format if the absolute value expression is used, and integer coding is performed if the relative value expression is used.

ただしこの判定は厳密な判定ではないので、いくつかの誤判定も考えられる。誤判定となっても、本実施形態に係る符号化方法が一部最適に適用されないだけで処理上問題なく十分な効果が得られる。 However, since this determination is not a strict determination, some erroneous determinations can be considered. Even if an erroneous determination is made, a sufficient effect can be obtained without any problem in processing only when the encoding method according to the present embodiment is not partially optimally applied.

本実施形態では、どちらの方法を使って判定しても同じ結果が得られる。 In the present embodiment, the same result can be obtained regardless of which method is used.

そして最後に、ステップＳ４０７では、全ての数値文字列についてステップＳ４０６で生成した数値文字列、変換前の数値文字列、最大桁数Ｃ、を符号化する。本実施形態では、バイナリＸＭＬとして符号化する。符号化する際には、他のデータの一部として記述される。 Finally, in step S407, the numeric character string generated in step S406, the numeric character string before conversion, and the maximum number of digits C are encoded for all numeric character strings. In the present embodiment, encoding is performed as binary XML. When encoding, it is described as a part of other data.

図６のデータ型定義を使ってバイナリＸＭＬとして符号化すると図７のようになる。描画コマンドＭの小数値のデータ型は0x1a、それ以外の小数値のデータ型は、２バイト整数で１０の３乗なので0x43となる。小数値はそれぞれ、１９８．７８４→０ｘ４３４６ｃ８ｂ４、５９．７６２→０ｘ４２６ｆ０ｃ４ａ、３．１８３→０ｘ０ｃ６ｆ、−０．９１１→０ｘｆｃ７１、４．９７２→０ｘ１３６ｃ、−２．８２５→０ｘｆ４ｆ７、５．３６６→０ｘ１４ｆ６、−５．７４２→０ｘｅ９９２となる。 7 is encoded as binary XML using the data type definition of FIG. The data type of the decimal value of the drawing command M is 0x1a, and the data type of the other decimal values is a 2-byte integer that is the third power of 10 and is 0x43. The decimal values are 198.784 → 0x4346c8b4, 59.762 → 0x426f0c4a, 3.183 → 0x0c6f, −0.911 → 0xfc71, 4.972 → 0x136c, −2.825 → 0xf4f7, 5.366 → 0x14f6, − 5.742 → 0xe992.

デコード側は、第１の実施形態と同様に、符号化データのデータ型に従って逆変換を行い、元の実数値を取得する。 As in the first embodiment, the decoding side performs inverse conversion according to the data type of the encoded data, and acquires the original real value.

［第３の実施形態］
本実施形態では、図２に示したＳＶＧのvalues属性に対して、第１の実施形態で説明した符号化方法を適用する場合について説明する。values属性にも多くの数値が記述される。これらの数値は、アニメーション時のある値の変化を列挙したものであり、keyTimes属性とともに使われる。values属性の値は、絶対値表現で記述することが決められている。各値を見ると、-418や-446.71など、第１，２の実施形態とは異なり、整数符号化すると２バイトを越える値が多くある。そこで本実施形態では、このような絶対値表現で記述された値に対しての符号化について説明する。 [Third Embodiment]
In the present embodiment, a case will be described in which the encoding method described in the first embodiment is applied to the values attribute of the SVG illustrated in FIG. Many numeric values are also described in the values attribute. These numbers enumerate certain value changes during animation and are used with the keyTimes attribute. The value of the values attribute is determined to be described in absolute value expression. Looking at each value, unlike the first and second embodiments, such as -418 and -446.71, there are many values exceeding 2 bytes when integer-coded. Therefore, in the present embodiment, encoding for values described in such an absolute value expression will be described.

本実施形態に係る符号化処理について以下、説明する。 The encoding process according to this embodiment will be described below.

先ず、ＳＶＧのデータにおいて、符号化対象部分を予め指定しておく。全ての値を符号化対象とするならば特に指定する必要はない。 First, in the SVG data, an encoding target portion is designated in advance. If all values are to be encoded, there is no need to specify them.

本実施形態では、図２のＳＶＧのデータにおいて、animateTransform要素のvalues属性を、符号化対象として指定したものとする。また、属性値として列挙されている実数値の区切り文字“,”、”;”を指定する。 In the present embodiment, it is assumed that the values attribute of the animateTransform element is specified as an encoding target in the SVG data of FIG. In addition, the delimiters “,”, “;” of real values listed as attribute values are designated.

先ず、符号化対象としてＲＡＭ１０３に入力されたＳＶＧのデータ内を順次参照するのであるが、参照した部分が、animateTransform要素のvalues属性であるのかを判断する。係る判断の結果、animateTransform要素のvalues属性ではない場合は、参照した部分については、従来通りのバイナリＸＭＬの符号化を行う。 First, the SVG data input to the RAM 103 as an encoding target is sequentially referred to, and it is determined whether the referred portion is the values attribute of the animateTransform element. If it is not the values attribute of the animateTransform element as a result of the determination, the referenced portion is encoded in the conventional binary XML format.

一方、animateTransform要素のvalues属性である場合には、animateTransform要素のvalues属性における全ての数値文字列（属性値）を抽出したか否かをチェックする。係るチェックの結果、全ての数値文字列を抽出していない場合には先ず、animateTransform要素のvalues属性において未だ抽出していない数値文字列（属性値）抽出する。数値文字列の抽出には、予め指定された区切り文字“,”、”;”を利用する。本実施形態では、抽出すべき数値文字列は、−６８、−４１８、−１０１．４９、−４４６．７１、−１２６．５２、−４６５．２４、−１３９．４９、−４６９．２９、−１３９．８、−４５８．１４である。第１の実施形態と同様に、特に指定された区切り文字を使わなくとも数値文字列の検出を行うことはできる。 On the other hand, in the case of the values attribute of the animateTransform element, it is checked whether or not all numeric character strings (attribute values) in the values attribute of the animateTransform element have been extracted. If all numeric character strings have not been extracted as a result of the check, first, a numeric character string (attribute value) that has not yet been extracted in the values attribute of the animateTransform element is extracted. A numeric character string is extracted using delimiters “,” and “;” designated in advance. In the present embodiment, numeric character strings to be extracted are −68, −418, −101.49, −446.71, −126.52, −465.24, −139.49, −469.29, − 139.8 and -458.14. Similar to the first embodiment, it is possible to detect a numeric character string without using a designated delimiter.

次に、区切り文字で区切られた数値文字列のセットの並び順において先頭から、セット間の数値文字列の差分値（属性値間の差分値）を計算する。これにより、(-33.49, -28.71)、(-25.03, -18.53)、(-12.97, -4.05)、(-0.31, 11.15)を算出する。 Next, a difference value of numerical character strings between sets (difference value between attribute values) is calculated from the top in the arrangement order of the sets of numerical character strings separated by delimiters. As a result, (−33.49, −28.71), (−25.03, −18.53), (−12.97, −4.05), and (−0.31, 11.15) are calculated.

以下では、これら差分値は、整数符号化の対象となる。なお、セットの並び順において、先頭位置におけるセット（-68, -418）は絶対値であるのでＩＥＥＥ７５４方式で符号化する対象となる。 In the following, these difference values are subject to integer encoding. Note that in the arrangement order of the sets, the set (−68, −418) at the head position is an absolute value, and is therefore an object to be encoded by the IEEE754 method.

そして、各差分値の小数点以下の桁数ｃ’を取得する。 Then, the number of digits c ′ after the decimal point of each difference value is acquired.

次に、各差分値について取得した桁数ｃ’のうち、最大値（最大桁数）Ｃを求める。そして、差分値を１０^Ｃ倍して得られる整数値を表す数値文字列を生成する。そして最後に、ステップＳ４０７では、先頭位置におけるセット、各差分値、最大桁数Ｃ、を符号化する。本実施形態でも、バイナリＸＭＬとして符号化する。 Next, the maximum value (maximum number of digits) C is obtained from the number of digits c ′ acquired for each difference value. Then, a numerical character string representing an integer value obtained by multiplying the difference value by 10 ^C is generated. Finally, in step S407, the set at the head position, each difference value, and the maximum number of digits C are encoded. Also in the present embodiment, encoding is performed as binary XML.

図１０は、図６に示したデータ型定義を拡張した結果を示す図である。具体的には、差分を計算して整数符号化する。これは、先頭の１ビット目を１にすることによって表現される。本実施形態では、図８に示す如く、最初の座標のデータ型は0x1a、残りの座標のデータ型は、２バイト整数で１０の２乗なので0xc2となる。小数値はそれぞれ、0xc2880000,0xc3d10000,x0f2eb,0xf4c9,0xf639,0xf8c3,0xfaef,0xfe6b,0xffe1,0x045bとなる。 FIG. 10 is a diagram showing a result of extending the data type definition shown in FIG. Specifically, the difference is calculated and integer coding is performed. This is expressed by setting the first bit to 1. In the present embodiment, as shown in FIG. 8, the data type of the first coordinate is 0x1a, and the data type of the remaining coordinates is a 2-byte integer that is a square of 10 and therefore 0xc2. The decimal values are 0xc2880000, 0xc3d10000, x0f2eb, 0xf4c9, 0xf639, 0xf8c3, 0xfaef, 0xfe6b, 0xffe1,0x045b, respectively.

以上により、絶対値記述された小数でも、データサイズを縮小することができる。 As described above, the data size can be reduced even with a decimal number described in absolute value.

デコード側は、第１の実施形態と同様に、符号化データのデータ型に従って逆変換を行い元の小数値を取得する。 As in the first embodiment, the decoding side performs inverse conversion according to the data type of the encoded data to obtain the original decimal value.

［第４の実施形態］
本実施形態では、符号化対象となるＸＭＬのデータに対して予め、整数符号化しやすいようなデータ形式に変換しておく。 [Fourth Embodiment]
In the present embodiment, XML data to be encoded is converted in advance into a data format that facilitates integer encoding.

ＳＶＧ仕様には、transform属性が規定されている。transform属性は、記述された座標値の座標系を変換するためのもので、描画処理を行う前の変換処理を記述することができる。 The SVG specification defines a transform attribute. The transform attribute is for converting the coordinate system of the described coordinate value, and can describe the transformation process before the drawing process.

図９は、ＳＶＧにおけるpath要素のd属性の属性値の例を示す図である。図９に示す如く、path要素のd属性には、複数の実数値のデータが記述される。 FIG. 9 is a diagram illustrating an example of attribute values of the d attribute of the path element in SVG. As shown in FIG. 9, a plurality of real value data is described in the d attribute of the path element.

本実施形態に係る符号化処理では先ず、図４のフローチャートにおいて、ステップＳ４０１〜ステップＳ４０５における処理を、第１の実施形態と同様にして行う。 In the encoding process according to the present embodiment, first, in the flowchart of FIG. 4, the processes in steps S401 to S405 are performed in the same manner as in the first embodiment.

図９のd属性に記述された実数値は、小数点以下３桁が最大なので、ステップＳ４０５では、
Ｃとして、「３」が確定する。各実数値を符号化する際には、IEEE754形式で符号化するのではなく、１０^３倍した整数値を符号化する。 Since the real value described in the d attribute in FIG. 9 has a maximum of three digits after the decimal point, in step S405,
As C, “3” is determined. When encoding each real value, an integer value multiplied by 10 ³ is encoded instead of encoding in IEEE754 format.

各値を元に戻す場合には、0.001を掛け合わせる必要がある。そこで、ＳＶＧのデータにtransform属性の記述を追加し、正しく描画処理できるようにする。scale(X)というのは、座標値をＸ倍するという変換処理の記述である。0.001倍するので、transform=scale(0.001)を記述する。記述先の要素は、図９の例１に示すようなコンテナ要素g、例２に示すようなグラフィック要素pathが考えられる。 To restore each value, it is necessary to multiply by 0.001. Therefore, a description of the transform attribute is added to the SVG data so that the drawing process can be performed correctly. scale (X) is a description of the conversion process in which the coordinate value is multiplied by X. Since it is multiplied by 0.001, describe transform = scale (0.001). The description element can be a container element g as shown in Example 1 of FIG. 9 and a graphic element path as shown in Example 2.

追加したtransform属性は、通常のバイナリＸＭＬの符号化方式で符号化される。 The added transform attribute is encoded by a normal binary XML encoding method.

以上より、ＳＶＧのようなアプリケーションデータ側の仕様で数値変換処理の指定がサポートされている場合は、事前にデータ編集しておくことで、バイナリ符号化のフォーマットに依存することなく本提案の方式を適用することができる。 As described above, when the specification of the numerical conversion process is supported by the specification on the application data side such as SVG, the proposed method can be made without depending on the binary encoding format by editing the data in advance. Can be applied.

なお、上述の各実施形態については、適宜組み合わせて用いても良く、それぞれの実施形態における処理を、並行して実行しても良いし、様々な条件に応じて適宜切り替えて実行するようにしても良い。 In addition, about each above-mentioned embodiment, it may be used in combination as appropriate, and the processing in each embodiment may be executed in parallel, or may be executed by switching appropriately according to various conditions. Also good.

［その他の実施形態］
なお、本発明は、例えば、システム、装置、方法、プログラムもしくはコンピュータ読み取り可能な記憶媒体等としての実施態様をとることが可能である。具体的には、複数の機器から構成されるシステムに適用しても良いし、また、一つの機器からなる装置に適用しても良い。 [Other Embodiments]
The present invention can take the form of, for example, a system, apparatus, method, program, or computer-readable storage medium. Specifically, the present invention may be applied to a system composed of a plurality of devices, or may be applied to an apparatus composed of a single device.

尚、本発明は、次のような形態であっても達成しうる。即ち、前述した実施形態の機能を実現するソフトウェアのプログラム（上述したフローチャートに対応したプログラム）を、システムあるいは装置に直接あるいは遠隔から供給する。そして、そのシステムあるいは装置のコンピュータがこの供給されたプログラムコードを読み出して実行することによっても達成される。 Note that the present invention can also be achieved in the following forms. That is, a software program that realizes the functions of the above-described embodiments (a program corresponding to the above-described flowchart) is directly or remotely supplied to a system or apparatus. This can also be achieved by the computer of the system or apparatus reading and executing the supplied program code.

従って、本発明の機能処理をコンピュータで実現するために、このコンピュータにインストールされるプログラムコード自体も本発明を実現するものである。つまり、本発明は、本発明の機能処理を実現するためのコンピュータプログラム自体も含まれる。 Accordingly, since the functions of the present invention are implemented by computer, the program code installed in the computer also implements the present invention. In other words, the present invention includes a computer program itself for realizing the functional processing of the present invention.

その場合、プログラムの機能を有していれば、オブジェクトコード、インタプリタにより実行されるプログラム、ＯＳに供給するスクリプトデータ等の形態であっても良い。 In that case, as long as it has the function of a program, it may be in the form of object code, a program executed by an interpreter, script data supplied to the OS, or the like.

プログラムを供給するための記録媒体としては、例えば、次のようなものがある。即ち、フロッピー（登録商標）ディスク、ハードディスク、光ディスク、光磁気ディスク、ＭＯ、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、ＣＤ−ＲＷ、磁気テープ、不揮発性のメモリカード、ＲＯＭ、ＤＶＤ（ＤＶＤ−ＲＯＭ，ＤＶＤ−Ｒ）などがある。 Examples of the recording medium for supplying the program include the following. Namely, floppy (registered trademark) disk, hard disk, optical disk, magneto-optical disk, MO, CD-ROM, CD-R, CD-RW, magnetic tape, nonvolatile memory card, ROM, DVD (DVD-ROM, DVD-) R).

その他、プログラムの供給方法としては、次のようなものがある。クライアントコンピュータのブラウザを用いてインターネットのホームページに接続し、ホームページから本発明のコンピュータプログラムそのもの、もしくは圧縮され自動インストール機能を含むファイルをハードディスク等の記録媒体にダウンロードする。また、本発明のプログラムを構成するプログラムコードを複数のファイルに分割し、それぞれのファイルを異なるホームページからダウンロードすることによっても実現可能である。つまり、本発明の機能処理をコンピュータで実現するためのプログラムファイルを複数のユーザに対してダウンロードさせるＷＷＷサーバも、本発明に含まれるものである。 Other program supply methods include the following. The browser of the client computer is used to connect to a homepage on the Internet, and the computer program itself of the present invention or a compressed file including an automatic installation function is downloaded from the homepage to a recording medium such as a hard disk. It can also be realized by dividing the program code constituting the program of the present invention into a plurality of files and downloading each file from a different homepage. That is, a WWW server that allows a plurality of users to download a program file for realizing the functional processing of the present invention on a computer is also included in the present invention.

また、本発明のプログラムを暗号化してＣＤ−ＲＯＭ等の記憶媒体に格納してユーザに配布し、所定の条件をクリアしたユーザに対し、インターネットを介してホームページから暗号化を解く鍵情報をダウンロードさせる。そして、その鍵情報を使用することにより暗号化されたプログラムを実行してコンピュータにインストールさせて実現することも可能である。 In addition, the program of the present invention is encrypted, stored in a storage medium such as a CD-ROM, distributed to users, and key information for decryption is downloaded from a homepage via the Internet to users who have cleared predetermined conditions. Let It is also possible to execute the encrypted program by using the key information and install the program on a computer.

また、コンピュータが、読み出したプログラムを実行することによっても、前述した実施形態の機能が実現され得る。即ち、係るプログラムを実行することで、前述した実施形態の機能が実現される他、そのプログラムの指示に基づき、コンピュータ上で稼動しているＯＳなどが、実際の処理の一部または全部を行い、その処理によっても前述した実施形態の機能が実現され得る。 The functions of the above-described embodiments can also be realized by a computer executing a read program. That is, by executing the program, the functions of the above-described embodiments are realized, and an OS running on the computer performs part or all of the actual processing based on the instruction of the program. The function of the above-described embodiment can also be realized by the processing.

さらに、以下の処理によっても前述した実施形態の機能が実現される。即ち、記録媒体から読み出されたプログラムを、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書き込む。そしてその後、そのプログラムの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるＣＰＵなどが実際の処理の一部または全部を行ない、その処理によっても前述した実施形態の機能が実現される。 Further, the functions of the above-described embodiments are realized by the following processing. That is, the program read from the recording medium is written in a memory provided in a function expansion board inserted into the computer or a function expansion unit connected to the computer. Then, based on the instructions of the program, the CPU or the like provided in the function expansion board or function expansion unit performs part or all of the actual processing, and the functions of the above-described embodiments are realized by the processing.

本発明の第１の実施形態に係る文書符号化装置に適用可能なコンピュータのハードウェア構成例を示すブロック図である。It is a block diagram which shows the hardware structural example of the computer applicable to the document encoding apparatus which concerns on the 1st Embodiment of this invention. ＳＶＧの一例を示す図である。It is a figure which shows an example of SVG. ＩＥＥＥ７５４形式のフォーマットを説明する図である。It is a figure explaining the format of IEEE754 format. ＳＶＧを符号化する処理のフローチャートである。It is a flowchart of the process which encodes SVG. 各数値文字列の符号化について説明する図である。It is a figure explaining encoding of each numerical character string. データ型の定義を説明する図である。It is a figure explaining the definition of a data type. 変換結果を説明する図である。It is a figure explaining the conversion result. 変換結果を説明する図である。It is a figure explaining the conversion result. ＳＶＧにおけるpath要素のd属性の属性値の例を示す図である。It is a figure which shows the example of the attribute value of d attribute of the path element in SVG. 図６に示したデータ型定義の拡張結果を説明する図である。It is a figure explaining the expansion result of the data type definition shown in FIG.

Claims

A document encoding device for encoding a structured document,
Detecting means for detecting each attribute value in the structured document;
Obtaining means for obtaining the number of digits after the decimal point of each attribute value detected by the detecting means;
By operating the decimal point position of each attribute value according to the maximum number of digits acquired by the acquisition unit, each attribute value detected by the detection unit is converted into a numeric character string representing an integer value. Conversion means for converting;
A document encoding apparatus comprising: encoding means for encoding each numeric character string by the conversion means and the maximum number of digits.

2. The document encoding apparatus according to claim 1, wherein the conversion unit generates a numeric character string representing an integer value obtained by multiplying an attribute value by 10 ^C , where C is the maximum number of digits.

The encoding means includes
Among numerical character strings as conversion results by the conversion means, for numerical character strings indicating numerical values that do not fall within a numerical range determined based on a size set in advance as the data size of the encoding result, the numerical value of the conversion source The document encoding apparatus according to claim 1, wherein the character string is encoded in an IEEE 754 format.

A document encoding device for encoding a structured document,
Detecting means for detecting each attribute value in the structured document;
In the arrangement order of the attribute values detected by the detection means, calculation means for calculating a difference value between the attribute values in order from the top;
Obtaining means for obtaining the number of digits after the decimal point of each difference value calculated by the calculating means;
By operating the decimal point position of each difference value according to the maximum number of digits acquired by the acquisition means, each difference value calculated by the calculation means is converted into a numeric character string representing an integer value. Conversion means for converting;
Encoding means for encoding each numerical character string by the conversion means, the attribute value at the head position in the arrangement order of each attribute value detected by the detection means, and the maximum number of digits,
The document encoding apparatus, wherein the encoding means encodes the attribute at the head position in IEEE754 format.

The document encoding apparatus according to any one of claims 1 to 4, wherein the encoding unit performs encoding by binary XML.

6. The document encoding apparatus according to claim 1, wherein the structured document is SVG.

A document encoding method performed by a document encoding apparatus for encoding a structured document,
A detection step for detecting each attribute value in the structured document;
An acquisition step of acquiring the number of digits after the decimal point of each attribute value detected in the detection step;
By manipulating the decimal point position of each attribute value according to the maximum number of digits acquired in the acquisition step, each attribute value detected in the detection step is converted into a numeric character string representing an integer value. A conversion process to convert;
A document encoding method comprising: an encoding step for encoding each numerical character string obtained by the conversion step and the maximum number of digits.

A document encoding method performed by a document encoding apparatus for encoding a structured document,
A detection step for detecting each attribute value in the structured document;
In the arrangement order of the attribute values detected in the detection step, a calculation step for calculating a difference value between the attribute values in order from the top,
An acquisition step of acquiring the number of digits after the decimal point of each difference value calculated in the calculation step;
By operating the decimal point position of each difference value according to the maximum number of digits acquired in the acquisition step, each difference value calculated in the calculation step is converted into a numeric character string representing an integer value. A conversion process to convert;
An encoding step for encoding each numeric character string by the conversion step, an attribute value at the head position in the arrangement order of each attribute value detected by the detection step, and the maximum number of digits, and
In the encoding step, the attribute at the head position is encoded in IEEE754 format.

A program for causing a computer to execute the document encoding method according to claim 7 or 8.

A computer-readable storage medium storing the program according to claim 9.