JP2000357083A

JP2000357083A - Method and device for processing document and storage medium

Info

Publication number: JP2000357083A
Application number: JP2000121910A
Authority: JP
Inventors: Sabikkii Stephen; サビッキーステフェン; Wolf Gregory; ウォルフグレゴリー
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1999-04-30
Filing date: 2000-04-24
Publication date: 2000-12-26

Abstract

PROBLEM TO BE SOLVED: To provide a method and a device for processing document with which a document and processing related thereto are coupled in a structure. SOLUTION: An agency 400 placed between a client 402 and a server operates an agent 401 for processing a document to be exchanged between the client 402 and the server. The agent 401 is the set of active documents and the active document is a structured document containing a text and/or action. The active document is operated corresponding to the context of a character string, stream or purse tree on a network, can embed a program and is the structured document so that the program has the same syntax as the document.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は文書処理の分野に関
する。より詳細には、本発明の一態様は、文書、テキス
ト及びプログラムを同一のかつ保守が容易なやり方で処
理するための改良した方法及び装置を提供する。[0001] The present invention relates to the field of document processing. More particularly, one aspect of the present invention provides an improved method and apparatus for processing documents, text and programs in the same and maintainable manner.

【０００２】[0002]

【従来の技術】本発明は文書処理の分野に関する。より
詳細には、本発明の一態様は、文書、テキスト及びプロ
グラムを同一のかつ保守が容易なやり方で処理するため
の改良した方法及び装置を提供する。BACKGROUND OF THE INVENTION The present invention relates to the field of document processing. More particularly, one aspect of the present invention provides an improved method and apparatus for processing documents, text and programs in the same and maintainable manner.

【０００３】多数の小さなコンピュータ・ネットワーク
及び個々のコンピュータを相互に接続するグローバルな
ネットワークであるインターネットの発展により、ソフ
トウェアの開発・配布計画の多くで、大勢の人々が相互
に接続されていることを利用する。例えば、Sun Micros
ystems社（米国カリフォルニア州マウンテン・ヴュー）
により開発されたＪａｖａ（登録商標）プログラム言語
で記述したプログラムを用いることにより、ソフトウェ
ア・プロバイダがプログラムのコピーをインターネット
に載せ、多くのインターネット・クライアントが、その
プログラムを、あたかもクライアント・マシン上に存在
しているかのように走らせることができる。[0003] With the development of the Internet, a large network of many small computer networks and interconnecting individual computers, many people in software development and distribution plans have realized that many people are interconnected. Use. For example, Sun Micros
ystems (Mountain View, CA, USA)
By using a program written in the Java (registered trademark) programming language, which is developed by the Company, a software provider puts a copy of the program on the Internet, and many Internet clients make the program exist as if it were on a client machine. You can run as if you were.

【０００４】ここで使用されるように、”インターネッ
ト・サーバ”とは、インターネットに接続され、与えら
れた要求に応答する、１台のコンピュータ又は１団のコ
ンピュータのことである。”インターネット・クライア
ント”とは、インターネットに接続され、インターネッ
ト・サーバへ要求を送る、１台のコンピュータ又は１団
のコンピュータのことである。１台のコンピュータ又は
１団のコンピュータが、ある要求セットに対してはクラ
イアントとして働き、別の要求セットに対してはサーバ
として働く場合もある。As used herein, an "Internet server" is a computer or group of computers connected to the Internet and responding to given requests. An "Internet client" is a computer or group of computers connected to the Internet and sending requests to an Internet server. One computer or group of computers may act as a client for one set of requests and a server for another set of requests.

【０００５】要求と、それに対する応答の処理に一般的
に使用されるプロトコルは、要求の性質に応じていくつ
か存在する。例えば、ＦＴＰ（File Transfer Protoco
l）は、クライアントがサーバにファイルを要求するた
めに用いるプロトコルである。ＨＴＴＰ（HyperText Tr
ansport Protocol）は、クライアントがハイパーテキス
ト文書を要求するため、また、サーバが要求された文書
を返したり起動したオブジェクトを転送するために用い
るプロトコルである。ＨＴＴＰブラウザを使用して見え
る、他のハイパーテキスト文書とリンクされたハイパー
テキスト文書は、”World Wide Web”又は”Web”と総
称されてきた。これらのプロトコルは、通常、TCP/IP
（Transport Control Protocol/Internet Protocol）と
して知られている、より低レベルのプロトコルの上位で
動作する。これらのプロトコルはそれぞれ既存の文献に
詳しく記述されているので、これ以上説明しない。[0005] There are several protocols commonly used for processing requests and responses to them, depending on the nature of the request. For example, FTP (File Transfer Protocol)
l) is a protocol used by a client to request a file from a server. HTTP (HyperText Tr
Ansport Protocol is a protocol used by a client to request a hypertext document, and by a server to return a requested document or transfer an activated object. Hypertext documents linked to other hypertext documents that are viewed using an HTTP browser have been collectively referred to as the "World Wide Web" or "Web". These protocols are usually TCP / IP
It operates on top of a lower level protocol known as (Transport Control Protocol / Internet Protocol). Each of these protocols is described in detail in existing literature and will not be described further.

【０００６】ＨＴＴＰプロトコルは、静的な従前のハイ
パーテキスト文書の転送のためのプロトコルから発展し
たプロトコルであり、このプロトコルによって、サーバ
は、クライアントからの要求の性質及びパラメータ、サ
ーバ自体が管理している、そのクライアントに対するセ
ッション”状態”、その他種々の要因に基づいて迅速に
ハイパーテキスト文書を生成することができる。例え
ば、あるサーバに格納されている静的な従前のハイパー
テキスト・ページに対して要求を出す代わりに、ＣＧＩ
（Common Gateway Interface）スクリプトのようなスク
リプトに対して要求を出すことができるであろう。その
ようなスクリプトがあるならば、クライアントがサーバ
に対し要求を送り、その要求が静的文書とスクリプトの
どちらも指定可能であっても、サーバはあるスクリプト
に対する要求と判断し、応答するために、そのスクリプ
トを実行し、同スクリプトの出力を要求に対する結果と
して返送する。[0006] The HTTP protocol is a protocol developed from a conventional static hypertext document transfer protocol. With this protocol, a server manages the nature and parameters of a request from a client and the server itself. The hypertext document can be quickly generated based on the current "session" status for the client and various other factors. For example, instead of making a request for a static legacy hypertext page stored on a server, the CGI
(Common Gateway Interface) You could make a request for a script like a script. If there is such a script, the client sends a request to the server, and even if the request can specify either a static document or a script, the server determines that the request is for a script and Executes the script and returns the output of the script as a result of the request.

【０００７】図１は、そのようなスクリプト・ステムの
動作の仕組みを説明するものである。図１にブラウザ１
２とサーバ１４が示されているが、このサーバ１４上で
サーバ・スーパバイザ２０が実行される。サーバ・スー
パバイザ２０は、ブラウザ１２（この例ではクライアン
ト）との入出力を処理し、１つ以上のフォーム２２、ス
クリプト２４などのＣＧＩスクリプトをアクセスし、ス
クリプト２４の出力２６をブラウザに送信するために蓄
積する。なお、図示しないが、ブラウザ１２とサーバ１
４の間にネットワーク又はインターネットが介在しても
よい。FIG. 1 explains the mechanism of the operation of such a script system. Figure 1 Browser 1
2 and the server 14 are shown, on which the server supervisor 20 runs. The server supervisor 20 processes input / output with the browser 12 (the client in this example), accesses one or more CGI scripts such as a form 22 and a script 24, and sends an output 26 of the script 24 to the browser. To accumulate. Although not shown, the browser 12 and the server 1
4 may be interposed between the network and the Internet.

【０００８】図１中のフォーム２２とスクリプト２４の
例を図２と図３にそれぞれ示す。動作を説明すると、ブ
ラウザ１２はサーバ１４に対し参照を要求し、この参照
はサーバ１４によってフォーム２２に対する要求である
と解釈される。図２に見られるように、フォーム２２は
ブラウザ１２のユーザに名前と電話番号を要求するため
のフォームである。フォーム２２はブラウザ１２へ送ら
れ、ブラウザ１２はユーザに対し記入すべき適切なフォ
ームを提示する。ブラウザ１２は、フォーム２２を、そ
れに含まれている命令に従って表示する。本例では、こ
れら命令はＨＴＭＬ（HyperText Markup Language；Sta
ndard Generalized Markup Language"SGML"のサブセッ
ト）のタグ付けテキストの形式である。FIGS. 2 and 3 show examples of the form 22 and the script 24 in FIG. 1, respectively. In operation, browser 12 requests a reference from server 14, which is interpreted by server 14 as a request for form 22. As seen in FIG. 2, form 22 is a form for requesting a user of browser 12 for a name and telephone number. The form 22 is sent to the browser 12, which presents the user with an appropriate form to fill out. Browser 12 displays form 22 according to the instructions contained therein. In this example, these instructions are HTML (HyperText Markup Language; Sta
ndard is a form of tagged text in the Generalized Markup Language "SGML").

【０００９】サーバ１４は、記入済みフォームを与えら
れると、その記入済みフォームをスクリプト２４に渡
す。この例では、スクリプト２４は"phone.cgi"と呼ば
れ、フォーム２２中で参照される。スクリプト２４は、
図３から理解されるように、ＰＥＲＬとして知られてい
るスクリプト言語で記述されている。入力されたあるフ
ォームに対するスクリプト２４の出力２６は、ＰＥＲＬ
であるという認識のもとで決定できるものであって、"T
hank you. Your entry was:"の後に入力された名前と電
話番号が続く。このスクリプトはまた、入力データを"p
honebook.txt"と呼ばれるファイルに追加する。[0009] When the server 14 is provided with the completed form, the server 14 passes the completed form to the script 24. In this example, script 24 is called "phone.cgi" and is referenced in form 22. Script 24
As can be seen from FIG. 3, it is described in a script language known as PERL. The output 26 of the script 24 for a certain input form is PERL
Can be determined with the recognition that
hank you. Your entry was: "followed by the name and phone number entered. The script also converts the input data to" p
honebook.txt ".

【００１０】[0010]

【発明が解決しようとする課題】このような方法の１つ
の問題点は、フォームの作成とスクリプトの作成を調整
するために、２つの異スキルの組合せが、そして多くの
場合、２つの別の製品開発者の組合せが必要となること
である。フォーム開発者は、ＨＴＭＬに精通したテクニ
カル・ライターかもしれないが、フォーム中の変数の名
前及びフィールドがスクリプト中の変数名及び入力と一
致するように、ＰＥＲＬコードを書くプログラマと調整
する必要がある。ＪａｖａやＣなどの他言語について
も、調整が必要である。One problem with such a method is that the combination of two different skills and, in many cases, two separate skills, to coordinate form creation and scripting. This requires a combination of product developers. The form developer may be an HTML-savvy technical writer, but will need to coordinate with the programmer who writes the PERL code so that the names and fields of the variables in the form match the variable names and inputs in the script . Adjustments are also needed for other languages, such as Java and C.

【００１１】このように、文書と、それに関連付けられ
た振る舞い（プログラム）を統合する、改良した方法及
び装置が必要とされていることが分かる。本発明は、そ
のような改良した方法及び装置を提供しようとするもの
である。Thus, it can be seen that there is a need for an improved method and apparatus that integrates a document with its associated behavior (program). The present invention seeks to provide such an improved method and apparatus.

【００１２】[0012]

【課題を解決するための手段】本発明によれば、文書と
それに関連付けられる処理とが、文書それ自体及びそれ
に施される処理の両方に適用可能な共通の構造に従って
文書を構造化することにより結合される。According to the present invention, a document and its associated processes are structured by structuring the document according to a common structure applicable to both the document itself and the processes applied to it. Be combined.

【００１３】本発明の一実施態様であるクライアント・
サーバ文書処理システムでは、クライアントとサーバの
間にエージェンシーが介在し、このエージェンシーがク
ライアントとサーバ間でやり取りされる文書を処理する
１つ以上のエージェントを操作する。各エージェントは
アクティブ文書の集合であり、アクティブ文書はテキス
ト及び／又は振る舞いを含む構造化文書である。エージ
ェントは、アクティブ文書によって指定された振る舞い
を持つソフトウェア・オブジェクトとみなすこともでき
る。[0013] In one embodiment of the present invention, a client
In a server document processing system, an agency is interposed between the client and the server, and the agency operates one or more agents that process documents exchanged between the client and the server. Each agent is a collection of active documents, which are structured documents containing text and / or behavior. An agent can also be considered a software object with the behavior specified by the active document.

【００１４】アクティブ文書は、ネットワーク上で、文
字列、ストリーム及びパース木の文脈に応じて動作し、
プログラムの埋め込みが可能であり、また、構造化文書
であるので、そのプログラムは文書と同じ構文を有す
る。さらに、文書は構造化されるので、その要素をデー
タ構造として利用できる。An active document operates on a network in a context of a character string, a stream, and a parse tree.
Since the program can be embedded and is a structured document, the program has the same syntax as the document. Further, since the document is structured, its elements can be used as a data structure.

【００１５】本文書処理システムの応用分野として、標
準プロトコルと、クライアント、サーバ及びプロキシの
機能を結合するソフトウェア・エージェンシーとを利用
するウェブ上のネットワーク・オフィス機器がある。An application field of the document processing system is a network office device on the web utilizing a standard protocol and a software agency that combines the functions of a client, a server, and a proxy.

【００１６】アクティブ文書とエージェンシー・システ
ムの１つの利点は、高水準の機能をアクティブ文書とし
て実装できるため、クライアント用ソフトウェアとサー
バ用ソフトウェアが低水準の機能するだけでよいことで
ある。アクティブ文書言語により開発を行う場合、文書
の内容（データ）と処理（振る舞い)を特定するために
１つの統一された言語が使用されるので、文書向けコン
ピュータ処理を容易に実装することができる。One advantage of active documents and agency systems is that client software and server software need only perform at a low level, since higher levels of functionality can be implemented as active documents. When development is performed using the active document language, a single unified language is used to specify the content (data) and processing (behavior) of the document, so that computer processing for the document can be easily implemented.

【００１７】このようなアクティブ文書ベースの文書処
理システムのもう１つの利点は、エージェント自体をア
クティブ文書として記述できることである。Another advantage of such an active document based document processing system is that the agent itself can be described as an active document.

【００１８】本発明の本質及び利点は、実施の形態に関
連した説明によって、より完全に理解できるであろう
が、その理解を容易にするため以下に本発明に包含され
る方法などを列挙する。The nature and advantages of the present invention will be more fully understood from the description in connection with the embodiments. To facilitate the understanding, the methods included in the present invention will be listed below. .

【００１９】（１）入力構造化文書を処理して処理結
果である出力文書を生成する方法であって： a) パーサ・カーソルを前記入力構造化文書の最初の要
素を指すように初期化することを含む、入力パーサを初
期化するステップ； b) 前記入力パーサからカレント要素を受け取るため要
素プロセッサの要素入力を前記入力パーサと結合し、か
つ、前記要素プロセッサの要素出力を出力ジェネレータ
に結合する、前記要素プロセッサの初期化のステップ; c) 前記要素プロセッサに結合された定義テーブルを初
期化するステップ； d) 前記入力パーサから要素系列を受け取って、各要素
を次の1)から6)のステップによって処理するステップ： 1) 前記要素プロセッサの要素入力から要素を入力する
ステップ； 2) 前記要素がアクティブ要素であるかパッシブ要素で
あるか判定するステップ； 3) 前記要素がパッシブ要素の場合に、前記定義テーブ
ルから得られる適用可能な定義を用いて、前記パッシブ
要素を評価し、その結果を前記要素出力へ渡すステッ
プ； 4) 前記要素がアクティブ要素の場合に、次の1)と2)の
ステップを実行するステップ： i) 前記要素出力をアクティブ要素キューの入力に結合
するステップ；及び ii)前記定義テーブルから得られる適用可能な定義を用
いて、前記アクティブ要素を評価し、その結果を前記要
素出力へ渡すステップ； 5) 前記アクティブ要素キューが空でない場合に、前記
要素入力を前記アクティブ要素キューの出力に結合する
ステップ；及び 6) 前記アクティブ要素キューが空であり、かつ、前記
要素入力が前記入力パーサに結合されていない場合に、
前記要素入力を前記入力パーサに結合するステップ；及
び e) 前記出力ジェネレータを利用して前記出力文書を生
成するステップ、からなることを特徴とする文書処理方
法。(1) A method of processing an input structured document to generate an output document that is a processing result: a) Initializing a parser cursor to point to a first element of the input structured document. Initializing an input parser, including: b) coupling an element input of an element processor with the input parser to receive a current element from the input parser, and coupling an element output of the element processor to an output generator. , The step of initializing the element processor; c) the step of initializing a definition table coupled to the element processor; d) receiving an element sequence from the input parser, and replacing each element with the following 1) to 6) Processing by steps: 1) inputting an element from an element input of the element processor; 2) determining whether the element is an active element; Determining the element; 3) if the element is a passive element, using the applicable definition obtained from the definition table to evaluate the passive element and passing the result to the element output; 4) if the element is an active element, performing the following steps 1) and 2): i) coupling the element output to an input of an active element queue; and ii) obtained from the definition table Evaluating the active element using the applicable definition and passing the result to the element output; 5) coupling the element input to the output of the active element queue if the active element queue is not empty. And 6) if the active element queue is empty and the element input is not coupled to the input parser,
Combining the element input with the input parser; and e) generating the output document using the output generator.

【００２０】（２）前記（１）の文書処理方法におい
て、出力文書は処理された各要素がヌル文字列と評価さ
れる時にはヌル文書であることを特徴とする文書処理方
法。(2) The document processing method according to (1), wherein the output document is a null document when each processed element is evaluated as a null character string.

【００２１】（３）前記（１）の文書処理方法におい
て、入力構造化文書、出力文書及び定義テーブルはそれ
ぞれ共通の構造化文書フォーマットで表現されることを
特徴とする文書処理方法。(3) The document processing method according to (1), wherein the input structured document, the output document, and the definition table are each represented by a common structured document format.

【００２２】（４）前記（３）の文書処理方法におい
て、共通の構造化文書フォーマットはＳＧＭＬであるこ
とを特徴とする文書処理方法。(4) The document processing method according to (3), wherein the common structured document format is SGML.

【００２３】（５）前記１の文書処理方法において、
要素がアクティブ要素であるかパッシブ要素であるか判
定するステップは： a) 要素識別子をアクティブ・タグ・テーブルと比較す
るステップ； b) 一致するアクティブ・タグが前記アクティブ・タグ
・テーブル内に見つかったときに、前記要素をアクティ
ブ要素と認定するステップ；及び c) 一致するアクティブタグが前記アクティブ・タグ・
テーブル内に見つからないないときに、前記要素をパッ
シブ要素と認定するステップ、からなることを特徴とす
る文書処理方法。(5) In the first document processing method,
Determining whether an element is an active element or a passive element includes: a) comparing the element identifier to an active tag table; b) a matching active tag is found in the active tag table. Qualifying the element as an active element; and c) when a matching active tag is
Certifying the element as a passive element when the element is not found in the table.

【００２４】（６）前記（５）の文書処理方法におい
て、比較のステップは、要素識別子を定義テーブル内に
見つかるアクティブ・タグと比較するステップであるこ
とを特徴とする文書処理方法。(6) In the document processing method of the above (5), the comparing step is a step of comparing the element identifier with an active tag found in the definition table.

【００２５】（７）パッシブ要素を評価するステップ
は： a) 前記パッシブ要素内のトークンを、定義テーブルよ
り得られる適用可能な定義で置き換えるステップ；及び b) 前記パッシブ要素内にプリミティブ表現があるな
ば、そのプリミティブ表現を評価するステップ、からな
ることを特徴とする文書処理方法。(7) Evaluating the passive element includes: a) replacing the token in the passive element with an applicable definition obtained from a definition table; and b) there is no primitive expression in the passive element. Evaluating the primitive expression.

【００２６】（８）パッシブ要素を評価するステップ
は： a) 前記要素をパッシブ・アクター基準の集合と照合す
るステップ；及び b) 前記要素があるパッシブ・アクターの基準と一致し
たときに、一致した各パッシブ・アクターのためのパッ
シブ・アクター・メソッドを呼び出すステップ、からな
ることを特徴とする文書処理方法。(8) Evaluating the passive element includes: a) matching the element with a set of passive actor criteria; and b) matching if the element matches a criteria of a passive actor. Invoking a passive actor method for each passive actor.

【００２７】（９）前記（１）の文書処理方法におい
て、アクティブ要素を処理するステップは： a) 少なくとも前記アクティブ要素内のパッシブ・トー
クンを、前記定義テーブルから得られる適用可能な定義
で置き換えるステップ； b) 前記アクティプ要素内にプリミティブ表現があるな
らば、そのプリミティブ表現を評価するステップ；及び c) 前記アクティブ要素をパラメータとしてアクティブ
・アクター・アクション・メソッドを呼び出すステッ
プ、からなることを特徴する文書処理方法。(9) In the document processing method of (1), the step of processing an active element includes: a) replacing at least a passive token in the active element with an applicable definition obtained from the definition table. B) evaluating the primitive expression, if any, in the active element; and c) invoking an active actor action method with the active element as a parameter. Processing method.

【００２８】（10）アクティブ文書インタプリタによ
って入力構造化文書を処理して出力文書を生成する方法
であって： a) 前記入力文書を解析して、開始タグ、終了タグ及び
エンティティ参照を含む要素の系列からなるパース・シ
ーケンスを得るステップ； b) 前記パース・シーケンス内で遭遇する各要素につい
て、次の1)から6)のステップを実行するステップ： 1) 前記要素がパッシブ要素であるかアクティブ要素で
あるか判定するステップ、ここでアクティブ要素は前記
アクティブ文書インタプリタによって管理されているハ
ンドラ・データベース内に関連付けられたハンドラがあ
る要素である； 2) 前記要素の子要素が前記パース・シーケンス内にあ
るか判定するステップ； 3) 前記要素がパッシブ要素であって子要素を持たない
ときに、前記パッシブ要素を出力プロセスへ渡すステッ
プ； 4) 前記要素がパッシブ要素であって子要素を持つとき
に、その子要素をステップb)によって処理し、その処理
を必要に応じて繰り返すステップ； 5) 前記要素がアクティブ要素であって子要素を持つと
きに、その子要素を次のi)とii)のステップによって処
理するステップ： i) 前記子要素をステップb)により処理し、その処理を
必要に応じて繰り返すステップ；及び ii) ステップa)5)iで出力された要素を、記憶キューへ
送るステップ；及び 6) 前記記憶キューが空でなければ、前記記憶キューが
空になるまで、解析された入力文書の代わりに、前記記
憶キューを前記パース・シーケンスのソースとして利用
するステップ；及び c) 前記出力プロセスで、前記出力プロセスへ出力され
た各要素を定義された出力規則集合に従って処理するス
テップ、からなる文書処理方法。(10) A method of processing an input structured document by an active document interpreter to generate an output document, comprising: a) parsing the input document to determine an element including a start tag, an end tag, and an entity reference; Obtaining a parse sequence consisting of a sequence; b) performing the following steps 1) to 6) for each element encountered in said parse sequence: 1) said element is a passive element or an active element Where the active element is the element with the associated handler in the handler database maintained by the active document interpreter; 2) the element's child elements are in the parse sequence 3) when the element is a passive element and has no child elements, Passing the passive element to the output process; 4) when the element is a passive element and has a child element, processing the child element by step b) and repeating the processing as necessary; 5) the element Is an active element and has child elements, processing the child elements by the following steps i) and ii): i) processing the child elements by step b), and performing the processing as necessary Iterating; and ii) step a) sending the element output in 5) i to a storage queue; and 6) if the storage queue is not empty, the input analyzed until the storage queue is empty. Using the storage queue as a source of the parse sequence instead of a document; and c) defining in the output process each element output to the output process. The step of processing according to the output rule set, a document processing method comprising.

【００２９】（11）前記（10）の文書処理方法におい
て、パッシブ要素を前記出力プロセスへ与えるステップ
はさらに： a) 前記パッシブ要素のためのエンティティ参照がエン
ティティ定義集合中に定義されているか判定するステッ
プ；及び b) 前記エンティティ参照が前記エンティティ定義集合
中に定義されているときに、その定義されたエンティテ
ィ参照の対応エンティティ値を出力するステップ、を含
むことを特徴とする文書処理方法。(11) In the document processing method according to (10), the step of providing a passive element to the output process further includes: a) determining whether an entity reference for the passive element is defined in an entity definition set. And b) outputting the corresponding entity value of the defined entity reference when the entity reference is defined in the entity definition set.

【００３０】（12）前記（10）の文書処理方法におい
て、入力構造化文書の全体を参照することなく、前記入
力構造化文書を少なくとも部分的に処理することができ
るように、ステップa),b)及びc)はほぼ同時に実行され
ることを特徴とする文書処理方法。(12) In the document processing method of (10), steps a) and (a) are performed so that the input structured document can be at least partially processed without referring to the entire input structured document. b) and c) are executed almost simultaneously.

【００３１】（13）前記（10）の文書処理方法におい
て、ステップa),b)及びc)は逐次的に実行されることを
特徴とする文書処理方法。(13) The document processing method according to (10), wherein steps a), b) and c) are sequentially executed.

【００３２】（14）前記（10）の文書処理方法におい
て、ステップa),b)及びc)の処理操作は前記入力構造化
文書と同じ構文で表現されることを特徴とする文書処理
方法。(14) The document processing method according to (10), wherein the processing operations of steps a), b) and c) are expressed in the same syntax as the input structured document.

【００３３】（15）前記（10）の文書処理方法におい
て、ハンドラ・データベースは入力構造化文書と同じ構
文で表現されることを特徴とする文書処理方法。(15) The document processing method according to (10), wherein the handler database is expressed in the same syntax as the input structured document.

【００３４】（16）パース木によって表現された入力
構造化文書を処理し、出力文書を生成する文書処理装置
（アクティブ文書インタプリタ）であって： a) 要求された時に前記パース木をトラバースしてカレ
ント要素を出力する入力文書木トラバーサ、ここで前記
カレント要素は前記パース木内のカーソルで指示される
要素である； b) 前記入力文書木トラバーサにより出力される要素を
処理するための、前記入力文書木トラバーサと結合され
た要素プロセッサ； c) 前記要素プロセッサにより出力される要素で出力文
書木を構築するための、前記要素プロセッサと結合され
た出力文書木コンストラクタ； d) 入力要素がアクティブ要素であるかパッシブ要素で
あるか判定するための、前記要素プロセッサ内部の第１
の要素エバリュエータ； e) パッシブ要素を前記要素プロセッサの出力ステージ
へ送り、アクティブ要素を要素キューへ送るための、前
記要素プロセッサ内部の第１のルータ； f) 前記入力要素のための対応エンティティ置換値を定
義テーブルより見つけだすための、前記要素プロセッサ
内部の第２の要素エバリュエータ；及び g) 前記要素キューが空でない時に前記要素キューから
前記第２の要素エバリュエータ及び前記要素プロセッサ
の出力へ要素を送り、前記要素プロセッサが空の時に前
記入力から前記要素プロセッサへ要素を送るための、前
記要素プロセッサ内部の第２のルータ、からなることを
特徴とする文書処理装置。(16) A document processing device (active document interpreter) that processes an input structured document represented by a parse tree and generates an output document: a) traversing the parse tree when requested An input document tree traverser that outputs a current element, wherein the current element is an element pointed by a cursor in the parse tree; b) the input document for processing an element output by the input document tree traverser An element processor coupled to the tree traverser; c) an output document tree constructor coupled to the element processor for constructing an output document tree with elements output by the element processor; d) an input element is an active element A first element within the element processor for determining whether the element is a passive element.
E) a first router inside the element processor for sending a passive element to an output stage of the element processor and an active element to an element queue; f) a corresponding entity replacement value for the input element A) a second element evaluator inside said element processor to find out from the definition table; and g) sending an element from said element queue to said second element evaluator and output of said element processor when said element queue is not empty; A document processing apparatus, comprising: a second router inside the element processor for sending an element from the input to the element processor when the element processor is empty.

【００３５】（17）入力文書を処理して出力文書を得
る方法であって： a) 前記入力文書を解析して、要素開始タグ、要素終了
タグ、エンティティ参照及び文字列からなるシーケンス
を得るステップ； b) 前記入力文書を 1) 定義された各エンティティ参照を、実行された時に
定義された置換値を出力する命令シーケンスで置き換
え； 2) アクティブ要素と認識された各要素を、実行された
時に、当該要素の属性及び内容に対応した命令の出力を
入力として取り込み、それを当該アクティブ要素の定義
から本方法によって導き出された命令シーケンスへ渡す
命令シーケンスで置き換え、 3) プリミティブ操作を表していると認識された各要素
を、実行された時に、当該要素の属性及び内容に対応し
た命令の出力を入力として取り込み、それらに対し指示
されたプリミティブ操作を実行する命令シーケンスで置
き換え、かつ、 d) 認識されなかった各々の要素開始タグ、要素終了タ
グ、エンティティ参照及び文字列を、それを出力する命
令シーケンスで置き換える、ことによって、命令シーケ
ンスに変換するステップ：及び c) 得られた命令シーケンスを実行して出力文書を作成
するステップ、からなることを特徴とする文書処理方
法。(17) A method of processing an input document to obtain an output document, comprising: a) analyzing the input document to obtain a sequence consisting of an element start tag, an element end tag, an entity reference, and a character string. B) replaces the input document with 1) each defined entity reference with a sequence of instructions that, when executed, outputs a defined replacement value; 2) replaces each element identified as an active element when executed , Take the output of the instruction corresponding to the attribute and content of the element as input, replace it with the instruction sequence that is passed from the definition of the active element to the instruction sequence derived by this method, and 3) represents a primitive operation When each recognized element is executed, it takes in the output of the instruction corresponding to the attribute and content of the element as input, and By replacing the unrecognized element start tag, element end tag, entity reference, and character string with an instruction sequence that outputs it, with the instruction sequence performing the indicated primitive operation; and A document processing method comprising: converting the instruction sequence into an instruction sequence; and c) executing the obtained instruction sequence to create an output document.

【００３６】（18）前記（１）乃至（15）のいずれか
又は前記（17）の文書処理方法のための処理をコンピュ
ータに実行させるためのプログラムが記録されたことを
特徴とするコンピュータ読み取り可能な記憶媒体。(18) A computer readable program recorded with a program for causing a computer to execute any one of the above (1) to (15) or the processing for the document processing method of the above (17). Storage media.

【００３７】（19）前記（16）の文書処理装置の機能
をコンピュータに実現させるためのプログラムが記録さ
れたことを特徴とするコンピュータ読み取り可能な記憶
媒体。(19) A computer-readable storage medium on which a program for causing a computer to realize the functions of the document processing apparatus according to (16) is recorded.

【００３８】[0038]

【発明の実施の形態】以下、添付図面を参照して、本発
明の好適な実施例といくつかの変形例について説明す
る。当業者には、以下に説明する実施例の詳細事項の多
くを本発明の範囲から逸脱することなく変更可能である
ことは明らかであろう。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Preferred embodiments of the present invention and some modifications will be described below with reference to the accompanying drawings. It will be apparent to those skilled in the art that many of the details of the embodiments described below can be changed without departing from the scope of the invention.

【００３９】文書向けコンピュータ処理システムは、文
書を相互にやり取りするエージェンシーのネットワーク
のように見える。その一例は、１９９６年９月２４日付
出願の米国特許出願第08/718,858号（以下"Savitzky"と
呼ぶ）に説明されているパーソナル・インフォメーショ
ン・エージェンシー（ＰＩＡ）である。A document-oriented computer processing system appears to be a network of agencies that exchange documents with each other. One example is the Personal Information Agency (PIA) described in US patent application Ser. No. 08 / 718,858, filed Sep. 24, 1996 (hereinafter “Savitzky”).

【００４０】このようなシステムでは、入力デバイス、
出力デバイス及び記憶デバイスは、すべて複合文書にす
ぎないので取り扱いが簡単になる。入力デバイスは文書
のソースであるので、入力デバイスに文書を要求するエ
ージェントの立場で見ると、入力デバイスは、絶えず更
新される１つの文書、拡大し続ける文書の集合、あるい
は、１つ以上のデスティネーションへ文書を絶えず送り
続けるクライアントであるかのように見えるかもしれな
い。出力デバイスは文書のシンクであり、更新可能な文
書、例えば拡大する文書の集合、あるいは、１つ以上の
ソースに絶えず文書を要求するクライアントのように見
えるかもしれない。記憶デバイスは入力デバイスと出力
デバイスの両方の側面を持ち、文書を記憶デバイスへ送
り、その後に取り出すことができる。In such a system, an input device,
The output device and the storage device are easy to handle because they are all just compound documents. Since the input device is the source of the document, from the perspective of an agent requesting a document from the input device, the input device can be a single document that is constantly updated, a growing collection of documents, or one or more destinations. It may look like a client who constantly sends documents to the Nation. The output device is a sink of documents, and may look like an updatable document, eg, a growing collection of documents, or a client constantly requesting documents from one or more sources. The storage device has aspects of both an input device and an output device, and can send documents to the storage device for later retrieval.

【００４１】Savitzkyに開示されたＰＩＡ装置は以下に
説明するように文書処理の実装のために使用できるが、
SavitzkyのＰＩＡ装置に代えて他の装置を使用し得るこ
とを理解すべきである。さらに、Savitzkyに開示された
ＰＩＡに含まれるコンポーネント又は機能の全部は持た
ない文書処理エージェンシーでも、後述の文書処理も利
用できる。The PIA device disclosed in Savitzky can be used for implementing document processing as described below,
It should be understood that other devices could be used in place of the Savitzky PIA device. Further, a document processing agency that does not have all the components or functions included in the PIA disclosed in Savitzky can also use the document processing described below.

【００４２】図４は、アクティブ文書を処理するパーソ
ナル・インフォメーション・エージェンシー（ＰＩＡ）
のブロック図である。図示のように、ＰＩＡ４００は、
クライアント４０２とサーバ（不図示）又はネットワー
ク上の他のノード（不図示）との間でやり取りされるア
クティブ文書を処理する、いくつかのエージェントを含
んでいる。ここで用いられるクライアント４０２は例え
ばブラウザなどで、クライアント４０２から渡されるア
クティブ文書は、通常、文書の要求又は入力された結果
の形をとる。FIG. 4 shows a personal information agency (PIA) for processing active documents.
It is a block diagram of. As shown, PIA 400 is:
It includes several agents that process active documents exchanged between client 402 and a server (not shown) or other nodes on the network (not shown). The client 402 used here is, for example, a browser, and the active document passed from the client 402 usually takes the form of a document request or an input result.

【００４３】さて図４を参照すると、図２及び図３にそ
れぞれ示したフォーム２２及びスクリプト２４に代え
て、アクティブ文書４０４を使用する方法が示されてい
る。アクティブ文書４０４の内容を図６に示す。Referring now to FIG. 4, there is shown a method of using an active document 404 instead of the form 22 and script 24 shown in FIGS. 2 and 3, respectively. FIG. 6 shows the contents of the active document 404.

【００４４】このアクティブ文書４０４は、エージェン
ト４０１によって、受け取った時にその情報を処理する
ために使用されるだけでなく、ブラウザ４０２でユーザ
入力用フォームを表示するためにも利用される文書であ
る。図７と図８に、２つのブラウザ表示６０２，６０４
を示す。ブラウザ表示６０２はユーザ入力のために現れ
るであろうフォームであり、ブラウザ表示６０４はフォ
ーム６０２を与えた結果としてブラウザ画面に現れるで
あろうアクティブ文書処理結果である。両表示は、同一
のアクティブ文書４０４によって生成されるものであ
る。The active document 404 is a document used not only by the agent 401 to process the information when it is received, but also to display a user input form on the browser 402. 7 and 8 show two browser displays 602 and 604.
Is shown. Browser display 602 is the form that will appear for user input, and browser display 604 is the active document processing result that will appear on the browser screen as a result of providing form 602. Both displays are generated by the same active document 404.

【００４５】アクティブ文書によれば、振る舞いを文書
中に埋め込むことができるため、得られる単一コンポー
ネント文書は管理が容易であり、フォームはそれ自体で
解釈できる形とすることができ、表と他のデータは、そ
の保守のために必要とされるコードと同じ文書中にまと
められているので自己保守形式とすることができる。こ
れを次の段階に進めることにより、システム全体（例え
ばエージェント全体）を、ほかに何も必要としない、ア
クティブ文書の集合として実装可能である。According to the active document, since the behavior can be embedded in the document, the obtained single component document can be easily managed, the form can be interpreted by itself, and the table and other data can be used. Is self-maintaining because it is organized in the same document as the code required for its maintenance. By taking this to the next stage, the entire system (eg, the entire agent) can be implemented as a collection of active documents that require nothing else.

【００４６】好適な実施例では、アクティブ文書は、Ｓ
ＧＭＬの構文規約に従った、追加タグを含み得るＨＴＭ
Ｌ文書である。こうすることにより、普通のテキスト・
エディタで文書を編集可能となる。このような”文書の
コンピュータ処理”の手法では、データとメソッドはＨ
ＴＭＬ文書の部品にすぎない。例えば、ＨＴＭＬのリス
ト（＜ｕｌ＞タグと＜ｏｌ＞タグで示される）は、プロ
グラムによってリスト又は配列として用いられる。定義
リスト（＜ｄl＞タグ）は、（＜ｄｔ＞タグ付きテキス
ト）という用語が（＜ｄｄ＞タグ付きテキスト）という
データに相当するという点で、結合配列を兼ねる。同様
に、表（＜table＞タグ）はデータベース表として使用
できる。このようにしているのは、既存のテキスト構造
をプログラム・データとして利用することを考慮したか
らである。In the preferred embodiment, the active document is
HTM that may include additional tags, according to GML syntax conventions
L document. By doing this, ordinary text
The document can be edited with an editor. In such a "computer processing of documents" technique, data and methods are H
It is just a component of a TML document. For example, HTML lists (indicated by <ul> and <ol> tags) are used by programs as lists or arrays. The definition list (<dl> tag) also serves as a linking sequence in that the term (<dt> tagged text) corresponds to data (<dd> tagged text). Similarly, tables (<table> tags) can be used as database tables. This is done in consideration of utilizing an existing text structure as program data.

【００４７】一般に、アクティブ文書中の全てのデータ
はＳＧＭＬの”要素”の形式であって、要素は、（１）
１つの文字列、（２）開始タグ、（３）終了タグ又は
（４）エンティティ参照である。ここに示した例では、
開始タグは、”＜”文字と、識別子（”タグ”）と、可
変数の”属性＝値”の対と、”＞”文字とからなる。終
了タグは、”＜”文字と、タグと、”＞”文字とからな
る。エンティティは、”＆”文字と、識別子と、”；”
文字とからなる。開始タグと対応した終了タグ、それら
に囲まれた内容とからなる構造は、”要素”と呼ばれ
る。タグとテキストは”トークン”と総称される。普通
文字の文字列はテキストと呼ばれるが、タグとエンティ
ティはマークアップと呼ばれる。操作の中には、テキス
トだけに作用するもの、あるいはマークアップだけに作
用するものがある。エンティティは、著作権記号を表す
ＨＴＭＬの”＆ｃｏｐｙ；”のような、特殊な記法であ
り、変数又はマクロとして使用できる。そのようなもの
であるから、エンティティは定義可能であり、再定義可
能であり、また展開可能である（すなわち、エンティテ
ィは、その現在の定義によって置き換えることができ
る）。エンティティは、”反復変数”としても使用でき
る。反復変数を使用する一例は、リストの各項目をテキ
スト・ブロックに埋め込みたい場合である。In general, all data in the active document is in the form of SGML "elements", where the elements are (1)
One character string, (2) start tag, (3) end tag, or (4) entity reference. In the example shown here,
The start tag includes a “<” character, an identifier (“tag”), a variable number of “attribute = value” pairs, and a “>” character. The end tag is composed of a “<” character, a tag, and a “>” character. Entities are “&” characters, identifiers, and “;”
Consists of characters. A structure including a start tag, an end tag corresponding to the start tag, and contents enclosed by the start tag and the end tag is called an “element”. Tags and text are collectively referred to as "tokens." Plain text is called text, while tags and entities are called markup. Some operations only work on text or some only on markup. An entity is a special notation, such as HTML “©” representing a copyright symbol, and can be used as a variable or a macro. As such, an entity is definable, redefinable, and expandable (ie, an entity can be replaced by its current definition). Entities can also be used as "iteration variables". One example of using an iteration variable is when you want to embed each item in a list in a block of text.

【００４８】好適な実施例では、文書中のトークンのど
のサブセットも、アクティブにしたい”アクター”と呼
ばれるソフトウェア・オブジェクトに関連付けることが
できる。各アクターは、ＳＧＭＬの内容、プリミティブ
・ハンドル（サブルーチン）の参照のどちらかを含む。
内容によって置換される要素が文書中に現れた時に、ア
クターが展開され、すなわちハンドル・サブルーチンが
呼び出され、エンティティがその要素の属性及び内容と
結合される。アクターは、”タグセット”と呼ばれるグ
ループにまとめることができる。In the preferred embodiment, any subset of the tokens in the document can be associated with a software object called the "actor" that you want to activate. Each actor contains either SGML content or a reference to a primitive handle (subroutine).
When the element replaced by the content appears in the document, the actor is expanded, ie, the handle subroutine is called, and the entity is combined with the attribute and content of the element. Actors can be grouped into groups called "tag sets."

【００４９】アクターは２通りの方法でトークンと関連
付けられる。”アクティブ”アクターは、それを対応要
素のタグ識別子と関連づけするハッシュ・テーブルのエ
ントリーを有する。どのアクティブ・アクターも要素と
合致しない場合には、トークンの特徴（通常は属性）を
検査する一致基準のリストに関し”パッシブ”アクター
集合の各アクターが順に調べられる。例えば、”if”ア
クターはアクティブで開始タグ＜if＞と一致する。”_f
oreach”アクターはパッシブで、”foreach”属性を含
む開始タグのどれとも一致する。もう１つの可能なアク
ターは”_eval_perl_”アクターであり、これもパッシ
ブで、”perl”の値を持つ”language”属性を含むどの
開始タグとも一致する。Actors are associated with tokens in two ways. An "active" actor has an entry in a hash table that associates it with the tag identifier of the corresponding element. If no active actor matches the element, each actor in the "passive" actor set is examined in turn for a list of matching criteria that checks for token characteristics (usually attributes). For example, the "if" actor is active and matches the start tag <if>. ”_F
The "oreach" actor is passive and matches any start tag that includes a "foreach" attribute. Another possible actor is the "_eval_perl_" actor, which is also passive and has a "perl" value of "language" Matches any start tag that contains an attribute.

【００５０】ＰＩＡはアクティブ文書で使用するための
標準タグセットを定義しているが、一部のエージェント
は、この標準タグセットとともにロードするための特殊
なタグセットを定義し得る。ＨＴＭＬの構文解析（パー
シング）、フォーマッティング、さらには他の文書型の
ＳＧＭＬ文書のＨＴＭＬへの翻訳といった異種の文書処
理のために、全く異なったタグセットも定義できる。Although the PIA defines a standard set of tags for use in active documents, some agents may define a special set of tags to load with this standard set of tags. Completely different tag sets can be defined for heterogeneous document processing, such as HTML parsing, formatting, and even translating SGML documents of other document types into HTML.

【００５１】上述の概念は、プログラム言語の解釈及び
文書のフォーマッティングの両方と密接に関連してい
る。アクティブ文書は２つの性格を持つため、テクニカ
ル・ライターはアクターをマクロに関連付け、エンティ
ティを略語及びフォーマッティング又は活字組処理とし
ての文書評価処理と関連付け、他方、プログラマはそう
ではなく、アクターを機能に関連付け、エンティティを
変数に関連付けるかもしれない。The concepts described above are closely related to both the interpretation of the programming language and the formatting of the document. Because active documents have two characteristics, the technical writer associates actors with macros and associates entities with abbreviations and document evaluation processes as formatting or typesetting, while programmers do not associate actors with functions. May associate an entity with a variable.

【００５２】図９は、アクティブ文書を解釈するために
利用可能な一般化した処理のフローチャートである。ア
クティブ文書を解釈するオブジェクトは、アクティブ文
書インタプリタ（ＡＤＩ）と呼ばれる。ＡＤＩは、ほと
んどのエージェントのためのアクション及びハンドル・
メソッドを実装している。ＡＤＩは、入力文書を処理し
て出力文書を作成し、また、場合によっては、ファイル
の読み書きといった、出力文書生成以外の付随的な動作
を行わせる。FIG. 9 is a flowchart of a generalized process that can be used to interpret an active document. The object that interprets the active document is called the active document interpreter (ADI). ADI provides actions and handles for most agents.
Implements the method. The ADI processes an input document to create an output document and, in some cases, causes additional operations other than output document generation, such as reading and writing a file.

【００５３】ＡＤＩは、入力スタック、パース・スタッ
ク、出力キュー、それと、カレント・トークンＴ、”ス
トリーミング／パーシング”フラグ、”処理／引用”フ
ラグ、カレント・タグセット及び”ハンドラ”アクター
のリストのための値を保持する状態メモリから構成され
る。入力スタック上の各項目は、トークンか、トークン
・シーケンス中の次トークンを返すオブジェクトであ
る。トークンが要求されるとに、ＡＤＩは入力スタック
のトップの項目を調べる。そのスタックに使用できるト
ークンがあれば次のトークンが要求されるが、使用でき
るトークンがなければ、スタックはポップされ、次項目
が調べられる。The ADI is for the input stack, the parse stack, the output queue, and the current token T, the "streaming / parsing" flag, the "processing / quoting" flag, the current tag set and the list of "handler" actors. And a state memory that holds the value of Each item on the input stack is a token or an object that returns the next token in the token sequence. When a token is requested, ADI examines the top item on the input stack. If there are tokens available on the stack, the next token is requested, but if there are no tokens available, the stack is popped and the next item is examined.

【００５４】入力スタックは、通常、”スキャナ”オブ
ジェクトという１つのトークンを保持するように初期化
される。スキャナ・オブジェクトは、入力されたファィ
ル又は文字ストリームをトークンに分割する。入力スタ
ックもまた、要素のソフトウェア表現をトラバースして
（すなわち、辿って）、その要素の開始タグ、内容及び
終了タグを返すオブジェクトを保持するように初期化さ
れるであろう。The input stack is typically initialized to hold one token, the "scanner" object. The scanner object splits the input file or character stream into tokens. The input stack will also be initialized to hold an object that traverses (ie, traverses) the software representation of the element and returns the element's start tag, content, and end tag.

【００５５】ＡＤＩは、入力スタックから１度に１つの
トークンを読み込む。トークンは、開始タグであること
も、終了タグであることも、あるいは、文字列、空タ
グ、エンティティ参照、パース木のような完全なトーク
ンであることもある。最初、処理はストリーミング状態
であり、トークンは受け取られるとすぐに処理される。
これと別の処理状態は、パーシング状態である。読み込
まれたトークンが開始タグであるときには、そのトーク
ンは、パーシング／ストリーミング・フラグ及び引用フ
ラグとともにパース・スタックにプッシュされる。読み
込まれたトークンがパース・スタックのトップにある開
始タグと一致する終了タグであるときには、パース・ス
タックのトップのものがポップされ、そして、状態が”
パーシング”ならばカレント・トークンが入れ替えられ
る。そのトークンがパース・スタックのトップにあるも
のと一致しない終了タグであるときには、そのトークン
は入力スタックにプッシュされ、その省かれた終了タグ
はあたかも省略されなかったかのように生成される。処
理はストリーミング状態に戻り、”ストリーミング”フ
ラグがポップされる。ADI reads one token at a time from the input stack. A token can be a start tag, an end tag, or a complete token such as a string, empty tag, entity reference, or parse tree. Initially, the process is in a streaming state, and the token is processed as soon as it is received.
Another processing state is a parsing state. When the token read is the start tag, the token is pushed onto the parse stack along with the parsing / streaming and quote flags. If the token read is an end tag that matches the start tag at the top of the parse stack, the top one on the parse stack is popped and the state is "
If the token is an end tag that does not match the one at the top of the parse stack, the token is pushed onto the input stack and the omitted end tag is omitted as if it were a "parse". The process returns to the streaming state and the "streaming" flag is popped.

【００５６】そして、カレント・トークンＴは次のよう
に処理される。そのトークンがエンティティ参照であ
り、かつ、その値がカレント・エンティティ・テーブル
に定義されているならば、そのトークンはその値（テキ
スト、要素又はトークン・リスト）で置き換えられる。
次に、カレント・トークンＴがトークン・リスト又は内
容を持つ要素であるならば、そのトークンは、その開始
タグ、内容及び終了タグを個別に処理できるようにする
ため入力スタックにプッシュされる。カレント・トーク
ンがそのようなものでなく、エンティティ参照でもトー
クン・リストでもなければ、カレント・トークンは開始
タグ、終了タグ、空タグ、テキストのいずれかである。
カレント・トークンが開始タグか空タグならば、その属
性中のどのエンティティも展開される（すなわち、それ
らの値で置き換えられる）。Then, the current token T is processed as follows. If the token is an entity reference and the value is defined in the current entity table, the token is replaced with the value (text, element or token list).
Next, if the current token T is a token list or an element with content, the token is pushed onto the input stack so that its start tag, content and end tag can be processed separately. If the current token is not such and is neither an entity reference nor a token list, the current token is a start tag, end tag, empty tag, or text.
If the current token is a start tag or an empty tag, any entities in that attribute are expanded (ie, replaced with their values).

【００５７】次に、そのトークンのタグ識別子が、アク
ティブ・タグのカレント・タグセット・テーブルにおい
て検索される。一致するアクターが見つかったときに
は、カレント・トークンとＡＤＩ自体をパラメータとし
て、そのアクターのact_on()メソッドが呼び出される。
一致するアクターが見つからない場合には、トークンＴ
と一致する基準を持つアクターがないかカレント・タグ
セットのパッシブ・アクター・リストが調べられ、ま
た、そのような各アクターのact_on()メソッドが呼び出
される。Next, the tag identifier of the token is searched in the current tag set table of the active tag. When a matching actor is found, the act_on () method of the actor is called with the current token and the ADI itself as parameters.
If no matching actor is found, the token T
The passive actor list of the current tag set is consulted for actors with criteria that match, and the act_on () method of each such actor is called.

【００５８】act_on()メソッドは、ＡＤＩのパーシング
／ストリーミング・フラグと処理／引用フラグを変更
し、あるいは、カレント・トークンＴを別のトークン又
はヌル値（トークンの削除）で置き換えるであろう。こ
のメソッドはまた、”ハンドラ”リストに１つ以上のア
クターをプッシュするであろう。act_on()メソッドの処
理後、ハンドラ・リストが空でなく、かつ、処理／引用
フラグが処理を指示しているときには、ハンドラ・リス
ト中の各アクターのハンドル・メソッドが呼び出され
る。この時点でもトークンＴがヌルでなければ、パーシ
ング／ストリーミング・フラグが調べられる。パーシン
グの状態ならば、パース・スタックのトップにあるトー
クン（要素）の内容にＴが追加される。ストリーミング
の状態ならば、Ｔは出力キューに追加される。The act_on () method will change the parsing / streaming and processing / quoting flags of the ADI, or replace the current token T with another token or a null value (delete token). This method will also push one or more actors to the "handler" list. After the processing of the act_on () method, if the handler list is not empty and the processing / quotation flag indicates processing, the handle method of each actor in the handler list is called. At this point, if the token T is not null, the parsing / streaming flag is examined. In the parsing state, T is added to the contents of the token (element) at the top of the parse stack. If streaming, T is added to the output queue.

【００５９】以上の手順の重要な点は、パッシブ文書で
ある文書をアクティブ文書と同じ方法で処理できるが、
パッシブ文書は、単に処理されて出力され（パーシング
／ストリーミング・フラグがストリーミングに設定され
ているとき）、あるいは、解析されてパース木が生成さ
れ（パーシング／ストリーミング・フラグがパーシング
に設定されているとき）、一方、アクティブ文書はそれ
に含まれているメソッドを実行させるということであ
る。アクターを含むアクティブ文書の場合、処理はもっ
と複雑である。An important point of the above procedure is that a document that is a passive document can be processed in the same manner as an active document.
Passive documents are simply processed and output (when the parsing / streaming flag is set to streaming) or parsed to generate a parse tree (when the parsing / streaming flag is set to parsing). On the other hand, the active document causes the methods contained in it to be executed. For an active document containing actors, the process is more complicated.

【００６０】次に、アクティブ文書インタプリタの別の
実装について説明する。この実装においては、ＡＩＤに
対する入力はパース木のイテレータ（iterator）であ
る。ここで、パース木はノードの集合であり、入力文書
を表している。パース木の一例は、ＷＷＷコンソーシア
ムの文書オブジェクト・モデルに記述されている（例え
ば、＜http://www.w3.org/TR/PR-DOM-Level-1/＞を参
照）。Next, another implementation of the active document interpreter will be described. In this implementation, the input to the AID is a parse tree iterator. Here, the parse tree is a set of nodes and represents an input document. An example of a parse tree is described in the document object model of the WWW Consortium (for example, see <http://www.w3.org/TR/PR-DOM-Level-1/>).

【００６１】イテレータは文書のパース木のルート・ノ
ードに初期設定され、ＡＤＩは、その入力イテレータを
カレント入力ノードの次の兄弟ノード又は最初の子ノー
ドへ進めることができる。カレント入力ノードは前記第
１実施例におけるカレント・トークンと全く同様に扱わ
れる。次の兄弟ノードが存在しないことで、”終了タ
グ”条件がわかる。The iterator is initialized to the root node of the document's parse tree, and the ADI can advance its input iterator to the next sibling node or first child node of the current input node. The current input node is handled in exactly the same way as the current token in the first embodiment. The absence of the next sibling node indicates the "end tag" condition.

【００６２】ＡＩＤの出力はパース木コンストラクタで
ある。かかるコンストラクタに対する処理は、新たな子
ノードをカレント・ノードに追加すること、新たな子ノ
ードをカレント・ノードに追加して、その子ノードをカ
レント・ノードにすること、及びカレント・ノードの親
ノードを新たなカレント・ノードにすることである。Ａ
ＤＩは、その入力木を再帰的にトラバースすることによ
って動作する。通常の（非アクティブの）ノードは、出
力木にコピーされる。アクティブ・ノードは、新たなＡ
ＤＩをインスタンス化することによって展開される。こ
の新たなＡＤＩの出力は、その”文脈”と同じである
が、入力はそれらアクティブ・ノードの定義のイテレー
タである。元のアクティブ・ノードの属性及び内容は、
その展開結果の範囲内でアクセス可能なエンティティ名
と結合される。The output of the AID is a parse tree constructor. The processing for such a constructor includes adding a new child node to the current node, adding a new child node to the current node, making the child node the current node, and setting the parent node of the current node to the current node. To make it the new current node. A
DI operates by recursively traversing its input tree. Normal (inactive) nodes are copied to the output tree. The active node has a new A
Expanded by instantiating DI. The output of this new ADI is the same as its "context", but the input is an iterator of the definitions of those active nodes. The attributes and contents of the original active node are
Combined with the entity name that can be accessed within the expansion result.

【００６３】入力イテレータと出力コンストラクは前向
きにしか働かないため、木全体をＡＤＩに常駐させるこ
となく、そのようなトラバースをエミュレートするパー
サによってファイルを処理することが可能である。同様
に、出力コンストラクタは、実パース木を構築すること
なく、通過した各ノードの内部表現を出力することがで
きる。Since input iterators and output constructs work only positively, it is possible to process files with a parser emulating such traversal without having the entire tree resident in the ADI. Similarly, the output constructor can output the internal representation of each passed node without building a real parse tree.

【００６４】フィルタ処理のような特定の文書処理タス
クのために、他の特化した入力イテレータと出力コンス
トラクタを利用することができる。他の多くの等価な実
装、例えば、アクティブ文書のパース木中の各ノードを
所望の展開結果を出力するプログラム言語による等価な
命令シーケンスに変換するような実装が可能であること
は、当業者には明白であろう。このような実装では、最
初のパスでアクティブ文書がプログラムに変換され、次
のパスで、このプログラムが実行されて処理操作が遂行
される。For specialized document processing tasks such as filtering, other specialized input iterators and output constructors can be utilized. It will be appreciated by those skilled in the art that many other equivalent implementations are possible, such as, for example, converting each node in the parse tree of the active document into an equivalent instruction sequence in a programming language that outputs the desired expansion result. Will be obvious. In such an implementation, the first pass converts the active document into a program, and the second pass executes the program to perform processing operations.

【００６５】ここで、以下で使用されるいくつかの用語
を理解するため、発明者等が開発したアクティブ文書処
理用のInterForm言語の詳細を説明する。Here, the InterForm language for active document processing developed by the inventors will be described in detail in order to understand some terms used below.

【００６６】入力するトークンを変換することを”パー
シング”と言う。パーシングの結果として、ＡＤＩは、
開始タグとその内容からなり完全な要素を形成する複合
オブジェクトを作成する。要素、属性又はエンティティ
を新たなデータで置換する処理は、その結果が常にでは
ないが多くの場合に元のものより大きいことから、”展
開”と呼ぶ。アクターに要素に対する処理を実行させる
ことを、要素に対するアクターの”適用”と言う。ま
た、エンティティ名又は属性名と値を関連付けること
を、名前の値への”結合”と言う。値は、１文字からマ
ークアップ付きの文書全体まで、何でもよい。入力トー
クンのストリームを処理すること（アクターのパーシン
グ、展開及び適用）を、ストリームの”評価”と言う。Converting an input token is called "parsing". As a result of parsing, ADI
Create a composite object consisting of a start tag and its contents to form a complete element. The process of replacing an element, attribute or entity with new data is called "decompression" because the result is often, but not always, larger than the original. Having an actor perform an action on an element is called "applying" the actor to the element. In addition, associating an entity name or attribute name with a value is referred to as "binding" to a name value. The value can be anything from one character to the entire document with markup. Processing the stream of input tokens (actor parsing, expansion and application) is called "evaluation" of the stream.

【００６７】ここで説明する好適実施例において、Inte
rForm（登録商標）言語の構文はＳＧＭＬ標準構文から
派生したＨＴＭＬをベースにして出発しているため、In
terForm文書は、適切な文書型定義（ＤＴＤ）を有する
ＳＧＭＬパーサによって解析できる。ＨＴＭＬエディタ
が非標準的なエンティティと要素タグを考慮しているな
らば、そのＨＴＭＬエディタはInterForm文書を処理で
きるであろう。ＳＧＭＬバリデイタ（validator）を、
適切なＤＴＤとともに、InterForm文書の構文構造の検
証のために用いることができる。In the preferred embodiment described here, Inte
Since the syntax of the rForm® language is based on HTML derived from the SGML standard syntax, InForm
The terForm document can be parsed by an SGML parser with the appropriate document type definition (DTD). If the HTML editor considers non-standard entity and element tags, it will be able to process InterForm documents. SGML validator,
With the appropriate DTD, it can be used to verify the syntax structure of InterForm documents.

【００６８】少し異なった２種類の名前（識別子）がIn
terFormシステムで用いられる。第１の種類の名前は要
素タグと属性のためのものであり、第２の種類の名前は
エンティティのためのものである。要素タグと属性の名
前は、大文字と小文字の区別がなく、１つの文字（lett
er）で始まり、文字（letter）、数字、”．”（ピリオ
ド）及び”-”（ハイフン）の任意の並びからなる。エ
ンティティの名前は、同様の文字セットを使うが、大文
字と小文字が区別される。名前がＵＲＬ（Uniform Reso
urce Locator）の参照であるかファイル名である場合、
その名前の文字セットと形式は、例えばＵＲＬ又はファ
イル名に適用可能な外部ルールセットによって決めら
れ、InterForm言語には拘束されない。The two slightly different names (identifiers) are In.
Used in the terForm system. The first type of name is for element tags and attributes, and the second type of name is for entities. Element tag and attribute names are case-insensitive and have a single character (lett
er) and consists of any sequence of letters, numbers, “.” (period) and “-” (hyphen). Entity names use a similar character set, but are case-sensitive. If the name is URL (Uniform Reso
urce Locator) or a file name,
The character set and format of the name are determined by, for example, a set of external rules applicable to URLs or file names, and are not bound by the InterForm language.

【００６９】”if”アクターや”set”アクターのよう
な一部のアクターは、タグと直接対応する。慣例とし
て、タグと直接対応しないアクターは、ダッシュ（”
-”）で始まり、かつ、好ましくはダッシュで終わる名
前を付けるべきである。ダッシュは、アクターの名前の
中で語の区切りのためにも用いられる。多くの場合、あ
る名前を持つエージェントが実行中であるか調べる”ag
ent-running”に見られるように、最初の語は作用を受
けるＰＩＡデータ構造の名前を表す。Some actors, such as “if” actors and “set” actors, correspond directly to tags. By convention, actors that do not correspond directly to tags are represented by a dash ("
The name should start with-") and preferably end with a dash. The dash is also used to separate words in the actor's name. Check if it's inside “ag
As seen in "ent-running", the first word represents the name of the affected PIA data structure.

【００７０】”get”アクターなどの多くのアクター
は、”＜get entity＞”や”＜getactor＞”といった、
振る舞いを修正する非常に多様な特殊な属性を許容す
る。タグと属性が”．”で区切られた”get.entity”の
ような名前を持つ、多くの特化されたアクターのバージ
ョンが存在する。この”.”を使用すると、ＡＤＩがと
るべきアクションを決定するための作業量が少なくて済
も、いくぶん効率が良い。要するに、この”get”アク
ターはデータ構造を指定できるということである。Many actors, such as the “get” actor, include “<get entity>” and “<getactor>”
Allows a wide variety of special attributes that modify behavior. There are many specialized actor versions with names such as "get.entity" where tags and attributes are separated by ".". The use of "." Is somewhat more efficient, although the amount of work required to determine the action that the ADI should take is small. In short, this "get" actor can specify a data structure.

【００７１】エンティティ参照は以下のように定義され
る：エンティティ : := ’&’ パス ’i’? パス : := 識別子 [’.’ パス] ? 識別子 := [文字 | 数字 | ’-’]+ すなわち、エンティティ参照は、アンパサンド（”
＆”）を含み、その後に名前が続き、さらにセミコロン
（”；”）が続く。このセミコロンは、名前の中では許
されない文字がエンティティ名の後に続いている場合に
は省略可能であるが、いずれにしてもセミコロンを含め
るのが好ましいやり方である。このことは、エンティテ
ィ名において重要である。エンティティ名は、ピリオド
で区切られたいくつかのサブネームからなる。これらの
サブネームは、”名前空間”のシーケンス（エージェン
ト、＜table＞要素又は＜dl＞要素、フォームなど）へ
のパス（ファイルのパスに類似したもの）を構成する。
名前空間がまったく与えられない場合には、ＡＤＩは、
ローカル定義エンティティ（例えば、＜repeat＞タグで
定義されたリスト要素エレメント・エンティティ）と、
解釈対象のアクティブ文書に関連付けられた最上位（グ
ローバル）エンティティ・テーブルを探索する。ある名
前空間でパスが終わった場合、その名前空間が、文脈に
よって記述リスト又は照会文字列として返される。An entity reference is defined as follows: Entity:: = '&' path 'i'? Path:: = identifier ['.' Path]? Identifier: = [letter | number | '-'] + That is, an entity reference is an ampersand ("
&"), Followed by the name, followed by a semicolon (";"). The semicolon is optional if characters that are not allowed in the name follow the entity name, In any case, it is a good practice to include a semicolon, which is important for entity names, which consist of several subnames separated by periods. (Agents, <table> or <dl> elements, forms, etc.) (similar to file paths).
If no namespace is given, ADI will:
A locally defined entity (eg, a list element element entity defined with a <repeat> tag),
Search the top-level (global) entity table associated with the active document to be interpreted. If the path ends in a namespace, that namespace is returned as a descriptive list or query string depending on the context.

【００７２】デフォルトでは、ローカル定義エンティテ
ィとして、図１３に示すエンティティのほか、現在日、
日付、時刻や、カレント・エージェント、カレントＰＩ
Ａ及びカレント・ユーザを識別するための変数がある。
図１４に示したような、いくつかの名前空間が予め定義
されている。By default, in addition to the entities shown in FIG.
Date, time, current agent, current PI
There is a variable to identify A and the current user.
Several namespaces are predefined as shown in FIG.

【００７３】開始タグだけの要素は”空”要素と呼ばれ
る。したがって、空要素は、そのタグによって識別でき
る。一般的に、属性と識別子とは、空白を介在させず”
＝”で区切られる。An element with only a start tag is called an "empty" element. Therefore, an empty element can be identified by its tag. In general, attributes and identifiers should not be separated by spaces. "
= ”.

【００７４】高度な実装では、”entity-style hierarc
hical lookup on actor”タグが用意される。このタグ
は、＜dl＞リストをオブジェクトとして機能させること
ができる。開始タグ中の属性の順番は、処理のあいだ保
持されるけれども、予め定義されたアクターや標準ＨＴ
ＭＬタグのどれにとっても重要な意味はない。In the advanced implementation, "entity-style hierarc
hical lookup on actor "tag is provided. This tag allows the <dl> list to function as an object. The order of the attributes in the start tag is preserved during processing, but the predefined actor And standard HT
It has no significant meaning for any of the ML tags.

【００７５】以下に述べる＜repeat＞アクターなど、い
くつかのアクターはリストを処理する。リストは項目の
並びであり、通常、＜li＞要素で表現される。リスト
は、普通のプログラム言語の配列やデータベース言語の
集合に非常に似た働きをする。次の要素は単純リストと
して取り扱われる：＜ul＞,＜o1＞, ＜table＞, ＜tr＞＜table＞は、実はリストのリストである。また、記述
リスト（＜dl＞）は、２要素（＜dt＞，＜dd＞）リスト
のリスト、又は、名前と値が交互に並んだ単純、フラッ
トなリストとして扱うことができる。テキスト文字列、
特に属性は、＜split＞アクターを用いてホワイトスペ
ースで分割することによりリストとして扱うこともでき
る。要するに、どのようなトークン・シーケンスも、ホ
ワイトスペースかトークン境界で分割することによって
リストとして扱うことができる。Some actors process lists, such as the <repeat> actor described below. A list is a sequence of items, and is usually represented by an <li> element. Lists work very much like arrays in ordinary programming languages or collections in database languages. The following elements are treated as simple lists: <ul>, <o1>, <table>, <tr><table> is actually a list of lists. The description list (<dl>) can be treated as a two-element (<dt>, <dd>) list or a simple, flat list in which names and values are alternately arranged. Text strings,
In particular, attributes can be treated as a list by splitting with white space using the <split> actor. In short, any token sequence can be treated as a list by splitting on white space or token boundaries.

【００７６】アクティブ文書に関連付け情報を持たせる
こともできる。関連付けによって、＜dt＞要素と＜tabl
e＞要素はキー＝値で関連付けされたものとして扱われ
る。表のキーは通常、行の先頭項目にすぎないが、その
前の行に空白の先頭要素（通常、＜th＞タグ）を持つ行
が追加されている。ラベル付けされた列（通常、先頭の
＜th＞要素行で示される）を持つ表は、関連付けされた
リストに変換することができる。The active document can have association information. Depending on the association, the <dt> element and <tabl
e> elements are treated as being related by key = value. The table key is usually just the first item in the row, but a row with a blank first element (usually a <th> tag) is added to the previous row. Tables with labeled columns (typically indicated by a leading <th> element row) can be converted to an associated list.

【００７７】また、適当な形式の文字列はどれも関連付
け情報として用いることができる。対になったものは
（照会文字列におけるように）ホワイトスペース又は”
＆”文字によって区切られるが、名前と値は通常、”
＝”文字で区切られる。普通のリストは、項目を関連付
けして対にする＜associate＞アクターを用いることに
より、関連付け情報として用いることもできる。どのよ
うな要素も（属性及び値の）関連付け情報に変換するこ
とができ、その際、必要に応じそのタグと内容は特殊キ
ー（通常、”-tag-”及び”-content-”）と関連付けさ
れるが、これは属性と間違われることはない。これも＜
associate＞アクターを使用して行われる。Any character string in an appropriate format can be used as the association information. The pair is whitespace (as in the query string) or "
&"Character, but names and values are usually
= ”Character. Ordinary lists can also be used as association information by using <associate> actors to associate and pair items. Any element can be used as association information (of attributes and values). , Where tags and content are associated with special keys (usually "-tag-" and "-content-") as needed, but are not mistaken for attributes This is also <
associate> actor.

【００７８】各アクターは、その定義時に（結局は、そ
のアクターの本文の内部の要素を利用する時に）、図１
５に示すような属性として認定可能なオンライン文書文
字列を持ってもよい。アクターの文書をアクセスするた
めに用いられるアクターの例を図１６に示す。Each actor, at the time of its definition (after all, using elements inside the body of that actor),
An online document character string that can be recognized as an attribute as shown in FIG. FIG. 16 shows an example of an actor used to access an actor's document.

【００７９】ここで、上に述べた概念の具体例について
図１０の（ａ）乃至（ｅ）を参照して説明する。図１０
の（ａ）は、文書中に現れた時に、あるアクターをイン
スタンス化するテキスト部分８０２である。このテキス
ト部分８０２は、開始タグ８０４と命令群８０６と終了
タグ８０８からなっている。Here, a specific example of the above concept will be described with reference to FIGS. FIG.
(A) is a text portion 802 that instantiates an actor when it appears in a document. This text portion 802 is composed of a start tag 804, an instruction group 806, and an end tag 808.

【００８０】開始タグ８０４は、あるアクター（本例で
は”demo”）をアクティブにするとともに、そのquoted
属性及びdscr属性を定義する。命令群８０６は要素８０
２の内容（図１０（ａ））を構成し、”demo”アクター
の定義を含む。この命令群は、”demo”アクターの”ha
ndle”メソッドによって実行（展開）される。図示のよ
うに、命令群８０６は、（＜protect-result markup＞
＜/protect-result＞タグによって）内容をマーク付け
を変更することなく出力し、それに続けてボールド体
の”＝”と内容の展開結果を出力するものである。The start tag 804 activates a certain actor (“demo” in this example), and
Define attributes and dscr attributes. Instruction group 806 is element 80
2 (FIG. 10A), and includes the definition of the "demo" actor. This set of instructions is for the “demo” actor “ha”
The instruction group 806 includes (<protect-result markup>) as shown in FIG.
It outputs the contents (with the </ protect-result> tag) without changing the marking, followed by the bold "=" and the result of expanding the contents.

【００８１】図１０の（ｂ）は処理対象の入力文書８１
０を示す。ＡＤＩは、入力文書８１０の処理時に、開始
タグ＜demo＞に出会うと、同タグをそのパース・スタッ
クにプッシュすることは前述した通りである。＜demo＞
要素内の要素も終了タグ＜/demo＞に出会うまでパース
・スタックにプッシュされ、その後に、＜demo＞要素は
その”handle”メソッドの呼び出しによって展開され
る。FIG. 10B shows an input document 81 to be processed.
Indicates 0. When the ADI encounters the start tag <demo> during processing of the input document 810, it pushes the tag onto its parse stack as described above. <Demo>
Elements within the element are also pushed onto the parse stack until the end tag </ demo> is encountered, after which the <demo> element is expanded by calling its "handle" method.

【００８２】＜demo＞要素の処理の場合、同要素の定義
の内容はそれ自体が＜repeat＞要素であるので、ＡＤＩ
はその入力スタックに同内容をプッシュし、また名前”
content”を＜demo＞要素の内容に結合することによっ
て、＜demo＞要素の定義内容を処理する。そして、そこ
から処理が継続する。＜demo＞要素の内容を処理する場
合、その内容が出力され、次に図１０の（ａ）に関連し
て説明したように同内容が評価される。その結果、図１
０の（ｃ）に示すような出力文書が得られる。同内容は
＜repeat＞要素であるので、同要素は、まず変更される
ことなく出力され、ついで展開される。＜repeat＞要素
はｌｉｓｔ属性”a b c”を持ち、＜repeat＞要素の内
容は”&li; &li”であるから、”&li; &li;”の展開結
果はリスト項目の後に空白が続き、さらにその後にリス
ト項目が続いたものであり、したがって、＜repeat＞要
素の展開結果は”a ab bc c”である。この展開の再帰
性に注意されたい。図１０の（ａ）から（ｃ）の文書は
いずれも同一フォーマットであることにも留意された
い。例えば、図１０の（ｃ）に示した出力文書は、その
ままＡＤＩに入力文書として渡すことも可能である。In the case of the processing of the <demo> element, the definition of the same element is itself a <repeat> element.
Pushes the same on its input stack and returns the name "
process the definition content of the <demo> element by combining "content" with the content of the <demo> element. Processing continues from there. When processing the content of the <demo> element, the content is output. Then, the same contents are evaluated as described with reference to FIG.
Thus, an output document as shown in FIG. Since the content is a <repeat> element, the element is output without being changed first and then expanded. The <repeat> element has a list attribute “abc”, and the content of the <repeat> element is “&li;& li”, so the expansion result of “&li;&li;” is followed by a blank after the list item, and The list item is followed, so the expansion result of the <repeat> element is "a abbc c". Note the recursive nature of this expansion. It should also be noted that the documents in FIGS. 10A to 10C have the same format. For example, the output document shown in FIG. 10C can be passed as it is to the ADI as an input document.

【００８３】図１０の（ｄ）と（ｅ）は、demoアクター
のもう１つの使用例を説明するものである。図１０の
（ｄ）は、リスト項目の参照を内容とする＜ｏｌ＞（順
序リスト）要素を持った入力文書を表す。これら要素の
再帰的展開の場合、”＜li＞&li;”は、”＜li＞”の後
に＜ol＞要素の属性から選択された、”a b c”なる値
を持つリスト項目が続いたものに展開される。これらト
ークンは次に標準のＨＴＭＬマークアップとして処理さ
れ、図１０の（ｅ）に示す出力文書が作成され。図１０
の（ｅ）は、list属性の値から見つかったトークンの順
序リストを示している。FIGS. 10D and 10E illustrate another example of using the demo actor. FIG. 10D shows an input document having an <ol> (ordered list) element whose content is a reference to a list item. In the case of recursive expansion of these elements, "<li>&li;" is "<li>" followed by a list item with the value "abc" selected from the attributes of the <ol> element. Be expanded. These tokens are then processed as standard HTML markup to create the output document shown in FIG. FIG.
(E) indicates an ordered list of tokens found from the value of the list attribute.

【００８４】さて、アクティブ文書プロセッサの別の実
施例もしくは考え方を図１１に示す。このアクティブ文
書プロセッサは入力文書９００を受け取り、それを入力
パーサ９０２に与えるが、この入力文書９００は好まし
くはアクティブ文書であるが解析可能などのような文書
でもよい。入力パーサ９０２は、入力文書９００のカレ
ント・ノード９０６（ａ）をノード・プロセッサ９０４
に与えるように接続されている。入力プロセッサ９０２
はまた、以下に述べる様々なコマンドをノード・プロセ
ッサ９０４より受け取るように接続されている。ノード
・プロセッサ９０４は、接続ジェネレータ９１２と接続
され、カレント・ノード９０６（ａ）又はその処理結果
であるノード９０６（ｂ）を、その処理のための命令群
９１０とともに与える。出力ジェネレータ９１２は、こ
れらノード９０６（ｂ）を受け取ると、入力文書９００
を処理することにより得られる出力文書の部分部分を出
力する。この出力は、文字列構造体へ（文字列変数、固
定長記憶要素など）、又は、フィルタへ（パイプ入力、
ストリーミング入力など）送られるであろう。FIG. 11 shows another embodiment or concept of the active document processor. The active document processor receives an input document 900 and provides it to an input parser 902, which is preferably the active document, but can be any parseable document. The input parser 902 converts the current node 906 (a) of the input document 900 into a node processor 904.
Connected to give to. Input processor 902
Is also connected to receive from the node processor 904 various commands described below. The node processor 904 is connected to the connection generator 912, and gives the current node 906 (a) or a node 906 (b) that is a processing result thereof together with an instruction group 910 for the processing. Upon receiving these nodes 906 (b), the output generator 912 receives the input document 900
Is output as the output document. This output can be sent to a string structure (string variables, fixed-length storage elements, etc.) or to a filter (pipe input,
Streaming input, etc.).

【００８５】ノード・プロセッサ９０４が入力パーサ９
０２に対し発行するコマンドとしては、次のものがあ
る。 move to next sibling node（兄弟ノードへの移動） move to first child node（最初の子ノードへの移動） move to parent node（親ノードへの移動） tell whether a next sibling exists（次の兄弟ノード
の有無判定） tell whether a child exists（子の有無判定） tell whether a parent exists（親の有無判定） get current node's type（カレント・ノードの型の取
得） get current node's tag（カレント・ノードのタグの取
得） get current node's atributes（カレント・ノードの属
性の取得）The node processor 904 has the input parser 9
02 is issued as follows. move to next sibling node (move to sibling node) move to first child node (move to first child node) move to parent node (move to parent node) tell whether a next sibling exists Judgment) tell whether a child exists (judgment of child) tell whether a parent exists (judgment of parent) get current node's type (get type of current node) get current node's tag (get tag of current node) get current node's atributes (get attributes of current node)

【００８６】入力パーサ９０２は、”move”コマンドに
対しては、入力文書の木をトラバースして新たなカレン
ト・ノードを出力が、ノード・プロセッサ９０４は”te
ll”コマンド又は”get”コマンドに対しては、問い合
わせの回答を返す。このパーシング方法では、入力文書
は、その全体を解析する必要がないので、どのような大
きさであってよい。それどころか、木を作成する必要さ
えない。In response to the "move" command, the input parser 902 traverses the tree of the input document and outputs a new current node.
In response to the "ll" command or "get" command, the answer of the query is returned. With this parsing method, the input document does not have to be parsed in its entirety, so it can be of any size. You don't even need to create trees.

【００８７】ここに説明したことから明らかなよう
に、”move to next sibling”コマンド、”move to fi
rst child”コマンド及び”move to parent”コマンド
を用いることによって、文書の一部しか利用できない場
合でも、文書木の全体をトラバースすることができる。
図１２は、構造化文書から木が構築される様子を説明す
るものである。図１２を見れば、”move right”、”mo
ve down”及び”move up”の３つのmoveコマンドを思い
つくであろう。これらコマンドが正しい順序で発行され
るならば、ノード・プロセッサは、渡されたノードを全
て受け取ることができる。たとえ、その入力が終わりの
ない（例えばストリーミング）入力文書であったとして
も同様である。これは、文書木の全体はノード・プロセ
ッサのローカル記憶に格納する必要がないからである。
図１２に示されるように、構造化文書９００は、その中
に出現する様々な要素をノードとするパース木に変換で
きる。この例では、パース木の葉は入力文書９００のテ
キスト要素である。As is clear from the above description, the “move to next sibling” command and the “move to fi
By using the "rst child" command and the "move to parent" command, the entire document tree can be traversed even when only a part of the document is available.
FIG. 12 illustrates how a tree is constructed from a structured document. Referring to FIG. 12, "move right", "mo
You will come up with three move commands, "ve down" and "move up." If these commands are issued in the correct order, the node processor can receive all of the passed nodes. Even if the input is an endless (eg, streaming) input document, the entire document tree does not need to be stored in the local storage of the node processor.
As shown in FIG. 12, the structured document 900 can be converted into a parse tree having various elements appearing therein as nodes. In this example, the leaves of the parse tree are text elements of the input document 900.

【００８８】再び図１１を参照する。命令群９１０には
次の命令を含めることができる。 output node（ノードを出力） start (open) a new node（新たなノードを開始させる
（開く）） end a currently open node（現在開かれているノード
を終了させる）これらの命令を使用すれば、入力文書と同じフォーマッ
トの出力文書を生成することができる。図１１に示すア
ーキテクチャは同一フォーマット（同一の条件と規約に
従う）の出力文書に対応できるので、ノード・プロセッ
サをネストさせ又は連鎖させることができることに留意
されたい。例えば、１つのパーサ−プロセッサ−ジェネ
レータの出力を、別のパーサ−プロセッサ−ジェネレー
タの入力としてもよいであろう。これは、単一のパーサ
−プロセッサ−ジェネレータで、中間処理後のアクティ
ブ文書のデータ構造を格納するような、中間的な非処理
ステップが必要な場合に好適かもしれない。入力文書が
プログラムを表す場合、そのプログラムは、出力がそれ
自体と同じフォーマットのプログラムであろう。Referring again to FIG. The instruction group 910 can include the following instructions. output node (output node) start (open) a new node (start a new node (open)) end a currently open node (end the currently open node) An output document in the same format as the document can be generated. Note that the architecture shown in FIG. 11 can accommodate output documents of the same format (subject to the same conditions and conventions), so that node processors can be nested or chained. For example, the output of one parser-processor-generator could be the input of another parser-processor-generator. This may be suitable where a single parser-processor-generator requires intermediate non-processing steps, such as storing the data structure of the active document after intermediate processing. If the input document represents a program, the program will be a program whose output is in the same format as itself.

【００８９】以上に述べたことは、説明のためのもので
あって限定を意図したものではない。以上の説明から、
当業者には本発明の多くの変形が明白となろう。１例に
過ぎないが、ＡＤＩは、単純な文書処理タスクのための
スタンドアロン形式と、ＰＩＡ（personal information
agency）システムに埋め込まれた形式の両方をとる。
後者の形式では、ＡＤＩは、インターネットやその他の
ソースから検索された文書の解析及び修正、並びにユー
ザに直接提示するための文書の生成のために利用され
る。スタンドアロンのＡＤＩは、例えば、最新の文書と
それ自体のアクター及びタグセットの例を生成するため
に用いられるであろう。The foregoing is illustrative and not intended to be limiting. From the above explanation,
Many modifications of the invention will be apparent to those skilled in the art. By way of example only, ADI uses a stand-alone format for simple document processing tasks and a PIA (personal information).
agency) takes both forms embedded in the system.
In the latter format, ADI is used for parsing and modifying documents retrieved from the Internet and other sources, and for generating documents for direct presentation to a user. A stand-alone ADI would be used, for example, to generate up-to-date documents and examples of its own actors and tag sets.

【００９０】[0090]

【発明の効果】以上に詳細に説明したように、本発明に
よれば、その実現が望まれていた、文書と、それに関連
付けられた振る舞い（プログラム）を統合できる、改良
した方法及び装置を実現可能である。特に、アクティブ
文書とエージェンシー・システムの組合せにより、高水
準の機能をアクティブ文書として実装できるため、クラ
イアント用ソフトウェアとサーバ用ソフトウェアでは低
水準の機能に対応するだけでよく、また、アクティブ文
書言語により開発を行う場合、文書の内容（データ）と
処理（振る舞い)を特定するために１つの統一された言
語が使用されるので、文書向けコンピュータ処理を容易
に実装することができ、さらに、エージェント自体をア
クティブ文書として記述できる、等々の効果を得られ
る。As described in detail above, according to the present invention, an improved method and apparatus which can integrate a document and a behavior (program) associated therewith, which has been desired to be realized, are realized. It is possible. In particular, by combining active documents and agency systems, high-level functions can be implemented as active documents, so client and server software only need to support low-level functions, and are developed using the active document language. Is performed, a single unified language is used to specify the content (data) and processing (behavior) of the document, so that computer processing for the document can be easily implemented. An effect that can be described as an active document can be obtained.

[Brief description of the drawings]

【図１】文書とプログラムに別々の構文を用いる文書シ
ステムのブロック図である。FIG. 1 is a block diagram of a document system that uses different syntax for documents and programs.

【図２】図１中のフォーム２２の内容を示す図である。FIG. 2 is a diagram showing the contents of a form 22 in FIG.

【図３】図１中のＣＧＩスクリプトの内容を示す図であ
る。FIG. 3 is a diagram showing the contents of a CGI script in FIG. 1;

【図４】パーソナル・インフォメーション・エージェン
シーのブロック図である。FIG. 4 is a block diagram of a personal information agency.

【図５】パーソナル・インフォメーション・エージェン
シーの動作説明のための図である。FIG. 5 is a diagram for explaining the operation of a personal information agency.

【図６】図５中のアクティブ文書の内容を示す図であ
る。FIG. 6 is a diagram showing contents of an active document in FIG. 5;

【図７】ユーザ入力のためのブラウザ表示の一例を示す
図である。FIG. 7 is a diagram showing an example of a browser display for user input.

【図８】ユーザ入力後のブラウザ表示の一例を示す図で
ある。FIG. 8 is a diagram showing an example of a browser display after user input.

【図９】一般化したアクティブ文書の解釈処理のフロー
チャートである。FIG. 9 is a flowchart of a generalized active document interpretation process.

【図１０】アクティブ文書インタプリタにより処理され
る入出力文書の説明図である。FIG. 10 is an explanatory diagram of an input / output document processed by an active document interpreter.

【図１１】本発明による他の実施例のブロック図であ
る。FIG. 11 is a block diagram of another embodiment according to the present invention.

【図１２】図１１中の入力パーサによるトラバースのた
めに生成されるパース木の説明図である。FIG. 12 is an explanatory diagram of a parse tree generated for traversal by the input parser in FIG. 11;

【図１３】ローカル定義エンティティの説明図である。FIG. 13 is an explanatory diagram of a local definition entity.

【図１４】予め定義される名前空間の説明図である。FIG. 14 is an explanatory diagram of a name space defined in advance.

【図１５】アクターの属性の説明図である。FIG. 15 is an explanatory diagram of an attribute of an actor.

【図１６】文書アクセスのためのアクターの説明図であ
る。FIG. 16 is an explanatory diagram of an actor for document access.

[Explanation of symbols]

１２ブラウザ（クライアント）１４サーバ２０サーバ・スーパバイザ２２フォーム２４スクリプト４００ＰＩＡ（パーソナル・インフォメーション・エ
ージェンシー）４０１エージェント４０２クライアント（ブラウザ）４０４アクティブ文書12 Browser (Client) 14 Server 20 Server Supervisor 22 Form 24 Script 400 PIA (Personal Information Agency) 401 Agent 402 Client (Browser) 404 Active Document

───────────────────────────────────────────────────── フロントページの続き (72)発明者グレゴリーウォルフアメリカ合衆国カリフォルニア州 94025 メンローパークスィート 115 サンドヒルロード 2882 リコーコーポレーション内 ──────────────────────────────────────────────────続き Continuing on the front page (72) Inventor Gregory Wolff United States 94025 Menlo Park Suite 115 Sand Hill Road 2882 Ricoh Corporation

Claims

[Claims]

1. A method of processing an input structured document and generating an output document that is the result of the processing: a) Initializing a parser cursor to point to a first element of the input structured document. Initializing an input parser, b) coupling an element input of an element processor with the input parser to receive a current element from the input parser, and coupling an element output of the element processor to an output generator. Initializing the element processor; c) initializing a definition table coupled to the element processor; d) receiving an element sequence from the input parser and converting each element to the following steps 1) to 6). Processing by: 1) inputting an element from an element input of the element processor; 2) determining whether the element is an active element or a passive element. 3) if the element is a passive element, evaluate the passive element using an applicable definition obtained from the definition table and pass the result to the element output; 4) If the element is an active element, performing the following steps 1) and 2): i) combining the element output with an input of an active element queue; and ii) the applicability obtained from the definition table Evaluating the active element using any definition and passing the result to the element output; 5) coupling the element input to the output of the active element queue if the active element queue is not empty; And 6) if the active element queue is empty and the element input is not coupled to the input parser,
Combining the element input with the input parser; and e) generating the output document using the output generator.

2. The document processing method according to claim 1, wherein the output document is a null document when each processed element is evaluated as a null character string.

3. The document processing method according to claim 1, wherein the input structured document, the output document, and the definition table are each represented by a common structured document format.

4. The common structured document format is S
4. The document processing method according to claim 3, wherein the document is GML.

5. The step of determining whether the element is an active element or a passive element: a) comparing an element identifier with an active tag table; b) determining that a matching active tag is the active element. Identifying the element as an active element when found in the tag table; and c) determining that the matching active tag is the active tag.
2. The document processing method according to claim 1, further comprising the step of identifying the element as a passive element when the element is not found in the table.

6. The document processing method according to claim 5, wherein said comparing step is a step of comparing an element identifier with an active tag found in said definition table.

7. The step of evaluating the passive element includes: a) replacing tokens in the passive element with applicable definitions obtained from the definition table;
And b) evaluating the primitive expression, if any, in the passive element.

8. The step of evaluating the passive element includes: a) matching the element with a set of passive actor criteria; and b) matching when the element matches a criteria of a passive actor. 2. The document processing method according to claim 1, further comprising the step of: calling a passive actor method for each of the passive actors.

9. The processing of the active element comprises: a) replacing at least a passive token in the active element with an applicable definition obtained from the definition table; b) a primitive in the active element. 2. The document processing method according to claim 1, comprising: if there is an expression, evaluating the primitive expression; and c) calling an active actor action method with the active element as a parameter.

10. A method for processing an input structured document by an active document interpreter to generate an output document, comprising: a) parsing the input document to a sequence of elements including a start tag, an end tag, and an entity reference. B) performing the following steps 1) to 6) for each element encountered in said parse sequence: 1) The element is a passive element or an active element. Determining whether there is an active element, wherein the active element is an element with an associated handler in a handler database maintained by the active document interpreter; 2) a child element of the element is in the parse sequence 3) when the element is a passive element and has no child elements, Passing the passive element to the output process; 4) when the element is a passive element and has a child element, processing the child element by step b) and repeating the processing as necessary; 5) the element Is an active element and has child elements, processing the child elements by the following steps i) and ii): i) processing the child elements according to step b), and performing the processing as necessary Iterating; and ii) step a) sending the element output in 5) i to a storage queue; and 6) if the storage queue is not empty, the input analyzed until the storage queue is empty. Using the storage queue as a source of the parse sequence instead of a document; and c) defining in the output process each element output to the output process as a defined output. Document processing method characterized by comprising the steps, processes according to the rules set.

11. The method of providing the passive element to the output process further comprising: a) determining whether an entity reference for the passive element is defined in an entity definition set; and b) determining whether the entity reference is 11. The document processing method according to claim 10, further comprising, when being defined in the entity definition set, outputting a corresponding entity value of the defined entity reference.

12. Steps a), b) and c) are performed substantially simultaneously so that the input structured document can be at least partially processed without reference to the entire input structured document. 11. The document processing method according to claim 10, wherein:

13. The document processing method according to claim 10, wherein steps a), b) and c) are performed sequentially.

14. The document processing method according to claim 10, wherein the processing operations of steps a), b) and c) are expressed in the same syntax as the input structured document.

15. The document processing method according to claim 10, wherein the handler database is expressed by the same syntax as the input structured document.

16. An apparatus for processing an input structured document represented by a parse tree and generating an output document, comprising: a) an input document tree that traverses the parse tree and outputs a current element when requested. A traverser, wherein the current element is a cursor-pointed element in the parse tree; b) an element processor coupled to the input document tree traverser for processing an element output by the input document tree traverser. C) an output document tree constructor combined with the element processor for building an output document tree with the elements output by the element processor; d) determining whether the input element is an active element or a passive element For the first within the element processor
E) a first router inside the element processor for sending a passive element to an output stage of the element processor and an active element to an element queue; f) a corresponding entity replacement value for the input element A) a second element evaluator inside said element processor to find out from the definition table; and g) sending an element from said element queue to said second element evaluator and output of said element processor when said element queue is not empty; A document processing apparatus, comprising: a second router inside the element processor for sending an element from the input to the element processor when the element processor is empty.

17. A method of processing an input document to obtain an output document, comprising: a) parsing the input document to obtain a sequence consisting of an element start tag, an element end tag, an entity reference, and a character string; b) replaces the input document with 1) each defined entity reference with a sequence of instructions that, when executed, outputs a defined replacement value; 2) each element identified as an active element, when executed, Take the output of the instruction corresponding to the attribute and content of the element as input, replace it with the instruction sequence passed from the definition of the active element to the instruction sequence derived by this method, 3) Recognize that it represents a primitive operation When each element executed is executed, it takes in the output of the instruction corresponding to the attribute and content of the element as input, and instructs them. By replacing the unrecognized element start tag, element end tag, entity reference, and character string with an instruction sequence that outputs it. Converting the sequence into a sequence: and c) executing the obtained instruction sequence to generate an output document.

18. A computer-readable storage for recording a program for causing a computer to execute the processing for the document processing method according to claim 1. Description: Medium.

19. A computer-readable storage medium having recorded thereon a program for causing a computer to realize the functions of the document processing apparatus according to claim 16.