JPH09282208A

JPH09282208A - Table generating method

Info

Publication number: JPH09282208A
Application number: JP8087942A
Authority: JP
Inventors: Tetsuo Tanaka; 哲雄田中
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1996-04-10
Filing date: 1996-04-10
Publication date: 1997-10-31
Anticipated expiration: 2016-04-10
Also published as: JP3489326B2

Abstract

PROBLEM TO BE SOLVED: To facilitate the data handling by defining the formats of file and record, extracting a character string corresponding to the record and determining the value of an item from the correspondent relation and the type of the item. SOLUTION: Record format and file format definition names are inputted and retrieved from a record format/file format definition table 109, a text file to be an object is inputted, and the character string is discriminated out of that file. The character string corresponding to the item is extracted from the discriminated character string. The value of the item is determined from the relation of correspondence among the extracted character string, item name and character string corresponding to the item and the type of the item. Thus, by inputting the formats of the file and the record without describing any program, a table can be generated from the text file in which plural items repeatedly appear.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、ワークステーショ
ンやパーソナルコンピュータ等の計算機で処理される複
数の項目が繰り返し出現するテキストファイルから、複
数の項目をもつ複数のレコードからなるテーブルを生成
する方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method of generating a table consisting of a plurality of records having a plurality of items from a text file in which a plurality of items repeatedly processed by a computer such as a workstation or a personal computer. .

【０００２】[0002]

【従来の技術】電子メールのアーカイブ、著者名・文献
名・雑誌名・発行日等からなる文献データのリスト、関
数名・引数の名前・引数の型・機能等からなる関数仕様
のリストなど、複数の項目から構成される同種のデータ
のリストを、計算機上のテキストファイルとして保存し
ておくことが多い。一方、表計算ソフトを用いることに
より、同種のデータのリストの操作が容易になる。例え
ば、ある項目の値によって並べ替えたり、ある項目の値
がある条件に合致するもののみを取り出したりすること
が容易になる。しかし、表計算ソフトが扱うことができ
るファイルの形式は固定的であり、前記のテキストファ
イルを表計算ソフトで扱うには、全てのファイルを１つ
のファイルにまとめ、各項目をカンマなどの表計算ソフ
トで決められた文字で区切るなどの処理が必要になる。2. Description of the Related Art E-mail archives, a list of document data consisting of author names, document names, journal names, issue dates, etc., a list of function specifications consisting of function names, argument names, argument types, functions, etc. A list of data of the same type consisting of multiple items is often saved as a text file on a computer. On the other hand, the use of spreadsheet software facilitates the operation of lists of data of the same type. For example, it becomes easy to sort by the value of a certain item, or to take out only those values of a certain item that meet certain conditions. However, the format of files that spreadsheet software can handle is fixed. To handle the above text files with spreadsheet software, combine all files into one file and put each item in a spreadsheet such as a comma. It is necessary to perform processing such as delimiting with characters specified by the software.

【０００３】これらのテキストファイル処理は、「矢
吹、宮城、富田、初めて使うEmacs、テクノプレス、１
９９５」にあるようなテキストエディタを用いることで
可能になる。テキストエディタの文字列の切り貼り機能
を用い、利用者が、移動または削除する文字列を選択
し、選択した文字列を切り取り、移動の場合は、異動先
に切り取った文字列を貼り付ける。These text file processes are described in "Yabuki, Miyagi, Tomita, Emacs used for the first time, Technopress, 1
This can be done by using a text editor such as "995". Using the cut and paste function of the character string of the text editor, the user selects the character string to move or delete, cuts the selected character string, and in the case of moving, pastes the cut character string to the transfer destination.

【０００４】また、「石田晴久、UNIX、共立出版株式会
社、１９８３」にあるようなプログラミング言語を用
い、テキスト処理用のプログラムを記述しそのプログラ
ムを実行することにより、テキストファイルから必要な
文字列だけを取り出して、カンマやスペースで区切られ
たテーブル形式のデータに変換することができる。Further, by using a programming language such as "Haruhisa Ishida, UNIX, Kyoritsu Shuppan Co., Ltd., 1983", a text processing program is written and the program is executed to obtain a required character string from a text file. You can just take out and convert it into a tabular data separated by commas or spaces.

【０００５】[0005]

【発明が解決しようとする課題】従来の技術におけるテ
キストエディタを用いる方法では、利用者が逐一、テキ
ストの選択・切り取り、貼り付けの処理を行わなければ
ならない。データ量が多くなるほど、利用者の操作は増
える。In the conventional method using the text editor, the user must select, cut and paste the text one by one. The larger the amount of data, the more user operations.

【０００６】また、プログラミング言語を用いる方法で
は、利用者はプログラミング言語の文法を理解し、その
文法に合致し、利用者の望む処理をするプログラムをフ
ァイルの種類毎に記述しなければならい。Further, in the method using a programming language, the user must understand the grammar of the programming language, and describe the program that matches the grammar and performs the processing desired by the user for each file type.

【０００７】本発明は上記不便を解消するためになされ
たもので、その目的は、１つ以上のテキストファイルか
ら複数の項目で構成されるレコードを抽出し、抽出した
複数のレコードで構成されるテーブルを自動的に生成す
ることである。同じ構造をもつ複数のデータをテーブル
形式に変換することにより、データの操作が容易にな
る。例えば、ある項目の値によって並べ替えたり、ある
項目の値がある条件に合致するもののみを取り出したり
することが容易になる。The present invention has been made to solve the above inconvenience, and an object thereof is to extract a record composed of a plurality of items from one or more text files and to compose a plurality of extracted records. It is to generate the table automatically. By manipulating a plurality of data having the same structure into a table format, data manipulation becomes easy. For example, it becomes easy to sort by the value of a certain item, or to take out only those values of a certain item that meet certain conditions.

【０００８】[0008]

【課題を解決するための手段】上記目的を達成するため
に、本発明のテーブル生成方法は、ファイルとレコード
の形式、すなわち、ファイル内のレコードに相当する文
字列を識別するための、レコードの先頭と終了の文字列
パターン、レコードに相当する文字列内における項目に
相当する文字列を識別するための、項目の区切り文字列
パターン、レコードを構成する項目名、項目の型、項目
名とテキストファイルの属性の対応関係、及び、項目名
とテキストファイル内の項目に相当する文字列の対応関
係、を入力するステップ、処理の対象となるテキストフ
ァイルの属性を抽出するステップ、抽出した属性と、前
記項目名とテキストファイルの属性の対応関係と、項目
の型から、項目の値を決定するステップ、対象となるテ
キストファイルを入力するステップ、前記レコードに相
当する文字列を識別する方法に基づき、入力したテキス
トからレコードに相当する文字列を抽出するステップ、
前記レコードの項目に相当する文字列を識別する方法に
基づき、レコードに相当する文字列から項目に相当する
文字列を抽出するステップ、抽出した文字列と、前記項
目名と項目に相当する文字列の対応関係と、前記項目の
型から項目の値を決定するステップ、を有する。In order to achieve the above object, the table generation method of the present invention uses a record format for identifying a file and a record format, that is, a character string corresponding to a record in the file. Character string pattern at the beginning and end, item delimiter string pattern for identifying the character string corresponding to the item in the character string corresponding to the record, item name that configures the record, item type, item name and text A step of inputting the correspondence relationship between the file attributes and a correspondence relationship between the item name and the character string corresponding to the item in the text file, the step of extracting the attribute of the text file to be processed, the extracted attribute, The step of determining the value of the item from the correspondence between the item name and the attribute of the text file and the type of the item, the target text file The step of force, according to the method of identifying the character string corresponding to the record, and extracts a character string corresponding to the record from the entered text step,
Extracting a character string corresponding to an item from the character string corresponding to the record based on the method of identifying the character string corresponding to the item of the record, the extracted character string, the item name and the character string corresponding to the item And the step of determining an item value from the item type.

【０００９】以下、項目を表す文字列を項目文字列、項
目間の区切りを表す文字列を区切り文字列と呼ぶ。Hereinafter, a character string representing an item is referred to as an item character string, and a character string representing a delimiter between items is referred to as a delimiter character string.

【００１０】[0010]

【発明の実施の形態】以下、本発明の実施例を図面によ
り説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００１１】図１は本発明のテーブル生成システムの構
成の一例である。図１において、１０１は中央処理装
置、１１１は主記憶装置、１０２は中央処理装置１０１
の処理結果を表示するディスプレー、１０３，１０４は
利用者からの入力を受け付けるキーボード、マウス、１
０５はハードディスク等の外部記憶装置である。外部記
憶装置には、テーブル生成プログラム１０６、ファイル
形式登録プログラム１０７、処理の対象となるテキスト
ファイル１０８、ファイル形式登録プログラム１０７で
登録された結果であるレコード形式・ファイル形式定義
表１０９、基本制御プログラム１１０が格納されてい
る。外部記憶装置１０５に格納されているプログラム及
びデータは主記憶装置１１１に転送され中央処理装置１
０１で処理される。FIG. 1 shows an example of the configuration of the table generation system of the present invention. In FIG. 1, 101 is a central processing unit, 111 is a main storage unit, and 102 is a central processing unit 101.
A display for displaying the processing result of the above, 103, 104 are a keyboard, a mouse, and 1 for receiving the input from the user.
Reference numeral 05 is an external storage device such as a hard disk. In the external storage device, a table generation program 106, a file format registration program 107, a text file 108 to be processed, a record format / file format definition table 109 which is a result registered by the file format registration program 107, and a basic control program. 110 is stored. The programs and data stored in the external storage device 105 are transferred to the main storage device 111 and transferred to the central processing unit 1.
01.

【００１２】図２は、テーブル生成プログラム１０６の
処理フローの一例である。図３は図２におけるレコード
生成ステップ（ステップ２０９、及び、ステップ２１
１）の処理フローの詳細である。図４は、図３における
ファイルから１レコード分バッファに読み込むステップ
（ステップ３１０）の詳細である。図５は、図３におけ
る区切り文字列と項目文字列に分割するステップ（ステ
ップ３１１）の詳細である。FIG. 2 shows an example of the processing flow of the table generation program 106. 3 shows the record generation step (step 209 and step 21 in FIG. 2).
It is a detail of the processing flow of 1). FIG. 4 shows details of the step (step 310) of reading one record from the file shown in FIG. 3 into the buffer. FIG. 5 shows details of the step (step 311) of dividing into the delimiter character string and the item character string in FIG.

【００１３】図６は、ファイル形式登録プログラム１０
７の処理フローの一例である。FIG. 6 shows a file format registration program 10.
7 is an example of the processing flow of No. 7.

【００１４】図７は、ファイル形式登録プログラム１０
７がレコード形式・ファイル形式定義ダイアログを表示
するステップ（ステップ６０１）でディスプレー１０２
に表示するダイアログの構成例である。図７において、
７００はファイル形式・レコード形式定義ダイアログ、
７０１は定義名入力領域、７０２は１ファイルに１レコ
ードの情報しか含まれないか複数レコードの情報が含ま
れるかの選択ボタン、７０３はレコードの先頭パターン
の入力領域、７０４はレコードが７０３で入力するパタ
ーンを含むか否かを選択する選択ボタン。７０５はレコ
ードの終了パターンの入力領域、７０６はレコードが７
０５で入力するパターンを含むか否かを選択する選択ボ
タン、７０７は区切り文字列と項目文字列の並びの定義
領域、７１０は区切り文字列と項目文字列の並び順が固
定か不定かの選択ボタン、７１１はレコードの形式と項
目の識別方法入力領域、７１８は入力された内容を登録
するためのＯＫボタン、７１９は入力を登録せずにプロ
グラムを終了するＣＡＮＣＥＬボタンである。また、７
０７は区切り文字列か項目名かの種別の入力領域７０８
（「項目名」または「区切り文字列パターン」のどちら
かを入力）と項目名または区切り文字列パターンの入力
領域７０９からなる。７０７には項目名を連続して入力
しない（項目名の間には必ず区切り文字列を入れる）。
７１１は、項目名の入力領域７１２、項目の型入力領域
７１３（「整数型」、「浮動小数型」、「日付型」等を
入力）、項目の決定方法入力領域７１４（「属性から」
または「内容から」を入力）、属性名入力領域７１５
（「最終更新日時」、「作成者名」、「ファイル名」、
または「ディレクトリ名」を入力）、項目判定方法入力
領域７１６（「前の区切り文字列から」、「後ろの区切
り文字列から」、または、「項目文字列パターンから」
を入力）、文字列パターン入力領域７１７のリストから
なる。なお、７０３〜７０６は７０２で１ファイル複数
レコードを選択した場合のみ入力する領域である。ま
た、属性名入力領域７１５は決定方法７１４に「属性か
ら」を入力した場合のみ入力する。また７１６と７１７
は７１４に「内容から」を入力し、かつ、７１０で不定
を選択した場合のみ入力する。また、７１４が「内容か
ら」であるような項目の項目名７１２と同じものを７０
８に入力する。７０３，７０５，７０９，７１７の各パ
ターンは正規表現で入力する。なお、正規表現について
は「河野真治、入門Ｐｅｒｌ、pp46-54、株式会社アス
キー、１９９４」に詳しい。FIG. 7 shows a file format registration program 10.
7 displays the record format / file format definition dialog (step 601).
It is a configuration example of a dialog displayed in. In FIG.
700 is a file format / record format definition dialog,
701 is a definition name input area, 702 is a selection button as to whether information of one record is included in one file or information of a plurality of records, 703 is an input area of the top pattern of the record, and 704 is an input in 703 of the record Selection button to select whether or not to include the pattern. 705 is the input area of the end pattern of the record, and 706 is the record 7
A selection button for selecting whether to include the pattern to be input in 05, 707 is a definition area for the arrangement of the delimiter character string and the item character string, and 710 is a selection of whether the arrangement order of the delimiter character string and the item character string is fixed or undefined. A button 711 is a record format and item identification method input area, a reference numeral 718 is an OK button for registering the input contents, and a reference numeral 719 is a CANCEL button for terminating the program without registering the input. Also, 7
07 is an input area 708 for the type of delimited character string or item name
(Enter either "item name" or "delimited character string pattern") and item name or delimited character string pattern input area 709. Item names are not continuously input to 707 (delimiter character strings are always inserted between item names).
An item name input area 712, an item type input area 713 (“integer type”, “floating point type”, “date type”, etc. are input) 711, an item determination method input area 714 (“from attribute”).
Or enter "from content"), attribute name input area 715
("Last updated date", "Creator name", "File name",
Or enter "directory name"), item determination method input area 716 ("from previous delimiter string", "from later delimiter string", or "from item string pattern")
Input)) and a list of character string pattern input areas 717. Note that reference numerals 703 to 706 are areas to be input only when a plurality of records in one file are selected in 702. Further, the attribute name input area 715 is input only when “from attribute” is input to the determination method 714. Also 716 and 717
Is input only when “From content” is input in 714 and when indefinite is selected in 710. In addition, the same name as the item name 712 of the item in which 714 is “from content” is set to 70.
Enter 8 The patterns 703, 705, 709, and 717 are input as regular expressions. For more information on regular expressions, see "Shinji Kono, Introductory Perl, pp46-54, ASCII Corporation, 1994".

【００１５】図８は、レコード形式・ファイル形式定義
表１０９の構成例である。図８において、レコード形式
・ファイル形式定義表１０９は、定義名８０１、１ファ
イルに１レコードの情報しか含まれないか複数レコード
の情報が含まれるかを表すフラグ８０２（１レコードし
か含まれないときＴＲＵＥ、複数含まれるときＦＡＬＳ
Ｅ）、レコードの先頭パターンの定義８０３、レコード
の終了パターンの定義８０６、区切り文字列と項目の並
び定義８１０、レコード形式表８１８へのポインタ８１
４からなる行のリストである。また、レコードの先頭パ
ターンの定義８０３は正規表現で表されるパターン８０
４とそのパターンをレコードが含むか含まないかを表す
フラグ８０５（含むときＴＲＵＥ、含まないときＦＡＬ
ＳＥ）からなる。８０６も同様に正規表現で表されるパ
ターン８０７とそのパターンをレコードが含むか含まな
いかを表すフラグ８０８（含むときＴＲＵＥ、含まない
ときＦＡＬＳＥ）からなる。区切り文字列と項目の並び
８１０は区切り文字列・項目名リスト８１５へのポイン
タ８１２と、項目の順序が固定か不定かをあらわすフラ
グ８１３（固定のときＴＲＵＥ、不定のときＦＡＬＳ
Ｅ）からなる。なお、８０３と８０６は８０２がＦＡＬ
ＳＥの場合のみ定義する。区切り文字列・項目名リスト
８１５は種別８１６と項目名または区切り文字列パター
ン８１７からなる行のリストである。種別８１６により
８１７の内容が項目名か区切り文字列パターンかを識別
する。また、レコード形式表は８１８は項目名８１９、
項目の型８２０、ファイルの内容から項目の値を決定す
るかファイルの属性から決定するかを表す決定方法８２
１、決定方法８２１が「属性から」のとき、ファイルの
「更新日時」、「作成者」、「ファイル名」、「ディレ
クトリ名」の内、何れの属性から決定するかをあらわす
属性名８２２、決定方法８２１が「内容から」のとき、
項目の値をどのように決定するかを表す項目判定方法８
２３（「前の区切り文字列による」、「後ろの区切り文
字列による」、または「項目文字列のパターンによ
る」）、判定に用いる区切り文字列パターンまたは項目
文字列パターン８２４からなる行のリストである。FIG. 8 is a structural example of the record format / file format definition table 109. In FIG. 8, a record format / file format definition table 109 includes a definition name 801, a flag 802 indicating whether one file includes information of one record or information of a plurality of records (when only one record is included). TRUE, FALS when more than one is included
E), definition 803 of record head pattern, definition 806 of record end pattern, definition 810 of delimiter character string and item, pointer 81 to record format table 818
It is a list of lines consisting of four. Further, the definition 803 of the leading pattern of the record is the pattern 80 represented by the regular expression.
4 and flag 805 indicating whether or not the record includes the pattern (TRUE when the record is included, FAL when the record is not included)
SE). Similarly, 806 is also composed of a pattern 807 represented by a regular expression and a flag 808 (TRUE when the record includes the pattern and FALSE when the record does not include the pattern). The delimiter string / item arrangement 810 is a pointer 812 to the delimiter string / item name list 815, and a flag 813 indicating whether the order of the items is fixed or undefined (TRUE when fixed, FALS when undefined).
E). 803 and 806, 802 is FAL
Defined only for SE. The delimiter string / item name list 815 is a list of lines including a type 816 and an item name or delimiter string pattern 817. The type 816 identifies whether the content of 817 is an item name or a delimiter string pattern. In the record format table, 818 is the item name 819,
Item type 820, determination method 82 that indicates whether to determine the value of the item from the contents of the file or the attribute of the file
1. When the determination method 821 is “from attribute”, an attribute name 822 indicating which attribute among the “update date / time”, “creator”, “file name”, and “directory name” of the file to determine When the determination method 821 is “from content”,
Item determination method 8 showing how to determine the value of an item
23 (“by the delimiter string before”, “by the delimiter string after”, or “by the pattern of the item string”), a list of lines consisting of the delimiter string pattern or the item string pattern 824 used for determination is there.

【００１６】図９は、本発明のテーブル生成方法で生成
したテーブルの一例である。テーブル９１はレコード形
式表８１８で定義された項目名８１９をもつ項目９３で
構成されるレコード９２のリストである。FIG. 9 shows an example of a table generated by the table generating method of the present invention. The table 91 is a list of records 92 composed of items 93 having the item names 819 defined in the record format table 818.

【００１７】図１０は、図３における「バッファ内の文
字列を区切り文字列と項目文字列に分割するステップ」
（ステップ３１１）の結果の一例である。分割結果は、
区切り文字列１００１と項目文字列１００２の連接とな
る。FIG. 10 shows "step of dividing character string in buffer into delimiter character string and item character string" in FIG.
It is an example of the result of (step 311). The division result is
The delimiter character string 1001 and the item character string 1002 are connected.

【００１８】次に、テーブル生成プログラム１０６の処
理フローを図２を用いて説明する。Next, the processing flow of the table generation program 106 will be described with reference to FIG.

【００１９】まず、レコード形式・ファイル形式定義名
を入力し、入力された定義名のレコード形式・ファイル
形式定義をレコード形式・ファイル形式定義表１０９か
ら検索し、読み込む（ステップ２０１）。次に、レコー
ドの形式８１４が指し示すレコード形式表８１８の全て
の項目名８１９を含む空のテーブル９１を生成し（ステ
ップ２０２）、処理の対象となるテキストファイルの名
称を１個以上入力する（ステップ２０３）。次に入力さ
れた全てのファイル名のファイルに対して以下の処理を
行う。すなわち、入力されたファイル名の全てのファイ
ルを処理したか否かを判定し（ステップ２０４）、全て
のファイルを処理したならば終了する。未処理のファイ
ルの内の一つに着目し（ステップ２０５）、そのファイ
ルの属性である最終更新日時、作成者名、ファイル名、
ファイルが格納されているディレクトリ名を記憶する
（ステップ２０６）。次に、１ファイル中に１レコード
か否かを８０２によって判定し（ステップ２０７）、１
ファイル中に複数のレコードを含むのであれば、ファイ
ルの最後まで以下の処理を行う。すなわち、ファイルの
終わりまで処理したか否かを判定し（ステップ２０
８）、終わりまで処理したならばステップ２０４に戻
る。終わりまで処理していないならば、未処理の部分か
らレコード９２を生成し（ステップ２０９）、生成した
レコード９２をステップ２０２で生成したテーブル９１
に挿入し（ステップ２１０）、ステップ２０８に戻る。
ステップ２０７において、１ファイル中に１レコードの
みを含むのであれば、ファイルからレコード９２を生成
し（ステップ２１１）、生成したレコード９２をステッ
プ２０２で生成したテーブル９１に挿入し（ステップ２
１２）、ステップ２０４に戻る。First, a record format / file format definition name is input, and the record format / file format definition of the input definition name is retrieved from the record format / file format definition table 109 and read (step 201). Next, an empty table 91 including all the item names 819 of the record format table 818 indicated by the record format 814 is generated (step 202), and one or more text file names to be processed are input (step 203). Next, the following processing is performed for all the input file names. That is, it is determined whether or not all the files having the input file name have been processed (step 204), and if all the files have been processed, the process ends. Focusing on one of the unprocessed files (step 205), the last update date and time, the creator name, the file name, which are the attributes of the file,
The directory name in which the file is stored is stored (step 206). Next, it is determined by 802 whether there is one record in one file (step 207), 1
If the file contains multiple records, do the following until the end of the file: That is, it is determined whether or not processing has been performed up to the end of the file (step 20
8) When the processing is completed, the process returns to step 204. If not processed to the end, a record 92 is generated from the unprocessed part (step 209), and the generated record 92 is generated in the table 91 generated in step 202.
(Step 210) and returns to step 208.
If only one record is included in one file in step 207, the record 92 is generated from the file (step 211), and the generated record 92 is inserted into the table 91 generated in step 202 (step 2
12) and returns to step 204.

【００２０】次に、図２のレコードを生成するステップ
（ステップ２０９、ステップ２１１）の詳細を図３を用
いて説明する。Next, details of the steps (step 209, step 211) for generating the record shown in FIG. 2 will be described with reference to FIG.

【００２１】まず、レコード形式表８１８の決定方法８
２１が「属性から」となっている行を全て処理したか否
かを判定し（ステップ３０１）、全てを処理していない
ならば、未処理の行の一つについて以下の処理を行う。
すなわち、レコード形式表８１８中の未処理の行の一つ
に着目し（ステップ３０２）、その行の属性名８２２が
「最終更新日時」であれば（ステップ３０３）、ファイ
ルの最終更新日時をその行の項目名８１９をもつ項目９
３とし（ステップ３０４）、ステップ３０１に戻る。そ
の行の属性名８２２が「ファイル名」であれば（ステッ
プ３０５）、ファイル名をその行の項目名８１９をもつ
項目９３とし（ステップ３０６）、ステップ３０１に戻
る。その行の属性名８２２が「ディレクトリ名」であれ
ば（ステップ３０７）、ファイルが格納されているディ
レクトリ名をその行の項目名８１９をもつ項目９３とし
（ステップ３０８）、ステップ３０１に戻る。その行の
属性名８２２が上記のいずれでもなければ「作成者名」
であるので、ファイルの作成者名をその行の項目名８１
９をもつ項目９３とし（ステップ３０９）、ステップ３
０１に戻る。ステップ３０１において８２１が「属性か
ら」となっているレコード形式表８１８中の行を全て処
理したならば、ファイルから１レコード分の文字列をバ
ッファに読み込み（ステップ３１０）、バッファの内容
を区切りを表す区切り文字列１００１と項目を表す項目
文字列１００２に分割する（ステップ３１１）。次に、
ｉを１とし（ステップ３１２）、ｉが区切り文字列・項
目リスト８１５中の項目名の数以下か否かを判定し（ス
テップ３１３）、以下でなければ終了する。以下なら
ば、項目の順序が固定か否かを８１３により判定し（ス
テップ３１４）、固定ならば、ステップ３１１で分割し
た文字列の内、ｉ番目の項目文字列を区切り文字列・項
目名リスト８１５中のｉ番目の項目名をもつ項目とする
（ステップ３１５）。項目の順序が不定ならば、区切り
文字列・項目名リスト８１５中のｉ番目の項目名８１７
をＮとし、Ｎと同じ項目名８１９をもつレコード形式表
８１８中の行Ｌを検索し（ステップ３１６）、Ｌの項目
判定方法８２３が「前の区切り文字列で判定」ならば
（ステップ３１７）、Ｌの文字列パターン８２４にマッ
チする区切り文字列１００１をステップ３１１で分割し
たバッファから検索し、その区切り文字列１００１の次
の項目文字列１００２を項目名Ｎをもつ項目９３とする
（ステップ３１８）。Ｌの項目判定方法８２３が「後の
区切り文字列で判定」ならば（ステップ３１９）、Ｌの
文字列パターン８２４にマッチする区切り文字列１００
１をステップ３１１で分割したバッファから検索し、そ
の区切り文字列１００１の前の項目文字列１００２を項
目名Ｎをもつ項目９３とする（ステップ３２０）。Ｌの
項目判定方法８２３が上記以外、すなわち「項目文字列
のパターンで判定」ならば、Ｌの文字列パターン８２４
にマッチする項目文字列１００２をステップ３１１で分
割したバッファから検索し、その項目文字列１００２を
項目名Ｎをもつ項目９３とする（ステップ３２１）。つ
ぎに、ｉを１増やし（ステップ３２２）、ステップ３１
３に戻る。なお、ステップ３０４，３０６，３０８，３
０９，３１８，３２０，３２１において、属性または項
目文字列を項目とするとき、レコード形式表８１８の型
８２０に変換して項目とする。次に図３におけるファ
イルから１レコード分バッファに読み込むステップ（ス
テップ３１０）の詳細を図４を用いて説明する。First, the determination method 8 of the record format table 818
It is determined whether or not all the rows in which 21 is "from attribute" have been processed (step 301). If all the rows have not been processed, the following processing is performed for one of the unprocessed rows.
That is, paying attention to one of the unprocessed lines in the record format table 818 (step 302), and if the attribute name 822 of that line is “last update date / time” (step 303), the last update date / time of the file is Item 9 with line item name 819
3 (step 304) and the process returns to step 301. If the attribute name 822 of the line is "file name" (step 305), the file name is set as the item 93 having the item name 819 of the line (step 306), and the process returns to step 301. If the attribute name 822 of the line is "directory name" (step 307), the directory name in which the file is stored is set as the item 93 having the item name 819 of the line (step 308), and the process returns to step 301. If the attribute name 822 in that line is none of the above, "creator name"
Therefore, the file creator name is the item name 81 of that line.
Item 93 having 9 (step 309), step 3
Return to 01. When all the rows in the record format table 818 in which the attribute 821 is "from attribute" are processed in step 301, the character string for one record is read from the file into the buffer (step 310), and the buffer contents are separated. It is divided into a delimiter character string 1001 representing the item and an item character string 1002 representing the item (step 311). next,
i is set to 1 (step 312), it is determined whether i is equal to or less than the number of item names in the delimited character string / item list 815 (step 313), and if not, the process ends. If it is the following, it is determined by 813 whether or not the order of items is fixed (step 314). If fixed, the i-th item character string of the character strings divided in step 311 is a delimited character string / item name list. The item having the i-th item name in 815 is set (step 315). If the order of the items is undefined, the i-th item name 817 in the delimited character string / item name list 815
Is set to N, and the row L in the record format table 818 having the same item name 819 as N is searched (step 316). If the item determination method 823 for L is “determined by previous delimiter character string” (step 317). , L of the delimiter string 1001 matching the character string pattern 824 is searched from the buffer divided in step 311, and the next item character string 1002 of the delimiter string 1001 is set as the item 93 having the item name N (step 318). ). If the L item determination method 823 is “determined by subsequent delimiter character string” (step 319), the delimiter character string 100 that matches the L character string pattern 824.
1 is searched from the buffer divided in step 311, and the item character string 1002 before the delimiter character string 1001 is set as the item 93 having the item name N (step 320). If the L item determination method 823 is other than the above, that is, "determined by the item character string pattern", the L character string pattern 824
The item character string 1002 that matches is searched from the buffer divided in step 311, and the item character string 1002 is set as the item 93 having the item name N (step 321). Next, i is incremented by 1 (step 322), and step 31
Return to 3. Note that steps 304, 306, 308, 3
In 09, 318, 320, and 321, when an attribute or an item character string is used as an item, it is converted into a type 820 of the record format table 818 to be an item. Next, details of the step (step 310) of reading from the file in FIG. 3 into the buffer for one record will be described with reference to FIG.

【００２２】まず、レコード情報としてレコードの先頭
パターンを含むか否かを８０５により判定し（ステップ
４０１）、含むのであれば、パターン８０４にマッチす
る文字列の前までファイルを読み飛ばす（ステップ４０
２）。含まないのであれば、パターン８０４にマッチす
る文字列まで読み飛ばす（ステップ４０３）。次にレコ
ード情報としてレコードの終了パターンを含むか否かを
判定し（ステップ４０４）、含むのであれば、パターン
にマッチする文字列までファイルを読み込みバッファに
格納する（ステップ４０５）。含まないのであれば、パ
ターンにマッチする文字列の前までファイルを読み込み
バッファに格納する（ステップ４０６）。First, it is determined by 805 whether or not the record start pattern is included as the record information (step 401). If it is included, the file is skipped to before the character string matching the pattern 804 (step 40).
2). If it is not included, the character string that matches the pattern 804 is skipped (step 403). Next, it is determined whether or not the end pattern of the record is included as the record information (step 404), and if it is included, the file is read up to the character string matching the pattern and stored in the buffer (step 405). If not included, the file is read up to the character string matching the pattern and stored in the buffer (step 406).

【００２３】次に図３におけるバッファの内容を区切り
文字列と項目文字列に分割するステップ（ステップ３１
１）の詳細を図５を用いて説明する。Next, a step of dividing the contents of the buffer in FIG. 3 into a delimiter character string and an item character string (step 31
Details of 1) will be described with reference to FIG.

【００２４】まず、バッファにおける処理位置を示すカ
ーソルをバッファの先頭とし、ｉ，ｊをぞれぞれ１とす
る（ステップ５０１）。次に、カーソルがバッファの最
後にあるか否かを判定し（ステップ５０２）、最後にあ
れば処理を終了する。最後になければ、項目の順序が固
定か否かを８１３により判定し（ステップ５０３）、固
定であればｉ番目の区切り文字列にマッチする文字列を
ｉ番目の区切り文字列１００１とする（ステップ５０
４）。固定でなければ、カーソル位置以降の文字列で、
８１５中の何れかの区切り文字列にマッチする最初の文
字列をｉ番目の区切り文字列とする。次にｉが２以上で
かつｉ−１番目の区切り文字列とｉ番目の区切り文字列
が連続していないならば（ステップ５０６）、ｉ−１番
目の区切り文字列とｉ番目の区切り文字列の間の文字列
をｊ番目の項目文字列とし、ｊを１増やす（ステップ５
０７）。次にカーソル位置をｉ番目の区切り文字列の直
後とし、ｉを１増やす（ステップ５０８）。First, the cursor indicating the processing position in the buffer is set to the head of the buffer, and i and j are set to 1 (step 501). Next, it is determined whether or not the cursor is at the end of the buffer (step 502), and if it is at the end, the process ends. If it is not the last, it is determined by 813 whether or not the order of items is fixed (step 503). If fixed, the character string that matches the i-th delimited character string is set as the i-th delimited character string 1001 (step Fifty
4). If it is not fixed, it is the character string after the cursor position,
The first character string that matches any of the delimiter character strings in 815 is the i-th delimiter character string. Next, if i is 2 or more and the i-1 th delimiter string and the i th delimiter string are not continuous (step 506), the i-1 th delimiter string and the i th delimiter string. The character string between the two is the j-th item character string, and j is incremented by 1 (step 5
07). Next, the cursor is positioned immediately after the i-th delimited character string, and i is incremented by 1 (step 508).

【００２５】次にファイル形式登録プログラム１０７の
処理を図６を用いて説明する。Next, the processing of the file format registration program 107 will be described with reference to FIG.

【００２６】まず、レコード形式・ファイル形式定義ダ
イアログをディスプレー１０２に表示し（ステップ６０
１）、利用者からの入力を受け付け（ステップ６０
２）、入力が選択ボタン（７０２，７０４，７０６，７
１０）の押下であれば（ステップ６０３）、選択された
旨を示すマークを付け（ステップ６０４）、ステップ６
０２に戻る。入力領域（７０１，７０３，７０５，７０
８，７１１）へのテキスト入力であれば（ステップ６０
５）、入力された内容を入力領域に表示し（ステップ６
０６）、ステップ６０２に戻る。入力がＯＫボタン７１
８の押下であれば（ステップ６０７）、入力された内容
からレコード形式・ファイル形式定義表を更新し（ステ
ップ６０７）、処理を終了する。上記以外の入力であれ
ば（ＣＡＮＣＥＬボタン７１９）の押下であれば、処理
を終了する。First, a record format / file format definition dialog is displayed on the display 102 (step 60).
1) Accept input from user (step 60
2) Input is select button (702, 704, 706, 7
If the button 10 is pressed (step 603), a mark indicating that it has been selected is added (step 604), and step 6
Return to 02. Input area (701, 703, 705, 70
8, 711) is a text input (step 60).
5) Display the input contents in the input area (step 6
06), and returns to step 602. Input is OK button 71
If the user presses 8 (step 607), the record format / file format definition table is updated from the input contents (step 607), and the process ends. If the input is other than the above, and the (CANCEL button 719) is pressed, the process ends.

【００２７】ステップ６０４では、レコード形式・ファ
イル形式定義表１０９に１行追加し、各項目の値を以下
のように設定する。７０１の内容を８０１の値とする。
７０２で「１ファイル１レコード」が選択されていれば
ＴＲＵＥを「１ファイル複数レコード」が選択されてい
ればＦＡＬＳＥを８０２の値とする。７０３の内容を８
０４の値とする。７０４で「から」が選択されていれば
ＴＲＵＥを、「の次から」が選択されていればＦＡＬＳ
Ｅを８０５の値とする。７０５の内容を８０７の値とす
る。７０６で「から」が選択されていればＴＲＵＥを、
「の次から」が選択されていればＦＡＬＳＥを８０８の
値とする。７０７の内容をそのまま区切り文字列・項目
名リスト８１５とし、それへのポインタを８１２の値と
する。７１０で、「固定」が選択されていればＴＲＵ
Ｅ、「不定」が選択されていればＦＡＬＳＥを８１３の
値とする。７１１の内容をそのままレコード形式表８１
８とし、それへのポインタを８１４の値とする。In step 604, one line is added to the record format / file format definition table 109, and the value of each item is set as follows. The content of 701 is set to the value of 801.
If "1 file 1 record" is selected in 702, TRUE is set to FALSE if "1 file multiple records" is selected. Contents of 703 8
The value is 04. TRUE if "from" is selected in 704, and FALS if "after" is selected.
Let E be a value of 805. The content of 705 is set to the value of 807. If "kara" is selected in 706, TRUE is selected,
If “after” is selected, FALSE is set to a value of 808. The content of 707 is directly used as the delimited character string / item name list 815, and the pointer to it is set to the value of 812. If “fixed” is selected in 710, TRU
If E and "undefined" are selected, then FALSE is set to a value of 813. Record format table 81 with the contents of 711 as it is
8 and the pointer to it is the value of 814.

【００２８】[0028]

【発明の効果】本発明によれば、プログラミング言語で
プログラムを記述することなく、ファイルの形式とレコ
ードの形式を入力することにより、複数の項目が繰り返
し出現するテキストファイルからテーブルを生成できる
ので、利用者は１レコード中の項目の数に比例するだけ
の情報を入力すればよく、レコードの形式が変わらなけ
ればデータ量が増えても利用者の操作が増えることはな
い。According to the present invention, a table can be generated from a text file in which a plurality of items appear repeatedly by inputting a file format and a record format without writing a program in a programming language. The user only needs to input information in proportion to the number of items in one record, and if the format of the record does not change, the number of operations by the user does not increase even if the amount of data increases.

【００２９】また、本発明によれば、一度入力したファ
イル形式の定義とレコードの形式の定義を保存できるの
で、同じ形式のファイルを処理する場合は、くり返し定
義情報を入力する必要がない。Further, according to the present invention, since the file format definition and the record format definition that have been input once can be saved, it is not necessary to repeatedly input definition information when processing files of the same format.

【００３０】また、本発明によれば、テキストファイル
の内容だけでなく、ファイルの属性もレコードの項目と
することができる。Further, according to the present invention, not only the contents of the text file but also the attributes of the file can be the items of the record.

[Brief description of drawings]

【図１】システム構成である。FIG. 1 is a system configuration.

【図２】テーブル生成プログラム１０６の処理フローで
ある。FIG. 2 is a processing flow of a table generation program 106.

【図３】図２におけるステップ２０９とステップ２１１
の詳細な処理フローである。FIG. 3 shows steps 209 and 211 in FIG.
Is a detailed processing flow of.

【図４】図３におけるステップ３１０の詳細な処理フロ
ーである。FIG. 4 is a detailed processing flow of step 310 in FIG.

【図５】図３におけるステップ３１１の詳細な処理フロ
ーである。5 is a detailed processing flow of step 311 in FIG.

【図６】ファイル形式登録プログラム１０７の処理フロ
ーである。FIG. 6 is a processing flow of a file format registration program 107.

【図７】レコード形式・ファイル形式定義ダイアログの
構成である。FIG. 7 shows the structure of a record format / file format definition dialog.

【図８】レコード形式・ファイル形式定義表である。FIG. 8 is a record format / file format definition table.

【図９】テーブルである。FIG. 9 is a table.

【図１０】図３におけるステップ３１１の結果の一例で
ある。10 is an example of a result of step 311 in FIG.

[Explanation of symbols]

９１…テーブル、９２…レコード、９３…項目、１０
１…中央処理装置、１０２…ディスプレー、１０３…キ
ーボード、１０４…マウス、１０５…外部記憶装
置、１０６…テーブル生成プログラム、１０７…ファイ
ル形式登録プログラム、１０８…処理対象テキス
ト、１０９…レコード形式・ファイル形式定義表、１１
０…基本制御プログラム、１１１…主記憶装置、７００
…ファイル形式・レコード形式定義ダイアログ、８１５
…区切り文字列・項目名リスト、８１８…レコー
ド形式表、１００１…区切り文字列、１００２…項目文
字列。91 ... Table, 92 ... Record, 93 ... Item, 10
DESCRIPTION OF SYMBOLS 1 ... Central processing unit, 102 ... Display, 103 ... Keyboard, 104 ... Mouse, 105 ... External storage device, 106 ... Table generation program, 107 ... File format registration program, 108 ... Text to be processed, 109 ... Record format / file format Definition table, 11
0 ... Basic control program, 111 ... Main storage device, 700
… File format / record format definition dialog, 815
Delimiter character string / item name list, 818 ... Record format table, 1001 ... Delimiter character string, 1002 ... Item character string.

Claims

[Claims]

1. A system for generating a table consisting of a plurality of records having a plurality of items from one or more text files in which a plurality of items appear repeatedly, and identifying a character string corresponding to a record included in the text file. Information, information for identifying a character string corresponding to an item of a record included in a character string corresponding to the record, a name of an item forming the record, an item type, a correspondence relationship between the item name and the attribute of the text file, And the step of inputting the definition of the format of the text file and record consisting of the correspondence between the item names and the character strings corresponding to the items included in the text file, the step of inputting the text file, and the attributes of the text file are extracted. Step, the attribute extracted in the step, the item name and the text file Determining the value of the item from the correspondence relationship of sex and the type of item, extracting the character string corresponding to the record from the input text based on the method of identifying the character string corresponding to the record, Extracting a character string corresponding to an item from the character string corresponding to the record extracted in the step based on a method of identifying a character string corresponding to an item of a record, and
A table generation method comprising: a step of determining a value of an item from a type of the item and a correspondence relationship between the extracted character string, the item name and a character string corresponding to the item.

2. A storage means for storing information, the step of storing the definition input in the step of inputting the format of the text file and the record in the storage means, and a plurality of definitions stored in the storage means. Selecting one of the items, determining the value of the item, extracting the character string corresponding to the record, extracting the character string corresponding to the item, determining the value of the item The table generating method according to claim 1, wherein the step of performing is based on the content of the selected definition.

3. The method of identifying a character string corresponding to a record in a file is a pattern of a character string at the beginning of the record and a pattern of a final character string of the record. The table generation method described in 1 or 2.

4. The information for identifying the character string corresponding to the item of the record in the character string corresponding to the record is a pattern of the character string delimiting the item.
The table generation method according to any one of 3 to 3.

5. The table generation according to claim 1, wherein the attribute of the file is the date and time of the last update of the file, the name of the creator of the file, the name of the file, and the location where the file is stored. Method.

6. The correspondence between the name of an item and a character string corresponding to an item in a text file is a combination of a pattern of a character string delimiting an item before or after an item and an item name, or an item. 6. The table generating method according to claim 1, wherein the table generating method is a set of corresponding character string patterns and item names.