JP3414319B2

JP3414319B2 - Data retrieval apparatus, method and recording medium

Info

Publication number: JP3414319B2
Application number: JP12305899A
Authority: JP
Inventors: 要一中嶋
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1999-04-28
Filing date: 1999-04-28
Publication date: 2003-06-09
Anticipated expiration: 2019-04-28
Also published as: JP2000311088A

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、意味的に一致する
かどうかの照合によって、データを検索するデータ検索
装置、方法及び記録媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a data retrieval device, method and recording medium for retrieving data by checking whether they match in meaning.

【０００２】[0002]

【従来の技術】従来の文字列置換システムは、ソースプ
ログラム等のテキストの内容を取得して、指定した文字
列と一致する文字列があるかどうかの照合処理を、テキ
ストの先頭から順に行い、一致する文字列があった場合
には、その文字列を指定した文字列と置換するという処
理を、テキストの末尾まで繰り返し行っている。2. Description of the Related Art A conventional character string replacement system acquires the contents of a text such as a source program and performs collation processing for checking whether or not there is a character string that matches a specified character string in order from the beginning of the text. When there is a matching character string, the process of replacing the character string with the specified character string is repeated until the end of the text.

【０００３】しかしながら、従来のシステムでは、指定
した文字列と置換対象の文字列との照合処理が、「語」
の概念を意識せずに単なる文字列同士の比較として行っ
ているため、本来は置換したくない文字列でも置換対象
となってしまうという問題がある。例えば、“かき”を
“くり”に置換しようとした場合には、“かきくけこ”
という文字列中の“かき”も置換対象となってしまう。However, in the conventional system, the collation process of the designated character string and the character string to be replaced is performed by "word".
Since the comparison is simply performed between the character strings without being aware of the concept of, there is a problem that even a character string that should not be originally replaced may be replaced. For example, if you try to replace "oyster" with "kuri",
"Oyster" in the character string is also replaced.

【０００４】そこで、特開平３−１５０６７７号公報に
おいて、プログラムを単語（語、字句、トークン）に分
割し、分割された単語を単位として検索文字列との比較
を行う文字列検索方式が提案されている。この技術を適
用すれば、“かき”を“くり”に置換しようとしたとき
に、“かきくけこ”という文字列中の“かき”が置換対
象とされることはなくなる。Therefore, Japanese Patent Laid-Open No. 3-150677 proposes a character string search method in which a program is divided into words (words, tokens, tokens) and the divided words are compared as a unit with a search character string. ing. When this technique is applied, when trying to replace “oyster” with “kuri”, the “oyster” in the character string of “oyster” will not be replaced.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、上記公
報に記載の技術においても、指定した文字列と意味的に
は一致していても、表現方法が一致しない文字列は、置
換対象とされないという問題がある。例えば、“う”の
上位に“た”が存在する階層関係を“うＯＦた”で表
すものとし、“う”の上位に“な”が存在し、“な”の
上位に“た”が存在する階層関係を“うＯＦなＯ
Ｆた”で表すものとする。However, even in the technique described in the above publication, a character string that is semantically identical to a specified character string but does not match the expression method is not subject to replacement. There is. For example, assume that a hierarchical relationship in which "ta" exists above "u" is represented by "u OFta", "na" exists above "u", and "ta" exists above "na". The existing hierarchical relationships are
It is represented by F ta.

【０００６】このとき、“うＯＦた”と“うＯＦ
なＯＦた”は、構文的な記述の相違の有無に関わ
らず同一対象であると認識可能であり、意味的には同じ
になり、置換対象としてもよいはずである。ところが、
従来のシステム（上記公報に記載のものを含む）では、
これら両者の表現に一致する部分がないため、置換対象
の文字列として取り扱うことができない。このように、
本来は置換対象としたい文字列でも、置換対象とならな
いという問題がある。At this time, "U OF" and "U OF"
“OF” is related to whether there is a difference in the syntactic description.
However, they can be recognized as the same object, the meaning is the same, and they may be replaced. However,
In conventional systems (including those described in the above publication),
Since there is no part that matches these expressions, it cannot be handled as the character string to be replaced. in this way,
There is a problem in that even a character string that is originally to be replaced is not replaced.

【０００７】本発明は、上記従来技術の問題点を解消す
るためになされたものであり、他のデータへの置換対象
となるデータなど、所望のデータと意味的に一致するデ
ータを検索することができるデータ検索装置、方法及び
この方法を実行するためのプログラムを記録した記録媒
体を提供することを目的とする。The present invention has been made in order to solve the above-mentioned problems of the prior art, and is to search for data that is semantically identical to desired data, such as data to be replaced with other data. It is an object of the present invention to provide a data retrieval device, a method, and a recording medium recording a program for executing the method.

【０００８】[0008]

【課題を解決するための手段】上記目的を達成するた
め、本発明の第１の観点にかかるデータ検索装置は、階
層関係が明示されて所定のルールに従って記述された第
１の文字列を、前記所定のルールに基づいて構文解析す
る第１の解析手段と、検索対象となる第２の文字列を、
前記所定のルールに基づいて構文解析する第２の解析手
段と、前記第１の解析手段による解析結果中に、前記第
２の解析手段による解析結果と対象が同一であると認識
可能な部分であって意味的に一致する部分があるかどう
か照合する第１の照合手段と、前記第１の照合手段によ
る照合で意味的に一致した部分に対応する前記第１の文
字列中の文字列を検索された文字列として抽出する第１
の抽出手段とを備えることを特徴とする。To achieve the above object, according to an aspect of the data retrieval apparatus according to the first aspect of the present invention, floor
First parsing means for parsing the first character string in which the layer relationship is clearly described according to a predetermined rule, based on the predetermined rule, and a second character string to be searched,
Recognizing that the analysis result by the second analysis unit is the same as the target in the analysis result by the second analysis unit and the first analysis unit that performs syntax analysis based on the predetermined rule.
A first collating means for collating whether or not there is a possible semantically matching portion, and the first character string corresponding to the semantically matching portion in the matching by the first collating means First to extract the character string of as the searched character string
And an extracting means of the above.

【０００９】上記データ検索装置では、第１の照合手段
は、単なる文字列の表現としてではなく、構文解析した
結果に基づいて第１の文字列と第２の文字列とに一致が
あるかどうかを照合している。これにより、文字列の表
現として異なっていても、同一の意味を有する文字列を
検索された文字列として第１の抽出手段が抽出すること
が可能となる。In the above data retrieval device, the first collating means determines whether or not there is a match between the first character string and the second character string based on the result of syntax analysis, not as a simple expression of the character string. Is collating. This enables the first extracting unit to extract a character string having the same meaning even if the expression of the character string is different, as the searched character string.

【００１０】上記データ検索装置は、前記第２の文字列
が置換されるべき第３の文字列を指示する第１の指示手
段と、前記第１の抽出手段によって抽出された文字列
を、前記第１の指示手段から指示された第３の文字列に
置換する第１の置換手段とをさらに備えるものとしても
よい。The above-mentioned data retrieving apparatus uses the first instructing means for instructing a third character string in which the second character string is to be replaced, and the character string extracted by the first extracting means, It may further include first replacing means for replacing the third character string instructed by the first instructing means.

【００１１】ここで、前記第１の抽出手段は、前記第１
の照合手段による照合で意味的に一致した部分に対応す
る前記第１の文字列中の文字列の位置を前記第１の置換
手段に通知してもよい。この場合、前記第１の置換手段
は、前記第１の抽出手段から通知された文字列の位置に
基づいて、前記第３の文字列への置換を行うものとする
ことができる。Here, the first extracting means is the first extracting means.
The position of the character string in the first character string that corresponds to the portion that is semantically matched by the matching unit may be notified to the first replacing unit. In this case, the first replacement unit may replace with the third character string based on the position of the character string notified from the first extraction unit.

【００１２】さらに、前記第１の解析手段は、前記第１
の文字列の解析結果に、前記指示手段から指示された前
記第３の文字列への置換を行うための情報を対応付けさ
せるものとしてもよい。この場合、前記第１の置換手段
は、前記第１の文字列の解析結果に対応付けられている
情報に基づいて、前記第３の文字列への置換を行うもの
とすることができる。Further, the first analysis means is the first
The analysis result of the character string may be associated with information for performing replacement with the third character string instructed by the instructing means. In this case, the first replacement unit may replace the third character string based on the information associated with the analysis result of the first character string.

【００１３】上記データ検索装置において、前記第１、
第２の解析手段は、それぞれ前記第１、第２の文字列を
語に分割し、分割した各語を所定のデータ構造で関係付
けてそれぞれの解析結果とすることができる。この場
合、前記第１の照合手段は、前記第１、第２の解析手段
による解析結果を、それぞれに含まれる語を単位として
比較しながら、意味的に一致する部分があるかどうか照
合することができる。In the above data retrieval device, the first,
The second analysis means can divide the first and second character strings into words, respectively, and associate the divided words with a predetermined data structure to obtain respective analysis results. In this case, the first collating means compares the analysis results of the first and second analyzing means in terms of words included in each, and collates whether there is a semantically matching portion. You can

【００１４】ここで、前記第１、第２の文字列は、階層
構造で記述された複数の語を含むものであってもよい。
この場合、前記第１、第２の解析手段は、階層構造で記
述された複数の語を、上位側または下位側の語から順次
リストでつなげてそれぞれの解析結果とし、前記第１の
照合手段は、前記第１、第２の解析手段による解析結果
を、それぞれのリストをたどっていきながら、意味的に
一致する部分があるかどうか照合するものとすることが
できる。Here, the first and second character strings may include a plurality of words described in a hierarchical structure.
In this case, the first and second analysis means sequentially connect a plurality of words described in a hierarchical structure from the words on the upper side or the lower side to obtain respective analysis results, and the first matching means Can check the results of analysis by the first and second analyzing means while tracing each list to see if there is a semantically matching portion.

【００１５】さらに、前記第１の解析手段は、階層構造
の最上位または最下位の語を、ハッシュ法によって得ら
れたハッシュ値に従った所定の記憶位置に記憶させるも
のとし、前記第１の照合手段は、前記第２の文字列中の
階層構造で記述された語のうちの最上位または最下位の
語からハッシュ値を求め、当該ハッシュ値の位置に記憶
されている第１の文字列中の語から順にリストをたどっ
ていきながら、意味的に一致する部分があるかどうかを
照合するものとすることができる。Further, the first analysis means stores the highest or lowest word of the hierarchical structure in a predetermined storage position according to a hash value obtained by the hash method, and the first analysis means The matching unit obtains a hash value from the highest or lowest word of the words described in the hierarchical structure in the second character string, and stores the first character string at the position of the hash value. It is possible to check whether there is a semantically matching part by checking the list in order from the word inside.

【００１６】なお、上記データ検索装置において、前記
第１の文字列は、所定のプログラミング言語で記述され
たものとすることができる。この場合において、前記第
１、第２の解析手段は、前記所定のプログラミング言語
の仕様に基づいて、前記第１、第２の文字列をそれぞれ
構文解析するものとすることができる。In the above data retrieval device, the first character string may be described in a predetermined programming language. In this case, the first and second parsing means may parse the first and second character strings based on the specifications of the predetermined programming language.

【００１７】また、前記第１の文字列は、前記所定のプ
ログラミング言語におけるデータ定義である場合には、
上記データ検索装置は、前記所定のプログラミング言語
における処理記述である第４の文字列を、オペランドに
分割するオペランド解析手段と、前記オペランド解析手
段によって分割されたオペランドを構文解析する第３の
解析手段と、前記第３の解析手段によるオペランドの解
析結果に、前記第１の解析手段の解析結果と意味的に一
致する部分があるかどうか照合する第２の照合手段と、
前記第１の照合手段による照合で意味的に一致した部分
に対応する前記第４の文字列中の文字列を検索された文
字列として抽出する第２の抽出手段とをさらに備えるも
のとすることができる。Further, when the first character string is a data definition in the predetermined programming language,
The data retrieval device includes an operand analysis unit that divides a fourth character string, which is a processing description in the predetermined programming language, into operands, and a third analysis unit that parses the operands divided by the operand analysis unit. And second collation means for collating whether or not the operand analysis result by the third analysis means has a part that semantically matches the analysis result by the first analysis means.
And a second extracting unit for extracting a character string in the fourth character string corresponding to a portion that is semantically matched by the matching by the first matching unit as a searched character string. You can

【００１８】ここで、前記第３の解析手段は、前記オペ
ランド解析手段によって分割されたオペランドのうち
で、予約語以外のオペランドを構文解析の対象とするこ
とができる。[0018] Here, the third analysis means can target the operands other than the reserved words among the operands divided by the operand analysis means as the target of syntax analysis.

【００１９】このように第４の文字列をさらに解析する
ものにあっては、前記第４の文字列が置換されるべき第
５の文字列を指示する第２の指示手段と、前記第２の抽
出手段によって抽出された文字列を、前記第２の指示手
段から指示された第５の文字列に置換する第２の置換手
段とをさらに備えてもよい。As described above, in the further analysis of the fourth character string, the second instructing means for instructing the fifth character string to replace the fourth character string, and the second instructing means. The character string extracted by the extracting unit may be further replaced with a second replacing unit that replaces the fifth character string instructed by the second instructing unit.

【００２０】上記目的を達成するため、本発明の第２の
観点にかかるデータ検索方法は、階層関係が明示されて
所定のルールに従って記述された第１の文字列を、前記
所定のルールに基づいて構文解析する第１の解析ステッ
プと、検索対象となる第２の文字列を、前記所定のルー
ルに基づいて構文解析する第２の解析ステップと、前記
第１の解析ステップでの解析結果中に、前記第２の解析
ステップでの解析結果と対象が同一であると認識可能な
部分であって意味的に一致する部分があるかどうか照合
する照合ステップと、前記照合ステップにおける照合で
意味的に一致した部分に対応する前記第１の文字列中の
文字列を、検索された文字列として抽出する抽出ステッ
プとを含むことを特徴とする。In order to achieve the above object, in the data search method according to the second aspect of the present invention, the first character string in which a hierarchical relationship is clearly described and described according to a predetermined rule is used as the predetermined character string. A first parsing step for parsing based on the rule, and a second parsing step for parsing the second character string to be searched based on the predetermined rule; and the first parsing step. It is possible to recognize that the target is the same as the analysis result in the second analysis step in the analysis result
A matching step for matching whether there are semantically matching part a portion, the character string in the first character string corresponding to semantically matched part matching in the matching step, retrieved And an extraction step of extracting as a character string.

【００２１】上記データ検索方法は、前記第２の文字列
が置換されるべき第３の文字列を指示する指示ステップ
と、前記抽出ステップで抽出された文字列を、前記第１
の指示ステップで指示された第３の文字列に置換する置
換ステップとをさらに含むものとすることができる。In the data retrieval method, the first character string extracted in the instructing step of instructing a third character string in which the second character string is to be replaced and the first character string extracted in the extracting step is used.
The replacement step of replacing with the third character string designated in the designation step may be further included.

【００２２】上記目的を達成するため、本発明の第３の
観点にかかるコンピュータ読み取り可能な記録媒体は、
コンピュータに、階層関係が明示されて所定のルールに
従って記述された第１の文字列を、前記所定のルールに
基づいて構文解析する第１の解析ステップと、検索対象
となる第２の文字列を、前記所定のルールに基づいて構
文解析する第２の解析ステップと、前記第１の解析ステ
ップでの解析結果中に、前記第２の解析ステップでの解
析結果と対象が同一であると認識可能な部分であって意
味的に一致する部分があるかどうか照合する照合ステッ
プと、前記照合ステップにおける照合で意味的に一致し
た部分に対応する前記第１の文字列中の文字列を、検索
された文字列として抽出する抽出ステップとを実行させ
るためのプログラムを記録することを特徴とする。In order to achieve the above object, a computer-readable recording medium according to the third aspect of the present invention comprises:
A first parsing step of parsing a first character string in which a hierarchical relationship is clearly described according to a predetermined rule in the computer based on the predetermined rule, and a second character string to be searched are provided. , It is possible to recognize that the analysis result of the second analysis step is the same as the target in the analysis results of the second analysis step and the first analysis step, which are syntax-analyzed based on the predetermined rule. a matching step for matching whether there is meaning <br/> taste congruent portions a moiety are characters in the first character string corresponding to semantically matched part matching in the matching step And a program for executing an extraction step of extracting a string as a searched character string.

【００２３】上記コンピュータ読み取り可能な記録媒体
は、コンピュータに、前記第２の文字列が置換されるべ
き第３の文字列を指示する指示ステップと、前記抽出ス
テップで抽出された文字列を、前記第１の指示ステップ
で指示された第３の文字列に置換する置換ステップとを
さらに実行させるためのプログラムを記録するものとす
ることができる。The computer-readable recording medium further comprises an instruction step for instructing a computer of a third character string to be replaced with the second character string, and the character string extracted in the extracting step, A program for further executing the substitution step of substituting the third character string designated in the first designation step may be recorded.

【００２４】[0024]

【発明の実施の形態】以下、添付図面を参照して、本発
明の実施の形態について説明する。BEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described below with reference to the accompanying drawings.

【００２５】［第１の実施の形態］図１は、この実施の
形態に係るデータ置換システムを示す機能ブロック図で
ある。同図に示すように、このデータ置換システムで
は、置換される文字列を含むソースプログラム１と、置
換される文字列とその置換後の文字列とを含むパラメー
タ３とが与えられる。また、システムとして、文字列変
換機能２を含み、これが与えられたパラメータ３に従っ
て、ソースプログラム１中の文字列を置換する。[First Embodiment] FIG. 1 is a functional block diagram showing a data replacement system according to this embodiment. As shown in the figure, the data replacement system is provided with a source program 1 including a character string to be replaced, and a parameter 3 including a character string to be replaced and a character string after the replacement. The system also includes a character string conversion function 2, which replaces the character string in the source program 1 in accordance with the given parameter 3.

【００２６】ソースプログラム１は、所定のプログラミ
ング言語（高級言語）に従って記述され、データ定義部
１１と、処理記述部１２とを含む。パラメータ３は、置
換される文字列（置換対象となる文字列）と置換後の文
字列（置換されるべき文字列）とを指定する。The source program 1 is described in accordance with a predetermined programming language (high-level language), and includes a data definition section 11 and a processing description section 12. Parameter 3 specifies the character string to be replaced (character string to be replaced) and the character string after replacement (character string to be replaced).

【００２７】文字列置換機能２は、データ定義の辞書２
１と、データ解析部２２と、データ照合部２３と、オペ
ランド解析部２４と、文字列置換部２５と、データ・処
理照合部２６との各機能を備えている。文字列置換機能
２において、データ解析部２２→データ照合部２３→オ
ペランド解析部２４→文字列置換部２５、の順に処理が
移行されるように制御されている。The character string replacing function 2 is a data definition dictionary 2.
1, a data analysis unit 22, a data collation unit 23, an operand analysis unit 24, a character string replacement unit 25, and a data / process collation unit 26. In the character string replacement function 2, the processing is controlled so as to be transferred in the order of the data analysis unit 22 → data collation unit 23 → operand analysis unit 24 → character string replacement unit 25.

【００２８】データ定義の辞書２１は、定義された個々
のデータに対して一つずつの情報が登録される。データ
の階層構造に応じて、辞書上も階層構造を伴って登録さ
れるように構築する。例えば、“あ”という名前のデー
タが複数回定義されていた場合でも、上位データが
“か”であった場合と“さ”であった場合では、異なる
データであるということが辞書上で表現される。In the data definition dictionary 21, one information item is registered for each defined data item. According to the hierarchical structure of the data, the dictionary is constructed so as to be registered with the hierarchical structure. For example, even if the data named "A" is defined multiple times, it is expressed in the dictionary that the data is different when the upper data is "KA" and "SA". To be done.

【００２９】データ解析部２２は、ソースプログラム１
中のデータ定義部１１を構文解析し、データの定義内容
を語（字句）間の関係の階層構造としてデータ定義の辞
書２１に登録する（これを、データ定義の辞書２１の構
築とする）。The data analysis unit 22 uses the source program 1
The data definition unit 11 therein is parsed, and the definition content of the data is registered in the data definition dictionary 21 as a hierarchical structure of the relationship between words (lexical characters) (this is referred to as construction of the data definition dictionary 21).

【００３０】データ照合部２３は、データ定義の辞書２
１の構築後、パラメータ３として与えられた置換対象と
なる文字列（被置換文字列）を取得し、これを構文解析
する。データ照合部２３は、被置換文字列の構文解析結
果をデータ定義の辞書２１に登録されている構文解析結
果と比較し、構文的な記述の相違の有無に関わらず同一
対象であると認識可能なデータであって意味的に一致す
るデータの定義があるかどうかを照合する。データ照合
部２３は、上記照合の結果、被置換文字列と意味的に一
致したデータの定義について、データ定義の辞書２１上
で、置換対象のデータであることを示す情報を設定す
る。The data collating unit 23 uses the data definition dictionary 2
After the construction of 1, the character string to be replaced (replaced character string) given as the parameter 3 is acquired, and this is parsed. The data collating unit 23 compares the syntactic analysis result of the replaced character string with the syntactic analysis result registered in the data definition dictionary 21, and determines whether the syntactic description is the same or not.
It is checked whether or not there is a definition of data that can be recognized as a target and is semantically consistent. As a result of the above collation, the data collating unit 23 sets, in the data definition dictionary 21, information indicating that the data is semantically identical to the character string to be replaced, which is the data to be replaced.

【００３１】オペランド解析部２４は、被置換文字列照
合後、ソースプログラム１中の処理記述部１２を字句解
析し、処理記述の内容を、構文的に独立した単位（オペ
ランド）に分割する。After matching the replaced character strings, the operand analysis unit 24 lexically analyzes the process description unit 12 in the source program 1 and divides the content of the process description into syntactically independent units (operands).

【００３２】データ・処理照合部２６は、オペランド解
析部２４が分割したオペランドを受けて、ソースプログ
ラム１のプログラミング言語による予約語でないオペラ
ンドが、データ定義の辞書２１中に存在するデータ名と
一致するかの照合を行う。The data / process collation unit 26 receives the operands divided by the operand analysis unit 24, and the operand which is not a reserved word in the programming language of the source program 1 matches the data name existing in the data definition dictionary 21. Check whether to do.

【００３３】文字列置換部２５は、データ・処理照合部
２６が行った照合処理において照合に成功したオペラン
ドについて、パラメータ３として与えられた置換後の文
字列でソースプログラム１の処理記述部１２を置換す
る。また、文字列置換部２５は、データ照合部２３が実
施した照合において、パラメータ３として与えられた被
置換文字列（置換対象の文字列）との照合に成功したデ
ータについて、パラメータ３として与えられた置換後の
文字列で置換する。すなわち、ソースプログラム１のデ
ータ定義部１１における照合に成功したデータは、置換
後の文字列に置換される。The character string replacing section 25 is a processing description section of the source program 1 with the replaced character string given as the parameter 3 for the operand successfully collated in the collating process performed by the data / process collating section 26. Replace 12. Further, the character string replacing unit 25 assigns, as the parameter 3, data that has been successfully collated with the replaced character string (the character string to be replaced) given as the parameter 3 in the collation performed by the data collating unit 23. Replace with the replaced character string. That is, the data successfully collated in the data definition unit 11 of the source program 1 is replaced with the replaced character string.

【００３４】次に上述したシステムの機能を実現するハ
ードウェアの構成について、図２のブロック図を参照し
て説明する。図示するように、このハードウェアは、二
次記憶装置５１と、メインメモリ５２と、入力部５３
と、出力部５４と、中央処理装置（以下、ＣＰＵとい
う）５５と、を備えており、これらの各構成要素は、バ
ス５６を介してそれぞれ接続されている。Next, the hardware configuration for realizing the above-described system functions will be described with reference to the block diagram of FIG. As shown, this hardware includes a secondary storage device 51, a main memory 52, and an input unit 53.
, An output unit 54, and a central processing unit (hereinafter referred to as CPU) 55, and these components are connected to each other via a bus 56.

【００３５】二次記憶装置５１は、例えばハードディス
ク等で構成されており、上記ソースプログラム１、及び
上記文字列置換機能２を遂行させるためのプログラム、
ファイル形式のデータ定義の辞書２１を格納する。The secondary storage device 51 is composed of, for example, a hard disk and the like, and is a program for executing the above-mentioned source program 1 and the above character string replacing function 2,
A dictionary 21 of data definition of file format is stored.

【００３６】メインメモリ５２は、例えばＲＡＭ等から
構成されており、二次記憶装置５１から読み出されるソ
ースプログラム１及び文字列置換機能２を遂行させるた
めのプログラムを記憶する記憶領域５２Ａ、５２Ｂと、
置換処理において必要なデータを一時的に記憶するワー
クエリア５２Ｃと、が割り当てられる。なお、上記デー
タ定義の辞書２１は、ワークエリア５２Ｃ上で作成さ
れ、必要に応じて二次記憶装置５１に保存される。ま
た、パラメータ３、及びこの構文解析結果も、一時的
に、ワークエリア５２Ｃに記憶される。The main memory 52 is composed of, for example, a RAM or the like, and has storage areas 52A and 52B for storing a source program 1 read from the secondary storage device 51 and a program for executing the character string replacement function 2.
A work area 52C for temporarily storing data required in the replacement process, but are assignment have. The data definition dictionary 21 is created in the work area 52C and stored in the secondary storage device 51 as needed. Further, the parameter 3 and the result of the parsing are also temporarily stored in the work area 52C.

【００３７】入力部５３は、キーボード、マウス等から
構成されており、例えばパラメータ３を指定する。出力
部５４は、ディスプレイ装置から構成されており、例え
ばソースプログラム１、及び指定されたパラメータ３の
表示、置換処理後のソースプログラム１の表示を行う。The input unit 53 is composed of a keyboard, a mouse, etc., and specifies the parameter 3, for example. The output unit 54 is composed of a display device, and displays, for example, the source program 1, the designated parameter 3, and the source program 1 after the replacement process.

【００３８】ＣＰＵ５５は、バス５６を介して上記各構
成要素を制御すると共に、二次記憶装置５１からメイン
メモリ５２へプログラムをロードし、このプログラムを
実行することにより、文字列置換機能２に含まれる各機
能を実現する。The CPU55 via the bus 5 6 controls the above components, loading a program from the secondary storage device 51 into main memory 52 by executing the program, the string replacement function 2 Achieve each included function.

【００３９】以下、この実施の形態にかかるデータ置換
システムの各部の動作を、その処理順序に従って説明す
る。なお、ここでは、ソースプログラム１を記述するプ
ログラミング言語がＣＯＢＯＬである場合を例にとって
説明する。The operation of each part of the data replacement system according to this embodiment will be described below according to the processing order. Here, the programming language describing the source program 1 is described Tsu preparative an example where a COBOL.

【００４０】ＣＯＢＯＬでは、言語仕様上、“DATA DIV
ISION.”と記述される箇所と“PROCEDURE DIVISION.”
と記述される箇所との間にデータが定義されるので、こ
の間の記述をデータ定義部１１とする。一方、“PROCED
URE DIVISION.”と記述される箇所の後ろに処理が記述
されるので、この後ろの記述を処理記述部１２とする。In COBOL, the "DATA DIV
ISION. ”And“ PROCEDURE DIVISION. ”
Since the data is defined between the part described with, the description between this part is defined as the data definition part 11. On the other hand, "PROCED
Since the processing is described after the portion described as “URE DIVISION.”, The description after this is referred to as the processing description section 12.

【００４１】ＣＯＢＯＬにおけるデータ定義及び処理記
述を、“あ△OF△さ．”を例にとり、説明する。ここ
で、△は空白を意味する。この△は、説明の都合上、表
記したものであり、実際には、ＣＯＢＯＬによる記述に
おいて使用されない。この記述における意味は、次の通
りである。The data definition and the process description in COBOL will be described by taking "a.DELTA.OF.DELTA.sa." As an example. Here, Δ means blank. This Δ is shown for convenience of explanation, and is not actually used in the description by COBOL. The meaning of this description is as follows.

【００４２】（１）「空白」は分離記号として扱われ
る。（２）“OF”は階層関係を表す語を意味する。上記記述
例では、“あ”の上位に“さ”が存在する階層関係が表
現されている。（３）ピリオド（．）は命令の終了を意味する。上記記
述例では“あ OF さ”にてオペランドが確定されるこ
とになる。(1) "Blank" is treated as a separator. (2) "OF" means a word representing a hierarchical relationship. In the above description example, a hierarchical relationship in which "sa" exists above "a" is expressed. (3) The period (.) Means the end of the instruction. In the above description example, the operand will be determined by "a OF sa".

【００４３】次に、データ解析部２２の処理について、
図３を参照して説明する。データ解析部２２は、ＣＯＢ
ＯＬの仕様に従って、ソースプログラム１中のデータ名
が定義されている箇所であるデータ定義部１１を処理対
象とする。データ解析部２２は、データ定義部１１に文
字列で記述することにより定義されたデータを、以後の
照合処理に適用し易いよう、メモリ上のデータ定義の辞
書２１に登録する。Next, regarding the processing of the data analysis unit 22,
This will be described with reference to FIG. The data analysis unit 22 uses COB
According to the specifications of the OL, the data definition part 11 which is the part where the data name in the source program 1 is defined is targeted for processing. The data analysis unit 22 registers the data defined by writing a character string in the data definition unit 11 in the data definition dictionary 21 on the memory so that it can be easily applied to the subsequent matching process.

【００４４】ここで、以後の照合処理に適用し易い登録
の形式とは、（１）データの階層構造を辿れるように、
従属するデータ及び従属されるデータの相互に参照が可
能なポインタを持っていること、（２）データ名の検索
が素早く行えるよう、ハッシュ法を用いてデータ名から
ハッシュ値を求め、そのハッシュ値が同じデータを連続
して参照することが可能なポインタを持っていること、
などの特徴を有する形式である。Here, the registration format that is easily applied to the subsequent collation processing is (1) so that the hierarchical structure of data can be traced.
Have a dependent data and a pointer that can refer to the dependent data mutually, and (2) Obtain a hash value from the data name using the hash method so that the data name can be searched quickly. Has a pointer that can continuously reference the same data,
It is a format with features such as.

【００４５】例えば、図３中のデータ定義部１１に記述
されている文字列（記述）を、データ解析部２２が解析
する場合を例に説明する。「０１か．／／０２あ．
（／／は改行を表す）」という文字列（記述）を、デー
タ解析部２２が、構文解析により、語“あ”の上位に語
“か”が存在する階層関係であるという解析結果を得
る。For example, the case where the data analysis unit 22 analyzes the character string (description) described in the data definition unit 11 in FIG. 3 will be described as an example. "01? // 02 a.
The data analysis unit 22 obtains an analysis result that the word “a” exists in a higher order of the word “a” by the syntax analysis of the character string (description) “(// represents a line break)”. .

【００４６】そして、データ解析部２２は、“あ”及び
“か”のデータそれぞれに、階層関係を表すポインタが
付加すると共に、“あ”データに、ハッシュ法によりデ
ータ名（“あ”）から求めたハッシュ値に対応するポイ
ンタを付加する。Then, the data analysis unit 22 adds a pointer indicating a hierarchical relationship to each of the data "a" and "ka", and the data name ("a") is added to the "a" data by the hash method. A pointer corresponding to the obtained hash value is added.

【００４７】同様にして、２番目、２番目の文字列（記
述）についても解析が実施されると、図３中のデータ定
義の辞書２１に示すような階層構造の情報が得られる。Similarly, when the second and second character strings (descriptions) are also analyzed, the hierarchical structure information as shown in the data definition dictionary 21 in FIG. 3 is obtained.

【００４８】次に、データ照合部２３の処理について、
図４を参照して説明する。データ照合部２３は、パラメ
ータ３に与えられた被置換文字列（置換対象の文字列）
が、データ定義の辞書２１中において、どのデータの定
義に相当するかを決定する。Next, regarding the processing of the data collating section 23,
This will be described with reference to FIG. The data matching unit 23 uses the replaced character string given to the parameter 3 (the character string to be replaced).
Determines which data definition corresponds in the data definition dictionary 21.

【００４９】データ照合部２３は、被置換文字列の内容
を、ソースプログラム１の言語（ここでは、ＣＯＢＯ
Ｌ）の仕様に従って構文解析し、解析結果として分割さ
れた先頭の語のハッシュ値を求め、データ定義の辞書２
１上で、そのハッシュ値で表される名標（名前）を持つ
データの定義を次々と走査し、先頭の語と一致するデー
タ名のデータを仮に確定する。その後、データ照合部２
３は、被置換文字列の記述内容に階層構造の表現がある
場合は、この階層構造とデータ定義の辞書２１上での階
層構造とが一致しているか否かの照合処理を行う。The data collating unit 23 determines the contents of the replaced character string in the language of the source program 1 (COBO in this case).
L) The syntax is parsed according to the specification, the hash value of the first word divided as the analysis result is obtained, and the data definition dictionary 2
1, the definition of data having a name tag (name) represented by the hash value is scanned one after another, and the data of the data name that matches the first word is provisionally determined. After that, the data matching unit 2
If the description content of the replaced character string has an expression of a hierarchical structure, 3 performs collation processing as to whether this hierarchical structure and the hierarchical structure in the data definition dictionary 21 match.

【００５０】例えば、被置換文字列が“あ OF さ”とな
っている場合、“あ”の上位データとして“さ”が存在
するかどうかは、データ定義の辞書２１上で、階層構造
を参照するポインタを“あ”を示すデータ名から上位に
辿って行き、各層におけるデータ名と、“あ”を示すデ
ータ名との照合処理により実施される。そして、“あ”
を示すデータ名の上位に“さ”を示すデータ名が存在す
る場合には、そのデータ名は置換対象である、というこ
とが確定される。For example, if the character string to be replaced is "a OF sa", refer to the hierarchical structure in the data definition dictionary 21 to determine whether "sa" exists as higher data of "a". The pointer is traced upward from the data name indicating "A", and the comparison is performed by the data name in each layer and the data name indicating "A". And “a”
If a data name indicating “sa” exists above the data name indicating “”, it is determined that the data name is the replacement target.

【００５１】図４において、記号「×」は、パラメータ
３からの“あ OF さ”の“あ”と、データ定義の辞書２
１上の“か”の下位に存在する“あ”とが最初に照合さ
れたが、この階層構造で表現される文字列が、“あ OF
さ”と意味的に一致しなかったことを示している。ま
た、記号「○」は、パラメータ３からの“あ OF さ”の
“あ”と、データ定義の辞書２１上の“さ”の下位に存
在する“あ”とが次に照合された結果、この階層構造で
表現される文字列が、“あ OF さ”と意味的に一致した
ことを示している。なお、後述する図において表記され
る記号「×」及び「○」」についても同様の意味とす
る。In FIG. 4, the symbol "x" is the "a" of "a OFsa" from the parameter 3 and the dictionary 2 of the data definition.
The "a" that exists below "ka" in 1 was first matched, but the character string expressed in this hierarchical structure was "a OF.
"A" of "a OF sa" from parameter 3 and "sa" in the dictionary 21 of the data definition. As a result of next matching with the lower "a", it is shown that the character string represented by this hierarchical structure has the same meaning as "a OF sa". Note that the symbols “x” and “◯” described in the figures described later have the same meaning.

【００５２】データ照合部２３は、上述したようにして
置換対象であると確定されたデータ（データ名）に、デ
ータ定義の辞書２１上で、置換対象であるということを
示す情報を設定する。The data collating unit 23 sets, in the data definition dictionary 21, information indicating that it is a replacement target for the data (data name) determined to be a replacement target as described above.

【００５３】図４に示す例においては、確定されたデー
タ名（“あ”）に記号「◎」が付加されているが、この
記号「◎」は、置換対象であるということを示す情報を
意味している。なお後述する図において表記される記号
「◎」についても同様の意味とする。In the example shown in FIG. 4, the symbol “⊚” is added to the confirmed data name (“a”), but this symbol “⊚” indicates information indicating that the data is to be replaced. I mean. Note that the symbol “⊚” described in the drawings described later has the same meaning.

【００５４】データ照合部２３は、さらに、文字列置換
部２５に対して、意味的に一致したデータ定義部１１中
の文字列の開始位置と終了位置を通知して、データ定義
１１中の該当するデータ定義の置換処理を指示する。The data collating unit 23 further notifies the character string substituting unit 25 of the start position and end position of the character string in the data defining unit 11 which is semantically matched, and the data defining unit 25 Instruct replacement processing of the corresponding data definition.

【００５５】次に、オペランド解析部２４の処理につい
て、図５を参照して説明する。オペランド解析部２４
は、ＣＯＢＯＬ言語に従って、ソースプログラム１中の
処理記述が定義されている箇所である処理記述部１２を
処理対象とする。オペランド解析部２４は、次のように
処理を行う。Next, the processing of the operand analysis unit 24 will be described with reference to FIG. Operand analysis unit 24
Processes the process description part 12 in the source program 1 where the process description is defined according to the COBOL language. The operand analysis unit 24 performs the following process.

【００５６】（１）言語仕様上の分離記号を基に、処理
記述部１２に文字列で記述された処理記述から「語」を
分離して確定する。（２）分離した「語」が言語仕様上の予約語ではない利
用者語であった場合に、ＣＯＢＯＬ言語で階層関係を示
す語“OF”を手掛かりに、処理記述中の利用者語の階層
関係に従って、文字列の範囲を「オペランド」として確
定する。（３）確定した「オペランド」の単位で、置換対象であ
るか否かの判断処理を実施するよう、データ・処理照合
部２６に指示する。(1) A "word" is separated and determined from the process description described in the process description section 12 as a character string based on the separator in the language specification. (2) When the separated "word" is a user word that is not a reserved word in the language specification, the word "OF" indicating the hierarchical relationship in the COBOL language is used as a clue, and the hierarchy of the user word in the process description According to the relationship, the range of the character string is fixed as the "operand". (3) The data / process collation unit 26 is instructed to perform the process of determining whether or not the target is the replacement target in the unit of the determined “operand”.

【００５７】ここで、処理記述中に“MOVE△100△TO△
あ△OF△さ．”（△は空白文字とする）で示される文字
列が記述されている場合において、“あ”以後の“あ△
OF△さ．”の記述に対する解析について説明する。Here, in the processing description, "MOVE △ 100 △ TO △
Oh △ OF △ If a character string indicated by "(△ is a blank character) is described,"
OF △ The analysis for the description of "is described.

【００５８】オペランド解析部２４は、最初に、分離記
号を探す。ＣＯＢＯＬ言語では「空白」は分離記号とし
て扱われるので、まず“あ”が「語」（字句）として確
定される。後続に“OF”があるので、さらに後続の語
“さ”で階層関係を表現していることが判断される。そ
して、“さ”の後ろにはピリオド（．）があり、これは
ＣＯＢＯＬにおいては命令の終了を意味するので、“あ
△OF△さ”の範囲でオペランドが確定されることにな
る。The operand analysis unit 24 first searches for a separator. In the COBOL language, "blank" is treated as a separator, so that "a" is first determined as a "word" (a token). Since there is "OF" after it, it is judged that the word "sa" further expresses the hierarchical relationship. Then, there is a period (.) After "sa", which means the end of the instruction in COBOL, so that the operand is determined within the range of "aΔOFΔsa".

【００５９】このようにしてオペランドを確定したオペ
ランド解析部２４は、そのオペランド“あ△OF△さ”
を、データ・処理照合部２６に渡して、置換対象である
か否かの判断を実行させる。The operand analysis unit 24, which has determined the operand in this manner, uses the operand “AΔOFΔSA”.
Is passed to the data / process collation unit 26, and it is determined whether or not it is a replacement target.

【００６０】次に、データ・処理照合部２６の処理につ
いて、図６を参照して説明する。データ・処理照合部２
６は、オペランド解析部２４から渡されたオペランドを
語に分割し、データ照合部２３がパラメータ３として与
えられた被置換文字列と同様に構文解析する。次に、オ
ペランドの先頭の語のハッシュ値を求め、データ定義の
辞書２１上で、そのハッシュ値で表される名標を持つデ
ータの定義を次々と走査し、先頭の語と一致するデータ
名のデータを仮に確定する。Next, the processing of the data / processing collating section 26 will be described with reference to FIG. Data / process collation unit 2
6 divides the operand transferred from the operand analyzer 2 4 the word, the data comparison unit 23 is similarly parsed and the replaced character string given as the parameter 3. Next, the hash value of the first word of the operand is obtained, the definitions of the data having the name tag represented by the hash value are sequentially scanned on the data definition dictionary 21, and the data name that matches the first word The data of is temporarily confirmed.

【００６１】例えば、オペランドの最初の語が“あ”の
場合は、“あ”と同じハッシュ値のデータを参照するポ
インタを次々に辿って行き、データ名との照合処理を行
い、“あ”というデータ名を持つデータを仮に確定す
る。For example, when the first word of the operand is "A", pointers that refer to the data of the same hash value as "A" are traced one after another, and the collation process with the data name is performed to obtain "A". The data having the data name of is temporarily determined.

【００６２】その後、データ・処理照合部２６は、オペ
ランドの記述内容に階層構造の表現がある場合は、この
階層構造とデータ定義の辞書２１上での階層構造とが一
致しているか否かの照合処理を行う。Thereafter, if the description content of the operand has a hierarchical structure expression, the data / process matching unit 26 determines whether this hierarchical structure matches the hierarchical structure in the data definition dictionary 21. Perform matching processing.

【００６３】例えば、オペランドが“あ OF さ”となっ
ている場合、“あ”の上位データとして“さ”が存在す
るかどうかは、データ定義の辞書２１上で、階層構造を
参照するポインタを“あ”を示すデータ名から上位に辿
って行き、各層におけるデータ名と、“あ”を示すデー
タ名との照合処理により実施される。そして、“あ”を
示すデータ名の上位に“さ”を示すデータ名が存在する
場合には、そのデータ名は置換対象である、ということ
が確定される。For example, when the operand is “A OF SA”, whether or not “SA” exists as higher data of “A” is determined by referring to the pointer for referencing the hierarchical structure in the data definition dictionary 21. The data name indicating “a” is traced to the upper level, and the comparison is performed by comparing the data name in each layer with the data name indicating “a”. Then, when a data name indicating "sa" exists above the data name indicating "a", it is determined that the data name is a replacement target.

【００６４】このように、オペランドを示す文字列を、
文字列全体で一度に照合せずに「語」毎にデータ定義の
辞書２１と照合することにより、例えば“あ△OF△さ”
と“あ△△OF△さ”の二つの記述（△を空白文字とす
る）の様に、“あ”と“OF”の間の空白文字の数が記述
箇所によって異なるような場合であっても、構文的なチ
ェックの結果、同じデータ名を示す記述である、という
ことを判断することができる。Thus, the character string indicating the operand is
By collating the entire character string with the data definition dictionary 21 for each "word" without collating at once, for example, "a △ OF △ SA"
In the case where the number of blank characters between “A” and “OF” is different depending on the description, like the two descriptions of “A” and “A △△ OF △ SA” (△ is a blank character). Also, as a result of the syntactical check, it is possible to determine that the descriptions have the same data name.

【００６５】また、“う OF た”（２階層）と“う OF
な OF た”（３階層）のように、階層構造の指定が異な
るような場合であっても、意味的なチェックの結果、同
じデータ名を示す記述である、ということを判断するこ
とができる。In addition, "U OFta" (2 levels) and "U OF
Even if the hierarchical structure is specified differently, such as "OF" (3 layers), it is possible to determine that the description indicates the same data name as a result of the semantic check. .

【００６６】置換対象であるということが確定されたデ
ータに対して、データ照合部２３による照合処理におい
て置換対象であるという情報（図６の例では記号
「◎」）が設定されている場合は、そのオペランドが置
換対象（被置換文字列）となる。In the case where information that is a replacement target is set in the matching processing by the data matching unit 23 for the data determined to be a replacement target (in the example of FIG. 6, the symbol “⊚”), , That operand becomes the replacement target (replaced character string).

【００６７】この場合、データ・処理照合部２６は、文
字列置換部２５に対して、当該オペランドのソースプロ
グラム１中での開始位置と終了位置を指定し、処理記述
１２中の該当する処理記述の置換を指示する。In this case, the data / process collation unit 26 designates the start position and the end position of the operand in the source program 1 to the character string replacement unit 25, and the corresponding process description in the process description 12 is specified. To replace the.

【００６８】次に、文字列置換部２５の処理について、
図７を参照して説明する。文字列置換部２５に対して
は、データ照合部２３、及びデータ・処理照合部２６の
両方から、文字列置換を実施するよう指示されるが、い
ずれの指示においても、置換される文字列のソースプロ
グラム１中で開始位置と終了位置と置換文字列が通知さ
れる。Next, regarding the processing of the character string replacing section 25,
This will be described with reference to FIG. Both the data collation unit 23 and the data / process collation unit 26 are instructed to the character string replacement unit 25 to perform the character string replacement. The start position, the end position, and the replacement character string are notified in the source program 1.

【００６９】文字列置換部２５は、この文字列置換の指
示内容を、データ照合部２３とデータ・処理照合部２６
の処理が終了するまで蓄積し、蓄積終了後、開始位置を
降順に整列する。そして、文字列置換部２５は、開始位
置と終了位置で示されたソースプログラム１中の文字列
を、パラメータ３として与えられた置換語の文字列で置
換する。The character string substituting unit 25 compares the instruction contents of the character string substituting with the data collating unit 23 and the data / process collating unit 26.
After the end of the process, the start positions are arranged in descending order. Then, the character string replacing unit 25 replaces the character string in the source program 1 indicated by the start position and the end position with the character string of the replacement word given as the parameter 3.

【００７０】なお、被置換文字列の開始位置の降順に整
列するのは、次の様な理由からである。文字列を置換し
た際に、置換後の文字列の長さが置換前の文字列の長さ
と相違する場合には、置換後の文字列の後ろにある記述
が相対的にずれる（後方にずれる）ことになる。そこ
で、ソースプログラム１の後ろから置換していくことに
よって、置換対象位置が順次前に移動することになるの
で、記述の相対的な移動を、後続の置換処理中に意識す
る必要がないからである。従って、処理記述部１２が先
に置換処理され、その後、データ定義部１１が置換処理
されることになる。The reason for arranging the replaced character strings in descending order of the start positions is as follows. When replacing a character string, if the length of the character string after replacement is different from the length of the character string before replacement, the description after the character string after replacement is relatively displaced (shifted backward). ) It will be. Therefore, by replacing from the back of the source program 1, the replacement target position is sequentially moved to the front, so that it is not necessary to be aware of the relative movement of the description during the subsequent replacement processing. is there. Therefore, the process description section 12 is replaced first, and then the data definition section 11 is replaced.

【００７１】なお、図７において、処理記述部１２にお
いては、「１０行３０桁〜１０行３１桁」の“あ OF
さ”が“ア OF さ”に置換されていることが示されてお
り、またデータ定義部１１においては、「５行１５桁〜
５行１６桁」の“あ”が“ア”に置換されていることが
示されている。Note that, in FIG. 7, in the processing description section 12, "A OF
It is shown that “S” is replaced with “A OF S”, and in the data definition section 11, “5 lines and 15 digits ...
It is shown that “a” in “5 rows and 16 digits” is replaced with “a”.

【００７２】以上説明したように、この実施の形態によ
れば、ソースプログラム１中の文字列が置換対象として
適切であるということを、意味的に確認した上で決定す
るようにしているので、データ名単位での文字列置換を
漏れなく正確に実施することができる。As described above, according to this embodiment, it is determined after confirming semantically that the character string in the source program 1 is suitable as a replacement target. It is possible to accurately perform character string replacement in units of data names.

【００７３】また、上述した様に置換対象の文字列の絞
り込みを正確に行うことができるので、文置換処理を起
動した後に、誤った置換対象の文字列が置換された場合
に、その文字列を元の文字列に戻すという、人手による
作業を介在させる必要がなくなる。従って、複数のソー
スプログラム１に対する文字列置換を行う場合でも、オ
ペレータは、最初にパラメータ３を指定するのみで良
く、ごくわずかな作業量で一度に行うことができる。Further, as described above, the character strings to be replaced can be narrowed down accurately. Therefore, if the character string to be replaced is erroneously replaced after the sentence replacement process is started, the character string to be replaced is replaced. It is no longer necessary to intervene by the manual work of returning to the original character string. Therefore, even when performing character string replacement for a plurality of source programs 1, the operator only needs to specify the parameter 3 at the beginning, and it is possible to perform it all at once with a very small amount of work.

【００７４】［第２の実施の形態］図８は、第２の実施
の形態に係る文字列検索装置の構成を示す機能ブロック
図である。文字列置換部２５が参照する置換語の文字列
は、パラメータ３として直接渡されているが、この実施
の形態では、置換語の文字列に関する情報を、一旦、デ
ータ定義の辞書２１に保管し、文字列置換部２５に対す
る置換指示の一つとして与える点で、第１の実施の形態
と異なる。すなわち、データ照合部２３及び文字列置換
部２５の処理が、第１の実施の形態での処理と多少異な
っている。そこで、これらの構成要素について説明す
る。[Second Embodiment] FIG. 8 is a functional block diagram showing a configuration of a character string search device according to a second embodiment. The character string of the replacement word referred to by the character string replacement unit 25 is directly passed as the parameter 3. However, in this embodiment, the information about the character string of the replacement word is temporarily stored in the data definition dictionary 21. , Is different from the first embodiment in that it is given as one of the replacement instructions to the character string replacement unit 25. That is, the processes of the data collating unit 23 and the character string replacing unit 25 are slightly different from the processes in the first embodiment. Therefore, these components will be described.

【００７５】パラメータ３に与えられた被置換文字列
が、データ定義の辞書２１中の、どのデータの定義に相
当するかを、データ照合部２３によって決定する部分
は、図４を参照して説明した処理と基本的に同様であ
る。しかし、データ照合部２３の処理は、次の２点につ
いて第１の実施の形態の場合と異なる。The part in which the data collating unit 23 determines which data definition in the data definition dictionary 21 the replaced character string given to the parameter 3 corresponds to will be described with reference to FIG. It is basically the same as the processing performed. However, the processing of the data matching unit 23 is different from that of the first embodiment in the following two points.

【００７６】図４に示す例では、確定されたデータに対
するデータ定義の辞書２１上での情報設定は、置換対象
であるということを示す情報（図４では記号「◎」で示
される情報）のみであったが、この実施の形態では、デ
ータ照合部２３は、この情報に加えて、データ定義の辞
書２１に、図９に示すように、置換語の文字列が何であ
るかということを示す情報２１Ａを設定する。この情報
２１Ａは、置換語の文字列が、例えば“ア OF さ”の場
合は、“ア ”を示す情報である。In the example shown in FIG. 4, the information setting on the dictionary 21 of the data definition for the decided data is only information indicating that it is the replacement target (information indicated by the symbol "⊚" in FIG. 4). However, in this embodiment, in addition to this information, the data collating unit 23 indicates in the data definition dictionary 21, what the character string of the replacement word is, as shown in FIG. The information 21A is set. This information 21A is information indicating "a" when the character string of the replacement word is, for example, "a OFa".

【００７７】また、データ照合部２３から文字列置換部
２５への指定する内容は、被置換文字列（例えば“あ O
F さ”）の開始位置と終了位置の情報のみであったが、
この実施の形態では、データ照合部２３は、これらの情
報に加えて、置換文字列（例えば“ア OF さ”の“ア
”）を指示内容として文字列置換部２５に通知する。The contents specified by the data collating unit 23 to the character string replacing unit 25 are the character strings to be replaced (for example, "A O
Only the information of the start position and end position of F
In this embodiment, the data collating unit 23 notifies the character string replacing unit 25 of the replacement character string (for example, “A” of “A OF SA”) as the instruction content in addition to these pieces of information.

【００７８】文字列置換部２５においては、図７を参照
して説明した場合と異なる点は次の一点である。図７に
示す例では、置換文字列はパラメータ３から直接取得す
るようにしているが、この実施の形態では、文字列置換
部２５には、図１０に示すように、パラメータ３から一
旦、データ定義の辞書２１に格納された置換後の文字列
に関する情報（例えば“ア OF さ”の場合は、“ア”）
が、置換指示毎に与えられる。The character string replacing unit 25 is different from the case described with reference to FIG. 7 in the following one point. In the example shown in FIG. 7, the replacement character string is directly obtained from the parameter 3, but in this embodiment, the character string replacement unit 25 temporarily stores the data from the parameter 3 as shown in FIG. Information about the character string after replacement stored in the definition dictionary 21 (for example, "a" in the case of "a OFsa")
Is given for each replacement instruction.

【００７９】この実施の形態によれば、置換対象となる
文字列の情報と置換後の文字列に関する情報を、データ
定義の辞書２１中で一対にして記憶させているため、異
なるパラメータ３の内容（置換前の文字列及び置換後の
文字列）を、複数件同時に指定することができる。な
お、この実施の形態においても、上記第１の実施の形態
における効果と同様の効果を得ることができる。According to this embodiment, since the information on the character string to be replaced and the information on the character string after replacement are stored as a pair in the data definition dictionary 21, the contents of different parameters 3 are stored. (Character string before replacement and character string after replacement) can be specified simultaneously. In addition, also in this embodiment, it is possible to obtain the same effect as that of the first embodiment.

【００８０】［第３の実施の形態］上記の第１、第２の
実施の形態では、階層構造で定義されたデータを置換す
る場合を例として説明した。これに対し、この実施の形
態では、数式を置換する場合を例として説明する。ここ
では、図１１に示すような、（１）〜（３）の数式を含
むプログラムを例とする。[Third Embodiment] In the above first and second embodiments, the case where the data defined in the hierarchical structure is replaced has been described as an example. On the other hand, in this embodiment, a case of replacing a mathematical expression will be described as an example. Here, a program including the mathematical expressions (1) to (3) as shown in FIG. 11 is taken as an example.

【００８１】コンパイラなど、所定のプログラムを実行
する処理装置が、数式（１）〜（３）をそれぞれ読み込
んだとき、これを構文解析すると、図１２（ａ）〜
（ｃ）に示すような構文木が得られる。数式（１）の構
文木は、図１２（ａ）に示すように、最上位の親のノー
ドを「＝」とし、左側の子のノードとして「ａ」を、右
側の子のノードとして部分木１０１を有している。従っ
て、数式（１）以降で、部分木１０１と意味的に同じ部
分木が表れた場合には、その部分木を「ａ」に置き換え
られることを意味しており、処理装置は、部分木１０１
と意味的に一致する構文木を「ａ」に置換するように指
示する。When a processing device such as a compiler that executes a predetermined program reads the mathematical expressions (1) to (3), the syntax analysis is performed.
A syntax tree as shown in (c) is obtained. In the syntax tree of Expression (1), as shown in FIG. 12A, the uppermost parent node is “=”, the left child node is “a”, and the right child node is a subtree. Has 101. Therefore, when a subtree semantically the same as the subtree 101 appears in the mathematical expression (1) and thereafter, it means that the subtree can be replaced with “a”, and the processing device
To replace the syntax tree that is semantically matched with "a".

【００８２】なお、部分木１０１は、親のノードを
「＋」とし、左側の子のノードを「ｂ」、右側の子のノ
ードを「ｃ」としているが、親のノードが「＋」である
ので、左右の子のノードに序列はなく、子のノードとし
て「ｂ」と「ｃ」とを有していることとなる。In the subtree 101, the parent node is "+", the left child node is "b", and the right child node is "c", but the parent node is "+". Therefore, the left and right child nodes have no order, and have “b” and “c” as child nodes.

【００８３】また、数式（２）の構文木は、図１２
（ｂ）に示すように、部分木１０２を含んでいる。部分
木１０２は部分木１０１と完全に一致するため、これを
「ａ」に置換することができる。従って、数式（２）
は、処理装置によって「ｄ＝ｅ＋ａ」と置換される。The syntax tree of equation (2) is shown in FIG.
As shown in (b), the subtree 102 is included. Since the subtree 102 completely matches the subtree 101, this can be replaced with “a”. Therefore, equation (2)
Is replaced by "d = e + a" by the processor.

【００８４】一方、数式（３）の構文木は、図１２
（ｃ）に示すように、部分木１０３を含んでいる。部分
木１０３は、親のノードが「＋」であるため、左右の子
のノードに序列はなく、子のノードとして「ｂ」と
「ｃ」とを有していることとなる。従って、部分木１０
３は部分木１０１と意味的に一致することとなり、数式
（３）は、処理装置によって「ｆ＝ｇ＋ａ」と置換され
る。On the other hand, the syntax tree of equation (3) is shown in FIG.
As shown in (c), the subtree 103 is included. In the subtree 103, since the parent node is “+”, the left and right child nodes have no order and have “b” and “c” as child nodes. Therefore, the subtree 10
3 becomes a semantic match with the subtree 101, and the expression (3) is replaced with “f = g + a” by the processing device.

【００８５】比較例として、従来の技術に従って同様の
置換を行った場合には、数式（２）は、数式（１）に含
まれる「ｂ＋ｃ」をその順番のまま含んでいるので、
「ｄ＝ｅ＋ａ」と置換することができる。しかし、数式
（３）に含まれている「ｃ＋ｂ」は、数式（１）に含ま
れる「ｂ＋ｃ」と意味的に同じでも、文字列の表現とし
て一致していないため、置換することができない。As a comparative example, when the same substitution is performed according to the conventional technique, the expression (2) includes “b + c” included in the expression (1) in that order,
It can be replaced with “d = e + a”. However, even if “c + b” included in Expression (3) has the same meaning as “b + c” included in Expression (1), they cannot be replaced because they do not match as the expression of the character string.

【００８６】従って、図１１に示したプログラムに対し
て置換を行った後のプログラムを実行した場合の加算演
算のステップ数は、この実施の形態では３ステップで済
むのに対して、従来例では４ステップが必要となる。つ
まり、この実施の形態によれば、プログラムをより実行
効率のよい形に最適化することができるようになり、プ
ログラムの実行速度を向上させることが可能となる。Therefore, in the case of executing the program after the replacement shown in FIG. 11 is executed, the number of steps of the addition operation is only three in this embodiment, whereas in the conventional example. Four steps are required. That is, according to this embodiment, it becomes possible to optimize the program in a more efficient form, and it is possible to improve the execution speed of the program.

【００８７】［実施の形態の変形］本発明は、上記の第
１〜第３の実施の形態に限られず、種々の変形、応用が
可能である。以下、本発明に適用可能な上記の実施の形
態の変形態様について、説明する。[Modifications of Embodiments] The present invention is not limited to the above-described first to third embodiments, and various modifications and applications are possible. Hereinafter, modifications of the above-described embodiment applicable to the present invention will be described.

【００８８】上記の第１、第２の実施の形態では、ＣＯ
ＢＯＬで記述されたプログラム中に含まれる階層構造を
定義した文字列を他の文字列に置換する場合を例として
説明した。上記の第３の実施の形態では、所定のプログ
ラミング言語で記述されたプログラム中に含まれる数式
の一部に、演算結果が先に求められるものがある場合
に、当該数式の一部を先の演算結果が代入される変数に
置換する場合を例として説明した。しかしながら、本発
明で適用可能なプログラミング言語は、ＣＯＢＯＬに限
られず、他の高級言語としてもよい。さらには、自然言
語で記述された文章中の文字列と指定した文字列とをそ
れぞれ形態素解析し、形態素解析した結果同士を比較し
て、文章中の文字列を置換対象として判断してもよい。In the first and second embodiments described above, CO
The case where the character string defining the hierarchical structure included in the program described in BOL is replaced with another character string has been described as an example. In the above-described third embodiment, if some of the mathematical expressions included in the program written in a predetermined programming language require a calculation result in advance, then some of the mathematical expressions are The case where the calculation result is replaced with the variable to be substituted has been described as an example. However, the programming language applicable to the present invention is not limited to COBOL and may be another high-level language. Further, the character string in the sentence written in natural language and the specified character string may be subjected to morphological analysis, and the results of the morphological analysis may be compared with each other to determine the character string in the sentence as a replacement target. .

【００８９】上記の第１〜第３の実施の形態では、指定
した文字列から置換対象と判断された文字列を他の文字
列に置換する場合を例として説明した。しかしながら、
本発明は、文字列の置換を行わない場合であっても、指
定した文字列と意味的に一致する文字列を検索するあら
ゆる場合に適用することができる。In the above first to third embodiments, the case where the character string determined to be replaced from the specified character string is replaced with another character string has been described as an example. However,
INDUSTRIAL APPLICABILITY The present invention can be applied to all cases in which a character string that semantically matches a specified character string is searched, even if the character string is not replaced.

【００９０】上記の第１〜第３の実施の形態で説明した
処理を実行するためのプログラム（文字列置換機能２を
実現するプログラム）は、ＣＤ−ＲＯＭなどのコンピュ
ータ読み取り可能な記録媒体に格納して配布することも
可能である。The program for executing the processing described in the above-mentioned first to third embodiments (the program realizing the character string replacing function 2) is stored in a computer-readable recording medium such as a CD-ROM. It is also possible to distribute it.

【００９１】[0091]

【発明の効果】以上説明したように、本発明によれば、
文字列の表現として異なっていても、同一の意味を有す
る文字列を、例えば他の文字列への置換対象の文字列な
どとして、検索することができる。As described above, according to the present invention,
It is possible to search for character strings having the same meaning even if the expressions of the character strings are different, for example, as a character string to be replaced with another character string.

[Brief description of drawings]

【図１】本発明の第１の実施の形態に係るデータ置換装
置システムの構成を示す機能ブロック図である。FIG. 1 is a functional block diagram showing a configuration of a data replacement device system according to a first embodiment of the present invention.

【図２】図１に示したデータ置換システムを実現するハ
ードウェアの構成を示すブロック図である。FIG. 2 is a block diagram showing a configuration of hardware that realizes the data replacement system shown in FIG.

【図３】データ解析部の処理を説明するための図であ
る。FIG. 3 is a diagram for explaining a process of a data analysis unit.

【図４】データ照合部の処理を説明するための図であ
る。FIG. 4 is a diagram for explaining a process of a data matching unit.

【図５】オペランド解析部の処理を説明するための図で
ある。FIG. 5 is a diagram for explaining a process of an operand analysis unit.

【図６】データ・処理照合部の処理を説明するための図
である。FIG. 6 is a diagram for explaining a process of a data / process matching unit.

【図７】文字列置換部の処理を説明するための図であ
る。FIG. 7 is a diagram for explaining a process of a character string replacing unit.

【図８】本発明の第２の実施の形態に係るデータ置換シ
ステムの構成を示す機能ブロック図である。FIG. 8 is a functional block diagram showing a configuration of a data replacement system according to a second embodiment of the present invention.

【図９】図８に示したデータ照合部の処理を説明するた
めの図である。9 is a diagram for explaining the process of the data matching unit shown in FIG.

【図１０】図８に示した文字列置換部の処理を説明する
ための図である。FIG. 10 is a diagram for explaining the process of the character string replacing unit shown in FIG.

【図１１】本発明の第３の実施の形態で適用されるプロ
グラム中の数式例である。FIG. 11 is an example of mathematical expressions in a program applied in the third embodiment of the present invention.

【図１２】（ａ）〜（ｃ）は、それぞれ図１１の数式を
構文解析した構文木を示す図である。12A to 12C are diagrams each showing a syntax tree obtained by parsing the mathematical expression of FIG. 11;

[Explanation of symbols]

２文字列置換機能２１データ定義辞書２２データ解析部２３データ照合部２４オペランド解析部２５文字列置換部２６データ・処理照合部５１２次記憶装置５２メインメモリ５３入力部５４出力部５５中央処理装置（ＣＰＵ）５６バス 2 Character replacement function 21 Data definition dictionary 22 Data Analysis Department 23 Data collator 24 Operand analysis unit 25 Character string replacement part 26 Data / Process Collation Unit 51 secondary storage 52 main memory 53 Input section 54 Output section 55 Central Processing Unit (CPU) 56 bus

Claims

(57) [Claims]

1. A first analysis means for parsing a first character string in which a hierarchical relationship is clearly described according to a predetermined rule based on the predetermined rule, and a second character to be searched. It is possible to recognize that the analysis result by the second analysis means is the same as the target in the analysis result by the second analysis means for parsing the string based on the predetermined rule and the analysis result by the first analysis means. Na
First comparing means and said first character in the first character string corresponding to semantically matched part verification by verification means for checking whether there is a portion to semantically match a partial A data retrieving apparatus comprising: a first extracting unit that extracts a string as a searched character string.

2. A first instructing means for instructing a third character string in which the second character string is to be replaced, and the character string extracted by the first extracting means, The data retrieving apparatus according to claim 1, further comprising: a first replacing unit that replaces the third character string designated by the unit.

3. The first extracting means includes a position of a character string in the first character string corresponding to a semantically matching portion in the matching by the first matching means, and the first replacing means. And the first replacement unit performs the replacement with the third character string based on the position of the character string notified from the first extraction unit. Data retrieval device described.

4. The first analyzing means associates the analysis result of the first character string with information for performing replacement with the third character string instructed by the instructing means, The said 1st replacement | exchange means replaces with the said 3rd character string based on the information matched with the analysis result of the said 1st character string, It is characterized by the above-mentioned. Data retrieval device described.

5. The first and second analysis means divide the first and second character strings into words, respectively, and associate the divided words with a predetermined data structure to obtain respective analysis results. The first collating means compares the analysis results by the first and second analyzing means in terms of words included in each, and collates whether or not there is a semantically matching portion. The data search device according to any one of claims 1 to 4.

6. The first and second character strings include a plurality of words described in a hierarchical structure, and the first and second analysis means include a plurality of words described in a hierarchical structure, The upper side or the lower side words are sequentially connected in a list to form respective analysis results, and the first collating means traces the analysis results by the first and second analyzing means, respectively, 6. The data search device according to claim 5, wherein whether or not there is a semantically matching portion is checked.

7. The first analysis means stores the highest or lowest word of the hierarchical structure in a predetermined storage position according to a hash value obtained by a hash method, and the first collation means. Is the hash value from the highest or lowest word of the words described by the hierarchical structure in the second character string, and in the first character string stored at the position of the hash value. 7. The data search device according to claim 6, wherein whether or not there is a semantically matching portion is checked while tracing the list in order from the word.

8. The first character string is described in a predetermined programming language, and the first and second analysis means are based on specifications of the predetermined programming language. 8. The data search device according to claim 1, wherein each of the two character strings is parsed.

9. The first character string is a data definition in the predetermined programming language, and an operand analysis unit that divides a fourth character string, which is a process description in the predetermined programming language, into operands. Third parsing means for parsing the operands divided by the operand parsing means, and a portion semantically matching the parsing result of the operand by the third parsing means with the parsing result of the first parsing means. A second collating means for collating whether or not there is a second collating means for extracting a character string in the fourth character string corresponding to a portion semantically matched by the collation by the first collating means as a searched character string; 9. The data search device according to claim 8, further comprising two extracting means.

10. The data according to claim 9, wherein the third analysis means targets operands other than reserved words among the operands divided by the operand analysis means as objects of syntax analysis. Search device.

11. A fifth to-be-replaced fourth character string.
Second instructing means for instructing the character string of, and a second replacing means for replacing the character string extracted by the second extracting means with the fifth character string instructed by the second instructing means. The data search device according to claim 9 or 10, further comprising:

12. A first parsing step of parsing a first character string in which a hierarchical relationship is clearly described according to a predetermined rule based on the predetermined rule, and a second character to be searched. A second analysis step of parsing a string based on the predetermined rule; and an analysis result of the first analysis step, and an object of analysis is the same as the analysis result of the second analysis step. Recognizable
Collating step for collating whether there is a semantically matching portion that is an effective portion, and searching for a character string in the first character string corresponding to the semantically matching portion in the matching in the collating step. And a extracting step of extracting the extracted data as a character string.

13. A third to-be-replaced second character string.
Further comprising an instructing step of instructing the character string of No. 1 and a substituting step of substituting the character string extracted in the extracting step with the third character string instructed in the first instructing step. The data search method according to claim 12.

14. A first parsing step of parsing a first character string having a hierarchical relationship clearly described in a computer according to a predetermined rule, based on the predetermined rule, and a first object to be searched. The second analysis step of parsing the second character string based on the predetermined rule and the analysis result of the first analysis step have the same object as the analysis result of the second analysis step. Recognizable as
Collating step for collating whether there is a semantically matching portion that is an effective portion, and searching for a character string in the first character string corresponding to the semantically matching portion in the matching in the collating step. A computer-readable recording medium for recording a program for executing an extracting step of extracting the extracted character string.

15. An instructing step for instructing a computer to a third character string to replace the second character string, and instructing the character string extracted in the extracting step, in the first instructing step. 15. The computer-readable recording medium according to claim 14, further comprising a program for executing a substituting step of substituting the generated third character string.