JP2007102540A

JP2007102540A - Character string conversion device and character string conversion program

Info

Publication number: JP2007102540A
Application number: JP2005292350A
Authority: JP
Inventors: Toshinobu Sawada; 敏伸澤田
Original assignee: Hitachi Software Engineering Co Ltd
Current assignee: Hitachi Software Engineering Co Ltd
Priority date: 2005-10-05
Filing date: 2005-10-05
Publication date: 2007-04-19

Abstract

<P>PROBLEM TO BE SOLVED: To provide a character string conversion device including a mask processing means masking a character string corresponding to personal information, in which restoration to an original character string can be performed based on a character string after conversion while reducing the leakage risk of personal information. <P>SOLUTION: The character conversion device stores a plurality of programs 31-35 as conversion processing means and a plurality of tables 36-38 as conversion character lists in a program memory 30. In the conversion device, it is determined whether a character contained in a designated mask range is matched to a character contained in each table or not, and when the both are matched, this character is converted to another character in the corresponding table according to a preliminarily defined conversion rule. As restoration processing, each converted character string is restored to a character string before conversion according to the conversion rule. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、文字列変換装置に関し、特に個人名などの個人情報保護のための文字列変換に関する。 The present invention relates to a character string conversion device, and more particularly to character string conversion for protecting personal information such as a personal name.

従来、個人情報保護のために個人名等を非公開とする技術として、電子文書中に含まれる個人名等を示す文字列に対してマスク処理として他の文字列への変換を行う手段を備えた構成が公知となっている（例えば、特許文献１参照。）。
特開２００２−１４９６３８号公報 Conventionally, as a technique for making personal names private in order to protect personal information, a means for converting a character string indicating a personal name included in an electronic document into another character string as a mask process is provided. The structure is publicly known (for example, see Patent Document 1).
JP 2002-149638 A

前記特許文献１に記載の構成では、固有名詞辞書に基づきマスク処理対象となる文字列を抽出し、抽出した文字列を他の文字列に変換することによって、個人名等の特定を不可能としている。例えば、「山田太郎」という人名を「＊＊＊＊」や「人名１」などの文字列に変換している。 In the configuration described in Patent Document 1, a character string to be masked is extracted based on a proper noun dictionary, and the extracted character string is converted into another character string, thereby making it impossible to specify a personal name or the like. Yes. For example, the personal name “Taro Yamada” is converted into a character string such as “****” or “person name 1”.

しかし、前記特許文献１に記載の構成では、変換処理を行った文字列について、元の文字列に復元することは考慮されていなかった。即ち、人名等を示す文字列を全て共通の文字列（例えば、「＊＊＊＊」）に変換した場合には、元の文字列に復元することは不可能となっていた。従って、変換前の内容を確認するためには元の電子文書を保持することが必要となり、個人情報の漏洩リスクを低くすることができなかった。
また、個人名を「人名１」等と変換した場合においても、少なくとも変換前の文字列（氏名）と変換後の文字列（「人名１」等）との対応情報を記憶しておくことが必要となっており、変換前の電子文書を保持する場合と略同様の漏洩リスクを負うこととなっていた。 However, in the configuration described in Patent Document 1, it is not considered to restore the original character string of the character string that has been subjected to the conversion process. That is, when all character strings indicating personal names or the like are converted into a common character string (for example, “****”), it is impossible to restore the original character string. Therefore, it is necessary to retain the original electronic document in order to confirm the content before conversion, and the risk of leakage of personal information cannot be reduced.
Further, even when the personal name is converted to “person name 1” or the like, correspondence information between at least the character string before conversion (name) and the character string after conversion (such as “person name 1”) may be stored. It is necessary and has the risk of leakage almost the same as the case of holding the electronic document before conversion.

本発明は前記課題を解決するためのものであり、個人情報等に該当するものとして、変換処理対象として指定された文字列についての変換処理手段を備えた文字列変換装置において、変換後の文字列に基づき、元の文字列への復元を可能とするとともに、個人情報の漏洩リスクを低くすることが可能となる文字列変換装置を提供することを目的とする。 The present invention is to solve the above-mentioned problem, and in a character string conversion device provided with conversion processing means for a character string designated as a conversion processing target as corresponding to personal information or the like, An object of the present invention is to provide a character string conversion device that enables restoration to an original character string based on a column and also reduces the risk of leakage of personal information.

前記課題を解決するため本発明の文字列変換装置は、変換処理対象として指定された電子文書中の文字列を、他の文字列に変換する文字列変換装置であって、予め変換文字と、各変換文字を一意に識別する識別子とを定義した文字リストと、変換指示の入力を受付け、前記変換対象としての文字列に含まれる各文字を前記文字リストから検索して、当該文字リストから変換対象文字に付与された識別子に基づき、予め定義された変換規則に従って前記文字リスト中の他の文字に変換する手段と、復元指示の入力を受付け、前記変換後の文字を、前記変換規則に対応した復元規則により、前記文字リストに含まれる変換対象文字に復元する手段とを備えることを特徴とする。
また、前記変換規則は、予め定義された複数の変換規則から任意に選択されたもの又はそれらの組合せとすることを特徴とする。
また、本発明の文字列変換プログラムは、変換処理対象として指定された電子文書中の文字列を、他の文字列に変換する処理をコンピュータに実行させる文字列変換プログラムであって、変換指示の入力を受付け、前記変換対象としての文字列に含まれる各文字を、予め変換文字を格納した文字リストから検索して、当該文字リストから変換対象文字に付与された識別子に基づき、予め定義された変換規則に従って前記文字リスト中の他の文字に変換する処理と、復元指示の入力を受付け、前記変換後の文字を、前記変換規則に対応した復元規則により、前記文字リストに含まれる変換対象文字に復元する処理とをコンピュータに実行させることを特徴とする。
また、前記変換規則は、予め定義された複数の変換規則から任意に選択されたもの又はそれらの組合せとすることを特徴とする。 In order to solve the above problems, the character string conversion device of the present invention is a character string conversion device that converts a character string in an electronic document designated as a conversion processing target into another character string, and a conversion character in advance, A character list that defines an identifier for uniquely identifying each conversion character and an input of a conversion instruction are received, and each character included in the character string as the conversion target is searched from the character list and converted from the character list. Based on an identifier assigned to the target character, means for converting to another character in the character list according to a predefined conversion rule, and accepting an input of a restoration instruction, and the converted character corresponds to the conversion rule Means for restoring to a conversion target character included in the character list according to the restored rule.
The conversion rule may be arbitrarily selected from a plurality of predefined conversion rules or a combination thereof.
A character string conversion program according to the present invention is a character string conversion program for causing a computer to execute a process of converting a character string in an electronic document designated as a conversion processing target into another character string. An input is received, and each character included in the character string as the conversion target is searched from a character list in which conversion characters are stored in advance, and is defined in advance based on an identifier assigned to the conversion target character from the character list. A process of converting to another character in the character list according to a conversion rule and an input of a restoration instruction are accepted, and the character after conversion is included in the character list by the restoration rule corresponding to the conversion rule. It is characterized by causing a computer to execute the process of restoring to.
The conversion rule may be arbitrarily selected from a plurality of predefined conversion rules or a combination thereof.

以上の構成により本発明では、個人情報等を示す文字列の変換を行った電子文書について、文字リスト及び変換規則に基づき、変換後の文字列を変換前の文字列に復元することが可能となる。従って、文字リスト及び変換規則を共有することにより、複数の端末において、個人情報等をマスキングした文書について変換前の内容を確認することが可能となる。
これにより、例えば、携帯端末やＰＣ（パーソナルコンピュータ）から入力された電文を電子メールなどを使って端末間を送受信する場合に、送信する側は、入力された文字列の原文を本文字列変換装置で変換した後、その変換後の電文を送信し、受信する側では受け取った変換後の電文を文字列変換装置で復元することによって、伝送経路にインタネット等のネットワークを用いている場合における伝送中のメッセージの内容が漏洩してしまう可能性の低減に寄与する。 With the above configuration, in the present invention, it is possible to restore a converted character string to a character string before conversion based on a character list and conversion rules for an electronic document converted from a character string indicating personal information or the like. Become. Therefore, by sharing the character list and the conversion rule, it is possible to confirm the content before conversion of the document with the personal information masked in a plurality of terminals.
As a result, for example, when a message input from a portable terminal or a PC (personal computer) is transmitted and received between terminals using e-mail or the like, the transmitting side converts the original character string of the input character string into the character string. After conversion by the device, the converted message is sent, and the received message is restored by the character string conversion device on the receiving side, so that the transmission in the case of using a network such as the Internet as the transmission path This contributes to a reduction in the possibility that the content of the message inside will be leaked.

以下、本発明の一実施の形態に係る文字列変換装置について、図面に基づき説明する。
図１は、本発明の一実施の形態に係る文字列変換装置の概略構成を示すブロック図である。
本実施の形態に係る文字列変換装置は、文字を入力する入力装置１０と、文字列変換処理を実行する中央処理装置２０と、中央処理装置２０での処理に必要な各種プログラム，各種管理テーブル等を格納するプログラムメモリ（ＲＯＭ）３０と、入力装置１０からの入力データ，中央処理装置２０での処理途中／結果データ等を格納する一時記憶手段としてのデータメモリ（ＲＡＭ）４０と、中央処理装置２０で作成した文字列を出力する出力装置５０とから構成される。本例の文字列変換装置は、電子文書中において変換処理対象として指定された文字列について、変換処理を行う。
プログラムメモリ３０は、文字列変換処理に関し、変換処理手段としての文字列変換プログラム３１，一文字変換プログラム３２，ひらがなカタカナ変換プログラム３３，人名用漢字変換プログラム３４，ＪＩＳ第１第２水準漢字変換プログラム３５と、変換文字リストとしてのひらがなカタカナテーブル３６，人名用漢字テーブル３７，ＪＩＳ第１第２水準漢字テーブル３８とが格納されている。
ひらがなカタカナテーブル３６には、日本語のひらがな及びカタカナと母音・子音に対応する数字があらかじめ格納されている。
人名用漢字テーブル３７には、戸籍法等により人名に使用できる常用漢字及び人名用漢字別表２２３２文字があらかじめ格納されている。
ＪＩＳ第１第２水準漢字テーブル３８には、ＪＩＳ規格によって決められた漢字文字があらかじめ格納されている。
データメモリ４０には、入力装置から入力された変換前文字４１と変換処理後の変換後文字列４２と入力装置から入力された文字列をマスク処理するのか復元処理をするのかを指示されたマスク・復元フラグ４３が格納されているとする。
本実施の形態に係る文字列変換装置では、指定されたマスク範囲に含まれる文字が各テーブルに含まれる文字と一致するか否かを判定し、一致する場合には、予め定義された変換規則に従って、当該テーブル内の他の文字に変換する。また、復元処理として、変換された各文字列を変換規則に対応する復元規則に従って変換前の文字列に復元する。 Hereinafter, a character string conversion device according to an embodiment of the present invention will be described with reference to the drawings.
FIG. 1 is a block diagram showing a schematic configuration of a character string conversion apparatus according to an embodiment of the present invention.
The character string conversion device according to the present embodiment includes an input device 10 for inputting characters, a central processing device 20 for executing character string conversion processing, various programs necessary for processing in the central processing device 20, and various management tables. A program memory (ROM) 30 for storing the data, a data memory (RAM) 40 as temporary storage means for storing the input data from the input device 10, the mid-processing / result data in the central processing device 20, and the central processing And an output device 50 that outputs a character string created by the device 20. The character string conversion apparatus of this example performs conversion processing on a character string designated as a conversion processing target in an electronic document.
The program memory 30 relates to a character string conversion process, a character string conversion program 31 as a conversion processing means, a single character conversion program 32, a hiragana katakana conversion program 33, a kanji conversion program for personal names 34, and a JIS first level 2 kanji conversion program 35. And a hiragana / katakana table 36, a personal name kanji table 37, and a JIS first and second level kanji table 38 as a conversion character list.
In the hiragana katakana table 36, numbers corresponding to Japanese hiragana and katakana and vowels / consonants are stored in advance.
In the personal name kanji table 37, common kanji that can be used for personal names according to the family register method and the like, and 2232 characters according to personal name kanji are stored in advance.
In the JIS first and second level kanji table 38, kanji characters determined by the JIS standard are stored in advance.
In the data memory 40, a mask instructed to mask or restore the pre-conversion character 41 input from the input device, the post-conversion character string 42 after conversion processing, and the character string input from the input device. It is assumed that the restoration flag 43 is stored.
In the character string conversion device according to the present embodiment, it is determined whether or not the characters included in the specified mask range match the characters included in each table. If they match, a predefined conversion rule is determined. To convert to other characters in the table. Further, as a restoration process, each converted character string is restored to a character string before conversion according to a restoration rule corresponding to the conversion rule.

図２は、ひらがなカタカナテーブル３６の一例と、本例に示すひらがなカタカナテーブルを用いた変換例を示す図である。
本例に示すひらがなカタカナテーブル３６では、子音に対応する数字（０１〜１１）と、母音に対応する数字（０１〜０５）との組合せにより各文字を識別可能としている。
例えば、変換前文字列２１０「たなかいちろう」の「た」は、子音が「０４」であり、母音が「０１」であるため、「０４０１」で示される。
一方、変換例では、変換規則として「子音に対応する数字」をカウントアップすること、が定義されているものとする。
この場合、変換前文字列２１０について「た」を「な」に変換する。同様に変換処理を行うことにより、変換前文字列２１０に示す「たなかいちろう」を、変換後文字列２２０に示す「なはさきにをく」に変換する。
なお、復元処理においては、変換規則の逆の処理、即ち「子音に対応する数字」をカウントダウンして変換処理を行う。
なお、本例では、ひらがなのみを示しているが、カタカナについても同様のデータ構造のテーブルを用いて変換処理を行う。 FIG. 2 is a diagram showing an example of the hiragana katakana table 36 and a conversion example using the hiragana katakana table shown in this example.
In the hiragana katakana table 36 shown in this example, each character can be identified by a combination of numbers (01 to 11) corresponding to consonants and numbers (01 to 05) corresponding to vowels.
For example, “ta” in the pre-conversion character string 210 “Tanaka Ichiro” is indicated by “0401” because the consonant is “04” and the vowel is “01”.
On the other hand, in the conversion example, it is assumed that “the number corresponding to the consonant” is counted up as a conversion rule.
In this case, “ta” is converted to “na” in the pre-conversion character string 210. Similarly, by performing the conversion process, “Tanaka Ichirou” shown in the pre-conversion character string 210 is converted into “Nahanasaki nioku” shown in the post-conversion character string 220.
Note that in the restoration process, the conversion process is performed by counting down the “number corresponding to the consonant”, which is the reverse process of the conversion rule.
In this example, only hiragana is shown, but conversion processing is also performed for katakana using a table having a similar data structure.

図３は、人名漢字テーブルの一例と、本例に示す人名漢字テーブルを用いた変換例を示す図である。
本例に示す人名漢字テーブル３００では、各人名漢字を文字コード順に並べ、各文字に識別番号を付与している。
一方、変換例では、変換規則として識別番号をカウントアップすること、が定義されているものとする。
この場合に、変換前文字列３１０の「田中一郎」の「田」を「電」に変換する。同様に変換処理を行うことにより、変換前文字列３１０に示す「田中一郎」を、変換後文字列３２０に示す「電仲壱楼」に変換する。
なお、復元処理においては、識別番号をカウントダウンして変換処理を行う。 FIG. 3 is a diagram showing an example of a personal name kanji table and a conversion example using the personal name kanji table shown in this example.
In the personal name Chinese character table 300 shown in this example, the personal Chinese characters are arranged in the order of the character codes, and an identification number is assigned to each character.
On the other hand, in the conversion example, it is assumed that counting up an identification number is defined as a conversion rule.
In this case, “Ta” of “Ichiro Tanaka” in the pre-conversion character string 310 is converted to “den”. Similarly, by performing the conversion process, “Ichiro Tanaka” shown in the pre-conversion character string 310 is converted into “Den-Nakarou” shown in the post-conversion character string 320.
In the restoration process, the identification number is counted down to perform the conversion process.

図４は、ＪＩＳ第１第２水準漢字テーブルの一例と、本例に示すＪＩＳ第１第２水準漢字テーブルを用いた変換例を示す図である。
本例に示すＪＩＳ第１第２水準漢字テーブル４００では、ＪＩＳ文字コード順に漢字を並べているとともに、人名用漢字テーブルに含まれる漢字か否かを識別する識別フラグが設定されている。
一方、変換例では、変換規則としてＪＩＳ文字コードをカウントアップすること及び変換後の文字が人名用漢字テーブルに含まれる文字の場合にはさらにＪＩＳ文字コードをカウントアップすること、が定義されているものとする。
この場合に、変換前文字列４１０の「唖阿逢渥」の「唖」を「娃」に変換する。同様に変換処理を行うことにより、変換前文字列４１０に示す「唖阿逢渥」を、変換後文字列４２０に示す「娃挨葵旭」に変換する。
なお、復元処理においては、ＪＩＳ文字コードをカウントダウンして変換処理を行う。 FIG. 4 is a diagram showing an example of a JIS first level kanji table and a conversion example using the JIS first second level kanji table shown in this example.
In the JIS first and second level kanji table 400 shown in this example, kanji are arranged in the order of JIS character codes, and an identification flag for identifying whether or not the kanji is included in the personal name kanji table is set.
On the other hand, in the conversion example, it is defined that the JIS character code is counted up as a conversion rule and that the JIS character code is further counted up when the converted character is included in the personal name kanji table. Shall.
In this case, “唖” of “唖阿逢渥” in the pre-conversion character string 410 is converted to “娃”. Similarly, by performing the conversion process, “唖阿唖” shown in the pre-conversion character string 410 is converted to “に dust 葵 Asa” shown in the post-conversion character string 420.
In the restoration process, the conversion process is performed by counting down the JIS character code.

以上の構成に基づき、本実施の形態に係る文字列変換装置の行う処理について説明する。
図５は、文字列変換装置を構成する文字列変換プログラム３１の処理を示すフローチャートである。
文字列変換プログラム３１は、変換処理対象として指定された文字列とマスク・復元処理指示の入力を受付け、データメモリ４０に変換前文字列４１，マスク・復元フラグ４３としてそれぞれのデータを待避する（ステップ５０１）。
その後、入力文字列数を計算し（ステップ５０２）、以下の処理を一文字単位で行う。
入力された文字列数分の変換処理が済んでいるかどうか判定し（ステップ５０３）、変換処理が済んでいなければ、未処理文字について一文字変換プログラム３２に処理を実行させ（ステップ５０４）、変換された文字をデータメモリ４０に変換後文字列４２として格納した後、ステップ５０３の判定へ戻る。
一方、ステップ５０３の判定で、入力された文字列数分の変換処理が済んでいる場合は、変換後文字列４２の出力処理を行い（ステップ５０５）、処理を終了する。 Based on the above configuration, processing performed by the character string conversion apparatus according to the present embodiment will be described.
FIG. 5 is a flowchart showing processing of the character string conversion program 31 constituting the character string conversion device.
The character string conversion program 31 receives the input of the character string designated as the conversion processing target and the mask / restoration processing instruction, and saves the respective data as the pre-conversion character string 41 and the mask / restoration flag 43 in the data memory 40 ( Step 501).
Thereafter, the number of input character strings is calculated (step 502), and the following processing is performed for each character.
It is determined whether or not conversion processing for the number of input character strings has been completed (step 503). If conversion processing has not been completed, the single character conversion program 32 executes processing for unprocessed characters (step 504), and conversion is performed. After the converted character is stored in the data memory 40 as the converted character string 42, the process returns to the determination in step 503.
On the other hand, if it is determined in step 503 that conversion processing for the number of input character strings has been completed, output processing of the converted character string 42 is performed (step 505), and the processing ends.

図６は、一文字変換プログラム３２の処理を示すフローチャートである。
一文字変換プログラム３２は、文字列変換プログラム３１から渡された処理対象文字がひらがなカタカナテーブル３６にあるかを検索する（ステップ６０１）。
ステップ６０１で検索した結果を判定し（ステップ６０２）、ひらがなカタカナテーブル３６にあった場合は、ひらがなカタカナ変換プログラム３３に処理を実行させる（ステップ２０３）。
一方、ステップ６０２の判定で、ひらがなカタカナテーブル３６に無かった場合は、人名用漢字テーブル３７にあるかを検索する（ステップ６０４）。
ステップ６０４で検索した結果を判定し（ステップ６０５）、人名用漢字テーブル３７にあった場合は、人名用漢字変換プログラム３４に処理を実行させる（ステップ６０６）。
一方、ステップ６０５の判定で、人名用漢字テーブル３７に無かった場合は、ＪＩＳ第１第２水準漢字テーブル３８にあるかを検索する（ステップ６０７）。
ステップ６０７で検索した結果を判定し（ステップ６０８）、ＪＩＳ第１第２水準漢字テーブル３８にあった場合は、ＪＩＳ第１第２水準漢字変換プログラム３５に処理を実行させる（ステップ６０９）。
一方、ステップ６０８の判定で、ＪＩＳ第１第２水準漢字テーブル３８に無かった場合は、変換処理は行わずに処理を終了する（ステップ６１０）。 FIG. 6 is a flowchart showing the processing of the single character conversion program 32.
The single character conversion program 32 searches whether the processing target character passed from the character string conversion program 31 is in the hiragana-katakana table 36 (step 601).
The result of the search in step 601 is determined (step 602). If the result is found in the hiragana / katakana table 36, the hiragana / katakana conversion program 33 is caused to execute processing (step 203).
On the other hand, if it is determined in step 602 that the character is not in the hiragana-katakana table 36, it is searched whether it is in the personal name kanji table 37 (step 604).
The result of the search in step 604 is determined (step 605). If it is found in the personal name kanji table 37, the personal name kanji conversion program 34 is caused to execute processing (step 606).
On the other hand, if it is determined in step 605 that the name is not in the personal name kanji table 37, it is searched whether it is in the JIS first and second level kanji table 38 (step 607).
The result of the search in step 607 is determined (step 608), and if it is in the JIS first second level kanji table 38, the JIS first second level kanji conversion program 35 executes the process (step 609).
On the other hand, if it is determined in step 608 that there is no JIS first second level kanji table 38, the conversion process is not performed and the process is terminated (step 610).

図７は、ひらがなカタカナ変換プログラム３３の処理を示すフローチャートである。
ひらがなカタカナ文字変換プログラム３３は、一文字変換プログラム３２から渡された処理対象文字について、ひらがなカタカナテーブル３６から子音＋母音の数値を取得する（ステップ７０１）。例えば、図２に示すひらがなカタカナテーブル３６では、処理対象文字が「た」の場合、子音＋母音が「０４０１」となる。
次に、データメモリ４０にあるマスク・復元フラグ４３を判定し（ステップ７０２）、マスク処理指示であった場合は、子音の数値をカウントアップする（ステップ７０３）。前述の例では、数値を「０５０１」とする。
次に、ステップ７０３でカウントアップした子音の数値とひらがなカタカナテーブル３６の範囲を比較し（ステップ７０４）、カウントアップした子音の数値がひらがなカタカナテーブル３６の範囲を超えている場合は、子音の数値を先頭の数値とする（ステップ７０５）。例えば、図２の例では、カウントアップした数値が「１２０１」であった場合には、「０１０１」とする。
ステップ７０３またはステップ７０５で計算した子音＋母音の数値に基づき、ひらがなカタカナテーブル３６から対応する文字を取得する（ステップ７０６）。前述の例では、数値が「０５０１」であるため、対応する文字として「な」を取得する。
最後に、ステップ３０９で取得した文字に置き換えて（ステップ７０７）、文字列変換プログラム３１に渡す。
一方、ステップ７０２の判定結果が復元指示であった場合には、子音の数値をカウントダウンする（ステップ７０８）。前述の例で「た」であった場合、子音＋母音は「０４０１」であるため、「０３０１」となる。
次に、ステップ７０３と同様、カタカナテーブル３６の範囲を比較し（ステップ７０９）、カウントダウンした子音の数値がひらがなカタカナテーブル３６の範囲を超えている場合は、子音の数値を最終の数値とする（ステップ７１０）。例えば、図２の例では、カウントダウンした数値が「０００１」であった場合は、「１１０１」とする。
ステップ７０８またはステップ７１０で計算した子音＋母音の数値を元にステップ７０６，７０７の処理を行う。 FIG. 7 is a flowchart showing the processing of the hiragana / katakana conversion program 33.
The hiragana katakana character conversion program 33 acquires the consonant + vowel numerical value from the hiragana katakana table 36 for the processing target character passed from the single character conversion program 32 (step 701). For example, in the hiragana katakana table 36 shown in FIG. 2, when the character to be processed is “ta”, the consonant + vowel is “0401”.
Next, the mask / restoration flag 43 in the data memory 40 is determined (step 702). If it is a mask processing instruction, the consonant value is counted up (step 703). In the above example, the numerical value is “0501”.
Next, the value of the consonant counted up in step 703 is compared with the range of the hiragana katakana table 36 (step 704). If the counted consonant value exceeds the range of the hiragana katakana table 36, the value of the consonant is displayed. Is the first numerical value (step 705). For example, in the example of FIG. 2, when the counted up value is “1201”, it is set to “0101”.
Based on the numerical value of consonant + vowel calculated in step 703 or 705, the corresponding character is acquired from the hiragana-katakana table 36 (step 706). In the above example, since the numerical value is “0501”, “NA” is acquired as the corresponding character.
Finally, the character is replaced with the character acquired in step 309 (step 707) and passed to the character string conversion program 31.
On the other hand, if the determination result in step 702 is a restoration instruction, the numerical value of the consonant is counted down (step 708). In the above example, if it is “TA”, the consonant + vowel is “0401”, and thus “0301”.
Next, as in step 703, the ranges of the katakana table 36 are compared (step 709), and if the counted down consonant value exceeds the range of the hiragana katakana table 36, the consonant value is set as the final value ( Step 710). For example, in the example of FIG. 2, when the counted down value is “0001”, it is set to “1101”.
The processing in steps 706 and 707 is performed based on the consonant + vowel value calculated in step 708 or 710.

図８は、人名用漢字変換プログラム３４の処理を示すフローチャートである。
人名用漢字変換プログラム３４は、一文字変換プログラム３２から渡された処理対象文字について、人名用漢字テーブル３７から識別番号を取得する（ステップ８０１）。例えば、図３に示す人名漢字テーブルでは、処理対象文字が「田」の場合、識別番号は「１３５０」となる。
次に、データメモリ４０にあるマスク・復元フラグ４３を判定し（ステップ８０２）、マスク処理指示であった場合は、識別番号の数値をカウントアップする（ステップ８０３）。前述の例では、数値を「１３５１」とする。
次に、ステップ８０３でカウントアップした識別番号の数値と人名用漢字テーブル３７の範囲を比較し（ステップ８０４）、カウントアップした識別番号の数値が人名用漢字テーブル３７の範囲を超えている場合は、識別番号の数値を先頭の数値とする（ステップ８０５）。例えば図３の例では、カウントアップした数値が「２２３３」であった場合は、「０００１」とする。
ステップ８０３またはステップ８０５で計算した識別番号を元に人名用漢字テーブル３７に対応する文字を取得する（ステップ８０６）。前述の例では、数値が「１３５１」であるため、対応する文字として「電」を取得する。
最後に、ステップ８０９で取得した文字に置き換えて（ステップ８０７）、文字列変換プログラム３１に渡す。
一方、ステップ８０２の判定結果が復元指示であった場合は、識別番号の数値をカウントダウンする（ステップ８０８）。前述の例で「田」であった場合、識別番号の数値は「１３５０」であるため、「１３４９」となる。
次に、ステップ８０３と同様、人名用漢字テーブル３７の範囲を比較し（ステップ８０９）、カウントダウンした識別番号の数値が人名用漢字テーブル３７の範囲を超えている場合は、識別番号を人名漢字テーブル３７の最終の数値とする（ステップ８１０）。例えば図３の例では、カウントダウンした数値が「００００」であった場合は、「２２３２」とする。
ステップ８０８またはステップ８１０で計算した識別番号の数値を元にステップ８０６，８０７の処理を行う。 FIG. 8 is a flowchart showing the processing of the personal name kanji conversion program 34.
The personal name kanji conversion program 34 acquires an identification number from the personal name kanji table 37 for the processing target character passed from the single character conversion program 32 (step 801). For example, in the personal name kanji table shown in FIG. 3, when the character to be processed is “ta”, the identification number is “1350”.
Next, the mask / restoration flag 43 in the data memory 40 is determined (step 802), and if it is a mask processing instruction, the numerical value of the identification number is counted up (step 803). In the above example, the numerical value is “1351”.
Next, the numeric value of the identification number counted up in step 803 is compared with the range of the personal name kanji table 37 (step 804). If the numeric value of the identification number counted up exceeds the range of the personal name kanji table 37, The numerical value of the identification number is set as the first numerical value (step 805). For example, in the example of FIG. 3, if the counted up value is “2233”, it is set to “0001”.
Based on the identification number calculated in step 803 or 805, the character corresponding to the personal name kanji table 37 is acquired (step 806). In the above example, since the numerical value is “1351”, “den” is acquired as the corresponding character.
Finally, the character is replaced with the character acquired in step 809 (step 807) and transferred to the character string conversion program 31.
On the other hand, if the determination result in step 802 is a restoration instruction, the numerical value of the identification number is counted down (step 808). In the above example, if the field is “field”, the numerical value of the identification number is “1350”, and thus “1349”.
Next, as in step 803, the ranges of the personal name kanji table 37 are compared (step 809), and if the counted down identification number exceeds the range of the personal name kanji table 37, the identification number is entered into the personal name kanji table. The final value is 37 (step 810). For example, in the example of FIG. 3, when the counted down value is “0000”, it is set to “2232”.
Steps 806 and 807 are performed based on the numerical value of the identification number calculated in step 808 or step 810.

図９は、ＪＩＳ第１第２水準漢字変換プログラム３５の処理を示すフローチャートである。
ＪＩＳ第１第２水準漢字変換プログラム３５は、一文字変換プログラム３２から渡された処理対象文字について、ＪＩＳ第１第２水準漢字テーブル３８から対応する文字コード（１６進数）を取得する（ステップ９０１）。例えば、図４に示すＪＩＳ第１第２水準漢字テーブルでは、処理対象文字が「唖」であった場合、文字コードは「３０２２」となる。
次に、データメモリ４０にあるマスク・復元フラグ４３を判定し（ステップ９０２）、マスク処理指示であった場合は、文字コードの数値をカウントアップする（ステップ９０３）。前述の例では、数値を「３０２３」とする。
次に、ステップ９０３でカウントアップした文字コードとＪＩＳ第１第２水準漢字テーブル３８の範囲を比較し（ステップ９０４）、カウントアップした文字コードの数値がＪＩＳ第１第２水準漢字テーブル３８の範囲を超えている場合は、文字コードを先頭の数値とする（ステップ９０５）。例えば図４の例では、カウントアップした文字コードの数値が「７４２６」であった場合は、「３０２１」とする。
ステップ９０３またはステップ９０５で取得した文字コードに対し、人名用漢字識別フラグが設定されているか否かを判定し（ステップ９０６）、人名用漢字識別フラグが設定されている場合には、ステップ９０３からの処理を繰り返す。
一方、人名漢字フラグが設定されていない場合には、ＪＩＳ第１第２水準漢字テーブル３８より該文字コードに対応する文字を取得する（ステップ９０７）。前述の例では、文字コードの数値が「３０２３」であるため、対応する文字として「娃」を取得する。
最後に、ステップ９０７で取得した文字を置き換えて（ステップ９０８）、文字列変換プログラム３１に渡す。
一方、ステップ９０２の判定結果が復元指示であった場合には、文字コードの数値をカウントダウンする（ステップ９０９）。前述の例で「唖」であった場合、文字コードの数値は「３０２２」であるため、「３０２１」となる。
次に、ステップ９０４と同様、ＪＩＳ第１第２水準漢字テーブル３８の範囲を比較し（ステップ９１０）、カウントダウンした文字コードの数値がＪＩＳ第１第２水準漢字テーブル３８の範囲を超えている場合は、文字コードを最終の数値とする（ステップ９１１）。例えば図４の例では、カウントダウンした文字コードが「３０２０」であった場合は、「７４２６」とする。
ステップ９０９またはステップ９１１で取得した文字コードに対し、人名用漢字識別フラグが設定されているか否かを判定し（ステップ９１２）、人名用漢字識別フラグが設定されている場合には、ステップ９０７からの処理を繰り返す。
一方、人名用漢字識別フラグが設定されていない場合には、ステップ９０７，９０８の処理を行う。 FIG. 9 is a flowchart showing the processing of the JIS first and second level kanji conversion program 35.
The JIS first second-level kanji conversion program 35 acquires the corresponding character code (hexadecimal) from the JIS first second-level kanji table 38 for the processing target character passed from the one-character conversion program 32 (step 901). . For example, in the JIS first and second level kanji table shown in FIG. 4, when the processing target character is “唖”, the character code is “3022”.
Next, the mask / restoration flag 43 in the data memory 40 is determined (step 902), and if it is a mask processing instruction, the numerical value of the character code is counted up (step 903). In the above example, the numerical value is “3023”.
Next, the character code counted up in step 903 is compared with the range of the JIS first second level kanji table 38 (step 904), and the value of the counted character code is the range of the JIS first second level kanji table 38. Is exceeded, the character code is set to the first numerical value (step 905). For example, in the example of FIG. 4, when the counted character code value is “7426”, it is set to “3021”.
It is determined whether or not the personal name kanji identification flag is set for the character code acquired in step 903 or step 905 (step 906). Repeat the process.
On the other hand, if the personal name kanji flag is not set, the character corresponding to the character code is obtained from the JIS first and second level kanji table 38 (step 907). In the above example, since the numerical value of the character code is “3023”, “娃” is acquired as the corresponding character.
Finally, the character acquired in step 907 is replaced (step 908) and passed to the character string conversion program 31.
On the other hand, if the determination result in step 902 is a restoration instruction, the numerical value of the character code is counted down (step 909). If “唖” in the above example, the numerical value of the character code is “3022”, so “3021”.
Next, as in step 904, the range of the JIS first level kanji table 38 is compared (step 910), and the value of the counted down character code exceeds the range of the JIS first second level kanji table 38. Uses the character code as the final numerical value (step 911). For example, in the example of FIG. 4, when the counted down character code is “3020”, it is set to “7426”.
It is determined whether or not the personal name kanji identification flag is set for the character code acquired in step 909 or 911 (step 912). If the personal name kanji identification flag is set, the process proceeds from step 907. Repeat the process.
On the other hand, if the personal name kanji identification flag is not set, the processing of steps 907 and 908 is performed.

以上のように、本実施の形態に係る文字列変換装置では、予め定義された文字リスト及び変換規則に従って、個人情報等を示す文字列のマスク処理を行うこととしたので、変換後の文字列に基づき、変換前の文字列への復元を行うことが可能となる。 As described above, in the character string conversion device according to the present embodiment, the character string indicating personal information is masked in accordance with a predefined character list and conversion rules. Based on the above, it is possible to restore the character string before conversion.

なお、本発明の文字列変換装置における文字リスト及び変換規則は、前記実施の形態に示すものに限らず、種々のものを定義することが可能である。この場合、変換規則は複数定義されたものの中から任意に選択可能としてもよく、また、複数の変換規則を組み合せて変換処理を行うこととしてもよい。 The character list and conversion rules in the character string conversion device of the present invention are not limited to those shown in the above-described embodiment, and various types can be defined. In this case, conversion rules may be arbitrarily selected from a plurality of conversion rules, or a conversion process may be performed by combining a plurality of conversion rules.

本発明の一実施の形態に係る文字列変換装置の概略構成を示すブロック図である。It is a block diagram which shows schematic structure of the character string converter which concerns on one embodiment of this invention. ひらがなカタカナテーブルとそれを用いた文字列変換例を示す図である。It is a figure which shows the example of a character string conversion using the hiragana katakana table and it. 人名用漢字テーブルとそれを用いた文字列変換例を示す図である。It is a figure which shows the kanji table for personal names, and the character string conversion example using the same. ＪＩＳ第１第２水準漢字テーブルとそれを用いた文字列変換例を示す図である。It is a figure which shows the example of a character string conversion using the JIS 1st 2nd level Kanji table. 文字列変換プログラムの行う処理を示すフローチャートである。It is a flowchart which shows the process which a character string conversion program performs. 一文字変換プログラムの行う処理を示すフローチャートである。It is a flowchart which shows the process which a single character conversion program performs. ひらがなカタカナ変換プログラムの行う処理を示すフローチャートである。It is a flowchart which shows the process which a hiragana katakana conversion program performs. 人名用漢字変換プログラムの行う処理を示すフローチャートである。It is a flowchart which shows the process which the Chinese character conversion program for personal names performs. ＪＩＳ第１第２水準漢字変換プログラムの行う処理を示すフローチャートである。It is a flowchart which shows the process which a JIS 1st 2nd level kanji conversion program performs.

Explanation of symbols

１０入力装置、２０中央処理装置、３０プログラムメモリ、３１文字列変換プログラム、３２一文字変換プログラム、３３ひらがなカタカナ変換プログラム、３４人名用漢字変換プログラム、３５ＪＩＳ第１第２水準漢字変換プログラム、３６ひらがなカタカナテーブル、３７人名用漢字テーブル、３８ＪＩＳ第１第２水準漢字テーブル、４０データメモリ、４１変換前文字列、４２変換後文字列、４３マスク・復元フラグ、５０出力装置。
10 input device, 20 central processing unit, 30 program memory, 31 character string conversion program, 32 single character conversion program, 33 hiragana katakana conversion program, 34 personal name kanji conversion program, 35 JIS first level 2 kanji conversion program, 36 hiragana Katakana table, 37 Kanji table for personal names, 38 JIS first level 2 Kanji table, 40 data memory, 41 pre-conversion character string, 42 post-conversion character string, 43 mask / restore flag, 50 output device.

Claims

A character string conversion device for converting a character string in an electronic document designated as a conversion processing target into another character string,
A character list that defines conversion characters and identifiers that uniquely identify each conversion character;
Accepting input of conversion instruction, searching each character included in the character string as the conversion target from the character list, based on an identifier assigned to the conversion target character from the character list, according to a predefined conversion rule Means for converting to other characters in the character list;
A character string conversion apparatus comprising: means for receiving an input of a restoration instruction and restoring the converted character to a conversion target character included in the character list according to a restoration rule corresponding to the conversion rule.

The character string conversion device according to claim 1, wherein the conversion rule is arbitrarily selected from a plurality of predefined conversion rules or a combination thereof.

A character string conversion program that causes a computer to execute processing for converting a character string in an electronic document designated as a conversion processing target into another character string,
Receiving input of conversion instruction, each character included in the character string as the conversion target is searched from a character list in which conversion characters are stored in advance, and based on an identifier given to the conversion target character from the character list, A process of converting to another character in the character list according to a defined conversion rule;
A character string characterized by receiving an input of a restoration instruction and causing a computer to execute a process of restoring the converted character to a conversion target character included in the character list according to a restoration rule corresponding to the conversion rule Conversion program.

The character string conversion program according to claim 3, wherein the conversion rule is arbitrarily selected from a plurality of predefined conversion rules or a combination thereof.