JP5336779B2

JP5336779B2 - Information processing apparatus for performing character string conversion, character string conversion method, program, and information processing system

Info

Publication number: JP5336779B2
Application number: JP2008168087A
Authority: JP
Inventors: 剛志福田
Original assignee: International Business Machines Corp
Current assignee: International Business Machines Corp
Priority date: 2008-06-27
Filing date: 2008-06-27
Publication date: 2013-11-06
Anticipated expiration: 2028-06-27
Also published as: JP2010009329A

Description

本発明は、文字列変換技術に関し、より詳細には、元文字で記述された文字列を、アルファベットなど他の言語の文字列に変換する情報処理装置、文字列変換方法、プログラムおよび情報処理システムに関する。 The present invention relates to a character string conversion technique, and more specifically, an information processing apparatus, a character string conversion method, a program, and an information processing system for converting a character string described in original characters into a character string of another language such as an alphabet. About.

近年、経済活動のグローバル化、交通機関の発達、インターネットの普及などにより、日本語、英語、中国語、韓国語など複数の言語を同時的に処理しなければならい場合が多い。複数の言語を処理することが必要な業務は、数多く存在する。例えば、人名については、人名特有の単語も多く、また判断するための時間も限られている場合も多い。アルファベットなどインド・ヨーロッパ語圏の名称が例えばアルファベットで記述されている場合には、アルファベット変換の必要性はない。しかしながら、インド・ヨーロッパ語圏以外の言語に由来する人名や、インド・ヨーロッパ語圏の人名がそれ以外の言語、例えば日本語、中国語、韓国語で記述された後、日本語などの表記からアルファベットを生成する場合には、種々の問題が生じる。 In recent years, due to the globalization of economic activities, the development of transportation facilities, the spread of the Internet, etc., it is often necessary to process multiple languages such as Japanese, English, Chinese and Korean simultaneously. There are many tasks that require processing of multiple languages. For example, as for the name of a person, there are many words unique to the name of the person, and the time for determination is often limited. When names in the Indo-European-speaking area such as alphabets are described in alphabets, for example, there is no need for alphabet conversion. However, after a person's name derived from a language other than Indo-European-speaking languages, or after a name of an Indo-European-speaking person is written in another language, such as Japanese, Chinese, or Korean, Various problems arise when generating the alphabet.

例えば、日本語のカタカナは、表音文字であり、カナ表記で、「らいと」または「ライト」の文字列について、アルファベットに変換する場合について例示的に検討する。例えば、カタカナ表記「ライト」に対して、同一または類似のアルファベット表記は、“right”、“light”、“write”、“wright”など考えられるアルファベット表記は多数存在する。 For example, Japanese katakana is a phonetic character, and a case where a character string of “raito” or “light” in kana notation is converted into an alphabet will be considered as an example. For example, for the katakana notation “light”, there are many possible alphabet notations such as “right”, “light”, “write”, “wright” as the same or similar alphabet notation.

また、インド・ヨーロッパ語圏の人名を、カタカナとする場合には、例えば“Henry”について、英語の話者は、「ヘンリー」と発音し、カタカナ表記が与えられる。ところが、アルファベットで“Henry”と記述された場合、フランス語の話者は、「アンリ」と発音するので、オリジナルのアルファベット表示が同一にもかかわらず、異なるカタカナ表記が与えられる。このようなアルファベット表記が、カタカナ表記に変換され、さらにカタカナ表記を他者が、再度アルファベット表記に変換する場合、必ずしも一義的な変換が与えられるというわけではない。 When the name of an Indian / European-speaking person is katakana, for example, “Henry”, an English speaker pronounces “Henry” and is given katakana notation. However, when “Henry” is written in the alphabet, the French speaker pronounces “Henri”, so that different katakana notation is given even though the original alphabet display is the same. When such an alphabet notation is converted into a katakana notation, and when another person converts the katakana notation into an alphabet notation again, a unique conversion is not necessarily given.

また、言語上で使用される単語には、人名の他、地名、造語、合成語などを挙げることができる。これらの単語を都度辞書を参照してそのスペリングを調べることも可能ではある。しかしながら、電話などの音声コールでは、発音された単語に基づいてリアルタイムで処理しなければならない場合がほとんどで、都度辞書を引用して調べることが可能でない場合もあり、この結果、誤記、ヒアリングミスなどによるエラーが発生する場合もあった。 The words used in the language can include place names, coined words, synthesized words, etc. in addition to the names of people. It is also possible to check the spelling of these words by referring to the dictionary each time. However, voice calls such as telephone calls often require processing in real time based on the pronounced words, and it may not be possible to look up the dictionary by quoting each time. In some cases, errors may occur.

これまで上述したアルファベット変換を行うための技術も知られている。例えば、特開平８−３３９３７６号公報（特許文献１）では、データベースに登録された外国語単語をカタカナ単語で効率良く検索する装置およびシステムが開示されている。特許文献１では、発音記号とカタカナ文字との対応を記憶する発音記号・カタカナ対応テーブルと、登録データ入力部から入力された外国語単語および発音記号からなる登録データの発音記号を、発音記号・カタカナ対応テーブルを用いてカタカナ単語に変換する発音記号カタカナ変換手段とを使用する。特許文献１のシステムは、検索キーワードのカタカナ単語のデータベースに登録された各カタカナ単語に対する類似度Ｒｉを計算し、単語類似度Ｒｉが規定値以上のカタカナ単語に対応する外国語単語を検索結果として出力するものである。
特開平８−３３９３７６号公報 Techniques for performing the above-described alphabet conversion are also known. For example, Japanese Patent Laid-Open No. 8-339376 (Patent Document 1) discloses an apparatus and system for efficiently searching for foreign language words registered in a database using katakana words. In Patent Document 1, a phonetic symbol / katakana correspondence table for storing the correspondence between phonetic symbols and katakana characters, and phonetic symbols of registered data composed of foreign words and phonetic symbols input from a registered data input unit, Using phonetic symbol katakana conversion means for converting into katakana words using the katakana correspondence table. The system of Patent Literature 1 calculates a similarity Ri for each katakana word registered in a database of katakana words of search keywords, and uses a foreign language word corresponding to a katakana word having a word similarity Ri equal to or higher than a specified value as a search result. Output.
JP-A-8-339376

上述したように特許文献１に記載された技術は、変換テーブルを使用してカタカナ外国語変換を行うものであるため、辞書メンテナンスの手間を要する。また、辞書精度によって、カタカナ−アルファベット変換の精度が左右されること、および単語類似度を、カタカナ同士の文字比較を使用して行うものであり、カタカナ表記の多様性や、アルファベット表記とカタカナに変換した場合に発音されない、すなわち黙字などの存在により同一のカタカナ表記となる場合など、充分な精度で変換できない場合もあった。 As described above, since the technique described in Patent Document 1 performs katakana foreign language conversion using a conversion table, it requires time and effort for dictionary maintenance. In addition, the accuracy of katakana-alphabet conversion depends on the dictionary accuracy, and word similarity is determined using character comparison between katakana characters. There are cases in which conversion is not possible with sufficient accuracy, such as not being pronounced when converted, that is, the same katakana notation due to the presence of silent characters, etc.

上述した従来技術の不都合は、カタカナ同士の比較により、その類似度を計算することによるものである。また、検索キーワードを取得し、類似度を計算した後、類似度に応答して発音を参照し、外国語単語を検索するのでは、テーブルのために割当てるメモリなどのハードウェア資源の消費の点、検索時間などのマイクロプロセッサ占有時間、および検索精度の点からリアルタイム応答性を実現する上で充分なものということができなかった。 The inconvenience of the prior art described above is due to the calculation of the degree of similarity by comparison between katakana. In addition, after obtaining the search keyword and calculating the similarity, referring to the pronunciation in response to the similarity and searching for foreign language words, the consumption of hardware resources such as memory allocated for the table In view of the microprocessor occupation time such as the search time and the search accuracy, it cannot be said to be sufficient for realizing the real-time response.

本発明は、上述した従来技術の問題点に鑑みてなされたものであり、本発明は、カタカナ、平仮名、ハングルなどの元文字列と、アルファベットなどの他の言語とを直接関連付けることにより、元文字列をアルファベットなどの対応する他の言語の文字列に変換することを可能とする情報処理装置、文字列変換方法、プログラムおよび情報処理システムを提供することを目的とする。 The present invention has been made in view of the above-described problems of the prior art, and the present invention can be realized by directly associating original character strings such as katakana, hiragana, and hangul with other languages such as alphabets. It is an object of the present invention to provide an information processing apparatus, a character string conversion method, a program, and an information processing system that can convert a character string into a character string corresponding to another language such as an alphabet.

さらに本発明は、カタカナ、平仮名、ハングルなどの元文字列を、元文字列に対応する最尤のアルファベットといった他の言語の文字列に変換することが可能な情報処理装置、文字列変換方法、プログラムおよび情報処理システムを提供することを目的とする。 Furthermore, the present invention relates to an information processing apparatus capable of converting an original character string such as katakana, hiragana or hangul into a character string of another language such as a maximum likelihood alphabet corresponding to the original character string, a character string conversion method, An object is to provide a program and an information processing system.

さらに、本発明は、カタカナ、平仮名、ハングルなどの元文字列に対してアルファベットといった他の言語の音素を対応付け、確率モデルを使用して最尤のアルファベット文字列への変換を可能とする、情報処理装置、文字列変換方法、プログラムおよび情報処理システムを提供することを目的とする。 Furthermore, the present invention associates phonemes of other languages such as alphabets with original character strings such as katakana, hiragana, and hangul, and enables conversion to a maximum likelihood alphabet character string using a probability model. An object is to provide an information processing apparatus, a character string conversion method, a program, and an information processing system.

本発明は、上記従来技術の問題点に鑑みてなされたものであり、本発明では、元文字列の音素と、変換先文字列の音素との間の対応関係に対してコストを定義し、元文字列と変換先文字列との間の音素特性の相違を特徴付けるアライメント・コストを採用する。アライメント・コストは、元文字列と変換先文字列との間の言語学的な音素特性の相違を文字列変換に含ませることを可能とする。アライメント処理において、元文字列の音素シーケンスおよび変換先文字列の音素シーケンスは、それぞれを軸とする経路マップを生成するために使用される。経路マップでは、元文字列の単位音素と変換先文字列の対応付けられる単位音素とによりセルが定義される。そして、経路に対して、各軸に沿った方向および各軸に対して斜めの方向の３方向に対応する３つの単位経路を指定し、それぞれに対して異なるコストを付与し、経路マップ上での最短経路、すなわち、最小コスト経路を与えるように各コストを決定し、コスト・モデルを作成する。 The present invention has been made in view of the above problems of the prior art, and in the present invention, the cost is defined for the correspondence between the phoneme of the original character string and the phoneme of the conversion destination character string, Adopt an alignment cost that characterizes the difference in phoneme characteristics between the original string and the destination string. The alignment cost enables the character string conversion to include a linguistic phoneme characteristic difference between the original character string and the conversion target character string. In the alignment process, the phoneme sequence of the original character string and the phoneme sequence of the conversion destination character string are used to generate a route map around each axis. In the route map, a cell is defined by a unit phoneme of an original character string and a unit phoneme associated with a conversion destination character string. Then, for the route, three unit routes corresponding to three directions, that is, a direction along each axis and an oblique direction with respect to each axis are designated, and different costs are assigned to each. Each cost is determined so as to give the shortest path, that is, the minimum cost path, and a cost model is created.

作成されたコスト・モデルは、その後、未アライメント事例の自動アライメントに使用される。未アライメント事例の自動アライメントは、元文字列の音素シーケンスを決定し、元文字列の音素シーケンスに対応付けられる可能性のある変換先文字列の音素シーケンスとから複数の経路マップを生成し、最小コスト経路を探索する。探索された最小コスト経路は、元文字列に対する変換先文字列のアライメント結果を与える。なお、本発明では、経路探索を、ビタビ・アルゴリズムを使用して実行してもよい。 The created cost model is then used for automatic alignment of unaligned cases. Automatic alignment of unaligned cases determines the phoneme sequence of the original string, generates multiple path maps from the phoneme sequence of the destination string that may be associated with the phoneme sequence of the original string, and minimizes Search cost path. The searched minimum cost path gives the alignment result of the conversion destination character string with respect to the original character string. In the present invention, the route search may be performed using the Viterbi algorithm.

さらに本発明では、上述した自動アライメントの結果を使用して、元文字音素を観測系列とし、変換先文字列を状態遷移系列とする、確率モデルを生成する。確率モデルは、元文字音素と元文字に対応する変換先音素の変換確率π_ｉと、元文字列のシーケンスに対応して後続する音素を、変換先音素に変換するための変換確率π_ｊを与える遷移確率Ｐ_ｉｊとを、元文字音素または変換先文字列の音素に対応付けた遷移確率テーブルとして生成される。 Furthermore, in the present invention, using the result of the automatic alignment described above, a probability model is generated in which the original character phoneme is an observation sequence and the conversion destination character string is a state transition sequence. The probabilistic model includes a conversion probability π _i of an original character phoneme and a conversion destination phoneme corresponding to the original character, and a conversion probability π _j for converting a subsequent phoneme corresponding to the sequence of the original character string into a conversion destination phoneme. The given transition probability P _ij is generated as a transition probability table in which the original character phoneme or the phoneme of the conversion destination character string is associated.

文字列変換処理を行う場合、元文字列を取得して、音素分解を行い、音素分解の結果および確率モデルを使用して元文字列から状態遷移により生成される他言語の文字列の変換尤度χを計算し、最大の変換尤度χを与える他言語の文字列を、文字列変換結果として出力することで、文字列変換を行う。 When performing character string conversion processing, the original character string is acquired, phoneme decomposed, and the conversion likelihood of the other language character string generated by state transition from the original character string using the result of phoneme decomposition and the probability model. Character string conversion is performed by calculating a degree χ and outputting a character string of another language that gives the maximum conversion likelihood χ as a character string conversion result.

また、確率モデルは、異なる言語に対応して複数使用することも可能であり、特定の元文字列から、言語種類を推定し、推定された言語への文字列変換を行うこともできる。 A plurality of probability models can be used corresponding to different languages, and a language type can be estimated from a specific original character string and a character string can be converted into the estimated language.

さらに、本発明では、上記の処理を情報処理装置に実行させるための情報処理実行可能な文字列変換方法およびプログラムを提供するものである。 Furthermore, the present invention provides a character string conversion method and program capable of information processing for causing the information processing apparatus to execute the above processing.

本発明は、ネットワークを介してウェブ・クライアントに対して文字列変換サービスを提供するウェブ・サーバとして実装することができる。 The present invention can be implemented as a web server that provides a character string conversion service to a web client via a network.

本発明によれば、元文字列と変換先の他言語との間の言語学的な音素分解の相違に柔軟に対応でき、さらに元文字列の音素と変換先文字列の音素とを直接変換することで、変換精度を向上させ、言語の多様性にも柔軟に対応でき、さらにハードウェア資源の浪費を伴わない、情報処理装置、文字列変換方法、プログラムおよび情報処理システムを提供することができる According to the present invention, it is possible to flexibly cope with the difference in linguistic phoneme decomposition between the original character string and the other language of the conversion destination, and further directly convert the phoneme of the original character string and the phoneme of the conversion destination character string. To provide an information processing apparatus, a character string conversion method, a program, and an information processing system that can improve conversion accuracy, flexibly cope with language diversity, and do not waste hardware resources. it can

以下、本発明を、実施形態をもって説明するが、本発明は、後述する実施形態に限定されるものではない。図１は、本実施形態の情報処理装置１００の機能ブロック図である。情報処理装置１００は、パーソナル・コンピュータ、ワークステーション、またはサーバ専用機として実装することができる。 Hereinafter, although this invention is demonstrated with embodiment, this invention is not limited to embodiment mentioned later. FIG. 1 is a functional block diagram of the information processing apparatus 100 according to the present embodiment. The information processing apparatus 100 can be implemented as a personal computer, a workstation, or a server dedicated machine.

情報処理装置１００をサーバ専用機として実装する場合、マイクロプロセッサとしては、ＰＥＮＴＩＵＭ（登録商標）、ＰＥＮＴＩＵＭ（登録商標）互換チップ、などのＣＩＳＣアーキテクチャのマイクロプロセッサ、または、ＰＯＷＥＲＰＣ（登録商標）などのＲＩＳＣアーキテクチャのマイクロプロセッサを使用することができ、シングルコアでもマルチコアでもかまわない。また、情報処理装置１００をサーバ専用機として実装する場合、そのオペレーティングシステム（ＯＳ）は、ＷＩＮＤＯＷＳ（登録商標）２００Ｘ、ＵＮＩＸ（登録商標）、ＬＩＮＵＸ（登録商標）などを使用することができる。 When the information processing apparatus 100 is mounted as a server-dedicated machine, the microprocessor may be a microprocessor of CISC architecture such as PENTIUM (registered trademark), a PENTIUM (registered trademark) compatible chip, or POWER PC (registered trademark). A RISC architecture microprocessor can be used, and may be single-core or multi-core. When the information processing apparatus 100 is implemented as a server-dedicated machine, the operating system (OS) can use WINDOWS (registered trademark) 200X, UNIX (registered trademark), LINUX (registered trademark), or the like.

また、情報処理装置１００は、サーバ専用機として実装される場合、Ｃ＋＋、ＪＡＶＡ（登録商標）、ＪＡＶＡ（登録商標）ＢＥＡＮＳ、ＰＥＲＬ、ＲＵＢＹなどのプログラミング言語を使用して実装される、ＣＧＩ、サーブレット、ＡＰＡＣＨＥ、ＩＩＳなどのサーバ・プログラムを実行し、ネットワーク（図示せず）を介して各種要求を処理する。なお、情報処理装置１００が、サーバ専用機として実装される場合、情報処理装置１００は、ウェブ・サーバとすることができる。また情報処理装置１００は、ＣＯＲＢＡ(Common Object Resource Broker Architecture)を使用した分散コンピューティングを可能とする専用サーバとすることができる。 Further, when the information processing apparatus 100 is implemented as a server-dedicated machine, the CGI, servlet implemented using a programming language such as C ++, JAVA (registered trademark), JAVA (registered trademark) BEANS, PERL, RUBY, etc. A server program such as APACHE or IIS is executed to process various requests via a network (not shown). When the information processing apparatus 100 is implemented as a server dedicated machine, the information processing apparatus 100 can be a web server. In addition, the information processing apparatus 100 can be a dedicated server that enables distributed computing using CORBA (Common Object Resource Broker Architecture).

情報処理装置１００をパーソナル・コンピュータまたはワークステーションなどを使用して実装する場合、マイクロプロセッサ（ＭＰＵ）は、これまで知られたいかなるシングルコア・プロセッサまたはデュアルコア・プロセッサを含んでいてもよい。この実施形態では、情報処理装置１００は、ＷＩＮＤＯＷＳ（登録商標）、ＵＮＩＸ（登録商標）、ＬＩＮＵＸ（登録商標）、ＭＡＣＯＳ(登録商標)など、いかなるオペレーティング・システムを搭載してもよい。また、情報処理装置１００は、ウェブ・クライアントとして機能する場合、Internet
Explorer（登録商標）、Mozilla、Opera、Netscape(登録商標) Navigatorなどのブラウザ・ソフトウェアを使用して、ＨＴＴＰプロトコルを使用してウェブ・サーバにアクセスすることが可能とされている。 When the information processing apparatus 100 is implemented using a personal computer or a workstation, the microprocessor (MPU) may include any single-core processor or dual-core processor known so far. In this embodiment, the information processing apparatus 100 may be installed with any operating system such as WINDOWS (registered trademark), UNIX (registered trademark), LINUX (registered trademark), or MAC OS (registered trademark). Further, when the information processing apparatus 100 functions as a web client,
Browser software such as Explorer (registered trademark), Mozilla, Opera, and Netscape (registered trademark) Navigator can be used to access a web server using the HTTP protocol.

図１の情報処理装置１００の機能ブロックについて以下、詳細に説明する。本実施形態の情報処理装置１００は、元言語として、カタカナ、平仮名、ハングルなどを含む表音文字列を使用する。また、他の言語としてはアルファベット、イスラム語、ヘブライ語、スラブ語、ヒンズー語など、インド・西ヨーロッパ語圏の言語を使用することができる。以下、本実施形態を具体的に説明する目的で、元言語を、平仮名またはカタカナを含むカナとし、他の言語を、アルファベットとして説明する。また、以下に説明する各機能部は、メモリなどにプログラムを展開し、ＣＰＵまたはマイクロプロセッサがプログラムを実行することによって情報処理装置１００の機能手段として実現される。 The functional blocks of the information processing apparatus 100 in FIG. 1 will be described in detail below. The information processing apparatus 100 according to the present embodiment uses a phonetic character string including katakana, hiragana, and hangul as an original language. In addition, other languages such as alphabet, Islamic, Hebrew, Slavic, Hindu, etc. can be used. Hereinafter, for the purpose of specifically explaining the present embodiment, the original language will be described as hiragana or kana including katakana, and the other languages will be described as alphabets. Each functional unit described below is realized as a functional unit of the information processing apparatus 100 by developing a program in a memory or the like and causing the CPU or the microprocessor to execute the program.

情報処理装置１００は、情報処理装置本体１１０と入力部１１２と、入出力インタフェース１１４とを含んでいる。入力部１１２は、図１に示した実施形態では、キーボードなどを使用することができる。また入力部１１２は、元文字列としてカナ文字列を入力し、入出力インタフェース１１４を介して、元文字取得部であるカナ文字取得部１１６にカナ文字（列）を送付する。なお、図１に示した情報処理装置１００は、この他、ディスプレイ装置を含んでいるが、本実施形態の要旨には関係しないので説明を省略する。 The information processing apparatus 100 includes an information processing apparatus main body 110, an input unit 112, and an input / output interface 114. In the embodiment shown in FIG. 1, the input unit 112 can use a keyboard or the like. The input unit 112 also inputs a kana character string as an original character string, and sends the kana character (string) to the kana character acquisition unit 116 that is the original character acquisition unit via the input / output interface 114. In addition, the information processing apparatus 100 shown in FIG. 1 includes a display device, but the description is omitted because it is not related to the gist of the present embodiment.

カナ文字取得部１１６は、受領したカナ文字列を音素分解部１２０に送付する。音素分解部１２０は、音素データ格納部１２６を参照して、受領したカナ文字列を変換処理単位として使用する音素シーケンスに分解する。例えば、カナ文字列が、「インフォメーション」の場合、元文字音素として、「イ」、「ン」、「フォ」、「メー」、「ショ」、「ン」の各音素に分解する。なお、本実施形態における元文字音素への分解は、カナとして独立できる単位を最小単位とし、長音、促音などがある場合、長音、促音を直前のカナ文字に結合して音素として登録する。 The kana character acquisition unit 116 sends the received kana character string to the phoneme decomposition unit 120. The phoneme decomposition unit 120 refers to the phoneme data storage unit 126 and decomposes the received kana character string into phoneme sequences that are used as conversion processing units. For example, when the kana character string is “information”, the original character phonemes are decomposed into “i”, “n”, “fo”, “mae”, “sho”, and “n” phonemes. In the present embodiment, the original character phoneme is decomposed into a unit that can be independent as kana, and when there is a long sound or a prompt sound, the long sound or the prompt sound is combined with the immediately preceding kana character and registered as a phoneme.

なお、音素データは、カナ文字列およびアルファベット文字列について、言語学上の最小単位となるべき音素を、例えば特定のカナ音素について、当該カナ音素に対応する音素として分類するべきアルファベット音素を対応付けて登録するデータ構造として作成することができ、予め音素データ格納部１２６に登録しておくことができる。 Note that phoneme data associates phonemes that should be the smallest linguistic unit for kana character strings and alphabetic character strings, for example, for alphabetic phonemes that should be classified as phonemes corresponding to the kana phonemes for specific kana phonemes. Can be created as a data structure to be registered, and can be registered in the phoneme data storage unit 126 in advance.

また、例えばハングルなど他の言語を元文字とする場合には、言語学上の観点から適切な単位で元文字音素を登録することができる。なお、元言語がカナではなく、他の言語もアルファベットでない場合には、元言語の音素および他言語の音素をそれぞれ対応付けて音素データ格納部１２６に登録すればよい。なお、元文字音素および変換先音素は、後述する学習の際の経路探索での、セルを定義する単位として使用される。 Further, when another language such as Korean is used as the original character, the original character phoneme can be registered in an appropriate unit from the viewpoint of linguistics. If the original language is not kana and the other languages are not alphabets, the original language phonemes and the other language phonemes may be associated with each other and registered in the phoneme data storage unit 126. Note that the original character phoneme and the conversion destination phoneme are used as a unit for defining a cell in a route search in learning described later.

音素分解部１２０は、カナ文字列の音素分解が終了すると、音素分解結果を、変換尤度計算部１２２に送付する。変換尤度計算部１２２は、確率モデル格納部１２８をルックアップして、変換尤度を計算する。確率モデル格納部１２８は、確率モデルを格納しており、確率モデルは、特定のカナ文字に対応し文字変換を行う言語の音素である変換先音素の変換確率を登録する。変換確率とは、例えばカナ音素「イ」に対して、アルファベット音素「Ｉ」と対応付けるための確率である。 When the phoneme decomposition of the kana character string is completed, the phoneme decomposition unit 120 sends the phoneme decomposition result to the conversion likelihood calculation unit 122. The conversion likelihood calculation unit 122 looks up the probability model storage unit 128 and calculates the conversion likelihood. The probability model storage unit 128 stores a probability model, and the probability model registers a conversion probability of a conversion destination phoneme that is a phoneme of a language that performs character conversion corresponding to a specific kana character. The conversion probability is, for example, the probability for associating the kana phoneme “I” with the alphabet phoneme “I”.

また、確率モデルは、連続する音素間の遷移確率を登録する。遷移確率とは、例えばアルファベット「Ｉ」の後続音素として「Ｎ」が出現する確率である。説明する特定の実施形態では、アルファベット音素に対してアルファベットの音素を対応付ける確率πおよびカナ文字に対応する連続する音素シーケンスが連続する遷移確率Ｐを例えば遷移確率テーブルとして登録している。 The probability model registers the transition probability between consecutive phonemes. The transition probability is a probability that “N” appears as a subsequent phoneme of the alphabet “I”, for example. In the specific embodiment to be described, the probability π of associating alphabetic phonemes with alphabetic phonemes and the transition probability P of continuous phoneme sequences corresponding to kana characters are registered as, for example, a transition probability table.

変換尤度計算部１２２は、確率モデル格納部１２８を参照して、取得したカナ文字の音素シーケンスをフォワード処理し、先頭のカナ文字の変換確率π_１を取得し、続いた直後のカナ文字の変換確率π_２および先頭カナ文字−後続カナ文字間の遷移確率Ｐ_１２を取得する。上記の処理を音素シーケンスが終了するまで実行し、生成された確率値を使用して積算、乗算、または他の適切な計算式を使用してカナ音素シーケンス−アルファベット音素シーケンスについて、先頭から最後尾までの変換尤度χを計算する。 The conversion likelihood calculation unit 122 refers to the probability model storage unit 128 and forwards the acquired phoneme sequence of the kana character to acquire the conversion probability π ₁ of the _first kana character, The conversion probability π ₂ and the transition probability P ₁₂ between the first kana character and the subsequent kana character are acquired. Repeat the above process until the phoneme sequence is complete, and use the generated probability value to multiply, multiply, or use other appropriate formulas for kana phoneme-alphabet phoneme sequences from head to tail The conversion likelihood χ until is calculated.

最尤音素シーケンス決定部１１８は、変換尤度計算部１２２の確率計算が終了した通知を受けて、変換尤度計算部１２２が作成した結果リストを検索し、説明する実施形態では、変換尤度χの最も大きなアルファベット・シーケンスを、最尤のアルファベット文字列候補として決定し、結果出力部１３０に文字列変換結果として出力する。出力結果は、好ましい実施形態では、ディスプレイ装置のデスクトップ画面に表示され、ユーザが適宜ハードコピーすることができる。 The maximum likelihood phoneme sequence determination unit 118 receives a notification that the probability calculation of the conversion likelihood calculation unit 122 has been completed, searches the result list created by the conversion likelihood calculation unit 122, and in the embodiment described below, the conversion likelihood The alphabet sequence having the largest χ is determined as the most likely alphabet character string candidate, and is output to the result output unit 130 as a character string conversion result. In the preferred embodiment, the output result is displayed on the desktop screen of the display device, and can be hard-copied by the user as appropriate.

なお、音素データ格納部１２６に格納される音素データおよび確率モデル格納部１２８に格納される遷移確率は、文字変換処理を実行する前にプリプロセッサ１２４が作成し、ハードディスク装置（図示せず）や、ＥＥＰＲＯＭ、ＥＰＲＯＭなどに登録しておくことができる。なお、他の実施形態では、音素データおよび遷移確率データ、を実行データとしてプログラム実行時にハードディスク装置などから、情報処理装置１００のＲＡＭへと読出して使用することができる限り、情報処理装置１００にプリプロセッサ１２４を実装しなくともよい。 Note that the phoneme data stored in the phoneme data storage unit 126 and the transition probabilities stored in the probability model storage unit 128 are created by the preprocessor 124 before executing the character conversion process, and the hard disk device (not shown), It can be registered in EEPROM, EPROM or the like. In other embodiments, as long as the phoneme data and the transition probability data can be read as execution data from a hard disk device or the like to the RAM of the information processing device 100 during execution of the program, the information processing device 100 is preprocessord. 124 may not be implemented.

図２は、情報処理装置１００が含むプリプロセッサ１２４の機能ブロック構成２００を示す。プリプロセッサ１２４には、システム管理者や開発者によって、アライメント済み事例および未アライメント事例が入力される。アライメント済み事例とは、カナ音素とアルファベット音素とが、すでに対応付けられたデータ・セットとして定義される。また、未アライメント事例とは、互いに対応付けを行うべきアルファベット文字列と、カナ文字列とからなるデータ・セットとして定義され、後述するコスト・モデルを使用して確率モデルの精度を向上させるための学習データとして使用される。 FIG. 2 shows a functional block configuration 200 of the preprocessor 124 included in the information processing apparatus 100. Aligned cases and unaligned cases are input to the preprocessor 124 by a system administrator or a developer. An aligned case is defined as a data set in which kana phonemes and alphabetic phonemes are already associated. An unaligned case is defined as a data set consisting of an alphabet string and kana string that should be associated with each other, and is used to improve the accuracy of the probability model using a cost model described later. Used as learning data.

プリプロセッサ１２４は、アライメント済み事例２１０を受領すると、コスト・モデル生成部２１４に渡す。コスト・モデル生成部２１４は、アライメント済み事例２１０のデータのカナ音素−アルファベット音素の対応付け関係を使用してカナ音素−アルファベット音素の対応付けのコストを計算する。そして、コスト・モデル生成部２１４は、取得したアライメント済み事例２１０の要素集合全体について、コスト計算を実行し、それぞれのアライメント済み事例から、カナ音素とアルファベット音素との対応付けするためのコスト・モデルを生成し、適切な記憶領域に格納する。 Upon receiving the aligned case 210, the preprocessor 124 passes it to the cost model generation unit 214. The cost model generation unit 214 calculates the kana phoneme-alphabet phoneme association cost using the kana phoneme-alphabet phoneme association relationship of the data of the aligned case 210. Then, the cost model generation unit 214 performs cost calculation on the entire element set of the acquired aligned case 210, and the cost model for associating the kana phoneme and the alphabet phoneme from each aligned case. Is generated and stored in an appropriate storage area.

また、アライメント処理部２１６は、生成されたコスト・モデルを使用して、未アライメント事例２１２のアライメントを実行し、コスト・モデルにより与えられる音素間のコストを計算する。さらにアライメント処理部２１６は、カナ音素に対して対応付け可能な複数のアルファベット音素を対応付けて複数の異なる経路マップを生成し、各経路マップに対してコスト・モデルを適用し、最低コストのアライメントを決定することで、アライメントによる学習事例を生成し、その結果をアライメント結果格納部２２０に格納する。 In addition, the alignment processing unit 216 performs alignment of the unaligned case 212 using the generated cost model, and calculates the cost between phonemes given by the cost model. Further, the alignment processing unit 216 generates a plurality of different route maps by associating a plurality of alphabetic phonemes that can be associated with kana phonemes, applies a cost model to each route map, and performs the lowest cost alignment. Is determined to generate a learning case by alignment, and the result is stored in the alignment result storage unit 220.

確率モデル生成部２１８は、アライメント処理部２１６の処理が終了すると、アライメント結果格納部２２０の結果を抽出し、アライメント事例の出現数をカウントして、カナ音素−アルファベット音素について、カナ音素に対応するアルファベット音素の変換確率πおよび各音素のシーケンス間の遷移確率Ｐを生成する。生成された各確率値は、説明する実施形態では、アルファベットに対してのカナの変換確率πおよび後続するアルファベットへの遷移確率Ｐを、遷移確率テーブルの型式で確率モデル格納部１２８に登録する。 When the processing of the alignment processing unit 216 is completed, the probability model generation unit 218 extracts the result of the alignment result storage unit 220, counts the number of occurrences of alignment examples, and corresponds to the kana phoneme for the kana phoneme-alphabet phoneme. A conversion probability π of alphabetic phonemes and a transition probability P between sequences of each phoneme are generated. In each of the generated probability values, in the embodiment to be described, the conversion probability π of kana to the alphabet and the transition probability P to the subsequent alphabet are registered in the probability model storage unit 128 in the form of a transition probability table.

図３は、本実施形態の確率モデル生成処理の実施形態のフローチャートである。図３に示した処理は、図２に示したプリプロセッサ１２４が実行する処理に対応する。図３の処理は、ステップＳ３００から開始し、ステップＳ３０１で、アライメント事例集合を取得する。ステップＳ３０２で、アライメント済み事例から最初のアライメント済み事例を取得し、アライメント・コストを計算する。アライメント・コストの計算は、アルファベットでの音素とカナの音素の間の対応関係を考慮して割当てた経路コストを使用して計算する。 FIG. 3 is a flowchart of an embodiment of the probability model generation process of the present embodiment. The process shown in FIG. 3 corresponds to the process executed by the preprocessor 124 shown in FIG. The process of FIG. 3 starts from step S300, and an alignment example set is acquired in step S301. In step S302, the first aligned case is obtained from the aligned case and the alignment cost is calculated. The alignment cost is calculated using the route cost assigned in consideration of the correspondence between the phonemes in the alphabet and the kana phonemes.

アライメント・コストの計算は、アライメント済み事例について対応付けされたカナ音素−アルファベット音素について、経路マップを生成させ、経路マップ上に、カナ音素−アルファベット音素のセルを割当てる。さらに、セルに沿った単位経路を一定規則の下で割当てて、先頭から末尾まで単位系路に沿って経路付けを行って、当該経路上に出現する一定規則のコストを総和することによって実行することができる。なお、アライメント・コストの計算処理についてはより詳細に後述する。 In the calculation of the alignment cost, a route map is generated for the kana phoneme-alphabet phoneme associated with the aligned case, and a kana phoneme-alphabet phoneme cell is allocated on the route map. Furthermore, the unit route along the cell is assigned under a certain rule, routed along the unit system route from the beginning to the end, and executed by summing the costs of the certain rule appearing on the route. be able to. The alignment cost calculation process will be described later in more detail.

ステップＳ３０３では、アライメント済み事例集合の全要素についてコストを決定したか否かを判断し、全要素についてコストを決定していない場合（ｎｏ）、処理をステップＳ３０２に分岐させ、アライメント済み事例集合全部の処理が終了するまでコスト計算を蓄積する。なお、アライメント済み事例は、プログラム作成者側で、プログラムのデータ作成作業により、作成することができる。 In step S303, it is determined whether or not the cost has been determined for all elements of the aligned case set. If the cost has not been determined for all elements (no), the process branches to step S302, and the entire aligned case set is determined. The cost calculation is accumulated until the process is completed. The aligned case can be created by the program creator on the program creator side.

ステップＳ３０３で、全要素についてコスト計算が終了したと判断した場合（ｙｅｓ）、ステップＳ３０４で、カナ音素−アルファベット音素対応付けについてのコスト・モデルを、コスト計算の結果を参照して生成する。コスト・モデルは、連続する音素間の対応付けのアルファベット変換における音素の増減を含めた妥当性の尺度を提供する。 If it is determined in step S303 that cost calculation has been completed for all elements (yes), a cost model for kana-phoneme-alphabet phoneme association is generated with reference to the result of cost calculation in step S304. The cost model provides a measure of validity including the phoneme increase and decrease in the alphabet conversion of the correspondence between successive phonemes.

ステップＳ３０５では、プリプロセッサ１２４が未アライメント事例集合を取得する。ステップＳ３０６で、コスト・モデルを使用して情報処理装置１００がアライメント処理を実行して、情報処理装置１００による自動アライメント済み集合を生成し、適切な記憶領域に格納する。自動アライメント処理は、より具体的には、コスト・モデルを生成する際にアライメント済み事例から抽出されたアルファベット音素を、コスト・モデルを参照しながら未アライメント事例のカナ音素に対して割当てて複数の経路マップを生成し、各経路マップについて先頭音素から末尾音素までの経路コストを計算する。全経路マップについて経路コストを計算した後、特定の未アライメント事例について最小のコストを与えるアルファベット音素をそれぞれ割当てることによって実行される。 In step S305, the preprocessor 124 acquires an unaligned case set. In step S306, the information processing apparatus 100 performs alignment processing using the cost model, generates an automatically aligned set by the information processing apparatus 100, and stores it in an appropriate storage area. More specifically, the automatic alignment process is performed by assigning alphabetical phonemes extracted from aligned cases when generating a cost model to kana phonemes of unaligned cases while referring to the cost model. A route map is generated, and a route cost from the first phoneme to the last phoneme is calculated for each route map. After calculating the path cost for all path maps, it is performed by assigning each alphabetic phoneme that gives the minimum cost for a particular unaligned case.

ステップＳ３０７では、未アライメント事例集合内の全集合要素についてアライメントを完了したか否かを判断し、全集合要素について処理が終了していない場合（ｎｏ）、処理をステップＳ３０６に分岐させて、自動アライメント処理を反復させる。一方、全集合要素についてアライメント完了した場合（ｙｅｓ）、ステップＳ３０８でカナ音素−アルファベット音素の変換確率および音素間の遷移確率を計算し、確率モデルを生成する。確率モデルの型式は種々想定することができるが、コスト・モデルにそれぞれ登録されたカナ音素およびアルファベット音素についてその変換確率と、先行音素および後続音素の間の遷移確率とを登録して生成することができる。ステップＳ３０９では、生成した確率モデルを確率モデル格納部１２８に登録し、ステップＳ３１０で、確率モデル生成処理を終了する。 In step S307, it is determined whether or not alignment has been completed for all set elements in the unaligned case set. If processing has not been completed for all set elements (no), the process branches to step S306 to automatically Repeat the alignment process. On the other hand, if the alignment has been completed for all the set elements (yes), the conversion probability of kana phoneme-alphabet phoneme and the transition probability between phonemes are calculated in step S308, and a probability model is generated. Various types of probabilistic models can be assumed, but the conversion probabilities and transition probabilities between preceding and succeeding phonemes are registered and generated for kana and alphabetic phonemes registered in the cost model. Can do. In step S309, the generated probability model is registered in the probability model storage unit 128, and in step S310, the probability model generation process ends.

以上の処理は、プリプロセッサ１２４が実行する処理であり、プリプロセッサ１２４を実装しない実施形態では、確率モデルは、文字変換処理を実行するためのプログラムの実行データとして、ＣＤ−ＲＯＭ、ＤＶＤ−ＲＯＭなどから、インストール時にハードディスク装置の適切な記憶領域に格納することができる。そして、プログラムの実行開始とともに、ハードディスク装置からＲＡＭなどの高速アクセス・メモリに読出され、プログラムの実行に利用される。また、情報処理装置１００がプリプロセッサ１２４を含む場合には、確率モデルを直接ハードディスク装置に登録し、プログラム実行時に、ＲＡＭなどの高速アクセス・メモリに読出してプログラムの実行に利用する。いずれの実施形態であっても、音素データおよび確率モデルは、プログラムが実行のために呼び出される段階で、情報処理装置１００のＲＡＭなどに格納されて、プログラムにより利用される。 The above processing is processing executed by the preprocessor 124. In the embodiment in which the preprocessor 124 is not mounted, the probability model is obtained from a CD-ROM, DVD-ROM, or the like as execution data of a program for executing the character conversion processing. At the time of installation, it can be stored in an appropriate storage area of the hard disk device. Then, when the execution of the program is started, the program is read from the hard disk device to a high-speed access memory such as a RAM and used for executing the program. When the information processing apparatus 100 includes the preprocessor 124, the probability model is directly registered in the hard disk device, and is read into a high-speed access memory such as a RAM and used for executing the program when the program is executed. In any embodiment, the phoneme data and the probability model are stored in the RAM or the like of the information processing apparatus 100 and used by the program when the program is called for execution.

図４は、図３のステップＳ３０１でプリプロセッサ１２４が取得するアライメント済み事例集合４００を、その要素のデータ構造とともに示す。アライメント済み事例集合は、プログラムの開発者サイドで、プログラム実行の実行データを作成するための基礎的データとして収集・選択され、音素単位で、カナ−アルファベットを対応付けすることにより生成される。図４に示した実施形態では、例えば、information、george、smith、clintonなどのアルファベット文字列に対して音素単位でカタカナが対応付けられている。図４に示した実施形態では、対応付けを、括弧（「」）で示しているが、スペース、カンマ、コロン、セミコロン、／など、プリプロセッサ１２４が識別でき、カナ・アルファベット以外のコードであれば、いかなる区切り方式を使用してもよい。 FIG. 4 shows the aligned case set 400 acquired by the preprocessor 124 in step S301 of FIG. 3 together with the data structure of its elements. The aligned case set is collected and selected as basic data for creating execution data for program execution on the program developer side, and is generated by associating kana-alphabets in units of phonemes. In the embodiment shown in FIG. 4, for example, katakana is associated in units of phonemes with alphabetic character strings such as information, george, smith, and clinton. In the embodiment shown in FIG. 4, the association is indicated by parentheses (“”). However, the preprocessor 124 can identify a code such as a space, comma, colon, semicolon, and / or any code other than the kana alphabet. Any separation scheme may be used.

図４に示したアライメント済み事例集合は、その要素数が多ければ多いほど、生成するコスト・モデルの精度が向上し、より高精度の文字列変換を可能とする。また、文字列変換の際に特徴的なコスト値ができるだけ出現する文字列、例えば「ｅｘ」、「ｏｘ」、「ｉｇｎ」、「ｋｎｏ」など、変換後の音素数が変わるもの、アルファベットの単語を発音しない、いわゆる黙字などを含むアルファベット文字列とカナ文字列とを、アライメント済み事例集合として選択することが好ましい。 As the number of elements in the aligned case set shown in FIG. 4 is larger, the accuracy of the cost model to be generated is improved, and more accurate character string conversion is possible. In addition, a character string in which a characteristic cost value appears as much as possible during character string conversion, for example, “ex”, “ox”, “ign”, “kno”, etc., whose phoneme number after conversion changes, alphabetic words It is preferable to select an alphabet character string including a so-called silent character and a kana character string that do not pronounce the character as an aligned case set.

図５は、図３の処理のステップＳ３０２で実行する、アライメント済み事例に関してアライメント・コストを計算する処理の概念図である。アライメント・コストは、カナ音素に対して対応付けられたアルファベット音素から、経路マップ５００を生成して実行される。アライメント済み事例を使用するアライメント・コストの計算は、カナ音素に対してアルファベット音素を対応付ける場合、特に文字列変換において音素数が対応付けられない場合であっても、当該対応付けを最適に割当てるために適切なコスト・セットを生成するための処理である。 FIG. 5 is a conceptual diagram of the process for calculating the alignment cost for the aligned case executed in step S302 of the process of FIG. The alignment cost is executed by generating a route map 500 from alphabetic phonemes associated with kana phonemes. Alignment cost calculation using aligned examples is to assign alphabetical phonemes to kana phonemes, especially when the number of phonemes is not matched in character string conversion. This is a process for generating an appropriate cost set.

カナとアルファベット間で音素が一対一に対応付けできない場合とは、具体的に説明すると、例えば、「ＩＮ」は、カナ音素で「イ」＋「ン」と変換されるので音素数の対応付けには過不足はない。ところが、アルファベット文字列「ＯＸ」は、カタカナでは、「オッ」＋「ク」＋「ス」と発音され、アルファベット文字数よりもカタカナ文字数が増加することになり、対応付ける場合には音素数の変化を考慮することが必要となることに対応する。一方、「ＦＯＲ」は、アルファベット音素で、「Ｆ」＋「Ｏ」＋「Ｒ」となるが、カナ音素では、１音素「フォ」となるので、アルファベット音素の方が音素数が増えることになることに対応するものである。 More specifically, the case where phonemes cannot be correlated one-to-one between kana and alphabets. For example, “IN” is converted to “i” + “n” in kana phonemes, so the number of phonemes is correlated. There is no excess or deficiency. However, the alphabet string “OX” is pronounced as “O” + “K” + “Su” in Katakana, and the number of Katakana characters increases compared to the number of alphabet characters. Corresponds to what needs to be considered. On the other hand, “FOR” is an alphabetic phoneme, and is “F” + “O” + “R”. However, in Kana phoneme, one phoneme is “fo”, so that the number of phonemes in alphabetic phoneme increases. It corresponds to becoming.

上述した対応付けを行うために、本実施形態では、経路マップ５００に対してそれぞれの文字列の先頭から末尾まで、各文字列の単位音素で規定されるセル５１０を定義し、セルに沿って単位経路を定義する。単位経路は、本実施形態では、斜め経路５２０、垂直経路５３０、水平経路５４０の各経路として示されており、経路の進行態様に対してコストを割当て、最小コスト経路を与える経路探索問題としてコスト・モデルを処理する。 In order to perform the above-described association, in the present embodiment, a cell 510 defined by a unit phoneme of each character string is defined from the beginning to the end of each character string in the route map 500, and along the cell. Define unit route. In this embodiment, the unit route is shown as each of a diagonal route 520, a vertical route 530, and a horizontal route 540. Cost is assigned as a route search problem that assigns a cost to the traveling mode of the route and gives a minimum cost route. • Process the model.

説明する実施形態では、経路探索の進行態様を、本実施形態では、以下の３規則に限定する。
（１）カナ音素とアルファベット音素が対応付けられる場合には、セルを斜めに横断する斜め経路５２０を進行する。
（２）カナ音素が、アルファベット音素よりも短い場合、セルの垂直経路５３０を進行する。
（３）カナ音素が、アルファベット音素よりも長くなる場合、セルの水平経路５４０を進行する。 In the embodiment to be described, the progress mode of the route search is limited to the following three rules in the present embodiment.
(1) When a kana phoneme and an alphabetic phoneme are associated with each other, an oblique path 520 that diagonally crosses the cell is advanced.
(2) If the kana phoneme is shorter than the alphabet phoneme, it proceeds along the vertical path 530 of the cell.
(3) If the kana phoneme is longer than the alphabet phoneme, proceed along the horizontal path 540 of the cell.

斜め経路の場合は、対応付けに問題はないので、最小コストＣ_ｍｉｎを与える。また、垂直経路を進行する対応付けの場合は、インド・ヨーロッパ語では、発音されない黙字が存在する場合や音節の関係でカナ音素の促音や長音とされる場合などが対応し、一致性は低下するが、重大な不一致ではないので、完全一致よりも高いコストＣ_ｍｅｄを割当てる。この逆に、アルファベットでもカナでも、同一の音素が同一の単語に出現することは頻繁にあるので、不適切に離れた音素の対応付けを排除するために、水平経路は、他の経路に比べてコストを高める必要が好ましい。一方では、上述したように、アルファベット文字列「ＯＸ」のように自然言語上、合理的な範囲でカナ文字が増えてしまう場合があり、水平経路を含む対応付けを完全に排除することは不適切である。このため、水平経路には、他の２つよりも高い、Ｃ_ｍａｘのコストを割当てる。上述した経路探索規則に基づき、アライメント済み事例についてコストを下記式（１）を使用して計算する。 In the case of an oblique route, since there is no problem in association, a minimum cost C _min is given. In addition, in the case of mapping that travels along the vertical path, in Indo-European languages, there are cases where silent letters that are not pronounced are present, or when kana phoneme prompts or long sounds are considered due to syllables, etc. Decrease but not a serious mismatch, so assign a higher cost C _med than an exact match. On the contrary, the same phoneme frequently appears in the same word in both alphabets and kana, so the horizontal path is compared to other paths in order to eliminate the mapping of inappropriately separated phonemes. It is preferable to increase the cost. On the other hand, as described above, the kana characters may increase within a reasonable range in the natural language like the alphabet character string “OX”, and it is impossible to completely eliminate the correspondence including the horizontal path. Is appropriate. For this reason, the horizontal path is assigned a higher cost of C _max than the other two. Based on the route search rule described above, the cost is calculated using the following formula (1) for the aligned cases.

上記式（１）中、Ｃ_{ＴＯＴＡＬ}は、合計スコアであり、サフィックスａ、ｂ、ｃは、それぞれセル単位での斜め経路の発生数、垂直経路の発生数、水平経路の発生数である。

In the above formula (1), C _TOTAL is the total score, and suffixes a, b, and c are the number of oblique paths, the number of vertical paths, and the number of horizontal paths in cell units, respectively.

アライメント済み事例集合の要素については、経路は一義的に定められるので、各アライメント事例について合計コストを解析し、完全整合の経路の合計コストを音素数で除算した値が最も高くなり、垂直経路を含む経路の合計コストを音素数で除算した値が中間的な値を与え、水平経路を含む場合の合計コストを音素数で除算した値が、他の２態様の場合よりも高い最高値を与えるように、各コスト値Ｃ_ｍｉｎ、Ｃ_ｍｅｄ、Ｃ_ｍａｘを設定する。 For the elements of the aligned case set, the path is uniquely defined, so the total cost is analyzed for each alignment case, and the total cost of the perfectly matched path divided by the number of phonemes is the highest, and the vertical path is A value obtained by dividing the total cost of the path including the phoneme number gives an intermediate value, and a value obtained by dividing the total cost including the horizontal path by the phoneme number gives a higher maximum value than in the other two modes. In this way, the cost values C _min , C _med , and C _max are set.

上述した実施形態では、縦・横・斜にそれぞれ固定のコストを割当てるものとして説明した。他の実施形態では、アルファベット毎に異なる（横）コスト、カタカナ毎に異なる縦コスト、アルファベットとカナのペア毎に異なる（斜）コストを割当てることもできる。当該実施形態の場合、経路上の辺ｘ（縦でも横でも斜でも）のコストをＣｘとして、上記式（１）に代えて、下記式（１′）で与えられる合計スコアを使用することができる In the above-described embodiment, it has been described that fixed costs are assigned to the vertical, horizontal, and diagonal directions. In another embodiment, a different (horizontal) cost for each alphabet, a different vertical cost for each katakana, and a different (slanted) cost for each alphabet / kana pair may be assigned. In the case of this embodiment, it is possible to use the total score given by the following formula (1 ′) instead of the above formula (1), where Cx is the cost of the side x (vertical, horizontal, or diagonal) on the route. it can

図５を使用して経路コスト計算を説明すると、図５に示した実施形態では、カナ文字列＝インフォメーションであり、アルファベット文字列＝ＩＮＦＯＲＭＡＴＩＯＮである。アライメント済み事例では、イ＝Ｉ、ン＝Ｎ、フォ＝ＦＯＲ、メー＝ＭＡ、ショ＝ＴＩＯ、ン＝Ｎとして予めアライメントされているので、「ＦＯＲ」、「ＭＡ」、「ＴＩＯ」について一義的に、垂直経路５３０が割当てられ、その他の経路は、斜め経路５２０が割当てられている。 The route cost calculation will be described with reference to FIG. 5. In the embodiment shown in FIG. 5, the kana character string = information and the alphabet character string = INFORMATION. In the aligned case, since it is pre-aligned as i = I, n = N, fo = FOR, mae = MA, sho = TIO, n = N, “FOR”, “MA”, “TIO” are unambiguous. The vertical path 530 is assigned to the other path, and the slant path 520 is assigned to the other paths.

一方、図３のステップＳ３０５およびステップＳ３０６では、未アライメントの事例集合を使用して、経路探索を実行し、合計コストの最小値を与えるアライメントを、最適対応付けとし、変換確率および遷移確率を計算させるために、学習させる。未アライメント事例は、対応付けるべき、｛アルファベット文字列、カナ文字列のデータ・セットとして定義される。例えば、図５を使用して未アライメント事例が、「インフォメーション」である場合を説明する。 On the other hand, in step S305 and step S306 in FIG. 3, the path search is performed using the unaligned case set, and the alignment that gives the minimum value of the total cost is set as the optimum association, and the conversion probability and the transition probability are calculated. To learn. An unaligned case is defined as a data set of {alphabet character string, kana character string} to be associated. For example, a case where the unaligned example is “information” will be described with reference to FIG.

カナ文字列＝インフォメーションについて、音素データを参照することにより、「イ」、「ン」、「フォ」、「メー」、「ショ」、「ン」のカナ音素シーケンスが与えられる。一方アルファベット文字列についても、音素データ格納部１２６をルックアップして、アライメント済み事例を処理した段階で登録され、同一のカナ音素に対応付けられたアルファベット音素を列挙して音素シーケンスを生成し、複数の経路マップを作成する。 By referring to the phoneme data for the kana character string = information, a kana phoneme sequence of “i”, “n”, “fo”, “mae”, “sho”, “n” is given. On the other hand, for the alphabet string, the phoneme data storage unit 126 is looked up and registered at the stage where the aligned case is processed, and the phoneme sequence is generated by enumerating the alphabet phonemes associated with the same kana phoneme, Create multiple route maps.

そして与えられた複数の経路マップに対し、図５で説明した規則を使用して経路コストを計算する。例えば、ステップＳ３０６の経路探索によるアライメントの途上では、経路５５０の他、経路５６０、経路５７０が探索される。このとき、計算されるＣ_{ＴＯＴＡＬ}は、経路５５０が最小となるので、アライメント事例と同様に、経路５５０が最適経路として採用され、経路５６０、５７０は、廃棄される。 Then, for the given plurality of route maps, the route cost is calculated using the rules described in FIG. For example, in the course of alignment by the route search in step S306, the route 560 and the route 570 are searched in addition to the route 550. At this time, since the calculated C _TOTAL has the minimum path 550, the path 550 is adopted as the optimum path and the paths 560 and 570 are discarded, as in the alignment example.

なお、上述した図５を使用したアライメントのための経路探索は、図５で説明した実施形態の他に、例えば、適切な条件を設定したビタビ・アルゴリズムを適用して実行することができる。 The above-described path search for alignment using FIG. 5 can be executed by applying, for example, a Viterbi algorithm in which appropriate conditions are set, in addition to the embodiment described in FIG.

さらに、本実施形態の確率モデル生成処理を説明する。図３のステップＳ３０８では、カナ音素−アルファベット音素の対応付け例からカナ音素に対応するアルファベット音素の割当て確率πおよび連続する音素間の遷移確率Ｐを学習事例を統計解析して生成する。図６は、未アライメント事例集合を使用したアライメント学習の実施形態を説明する概念図である。アライメント済み事例を解析することにより、情報処理装置１００は、コスト・モデル６５０を生成し、適切な記憶領域に格納している。コスト・モデル６５０は、カナ音素＝アについて、アルファベットの先行文字＝ａ、ｅ、ｏ、ｒ、ｕの場合が有るとして割当て、後続文字が、ｅ、ｈ、ｒの場合、コストがそれぞれ１、２（それぞれ例えば、ａｅｒｏ、ｏｕｒ（アワ）に対応する。また、カナ音素＝イについても同様にコストが登録される。 Furthermore, the probability model generation process of this embodiment will be described. In step S308 of FIG. 3, an alphabetic phoneme allocation probability π corresponding to a kana phoneme and a transition probability P between consecutive phonemes are generated by statistically analyzing learning examples from the kana phoneme-alphabet phoneme correspondence example. FIG. 6 is a conceptual diagram illustrating an embodiment of alignment learning using an unaligned case set. By analyzing the aligned case, the information processing apparatus 100 generates a cost model 650 and stores it in an appropriate storage area. The cost model 650 assigns the kana phoneme = a as an alphabetic leading character = a, e, o, r, u, and if the subsequent characters are e, h, r, the cost is 1, respectively. 2 (corresponding to, for example, aero and our respectively). Similarly, the cost is registered for kana phoneme = i.

なお、コスト＝１は、例示する目的で斜め経路のコストであり、コスト＝２は、垂直経路のコストであり、コスト＝３は、水平経路のコストである。コスト・モデル６５０を、未アライメント事例に対するアライメント結果に適用して合計コストを計算し、可能な経路のうち、上述したように、合計コストを最小とする経路をカナ音素−アルファベット音素の対応付けとして学習させる。そして、図６に示すように、学習結果として蓄積されたアライメント結果を使用して確率モデル７００を、遷移確率テーブルとして生成する。 Note that cost = 1 is the cost of an oblique path for the purpose of illustration, cost = 2 is the cost of a vertical path, and cost = 3 is the cost of a horizontal path. The cost model 650 is applied to the alignment result for the unaligned case to calculate the total cost, and among the possible paths, as described above, the path that minimizes the total cost is set as a kana phoneme-alphabet phoneme correspondence. Let them learn. Then, as shown in FIG. 6, the probability model 700 is generated as a transition probability table using the alignment result accumulated as the learning result.

図７は、図６のアライメント学習によって生成された確率モデル７００の実施形態のデータ構造を示す。図７に示す確率モデル７００は、アルファベット音素と、当該アルファベット音素に対するカナ音素およびその対応付けの確率π、後続アルファベット音素およびその変換確率Ｐとが対応付けられている。なお、後続アルファベット音素に対応付けられるカナ音素は、図７に示した実施形態では省略して示す。 FIG. 7 shows the data structure of an embodiment of a probability model 700 generated by the alignment learning of FIG. The probability model 700 shown in FIG. 7 associates an alphabetic phoneme, a kana phoneme with respect to the alphabetic phoneme, and a matching probability π, a subsequent alphabetic phoneme, and a conversion probability P thereof. Note that kana phonemes associated with subsequent alphabetic phonemes are omitted in the embodiment shown in FIG.

図７に示すように、先行アルファベット音素が「ａ」である場合、カナ音素は、「ア」、「エ」、「エー」、「ヤ」、「アッ」などが登録され、それぞれカナ音素に対応付けされる変換確率πが対応付けて登録されている。また、その右手側の各フィールドには、後続アルファベット音素が、「ｎ」、「ｌ」、「ｃ」、「ｓ」、「ｌｉ」などとして登録され、先行アルファベット音素「ａ」の後に各アルファベット音素が出現する遷移確率Ｐが登録されている。 As shown in FIG. 7, when the leading alphabet phoneme is “a”, kana phonemes are registered as “a”, “e”, “a”, “ya”, “a”, etc. Corresponding conversion probabilities π are registered in association with each other. In each field on the right-hand side, subsequent alphabetic phonemes are registered as “n”, “l”, “c”, “s”, “li”, etc., and each alphabet after the preceding alphabetic phoneme “a” is registered. A transition probability P at which a phoneme appears is registered.

以上で、図１および図２に示したプリプロセッサ１２４が実行する音素データ生成および遷移確率生成処理が完了する。情報処理装置１００は、上述した処理で生成された確率モデル７００を使用して、カナ文字列−アルファベット文字列の文字列変換処理を実行する。以下、情報処理装置が実行する文字列変換処理を、図１および図８を参照して説明する。 Thus, the phoneme data generation and transition probability generation processing executed by the preprocessor 124 shown in FIGS. 1 and 2 is completed. The information processing apparatus 100 executes the kana character string-alphabet character string conversion processing using the probability model 700 generated by the above-described processing. Hereinafter, a character string conversion process executed by the information processing apparatus will be described with reference to FIGS. 1 and 8.

図７に示した確率モデルに基づき、アルファベット音素への対応付けを隠れマルコフ・モデルでの状態遷移として記述したのが状態遷移図８００である。状態遷移図８００に示すように、隠れマルコフ・モデルでは、観測系列となるカナ音素＝「イ」について、状態遷移系列であるアルファベット音素＝「Ｉ」に対応付ける確率は、確率モデル７００をルックアップすることによりπ_１として決定される。先頭の音素について対応付けを終了した後、カナ音素＝「ン」の処理を実行し、「イ」に後続する「Ｎ」について確率モデル７００をルックアップして、Ｐ_１２およびπ_２の値を取得し、以後順次、π_３、Ｐ_２３、．．．を確率テーブル７００をルックアップして取得して行く。 Based on the probability model shown in FIG. 7, the state transition diagram 800 describes the association with alphabetic phonemes as the state transition in the hidden Markov model. As shown in the state transition diagram 800, in the hidden Markov model, the probability that the kana phoneme = “I” as the observation sequence corresponds to the alphabetic phoneme = “I” as the state transition sequence looks up the probability model 700. Is determined as π ₁ . After the association with the first phoneme, the kana phoneme = “n” process is executed, the probability model 700 is looked up for “N” following “a”, and the values of P ₁₂ and π ₂ are calculated. Are acquired and sequentially π ₃ , P ₂₃ ,. . . Is obtained by looking up the probability table 700.

なお、図８中、Ｐ_ｉｊは、それぞれ状態Ｓ_ｉの時にＳ_ｊの遷移を生じさせる確率であり、π_ｉは、状態Ｓ_ｉのとき、変換先音素ｘに変換される確率であり、下記式（２）により定式化することができる。 In FIG. 8, P _ij is a probability of causing a transition of S _j in each state S _i , and π _i is a probability of being converted to a conversion destination phoneme x in the state S _i , and Formula (2) can be formulated.

その後、変換尤度計算部１２２は、取得された各確率値を使用して、カナ文字列の先頭から末尾までの変換尤度χを、各確率の対数を取り、下記式（２）を使用して計算する。 Thereafter, the conversion likelihood calculation unit 122 uses each acquired probability value, takes the conversion likelihood χ from the beginning to the end of the kana character string, takes the logarithm of each probability, and uses the following equation (2): And calculate.

さらに、図１に示した最尤音素シーケンス決定部１１８は、変換尤度計算部１２２の計算した変換尤度χを取得し、変換尤度χのマックスを与える状態遷移系列としてアルファベット音素のシーケンスを取得する。そして、取得したアルファベット音素のシーケンスを、最尤音素シーケンスとしてアルファベット出力部１３０に渡し、一連の文字列変換処理を終了する。 Further, the maximum likelihood phoneme sequence determination unit 118 shown in FIG. 1 acquires the conversion likelihood χ calculated by the conversion likelihood calculation unit 122, and converts the alphabet phoneme sequence as a state transition sequence that gives the maximum of the conversion likelihood χ. get. Then, the acquired alphabetic phoneme sequence is passed to the alphabet output unit 130 as a maximum likelihood phoneme sequence, and a series of character string conversion processes is completed.

図９は、本実施形態の情報処理装置１００をウェブ・サーバ９１０として実装し、さらに複数のインド・西ヨーロッパ語圏の言語に対応することが可能な、情報処理システム９００の実施形態を示す。なお、図９の情報処理システム９００は、プリプロセッサ１２４に相当する機能を含まず、文字列変換を実行するためのプログラムとともに実行データとしてハードディスク装置９９０に格納されているものとして説明する。 FIG. 9 shows an embodiment of an information processing system 900 in which the information processing apparatus 100 according to the present embodiment is implemented as a web server 910 and can handle a plurality of languages of Indo-West European languages. Note that the information processing system 900 in FIG. 9 does not include a function corresponding to the preprocessor 124 and is described as being stored in the hard disk device 990 as execution data together with a program for executing character string conversion.

なお、サーバ９１０は、プリプロセッサ１２４に相当する機能を含んで構成することもでき、この場合、ネットワークを介して特定のユーザに適したアライメント済み事例集合や未アライメント事例集合を受領して、ユーザごとに、例えば専門分野別にカスタマイズした文字列変換サービスを提供するように構成することもできる。 The server 910 can also be configured to include a function corresponding to the preprocessor 124. In this case, the server 910 receives an aligned case set and unaligned case set suitable for a specific user via the network, and receives each user. For example, a character string conversion service customized for each specialized field can be provided.

以下、図９を使用して情報処理システム９００を説明する。情報処理システム９００は、ウェブ・サービスを提供する。ウェブ・サーバ９１０は、ネットワーク・アダプタ９３０と、各種要求をサーバ・プログラムの型式に適合させるためのＣＧＩ(Common Gateway Interface)９４０と、カナ文字取得部９５０とを含んで構成されている。ネットワーク・アダプタ９３０は、ネットワーク・インタフェース・カード（ＮＩＣ）を含み、インターネットなどのネットワーク９２０を介して要求を受領し、またウェブ・サーバ９１０の処理結果を、ネットワーク９２０を介して遠隔接続されたウェブ・クライアント（図示せず）に返している。 Hereinafter, the information processing system 900 will be described with reference to FIG. The information processing system 900 provides a web service. The web server 910 includes a network adapter 930, a CGI (Common Gateway Interface) 940 for adapting various requests to the server program type, and a kana character acquisition unit 950. The network adapter 930 includes a network interface card (NIC), receives a request via a network 920 such as the Internet, and transmits a processing result of the web server 910 to a remotely connected web via the network 920. • Returned to client (not shown).

ウェブ・サーバ９１０は、さらに各種データベース９９０を管理している。データベース９９０は、カナ文字列で表現されたインド・西ヨーロッパ語の言語種類を推定するために使用する言語種類推定形態素辞書９９０ａと、言語種別確率モデル格納部９９０ｂとを含んで構成される。 The web server 910 further manages various databases 990. The database 990 includes a language type estimation morpheme dictionary 990a and a language type probability model storage unit 990b that are used to estimate the language type of Indian / Western European languages expressed by kana character strings.

カナ文字取得部９５０は、ウェブ・クライアントからの文字列変換要求に含まれるカナ文字列を、ＣＧＩ９４０を介して取得する。カナ文字取得部９５０は、言語種別推定形態素辞書９９０ａをルックアップして、カナ文字列に対応する特有の形態を検索し、言語種類を推定する。なお、他の実施形態では、ユーザが言語種類を特定するデータを文字列変換要求に含ませておくことができ、この場合、カナ文字取得部９５０は、当該データを音素分解部９６０などに送付して文字列変換を実行する。 The kana character acquisition unit 950 acquires the kana character string included in the character string conversion request from the web client via the CGI 940. The kana character acquisition unit 950 looks up the language type estimation morpheme dictionary 990a, searches for a specific form corresponding to the kana character string, and estimates the language type. In another embodiment, the user can include data specifying the language type in the character string conversion request. In this case, the kana character acquisition unit 950 sends the data to the phoneme decomposition unit 960 or the like. And execute string conversion.

カナ文字取得部９５０は、カナ文字列および言語種類のデータを取得すると、カナ文字列を音素分解部９６０に渡す。音素分解部９６０は、受領したカナ文字列を音素データ格納部をルックアップして音素分解し、音素分解の結果を変換尤度計算部９７０に渡す。変換尤度計算部９７０は、推定された言語種類に対応して登録された確率モデルを、言語種別確率モデル格納部９９０ｂから呼出して、遷移確率の計算を実行する。 When the kana character acquisition unit 950 acquires the kana character string and the language type data, the kana character acquisition unit 950 passes the kana character string to the phoneme decomposition unit 960. The phoneme decomposition unit 960 performs phoneme decomposition on the received kana character string by looking up the phoneme data storage unit, and passes the result of phoneme decomposition to the conversion likelihood calculation unit 970. The conversion likelihood calculation unit 970 calls the probability model registered corresponding to the estimated language type from the language type probability model storage unit 990b, and executes the transition probability calculation.

最尤文字種シーケンス決定部９８０は、図１に示した最尤アルファベット・シーケンス決定部１１８と同様の機能を含んで実装されており、変換尤度計算部９７０が計算した変換尤度χの最大値を与える音素シーケンスを、最尤文字種シーケンスとして決定する。その後、最尤文字種シーケンス決定部９８０は、その結果を、ネットワーク・アダプタ９３０を介してウェブ・クライアントへと返し、ウェブ・クライアントが要求したカナ文字列に対応する文字種・シーケンスを返す。 Maximum likelihood character type sequence determination unit 980 is implemented to include the same function as maximum likelihood alphabet sequence determination unit 118 shown in FIG. 1, and is the maximum value of conversion likelihood χ calculated by conversion likelihood calculation unit 970. Is determined as the maximum likelihood character type sequence. Thereafter, the maximum likelihood character type sequence determination unit 980 returns the result to the web client via the network adapter 930, and returns the character type / sequence corresponding to the kana character string requested by the web client.

また、他の実施形態で、最尤文字種シーケンス決定部９８０は、カナ文字列が人名である場合、生成した文字列シーケンスを、ＧＮＡ(Grobal Name Analytics)サーバ９９５に送付して、人名検索を実行することもできる。なお、ＧＮＡについては、例えば、http://publibfp.boulder.ibm.com/epubs/pdf/c1912860.pdfで記述されるシステムまたはサーバを挙げることができる。 In another embodiment, the maximum likelihood character type sequence determination unit 980 sends a generated character string sequence to a GNA (Grobal Name Analytics) server 995 to execute a person name search when the kana character string is a person name. You can also In addition, about GNA, the system or server described by http://publibfp.boulder.ibm.com/epubs/pdf/c1912860.pdf can be mentioned, for example.

図９に示した情報処理システム９００は、ユーザから送付されたカナ文字列を、その言語種別を識別して対応するインド・西ヨーロッパ語に変換することが可能となり、高い言語汎用性を提供することが可能となる。また、プリプロセッサを実装する実装形態の場合、ユーザに対してカスタマイズした文字列変換を可能とでき、ユーザごとに異なるウェブ・サーバや確率モデルを作成することなく、効率的なウェブ・サービスを提供することができる。 The information processing system 900 shown in FIG. 9 can convert a kana character string sent from a user into a corresponding Indian / Western European language by identifying its language type, and provides high language versatility. It becomes possible. In addition, in the case of an implementation that implements a preprocessor, it is possible to perform character string conversion customized for the user, and provide an efficient web service without creating a different web server or probability model for each user. be able to.

さらに変換されたインド・西ヨーロッパ語のシーケンスが人名である場合、人名検索システム９９５への入力インタフェースとして機能することができるので、個人検索、名寄せ、マネーロンダリングなどの重要な用途に対して効率的に検索結果を返すことが可能となる。 Furthermore, if the converted Indo-West European sequence is a person name, it can function as an input interface to the person name search system 995, so it is efficient for important applications such as personal search, name identification, money laundering, etc. Search results can be returned automatically.

図１０は、本実施形態のウェブ・サーバ９１０が実行する文字変換処理の実施形態のフローチャートである。図１０に示す処理は、カナ−アルファベット文字列変換要求およびアライメント要求（ブロックＢ）を並列的に記述するが、各要求はそれぞれ単独で処理することができる。 FIG. 10 is a flowchart of an embodiment of character conversion processing executed by the web server 910 of this embodiment. The process shown in FIG. 10 describes a Kana-alphabet character string conversion request and an alignment request (block B) in parallel, but each request can be processed independently.

図１０の処理は、ステップＳ１０００から開始し、ステップＳ１００１で、カナ文字列またはカナ文字列およびアルファベット文字列の両方を含む要求をウェブ・サーバ１０１０が受領する。カナ文字列のみを受領する場合は、文字列変換要求であり、上述したように言語種類を推定するためのデータを同時に受領してもよい。また、カナ文字列とアルファベット文字列とを同時に受領する実施形態は、アライメント要求に対応し、図５に示した経路探索を実行し、合計コストが最小のアライメントペアをレスポンスとして返す実施形態である。 The process of FIG. 10 starts from step S1000. In step S1001, the web server 1010 receives a request including a kana character string or both a kana character string and an alphabetic character string. When only a kana character string is received, it is a character string conversion request, and data for estimating the language type may be received simultaneously as described above. Further, the embodiment in which the kana character string and the alphabet character string are received simultaneously is an embodiment that responds to the alignment request, executes the route search shown in FIG. 5, and returns the alignment pair with the minimum total cost as a response. .

ステップＳ１００２は、カナ文字列またはアルファベット文字列がある場合には、アルファベット文字列についても音素分解し、ステップＳ１００３で、カナ文字列とアルファベット文字列が両方存在する可動かを判断する。両方存在する場合（ｙｅｓ）、受領した要求はアライメント要求であると判断し、ブロックＢに処理を分岐させる。また、カナ文字列およびアルファベット文字列の両方を含まないと判断した場合（ｎｏ）、ステップＳ１００４で、言語種類に対応する確率モデルを使用して尤度確率χを計算する。ステップＳ１００５では、尤度確率χの尤も大きい文字列を取得し、ステップＳ１００６で、変換結果を表示するための適切なフォーマット、例えばＲＳＳ、テーブルなどのフォーマットで変換結果を作成し、構造化文書として、出力させ、ステップＳ１００７で処理を終了させる。 In step S1002, if there is a kana character string or an alphabet character string, the alphabet character string is also phoneme decomposed, and in step S1003, it is determined whether both the kana character string and the alphabet character string exist. If both are present (yes), it is determined that the received request is an alignment request, and the process branches to block B. If it is determined that both the kana character string and the alphabet character string are not included (no), the likelihood probability χ is calculated using a probability model corresponding to the language type in step S1004. In step S1005, a character string having a large likelihood probability χ is obtained. In step S1006, a conversion result is created in an appropriate format for displaying the conversion result, for example, a format such as RSS, table, etc. And the process is terminated in step S1007.

以上のように、本実施形態では、カナ音素と、アルファベット音素とを直接遷移確率で関連付け、隠れマルコフ・モデル（ＨＭＭ）法を使用して最尤の文字列変換を実行することができるので、ハードウェア資源を効率的に使用し、さらにより直接的に高精度の検索を可能とする。また言語ごとの確率モデルを生成するだけで、言語種類に対応して柔軟な文字列変換が可能となる。 As described above, in this embodiment, the Kana phoneme and the alphabet phoneme can be directly associated with the transition probability, and the maximum likelihood string conversion can be executed using the hidden Markov model (HMM) method. It makes efficient use of hardware resources and enables more accurate search even more directly. Also, by simply generating a probability model for each language, flexible character string conversion corresponding to the language type becomes possible.

また、ブロックＢの処理は、アライメント要求に対応する処理であり、ステップＳ１００８で音素分解の結果を使用して経路探索し、合計コストの最小のアライメントを決定する。その後、ステップＳ１００９で検索結果を表示するためのフォーマットで構造化文書を作成し、出力して、処理をステップＳ１００７で終了させる。なお、ブロックＢで説明した処理は、図５で説明した未アライメント事例の自動アライメント処理と同様の処理である。 The process of block B is a process corresponding to the alignment request. In step S1008, a route search is performed using the result of phoneme decomposition, and the minimum alignment with the total cost is determined. Thereafter, a structured document is created and output in a format for displaying the search result in step S1009, and the process ends in step S1007. The process described in block B is the same process as the automatic alignment process in the unaligned case described in FIG.

上述したブロックＢの実施形態は、ウェブ・クライアントがアライメントを確認したい場合や、アライメント処理の精度をウェブ・サーバ９１０の管理者などが確認したい場合に使用することができ、アライメント精度の確認および校正のために利用することができる。 The embodiment of the block B described above can be used when the web client wants to check alignment, or when the administrator of the web server 910 or the like wants to check the accuracy of the alignment process. Can be used for.

図１１〜図１６を参照して、本実施形態の文字列変換処理を説明する。図１１は、本実施形態の文字列変換方法で、ウェブ・クライアント９２５に表示されるグラフィカル・ユーザ・インタフェース（ＧＵＩ）１１００の実施形態である。ＧＵＩ１１００には、当該ＧＵＩ１１００がカタカナ語ローマ字変換を行うものであることが示されている。なお、ローマ字とは、ヘボン式、訓令式などで指定され、日本語をその発音に対応したアルファベットで記述するための表記方式であり、アルファベットへの変換が実質的に実行される。 A character string conversion process according to this embodiment will be described with reference to FIGS. FIG. 11 shows an embodiment of a graphical user interface (GUI) 1100 displayed on the web client 925 in the character string conversion method of this embodiment. The GUI 1100 indicates that the GUI 1100 performs Katakana / Romaji conversion. Note that the Roman alphabet is a notation system for describing Japanese in alphabets corresponding to its pronunciation, which is designated by the Hebon formula, the decree formula, and the like, and the conversion to the alphabet is substantially executed.

ＧＵＩ１１００には、カナを入力するフィールド１１１０と、検索するローマ字変換候補の上限数を指定するフィールド１１２０とが表示されている。ユーザは、各フィールドに文字列および上限数を入力した後、「ＯＫ」ボタンをクリックすることで、文字列変換要求をウェブ・サーバ９１０に送付する。ウェブ・サーバ９１０は、上述した文字変換処理を実行し、ウェブ・クライアント９２５に処理結果を返す。図１１に示した実施形態では、カナ文字列＝インフォメーションである。 In the GUI 1100, a field 1110 for inputting kana and a field 1120 for designating the upper limit number of Roman character conversion candidates to be searched are displayed. The user inputs a character string and an upper limit number in each field and then clicks an “OK” button to send a character string conversion request to the web server 910. The web server 910 performs the character conversion process described above, and returns a process result to the web client 925. In the embodiment shown in FIG. 11, kana character string = information.

図１２は、ウェブ・サーバ９１０による処理結果を受領したウェブ・クライアント９２５が表示するＧＵＩ１２００の実施形態を示す。図１２に示すように、ウェブ・サーバ９１０は、以後の参照および検索が容易となるようにＲＳＳ型式で変換結果を作成し、ＨＴＭＬ、ＸＭＬなどの構造化文書を作成し、ウェブ・クライアント９２５に送付する。図１２に示すように、変換対象カナ＝インフォメーションであることが示され、それ以降のラインに、変換尤度χの値とともにアルファベット変換結果が表示される。図１２に示されるように、最高の尤度を有するアルファベット文字列は、「ｉｎｆｏｒｍａｔｉｏｎ」であり、充分な精度で、文字変換が実行されているのが示される。 FIG. 12 illustrates an embodiment of a GUI 1200 displayed by a web client 925 that has received a processing result from the web server 910. As shown in FIG. 12, the web server 910 creates a conversion result in the RSS format so as to facilitate subsequent reference and search, creates a structured document such as HTML, XML, etc., and sends it to the web client 925. Send it. As shown in FIG. 12, it is shown that conversion target kana = information, and the alphabet conversion result is displayed on the subsequent lines together with the value of the conversion likelihood χ. As shown in FIG. 12, the alphabet string having the highest likelihood is “information”, which indicates that character conversion is being performed with sufficient accuracy.

図１３は、アルファベット変換の他の実施形態を、カナ文字列＝マンチェスターを使用した場合について表示するＧＵＩ１３００を示す。図１３では、入力されたカナ文字列＝マンチェスターに対して尤度χ＝０．２２程度で正解の結果が与えられているのが示されている。また、第２位の尤度を有する「ｍａｎｃｅｓｔｅｒ」との間の尤度階のさも充分確保されていることが示されている。 FIG. 13 shows a GUI 1300 that displays another embodiment of alphabet conversion for a case where Kana character string = Manchester is used. FIG. 13 shows that the correct result is given with the likelihood χ = 0.22 for the input kana character string = Manchester. In addition, it is shown that the likelihood rank between the “mancester” having the second highest likelihood is sufficiently secured.

図１４は、ウェブ・サーバ９１０が提供するさらに他の実施形態のＧＵＩ１４００を示す。図１４の実施形態では、本実施形態のアライメント処理の実施形態である。図１４に示した実施形態は、ウェブ・サーバ９１０が、カナ文字列およびアルファベット文字列の両方を含むアライメント要求を受領した場合に実行する。ユーザは、ＧＵＩ１４００にカナ文字列およびアルファベット文字列を入力した後、「ＯＫ」ボタンをクリックすると、「インフォメーション」および「ｉｎｆｏｒｍａｔｉｏｎ」を含むアライメント要求が送付される。ウェブ・サーバ９１０は、各文字列を取得すると、音素分解を実行し、図５に示した経路探索を実行する。 FIG. 14 shows a GUI 1400 according to still another embodiment provided by the web server 910. The embodiment of FIG. 14 is an embodiment of the alignment process of the present embodiment. The embodiment shown in FIG. 14 executes when the web server 910 receives an alignment request that includes both a Kana character string and an alphabetic character string. When the user inputs a kana character string and an alphabetic character string into the GUI 1400 and then clicks an “OK” button, an alignment request including “information” and “information” is sent. When the web server 910 acquires each character string, it performs phoneme decomposition and performs the route search shown in FIG.

ウェブ・サーバ９１０は、経路探索の結果、合計コストが最小のカナ音素−アルファベット音素の対応付けが見出され、その対応付けを、アライメント結果としてＲＳＳ型式で記述し、構造化文書を作成する。 As a result of the route search, the web server 910 finds a kana-phoneme-alphabet phoneme association with the minimum total cost, describes the association in the RSS format as an alignment result, and creates a structured document.

図１５は、ウェブ・サーバ９１０がウェブ・クライアント９２５に送付したアライメント結果を表示するＧＵＩ１５００である。図１５に示すように、合計コストが最小のアライメント結果がＲＳＳ型式で表示されており、良好なアライメント精度が得られていることが示されている。さらに、図１６は、同様のアライメント要求を、「マンチェスター」および「ｍａｎｃｈｅｓｔｅｒ」を入力した場合に生成されるアライメント結果である。図１６に示されるように、カナ音素−アルファベット音素が良好に対応付けられており、カナ音素−アルファベット音素を直接対応付けする本実施形態の文字列変換処理によっても高精度のアルファベット文字列変換またはアルファベット文字列検索が可能となることが示された。 FIG. 15 is a GUI 1500 that displays the alignment result sent from the web server 910 to the web client 925. As shown in FIG. 15, the alignment result with the minimum total cost is displayed in the RSS type, indicating that good alignment accuracy is obtained. Further, FIG. 16 shows an alignment result generated when “Manchester” and “manchester” are input in the same alignment request. As shown in FIG. 16, Kana phoneme-alphabet phoneme is well associated, and high-precision alphabet character string conversion or It was shown that alphabet string search is possible.

本実施形態の上記機能は、Ｃ＋＋、Ｊａｖａ（登録商標）、Ｊａｖａ（登録商標）Ｂｅａｎｓ、Ｊａｖａ（登録商標）Ａｐｐｌｅｔ、Ｊａｖａ（登録商標）Ｓｃｒｉｐｔ、Ｐｅｒｌ、Ｒｕｂｙなどのオブジェクト指向プログラミング言語などで記述された装置実行可能なプログラムにより実現でき、当該プログラムは、ハードディスク装置、ＣＤ−ＲＯＭ、ＭＯ、フレキシブルディスク、ＥＥＰＲＯＭ、ＥＰＲＯＭなどの装置可読な記録媒体に格納して頒布することができ、また他装置が可能な形式でネットワークを介して伝送することができる。 The functions of this embodiment are described in an object-oriented programming language such as C ++, Java (registered trademark), Java (registered trademark) Beans, Java (registered trademark) Applet, Java (registered trademark) Script, Perl, and Ruby. The program can be realized by a program executable by the apparatus, and the program can be stored in a device-readable recording medium such as a hard disk device, CD-ROM, MO, flexible disk, EEPROM, EPROM, and distributed. It can be transmitted over the network in a possible format.

これまで本実施形態につき説明してきたが、本発明は、上述した実施形態に限定されるものではなく、他の実施形態、追加、変更、削除など、当業者が想到することができる範囲内で変更することができ、いずれの態様においても本発明の作用・効果を奏する限り、本発明の範囲に含まれるものである。 Although the present embodiment has been described so far, the present invention is not limited to the above-described embodiment, and other embodiments, additions, changes, deletions, and the like can be conceived by those skilled in the art. It can be changed, and any aspect is within the scope of the present invention as long as the effects and effects of the present invention are exhibited.

本実施形態の情報処理装置の機能ブロック図。The functional block diagram of the information processing apparatus of this embodiment. 図１に示したプリプロセッサの機能ブロック図。FIG. 2 is a functional block diagram of the preprocessor shown in FIG. 1. 確率モデル生成処理のフローチャート。The flowchart of a probability model production | generation process. アライメント済み事例集合のデータ構造の実施形態を示した図。The figure which showed embodiment of the data structure of the aligned example set. 経路マップの実施形態を示した図。The figure which showed embodiment of a route map. 未アライメント事例集合を使用したアライメント学習の実施形態を説明する概念図。The conceptual diagram explaining embodiment of the alignment learning using an unaligned example set. 図６のアライメント学習によって生成された確率モデルの実施形態のデータ構造を示した図。The figure which showed the data structure of embodiment of the probability model produced | generated by the alignment learning of FIG. アルファベット音素への対応付けを隠れマルコフ・モデルでの状態遷移として記述したのが状態遷移図。State transition diagram describes correspondence to alphabetic phonemes as state transition in hidden Markov model. 本実施形態の情報処理システムの機能ブロック図。The functional block diagram of the information processing system of this embodiment. 本実施形態のウェブ・サーバが実行する文字変換処理の実施形態のフローチャート。The flowchart of embodiment of the character conversion process which the web server of this embodiment performs. 本実施形態の文字列変換方法で、ウェブ・クライアントに表示されるＧＵＩの実施形態を示した図。The figure which showed embodiment of GUI displayed on a web client by the character string conversion method of this embodiment. ウェブ・サーバによるカナ文字列＝インフォメーションについての処理結果を受領したウェブ・クライアントが表示するＧＵＩの実施形態を示した図。The figure which showed embodiment of GUI which the web client which received the process result about the kana character string = information by a web server displays. アルファベット変換の他の実施形態を、カナ文字列＝マンチェスターを使用した場合について表示するＧＵＩを示した図。The figure which showed GUI which displays other embodiment of alphabet conversion about the case where Kana character string = Manchester is used. ウェブ・サーバが提供するさらに他の実施形態のＧＵＩを示した図。The figure which showed GUI of further another embodiment which a web server provides. ウェブ・サーバがウェブ・クライアントに送付したアライメント結果を表示するＧＵＩを示した図。The figure which showed GUI which displays the alignment result which the web server sent to the web client. ウェブ・サーバがウェブ・クライアントに送付したアライメント結果を表示するＧＵＩを示した図。The figure which showed GUI which displays the alignment result which the web server sent to the web client.

Explanation of symbols

１００…情報処理装置、１１２…入力部、１１４…入出力インタフェース、１１６…カナ文字取得部、１１８…最尤音素シーケンス決定部、１２０…音素分解部、１２２…変換尤度計算部、１２４…プリプロセッサ、１２６…音素データ格納部、１２８…確率モデル格納部、２００…プリプロセッサ（機能ブロック）、２１０…アライメント済み事例、２１２…未アライメント事例、コストモデル生成部、２１６…アライメント処理部、２２０…アライメント結果格納部、２２２確率モデル生成部、４００…アライメント済み事例集合、５００…経路マップ、６００…未アライメント事例集合、６５０…コスト・モデル、７００…確率モデル、８００…状態遷移図、９００…情報処理システム DESCRIPTION OF SYMBOLS 100 ... Information processing apparatus, 112 ... Input part, 114 ... Input / output interface, 116 ... Kana character acquisition part, 118 ... Maximum likelihood phoneme sequence determination part, 120 ... Phoneme decomposition | disassembly part, 122 ... Conversion likelihood calculation part, 124 ... Preprocessor , 126 ... phoneme data storage unit, 128 ... probability model storage unit, 200 ... preprocessor (functional block), 210 ... aligned case, 212 ... unaligned case, cost model generation unit, 216 ... alignment processing unit, 220 ... alignment result Storage unit, 222 probability model generation unit, 400 ... aligned case set, 500 ... path map, 600 ... unaligned case set, 650 ... cost model, 700 ... probability model, 800 ... state transition diagram, 900 ... information processing system

Claims

An information processing device for converting an original character string into a character string of another language,
An original character acquisition unit for acquiring the original character string;
A phoneme decomposition unit that decomposes the acquired original character string into original character phonemes with reference to the phoneme data of the original character string;
For the original character phoneme sequence generated by the phoneme decomposition unit, refer to the probability model generated by learning corresponding to the original character phoneme, and use the transition probability for successive phoneme sequences to determine the conversion likelihood. A conversion likelihood calculation unit to calculate,
A maximum likelihood phoneme sequence determination unit that determines and outputs a maximum likelihood phoneme sequence of the other language with reference to the conversion likelihood calculated by the conversion likelihood calculation unit;
Including
The probabilistic model generates a cost model by associating the original character phoneme with the conversion destination phoneme using an aligned case in which the original character string and the phonemes of the other languages are aligned, and generating the cost model. A transition probability table for registering transition probabilities when phoneme sequences are generated by learning the alignment of the original character string before alignment according to a model.
Information processing device.

The conversion likelihood calculation unit is a hidden Markov model that uses a conversion probability of a conversion destination phoneme that appears between the beginning and the end of the original character string and a transition probability between successive conversion destination phonemes. The information processing apparatus according to claim 1, wherein the conversion likelihood to a conversion destination phoneme sequence is calculated.

The information processing apparatus according to claim 2 , wherein the alignment is learned by a route search along a cell having each phoneme as a unit.

In the route search, the Viterbi algorithm or the original character string heads from the beginning to the end, the costs allocated for the unit route along the cell are integrated along the route, and the route having the smallest total cost is searched. The information processing apparatus according to claim 3 .

Including a preprocessor for generating the probability model, wherein the preprocessor associates the original character phoneme with the destination phoneme using an aligned case in which the original character string and the phonemes of the other languages are aligned. A cost model generating unit for generating a cost model, an alignment processing unit for learning the alignment of the original character string before alignment according to the cost model and generating the probability model, and the alignment processing The information processing apparatus according to claim 1, further comprising: a probability model generation unit that calculates the transition probability using an output of a unit and registers the transition probability as a transition probability table.

A character string conversion method executed by an information processing device for converting an original character string into a character string of another language,
Reading the original character string from a phoneme data storage unit for registering phoneme data of the original character, reading out a probability model corresponding to the original character phoneme and generated by learning from the probability model storage unit;
Obtaining the original character string;
Decomposing the acquired original character string into the original character phonemes with reference to the phoneme data;
The original character phoneme sequence generated by the decomposing step is processed from the beginning to the end of the original character string with reference to the probability model, and the conversion likelihood is calculated using the transition probability for the continuous phoneme sequence. Calculating the degree,
Determining and outputting a maximum likelihood phoneme sequence with reference to the conversion likelihood calculated by the calculating step;
Run
The probability model generates a cost model by associating the original character phoneme with the conversion destination phoneme using an aligned case in which the original character string and each phoneme of the other language are aligned, and generating the cost model A character string conversion method , which is a transition probability table for registering transition probabilities when phoneme sequences are generated, which is generated by learning alignment of the original character string before alignment according to a model .

The calculating step includes the conversion destination according to a hidden Markov model using a conversion probability of a conversion destination phoneme that appears between the beginning and the end of the original character string and a transition probability between successive conversion destination phonemes. The character string conversion method according to claim 6 , further comprising: calculating the conversion likelihood to a phoneme sequence.

The cost model is generated by associating the original character phoneme with the conversion destination phoneme using an aligned case where the original character string and each phoneme of the other language are aligned, and generating the cost model The character string conversion method according to claim 7 , further comprising the step of learning and registering the alignment of the original character string before alignment according to the model, and generating in advance as a transition probability table.

The pre-generating step includes a Viterbi algorithm or a cost assigned to a unit route along a cell in units of each phoneme from the beginning to the end of the original character string. The character string conversion method according to claim 8 , further comprising a step of searching for a route so as to give the route having a minimum total cost by integrating using a number.

A program for an information processing apparatus to execute a character string conversion method for converting an original character string into a character string of another language, the information processing apparatus using the program,
A cost model is generated by associating the original character phoneme with the conversion destination phoneme using the aligned example in which the original character string and the phonemes of the other languages are aligned, and the pre-alignment according to the cost model Pre-generating a probability model as a transition probability table for registering transition probabilities when phoneme sequences are continuous, by learning and registering alignment of original character strings;
Reading the original character string from a phoneme data storage unit for registering phoneme data of the original character, reading out a probability model corresponding to the original character phoneme and generated by learning from the probability model storage unit;
Obtaining an original string; and
Decomposing the acquired original character string into the original character phonemes with reference to the phoneme data;
The original character phoneme sequence generated by the decomposing step is processed from the beginning to the end of the original character string with reference to the probability model, and the conversion likelihood is calculated using the transition probability for the continuous phoneme sequence. Calculating the degree,
Performing a step of determining and outputting a maximum likelihood phoneme sequence with reference to the conversion likelihood calculated by the calculating step;
A computer executable program.

The calculating step includes the conversion destination according to a hidden Markov model using a conversion probability of a conversion destination phoneme that appears between the beginning and the end of the original character string and a transition probability between successive conversion destination phonemes. The program according to claim 10 , comprising calculating the likelihood of conversion to a sequence of phonemes.

The pre-generating step includes a Viterbi algorithm or a cost assigned to a unit route along a cell in units of each phoneme from the beginning to the end of the original character string. The program according to claim 11 , comprising a step of searching for a route so as to accumulate the number using a number and to give the route having a minimum total cost.

An information processing system for performing character string conversion via a network, wherein the information processing system includes:
A network adapter for receiving a request including the original string over the network;
An original character string acquisition unit for acquiring the original character string from the request;
A phoneme decomposition unit that decomposes the original character string into original character phonemes;
For the original character phoneme sequence generated by the phoneme decomposition unit, refer to the probability model generated by learning corresponding to the original character phoneme, and use the transition probability for successive phoneme sequences to determine the conversion likelihood. A conversion likelihood calculation unit to calculate,
A maximum likelihood phoneme sequence is determined with reference to the conversion likelihood calculated by the conversion likelihood calculation unit, and is output via the network,
The probability model generates a cost model by associating the original character phoneme with the conversion destination phoneme using an aligned example in which the original character string and each phoneme of another language to be character-string-converted are aligned. A transition probability table for registering transition probabilities in the case of continuous phoneme sequences generated by learning the alignment of the original character string before alignment according to the cost model;
Information processing system.

The information processing system further includes a language type estimation morpheme dictionary for estimating a language in which the original character string is to be converted in response to the request, and a language type probability for registering the probability model for each estimated language The information processing system according to claim 13 , further comprising: a model storage unit, wherein the maximum likelihood phoneme sequence determination unit outputs the character string of the language estimated from the request as the original character string.