JP6311367B2

JP6311367B2 - User dictionary management device, user dictionary management method, and user dictionary management program

Info

Publication number: JP6311367B2
Application number: JP2014048331A
Authority: JP
Inventors: 貢三浦
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2014-03-12
Filing date: 2014-03-12
Publication date: 2018-04-18
Anticipated expiration: 2034-03-12
Also published as: JP2015172854A

Description

本願発明は、言語変換処理装置が言語変換処理を行う際に参照するユーザ辞書情報を管理するユーザ辞書管理装置等に関する。 The present invention relates to a user dictionary management device that manages user dictionary information that is referred to when a language conversion processing device performs language conversion processing.

近年、日本語から英語への翻訳処理、あるいは、仮名文字から漢字への変換処理等の言語変換処理を行う、様々な言語変換処理装置が利用されている。これらの言語変換処理装置は、言語変換処理を行う際に、変換前の変換対象ワードと、変換後の変換候補ワードとを関連付けた辞書データを有する変換辞書を参照する。係る変換辞書には、言語変換処理装置を使用するユーザが共通して使用する共通辞書の他に、各ユーザが個別に使用するユーザ辞書がある。共通辞書は、ユーザが共通して使用するような変換対象ワードに関する変換辞書である。一方、ユーザ辞書は、共通辞書に登録されていない変換対象ワードであって、各ユーザが変換処理する文書の特性上、ユーザ個別に使用されるような、変換対象ワードに関する変換辞書である。 In recent years, various language conversion processing apparatuses that perform language conversion processing such as Japanese-to-English translation processing or kana-character to kanji conversion processing have been used. When performing these language conversion processes, these language conversion processing devices refer to a conversion dictionary having dictionary data in which a conversion target word before conversion and a conversion candidate word after conversion are associated with each other. Such conversion dictionaries include user dictionaries used individually by each user, in addition to common dictionaries commonly used by users who use the language conversion processing device. The common dictionary is a conversion dictionary related to conversion target words that are commonly used by users. On the other hand, the user dictionary is a conversion dictionary related to conversion target words that are not registered in the common dictionary and are used individually for each user due to the characteristics of the document to be converted by each user.

一般的な言語変換処理装置においては、各ユーザが言語変換処理を行う度に、新しい辞書データが、係るユーザ辞書へ登録される。そして、このユーザ辞書に登録された辞書データが増加するに従い、各ユーザが行う言語変換処理の精度が向上する。したがって、係るユーザ辞書に対して辞書データをより効率的に登録する技術に対する期待が高まってきている、
このような技術に関連する技術として、特許文献１には、第１のユーザ辞書と第２のユーザ辞書に関する類似度を算出し、この類似度が閾値以上である場合は、第１のユーザ辞書に含まれ第２のユーザ辞書に含まれない辞書データを、第２のユーザ辞書に登録するシステムが開示されている。 In a general language conversion processing apparatus, each time a user performs language conversion processing, new dictionary data is registered in the user dictionary. As the dictionary data registered in the user dictionary increases, the accuracy of language conversion processing performed by each user improves. Therefore, there is an increasing expectation for a technique for registering dictionary data more efficiently with respect to such a user dictionary.
As a technique related to such a technique, Patent Document 1 calculates similarity between the first user dictionary and the second user dictionary, and when the similarity is equal to or greater than a threshold, the first user dictionary A system for registering in the second user dictionary the dictionary data included in the second user dictionary but not in the second user dictionary is disclosed.

特開2007-080019号公報JP 2007-080019 A

通常、言語変換処理において、１つの変換対象ワードに関する変換候補ワードは、１つに定まるわけではない。一人のユーザが行う言語変換処理に関しても、１つの変換対象ワードが、時と場合によって、異なるワードに変換される。したがって、一般的な変換辞書は、１つの変換対象ワードに１以上の変換候補ワードを関連付けた変換指示レコードを、係る辞書データとして有している。 Usually, in the language conversion process, the number of conversion candidate words related to one conversion target word is not limited to one. Regarding the language conversion processing performed by one user, one conversion target word is converted into a different word depending on time and circumstances. Therefore, a general conversion dictionary has, as such dictionary data, a conversion instruction record in which one or more conversion candidate words are associated with one conversion target word.

１つの変換対象ワードに複数の変換候補ワードが関連付けられている場合、各変換候補ワードに変換される確率は異なる。したがって、係る変換指示レコードが、各変換候補ワードと、当該変換候補ワードへの変換確率を基にした変換優先順位が示す値を関連付けて記憶することにより、言語変換処理に関する効率が向上する。例えば、仮名漢字変換システムでは、複数の変換候補である漢字を、係る変換優先順位が高い順に画面表示する。これにより、ユーザは、仮名漢字変換処理を効率的に行うことができる。 When a plurality of conversion candidate words are associated with one conversion target word, the probability of conversion into each conversion candidate word is different. Therefore, the conversion instruction record associates and stores each conversion candidate word and the value indicated by the conversion priority based on the conversion probability to the conversion candidate word, thereby improving the efficiency of the language conversion process. For example, in the Kana-Kanji conversion system, a plurality of conversion candidates, Kanji, are displayed on the screen in descending order of conversion priority. Thereby, the user can efficiently perform the kana-kanji conversion process.

特許文献１が開示した技術では、第１のユーザ辞書に登録されている辞書データを、第２のユーザ辞書に新規登録する際、第１及び第２のユーザ辞書に関する類似度が閾値以上である場合は、辞書データを一律に登録する。そして、係る技術は、この類似度が閾値未満である場合は、辞書データを一律に登録しない。しかしながら、この類似度が閾値未満である場合であっても、第２のユーザ辞書を使用するユーザが行う言語変換処理が、第１のユーザ辞書に登録済である辞書データを使用しないとは限らない。この場合、第２のユーザ辞書を使用するユーザが行う言語変換処理は、変換優先順位としては低いものの、第１のユーザ辞書に登録済である辞書データを使用する可能性がある。したがって、特許文献１が開示した技術は、特定のユーザ辞書に未登録である辞書データを、他のユーザ辞書から登録する際の柔軟性が十分にあるとはいえない。 In the technique disclosed in Patent Literature 1, when dictionary data registered in the first user dictionary is newly registered in the second user dictionary, the degree of similarity regarding the first and second user dictionaries is equal to or greater than a threshold value. In this case, the dictionary data is registered uniformly. Then, the technique does not register the dictionary data uniformly when the similarity is less than the threshold value. However, even if this similarity is less than the threshold value, the language conversion processing performed by the user who uses the second user dictionary does not always use the dictionary data registered in the first user dictionary. Absent. In this case, the language conversion processing performed by the user who uses the second user dictionary may use dictionary data registered in the first user dictionary although the conversion priority is low. Therefore, the technique disclosed in Patent Literature 1 cannot be said to have sufficient flexibility when registering unregistered dictionary data in a specific user dictionary from another user dictionary.

本願発明の主たる目的は、この問題を解決した、ユーザ辞書管理装置、ユーザ辞書管理方法、及び、ユーザ辞書管理プログラムを提供することである。 A main object of the present invention is to provide a user dictionary management device, a user dictionary management method, and a user dictionary management program that solve this problem.

本願発明に係るユーザ辞書管理装置は、言語変換処理装置がユーザ文書情報を言語変換処理する際に参照し、変換前のワードである変換対象ワードと、変換後のワードである１以上の変換候補ワードと、前記変換候補ワードに関する変換優先順位が示す値とを関連付けて記憶する変換指示レコード、を包含する複数のユーザ辞書情報の中の、第一及び第二のユーザ辞書情報に関する類似度が示す値を、所定の基準に基づき算出する算出手段と、前記第一のユーザ辞書情報が包含する前記変換指示レコードが示す情報を、前記類似度が示す値に基づいて、前記変換優先順位が示す値と関連付けて、前記第二のユーザ辞書情報に登録する登録手段と、を備えることを特徴とする。 The user dictionary management device according to the present invention refers to a language conversion processing device when language conversion processing is performed on user document information, and includes a conversion target word that is a word before conversion and one or more conversion candidates that are converted words. The degree of similarity related to the first and second user dictionary information among a plurality of user dictionary information including a word and a conversion instruction record that stores a value indicated by the conversion priority related to the conversion candidate word. A value indicated by the conversion priority based on a value indicated by the similarity based on a value indicated by the similarity and a calculation means for calculating a value based on a predetermined criterion and information indicated by the conversion instruction record included in the first user dictionary information And registration means for registering in the second user dictionary information.

上記目的を達成する他の見地において、本願発明のユーザ辞書管理方法は、情報処理装置によって、言語変換処理装置がユーザ文書情報を言語変換処理する際に参照し、変換前のワードである変換対象ワードと、変換後のワードである１以上の変換候補ワードと、前記変換候補ワードに関する変換優先順位が示す値とを関連付けて記憶する変換指示レコード、を包含する複数のユーザ辞書情報の中の、第一及び第二のユーザ辞書情報に関する類似度が示す値を、所定の基準に基づき算出し、前記第一のユーザ辞書情報が包含する前記変換指示レコードが示す情報を、前記類似度が示す値に基づいて、前記変換優先順位が示す値と関連付けて、前記第二のユーザ辞書情報に登録することを特徴とする。 In another aspect of achieving the above object, the user dictionary management method of the present invention refers to a conversion target that is a word before conversion, which is referred to when the language conversion processing device performs language conversion processing on the user document information by the information processing device. Among a plurality of user dictionary information including a word, one or more conversion candidate words that are converted words, and a conversion instruction record that associates and stores a value indicated by a conversion priority for the conversion candidate word, A value indicated by the similarity between the first and second user dictionary information is calculated based on a predetermined criterion, and the information indicated by the conversion instruction record included in the first user dictionary information is a value indicated by the similarity. And registering it in the second user dictionary information in association with the value indicated by the conversion priority.

また、上記目的を達成する更なる見地において、本願発明に係るユーザ辞書管理プログラムは、言語変換処理装置がユーザ文書情報を言語変換処理する際に参照し、変換前のワードである変換対象ワードと、変換後のワードである１以上の変換候補ワードと、前記変換候補ワードに関する変換優先順位が示す値とを関連付けて記憶する変換指示レコード、を包含する複数のユーザ辞書情報の中の、第一及び第二のユーザ辞書情報に関する類似度が示す値を、所定の基準に基づき算出する算出処理と、前記第一のユーザ辞書情報が包含する前記変換指示レコードが示す情報を、前記類似度が示す値に基づいて、前記変換優先順位が示す値と関連付けて、前記第二のユーザ辞書情報に登録する登録処理と、をコンピュータに実行させることを特徴とする。 Further, in a further aspect of achieving the above object, the user dictionary management program according to the present invention refers to a conversion target word that is a word before conversion, which is referred to when the language conversion processing device performs language conversion processing of user document information. The first of the plurality of user dictionary information including one or more conversion candidate words that are converted words and a conversion instruction record that stores the conversion priority order related to the conversion candidate words in association with each other. The similarity indicates the calculation process for calculating the value indicated by the similarity regarding the second user dictionary information based on a predetermined criterion, and the information indicated by the conversion instruction record included in the first user dictionary information. And causing the computer to execute a registration process for registering in the second user dictionary information in association with the value indicated by the conversion priority based on the value. To.

更に、本発明は、係るユーザ辞書管理プログラム（コンピュータプログラム）が格納された、コンピュータ読み取り可能な、不揮発性の記憶媒体によっても実現可能である。 Furthermore, the present invention can also be realized by a computer-readable non-volatile storage medium storing such a user dictionary management program (computer program).

本願発明は、言語変換処理装置が使用するユーザ辞書に対する辞書情報の登録を、効率的かつ柔軟に行うことを可能とする。 The present invention makes it possible to efficiently and flexibly register dictionary information for a user dictionary used by the language conversion processing device.

本願発明の第１の実施形態に係るユーザ辞書管理システムの構成を示すブロック図である。It is a block diagram which shows the structure of the user dictionary management system which concerns on 1st Embodiment of this invention. 本願発明の第１の実施形態に係るユーザ辞書管理システムの動作を示すフローチャートである。It is a flowchart which shows operation | movement of the user dictionary management system which concerns on 1st Embodiment of this invention. 本願発明の第１の実施形態に係る入力文書データ間の類似度を例示する図である。It is a figure which illustrates the similarity between the input document data which concerns on 1st Embodiment of this invention. 本願発明の第１の実施形態に係る類似度管理情報の構成例を示す図である。It is a figure which shows the structural example of the similarity management information which concerns on 1st Embodiment of this invention. 本願発明の第１の実施形態に係る優先順位管理情報の構成例を示す図である。It is a figure which shows the structural example of the priority management information which concerns on 1st Embodiment of this invention. 本願発明の第１の実施形態に係るユーザ辞書データへの辞書データの登録例を示す図である。It is a figure which shows the example of registration of the dictionary data to the user dictionary data based on 1st Embodiment of this invention. 本願発明の第２の実施形態に係るユーザ辞書管理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the user dictionary management apparatus which concerns on 2nd Embodiment of this invention. 本願発明の各実施形態に係るユーザ辞書管理装置を実行可能な情報処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the information processing apparatus which can execute the user dictionary management apparatus concerning each embodiment of this invention.

以下、本願発明の実施の形態について図面を参照して詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

＜第１の実施形態＞
図１は、第１の実施形態に係るユーザ辞書管理システム１の構成を概念的に示すブロック図である。本実施形態に係るユーザ辞書管理システム１は、ユーザ辞書管理装置１０、言語変換装置２０、ユーザ入力文書データ格納部３０、及び、ユーザ出力文書データ格納部４０を有する。 <First Embodiment>
FIG. 1 is a block diagram conceptually showing the structure of the user dictionary management system 1 according to the first embodiment. The user dictionary management system 1 according to the present embodiment includes a user dictionary management device 10, a language conversion device 20, a user input document data storage unit 30, and a user output document data storage unit 40.

以下に説明する本実施形態では、一例として、５人のユーザ（ユーザＡ乃至Ｅ）について処理する場合について説明する。即ち、ユーザ入力文書データ格納部３０は、５人のユーザＡ乃至Ｅがユーザ辞書管理システム１を使用して言語変換処理を行う際の入力データである入力文書データ（入力文書情報）３００乃至３０４を格納している。すなわち、
・ユーザＡ：入力文書データ３００、
・ユーザＢ：入力文書データ３０１、
・ユーザＣ：入力文書データ３０２、
・ユーザＤ：入力文書データ３０３、
・ユーザＥ：入力文書データ３０４。 In the present embodiment described below, a case where processing is performed for five users (users A to E) will be described as an example. That is, the user input document data storage unit 30 includes input document data (input document information) 300 to 304 that are input data when five users A to E perform language conversion processing using the user dictionary management system 1. Is stored. That is,
User A: input document data 300,
User B: input document data 301,
User C: input document data 302,
User D: input document data 303,
User E: input document data 304

尚、ユーザ辞書管理システム１を使用するユーザは５人に限定されるわけではなく、５人のユーザは一例にすぎない。ユーザ入力文書データ格納部３０は、例えば、電子メモリあるいは磁気ディスク等の記憶装置である。 Note that the number of users using the user dictionary management system 1 is not limited to five, but the five users are merely examples. The user input document data storage unit 30 is a storage device such as an electronic memory or a magnetic disk.

ユーザ出力文書データ格納部４０は、言語変換処理装置２０が入力文書データ３００乃至３０４を言語変換処理して出力したデータである出力文書データ４００乃至４０４を格納している。すなわち、
・ユーザＡ：出力文書データ４００、
・ユーザＢ：入力文書データ４０１、
・ユーザＣ：入力文書データ４０２、
・ユーザＤ：入力文書データ４０３、
・ユーザＥ：入力文書データ４０４。 The user output document data storage unit 40 stores output document data 400 to 404, which is data output by the language conversion processing device 20 by performing language conversion processing on the input document data 300 to 304. That is,
User A: output document data 400,
User B: input document data 401,
User C: input document data 402,
User D: input document data 403,
User E: input document data 404

ユーザ出力文書データ格納部４０は、例えば、電子メモリあるいは磁気ディスク等の記憶装置である。 The user output document data storage unit 40 is a storage device such as an electronic memory or a magnetic disk.

言語変換処理装置２０は、入力文書データ３００乃至３０４を、それぞれ、言語変換処理して、出力文書データ４００乃至４０４として出力する。言語変換処理装置２０は、例えば、英文和訳等の翻訳処理を行う場合もあれば、仮名漢字変換処理を行う場合もある。 The language conversion processing device 20 performs language conversion processing on the input document data 300 to 304, and outputs them as output document data 400 to 404, respectively. For example, the language conversion processing device 20 may perform translation processing such as English-Japanese translation, or may perform kana-kanji conversion processing.

言語変換処理装置２０は、ユーザ辞書データ格納部２１を備えている。ユーザ辞書データ格納部２１は、ユーザＡ乃至Ｅがユーザ辞書管理システム１を使用して言語変換処理を行う際に使用するユーザ辞書である、辞書データ（辞書情報）２１０乃至２１４を格納している。すなわち、
・ユーザＡ：辞書データ２１０、
・ユーザＢ：辞書データ２１１、
・ユーザＣ：辞書データ２１２、
・ユーザＤ：辞書データ２１３、
・ユーザＥ：辞書データ２１４。 The language conversion processing device 20 includes a user dictionary data storage unit 21. The user dictionary data storage unit 21 stores dictionary data (dictionary information) 210 to 214 which are user dictionaries used when the users A to E perform language conversion processing using the user dictionary management system 1. . That is,
User A: dictionary data 210,
User B: dictionary data 211,
User C: dictionary data 212,
User D: dictionary data 213
User E: dictionary data 214

言語変換処理装置２０は、入力文書データ３００乃至３０４を言語変換処理する際に、それぞれ、辞書データ２１０乃至２１４を参照する。言語変換処理装置２０は、汎用サーバ装置等の、あるいは、言語変換処理を専用に行う情報処理装置である。ユーザ辞書データ格納部２１は、例えば、電子メモリあるいは磁気ディスク等の記憶装置である。 The language conversion processing device 20 refers to the dictionary data 210 to 214 when performing the language conversion processing on the input document data 300 to 304, respectively. The language conversion processing device 20 is a general-purpose server device or the like, or an information processing device that exclusively performs language conversion processing. The user dictionary data storage unit 21 is a storage device such as an electronic memory or a magnetic disk.

ユーザ辞書管理装置１０は、辞書データ２１０乃至２１４を更新管理する装置である。ユーザ辞書管理装置１０は、算出部１１、及び、登録部１２を備えている。算出部１１、及び、登録部１２は、電子回路の場合もあれば、コンピュータプログラムとそのコンピュータプログラムに従って動作するプロセッサによって実現される場合もある。 The user dictionary management device 10 is a device that updates and manages the dictionary data 210 to 214. The user dictionary management device 10 includes a calculation unit 11 and a registration unit 12. The calculation unit 11 and the registration unit 12 may be electronic circuits or may be realized by a computer program and a processor that operates according to the computer program.

算出部１１は、入力文書データ３００乃至３０４に関して、各入力文書データ間の類似度が示す値を所定の基準に基づいて算出する。 The calculation unit 11 calculates the value indicated by the similarity between the input document data for the input document data 300 to 304 based on a predetermined criterion.

本実施形態に係る入力文書データ間の類似度について、図３に例示する。ここで、ユーザＡ乃至Ｃの入力文書データ３００乃至３０２が、ぞれぞれ、図３に示す内容の英文であったとする。このとき、言語変換処理装置２０は、英文和訳を行う装置である。算出部１１は、入力文書データ３００乃至３０２の英文が含む単語に関して、入力文書データ３００乃至３０２の少なくともいずれか２つ以上に含まれる単語を検出する。 FIG. 3 illustrates the similarity between input document data according to the present embodiment. Here, it is assumed that the input document data 300 to 302 of the users A to C are English sentences having the contents shown in FIG. At this time, the language conversion processing device 20 is a device that performs English-Japanese translation. The calculation unit 11 detects a word included in at least any two of the input document data 300 to 302 with respect to the words included in the English sentences of the input document data 300 to 302.

図３に示す例の場合、“Ｐｈｉｌｉｐｐｉｎｅｓ”と“ｔｙｐｈｏｏｎ”の２つの単語が、ユーザＡの入力文書データ３００及びユーザＢの入力文書データ３０１に、共通して含まれる単語である。尚、係る２つの単語以外には、入力文書データ３００乃至３０２の少なくともいずれか２つ以上に含まれる単語は存在しない。この場合、この２つの単語を共通して含む、ユーザＡの入力文書データ３００とユーザＢの入力文書データ３０１との間の類似度が示す値は大きいことになる。これに対して、ユーザＡの入力文書データ３００とユーザＣの入力文書データ３０２との間、及び、ユーザＢの入力文書データ３０１とユーザＣの入力文書データ３０２との間の類似度が示す値は小さいことになる。算出部１１は、例えば、２つの入力文書データが共有するワードが、その２つの入力文書データにおいて占める割合に基づき、係る類似度が示す値を算出する。 In the case of the example shown in FIG. 3, two words “Philippines” and “typoon” are commonly included in the input document data 300 of user A and the input document data 301 of user B. There are no words included in at least any two of the input document data 300 to 302 other than the two words. In this case, a value indicated by the similarity between the input document data 300 of the user A and the input document data 301 of the user B that includes the two words in common is large. On the other hand, the value indicated by the similarity between the input document data 300 of the user A and the input document data 302 of the user C and between the input document data 301 of the user B and the input document data 302 of the user C Will be small. For example, the calculation unit 11 calculates a value indicated by the similarity based on a ratio of words shared by the two input document data in the two input document data.

尚、算出部１１は、ｔｆ（ｔｅｒｍｆｒｅｑｕｅｎｃｙ）−ｉｄｆ（ｉｎｖｅｒｓｅｄｏｃｕｍｅｎｔｆｒｅｑｕｅｎｃｙ）のアルゴリズムを用いて、文書ベクトルを作成することにより、係る類似度が示す値を算出してもよい。算出部１１は、あるいは、潜在意味解析（ＬａｔｅｎｔＳｅｍａｎｔｉｃＡｎａｌｙｓｉｓ）を用いて、係る類似度が示す値を算出してもよい。 Note that the calculation unit 11 may calculate a value indicated by the similarity by creating a document vector using an algorithm of tf (term frequency) -idf (inverse document frequency). Alternatively, the calculation unit 11 may calculate a value indicated by the similarity by using latent semantic analysis (Lent Semantic Analysis).

算出部１１は、入力文書データ３００乃至３０４において、全ての２つの入力文書データの組み合わせに関する類似度が示す値を算出し、算出した結果である類似度管理情報１１０を生成する。類似度管理情報１１０の構成例を図４に示す。図４において、類似度管理情報１１０が示す値の単位はパーセントである。図４に示す例では、例えば、ユーザＡの入力文書データ３００とユーザＢの入力文書データ３０１との間の類似度が示す値は、８０％である。 In the input document data 300 to 304, the calculation unit 11 calculates a value indicated by the similarity regarding the combination of all the two input document data, and generates similarity management information 110 that is the calculated result. A configuration example of the similarity management information 110 is shown in FIG. In FIG. 4, the unit of the value indicated by the similarity management information 110 is a percentage. In the example illustrated in FIG. 4, for example, the value indicated by the similarity between the input document data 300 of the user A and the input document data 301 of the user B is 80%.

尚、算出部１１は、入力文書データ３００乃至３０４の少なくともいずれか、あるいは、辞書データ２１０乃至２１４の少なくともいずれかが更新されたことを検出して、その検出を行うたびに、類似度管理情報１１０を生成してもよい。あるいは、算出部１１は、定期的に、もしくは、システム管理者等からの指示を契機として、類似度管理情報１１０を生成してもよい。 Each time the calculation unit 11 detects that at least one of the input document data 300 to 304 or at least one of the dictionary data 210 to 214 has been updated, and performs the detection, the similarity management information 110 may be generated. Alternatively, the calculation unit 11 may generate the similarity management information 110 periodically or triggered by an instruction from a system administrator or the like.

登録部１２は、算出部１１が生成した類似度管理情報１１０を基に、優先順位管理情報１２０を生成する。優先順位管理情報１２０は、登録部１２が特定のユーザに関するユーザ辞書データに未登録である辞書データを、他のユーザに関するユーザ辞書データから登録する際に、どのユーザ辞書データからの辞書データを、変換優先順位を高くして登録するかについて示す情報である。 The registration unit 12 generates priority management information 120 based on the similarity management information 110 generated by the calculation unit 11. When the registration unit 12 registers the dictionary data that is not registered in the user dictionary data related to a specific user from the user dictionary data related to another user, the priority management information 120 is used to store dictionary data from which user dictionary data. This is information indicating whether or not to register with a higher conversion priority.

優先順位管理情報１２０の構成例を図５に示す。図４に示す類似度管理情報１１０において、ユーザＡの入力文書データ３００に関する、入力文書データ３０１乃至３０４との間の類似度が示す値は、それぞれ、８０％、１０％、４１％、及び、５％である。したがって、入力文書データ３０１乃至３０４を、ユーザＡの入力文書データ３００との類似度が高い順番に並べると、ユーザＢの入力文書データ３０１、ユーザＤの入力文書データ３０３、ユーザＣの入力文書データ３０２、ユーザＥの入力文書データ３０４となる。これにより、登録部１２は、ユーザＡに関する変換優先順位が、ユーザＢ、ユーザＤ、ユーザＣ、ユーザＥの順番となること示す、優先順位管理情報１２０におけるレコードを生成する。登録部１２は、ユーザＢ乃至Ｅに関しても同様に、優先順位管理情報１２０におけるレコードを生成する。 A configuration example of the priority management information 120 is shown in FIG. In the similarity management information 110 shown in FIG. 4, the values indicated by the similarity between the input document data 301 to 304 regarding the input document data 300 of the user A are 80%, 10%, 41%, and 5%. Therefore, when the input document data 301 to 304 are arranged in the order of similarity with the input document data 300 of the user A, the input document data 301 of the user B, the input document data 303 of the user D, and the input document data of the user C 302, the input document data 304 of the user E. As a result, the registration unit 12 generates a record in the priority management information 120 indicating that the conversion priority regarding the user A is the order of the user B, the user D, the user C, and the user E. The registration unit 12 similarly generates records in the priority management information 120 for the users B to E.

登録部１２は、生成した優先順位管理情報１２０に基づき、辞書データ４００乃至４０４に対して、未登録である辞書データを登録する。ユーザ辞書データに対する辞書データの登録例を図６に示す。 The registration unit 12 registers unregistered dictionary data in the dictionary data 400 to 404 based on the generated priority management information 120. An example of dictionary data registration for user dictionary data is shown in FIG.

図６は、登録部１２が、ユーザＡの辞書データ２１０に、未登録である辞書データを、辞書データ２１１乃至２１４から登録する場合の一例である。辞書データ２１０乃至２１４は、変換対象ワードと、変換候補ワードと、を関連付けた変換指示レコードを包含している。そして、この変換指示レコードは、変換候補ワードとして、１以上のワードを、変換優先順位と関連付けて記憶している。 FIG. 6 shows an example in which the registration unit 12 registers unregistered dictionary data from the dictionary data 211 to 214 in the dictionary data 210 of the user A. The dictionary data 210 to 214 include conversion instruction records in which conversion target words and conversion candidate words are associated with each other. This conversion instruction record stores one or more words as conversion candidate words in association with the conversion priority.

ユーザＡの辞書データ２１０は、変換対象ワードである原語１に関する変換候補ワードとして、訳語１−１を定義し、変換対象ワードである原語２に関する変換候補ワードとして、変換優先順位が示す値が高い順番に、訳語２−１及び訳語２−２を定義しているものとする。登録部１２は、ユーザＡの辞書データ２１０が定義していない、変換対象ワードと変換候補ワードとの組み合わせについて、辞書データ２１１乃至２１４をサーチする。 The dictionary data 210 of the user A defines the translation word 1-1 as the conversion candidate word for the source word 1 that is the conversion target word, and the conversion priority order has a high value as the conversion candidate word for the source word 2 that is the conversion target word. It is assumed that the translated word 2-1 and the translated word 2-2 are defined in order. The registration unit 12 searches the dictionary data 211 to 214 for combinations of conversion target words and conversion candidate words that are not defined by the dictionary data 210 of the user A.

図６に示す例では、原語１に関する、ユーザＡの辞書データ２１０が定義していない変換候補ワードとして、ユーザＣの辞書データ２１２が訳語１−３を定義し、ユーザＤの辞書データ２１３が訳語１−２を定義し、ユーザＥの辞書データ２１４が訳語１−４を定義している。登録部１２は、優先順位管理情報１２０を参照し、ユーザＡに関するこれらのユーザの変換優先順位が、ユーザＤ、ユーザＣ、及び、ユーザＥの順番であることを確認する。 In the example shown in FIG. 6, as conversion candidate words that are not defined by the dictionary data 210 of the user A relating to the source language 1, the dictionary data 212 of the user C defines the translation 1-3, and the dictionary data 213 of the user D is the translation 1-2 is defined, and dictionary data 214 of user E defines translated words 1-4. The registration unit 12 refers to the priority management information 120 and confirms that the conversion priority of these users regarding the user A is the order of the user D, the user C, and the user E.

そして、登録部１２は、ユーザＤの辞書データ２１３が定義した訳語１−２を、ユーザＡの辞書データ２１０における、原語１に関する、変換優先順位が２位である変換候補ワードとして追加定義する。登録部１２は、ユーザＣの辞書データ２１２が定義した訳語１−３を、ユーザＡの辞書データ２１０における、原語１に関する、変換優先順位が３位である変換候補ワードとして追加定義する。登録部１２は、ユーザＥの辞書データ２１４が定義した訳語１−４を、ユーザＡの辞書データ２１０における、原語１に関する、変換優先順位が４位である変換候補ワードとして追加定義する。 Then, the registration unit 12 additionally defines the translated word 1-2 defined by the dictionary data 213 of the user D as a conversion candidate word having the second conversion priority for the original word 1 in the dictionary data 210 of the user A. The registration unit 12 additionally defines the translated words 1-3 defined by the dictionary data 212 of the user C as conversion candidate words with the conversion priority of the third place for the original word 1 in the dictionary data 210 of the user A. The registration unit 12 additionally defines the translated words 1-4 defined by the dictionary data 214 of the user E as conversion candidate words having the fourth conversion priority for the original word 1 in the dictionary data 210 of the user A.

図６に示す例では、また、ユーザＡの辞書データ２１０が定義していない変換対象ワードとして、ユーザＢの辞書データ２１１及びユーザＤの辞書データ２１３が原語３を定義している。ユーザＢの辞書データ２１１は、原語３に関する変換候補ワードとして、訳語３−１を定義している。ユーザＤの辞書データ２１３は、原語３に関する変換候補ワードとして、訳語３−２を定義している。登録部１２は、優先順位管理情報１２０を参照し、ユーザＡに関するこれらのユーザの変換優先順位が、ユーザＢ、及び、ユーザＤの順番であることを確認する。 In the example shown in FIG. 6, user B dictionary data 211 and user D dictionary data 213 define source language 3 as conversion target words not defined by user A dictionary data 210. The dictionary data 211 of the user B defines a translation word 3-1 as a conversion candidate word for the original word 3. The dictionary data 213 of the user D defines a translation word 3-2 as a conversion candidate word for the original word 3. The registration unit 12 refers to the priority management information 120 and confirms that the conversion priority of these users regarding the user A is the order of the user B and the user D.

そして、登録部１２は、原語３を、ユーザＡの辞書データ２１０における変換対象ワードとして追加定義する。登録部１２は、ユーザＢの辞書データ２１１が定義した訳語３−１を、ユーザＡの辞書データ２１０における、原語３に関する、変換優先順位が１位である変換候補ワードとして追加定義する。登録部１２は、ユーザＤの辞書データ２１３が定義した訳語３−２を、ユーザＡの辞書データ２１０における、原語３に関する、変換優先順位が２位である変換候補ワードとして追加定義する。 Then, the registration unit 12 additionally defines the original language 3 as a conversion target word in the dictionary data 210 of the user A. The registration unit 12 additionally defines the translation 3-1 defined by the dictionary data 211 of the user B as a conversion candidate word having the first conversion priority for the original word 3 in the dictionary data 210 of the user A. The registration unit 12 additionally defines the translation 3-2 defined by the dictionary data 213 of the user D as a conversion candidate word having the second conversion priority for the original word 3 in the dictionary data 210 of the user A.

さらに、図６には示していないが、例えば、特定の変換対象ワードに関して、辞書データ２１１乃至２１４の少なくともいずれかが、ユーザＡの辞書データ２１０において未定義である、変換優先順位が異なる複数の変換候補ワードを定義している場合を考える。この場合、登録部１２は、係る複数の変換候補ワードを、当該変換優先順位が示す順番を維持して、ユーザＡの辞書データ２１０における、係る特定の変換対象ワードに関する変換候補ワードとして追加定義する。 Furthermore, although not shown in FIG. 6, for example, for a specific conversion target word, at least one of the dictionary data 211 to 214 is undefined in the dictionary data 210 of the user A, and a plurality of conversion priorities are different. Consider a case where conversion candidate words are defined. In this case, the registration unit 12 additionally defines the plurality of conversion candidate words as conversion candidate words related to the specific conversion target word in the dictionary data 210 of the user A while maintaining the order indicated by the conversion priority order. .

以上、図６を使用して、登録部１２が、ユーザＡの辞書データ２１０に、未登録である辞書データを、辞書データ２１１乃至２１４から登録する動作の一例を説明した。そして、登録部１２は、ユーザＢ乃至Ｅのそれぞれの辞書データ２１１乃至２１４に、未登録である辞書データを登録する場合も、ユーザＡの辞書データ２１０に登録する場合と同様の処理を行う。 As described above, an example of the operation in which the registration unit 12 registers the unregistered dictionary data in the dictionary data 210 of the user A from the dictionary data 211 to 214 has been described with reference to FIG. Then, the registration unit 12 performs the same processing as when registering unregistered dictionary data in the dictionary data 211 to 214 of each of the users B to E, even when registering unregistered dictionary data.

尚、登録部１２は、算出部１１が入力文書データ３００乃至３０４の少なくともいずれか、あるいは、辞書データ２１０乃至２１４の少なくともいずれかが更新されたことを検出して類似度管理情報１１０を生成するたびに、辞書データを登録する処理を行ってもよい。あるいは、登録部１２は、定期的に、もしくは、システム管理者等からの指示を契機として、辞書データを登録する処理を行ってもよい。その際、登録部１２は、算出部１１が類似度管理情報１１０を生成するタイミングと同期して辞書データを登録する処理を行ってもよいし、非同期で係る処理を行ってもよい。 The registration unit 12 generates the similarity management information 110 by detecting that the calculation unit 11 has updated at least one of the input document data 300 to 304 or at least one of the dictionary data 210 to 214. You may perform the process which registers dictionary data each time. Alternatively, the registration unit 12 may perform a process of registering dictionary data periodically or triggered by an instruction from a system administrator or the like. At this time, the registration unit 12 may perform a process of registering dictionary data in synchronization with the timing at which the calculation unit 11 generates the similarity management information 110, or may perform a process related to asynchronously.

次に図２のフローチャートを参照して、本実施形態に係るユーザ辞書管理システム１の動作（処理）について詳細に説明する。 Next, the operation (process) of the user dictionary management system 1 according to the present embodiment will be described in detail with reference to the flowchart of FIG.

算出部１１は、ユーザＡ乃至Ｅ入力文書データ３００乃至３０４を基に、類似度管理情報１１０を生成する（ステップＳ１０１）。登録部１２は、類似度管理情報１１０を基に、優先順位管理情報１２０を生成する（ステップＳ１０２）。 The calculation unit 11 generates similarity management information 110 based on the user A to E input document data 300 to 304 (step S101). The registration unit 12 generates priority management information 120 based on the similarity management information 110 (step S102).

処理は、Ｘ（ＸはＡ乃至Ｅの何れかの英字）に関して、ステップＳ１０７までのループ処理に入る（ステップＳ１０３）。登録部１２は、ユーザＸの辞書データに登録されていない、変換対象ワードと変換候補ワードの組み合わせが、他のユーザ辞書データに存在するか否かを確認する（ステップＳ１０４）。 The process enters a loop process up to step S107 with respect to X (X is any letter of A to E) (step S103). The registration unit 12 confirms whether the combination of the conversion target word and the conversion candidate word that is not registered in the dictionary data of the user X exists in other user dictionary data (step S104).

当該組み合わせが、他のユーザ辞書データに存在しない場合（ステップＳ１０５でＮｏ）、処理はステップＳ１０７へ進む。当該組み合わせが、他のユーザ辞書データに存在する場合（ステップＳ１０５でＹｅｓ）、登録部１２は、未登録である変換対象ワードと変換候補ワードの組み合わせを、優先順位管理情報１２０が示す情報に基づき、ユーザＸの辞書データに登録する（ステップＳ１０６）。 If the combination does not exist in other user dictionary data (No in step S105), the process proceeds to step S107. When the combination exists in other user dictionary data (Yes in step S105), the registration unit 12 determines the combination of the unregistered conversion target word and conversion candidate word based on the information indicated by the priority management information 120. The user X is registered in the dictionary data (step S106).

ＸがＥでない場合は、登録部１２は、次のＸに関してステップＳ１０３からの処理を実行し、ＸがＥである場合は、全体の処理は終了する（ステップＳ１０７）。 When X is not E, the registration unit 12 executes the process from step S103 for the next X, and when X is E, the entire process ends (step S107).

本実施形態に係るユーザ辞書管理システム１は、言語変換処理装置が使用するユーザ辞書に対する辞書情報の登録を、効率的かつ柔軟に行うことができる。その理由は、算出部１１が、各ユーザ辞書データ間に関する類似度を算出し、登録部１２が、特定のユーザ辞書データに未登録である辞書データを、他のユーザ辞書データから、係る類似度が示す値に基づいて、変換優先順位が示す値と関連付けて、係る特定のユーザ辞書データに登録するからである。 The user dictionary management system 1 according to the present embodiment can efficiently and flexibly register dictionary information for the user dictionary used by the language conversion processing device. The reason is that the calculation unit 11 calculates the similarity between the user dictionary data, and the registration unit 12 extracts the dictionary data not registered in the specific user dictionary data from other user dictionary data. This is because it is registered in the specific user dictionary data in association with the value indicated by the conversion priority order based on the value indicated by.

言語変換処理システムにおいて、特定のユーザが使用するユーザ辞書に未登録である辞書データを、別のユーザが使用するユーザ辞書から登録することにより、その特定のユーザが処理する言語変換処理の精度が向上する。そして、その際、登録先であるユーザ辞書に不必要な辞書データを登録することを回避するため、係る辞書データに関する登録先及び登録元であるユーザ辞書間の類似度が示す値が閾値以上である場合にのみ、登録を行うようにしたシステムがある。しかしながら、係るシステムでは、登録先であるユーザ辞書が使用する可能性がある辞書データの登録を排除する虞があり、辞書データの登録に関する柔軟性に欠けている。 In a language conversion processing system, by registering dictionary data that is not registered in a user dictionary used by a specific user from a user dictionary used by another user, the accuracy of the language conversion processing processed by the specific user is improved. improves. At that time, in order to avoid registering unnecessary dictionary data in the user dictionary that is the registration destination, the value indicated by the similarity between the registration destination and the user dictionary that is the registration source related to the dictionary data is greater than or equal to the threshold value. There are systems that only register in some cases. However, in such a system, there is a possibility that registration of dictionary data that may be used by a user dictionary that is a registration destination may be excluded, and flexibility regarding registration of dictionary data is lacking.

これに対して、本実施形態に係るユーザ辞書管理システム１では、係る類似度が示す値を、登録先のユーザ辞書に辞書データを登録するか否かの判断基準として使用するのではなく、係る辞書データに関する、登録先のユーザ辞書における変換優先順位を決定する際の判断基準として使用する。すなわち、本実施形態に係る登録部１２は、係る類似度が示す値が高い登録元であるユーザ辞書からの辞書データを、登録先であるユーザ辞書においける変換優先順位が示す値が高くなるように登録する。そして、登録部１２は、係る類似度が示す値が低い登録元であるユーザ辞書からの辞書データを排除するのではなく、登録先であるユーザ辞書における変換優先順位が示す値が低くなるように登録する。これにより、本実施形態に係るユーザ辞書管理システム１は、言語変換処理装置が使用するユーザ辞書に対する辞書データの登録を、効率的かつ柔軟に行うことができる。 On the other hand, in the user dictionary management system 1 according to the present embodiment, the value indicated by the similarity is not used as a criterion for determining whether or not to register dictionary data in the registration destination user dictionary. This is used as a criterion for determining the conversion priority in the user dictionary of the registration destination for dictionary data. That is, the registration unit 12 according to the present embodiment increases the value indicated by the conversion priority in the user dictionary that is the registration destination of the dictionary data from the user dictionary that is the registration source having a high value indicated by the similarity. Register as follows. The registration unit 12 does not exclude the dictionary data from the user dictionary that is the registration source having a low value indicated by the similarity, but the value indicated by the conversion priority in the user dictionary that is the registration destination is low. sign up. Thereby, the user dictionary management system 1 according to the present embodiment can efficiently and flexibly register dictionary data for the user dictionary used by the language conversion processing device.

尚、本実施形態に係る算出部１１は、入力文書データ３００乃至３０４に関して、各入力文書データ間の類似度が示す値を算出しているが、ユ辞書データ２１０乃至２１４を基に、各ユーザの辞書データ間の類似度が示す値を算出するようにしてもよい。 Note that the calculation unit 11 according to the present embodiment calculates the value indicated by the similarity between the input document data for the input document data 300 to 304, but each user is based on the dictionary data 210 to 214. A value indicated by the similarity between the dictionary data may be calculated.

また、本実施形態に係るユーザ辞書管理システム１は、入力文書データ３００乃至３０４の少なくともいずれか、あるいは、辞書データ２１０乃至２１４の少なくともいずれかが更新されるたびに、ユーザ辞書に辞書データを登録する、動的な登録処理を行うことができる。あるいは、本実施形態に係るユーザ辞書管理システム１は、定期的に、もしくは、システム管理者等からの指示を契機として、辞書データを登録する、静的な登録処理も行うことができる。ユーザ辞書管理システム１は、言語変換処理に関する精度を可能な限り向上させたい場合は、動的な登録処理を行えばよい。一方、ユーザ辞書管理システム１は、登録処理によって生じるシステム負荷を低減させたい場合は、静的な登録処理を行えばよい。すなわち、ユーザ辞書管理システム１は、システムに対する要求仕様に従い、ユーザ辞書への辞書データの登録処理に関して、柔軟な運用を行うことができる。 Further, the user dictionary management system 1 according to the present embodiment registers dictionary data in the user dictionary every time at least one of the input document data 300 to 304 or at least one of the dictionary data 210 to 214 is updated. Dynamic registration processing can be performed. Alternatively, the user dictionary management system 1 according to the present embodiment can also perform static registration processing for registering dictionary data periodically or triggered by an instruction from a system administrator or the like. The user dictionary management system 1 may perform dynamic registration processing when it is desired to improve the accuracy related to the language conversion processing as much as possible. On the other hand, the user dictionary management system 1 may perform static registration processing when it is desired to reduce the system load caused by the registration processing. That is, the user dictionary management system 1 can perform flexible operation regarding the process of registering dictionary data in the user dictionary in accordance with the required specifications for the system.

尚、本実施形態に係るユーザ辞書管理システム１は、図１に示す全ての構成要素がサーバ装置に包含される、クラウドサービス型のシステムとして構築されてもよい。あるいは、ユーザ辞書管理システム１は、ユーザ辞書データ、ユーザ入力文書データ、及び、ユーザ出力文書データが、各ユーザが使用するクライアント端末装置に包含されるようなシステムとして構築されてもよい。この場合、本実施形態に係るユーザ辞書管理システム１は、ユーザ入力文書データ間の類似度が示す値を、サーバ装置側で算出してもよいし、あるいは、クライアント端末装置間においてピアーツーピアー型の通信を行うことにより、クライアント端末装置側で算出してもよい。 Note that the user dictionary management system 1 according to the present embodiment may be constructed as a cloud service type system in which all the components shown in FIG. 1 are included in the server device. Alternatively, the user dictionary management system 1 may be constructed as a system in which user dictionary data, user input document data, and user output document data are included in a client terminal device used by each user. In this case, the user dictionary management system 1 according to the present embodiment may calculate the value indicated by the similarity between user input document data on the server device side, or a peer-to-peer type between client terminal devices The communication may be performed on the client terminal device side.

＜第２の実施形態＞
図７は第２の実施形態のユーザ辞書管理装置５０の構成を概念的に示すブロック図である。 <Second Embodiment>
FIG. 7 is a block diagram conceptually showing the structure of the user dictionary management apparatus 50 of the second embodiment.

本実施形態のユーザ辞書管理装置５０は、算出部５１、及び、登録部５２を備えている。 The user dictionary management device 50 according to the present embodiment includes a calculation unit 51 and a registration unit 52.

算出部５１は、複数のユーザ辞書情報の中の、第１のユーザ辞書情報６００、及び、第２のユーザ辞書情報６０１に関する類似度が示す値を、所定の基準に基づき算出する。係る複数のユーザ辞書情報は、変換指示レコードを包含している。係る変換指示レコードは、言語変換処理装置６０がユーザ文書情報を言語変換処理する際に参照するレコードである。そして、その変換指示レコードは、変換前のワードである変換対象ワードと、変換後のワードである１以上の変換候補ワードと、その変換候補ワードに関する変換優先順位が示す値とを関連付けて記憶している。 The calculation unit 51 calculates a value indicated by the similarity regarding the first user dictionary information 600 and the second user dictionary information 601 among the plurality of user dictionary information based on a predetermined criterion. The plurality of user dictionary information includes a conversion instruction record. The conversion instruction record is a record that is referred to when the language conversion processing device 60 performs language conversion processing on user document information. Then, the conversion instruction record stores the conversion target word that is the word before conversion, one or more conversion candidate words that are the converted word, and the value indicated by the conversion priority regarding the conversion candidate word. ing.

登録部５２は、第１のユーザ辞書情報６００が包含する変換指示レコードが示す情報を、係る類似度が示す値に基づいて、係る変換優先順位が示す値と関連付けて、第２のユーザ辞書情報６０２に登録する。 The registration unit 52 associates the information indicated by the conversion instruction record included in the first user dictionary information 600 with the value indicated by the conversion priority based on the value indicated by the similarity, and sets the second user dictionary information. Register at 602.

本実施形態に係るユーザ辞書管理装置５０は、言語変換処理装置が使用するユーザ辞書に対する辞書情報の登録を、効率的かつ柔軟に行うことができる。その理由は、算出部５１が、各ユーザ辞書情報間に関する類似度を算出し、登録部５２が、特定のユーザ辞書情報に未登録である辞書情報を、他のユーザ辞書情報から、係る類似度が示す値に基づいて、変換優先順位が示す値と関連付けて、係る特定のユーザ辞書情報に登録するからである。 The user dictionary management device 50 according to the present embodiment can efficiently and flexibly register dictionary information for the user dictionary used by the language conversion processing device. The reason is that the calculation unit 51 calculates the similarity between the user dictionary information, and the registration unit 52 extracts the dictionary information that is not registered in the specific user dictionary information from other user dictionary information. This is because it is registered in the specific user dictionary information in association with the value indicated by the conversion priority order based on the value indicated by.

＜ハードウェア構成例＞
上述した各実施形態において図１、及び、図７に示した各部は、専用のＨＷ（ＨａｗｄＷａｒｅ）（電子回路）によって実現することができる。また、少なくとも、算出部１１及び５１、及び、登録部１２及び５２は、ソフトウェアプログラムの機能（処理）単位（ソフトウェアモジュール）と捉えることができる。但し、これらの図面に示した各部の区分けは、説明の便宜上の構成であり、実装に際しては、様々な構成が想定され得る。この場合のハードウェア環境の一例を、図８を参照して説明する。 <Hardware configuration example>
In each embodiment described above, each unit illustrated in FIG. 1 and FIG. 7 can be realized by a dedicated HW (Holdware) (electronic circuit). Further, at least the calculation units 11 and 51 and the registration units 12 and 52 can be regarded as a function (processing) unit (software module) of the software program. However, the division of each part shown in these drawings is a configuration for convenience of explanation, and various configurations can be assumed for mounting. An example of the hardware environment in this case will be described with reference to FIG.

図８は、本発明の模範的な実施形態に係るユーザ辞書管理装置を実行可能な情報処理装置９００（コンピュータ）の構成を例示的に説明する図である。即ち、図８は、図１、及び、図７に示したユーザ辞書管理装置を実現可能なコンピュータ（情報処理装置）の構成であって、上述した実施形態における各機能を実現可能なハードウェア環境を表す。 FIG. 8 is a diagram illustrating an exemplary configuration of an information processing apparatus 900 (computer) that can execute the user dictionary management apparatus according to the exemplary embodiment of the present invention. That is, FIG. 8 shows a configuration of a computer (information processing apparatus) that can realize the user dictionary management apparatus shown in FIGS. 1 and 7, and a hardware environment that can realize each function in the above-described embodiment. Represents.

図８に示した情報処理装置９００は、ＣＰＵ９０１、ＲＯＭ（Ｒｅａｄ＿Ｏｎｌｙ＿Ｍｅｍｏｒｙ）９０２、ＲＡＭ（Ｒａｎｄｏｍ＿Ａｃｃｅｓｓ＿Ｍｅｍｏｒｙ）９０３、ハードディスク９０４（記憶装置）、外部装置との通信インタフェース９０５（Ｉｎｔｅｒｆａｃｅ：以降、「Ｉ／Ｆ」と称する）、ＣＤ−ＲＯＭ（Ｃｏｍｐａｃｔ＿Ｄｉｓｃ＿Ｒｅａｄ＿Ｏｎｌｙ＿Ｍｅｍｏｒｙ）等の記憶媒体９０７に格納されたデータを読み書き可能なリーダライタ９０８、及び、入出力インタフェース９０９を備え、これらの構成がバス９０６（通信線）を介して接続された一般的なコンピュータである。 The information processing apparatus 900 illustrated in FIG. 8 includes a CPU 901, a ROM (Read_Only_Memory) 902, a RAM (Random_Access_Memory) 903, a hard disk 904 (storage device), and a communication interface 905 (Interface: hereinafter referred to as “I / F”). A reader / writer 908 capable of reading and writing data stored in a storage medium 907 such as a CD-ROM (Compact_Disc_Read_Only_Memory), and an input / output interface 909, which are connected via a bus 906 (communication line). General computer.

そして、上述した実施形態を例に説明した本発明は、図８に示した情報処理装置９００に対して、その実施形態の説明において参照したブロック構成図（図１、及び、図７）における、算出部１１及び５１、及び、登録部１２及び５２、或いはフローチャート（図２）の機能を実現可能なコンピュータプログラムを供給した後、そのコンピュータプログラムを、当該ハードウェアのＣＰＵ９０１に読み出して解釈し実行することによって達成される。また、当該装置内に供給されたコンピュータプログラムは、読み書き可能な揮発性の記憶メモリ（ＲＡＭ９０３）またはハードディスク９０４等の不揮発性の記憶デバイスに格納すれば良い。 The present invention described using the above-described embodiment as an example is the block configuration diagram (FIG. 1 and FIG. 7) referenced in the description of the embodiment for the information processing apparatus 900 illustrated in FIG. 8. After supplying a computer program capable of realizing the functions of the calculation units 11 and 51, the registration units 12 and 52, or the flowchart (FIG. 2), the computer program is read and interpreted by the CPU 901 of the hardware. Is achieved. The computer program supplied to the apparatus may be stored in a readable / writable volatile storage memory (RAM 903) or a nonvolatile storage device such as the hard disk 904.

また、前記の場合において、当該ハードウェア内へのコンピュータプログラムの供給方法は、ＣＤ−ＲＯＭ等の各種記憶媒体９０７を介して当該装置内にインストールする方法や、インターネット等の通信回線を介して外部よりダウンロードする方法等のように、現在では一般的な手順を採用することができる。そして、このような場合において、本発明は、係るコンピュータプログラムを構成するコード或いは、そのコードが格納された記憶媒体９０７によって構成されると捉えることができる。 In the above-described case, the computer program can be supplied to the hardware by a method of installing in the apparatus via various storage media 907 such as a CD-ROM, or an external method via a communication line such as the Internet. A general procedure can be adopted at present, such as a method of downloading more. In such a case, it can be understood that the present invention is configured by a code constituting the computer program or a storage medium 907 in which the code is stored.

以上、上述した実施形態を模範的な例として本発明を説明した。しかしながら、本発明は、上述した実施形態には限定されない。即ち、本発明は、本発明のスコープ内において、当業者が理解し得る様々な態様を適用することができる。 The present invention has been described above using the above-described embodiment as an exemplary example. However, the present invention is not limited to the above-described embodiment. That is, the present invention can apply various modes that can be understood by those skilled in the art within the scope of the present invention.

１ユーザ辞書管理システム
１０ユーザ辞書管理装置
１１算出部
１１０類似度管理情報
１２登録部
１２０優先順位管理情報
２０言語変換処理装置
２１ユーザ辞書データ格納部
２１０乃至２１４辞書データ
３０ユーザ入力文書データ格納部
３００乃至３０４入力文書データ
４０ユーザ出力文書データ格納部
４００乃至４０４出力文書データ
５０ユーザ辞書管理装置
５１算出部
５２登録部
６０言語変換処理装置
６００第１のユーザ辞書情報
６０１第２のユーザ辞書情報
９００情報処理装置
９０１ＣＰＵ
９０２ＲＯＭ
９０３ＲＡＭ
９０４ハードディスク
９０５通信インタフェース
９０６バス
９０７記憶媒体
９０８リーダライタ
９０９入出力インタフェース DESCRIPTION OF SYMBOLS 1 User dictionary management system 10 User dictionary management apparatus 11 Calculation part 110 Similarity management information 12 Registration part 120 Priority management information 20 Language conversion processing apparatus 21 User dictionary data storage part 210 thru | or 214 Dictionary data 30 User input document data storage part 300 Through 304 input document data 40 user output document data storage unit 400 through 404 output document data 50 user dictionary management device 51 calculation unit 52 registration unit 60 language conversion processing device 600 first user dictionary information 601 second user dictionary information 900 information Processing device 901 CPU
902 ROM
903 RAM
904 Hard disk 905 Communication interface 906 Bus 907 Storage medium 908 Reader / writer 909 Input / output interface

Claims

The language conversion processing device refers to the user document information when performing language conversion processing. The conversion target word that is the word before conversion, one or more conversion candidate words that are the converted word, and the conversion priority for the conversion candidate word Based on a predetermined criterion, a value indicated by the similarity regarding the first and second user dictionary information among a plurality of user dictionary information including a conversion instruction record that stores the value indicated by the rank in association with each other is calculated. A calculation means;
When the conversion candidate word in which different values are registered in the plurality of first user dictionary information with respect to the specific conversion target word is registered in the second user dictionary information, Registration means for registering each of the conversion candidate words so that the value indicated by the conversion priority is higher in the order indicating the degree of similarity of the user dictionary information is greater ;
A user dictionary management device comprising:

The calculation means detects that the first user dictionary information has been updated, calculates a value indicated by the similarity,
The registration unit updates the second user dictionary information each time the calculation unit detects that the first user dictionary information is updated.
The user dictionary management apparatus according to claim 1 .

The calculation means calculates a value indicated by the similarity at a first predetermined time,
The registration means updates the second user dictionary information at a second predetermined time.
The user dictionary management apparatus according to claim 1 or 2 .

Said calculating means includes a first said user document information converted using the first user dictionary information, and the second of the user document information is converted using said second user dictionary information To calculate a value indicated by the similarity,
The user dictionary management apparatus according to any one of claims 1 to 3 .

The calculation means calculates the similarity based on a ratio of the conversion target word that the first and second user document information includes in common to the first and second user document information. Calculate the value shown,
The user dictionary management apparatus according to claim 4 .

Depending on the information processing device,
The language conversion processing device refers to the user document information when performing language conversion processing. The conversion target word that is the word before conversion, one or more conversion candidate words that are the converted word, and the conversion priority for the conversion candidate word Based on a predetermined criterion, a value indicated by the similarity with respect to the first and second user dictionary information among a plurality of user dictionary information including a conversion instruction record that is stored in association with the value indicated by the rank is calculated. ,
When the conversion candidate word in which different values are registered in the plurality of first user dictionary information with respect to the specific conversion target word is registered in the second user dictionary information, Register each of the conversion candidate words so that the value indicated by the conversion priority is higher in the order indicating that the similarity with respect to the user dictionary information is greater ;
User dictionary management method.

The language conversion processing device refers to the user document information when performing language conversion processing. The conversion target word that is the word before conversion, one or more conversion candidate words that are the converted word, and the conversion priority for the conversion candidate word Based on a predetermined criterion, a value indicated by the similarity regarding the first and second user dictionary information among a plurality of user dictionary information including a conversion instruction record that stores the value indicated by the rank in association with each other is calculated. Calculation process,
When the conversion candidate word in which different values are registered in the plurality of first user dictionary information with respect to the specific conversion target word is registered in the second user dictionary information, A registration process for registering each of the conversion candidate words such that the value indicated by the conversion priority is higher in an order indicating that the similarity with respect to the user dictionary information is greater ;
User dictionary management program that causes a computer to execute.