JP2014149786A

JP2014149786A - Natural language analysis processing device, method, and program

Info

Publication number: JP2014149786A
Application number: JP2013019563A
Authority: JP
Inventors: Jun Suzuki; 潤鈴木
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2013-02-04
Filing date: 2013-02-04
Publication date: 2014-08-21
Anticipated expiration: 2033-02-04
Also published as: JP5886220B2

Abstract

PROBLEM TO BE SOLVED: To accurately perform natural language analysis processing for each of successively input character strings by restricting an increase in an amount of calculation.SOLUTION: A problem for language analysis processing for one input unit is decomposed into minimal portion problems by a minimal portion problem creating section 22. Sets of minimal portion problems are formed. S number of groups composed by sets of portions of the all the sets of the minimal portion problems are formed by a group forming section 26, and are assigned to S number of calculation nodes 30. In each calculation node 30, the processes of a parameter updating section 32, a portion decomposition update section 34, a first flag determination section 36, a synchronization section 38, a limiting condition update section 40, and a second flag determination section 42 are repeated until the value of a limiting condition parameter is determined to have converged to the optimum value.

Description

本発明は、自然言語解析処理装置、方法、及びプログラムに係り、特に、入力文書に対して自然言語解析処理を行う自然言語解析処理装置、方法、プログラムに関する。 The present invention relates to a natural language analysis processing device, method, and program, and more particularly, to a natural language analysis processing device, method, and program for performing natural language analysis processing on an input document.

自然言語とは、日本語や英語など人間が通常使う言語のことである。自然言語で記述された文書を文法や意味的に解析する技術は、例えば、その言語の成り立ちや構造を理解するという言語学的な観点で学術的に重要な意義がある。また、近年では、人間が生成した文書を、文法・意味的に自動で解析し、その結果を利用して提供するさまざまなサービスが、主にｗｅｂ上で展開されるようになってきた。例えば、翻訳サイト、人や商品の評判分析サイト、ある特定の事象に対する要約サイトなどのサービスが、これに相当する。これらのサービスでは、人間が生成した電子化文書を、システム内部で文法、意味的に解析し、それらを利用する形で実際のサービスを提供する形になる。その意味で、自然言語を文法、意味的に解析する技術は、これらサービスの根幹の技術であり、情報処理分野でも非常に重要な位置を占めるようになった。 Natural language is a language normally used by humans, such as Japanese and English. A technique for grammatically and semantically analyzing a document described in a natural language has an academic significance from a linguistic viewpoint, for example, to understand the origin and structure of the language. In recent years, various services that automatically analyze grammatically and semantically documents generated by humans and use the results have been developed mainly on the web. For example, services such as a translation site, a reputation analysis site for people and products, and a summary site for a specific event correspond to this. In these services, digitized documents generated by humans are grammatically and semantically analyzed in the system, and the actual services are provided by using them. In this sense, grammatical and semantic analysis of natural language is the core technology of these services, and has come to occupy a very important position in the information processing field.

一概に、自然言語を解析するといっても、単語区切りや品詞推定といった表層的な解析から、語や節間の係り受け関係の推定といったより高度な解析を行うものまで含まれる。例えば、文書から文の区切りを推定する「文区切り」、単語の区切りを推定する「単語区切り」、単語の品詞を推定する品詞付与、単語や節間の関係を推定する係り受け解析などがある。これらの例を図９に示す。 Generally speaking, natural language analysis includes everything from surface analysis such as word breaks and part-of-speech estimation to more advanced analysis such as estimation of dependency relationships between words and clauses. For example, there are “sentence breaks” that estimate sentence breaks from documents, “word breaks” that estimate word breaks, part-of-speech assignment that estimates word part-of-speech, and dependency analysis that estimates the relationship between words and clauses. . These examples are shown in FIG.

品詞付与や係り受け解析といった言語解析問題は、「相互依存性」がある問題に分類される。これは、例えば、単語の品詞を決定する問題では、ある単語の品詞を決定する際には、前後や周辺の品詞によって対象とする単語に付与すべき品詞が決まる関係があり、また、その前後や周辺の品詞も、その前後や周辺の品詞に依存して決まる問題である。仮に、ある品詞Ａの後に、品詞Ｂは出現してはいけないという規則が存在する場合、個々の単語の品詞を独立に推定してしまうと、この規則を満たした結果が得られるという保証が無い。つまり、個々単語の品詞を独立に推定した場合はこういった全体としての整合性がとれるとは限らないため、文書全体として最も良い品詞列を探索する問題になってくる。特に、文区切り、単語区切り、品詞推定、係り受け解析といった問題を一括して解くことを考えると、とても複雑に絡み合った問題となっているため、最もよい出力を探索する問題は非常に難しい問題である。 Language analysis problems such as part-of-speech assignment and dependency analysis are classified as problems with “interdependence”. This is because, for example, in the problem of determining the part of speech of a word, when determining the part of speech of a word, there is a relationship in which the part of speech to be given to the target word is determined by the surrounding parts of speech and the surrounding parts of speech. And surrounding parts of speech are also problems that depend on the part of speech before and after that. If there is a rule that part of speech B should not appear after a part of speech A, there is no guarantee that if the part of speech of each word is estimated independently, a result satisfying this rule can be obtained. . That is, if the part of speech of each word is estimated independently, the consistency as a whole cannot always be obtained, which causes a problem of searching for the best part of speech string for the entire document. In particular, considering solving problems such as sentence breaks, word breaks, part-of-speech estimation, and dependency analysis, the problem of searching for the best output is a very difficult problem because it is a very complex problem. It is.

これらの問題を解くために、履歴ベースの逐次解析法（いわゆるstack decoding）、動的計画法、整数計画法などのアルゴリズムを使って文法・意味的な解析結果を推定する方法が広く使われている（非特許文献１〜３）。 In order to solve these problems, methods of estimating grammatical and semantic analysis results using algorithms such as history-based sequential analysis (so-called stack decoding), dynamic programming, and integer programming are widely used. (Non-patent documents 1 to 3).

Joakim Nivre. Algorithms for deterministic incremental dependency parsing. Computational Linguistics,2008.Joakim Nivre. Algorithms for deterministic incremental dependency parsing.Computational Linguistics, 2008. F. Sha and F. Pereira. Shallow Parsing with Conditional Random Fields. Proc. of HLT/NAACL, 2003.F. Sha and F. Pereira. Shallow Parsing with Conditional Random Fields. Proc. Of HLT / NAACL, 2003. Dan Roth, Wen-tau Yih Global Inference for Entity and Relation Identi_cation via a Linear Programming Formulation Introduction to Statistical Relational Learning, 2007.Dan Roth, Wen-tau Yih Global Inference for Entity and Relation Identi_cation via a Linear Programming Formulation Introduction to Statistical Relational Learning, 2007.

履歴ベースの逐次解析法は、直前までの推定結果を利用して次の推定を行う方法である（図１０参照）。独立に問題を解く方法に比べて、出力の整合性を保つことが可能である。ただし、直前までの解析結果を信じて次の解析結果を決定する方法であるため、直前の解析結果に誤りがあった場合には、その誤りに従った解析を続けてしまうため、間違いの伝播が起こるという問題がある。 The history-based sequential analysis method is a method for performing the next estimation using the estimation results obtained immediately before (see FIG. 10). Compared with the method of solving the problem independently, the output consistency can be maintained. However, since this is a method of deciding the next analysis result by believing in the analysis result up to the previous one, if there is an error in the previous analysis result, the analysis will continue according to the error, so the propagation of the error There is a problem that happens.

その問題を解決する方法として、動的計画法に基づく方法（図１１参照）や整数計画法に基づく方法が使われるようになった。これは、全体として最もよい出力を選ぶ方法となっているため、履歴ベースの逐次解析法に比べて途中の解析誤りといった問題が発生しない。しかし、全体として最もよい出力を選ぶための計算量は、逐次解析法に比べると非常に大きくなるといった問題がある。 As a method for solving the problem, a method based on dynamic programming (see FIG. 11) and a method based on integer programming have come to be used. Since this is a method for selecting the best output as a whole, there is no problem of an analysis error in the middle compared to the history-based sequential analysis method. However, there is a problem that the amount of calculation for selecting the best output as a whole becomes very large as compared with the sequential analysis method.

特にストリーミング入力のような逐次的にテキストが入力される状況に対して、入ってきた順に逐次処理をし、逐次的に結果を出力するシステムを想定する。例えば、自然言語解析を利用したアプリケーションには、機械翻訳、文書校正、オンライン違法有害情報監視などがある。この様なアプリケーションをオンラインで、かつリアルタイム処理する状況を想定すると、翻訳では、特に同時通訳システムがあり、文書校正では、非母国語文書作成時のオンライン文書校正といったアプリケーションがある。この様なサービスを実現する上では、テキストをリアルタイムかつ逐次処理が可能であることが重要な要件となる。 In particular, a system is assumed in which text is sequentially input, such as streaming input, in which processing is sequentially performed in the order in which the texts are input and results are sequentially output. For example, applications using natural language analysis include machine translation, document proofreading, and online illegal harmful information monitoring. Assuming a situation where such an application is processed online and in real time, there is a simultaneous interpretation system in particular for translation, and there is an application such as online document proofing when creating a non-native language document in document proofreading. In order to realize such a service, it is an important requirement that text can be processed sequentially in real time.

また、逐次入力に対応した逐次処理を必要とする状況では、履歴ベースの方法が最も適している。また、逆に整数計画問題法を用いて解析法で逐次処理を行う場合には、現状では入力が入る毎に最適化をやり直さなくてはいけない状況になり、計算量的な観点で非常に困難となる問題がある。 In a situation where sequential processing corresponding to sequential input is required, the history-based method is most suitable. Conversely, when using the integer programming problem method to perform sequential processing with an analysis method, at present, optimization must be performed again every time an input is entered, which is extremely difficult from a computational viewpoint. There is a problem.

また、一般的に、整数計画法のような全体最適化を行う方法論のほうが、履歴ベースで解析を行う方法論よりも解析制度は高いが、上述のように、整数計画法のような全体最適化を行う方法論は、逐次処理には計算量的な観点で問題がある。 In general, methodologies that perform overall optimization such as integer programming have a higher analysis system than methodologies that perform history-based analysis, but as described above, overall optimization such as integer programming However, the sequential processing has a problem in terms of computational complexity.

本発明では、上記問題点を解決するために成されたものであり、計算量の増大を抑制して、逐次入力される入力文字列に対して精度良く自然言語解析処理を行うことができる自然言語解析処理装置、方法、及びプログラムを提供することを目的とする。 The present invention has been made in order to solve the above-described problems, and it is possible to perform natural language analysis processing with high accuracy on an input character string that is sequentially input while suppressing an increase in the amount of calculation. An object is to provide a language analysis processing apparatus, method, and program.

上記目的を達成するために、第１の発明に係る自然言語解析処理装置は、逐次入力される少なくとも１つの文字からなる文字列の各々を連結して得られる入力文字列に対して言語解析処理を含む自然言語解析処理を行う自然言語解析処理装置であって、前記入力文字列に対して前記言語解析処理を行う問題を、予め定義した文字単位又は文字列単位で前記言語解析処理を行う部分問題に分解したときの前記部分問題を、前記文字単位又は文字列単位の入力毎に、逐次生成する部分問題生成手段と、前記部分問題生成手段により逐次生成された部分問題の集合を記憶する部分問題記憶手段と、前記部分問題生成手段により前記部分問題が逐次生成される毎に、前記自然言語解析処理を行うＳ個（Ｓは２以上の自然数である）の計算ノードと、前記部分問題生成手段により前記部分問題が逐次生成される毎に、前記部分問題記憶手段に記憶された前記部分問題の集合について、各部分問題が少なくとも２以上のグループに属するように、前記部分問題の全集合に対する任意の部分集合で構成されるＳ個のグループを作成して前記Ｓ個の計算ノードに割り当てるグループ作成手段と、を含み、前記Ｓ個の計算ノードの各々は、前記グループ作成手段によって割り当てられた前記グループの前記部分問題の全集合に対する部分集合について、前記２以上のグループに属する各極小部分問題の解が一致する制約条件に基づいて、前記部分集合の各部分問題の解を更新する部分解更新手段と、前記部分解更新手段によって更新された前記部分集合の各部分問題の解が、前回更新された前記部分集合の各部分問題の解と全て一致している場合に、前記計算ノードの前記部分更新手段による更新処理を行わないことを表すグループ非活性状態を設定するグループ非活性状態設定手段と、各部分問題について、前記部分解更新手段によって更新された前記部分問題の解が、前回更新された前記部分問題の解と一致していない場合に、前記制約条件に従って前記部分問題の解を一致させるときの解である制約条件パラメータについて設定されている、前記制約条件パラメータの更新処理を行わないことを表す制約条件非活性状態を解除する制約条件活性状態設定手段と、前記部分解更新手段によって更新された前記部分集合の各部分問題の解を、他の計算ノードに通知すると共に、前記他の計算ノードから通知された前記部分集合の各部分問題の解を取得する同期手段と、前記同期手段によって取得した前記他の計算ノードの前記部分集合の各部分問題の解と、前記部分解更新手段によって更新された前記部分集合の各部分問題の解とに基づいて、各部分問題について、前記部分問題の前記制約条件パラメータを更新する制約条件更新手段と、各部分問題について、前記制約条件更新手段により更新された前記制約条件パラメータが、前回更新された前記制約条件パラメータの値と一致しない場合には、前記部分問題が属するグループが割り当てられた前記計算ノードについて設定されている前記グループ非活性状態を解除するグループ活性状態設定手段と、各部分問題について、前記制約条件更新手段によって更新された前記制約条件パラメータが、前回更新された前記制約条件パラメータの値と一致している場合に、前記部分問題の前記制約条件パラメータについて前記制約条件非活性状態を設定する制約条件非活性状態設定手段と、前記制約条件パラメータの値が収束したか否かを判定し、前記制約条件パラメータの値が収束したと判定するまで、前記部分解更新手段による更新、前記グループ非活性状態設定手段による設定、前記制約条件活性状態設定手段による設定、前記同期手段による通知及び取得、前記制約条件更新手段による更新、前記グループ活性状態設定手段による設定、並びに前記制約条件非活性状態設定手段による設定を繰り返す収束判定手段とを含む自然言語解析処理装置であって、前記グループ非活性状態が設定されている前記計算ノードは、前記部分解更新手段による更新を行わず、前記グループ非活性状態が設定されている前記計算ノードの前記同期手段は、前記部分解更新手段によって最後に更新された前記部分集合の各部分問題の解を、他の計算ノードに通知し、前記制約条件更新手段は、前記制約条件非活性状態が設定されている前記制約条件パラメータを更新しない自然言語解析処理装置として構成されている。 In order to achieve the above object, a natural language analysis processing device according to a first aspect of the present invention provides a language analysis process for an input character string obtained by concatenating each of character strings consisting of at least one character that is sequentially input. A natural language analysis processing apparatus that performs natural language analysis processing including: a part that performs the language analysis processing on a predetermined character unit or character string basis on a problem of performing the language analysis processing on the input character string A partial problem generating unit that sequentially generates the partial problem when decomposed into problems for each input in character units or character string units, and a portion that stores a set of partial problems generated sequentially by the partial problem generating unit Each time the partial problem is successively generated by the problem storage means, the partial problem generation means, S calculation nodes (S is a natural number of 2 or more) for performing the natural language analysis processing, Each time the subproblem is generated by the subproblem generating means, the subproblem is stored so that each subproblem belongs to at least two groups. Creating S groups composed of arbitrary subsets of all sets and assigning them to the S computation nodes, each of the S computation nodes being sent by the group creation means Update a solution of each subproblem of the subset based on a constraint condition that the solutions of the minimal subproblems belonging to the two or more groups match for a subset of the assigned subset of the subproblem A partial decomposition updating means for performing the partial decomposition update of the subset updated by the partial decomposition updating means. Group inactive state setting means for setting a group inactive state indicating that update processing by the partial update means of the computation node is not performed when all the solutions of the partial problems of When the solution of the subproblem updated by the partial decomposition updating means does not match the solution of the subproblem updated last time, the solution for matching the solution of the subproblem according to the constraint condition The constraint condition active state setting means for canceling the constraint condition inactive state, which is set for the constraint condition parameter and indicates that the update process of the constraint parameter is not performed, and the updated by the partial decomposition update means Notifying the other computation nodes of the solution of each subproblem of the subset, and each subproblem of the subset notified from the other computation node A synchronization means for obtaining a solution; a solution for each partial problem of the subset of the other computation node obtained by the synchronization means; and a solution for each partial problem of the subset updated by the partial decomposition update means; The constraint condition updating means for updating the constraint condition parameter of the partial problem for each partial problem, and the constraint condition parameter updated by the constraint condition updating means for each partial problem was updated last time A group active state setting means for canceling the group inactive state set for the computing node to which the group to which the partial problem belongs is assigned when the value does not match the value of the constraint parameter; The constraint condition parameter updated by the constraint condition update unit is the same as the constraint condition parameter updated last time. The constraint condition inactive state setting means for setting the constraint condition inactive state for the constraint parameter of the partial problem, and whether the value of the constraint parameter has converged or not Until the value of the constraint parameter is determined to have converged, update by the partial decomposition update unit, setting by the group inactive state setting unit, setting by the constraint active state setting unit, synchronization unit A convergence determination unit that repeats notification and acquisition by, update by the constraint condition update unit, setting by the group active state setting unit, and setting by the constraint condition inactive state setting unit, The computation node in which the group inactive state is set does not perform the update by the partial decomposition update unit, and does not update the group. The synchronization means of the calculation node set to the inactive state notifies the solution of each partial problem of the subset last updated by the partial decomposition update means to other calculation nodes, and the constraint The condition update means is configured as a natural language analysis processing device that does not update the constraint parameter for which the constraint condition inactive state is set.

第２の発明に係る自然言語解析方法は、部分問題生成手段、部分問題記憶手段、Ｓ個（Ｓは２以上の自然数である）の計算ノード、及びグループ作成手段を含み、逐次入力される少なくとも１つの文字からなる文字列の各々を連結して得られる入力文字列に対して言語解析処理を含む自然言語解析処理を行う自然言語解析処理装置における自然言語解析処理方法であって、前記部分問題生成手段によって、前記入力文字列に対して前記言語解析処理を行う問題を、予め定義した文字単位又は文字列単位で前記言語解析処理を行う部分問題に分解したときの前記部分問題を、前記文字単位又は文字列単位の入力毎に、逐次生成し、前記部分問題記憶手段によって、前記部分問題生成手段により逐次生成された部分問題の集合を記憶し、前記グループ作成手段によって、前記部分問題生成手段により前記部分問題が逐次生成される毎に、前記部分問題記憶手段に記憶された前記部分問題の集合について、各部分問題が少なくとも２以上のグループに属するように、前記部分問題の全集合に対する任意の部分集合で構成されるＳ個のグループを作成して前記Ｓ個の計算ノードに割り当て、前記Ｓ個の計算ノードによって、前記自然言語解析処理を行い、前記Ｓ個の計算ノードの各々によって前記自然言語解析処理を行うことは、部分解更新手段によって、前記グループ作成手段によって割り当てられた前記グループの前記部分問題の全集合に対する部分集合について、前記２以上のグループに属する各極小部分問題の解が一致する制約条件に基づいて、前記部分集合の各部分問題の解を更新し、グループ非活性状態設定手段によって、前記部分解更新手段によって更新された前記部分集合の各部分問題の解が、前回更新された前記部分集合の各部分問題の解と全て一致している場合に、前記計算ノードの前記部分更新手段による更新処理を行わないことを表すグループ非活性状態を設定し、制約条件活性状態設定手段によって、各部分問題について、前記部分解更新手段によって更新された前記部分問題の解が、前回更新された前記部分問題の解と一致していない場合に、前記制約条件に従って前記部分問題の解を一致させるときの解である制約条件パラメータについて設定されている、前記制約条件パラメータの更新処理を行わないことを表す制約条件非活性状態を解除し、同期手段によって、前記部分解更新手段によって更新された前記部分集合の各部分問題の解を、他の計算ノードに通知すると共に、前記他の計算ノードから通知された前記部分集合の各部分問題の解を取得し、制約条件更新手段によって、前記同期手段によって取得した前記他の計算ノードの前記部分集合の各部分問題の解と、前記部分解更新手段によって更新された前記部分集合の各部分問題の解とに基づいて、各部分問題について、前記部分問題の前記制約条件パラメータを更新し、グループ活性状態設定手段によって、各部分問題について、前記制約条件更新手段により更新された前記制約条件パラメータが、前回更新された前記制約条件パラメータの値と一致しない場合には、前記部分問題が属するグループが割り当てられた前記計算ノードについて設定されている前記グループ非活性状態を解除し、制約条件非活性状態設定手段によって、各部分問題について、前記制約条件更新手段によって更新された前記制約条件パラメータが、前回更新された前記制約条件パラメータの値と一致している場合に、前記部分問題の前記制約条件パラメータについて前記制約条件非活性状態を設定し、収束判定手段によって、前記制約条件パラメータの値が収束したか否かを判定し、前記制約条件パラメータの値が収束したと判定するまで、前記部分解更新手段による更新、前記グループ非活性状態設定手段による設定、前記制約条件活性状態設定手段による設定、前記同期手段による通知及び取得、前記制約条件更新手段による更新、前記グループ活性状態設定手段による設定、並びに前記制約条件非活性状態設定手段による設定を繰り返すことを含む自然言語解析処理方法であって、前記グループ非活性状態が設定されている前記計算ノードは、前記部分解更新手段による更新を行わず、前記グループ非活性状態が設定されている前記計算ノードの前記同期手段は、前記部分解更新手段によって最後に更新された前記部分集合の各部分問題の解を、他の計算ノードに通知し、前記制約条件更新手段は、前記制約条件非活性状態が設定されている前記制約条件パラメータを、更新しない。 The natural language analysis method according to the second invention includes a subproblem generation means, a subproblem storage means, S calculation nodes (S is a natural number of 2 or more), and a group creation means, and is input at least sequentially. A natural language analysis processing method in a natural language analysis processing apparatus for performing natural language analysis processing including language analysis processing on an input character string obtained by concatenating each of character strings composed of one character, the partial problem The partial problem when the problem of performing the language analysis processing on the input character string is decomposed into a partial problem of performing the language analysis processing in a character unit or character string unit defined in advance by the generation unit For each input in units or character strings, it is generated sequentially, and the partial problem storage means stores a set of partial problems generated sequentially by the partial problem generation means. Each time the subproblem is sequentially generated by the subproblem generating means, the subproblem is set so that each subproblem belongs to at least two groups. And creating S groups composed of arbitrary subsets for the entire set of the subproblems and assigning them to the S computation nodes, and performing the natural language analysis processing by the S computation nodes, Performing the natural language analysis processing by each of the S computation nodes may be performed by the partial decomposition updating unit with respect to a subset of the subset of the partial problem assigned by the group creating unit with respect to the two or more subsets. The solutions of each subproblem of the subset are updated based on the constraint condition that the solutions of the minimal subproblems belonging to the group of And when the solutions of the partial problems of the subset updated by the partial decomposition updating means by the group inactive state setting means all coincide with the solutions of the partial problems of the subset updated last time A group inactive state indicating that the update process by the partial update unit of the calculation node is not performed, and the partial condition update unit updates the partial problem by the constraint condition active state setting unit. When the solution of the subproblem does not match the solution of the subproblem updated last time, the constraint condition parameter is set as a solution when matching the solution of the subproblem according to the constraint, The constraint condition inactive state indicating that the update process of the constraint parameter is not performed is released, and updated by the partial decomposition update unit by the synchronization unit Notifying the other computation nodes of the solution of each partial problem of the subset that has been obtained, obtaining the solution of each partial problem of the subset notified from the other computation node, by the constraint condition update means, Based on the solution of each subproblem of the subset of the other computation node obtained by the synchronization means and the solution of each subproblem of the subset updated by the partial decomposition update means, The constraint condition parameter of the partial problem is updated, and the constraint condition parameter updated by the constraint condition update unit for each partial problem is updated by the group active state setting unit with the value of the constraint condition parameter updated last time. The group inactivity set for the computing node to which the group to which the partial problem belongs is assigned. When the constraint condition parameter updated by the constraint condition update unit matches the value of the constraint parameter updated last time for each partial problem by the constraint condition inactive state setting unit In addition, the constraint condition inactive state is set for the constraint parameter of the partial problem, and it is determined whether or not the value of the constraint parameter has converged by a convergence determination unit, and the value of the constraint parameter is converged Update by the partial decomposition update unit, setting by the group inactive state setting unit, setting by the constraint condition active state setting unit, notification and acquisition by the synchronization unit, update by the constraint condition update unit, The setting by the group active state setting unit and the setting by the constraint inactive state setting unit are repeated. The computation node in which the group inactive state is set is not updated by the partial decomposition updating means, and the group inactive state is set in the natural language analysis processing method including The synchronization means of the calculation node notifies the solution of each partial problem of the subset last updated by the partial decomposition update means to other calculation nodes, and the constraint condition update means The constraint parameter for which the state is set is not updated.

第１の発明及び第２の発明によれば、逐次入力される入力文字列に対して自然言語解析処理を行う問題を部分問題の集合に分割してグループを作成し、２以上のグループに属する部分問題の解が一致する制約条件に基づいて、各計算ノードにおいて、計算ノードのグループに非活性状態が設定されているか否かに応じて、割り当てられた部分問題の部分集合について解を更新し、他の計算ノードから取得した部分問題の解を用いて、制約条件パラメータに非活性状態が設定されているか否かに応じて、制約条件パラメータを更新することを収束するまで繰り返すことにより、計算量の増大を抑制して、逐次入力される入力文字列に対して精度良く自然言語解析処理を行うことができる。 According to the first and second inventions, the problem of performing a natural language analysis process on an input character string that is sequentially input is divided into a set of subproblems to create a group, and belongs to two or more groups Based on the constraints that match the solution of the subproblem, each compute node updates the solution for a subset of the assigned subproblem depending on whether or not a group of computation nodes is inactive. Using the solution of the subproblem obtained from other calculation nodes, it is calculated by repeating updating the constraint parameter until convergence, depending on whether or not the constraint parameter is inactive. The natural language analysis process can be performed with high accuracy on the input character string that is sequentially input while suppressing an increase in the amount.

また、第１の発明及び第２の発明によれば、前記部分解更新手段は、前記グループ作成手段によって割り当てられた前記グループの前記部分問題の部分集合について、前回更新されたラグランジュ未定乗数、前記部分集合の各部分問題の解、及び前記制約条件パラメータを用いて、予め定められた目的関数の値を最適化するように、前記ラグランジュ未定乗数を更新し、前記更新された前記ラグランジュ未定乗数を用いて、前記目的関数の値を最適化するように、前記部分集合の各部分問題の解を更新し、前記同期手段は、前記部分解更新手段によって更新された前記ラグランジュ未定乗数及び前記部分集合の各部分問題の解を、他の計算ノードに通知すると共に、前記他の計算ノードから通知された前記ラグランジュ未定乗数及び前記部分集合の各極小部分問題の解を取得し、前記制約条件更新手段は、前記同期手段によって取得した前記ほかの計算ノードの前記ラグランジュ未定乗数及び前記部分集合の各部分問題の解に基づいて、前記目的関数の値を最適化するように、各部分問題について、前記制約条件パラメータを更新するようにすることもできる。 Further, according to the first and second inventions, the partial decomposition updating means is a Lagrange undetermined multiplier updated last time for a subset of the partial problems of the group assigned by the group creating means, The Lagrange undetermined multiplier is updated so as to optimize the value of a predetermined objective function using the solution of each subproblem of the subset and the constraint parameter, and the updated Lagrange undetermined multiplier is And updating the solution of each subproblem of the subset so as to optimize the value of the objective function, and the synchronization means updates the Lagrange undetermined multiplier and the subset updated by the partial decomposition update means. The solution of each subproblem is notified to another computation node, and the Lagrange undetermined multiplier and the portion notified from the other computation node. Each constraint sub-problem is obtained, and the constraint condition update means is based on the Lagrange undetermined multiplier of the other computation node obtained by the synchronization means and the solution of each sub-problem of the subset. The constraint parameter may be updated for each subproblem so as to optimize the value of the objective function.

第３の発明の自然言語解析処理装置は、逐次入力される少なくとも１つの文字からなる文字列の各々を連結して得られる入力文字列に対して言語解析処理を含む自然言語解析処理を行う自然言語解析処理装置であって、前記入力文字列に対して前記言語解析処理を行う問題を、予め定義した文字単位又は文字列単位で前記言語解析処理を行う部分問題に分解したときの前記部分問題を、前記文字単位又は文字列単位の入力毎に、逐次生成する部分問題生成手段と、前記部分問題生成手段により逐次生成された部分問題の集合を記憶する部分問題記憶手段と、前記部分問題生成手段により前記部分問題が逐次生成される毎に、前記自然言語解析処理を行うＳ個（Ｓは２以上の自然数である）の計算ノードと、前記部分問題生成手段により前記部分問題が逐次生成される毎に、前記部分問題記憶手段に記憶された前記部分問題の集合について、各部分問題が少なくとも２以上のグループに属するように、前記部分問題の全集合に対する任意の部分集合で構成されるＳ個のグループを作成して前記Ｓ個の計算ノードに割り当てるグループ作成手段と、制約条件更新手段と、グループ活性状態設定手段と、制約条件非活性状態設定手段と、収束判定手段と、を含み、前記Ｓ個の計算ノードの各々は、前記グループ作成手段によって割り当てられた前記グループの前記部分問題の全集合に対する部分集合について、前記２以上のグループに属する各極小部分問題の解が一致する制約条件に基づいて、前記部分集合の各部分問題の解を更新する部分解更新手段と、前記部分解更新手段によって更新された前記部分集合の各部分問題の解が、前回更新された前記部分集合の各部分問題の解と全て一致している場合に、前記計算ノードの前記部分更新手段による更新処理を行わないことを表すグループ非活性状態を設定するグループ非活性状態設定手段と、各部分問題について、前記部分解更新手段によって更新された前記部分問題の解が、前回更新された前記部分問題の解と一致していない場合に、前記制約条件に従って前記部分問題の解を一致させるときの解である制約条件パラメータについて設定されている、前記制約条件パラメータの更新処理を行わないことを表す制約条件非活性状態を解除する制約条件活性状態設定手段と、を含み前記制約条件更新手段は、各計算ノードの前記部分集合の各部分問題の解と、前記部分解更新手段によって更新された前記部分集合の各部分問題の解とに基づいて、各部分問題について、前記部分問題の前記制約条件パラメータを更新し、前記グループ活性状態フラグ付与手段は、各部分問題について、前記制約条件更新手段により更新された前記制約条件パラメータが、前回更新された前記制約条件パラメータの値と一致しない場合には、前記部分問題が属するグループが割り当てられた前記計算ノードについて設定されている前記グループ非活性状態を解除し、前記制約条件非活性状態設定手段は、各部分問題について、前記制約条件更新手段によって更新された前記制約条件パラメータが、前回更新された前記制約条件パラメータの値と一致している場合に、前記部分問題の前記制約条件パラメータについて前記制約条件非活性状態を設定し、前記収束判定手段は、前記制約条件パラメータの値が収束したか否かを判定し、前記制約条件パラメータの値が収束したと判定するまで、前記部分解更新手段による更新、前記グループ非活性状態設定手段による設定、前記制約条件活性状態設定手段による設定、前記同期手段による通知及び取得、前記制約条件更新手段による更新、前記グループ活性状態設定手段による設定、並びに前記制約条件非活性状態設定手段による設定を繰り返す自然言語解析処理装置であって、前記グループ非活性状態が設定されている前記計算ノードは、前記部分解更新手段による更新を行わず、前記制約条件更新手段は、前記制約条件非活性状態が設定されている前記制約条件パラメータの更新を行わず、前記グループ非活性状態が設定されている前記計算ノードにおける前記部分集合の各部分問題の解は、前記部分解更新手段によって最後に更新された前記部分集合の各部分問題の解とする自然言語解析処理装置として構成されている。 According to a third aspect of the present invention, there is provided a natural language analysis processing apparatus for performing a natural language analysis process including a language analysis process on an input character string obtained by concatenating each of character strings composed of at least one character that is sequentially input. The language analysis processing device, wherein the problem of performing the language analysis processing on the input character string is decomposed into a partial problem of performing the language analysis processing in a predefined character unit or character string unit For each input of the character unit or the character string unit, a partial problem generation unit that sequentially generates, a partial problem storage unit that stores a set of partial problems generated sequentially by the partial problem generation unit, and the partial problem generation Each time the subproblem is sequentially generated by the means, S calculation nodes (S is a natural number of 2 or more) for performing the natural language analysis processing, and the subproblem generating means Each time a partial problem is generated sequentially, an arbitrary portion of the partial problem set is stored in the partial problem storage means so that each partial problem belongs to at least two groups. A group creating means for creating S groups composed of sets and assigning to the S computing nodes, a constraint condition updating means, a group active state setting means, a constraint condition inactive state setting means, and a convergence determination Each of the S computing nodes is a subset of the subset of the sub-problems assigned by the group creating means, and each of the minimal sub-problems belonging to the two or more groups. Based on the constraint condition that the solutions match, the partial decomposition updating means for updating the solution of each partial problem of the subset, and the partial decomposition updating means When the solution of each subproblem of the newly updated subset is consistent with the solution of each subproblem of the subset updated last time, the update processing by the subupdate means of the calculation node is not performed A group inactive state setting means for setting a group inactive state indicating that, and for each partial problem, the solution of the partial problem updated by the partial decomposition update means is the same as the solution of the partial problem updated last time. If not, the constraint condition inactive state is set for the constraint parameter that is a solution when matching the solution of the subproblem according to the constraint condition, and indicates that the constraint parameter update process is not performed. The constraint condition update means includes a constraint condition active state setting means for canceling the solution, and the constraint condition update means includes a solution of each partial problem of the subset of each computation node and the partial decomposition update method. And updating the constraint parameter of the partial problem for each partial problem based on the solution of each partial problem of the subset updated by the group active state flag assigning means, If the constraint parameter updated by the constraint condition update means does not match the value of the constraint parameter updated last time, the calculation node assigned to the group to which the partial problem belongs is set. The group inactive state is canceled, and the constraint condition inactive state setting means sets the constraint condition parameter updated by the constraint condition update means for each partial problem to be equal to the previously updated constraint parameter value. The constraint condition inactive for the constraint parameter of the subproblem The convergence determination means determines whether or not the value of the constraint parameter has converged, and updates by the partial decomposition update means until determining that the value of the constraint parameter has converged, the group Setting by the inactive state setting unit, setting by the constraint condition active state setting unit, notification and acquisition by the synchronization unit, update by the constraint condition update unit, setting by the group active state setting unit, and the constraint condition inactive state A natural language analysis processing apparatus that repeats setting by a setting unit, wherein the calculation node in which the group inactive state is set does not perform update by the partial decomposition update unit, and the constraint condition update unit performs the constraint The condition inactive state is set The constraint parameter is not updated, and the group inactive state is set The solution of each subproblem of the subset in the computation node is configured as a natural language analysis processing device that uses the solution of each subproblem of the subset last updated by the partial decomposition update means.

第４の発明に係る自然言語解析方法は、部分問題生成手段、部分問題記憶手段、Ｓ個（Ｓは２以上の自然数である）の計算ノード、グループ作成手段、制約条件更新手段、グループ活性状態設定手段、制約条件非活性状態設定手段、及び収束判定手段を含み、逐次入力される少なくとも１つの文字からなる文字列の各々を連結して得られる入力文字列に対して言語解析処理を含む自然言語解析処理を行う自然言語解析処理装置における自然言語解析処理方法であって、前記部分問題生成手段によって、前記入力文字列に対して前記言語解析処理を行う問題を、予め定義した文字単位又は文字列単位で前記言語解析処理を行う部分問題に分解したときの前記部分問題を、前記文字単位又は文字列単位の入力毎に、逐次生成し、前記部分問題記憶手段によって、前記部分問題生成手段により逐次生成された部分問題の集合を記憶し、前記グループ作成手段によって、前記部分問題生成手段により前記部分問題が逐次生成される毎に、前記部分問題記憶手段に記憶された前記部分問題の集合について、各部分問題が少なくとも２以上のグループに属するように、前記部分問題の全集合に対する任意の部分集合で構成されるＳ個のグループを作成して前記Ｓ個の計算ノードに割り当て、前記Ｓ個の計算ノードによって、前記自然言語解析処理を行い、前記制約条件更新手段によって更新し、前記グループ活性状態設定手段によって設定し、前記制約条件非活性状態設定手段によって設定し、前記収束判定手段によって判定することを含み、前記Ｓ個の計算ノードの各々によって前記自然言語解析処理を行うことは、前記部分解更新手段によって、前記グループ作成手段によって割り当てられた前記グループの前記部分問題の全集合に対する部分集合について、前記２以上のグループに属する各極小部分問題の解が一致する制約条件に基づいて、前記部分集合の各部分問題の解を更新し、前記グループ非活性状態設定手段によって、前記部分解更新手段によって更新された前記部分集合の各部分問題の解が、前回更新された前記部分集合の各部分問題の解と全て一致している場合に、前記計算ノードの前記部分更新手段による更新処理を行わないことを表すグループ非活性状態を設定し、前記制約条件活性状態設定手段によって、各部分問題について、前記部分解更新手段によって更新された前記部分問題の解が、前回更新された前記部分問題の解と一致していない場合に、前記制約条件に従って前記部分問題の解を一致させるときの解である制約条件パラメータについて設定されている、前記制約条件パラメータの更新処理を行わないことを表す制約条件非活性状態を解除することを含み、前記制約条件更新手段は、各計算ノードの前記部分解更新手段によって更新された前記部分集合の各部分問題の解に基づいて、各部分問題について、前記部分問題の前記制約条件パラメータを更新し、前記グループ活性状態設定手段は、各部分問題について、前記制約条件更新手段により更新された前記制約条件パラメータが、前回更新された前記制約条件パラメータの値と一致しない場合には、前記部分問題が属するグループが割り当てられた前記計算ノードについて設定されている前記グループ非活性状態を解除し、制約条件非活性状態設定手段は、各部分問題について、前記制約条件更新手段によって更新された前記制約条件パラメータが、前回更新された前記制約条件パラメータの値と一致している場合に、前記部分問題の前記制約条件パラメータについて前記制約条件非活性状態を設定し、前記収束判定手段は、前記制約条件パラメータの値が収束したか否かを判定し、前記制約条件パラメータの値が収束したと判定するまで、各計算ノードの前記部分解更新手段による更新、前記グループ非活性状態設定手段による設定、前記制約条件活性状態設定手段による設定、前記同期手段による通知及び取得、前記制約条件更新手段による更新、前記グループ活性状態設定手段による設定、並びに前記制約条件非活性状態設定手段による設定を繰り返すことを含む自然言語解析処理方法であって、前記グループ非活性状態が設定されている前記計算ノードは、前記部分解更新手段による更新を行わず、前記制約条件更新手段は、前記制約条件非活性状態が設定されている前記制約条件パラメータの更新を行わず、前記グループ非活性状態が設定されている前記計算ノードにおける前記部分集合の各部分問題の解は、前記部分解更新手段によって最後に更新された前記部分集合の各部分問題の解とする。 A natural language analysis method according to a fourth aspect of the present invention is a partial problem generation means, a partial problem storage means, S (S is a natural number of 2 or more) calculation nodes, a group creation means, a constraint condition update means, a group active state Natural including a language analysis process for an input character string obtained by concatenating each of character strings composed of at least one character sequentially input, including a setting unit, a constraint condition inactive state setting unit, and a convergence determination unit A natural language analysis processing method in a natural language analysis processing apparatus for performing language analysis processing, wherein the problem of performing the language analysis processing on the input character string by the partial problem generation means is defined in character units or characters defined in advance. The partial problem when decomposed into partial problems for performing the language analysis processing in units of columns is sequentially generated for each input of the character unit or character string unit, and the partial problem storage A set of subproblems sequentially generated by the subproblem generating means is stored in each stage, and each time the partial problems are sequentially generated by the partial problem generating means by the group generating means, With respect to the stored set of subproblems, S groups are formed by forming arbitrary S subsets with respect to the entire set of subproblems so that each subproblem belongs to at least two groups. The S calculation nodes perform the natural language analysis processing, update by the constraint condition update means, set by the group active state setting means, and by the constraint condition inactive state setting means Setting and determining by the convergence determining means, each of the S computation nodes by the natural language The analysis process is performed by the partial decomposition updating unit, and the partial sub-problem belonging to the two or more groups is solved for a subset of the partial problem of the group assigned by the group creating unit. Based on the matching constraints, update the solution of each partial problem of the subset, and the solution of each partial problem of the subset updated by the partial decomposition update means by the group inactive state setting means, When all the solutions of the partial problems of the subset updated last time are consistent with each other, set a group inactive state indicating that update processing by the partial update unit of the calculation node is not performed, and the constraint condition For each subproblem by the active state setting means, the solution of the subproblem updated by the partial decomposition update means is updated last time. If the solution of the subproblem is not consistent with the solution of the subproblem, the update of the constraint parameter set for the constraint parameter that is a solution when matching the solution of the subproblem according to the constraint is not performed. The constraint condition updating means includes: releasing the constraint inactive state for expressing, for each partial problem based on a solution of each partial problem of the subset updated by the partial decomposition update means of each computation node Updating the constraint parameter of the subproblem, and the group active state setting means, for each subproblem, the constraint parameter updated by the constraint condition update means is the same as the constraint parameter updated last time. If it does not match the value, it is set for the computation node to which the group to which the subproblem belongs is assigned. The group inactive state is canceled, and the constraint inactive state setting unit is configured such that, for each partial problem, the constraint parameter updated by the constraint condition update unit matches the value of the constraint parameter updated last time. If it does, the constraint condition inactive state is set for the constraint parameter of the partial problem, the convergence determination means determines whether the value of the constraint parameter has converged, and the constraint condition Until it is determined that the value of the parameter has converged, update by the partial decomposition update unit of each calculation node, setting by the group inactive state setting unit, setting by the constraint active state setting unit, notification and acquisition by the synchronization unit , Update by the constraint condition update means, setting by the group active state setting means, and the constraint condition inactive state A natural language analysis processing method including repeating setting by a setting unit, wherein the calculation node in which the group inactive state is set does not perform update by the partial decomposition update unit, and the constraint condition update unit includes: The solution of each partial problem of the subset in the computation node in which the group inactive state is set without updating the constraint parameter in which the constraint inactive state is set is the partial decomposition. Let it be the solution of each subproblem of the subset last updated by the updating means.

第３の発明及び第４の発明によれば、逐次入力される入力文字列に対して自然言語解析処理を行う問題を部分問題の集合に分割してグループを作成し、２以上のグループに属する部分問題の解が一致する制約条件に基づいて、各計算ノードにおいて、割り当てられた部分問題の部分集合について、グループに設定された活性状態又は非活性状態に基づいて、解を更新し、各計算ノードで更新された部分問題の解を用いて、制約条件パラメータに設定された活性状態又は非活性状態に基づいて、制約条件パラメータを更新することを収束するまで繰り返すことにより、計算量の増大を抑制して、逐次入力される入力文字列毎に精度良く自然言語解析処理を行うことができる。 According to the third and fourth inventions, the problem of performing a natural language analysis process on an input character string that is sequentially input is divided into a set of subproblems to create a group, and belongs to two or more groups Based on the constraints that match the solution of the subproblem, each calculation node updates the solution based on the active state or inactive state set for the group for each assigned subproblem subset, and each calculation Using the solution of the subproblem updated at the node, the update of the constraint parameter is repeated until convergence based on the active state or inactive state set in the constraint parameter, thereby increasing the amount of calculation. The natural language analysis processing can be performed with high accuracy for each input character string that is sequentially input.

また、第３の発明によれば、前記部分解更新手段は、前記グループ作成手段によって割り当てられた前記グループの前記部分問題の部分集合について、前回更新されたラグランジュ未定乗数、前記部分集合の各部分問題の解、及び前記制約条件パラメータを用いて予め定められた目的関数の値を最適化するように、前記ラグランジュ未定乗数を更新し、前記更新された前記ラグランジュ未定乗数を用いて、前記目的関数の値を最適化するように、前記部分集合の各部分問題の解を更新し、前記制約条件更新手段は、各計算ノードの前記部分解更新手段によって更新された前記ラグランジュ未定乗数及び前記部分集合の各部分問題の解に基づいて、前記目的関数の値を最適化するように、各部分問題について、前記制約条件パラメータを更新するようにすることもできる。 According to a third aspect of the present invention, the partial decomposition updating unit is configured to update a Lagrange undetermined multiplier updated last time and each part of the subset with respect to the subset of the partial problem of the group assigned by the group generation unit. Updating the Lagrange undetermined multiplier so as to optimize the value of the objective function determined in advance using the solution of the problem and the constraint parameter, and using the updated Lagrange undetermined multiplier, the objective function The solution of each subproblem of the subset is updated so as to optimize the value of the subset, and the constraint condition update means updates the Lagrange undetermined multiplier and the subset updated by the partial decomposition update means of each computation node Updating the constraint parameter for each subproblem to optimize the value of the objective function based on the solution of each subproblem It can also be so.

また、本発明のプログラムは、コンピュータを、請求項１〜請求項４に記載の自然言語解析処理装置を構成する各手段として機能させるためのプログラムである。 Moreover, the program of this invention is a program for functioning a computer as each means which comprises the natural language analysis processing apparatus of Claims 1-4.

以上、説明したように、本発明の自然言語解析処理装置、方法、及びプログラムによれば、計算量の増大を抑制して、逐次入力される入力文字列に対して精度よく自然言語解析処理を行うことができる。 As described above, according to the natural language analysis processing device, method, and program of the present invention, an increase in the amount of calculation is suppressed, and natural language analysis processing is performed with high accuracy on input character strings that are sequentially input. It can be carried out.

「逐次処理方式」と「一括処理方式」の違いの例を示す図である。It is a figure which shows the example of the difference between a "sequential processing system" and a "batch processing system". 極小部分問題のグループを作成する方法を説明するための図である。It is a figure for demonstrating the method to produce the group of a minimum subproblem. 本発明の第１の実施の形態に係る自然言語解析処理装置の構成を示す概略図である。It is the schematic which shows the structure of the natural language analysis processing apparatus which concerns on the 1st Embodiment of this invention. 本発明の第１の実施の形態に係る自然言語解析処理装置の計算ノードにおける自然言語解析処理ルーチンの内容を示すフローチャートである。It is a flowchart which shows the content of the natural language analysis processing routine in the calculation node of the natural language analysis processing apparatus which concerns on the 1st Embodiment of this invention. 本発明の第１の実施の形態に係る自然言語解析処理装置の他の例を示す概略図である。It is the schematic which shows the other example of the natural language analysis processing apparatus which concerns on the 1st Embodiment of this invention. 本発明の第２の実施の形態に係る自然言語解析処理装置の構成を示す概略図である。It is the schematic which shows the structure of the natural language analysis processing apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第２の実施の形態に係る言語解析制御装置の構成を示す概略図である。It is the schematic which shows the structure of the language analysis control apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第２の実施の形態に係る言語解析装置の構成を示す概略図である。It is the schematic which shows the structure of the language analyzer which concerns on the 2nd Embodiment of this invention. 単語区切り、品詞タグ付け、固有表現抽出、文節区切り、及び文節間の係り受けの例を示す図である。It is a figure which shows the example of word break, part-of-speech tagging, specific expression extraction, phrase break, and dependency between phrases. 従来技術における履歴ベースの逐次解析法を示す図である。It is a figure which shows the history-based sequential analysis method in a prior art. 従来技術における動的計画法に基づく解析の例を示す図である。It is a figure which shows the example of the analysis based on the dynamic programming in a prior art.

以下、発明の概要について詳細に説明する。 Hereinafter, the outline of the invention will be described in detail.

＜発明の概要＞
本発明では、非特許文献４（Andre F. T. Martins, Mario A. T. Figueiredo, Pedro M. Q. Aguiar, Noah A. Smith, Eric P. Xing An Augmented Lagrangian Approach to Constrained MAP Inference Proc. of ICML, 2011.）で提案されている整数計画問題を解くための一つの方法論である双対分解＋拡張ラグランジュ緩和（ＤＤ−ＡＤＭＭ）法を用いて、全体最適性を保持しつつ逐次解析を実現する。ＤＤ−ＡＤＭＭ法の大きな特徴としては、実際に解きたい問題を小さなグループに分割して解く方法である。また、この特徴により得られる重要な性質として、分割した個々のグループは、他のグループと独立に解くことが出来るという点であり、相対的に計算が容易になるといった利点がある。 <Outline of the invention>
In the present invention, it is proposed in Non-Patent Document 4 (Andre FT Martins, Mario AT Figueiredo, Pedro MQ Aguiar, Noah A. Smith, Eric P. Xing An Augmented Lagrangian Approach to Constrained MAP Inference Proc. Of ICML, 2011.). Sequential analysis is realized while maintaining the global optimality using dual decomposition + extended Lagrangian relaxation (DD-ADMM) method, which is one methodology for solving the integer programming problem. A major feature of the DD-ADMM method is a method of solving a problem to be actually solved by dividing it into small groups. Further, as an important property obtained by this feature, each divided group can be solved independently of other groups, and there is an advantage that calculation is relatively easy.

また、独立に解くことが出来る性質から、個々のグループを解く際に、他のグループの解析結果には直接影響を与えられないため、将来入力されることで生成されるグループは事実上知らなくても、部分的に解析結果を得ることが可能である。 In addition, because of the nature of being able to solve independently, when solving individual groups, the analysis results of other groups are not directly affected, so the groups generated by future input are virtually unknown. However, it is possible to obtain an analysis result partially.

本発明では、逐次入力に対して前の処理結果を再利用することで計算コストの削減を図る。また、さらにグループの活性／非活性状態制御部を導入し、最適解の探索時に各グループを、計算コストを使って計算すべきか、計算不要であるかを制御する。 In the present invention, the calculation cost is reduced by reusing the previous processing result for the sequential input. Further, a group active / inactive state control unit is introduced to control whether each group should be calculated using the calculation cost or not when searching for the optimum solution.

本発明の実施の形態として、オンライン・リアルタイム日本語文書校正システムで用いられる日本語形態素解析器において、人間が文書を入力するのに合わせて逐次解析結果を出力するシステムを想定する。 As an embodiment of the present invention, it is assumed that a Japanese morphological analyzer used in an online real-time Japanese document proofreading system outputs a sequential analysis result as a human inputs a document.

逐次入力で一回に入力されるものは、一回のかな漢字変換確定後の文字列とする。そのため、最小で一文字であり、最長は一文書となる。一般的には、逐次入力で一回に入力される文字列は、１から３単語程度や一文節程度である。一回に入力される文字列は可変長なので、ここでは、一回に入力される文字列を「一入力単位」とする。処理は以下のように行われる。 What is input at one time by sequential input is a character string after one kana-kanji conversion is confirmed. Therefore, the minimum is one character, and the longest is one document. In general, a character string input at a time by sequential input is about 1 to 3 words or about one phrase. Since the character string input at one time has a variable length, the character string input at one time is referred to as “one input unit”. Processing is performed as follows.

・開始状態：何も入力されていない状態（初期状態）
・処理１：ユーザが一入力単位をシステムに入力
・処理２：システムが一入力単位を受理し、それまでに入力されたテキストと合わせて解析を行う。
・処理３：得られた解析結果を出力
・処理４：次の入力を待機
・（以降処理１〜４の繰り返し）
・終了状態：ユーザの入力が終了（明示的な終了宣言が無くてもよい） -Start state: No input (initial state)
Process 1: The user inputs one input unit to the system. Process 2: The system accepts one input unit and analyzes it together with the text input so far.
-Process 3: Output the obtained analysis result-Process 4: Wait for the next input-(Repeat process 1-4)
-End status: User input is complete (no explicit end declaration required)

本発明は、上記処理２の逐次入力される一入力単位に対して逐次最適な自然言語解析結果を提示する処理部に相当する。以下、上記処理２に相当する処理方法を説明する。 The present invention corresponds to a processing unit that presents an optimal natural language analysis result sequentially for one input unit that is sequentially input in the above processing 2. Hereinafter, a processing method corresponding to the above processing 2 will be described.

まず、本発明が対象とする「逐次処理」による自然言語解析処理装置について定義する。ここでは、「逐次処理方式」との対比として、入力テキストが最後まですべて入力後に処理を開始する方式を「一括処理方式」とする。「逐次処理方式」と「一括処理方式」では、最終的に入力される文書は完全に同じものであるとする。処理方式の違いは、入力のされ方に依存して変わる。 First, a natural language analysis processing device by “sequential processing” targeted by the present invention will be defined. Here, as a comparison with the “sequential processing method”, a method of starting processing after all input text has been input to the end is referred to as a “batch processing method”. In the “sequential processing method” and the “batch processing method”, it is assumed that finally inputted documents are completely the same. The difference in processing method varies depending on the input method.

逐次処理方式では、一入力単位がいくつかの時刻にわかれて各時刻毎に部分的に入力されるという状況を想定する。「逐次処理方式」と「一括処理方式」の違いを図１に示す。 In the sequential processing method, a situation is assumed in which one input unit is divided into several times and partially inputted at each time. The difference between the “sequential processing method” and the “batch processing method” is shown in FIG.

一入力単位は、最終的に時刻Ｔまで入力がなされると仮定する（０≦ｔ≦Ｔ）。現在時刻をｔとし、その一時刻前をｔ−１とし、ｔ＝０を何も入力がされていない状態（初期状態）とする。 Assume that one input unit is finally input until time T (0 ≦ t ≦ T). Let t be the current time, t-1 one hour before that, and t = 0 be a state in which no input is made (initial state).

ここで、注意点として、仮にＴ＝１の場合には、入力全体が一時刻で入力されたことを意味するので、これは一括処理方式と同じ処理になることを意味する。つまり、逐次処理方式は間接的に一括処理方式を包含する処理の枠組みとなっている。逐次処理方式が可能であれば、一括処理方式の状況でも逐次処理方式で対応可能であるという方式としての利点がある。 Here, as a point of caution, if T = 1, it means that the entire input has been input at one time, and this means that the processing is the same as the batch processing method. That is, the sequential processing method is a processing framework that indirectly includes the batch processing method. If the sequential processing method is possible, there is an advantage as a method in which the sequential processing method can cope with the batch processing method.

逆に、一括処理方式で逐次処理に対応するのは、方法論としては困難である。ただし、実際にはナイーブな適用法として、逐次入力される一入力単位に対して入力毎に一括処理方式を繰り返し行うことを意味するので、一般的には計算量的な困難性が発生する。 On the contrary, it is difficult as a methodology to cope with sequential processing by the batch processing method. However, in practice, as a naïve application method, it means that the batch processing method is repeatedly performed for each input with respect to one input unit that is sequentially input, so that generally a computational difficulty occurs.

ここまでの、逐次処理方式と一括処理方式との比較でわかるように、逐次処理方式としての要件は、前時刻の処理結果を有効に活用して、効率的に処理することが出来る方法である。例えば、従来の履歴ベースの方法であれば、前時刻までの解析結果は正しいと仮定して、その解析結果を利用して現時刻に新しく入ってきた一入力単位の処理を行う方法になっている。 As can be seen from the comparison between the sequential processing method and the batch processing method so far, the requirement as the sequential processing method is a method that can efficiently process the processing result of the previous time effectively. . For example, the conventional history-based method assumes that the analysis result up to the previous time is correct, and uses the analysis result to perform processing for one input unit newly entered at the current time. Yes.

本発明は、処理方式としては一括処理方式の解法をベースにするが、全体が入力されていない途中状態でも、その時点での最適解を得つつ、現時刻で新たに入力された一入力単位を、前時刻の処理結果をうまく利用して効率的に新しい解を得る処理方法に相当する。次に本発明のベースとなる一括処理方式について説明する。 The present invention is based on a batch processing method as a processing method, but even when the whole is not being input, an input unit newly input at the current time while obtaining an optimal solution at that time Corresponds to a processing method for efficiently obtaining a new solution by using the processing result of the previous time. Next, the collective processing method as the basis of the present invention will be described.

本発明では、基本的な考え方として処理の最小単位を事前に決定する。処理の最小単位とは、形態素解析処理において、問題を定義する上で最も小さい問題として定義できるものに相当する。最小単位の問題は、解きたい問題の部分に相当することから、ここでは、最小単位の問題を「極小部分問題」と呼ぶ。そして、極小部分問題のインデックスをｒで表す。また、すべての極小部分問題のインデックスの集合をＲで表す事とする。よって、解きたい問題は、｜Ｒ｜個の極小部分問題で構成されている。なお部分問題の一例が極小部分問題である。 In the present invention, the minimum unit of processing is determined in advance as a basic idea. The minimum unit of processing corresponds to what can be defined as the smallest problem in defining a problem in morphological analysis processing. Since the problem of the minimum unit corresponds to the part of the problem to be solved, the problem of the minimum unit is referred to as “minimal subproblem”. The index of the minimal subproblem is represented by r. Also, let R denote the set of indexes for all the minimal subproblems. Therefore, the problem to be solved is composed of | R | An example of a subproblem is a minimal subproblem.

次に、一入力単位を一つの入力と考えると、形態素解析は、極小部分問題の集合により定義される。よって、これらの問題の解空間をベクトルの集合Ｙで表すとする。入力が与えられた時に得られる一つの出力をベクトルＹ∋ｚであらわす。ここでは、極小部分問題の解はベクトルの一つの要素ｚ_ｉに相当する。 Next, considering one input unit as one input, morphological analysis is defined by a set of minimal subproblems. Therefore, the solution space of these problems is represented by a set Y of vectors. One output obtained when an input is given is represented by a vector Y∋z. Here, the solution of the minimal sub-problem corresponds to one element z _i of the vector.

次に、本発明では、いくつかの極小部分問題でグループを構成し、グループの集合で問題を表現する。この時、各グループは極小部分問題を重複して含むことが可能である。グループの作り方は任意であるが、以下の２つの点を満たすようにする。１点目は、解きたい問題全域をカバーする形でグループを構成することであり、２点目は、各極小部分問題は複数のグループに属することである。ここでは、作成した一つのグループを全体の問題に対する一部分の問題と考える。よって、一つのグループで一つの最適化問題として解くことになる。 Next, in the present invention, a group is composed of several minimal subproblems, and the problem is expressed by a set of groups. At this time, each group can include the minimum subproblem in an overlapping manner. The method of creating a group is arbitrary, but the following two points should be satisfied. The first point is to form a group so as to cover the entire problem to be solved, and the second point is that each local subproblem belongs to a plurality of groups. Here, the created group is considered as a partial problem for the entire problem. Therefore, one group solves as one optimization problem.

ここで、グループをＳ個作成したとする。ｓ番目のグループに含まれる解空間の部分空間をＹ_ｓとする。つまりＹ_ｓ⊆Ｙである。このとき、ｓ番目のグループに含まれるｒ番目の極小部分問題の解をｚ_ｓ（ｒ）とする。グループを作成する例を図２に示す。 Here, it is assumed that S groups are created. the subspace of the solution space included in s-th group and Y _s. That is, Y _s ⊆Y. At this time, let z _s (r) be the solution of the rth minimal subproblem included in the sth group. An example of creating a group is shown in FIG.

Ｒ_ｓをｓ番目のグループに含まれる極小部分問題のインデックスの集合とする。次に、Ｒ_ｓをＲ⊆Ｕ_ｓ＝１ ^ＳＲ´_ｓ、および、Ｒ_ｓ＝Ｒ´_ｓ∩Ｒとなる集合とする。これは、極小部分問題ではないが全体の問題を解くうえで有効な補助的な部分問題があるので、それらの部分問題を含めてグループを構成したい場合に対応するための拡張である。つまり、ここではＲ´_ｓ＼Ｒに相当する集合に含まれるインデックスは、極小部分問題ではないが問題を解く上での有効な補助的な部分問題のインデックスに相当する。 Let R _s be the set of indices for the minimal subproblem included in the sth group. Next, let R _{s be a set} that satisfies R ⊆ U _{s = 1} ^S R ′ _s and R _s = R ′ _s ∩R. Although this is not a minimal subproblem, there is an auxiliary subproblem that is effective in solving the whole problem, so this is an extension to cope with the case where it is desired to form a group including these subproblems. That is, here, the index included in the set corresponding to R ′ _s \ R is not a minimal subproblem but corresponds to an index of an auxiliary subproblem effective in solving the problem.

ｓ番目のグループで得られる出力に対して、出力の良さを計算する関数をｆ_ｓとする。ここでは、ｆ_ｓ（z_ｓ）の値が大きいほど、出力として良いものであるという仮定のもとに計算される値とする。この時、自然言語解析処理を行う問題全体は下記（１）式に示す最適化問題として定式化できる。 For the output obtained in the s-th group, let f _s be a function for calculating the goodness of output. Here, the value calculated under the assumption that the larger the value of f _s (z _s ) is, the better the output is. At this time, the whole problem for performing the natural language analysis processing can be formulated as an optimization problem expressed by the following equation (1).

ここで、ｚ_ｓはグループｓに含まれる極小部分問題の解の集合とする。また、ｚは、０または１の値をとる極小部分問題の解を表すｚ_ｓ（ｒ）の全集合とする。この最適化問題は、Ｓ個のグループが独立に、各グループが担当する極小部分問題に対する出力ｚ_ｓ（ｒ）で最も良いと思われる部分解を選択する問題を、各グループの部分解はｕ（ｒ）のベクトルと一致することを制約条件として解いていることに相当する。この定式化のポイントは、制約条件ｚ_ｓ（ｒ）＝ｕ（ｒ）の部分にあり、各グループｓで、ｒ番目の極小部分問題を含む場合、ｚ_ｓ（ｒ）の値はすべてｕ（ｒ）と一致することを解の条件とするということを意味している。これは、つまり、個々のグループでは、解ｚ_ｓ（ｒ）を独立に決定するが、制約条件ｚ_ｓ（ｒ）＝ｕ（ｒ）により全体として整合性が取れる解に限定することができるということである。この制約条件の式とラグランジュ緩和とを用いて、最適化のための目的関数を得る。なお、上記（１）式は、任意のスコア関数ｆ_ｓにより定義されるスコアが最大になる極小部分問題の解ｚ_ｓ（ｒ）の集合を求めることを意味する。ただし、解きたい問題は、任意のＳ個のグループに分割できることが条件である。つまり、分割したＳ個のグループの線形和によって得られるスコアが最大になるものが解となることになる。 Here, z _s is a set of solutions of the minimal subproblem included in the group s. Z is a complete set of z _s (r) representing a solution of a minimal subproblem having a value of 0 or 1. This optimization problem into S groups independently a problem of selecting the best a partial solution that seems at output z _{s (r)} for the minimum subproblem each group is responsible, partial solution of each group u This corresponds to solving as a constraint condition that it matches the vector of (r). The point of this formulation lies in the part of the constraint condition z _s (r) = u (r), and in each group s, when the rth minimal subproblem is included, all values of z _s (r) are u ( This means that the condition for the solution is to match r). That is, in each group, the solution z _s (r) is determined independently, but can be limited to a solution that can be consistent as a whole by the constraint condition z _s (r) = u (r). That is. An objective function for optimization is obtained using the constraint equation and Lagrangian relaxation. The above equation (1) means obtaining a set of solutions z _s (r) of the minimal subproblem that maximizes the score defined by an arbitrary score function f _s . However, it is a condition that the problem to be solved can be divided into arbitrary S groups. That is, the solution having the maximum score obtained by the linear sum of the divided S groups is the solution.

実際の形態素解析問題では、例えば、単語分割に関しては、文字間に分割候補点があるとして、その分割候補点で分割されるか（ｚ_ｓ（ｒ）＝１）、または、されないか（ｚ_ｓ（ｒ）＝０）を判定する整数計画問題とみなすことが出来る。また、品詞付与の場合は、各単語に事前に定義された品詞から一つを選択する問題であり、各品詞に一つの極小部分問題の解ｚ_ｓ（ｒ）を割り当て、その中から一つが１でそれ以外がすべて０になるという制約を持った整数計画問題を解くことに相当する。また、品詞付与と同様に、読みや原形推定も、各単語の読みや原形候補の中から一つを選択する問題であり、各候補に一つの極小部分問題の解ｚ_ｓ（ｒ）を割り当て、その中から一つが１でそれ以外がすべて０になるという制約を持った整数計画問題を解くことに相当する。このような分割や割り当て問題の全体が最も適している解が形態素解析の結果として選ばれる。 In an actual morphological analysis problem, for example, regarding word division, if there is a division candidate point between characters, it is divided at the division candidate point (z _s (r) = 1) or not (z _s). It can be regarded as an integer programming problem for determining (r) = 0). In the case of POS tagging, a problem of selecting one from part of speech defined in advance each word, it assigns a solution of one minimum subproblems z _{s (r)} in each part of speech, one from among them This is equivalent to solving an integer programming problem with the constraint that 1 and all others are 0. Similarly to part-of-speech assignment, reading and original shape estimation are problems of selecting one of readings and original candidates of each word, and assigning a solution z _s (r) of one minimal subproblem to each candidate This is equivalent to solving an integer programming problem with the constraint that one of them is 1 and all others are 0. A solution that is most suitable for the whole of such a division or assignment problem is selected as a result of the morphological analysis.

実際に整数計画問題を解く際には、（拡張）ラグランジュ緩和を用いて下記（２）式の最適化の目的関数を得る。 When actually solving an integer programming problem, an optimization objective function of the following equation (2) is obtained by using (extended) Lagrangian relaxation.

次に、非特許文献５（Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, and Jonathan Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 2011.）に記載されているように、augmented Lagrangianの項ρ／２（ｚ_ｓ（ｒ）―ｕ（ｒ））^２を追加して問題を下記（３）式に示すように２次式の形に変形することで問題をより解きやすい形とする。 Next, it is described in Non-Patent Document 5 (Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, and Jonathan Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 2011.) As shown, the augmented Lagrangian term ρ / 2 (z _s (r) -u (r)) ² is added and the problem is transformed into a quadratic form as shown in the following equation (3). Make the problem easier to solve.

これは、最適値で元の問題と一致する。また、以降の式展開を簡単にするためにλ_ｓ（ｒ）＝−ρα_ｓ（ｒ）とし、下記（４）式を得る。 This is an optimal value and is consistent with the original problem. In order to simplify the subsequent expression expansion, λ _s (r) = − ρα _s (r) and the following expression (4) is obtained.

実際の解は、以下の反復処理により得る。
・初期化：繰り返しを管理する変数ｋ＝０
・処理１：ラグランジュ未定乗数α^ｋ _ｓ（ｒ）を更新
・処理２：各極小部分問題の解ｚ^ｋ _ｓ（ｒ）を各極小部分問題毎に独立に推定
・処理３：制約条件パラメータ（各極小部分問題の解の平均）であるｕ^ｋ（ｒ）を更新
・処理４：収束判定
収束していたらｕ＾（ｒ）＝ｕ^ｋ（ｒ）とする。していない場合にはｋ＝ｋ＋１として処理１へ戻る。
・出力：ｕ＾を出力する。 The actual solution is obtained by the following iterative process.
Initialization: Variable k = 0 for managing repetition
Processing 1: Update Lagrange undetermined multiplier α ^k _s (r) Processing 2: Estimate the solution z ^k _s (r) of each local subproblem independently for each local subproblem Processing 3: Restriction parameter (each update and processing ^u k (r) is the average of the solution of the minimum partial problem) 4: When I was convergence determination convergence u ^ (r) ⁼ and ^u k (r). If not, k = k + 1 and the process returns to step 1.
-Output: u ^ is output.

以下、上記処理１〜４の処理を詳細に説明する。 Hereinafter, the processes 1 to 4 will be described in detail.

まず、処理１について説明すると、処理１として、各計算ノードのラグランジュ未定乗数α^ｓ（ｒ）を更新する。ｚとｕを固定したとき個々のα_ｓ（ｒ）の最適値の方向は、目的関数Ｌ（ｚ、ｕ、α）のα_ｓ（ｒ）に関する偏微分方向である。 First, processing 1 will be described. As processing 1, Lagrange undetermined multiplier α ^s (r) of each computation node is updated. When z and u are fixed, the direction of the optimum value of each α _s (r) is a partial differential direction with respect to α _s (r) of the objective function L (z, u, α).

上記（５）式の関係から以下の更新式を得る。 The following update formula is obtained from the relationship of the formula (5).

この上記（６）式の更新式は、各極小部分問題毎に独立に計算可能である。 The update formula of the above equation (6) can be calculated independently for each minimal subproblem.

次に処理２について説明すると、処理２は、各極小部分問題の解ｚ^ｋ _ｓ（ｒ）を推定する処理であり、反復計算ｋの時点で、αとｕを固定した時、各ｚ_ｓ（ｒ）の最適解は、目的関数Ｌ（ｚ、ｕ、α）を最大にするｚ_ｓ（ｒ）を見つける問題である。 Next, the process 2 will be described. The process 2 is a process for estimating the solution z ^k _s (r) of each minimal subproblem. When α and u are fixed at the time of the iterative calculation k, each z _s ( The optimal solution of r) is the problem of finding z _s (r) that maximizes the objective function L (z, u, α).

ここで、ｆ_ｓは下記（８）式であると仮定する。 Here, it is assumed that f _s is the following equation (8).

ここで、上記（８）式のθ_ｓ（ｒ）は、グループｓでのｒに関するスコアを表わしており、この値は外部から与えられるものとする。つまり、この最適化の中では定数として扱われる。定義に従って、上記（４）式からｚに関係する項のみ取り出すと下記（９）式になる。 Here, θ _s (r) in the above equation (8) represents a score regarding r in the group s, and this value is given from the outside. In other words, it is treated as a constant in this optimization. According to the definition, when only the term related to z is extracted from the above equation (4), the following equation (9) is obtained.

上記（７）式の計算も各グループｓ毎に計算することが可能である。つまり、各グループｓ毎に、下記（１１）式の最大化問題を解けばよい。 The calculation of equation (7) can also be calculated for each group s. That is, the maximization problem of the following equation (11) may be solved for each group s.

次に、処理３について説明すると、処理３として、制約条件パラメータｕ（ｒ）を更新する。ｚとαを固定したときｕの最適解は、目的関数Ｌ（ｚ、ｕ、α）に対して、ｕ（ｒ）に関する偏微分の値が０になる点である。その関係から下記（１２）式の関係式が得られる。 Next, processing 3 will be described. As processing 3, the constraint parameter u (r) is updated. The optimal solution for u when z and α are fixed is that the partial differential value for u (r) becomes 0 with respect to the objective function L (z, u, α). From the relationship, the following equation (12) is obtained.

つまり、ｕはｚ_ｓ（ｒ）の平均とα_ｓ（ｒ）の平均を足したものを意味し、制約条件パラメータｕ（ｒ）の更新には、すべてのα_ｓ（ｒ）とｚ_ｓ（ｒ）が必要である。 That is, u means the sum of the average of z _s (r) and the average of α _s (r). For updating the constraint parameter u (r), all α _s (r) and z _s ( r) is required.

次に、処理４について説明すると、処理４として、処理３で得られたｕが最適値になっているか（収束しているか）判定する。二つの小さな正の実数ε_１、ε_２を与え、下記（１３）式及び（１４）式を満たした際に収束したと判定する（非特許文献５を参照）。 Next, the process 4 will be described. As the process 4, it is determined whether u obtained in the process 3 is an optimal value (has converged). Two small positive real numbers ε ₁ and ε ₂ are given, and it is determined that convergence has occurred when the following equations (13) and (14) are satisfied (see Non-Patent Document 5).

収束判定で、収束していなかった場合は、ｋ＝ｋ＋１として処理１に戻る。最適値に収束していると判定された場合は、繰り返し処理を終了し、解ｕを出力する。以上が一括処理の説明である。 If the convergence is not determined in the convergence determination, the process returns to process 1 with k = k + 1. If it is determined that it has converged to the optimum value, the iterative process is terminated and the solution u is output. The above is the description of the batch processing.

次に、上述の言語解析問題を整数計画問題で定式化し、双対分解＋拡張ラグランジュ緩和により問題を解く方法で、逐次処理を実現する方法について説明する。 Next, a description will be given of a method for realizing the sequential processing by formulating the above-mentioned language analysis problem by an integer programming problem and solving the problem by dual decomposition + extended Lagrangian relaxation.

上述の一括処理では、最適化変数はすべて０などで初期化して最適化を開始するが、本発明では、基本的に、逐次入力に効率的に対応するために、前時刻の処理結果を初期値とする。概要として、前時刻までの処理結果に関しては、現時刻で新しく入力された一入力単位の解析に影響を受けない部分に関しては、解析結果は前時刻で得られたものと同じものが得られ、現時刻で新しく入力された一入力単位と依存関係のある部分の解析結果は、改めて解析結果が評価される、という処理になる。このような方法により、前時刻の最適化の結果を効率的に再利用できるので、毎入力毎に最適化を行う方法と比べて実際の処理時間を大幅に削減できる。 In the batch processing described above, all optimization variables are initialized to 0 or the like and the optimization is started. However, in the present invention, in order to efficiently cope with sequential input, the processing result at the previous time is initialized. Value. As a summary, with regard to the processing results up to the previous time, for the part that is not affected by the analysis of one input unit newly input at the current time, the analysis result is the same as that obtained at the previous time, The analysis result of the part having a dependency relationship with one input unit newly input at the current time is processed again. By such a method, the result of optimization at the previous time can be efficiently reused, so that the actual processing time can be greatly reduced compared to the method of performing optimization every input.

このような効率的な処理は、双対分解＋ラグランジュ緩和法の「個別の部分問題は独立に解くことができる」という性質によって効果的に導入できる。一方、整数計画法のソルバーなどでは、このような性質をもたないため、部分的、効率的かつ逐次的に解を修正していく処理には適さない。上記（１）式を逐次処理用に拡張すると下記（１５）式のように示せる。 Such an efficient processing can be effectively introduced by the property of “the individual subproblem can be solved independently” of the dual decomposition + Lagrange relaxation method. On the other hand, integer programming solvers and the like do not have such a property, and are not suitable for processing a solution that is partially, efficiently, and sequentially corrected. When the above equation (1) is expanded for sequential processing, it can be expressed as the following equation (15).

時刻ｔまでに入力されたテキストで生成されるグループの総数をＳ_ｔと書く。また、最終時刻Ｔで最後まで入力がなされた場合に最終的に生成されるグループの総数をＳ_Ｔとする。ただし、これは、上記（１）式中のＳと等しい値Ｓ＝Ｓ_Ｔであり、逐次処理をしない場合に一括して入力を与えた際に得られるグループの総数と一致することとする。 The total number of groups that are generated by the text input until the time t is written as S _t. Further, let S _T be the total number of groups that are finally generated when the input is made up to the end at the final time T. However, this is the (1) and S value equal S = S _T in formula, and that it matches the total number of groups obtained when given an input in a batch when no sequential processing.

このとき、時刻ｔに新たに生成されたグループのスコアの総和

の最大値（上記（１５）式の右辺第二項）を求めるだけの問題であれば、従来の履歴ベースの方法と等価である。つまり、履歴ベースの方法との差異は、時刻ｔ−１までに生成されたグループのスコアの最大値も同時に考慮しつつ、現在時刻ｔまでのすべてのグループのスコアの最大値を求めることである。ただし、特に工夫がなければ、これは一括処理方式を単に時刻毎に繰り返し実行することと等価になってしまう。そのため、本発明では、時刻ｔ−１までの処理結果をうまく利用することで、単に繰り返し実行するより大幅に計算量を少なくして逐次処理を可能とする。 At this time, the total score of the group newly generated at time t

Is the same as the conventional history-based method. That is, the difference from the history-based method is that the maximum value of the scores of all the groups up to the current time t is obtained while simultaneously taking into account the maximum value of the scores of the groups generated up to the time t−1. . However, unless otherwise devised, this is equivalent to simply executing the batch processing method repeatedly at each time. For this reason, in the present invention, the processing results up to time t-1 are used well, so that it is possible to perform sequential processing with a much smaller amount of calculation than simply executing repeatedly.

まず、最も単純な方法としては、時刻ｔ−１の最適解を得た状態を時刻ｔの初期値にして利用する方法が考えられる。時刻ｔ−１と時刻ｔの最適解が近ければ、この方法は直感的にうまく働くと考えられる。ただし、時刻変化によって変数や制約が増減しているので、単純に前時刻の最終状態を初期状態としても効果があるかは必ずしも自明ではない。また、最適化アルゴリズムによっては、初期値があまり効果的に働かない場合もある。 First, as the simplest method, a method of using the state obtained at the optimum solution at time t-1 as the initial value at time t can be considered. If the optimal solutions at time t-1 and time t are close, this method is considered to work intuitively and well. However, since variables and constraints are increased or decreased due to time changes, it is not always obvious whether the final state at the previous time is simply the initial state. Also, depending on the optimization algorithm, the initial value may not work very effectively.

しかし、本発明で利用する双対分解に基づく最適化アルゴリズムは、他の最適化アルゴリズムと比べ、各制約とその変数が他の制約やそこで扱われる変数と独立になるため、時刻ｔ−１から時刻ｔに変化した際に増減するグループの最適化変数ｚ_ｓ（ｒ）は他の変数との整合性等を考慮せず容易に追加／削除できるという利点がある。また、最適解を求めるアルゴリズムも変数を固定しての反復計算なので、徐々に最適解に近づくアルゴリズムである点も時刻ｔ−１の状態を利用する効果が大きいと期待できる。 However, since the optimization algorithm based on dual decomposition used in the present invention is independent of other constraints and variables handled in each constraint and its variables as compared with other optimization algorithms, the time from time t−1 to time The optimization variable z _s (r) of the group that increases or decreases when it changes to t has an advantage that it can be easily added / deleted without considering consistency with other variables. In addition, since the algorithm for obtaining the optimal solution is an iterative calculation with fixed variables, it can be expected that the effect of using the state at time t−1 is also large because the algorithm gradually approaches the optimal solution.

そのため、手続きとしては、時刻ｔ−１から時刻ｔに変化し、整数計画問題Ｐ_ｔ−１から、整数計画問題Ｐ_ｔに移行した際に、まずＰ_ｔ−１との制約（部分問題）の観点での差分を取得する。このとき、追加に関しては、ｚ_ｓ（ｒ）＝０、α_ｓ（ｒ）＝０とし、新たにｒが追加されたのであればｕ（ｒ）＝０として、新たな最適化変数を生成する。逆に、削除の時は、ｓに関するすべてのｚ_ｓ（ｒ）、α_ｓ（ｒ）を削除し、削除したｚ_ｓ（ｒ）が最後のｕ（ｒ）に関する要素であれば、ｕ（ｒ）も削除する。これらの増減したグループ以外のグループおよび等式制約の変数は、Ｐ_ｔ−１の最終状態を引き継ぐものとする。 Therefore, as a procedure, when changing from time t-1 to time t and shifting from integer programming problem P _t-1 to integer programming problem P _t , first of the constraints (partial problems) with P _t-1 Get the difference in perspective. At this time, regarding the addition, z _s (r) = 0, α _s (r) = 0, and if r is newly added, u (r) = 0 is set and a new optimization variable is generated. . Conversely, at the time of deletion, all z _s (r) and α _s (r) related to _s are deleted, and if the deleted z _s (r) is an element related to the last u (r), u (r ) Is also deleted. It is assumed that the groups other than these increased and decreased groups and the equality constraint variables take over the final state of P _t−1 .

非特許文献６（Terry Koo, Alexander M. Rush, Michael Collins, Tommi Jaakkola, and David Sontag. Dual decomposition for parsing with non-projective head automata. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 1288−1298, Cambridge, MA, October 2010. Association for Computational Linguistics.）及び非特許文献７（Andre Martins, Noah Smith, Mario Figueiredo, and Pedro Aguiar. Dual decom-position with many overlapping components. In Proceedings of the 2011 Con-ference on Empirical Methods in Natural Language Processing, pages 238−249.Association for Computational Linguistics, July 2011.）では、最適化中に多くの最適化変数が一定の値で固定する状態が続く点に着目し、そういった変数を最適化から外すことにより最適化速度を向上させる方法を採用している。本発明では、こういった方法を更に発展させ、より緻密な最適化変数の状態制御の方法を提案する。結果として、提案する状態制御法は、本発明が取り上げる動的変化整数計画問題系列のような、最適化変数や制約が動的に変化する状況でその効果を発揮する方法である。 Non-Patent Document 6 (Terry Koo, Alexander M. Rush, Michael Collins, Tommi Jaakkola, and David Sontag. Dual decomposition for parsing with non-projective head automata. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 1288 -1298, Cambridge, MA, October 2010. Association for Computational Linguistics. And Non-Patent Document 7 (Andre Martins, Noah Smith, Mario Figueiredo, and Pedro Aguiar. Dual decom-position with many overlapping components. In Proceedings of the 2011 Con -ference on Empirical Methods in Natural Language Processing, pages 238-249. Association for Computational Linguistics, July 2011.), focusing on the fact that many optimization variables continue to be fixed at constant values during optimization. A method of improving the optimization speed by removing such variables from the optimization is adopted. In the present invention, such a method is further developed, and a more precise optimization variable state control method is proposed. As a result, the proposed state control method is effective in a situation where optimization variables and constraints change dynamically, such as the dynamic change integer programming problem sequence taken up by the present invention.

まず、最適化中の不要な計算を削減する仕組みとして、各グループｓと等式制約の変数ｕ（ｒ）毎に、それぞれ「活性」「非活性」の二値の状態を持つフラグＦ_ｓとＦ_ｒを付与する。「活性」状態（Ｆ_ｓ＝１またはＦ_ｒ＝１）とは、ｓまたはｒは、通常の最適化処理を行うことを意味し、「非活性」状態（Ｆ_ｓ＝０またはＦ_ｒ＝０）とは、ｓまたはｒは、最適化処理の計算から除外される事を意味する。 First, as a mechanism for reducing unnecessary calculation during optimization, a flag F _s having a binary state of “active” and “inactive” for each group s and each variable u (r) of the equality constraint, to grant the F _r. “Active” state (F _s = 1 or F _r = 1) means that s or r performs normal optimization processing, and “inactive” state (F _s = 0 or F _r = 0) ) Means that s or r is excluded from the calculation of the optimization process.

まず、上記（６）式〜（１４）式で表されるアルゴリズムは、上記（６）式〜（１１）式と、上記（１２）式〜（１４）式という二種類の計算ブロックで表現でき、それぞれ一つのｆоｒ文の形で書き直すことが出来る。また、上記（６）式〜（１１）式の処理では、ｕ（ｒ）を定数として計算するので、各グループｓ毎に独立な変数ｚ_ｓ（ｒ）とα_ｓ（ｒ）のみが変数となる計算となり、グループｓの単位で計算することが出来る。一方、上記（１２）式〜（１４）式の処理では、ｚ_ｓ（ｒ）は定数として計算するので、すべて等式制約に基づく変数ｕ（ｒ）の単位で計算することが出来る。Ｆ_ｓ、Ｆ_ｒのフラグは以下の条件で切り替わる。 First, the algorithm expressed by the above equations (6) to (14) can be expressed by two types of calculation blocks, the above equations (6) to (11) and the above equations (12) to (14). , Each can be rewritten in the form of a single sentence. In the processing of the above formulas (6) to (11), u (r) is calculated as a constant, so that only independent variables z _s (r) and α _s (r) are variables for each group s. It can be calculated in units of group s. On the other hand, in the processing of the above formulas (12) to (14), z _s (r) is calculated as a constant, and therefore can be calculated in units of variable u (r) based on equality constraints. The flags of F _s and F _r are switched under the following conditions.

・Ｆ_ｓ＝１ → Ｆ_ｓ＝０
対応するグループｓに含まれるすべての変数ｚ_ｓ（ｒ）がｚ^ｋ _ｓ（ｒ）＝ｚ^ｋ−１ _ｓ（ｒ）となったとき、自身のグループｓは収束したと仮定。以降、Ｆ_ｓ＝０が続く限り、グループｓに対する上記（６）式〜（１１）式で表される処理は行われない。・ F _s = 1 → F _s = 0
It is assumed that when all variables z _s (r) included in the corresponding group s become z ^k _s (r) = z ^k−1 _s (r), the group s of its own has converged. Thereafter, as long as F _s = 0 continues, the processing represented by the above expressions (6) to (11) for the group s is not performed.

・Ｆ_ｒ＝１ → Ｆ_ｒ＝０
対応するｒに対しｕ^ｋ（ｒ）＝ｕ^ｋ−１（ｒ）となったとき、自身の変数ｕ（ｒ）は収束したと仮定する。以降、Ｆ_ｒ＝０が続く限り、ｕ（ｒ）に対する上記（１２）式〜（１４）式の処理は行わない。ただし、収束判定用の値はキャッシュしておき、非活性状態が続いても、キャッシュした値を用いることで適切に収束判定用の値を計算する。・ F _r = 1 → F _r = 0
When u ^k (r) = u ^k−1 (r) for the corresponding r, it is assumed that its variable u (r) has converged. Thereafter, as long as F _r = 0 continues, the processing of the above expressions (12) to (14) is not performed on u (r). However, the convergence determination value is cached, and even if the inactive state continues, the cached value is used to appropriately calculate the convergence determination value.

上記２つの状態以降の条件は比較的緩めに設定してある。よって、かなりアグレッシブに本来収束していない場合も非活性状態へ移行する可能性がある。この問題を回避するために非活性状態から活性状態に戻す処理を下記に示す。 The conditions after the above two states are set relatively loosely. Therefore, there is a possibility of shifting to an inactive state even when the signal is not converged aggressively. In order to avoid this problem, a process for returning from the inactive state to the active state is shown below.

・Ｆ_ｓ＝０ → Ｆ_ｓ＝１
Ｒ_ｓ∋ｒとする。制約条件パラメータｕ^ｋ−１（ｒ）≠ｕ^ｋ（ｒ）のとき、つまりグループｓに含まれる極小部分問題の解ｚ_ｓ（ｒ）と共有関係にある制約条件パラメータｕ（ｒ）が更新されたことを意味する。つまり、グループの解が制約条件パラメータｕ（ｒ）に一致していなかったことを示しているので、再度最適値を推定するためにグループｓを活性状態に戻す。・ F _s = 0 → F _s = 1
Let R _s ∋r. When the constraint parameter u ^k−1 (r) ≠ u ^k (r), that is, the constraint parameter u (r) that is shared with the solution z _s (r) of the minimal subproblem included in the group s is updated. Means that. That is, since the solution of the group does not match the constraint parameter u (r), the group s is returned to the active state in order to estimate the optimum value again.

・Ｆ_ｒ＝０ → Ｆ_ｒ＝１
あるグループｓの中のｚ_ｓ（ｒ）がｚ^ｋ−１ _ｓ（ｒ）≠ｚ^ｋ _ｓ（ｒ）のとき、すなわち、制約条件パラメータｕ（ｒ）の値が変化する可能性があるので、再度最適解を評価するために活性状態にする。・ F _r = 0 → F _r = 1
When z _s (r) in a certain group s is z ^k−1 _s (r) ≠ z ^k _s (r), that is, the value of the constraint parameter u (r) may change. In order to evaluate the optimal solution again, the active state is set.

活性状態、非活性状態のポイントは、フラグを非活性化する際には、自分自身のフラグしか変更することができないという点、及び活性化する際には、自分とは別カテゴリのフラグしか変更することが出来ないという点である。これにより、状態制御がスタックすることがないように設計されている。 The point of active state and inactive state is that when you deactivate a flag, you can only change your own flag, and when you activate it, you change only a flag in a different category from your own. It is a point that cannot be done. This is designed so that the state control does not get stuck.

このように、非活性状態から活性状態へ変化させる処理を導入することで、動的変化整数計画問題系列の時刻変化による制約や最適化変数の増減に対応することが可能となる。つまり、容易に非活性状態から活性状態へ移行することが可能な枠組みを持つため、最適化変数の増減など周囲の状況が変化した場合でも、単純なフラグ管理で最小の計算で最適化を行うことができる。 In this way, by introducing the process of changing from the inactive state to the active state, it becomes possible to cope with restrictions due to time changes of dynamic change integer programming problem sequences and increase / decrease of optimization variables. In other words, since it has a framework that can easily transition from the inactive state to the active state, even if the surrounding conditions change, such as increase / decrease of the optimization variable, optimization is performed with simple flag management and minimum calculation be able to.

＜第１の実施の形態＞
＜システム構成＞
本発明の第１の実施の形態に係る自然言語解析処理装置１００は、逐次与えられる自然言語解析処理の対象の一入力単位の文字列を逐次入力として受け取り、逐次入力された文字列を連結した入力文字列に対する自然言語解析処理の結果である日本語形態素解析の結果を逐次出力する。この自然言語解析処理装置１００は、ＣＰＵと、ＲＡＭと、後述する自然言語解析処理ルーチンを実行するためのプログラムを記憶したＲＯＭとを備えたコンピュータで構成され、機能的には次に示すように構成されている。図３に示すように、自然言語解析処理装置１００は、入力部１０と、演算部２０と、出力部５０とを備えている。 <First Embodiment>
<System configuration>
The natural language analysis processing apparatus 100 according to the first embodiment of the present invention receives a character string of one input unit to be sequentially applied as a natural language analysis processing target as a sequential input, and connects the sequentially input character strings. The result of the Japanese morphological analysis, which is the result of the natural language analysis processing for the input character string, is sequentially output. The natural language analysis processing apparatus 100 is configured by a computer including a CPU, a RAM, and a ROM that stores a program for executing a natural language analysis processing routine to be described later, and functionally as described below. It is configured. As shown in FIG. 3, the natural language analysis processing apparatus 100 includes an input unit 10, a calculation unit 20, and an output unit 50.

入力部１０は、逐次入力された一入力単位の文字列を逐次受け付ける。また、入力部１０は、人手により入力された計算ノード数Ｓとパラメータρを逐次受け付ける。 The input unit 10 sequentially receives a character string of one input unit that is sequentially input. Further, the input unit 10 sequentially receives the calculation node number S and the parameter ρ input manually.

演算部２０は、極小部分問題生成部２２と、極小部分問題記憶部２４と、グループ作成部２６と、Ｓ個の計算ノード３０_１〜３０_ｓと、パラメータ記憶部４６と、を備えている。なお、計算ノード３０_１〜３０_ｓのうちの任意の計算ノードを示す場合には、計算ノード３０と称することとする。 The computing unit 20 includes a minimal subproblem generator 22, a minimal subproblem storage unit 24, a group creation unit 26, S calculation nodes 30 ₁ to 30 _s, and a parameter storage unit 46. In addition, when showing arbitrary calculation nodes among calculation nodes 30 ₁ to 30 _s , they are referred to as calculation nodes 30.

極小部分問題生成部２２は、入力部１０により逐次入力された文字列を連結した入力文字列に対して言語解析処理を行う問題を分割して得られる予め定義された極小部分問題を、逐次入力された一入力単位の文字列毎に生成し、極小部分問題の集合を生成する。 The minimal subproblem generation unit 22 sequentially inputs a predefined minimal subproblem obtained by dividing a problem of performing language analysis processing on an input character string obtained by concatenating character strings sequentially input by the input unit 10. A set of minimal subproblems is generated for each input input character string.

極小部分問題記憶部２４は、極小部分問題生成部２２において逐次生成された極小部分問題の集合を記憶している。 The minimal subproblem storage unit 24 stores a set of minimal subproblems sequentially generated by the minimal subproblem generation unit 22.

グループ作成部２６は、極小部分問題生成部２２によって極小部分問題が逐次生成される毎に、極小部分問題記憶部２４に記憶されている極小部分問題の集合について、各極小部分問題が少なくとも２以上のグループに属するように、Ｓ個のグループを作成し、Ｓ個の計算ノード３０_１〜３０_ｓに割り当てる。また、グループ作成部２６は、入力されたパラメータρを、Ｓ個の計算ノード３０_１〜３０_Ｓの各々に通知する。なお、極小部分問題記憶部２４に記憶されている各極小部分問題の各々については、前回割り当てられたグループと同様のグループに所属するようにグループを作成する。 Each time the minimal subproblem is sequentially generated by the minimal subproblem generating unit 22, the group creating unit 26 has at least two minimum subproblems for each set of minimal subproblems stored in the minimal subproblem storage unit 24. S groups are created so as to belong to these groups, and assigned to the S computation nodes 30 ₁ to 30 _s . In addition, the group creation unit 26 notifies the input parameter ρ to each of the _S computation nodes 30 ₁ to 30 _S. It should be noted that a group is created so that each minimum subproblem stored in the local subproblem storage unit 24 belongs to the same group as the previously assigned group.

Ｓ個の計算ノード３０_１〜３０_ｓの各々は、パラメータ更新部３２と、部分解更新部３４と、第１フラグ判定部３６と、同期部３８と、制約条件更新部４０と、第２フラグ判定部４２と、収束判定部４４と、を備えている。各Ｓ個のパラメータ更新部３２と、部分解更新部３４と、第１フラグ判定部３６と、同期部３８と、制約条件更新部４０と、第２フラグ判定部４２と、収束判定部４４と、が存在することになるが、同様の機能を有する処理部は同じ番号で表している。また、Ｓ個の計算ノード３０_１〜３０_ｓの各々は、極小部分問題生成部２２によって極小部分問題が逐次生成される毎に、パラメータ更新部３２、部分解更新部３４、第１フラグ判定部３６、同期部３８、制約条件更新部４０、第２フラグ判定部４２、及び収束判定部４４による各処理を行う。なお、第１フラグ判定部３６は、グループ非活性状態設定手段と、制約条件活性状態設定手段の一例であり、第２フラグ判定部４２は、グループ活性状態設定手段と、制約条件非活性状態設定手段の一例である。 Each of the S calculation nodes 30 ₁ to 30 _s includes a parameter update unit 32, a partial decomposition update unit 34, a first flag determination unit 36, a synchronization unit 38, a constraint condition update unit 40, and a second flag. A determination unit 42 and a convergence determination unit 44 are provided. Each S parameter update unit 32, partial decomposition update unit 34, first flag determination unit 36, synchronization unit 38, constraint condition update unit 40, second flag determination unit 42, convergence determination unit 44, However, processing units having similar functions are denoted by the same numbers. In addition, each of the S calculation nodes 30 ₁ to 30 _s includes a parameter update unit 32, a partial decomposition update unit 34, and a first flag determination unit each time a minimal partial problem is sequentially generated by the minimal partial problem generation unit 22. 36, the synchronization unit 38, the constraint condition update unit 40, the second flag determination unit 42, and the convergence determination unit 44 perform each process. The first flag determining unit 36 is an example of a group inactive state setting unit and a constraint condition active state setting unit, and the second flag determining unit 42 is a group active state setting unit and a constraint condition inactive state setting. It is an example of a means.

パラメータ更新部３２は、グループ作成部２６によって割り当てられた自グループｓに含まれる各極小部分問題ｒに対するラグランジュ未定乗数α_ｓ（ｒ）を更新する。具体的には、各極小部分問題ｒに対して、上記（６）式に従って、ラグランジュ未定乗数α_ｓ（ｒ）を更新する。上記（６）式の更新式は、各計算ノード３０で独立に計算できるため、他の計算のノード３０と通信などを行う必要がない。なお、ラグランジュ未定乗数α_ｓ（ｒ）の初期値は、パラメータ記憶部４６に記憶されている前回の自然言語解析処理ルーチンの実行の結果により得られたラグランジュ未定乗数α_ｓ（ｒ）の値とする。自然言語解析処理ルーチンが一度も実行されていない場合には、ラグランジュ未定乗数α_ｓ（ｒ）の初期値は０とする。また、自グループｓのグループフラグＦ_ｓ＝０の場合、パラメータ更新部３２は、各極小部分問題ｒに対するラグランジュ未定乗数α_ｓ（ｒ）の更新を行わず、初期値、又は前回更新された値をラグランジュ未定乗数α_ｓ（ｒ）として保持する。 The parameter update unit 32 updates the Lagrange undetermined multiplier α _s (r) for each minimal subproblem r included in the own group s assigned by the group creation unit 26. Specifically, the Lagrange undetermined multiplier α _s (r) is updated according to the above equation (6) for each minimal subproblem r. Since the update formula (6) can be calculated independently at each calculation node 30, it is not necessary to communicate with the other calculation nodes 30. The initial value of the Lagrange multipliers α _{s (r)} is the value of the Lagrangian obtained as a result of the execution of the natural language analysis processing routine of the previous stored in the parameter storage unit 46 undetermined multipliers α _{s (r)} To do. When the natural language analysis processing routine has never been executed, the initial value of the Lagrange undetermined multiplier α _s (r) is set to zero. When the group flag F _s = 0 of the own group s, the parameter update unit 32 does not update the Lagrange undetermined multiplier α _s (r) for each local subproblem r, and the initial value or the previously updated value As Lagrange undetermined multiplier α _s (r).

部分解更新部３４は、グループ作成部２６によって割り当てられた自グループｓに含まれる極小部分問題ｒの部分集合について、当該計算ノード３０_ｓにおける各極小部分問題の解ｚ_ｓ（ｒ）を更新する。具体的には、上記（１１）式に示す最大化問題を解くことにより各極小部分問題の解を更新する。なお、各極小部分問題の解ｚ_ｓ（ｒ）の初期値は、パラメータ記憶部４６に記憶されている前回の自然言語解析処理ルーチンの実行の結果により得られた各極小部分問題の解ｚ_ｓ（ｒ）の値とする。自然言語解析処理ルーチンが一度も実行されていない場合には、各極小部分問題の解ｚ_ｓ（ｒ）の初期値は０とする。また、自グループｓのグループフラグＦ_ｓ＝０の場合、部分解更新部３４は、各極小部分問題の解ｚ_ｓ（ｒ）の解の更新を行わず、初期値、又は前回更新された値を各極小部分問題の解ｚ_ｓ（ｒ）の値として保持する。 The partial decomposition updating unit 34 updates the solution z _s (r) of each minimal subproblem in the calculation node 30 _{s with} respect to the subset of the minimal subproblem r included in the own group s assigned by the group creation unit 26. . Specifically, the solution of each local subproblem is updated by solving the maximization problem shown in the above equation (11). Note that the initial value of the solution z _s (r) of each minimal subproblem is the solution z _{s of} each minimal subproblem obtained as a result of the previous execution of the natural language analysis processing routine stored in the parameter storage unit 46. The value of (r) is assumed. If the natural language analysis processing routine has never been executed, the initial value of the solution z _s (r) of each minimal subproblem is set to zero. Further, when the group flag F _s = 0 of the own group s, the partial decomposition update unit 34 does not update the solution z _s (r) of each local subproblem, and the initial value or the value updated last time As the value of the solution z _s (r) of each minimal subproblem.

第１フラグ判定部３６は、部分解更新部３４により得られた自グループｓに含まれる各極小部分問題の解ｚ^ｋ _ｓ（ｒ）が、前回得られた解ｚ^ｋ−１ _ｓ（ｒ）と一致するか否かを判定する。自グループｓに含まれる全極小部分問題についてｚ^ｋ _ｓ（ｒ）＝ｚ^ｋ−１ _ｓ（ｒ）の場合、計算ノード３０_ｓに割り当てられたグループｓは収束したとして、自グループｓのグループフラグＦ_ｓに非活性化を表す０の値を付与する。また、自グループｓに含まれる各極小部分問題ｒについて、ｚ^ｋ _ｓ（ｒ）≠ｚ^ｋ−１ _ｓ（ｒ）となる極小部分問題ｒが存在する場合、当該極小部分問題ｒの制約条件パラメータｕ（ｒ）の値が変化する可能性があるとして、ｕ（ｒ）に対する制約フラグＦ_ｒに活性化を表す１の値を付与する。 The first flag determination unit 36 calculates the solution z ^k _s (r) of each local subproblem included in the own group s obtained by the partial decomposition update unit 34 as the solution z ^k−1 _s (r) obtained last time. It is determined whether or not it matches. As the case for all minimum partial problems included in the self group ^{_{^{s z k s (r) =}}} z k-1 s (r), a group s assigned to compute node 30 _s has converged, group flag of its own group s a value of 0 representing non-activation imparts to F _s. Further, for each minimal subproblem r included in the own group s, when there is a minimal subproblem r satisfying z ^k _s (r) ≠ z ^k−1 _s (r), the constraint condition parameter of the minimal subproblem r Assuming that the value of u (r) may change, a value of 1 _indicating activation is _assigned to the constraint flag F _r for u (r).

同期部３８は、当該計算ノード３０で今回更新され、又は保持されたラグランジュ未定乗数α_ｓ（ｒ）及び各極小部分問題の解ｚ_ｓ（ｒ）を、自分以外のすべての計算ノード３０_ｉへ通知する。また、同期部３８は、他の計算ノード３０_ｉすべてから通知された、今回更新され、または保持されたラグランジュ未定乗数α_ｓ（ｒ）及び各極小部分問題の解ｚ_ｓ（ｒ）を受け取る。この処理によって、個々の計算ノード３０はすべての計算ノード３０_ｓの持つラグランジュ未定乗数α_ｓ（ｒ）と各極小部分問題の解ｚ_ｓ（ｒ）の値を取得することができる。 The synchronization unit 38 transmits the Lagrange undetermined multiplier α _s (r) and the solution z _s (r) of each local subproblem that are updated or held this time at the calculation node 30 to all the calculation nodes 30 _i other than itself. Notice. In addition, the synchronization unit 38 receives the Lagrange undetermined multiplier α _s (r) and the solution z _s (r) of each local subproblem that have been notified from all the other computation nodes 30 _{i and} are updated or held this time. By this processing, each computation node 30 can obtain the value of Lagrange undetermined multiplier α _s (r) and the solution z _s (r) of each local subproblem that all computation nodes 30 _s have.

制約条件更新部４０は、他の計算ノード３０_ｓすべてから受け取ったラグランジュ未定乗数α_ｓ（ｒ）と各極小部分問題の解ｚ_ｓ（ｒ）を用いて、上記（１２）式により、制約条件に従って極小部分問題の解を一致させるときの解を示す制約条件パラメータｕ（ｒ）を、全ての極小部分問題ｒの各々について更新する。なお、ｕ（ｒ）の初期値は、パラメータ記憶部４６に記憶されている前回の自然言語解析処理ルーチンの実行の結果により得られたｕ（ｒ）の値とする。自然言語解析処理ルーチンが一度も実行されていない場合には、制約条件パラメータｕ（ｒ）の初期値は０とする。また、制約フラグＦ_ｒ＝０の場合、制約条件更新部４０は、ｕ（ｒ）の更新を行わず、初期値、又は前回更新された値をｕ（ｒ）の値として保持する。 The constraint condition update unit 40 uses the Lagrange undetermined multiplier α _s (r) received from all the other computation nodes 30 _s and the solution z _s (r) of each local subproblem, according to the above equation (12). The constraint condition parameter u (r) indicating the solution when the solutions of the minimal subproblems are made to coincide with each other is updated for each of the minima partial problems r. The initial value of u (r) is the value of u (r) obtained as a result of the previous execution of the natural language analysis processing routine stored in the parameter storage unit 46. If the natural language analysis processing routine has never been executed, the initial value of the constraint parameter u (r) is set to zero. When the constraint flag F _r = 0, the constraint condition update unit 40 does not update u (r), and holds the initial value or the value updated last time as the value of u (r).

また、個々の計算ノード３０で独立にｕ（ｒ）を求めているが、得られるｕ（ｒ）はすべての計算ノード３０で一致する。処理方法としては、任意のひとつの計算ノード３０でｕ（ｒ）を計算し、そのあとに各計算ノード３０に通知するといった処理を行うようにしてもよい。ただし、その場合には、選択された計算ノード３０の計算が終了し、結果が通知されるまで、それ以外の計算ノードは待機する必要がある。本実施の形態では、個々の計算ノード３０で同じ計算を行う方式をとった場合を例に説明する。 In addition, although u (r) is obtained independently at each computation node 30, the obtained u (r) matches at all computation nodes 30. As a processing method, u (r) may be calculated by one arbitrary calculation node 30 and then notified to each calculation node 30. In this case, however, the other calculation nodes need to wait until the calculation of the selected calculation node 30 is completed and the result is notified. In the present embodiment, a case will be described as an example where a method of performing the same calculation in each calculation node 30 is taken.

第２フラグ判定部４２は、全ての極小部分問題ｒの各々について、制約条件更新部４０により得られたｕ（ｒ）がｕ^ｋ−１（ｒ）と一致するか否かを判定する。ｕ^ｋ（ｒ）＝ｕ^ｋ−１（ｒ）の場合、自身の変数ｕ（ｒ）は収束したとして、制約フラグＦ_ｒに非活性化を表す０の値を付与する。また、ｕ^ｋ（ｒ）≠ｕ^ｋ−１（ｒ）の場合、制約式ｕ（ｒ）の値が更新されたとして、当該極小部分問題ｒを含むグループの解が制約条件パラメータｕ（ｒ）と一致しないことになるので、当該極小部分問題ｒを含むグループｓのグループフラグＦ_ｓに活性化を示す１の値を付与する。 The second flag determination unit 42 determines whether or not u (r) obtained by the constraint condition update unit 40 matches u ^k−1 (r) for each of the minimal partial problems r. In the case of u ^k (r) = u ^k−1 (r), the variable u (r) of its own has converged, and a value of 0 representing deactivation is given to the constraint flag F _r . When u ^k (r) ≠ u ^k−1 (r), the value of the constraint equation u (r) is updated, and the solution of the group including the minimal subproblem r is the constraint parameter u (r). Therefore, a value of 1 indicating activation is _assigned to the group flag F _s of the group s including the minimal subproblem r.

収束判定部４４は、上記（１３）式、及び（１４）式を用いて、制約条件更新部４０で得られたパラメータｕが収束して最適値になっているか判定する。具体的には、二つの小さな性の実数ε_１、ε_２を与え、上記（１３）式、及び（１４）式を満たした場合に、ｕが最適値に収束したと判定する。 The convergence determination unit 44 determines whether the parameter u obtained by the constraint condition update unit 40 has converged to an optimum value using the above equations (13) and (14). Specifically, given two small real numbers ε ₁ and ε ₂ and satisfying the above equations (13) and (14), it is determined that u has converged to an optimum value.

収束判定で、最適値に収束していない場合は、ｋ＝ｋ＋１として、パラメータ更新部３２による処理に戻る。最適値に収束していると判定された場合は、繰り返し処理を終了する。 In the convergence determination, if it has not converged to the optimum value, k = k + 1 is set, and the process returns to the parameter updating unit 32. If it is determined that the value has converged to the optimum value, the iterative process is terminated.

この収束判定の処理もｕ（ｒ）と同様に任意のひとつの計算ノード３０で行い、その結果を全体に通知するようにしてもよい。しかし、同期処理が必要となるため、本実施の形態では、収束判定もすべて計算ノード３０で個別に行い、収束と判定されれば処理を終了する場合を例に説明する。この収束判定も、すべての計算ノードで結果が必ず一致するため、ここに判定を行っても結果は同じになる。 This convergence determination process may also be performed by any one computation node 30 in the same manner as u (r) and the result may be notified to the whole. However, since synchronization processing is necessary, in this embodiment, an example will be described in which all the convergence determinations are individually performed by the calculation node 30 and the processing is terminated if the convergence is determined. The result of this convergence determination is always the same at all the computation nodes, so even if the determination is made here, the result is the same.

収束判定部４４は、ｕ（ｒ）が最適値に収束したと判定された場合、その時点で得られた各極小部分問題ｒの解ｕ（ｒ）を組み合わせて、形態素解析の結果を生成し、自然言語解析処理の結果として出力部５０により出力する。また、その時点で得られた各極小部分問題ｒの解ｕ（ｒ）、各計算ノード３０において更新されたグループｓに含まれる各極小部分問題ｒの解ｚ_ｓ（ｒ）、各計算ノード３０において更新されたグループｓのラグランジュ未定乗数α_ｓ（ｒ）の各々の値をパラメータ記憶部４６に記憶する。なお、本実施の形態では、任意の一つの計算ノード３０から、自然言語解析処理の結果が出力される場合を例に説明したが、全ての計算ノード３０から、自然言語解析処理の結果が出力されてもよい。 When it is determined that u (r) has converged to the optimum value, the convergence determination unit 44 combines the solutions u (r) of the respective minimal subproblems r obtained at that time to generate a morphological analysis result. The output unit 50 outputs the result as a result of the natural language analysis process. Further, the solution u (r) of each minimal subproblem r obtained at that time, the solution z _s (r) of each minimal subproblem r included in the group s updated in each computation node 30, and each computation node 30 Each value of the Lagrange undetermined multiplier α _s (r) of the group s updated in step S is stored in the parameter storage unit 46. In the present embodiment, the case where the result of the natural language analysis process is output from any one calculation node 30 has been described as an example. However, the result of the natural language analysis process is output from all the calculation nodes 30. May be.

パラメータ記憶部４６は、計算ノード３０において得られた各極小部分問題ｒの解ｕ（ｒ）、更新されたグループｓに含まれる各極小部分問題ｒの解ｚ_ｓ（ｒ）、更新されたグループｓのラグランジュ未定乗数α_ｓ（ｒ）の値を記憶している。 Parameter storage unit 46, the solution of the minimum subproblem r obtained in the calculation node 30 u (r), the solution of the minimum subproblem r contained in the updated group s z _{s (r),} the updated group The value of the Lagrange undetermined multiplier α _s (r) of _s is stored.

＜自然言語解析処理装置の作用＞
次に、本発明の第１の実施の形態に係る自然言語解析処理装置１００の作用について説明する。まず、自然言語解析処理の対象となる一入力単位の文字列が自然言語解析処理装置１００に入力される毎に、自然言語解析処理装置１００において、極小部分問題生成部２２によって、入力された一入力単位の文字列に対する自然言語解析処理を行う極小部分問題を生成し、極小部分問題記憶部２４に記憶されている極小部分問題の集合に追加する。そして、極小部分問題が生成される毎に、グループ作成部２６によって、極小部分問題記憶部２４に記憶されている極小部分問題の集合について、Ｓ個のグループを作成し、Ｓ個の計算ノード３０_１〜３０_ｓに割り当てる。 <Operation of natural language analysis processing device>
Next, the operation of the natural language analysis processing apparatus 100 according to the first embodiment of the present invention will be described. First, every time a character string of one input unit to be subjected to natural language analysis processing is input to the natural language analysis processing device 100, one character string input by the minimal subproblem generation unit 22 in the natural language analysis processing device 100. A minimal subproblem for performing a natural language analysis process on a character string in an input unit is generated and added to a set of minimal subproblems stored in the minimal subproblem storage unit 24. Each time a minimal subproblem is generated, the group creation unit 26 creates S groups for the set of minimal subproblems stored in the minimal subproblem storage unit 24, and S computing nodes 30. assigned to the _{1 ~30} _s.

そして、逐次極小部分問題が生成される毎に、制約条件パラメータｕ^０（ｒ）の各々に、初期値として、パラメータ記憶部４６に記憶されている前回の自然言語解析処理ルーチンの実行の結果で得られた値を各々設定する。なお、自然言語解析処理ルーチンが一度も実行されていない場合には、制約条件パラメータｕ^０（ｒ）の各々に、初期値として０を設定する。また、新たに生成された極小部分問題ｒに対する制約条件パラメータｕ^０（ｒ）に、初期値として０を設定する。また、各グループｓのグループフラグＦ-_ｓの値を、活性化状態を表す１に設定する共に、各極小部分問題ｒについての制約条件パラメータｕ（ｒ）の制約フラグＦ_ｒの値を、活性化状態を表す１の値に設定する。そして、自然言語解析処理装置１００の各計算ノード３０によって、図４に示す自然言語解析処理ルーチンが実行される。なお、以下では、計算ノード３０_ｓによって実行した場合について説明する。 Each time a minimal problem is sequentially generated, the result of the previous execution of the natural language analysis processing routine stored in the parameter storage unit 46 as an initial value for each of the constraint parameter u ⁰ (r). Each obtained value is set. When the natural language analysis processing routine has never been executed, 0 is set as an initial value for each of the constraint parameter u ⁰ (r). Further, 0 is set as an initial value in the constraint parameter u ⁰ (r) for the newly generated minimal subproblem r. Further, the value of the group flag F- _s of each group s is set to 1 representing the activation state, and the value of the constraint flag F _r of the constraint condition parameter u (r) for each minimal subproblem r is activated. Set to a value of 1 representing the conversion state. Then, the natural language analysis processing routine shown in FIG. 4 is executed by each computation node 30 of the natural language analysis processing apparatus 100. Hereinafter, a case where the calculation is executed by the calculation node 30 _s will be described.

まず、ステップＳ１００において、割り当てられた自グループｓの極小部分問題ｒの部分集合に対する各ラグランジュ未定乗数α^０ _ｓ（ｒ）及び各極小部分問題の解ｚ^０ _ｓ（ｒ）の各々に、初期値として、パラメータ記憶部４６に記憶されている前回の自然言語解析処理ルーチンの実行の結果で得られた値を各々設定する。なお、自然言語解析処理ルーチンが一度も実行されていない場合には、各ラグランジュ未定乗数α^０ _ｓ（ｒ）及び各極小部分問題の解ｚ^０ _ｓ（ｒ）の各々に、初期値として０を設定する。また、新たに生成された極小部分問題ｒに対するラグランジュ未定乗数α^０ _ｓ（ｒ）及び極小部分問題の解ｚ^０ _ｓ（ｒ）の各々に、初期値として０を設定する。 First, in step S100, an initial value is set for each Lagrange undetermined multiplier α ⁰ _s (r) and a solution z ⁰ _s (r) of each minimal subproblem for a subset of the assigned minimal subproblem r of the own group s. As a result, the values obtained as a result of the previous execution of the natural language analysis processing routine stored in the parameter storage unit 46 are set. If the natural language analysis processing routine has never been executed, 0 is set as the initial value for each Lagrange undetermined multiplier α ⁰ _s (r) and the solution z ⁰ _s (r) of each local subproblem. Set. Further, 0 is set as an initial value for each of the Lagrange undetermined multiplier α ⁰ _s (r) and the solution z ⁰ _s (r) of the minimal subproblem for the newly generated minimal subproblem r.

次に、ステップＳ１０２において、繰り返し回数を示す変数ｋに初期値１を設定する。 Next, in step S102, an initial value 1 is set to a variable k indicating the number of repetitions.

次に、ステップＳ１０４において、自グループｓのグループフラグＦ-_ｓの値が０か否かを判定する。自グループｓのグループフラグＦ_ｓの値が０である場合、ステップＳ１１８へ移行し、グループフラグＦ_ｓの値が０でない場合（１の場合）、ステップＳ１０６へ移行する。 Next, in step S104, it is determined whether or not the value of the group flag F- _s of the own group s is zero. If the value of the group flag _{F s} of the own group s is 0, the process proceeds to step S118, (the case of 1) when the value of the group flag _{F s} is not 0, the process proceeds to step S106.

次に、ステップＳ１０６において、パラメータ更新部３２によって、上記ステップＳ１００で設定されたラグランジュ未定乗数α^０ _ｓ（ｒ）、各極小部分問題の解ｚ^０ _ｓ（ｒ）、及び制約条件パラメータｕ^０（ｒ）、又は前回更新されたラグランジュ未定乗数α^ｋ−１ _ｓ（ｒ）各極小部分問題の解ｚ^ｋ−１ _ｓ（ｒ）、及び制約条件パラメータｕ^ｋ−１（ｒ）に基づいて、上記（６）式に従って、自グループｓ内の極小部分問題ｒの各々に対するラグランジュ未定乗数α^ｋ _ｓ（ｒ）を更新する。 Next, in step S106, the parameter update unit 32 causes the Lagrange undetermined multiplier α ⁰ _s (r) set in step S100, the solution z ⁰ _s (r) of each local subproblem, and the constraint parameter u ⁰ ( r), or Lagrange undetermined multiplier α ^k−1 _s (r) updated last time, based on the solution z ^k−1 _s (r) of each minimal subproblem and the constraint parameter u ^k−1 (r) The Lagrange undetermined multiplier α ^k _s (r) for each of the minimal subproblems r in the own group s is updated according to the equation (6).

次に、ステップＳ１０８において、部分解更新部３４によって、上記ステップＳ１０６で更新されたラグランジュ未定乗数α^ｋ _ｓ（ｒ）と、上記ステップＳ１００で設定された各極小部分問題の解ｚ^０ _ｓ（ｒ）、または前回更新された各極小部分問題の解ｚ^ｋ−１ _ｓ（ｒ）に基づいて、上記（１１）式に従って、自グループｓ内の極小部分問題ｒの各々に対する解ｚ^ｋ _ｓ（ｒ）を更新する。 Next, in step S108, the Lagrangian undetermined multiplier α ^k _s (r) updated in step S106 by the partial decomposition updating unit 34 and the solution z ⁰ _s (r) of each local subproblem set in step S100 are processed. ), or on the basis of the previous respective minimum subproblems updated solutions ^{z _k-1} s (r), according to the above (11), the solution to each of the minimum partial problems r in the own group s ^z _k s (r ).

次に、ステップＳ１１０において、第１フラグ判定部３６によって、上記ステップＳ１０８で更新された自グループｓの各極小部分問題の解ｚ^ｋ _ｓ（ｒ）に基づいて、自グループｓに含まれる全ての極小部分問題ｒの解ｚ^ｋ _ｓ（ｒ）が前回更新されたｚ^ｋ−１ _ｓ（ｒ）の値と同じ値であるか否かを判定する。同じ値である場合には、ステップＳ１１２に移行し、同じ値でない場合には、ステップＳ１１４に移行する。 Next, in step S110, the first flag determination unit 36 determines all the solutions included in the own group s based on the solution z ^k _s (r) of each local subproblem of the own group s updated in step S108. It is determined whether or not the solution z ^k _s (r) of the minimal subproblem r is the same value as the value of z ^k−1 _s (r) updated last time. If they are the same value, the process proceeds to step S112. If they are not the same value, the process proceeds to step S114.

次に、ステップＳ１１２において、自グループｓに対するグループフラグＦ_ｓに非活性化状態を表す０の値を付与する。 Next, in step S112, to impart a value of 0 representing the inactive state to the group flag F _s for its own group s.

次に、ステップＳ１１４において、第１フラグ判定部３６によって、上記ステップＳ１０８で更新された自グループｓの各極小部分問題の解ｚ^ｋ _ｓ（ｒ）に基づいて、自グループｓの各極小部分問題ｒの解ｚ^ｋ _ｓ（ｒ）が前回更新されたｚ^ｋ−１ _ｓ（ｒ）の値と同じ値でない極小部分問題ｒの解ｚ^ｋ _ｓ（ｒ）があるか否かを判定する。同じ値でない極小部分問題ｒの解ｚ^ｋ _ｓ（ｒ）が存在する場合は、ステップＳ１１６に移行し、それ以外の場合には、ステップＳ１１８に移行する。 Next, in step S114, each minimal subproblem of the own group s is determined by the first flag determination unit 36 based on the solution z ^k _s (r) of each minimal subproblem of the own group s updated in step S108. r of the solution ^z _k s (r) determines whether there is a solution ^z _k s minimum subproblems r not the same value as the value of ^{z _k-1} s was last updated (r) (r). If there is a solution z ^k _s (r) of the minimal sub-problem r that is not the same value, the process proceeds to step S116, and otherwise, the process proceeds to step S118.

次に、ステップＳ１１６において、ステップＳ１１４において前回更新された解と同じ値でない自グループｓの極小部分問題ｒの解ｚ^ｋ _ｓ（ｒ）の極小部分問題ｒについての制約条件パラメータｕ（ｒ）に対する制約フラグＦ_ｒに、活性化状態を表す１の値を付与する。 Next, in step S116, with respect to the constraint parameter u (r) for the minimal subproblem r of the solution z ^k _s (r) of the minimal subproblem r of the own group s that is not the same value as the previously updated solution in step S114. A value of 1 representing the activation state is _assigned to the constraint flag _Fr.

次に、ステップＳ１１８において、同期部３８によって、上記ステップＳ１０６で更新された自グループｓのラグランジュ未定乗数α^ｋ _ｓ（ｒ）、及び上記ステップＳ１０８において更新された自グループｓの各極小部分問題の解ｚ^ｋ _ｓ（ｒ）を他の計算ノード３０に通知すると共に、他の計算ノード３０_ｉ全てから、更新されたラグランジュ未定乗数α^ｋ _ｉ（ｒ）、及び各極小部分問題の解ｚ^ｋ _ｉ（ｒ）を取得する（ｉ＝１、・・・、ｓ−１、ｓ＋１、・・・Ｓ）。なお、自グループｓが非活性化状態となっていることにより、ステップＳ１０６、及びステップＳ１０８の処理を省略している場合には、その時点で保持している自グループｓのラグランジュ未定乗数α^ｋ−１ _ｓ（ｒ）、及び各極小部分問題の解ｚ^ｋ−１ _ｓ（ｒ）の値をｋ回目のラグランジュ未定乗数α^ｋ _ｓ（ｒ）、及び各極小部分問題の解ｚ^ｋ _ｓ（ｒ）の値として他の計算ノード３０に通知する。 Next, in step S118, the synchronization unit 38 determines the Lagrange undetermined multiplier α ^k _s (r) of the own group s updated in step S106 and the local sub-problems of the own group s updated in step S108. solution ^z _k s a (r) notifies the other computing nodes 30, from all the other computing nodes 30 _i, the updated Lagrange multipliers alpha ^k _i (r), and the solution ^z _{k i} of each minimum subproblem (R) is acquired (i = 1,..., S−1, s + 1,... S). If the processing of step S106 and step S108 is omitted because the own group s is in an inactive state, the Lagrange undetermined multiplier α ^k of the own group s held at that time is stored. ⁻¹ _s (r), and the solution z ^k−1 _s (r) of each minimal subproblem, the ^kth Lagrange undetermined multiplier α ^k _s (r), and the solution z ^k _s (r) of each minimal subproblem ) As a value of).

次に、ステップＳ１２０において、制約条件更新部４０によって、自グループｓの上記ステップＳ１０６で更新されたラグランジュ未定乗数α^ｋ _ｓ（ｒ）又は、ステップＳ１１８において他の計算ノード３０に通知したラグランジュ未定乗数α^ｋ _ｓ（ｒ）と、自グループｓの上記ステップＳ１０８において更新された各極小部分問題の解ｚ^ｋ _ｓ（ｒ）又は、ステップＳ１１８において他の計算ノード３０に通知した各極小部分問題の解ｚ^ｋ _ｓ（ｒ）と、上記ステップＳ１１８において他の計算ノード３０_ｉ全てから取得したラグランジュ未定乗数α^ｋ _ｉ（ｒ）、及び各極小部分問題の解ｚ^ｋ _ｉ（ｒ）とに基づいて、制約フラグＦ_ｒの値が１となっている制約条件パラメータｕ^ｋ（ｒ）の各々のみを、上記（１２）式に従ってを更新する。なお、制約フラグＦ_ｒの値が０となっている、制約条件パラメータｕ（ｒ）については、その時点で保持しているｕ^ｋ−１（ｒ）の値をｋ回目の制約条件パラメータｕ^ｋ（ｒ）とする。 Next, in step S120, the Lagrange undetermined multiplier α ^k _s (r) updated in step S106 of the own group s by the constraint condition updating unit 40 or the Lagrange undetermined multiplier notified to the other calculation node 30 in step S118. α ^k _s (r) and the solution z ^k _s (r) of each local subproblem updated in step S108 of the own group s or the solution of each local subproblem notified to the other calculation nodes 30 in step S118 Based on z ^k _s (r), the Lagrange undetermined multiplier α ^k _i (r) obtained from all the other computation nodes 30 _i in step S118, and the solution z ^k _i (r) of each local subproblem, Only the constraint parameter u ^k (r) having a constraint flag F _r value of 1 is updated according to the above equation (12). New. For the constraint parameter u (r) in which the value of the constraint flag F _r is 0, the value of u ^k−1 (r) held at that time is set to the kth constraint parameter u ^k. (R).

次に、ステップＳ１２２において、第２フラグ判定部４２によって、全ての極小部分問題ｒの各々について、制約条件パラメータｕ^ｋ（ｒ）が前回更新された制約条件パラメータｕ^ｋ−１（ｒ）と同じ値であるか否かを判定する。同じ値である極小部分問題ｒについてのｕ^ｋ（ｒ）については、後述するステップＳ１２４における処理が実行され、同じ値をもたない極小部分問題ｒについてのｕ^ｋ（ｒ）については、後述するステップＳ１２６における処理が実行される。 Next, in step S122, the constraint condition parameter u ^k (r) is the same as the previously updated constraint parameter u ^k−1 (r) for each of the minimal subproblems r by the second flag determination unit 42. It is determined whether it is a value. For u ^k (r) for the minimal sub-problem r having the same value, the processing in step S124 described later is executed, and for u ^k (r) for the minimal sub-problem r not having the same value, it will be described later. The process in step S126 is executed.

次に、ステップＳ１２４において、前回更新された制約条件パラメータｕ^ｋ−１（ｒ）と同じ値である制約条件パラメータｕ^ｋ（ｒ）に対する制約フラグＦ_ｒ各々に非活性状態を表す０の値を付与する。 Next, in step S124, each constraint flag F _r for the constraint parameter u ^k (r) having the same value as the previously updated constraint parameter u ^k−1 (r) is set to a value of 0 indicating an inactive state. Give.

次に、ステップＳ１２６において、第２フラグ判定部４２によって、前回更新された制約条件パラメータｕ^ｋ−１（ｒ）と同じ値でない制約条件パラメータｕ^ｋ（ｒ）に関する極小部分問題ｒを少なくとも１つ含むグループｓに対するグループフラグＦ_ｓの各々に、活性化状態を表す１の値を付与する。 Next, in step S126, the second flag determination unit 42 determines at least one minimal subproblem r regarding the constraint parameter u ^k (r) that is not the same value as the constraint parameter u ^k−1 (r) updated last time. to each group flag F _s for the group s containing, it imparts a value representing the active state.

次に、ステップＳ１２８において、更新又は通知された全ての極小部分問題ｒの各々の解ｚ^ｋ _ｓ（ｒ）と、更新又は保持されている全ての制約条件パラメータｕ^ｋ（ｒ）とに基づいて、上記（１３）式、及び（１４）式に従って、制約条件パラメータｕ^ｋ（ｒ）の全てが最適値に収束したか否かを判定する。上記（１３）式、及び（１４）式を満たさない場合には、収束していないと判断し、ステップＳ１３２へ移行し変数ｋを１インクリメントして、上記ステップＳ１０４へ戻る。一方、上記（１３）式、及び（１４）式を満たした場合には、収束したと判断し、上記ステップＳ１３２へ移行する。 Next, in step S128, based on the solutions z ^k _s (r) of all the minimal sub-problems r updated or notified and all the constraint parameters u ^k (r) updated or held. In accordance with the above equations (13) and (14), it is determined whether or not all of the constraint parameter u ^k (r) have converged to the optimum value. If the above expressions (13) and (14) are not satisfied, it is determined that they have not converged, the process proceeds to step S132, the variable k is incremented by 1, and the process returns to step S104. On the other hand, if the above expressions (13) and (14) are satisfied, it is determined that convergence has occurred, and the process proceeds to step S132.

次に、ステップＳ１３２において、自グループｓの上記ステップＳ１１８で通知及び取得したラグランジュ未定乗数α^ｋ _ｓ（ｒ）と、極小部分問題ｒの解ｚ^ｋ _ｓ（ｒ）と、上記ステップＳ１２０で最終的に更新又は保持している全ての制約条件パラメータｕ^ｋ（ｒ）をパラメータ記憶部４６に記憶する。また、上記ステップＳ１２０で最終的に更新又は保持している制約条件のパラメータｕ^ｋ（ｒ）を用いて、自然言語解析処理の結果を生成し、出力部５０より出力して、自然言語解析処理ルーチンを終了する。 Next, in step S132, the Lagrange undetermined multiplier α ^k _s (r) notified and acquired in step S118 of the own group s, the solution z ^k _s (r) of the minimal subproblem r, and the final in step S120 All the constraint parameter u ^k (r) updated or held in the parameter storage unit 46 is stored in the parameter storage unit 46. Also, the result of the natural language analysis process is generated using the parameter u ^k (r) of the constraint condition that is finally updated or held in step S120, and the result is output from the output unit 50. End the routine.

以上、説明したように、第１の実施の形態に係る自然言語解析処理装置によれば、逐次入力される入力文字列に対して自然言語解析処理を行う問題を部分問題の集合に分割してグループを作成し、２以上のグループに属する部分問題の解が一致する制約条件に基づいて、各計算ノードにおいて、計算ノードのグループに非活性状態が設定されているか否かに応じて、割り当てられた部分問題の部分集合について解を更新し、他の計算ノードから取得した部分問題の解を用いて、制約条件パラメータに非活性状態が設定されているか否かに応じて、制約条件パラメータを更新することを収束するまで繰り返すことにより、計算量の増大を抑制して、逐次入力される入力文字列に対して精度良く自然言語解析処理を行うことができる。 As described above, according to the natural language analysis processing apparatus according to the first embodiment, the problem of performing the natural language analysis processing on the sequentially input character string is divided into a set of partial problems. A group is created and assigned to each computation node according to whether or not the inactive state is set for the group of computation nodes based on the constraint condition that the solutions of the subproblems belonging to two or more groups match. Update the constraint parameter according to whether the inactive state is set for the constraint parameter using the solution of the partial problem obtained from another computation node. By repeating this process until convergence, an increase in the amount of calculation can be suppressed, and natural language analysis processing can be performed with high accuracy on input character strings that are sequentially input.

また、自然言語解析は、一般的に離散最適化問題として定式化され、その最適解を得ることが自然言語の解析結果を得ることと等価という事を前提として、自然言語解析問題を双対分解＋拡張ラグランジュ緩和法を利用して整数計画問題として解くことにより、自然言語解析問題を部分問題に分解して独立に解くことができ、自然言語解析問題を部分問題に分解して独立に解くことができるという性質を利用することで、活性／非活性制御を実現することができる。 Natural language analysis is generally formulated as a discrete optimization problem, and it is assumed that obtaining an optimal solution is equivalent to obtaining an analysis result of natural language. By solving as an integer programming problem using the extended Lagrangian relaxation method, the natural language analysis problem can be decomposed into subproblems and solved independently, and the natural language analysis problem can be decomposed into subproblems and solved independently. Active / inactive control can be realized by utilizing the property of being able to do so.

また、活性状態から非活性状態に変化させる処理を行うことで、値が変わらないグループの処理をスキップして最適化を行うことが可能となり、計算量を大幅に削減できる。また、逆に非活性状態から活性状態へ変化させる処理を導入することで、例えば、時刻が変わって新しい入力がされることにより前時刻の最適解から最適解が変化する場合に、これまで非活性状態で値の更新をしていなかったものに対して、活性状態にすることで、値を更新することが可能となる。つまり、この活性／非活性状態の制御によって、必要最小限の計算で最適解を見つけることが可能となり、また、時刻変化による最適解の変化に対しても、最小限の計算コストで対応することが可能となる。 Further, by performing the process of changing from the active state to the inactive state, it is possible to perform the optimization by skipping the process of the group whose value does not change, and the calculation amount can be greatly reduced. Conversely, by introducing a process for changing from the inactive state to the active state, for example, when the optimal solution changes from the optimal solution of the previous time due to a change in time and a new input, the non-active state has been A value that has not been updated in the active state can be updated by making it active. In other words, this active / inactive state control makes it possible to find the optimal solution with the minimum necessary calculation, and to cope with the change of the optimal solution due to time changes with the minimum calculation cost. Is possible.

また、ストリーミング入力のように、逐次的に入力が渡される環境でも、現時点まで入ってきた入力の最適解を提示しつつ処理を継続することが可能であることから、同時通訳のようなリアルタイム情報処理システムを実現するための基盤技術を提供することが可能となる。また、履歴ベースの方法を用いるよりも解析精度は大きく向上することから、自然言語処理を用いたアプリケーションの性能を底上げすることが可能である。また、前の処理結果の再利用と、活性／非活性制御とにより、逐次処理において冗長となる計算コストを削減し、効率的な逐次処理が可能となる。また、得られる解は、入力されているテキストの最適解であるため、その時点での最も良い解析結果が逐次得られることになる。 Also, even in an environment where input is passed sequentially, such as streaming input, it is possible to continue processing while presenting the optimal solution for the input that has entered so far, so real-time information such as simultaneous interpretation is possible. It becomes possible to provide the basic technology for realizing the processing system. In addition, since the analysis accuracy is greatly improved as compared with the history-based method, the performance of an application using natural language processing can be raised. Further, by reusing the previous processing result and active / inactive control, redundant calculation costs in the sequential processing are reduced, and efficient sequential processing becomes possible. Further, since the obtained solution is the optimum solution of the input text, the best analysis result at that time is sequentially obtained.

また、逐次的に解析を行うことで、実際に解く問題の最適化変数の一部しかわからない状況でも最適化を行い、その部分的な変数だけの状況での最適解を得ることができる。また、実際に解く整数計画問題の最適化変数が時刻変化により徐々に利用できるようになるような問題設定で、一時刻前の最適解を再利用して、効率的に次の時刻の最適解を得ることができる。そのため、最終的に問題全体として得られる最適解は、逐次、最適化変数が増えていく状況と、一括して通常通りに最適化問題を解く状況で基本的に一致するため、解析精度は一括処理と同等となる。 Further, by performing sequential analysis, optimization can be performed even in a situation where only a part of the optimization variables of the problem to be solved is known, and an optimal solution can be obtained in the situation of only the partial variables. In addition, the optimization solution of the integer programming problem to be solved can be used gradually as time changes, and the optimal solution of the next time can be efficiently reused by reusing the optimal solution of the previous time. Can be obtained. As a result, the optimal solution that is finally obtained as a whole problem basically matches the situation in which the number of optimization variables increases sequentially and the situation in which the optimization problem is solved as usual in a lump. It is equivalent to processing.

なお、上記の実施の形態では、各計算ノード３０が制約条件更新部４０、第２フラグ判定部４２、及び収束判定部４４を備えている場合を例に説明したが、これに限定されるものではない。例えば、図５に示すように、各計算ノード３０は、パラメータ更新部３２、部分解更新部３４、及び第１フラグ判定部３６を備え、演算部２０が、制約条件更新部４０、第２フラグ判定部４２、及び収束判定部４４を１つずつ備えるように構成してもよい。この場合には、制約条件更新部４０は、全ての計算ノード３０_ｓで得られたラグランジュ未定乗数α^ｋ _ｓ（ｒ）と各極小部分問題の解ｚ^ｋ _ｓ（ｒ）を用いて、制約条件パラメータｕ^ｋ（ｒ）を更新し、第２フラグ判定部４２は、得られたｕ^ｋ（ｒ）の値が前回更新又は保持されたｕ^ｋ−１（ｒ）の値と同じであるかに基づいて、対象となるＦ_ｒ及びＦ_ｓに活性状態又は非活性状態を表す値を付与し、収束判定部は、得られた制約条件パラメータｕ^ｋ（ｒ）が最適値に収束しているか判定するようにする。収束判定で、収束していなかった場合には、各計算ノード３０に得られた制約条件パラメータｕ^ｋ（ｒ）を通知してパラメータ更新部３２による処理に戻るようにする。 In the above embodiment, the case where each calculation node 30 includes the constraint condition update unit 40, the second flag determination unit 42, and the convergence determination unit 44 has been described as an example. However, the present invention is not limited to this. is not. For example, as illustrated in FIG. 5, each calculation node 30 includes a parameter update unit 32, a partial decomposition update unit 34, and a first flag determination unit 36, and the calculation unit 20 includes a constraint condition update unit 40, a second flag You may comprise so that the determination part 42 and the convergence determination part 44 may be provided one each. In this case, the constraint condition updating unit 40 uses the Lagrange undetermined multiplier α ^k _s (r) obtained at all the computation nodes 30 _s and the solution z ^k _s (r) of each local subproblem to update the parameter ^u k (r), on whether the second flag determination unit 42 is the same as the value of ^u values obtained ^u k (r) was last updated or maintained ^k-1 (r) Based on this, a value representing an active state or an inactive state is assigned to the target F _r and F _s , and the convergence determination unit determines whether the obtained constraint parameter u ^k (r) has converged to an optimal value. To do. If the convergence is not determined in the convergence determination, the constraint condition parameter u ^k (r) obtained for each computation node 30 is notified and the process returns to the process by the parameter updating unit 32.

＜第２の実施の形態＞
次に、第２の実施の形態について説明する。なお、第１の実施の形態と同様の構成となる部分については、同一符号を付して説明を省略する。 <Second Embodiment>
Next, a second embodiment will be described. In addition, about the part which becomes the structure similar to 1st Embodiment, the same code | symbol is attached | subjected and description is abbreviate | omitted.

第２の実施の形態では、ネットワークで接続された複数の言語解析装置を備えた分散並列計算環境において、複数の言語解析装置による分散並列計算で、パラメータ更新を行っている点が、第１の実施の形態と異なっている。 In the second embodiment, in the distributed parallel computing environment provided with a plurality of language analysis devices connected by a network, the parameter is updated by the distributed parallel computation by the plurality of language analysis devices. This is different from the embodiment.

図６に示すように、第２の実施の形態に係る自然言語解析処理システム２００は、言語解析制御装置２０１、及びＳ個の言語解析装置２０２_１〜２０２_Ｓを備えている。言語解析制御装置２０１及びＳ個の言語解析装置２０２_１〜２０２_Ｓは、ネットワーク２０３を介して接続されている。なお、言語解析装置２０２_１〜２０２_Ｓのうちの任意の言語解析装置を示す場合には、言語解析装置２０２と称することとする。 As shown in FIG. 6, the natural language analysis processing system 200 according to the second embodiment includes a language analysis control device 201 and S language analysis devices 202 _{1 to} 202 _S. The language analysis control device 201 and the S language analysis devices 202 _{1 to} 202 _S are connected via a network 203. In addition, when referring to any language analysis device among the language analysis devices 202 _{1 to} 202 _S , the language analysis device 202 is referred to.

図７に示すように、言語解析制御装置２０１は、入力部１０、演算部２２０、及び出力部２３０を備えている。 As illustrated in FIG. 7, the language analysis control device 201 includes an input unit 10, a calculation unit 220, and an output unit 230.

演算部２２０は、極小部分問題生成部２２、極小部分問題記憶部２４、及びグループ作成部２６を備えている。 The calculation unit 220 includes a minimum partial problem generation unit 22, a minimum partial problem storage unit 24, and a group creation unit 26.

グループ作成部２６は、極小部分問題生成部２２によって極小部分問題が逐次生成される毎に、極小部分問題記憶部２４に記憶されている極小部分問題の集合について、各極小部分問題が少なくとも２以上のグループに属するように、Ｓ個のグループを作成し、ネットワーク２０３を介してＳ個の言語解析装置２０２_１〜２０２_Ｓに送信する。また、グループ作成部２６は、入力されたパラメータρを、ネットワーク２０３を介してＳ個の言語解析装置２０２_１〜２０２_Ｓの各々に送信する。なお、極小部分問題記憶部２４に記憶されている各極小部分問題の各々については、前回割り当てられたグループと同様のグループ構成となるようにグループを作成する。 Each time the minimal subproblem is sequentially generated by the minimal subproblem generating unit 22, the group creating unit 26 has at least two minimum subproblems for each set of minimal subproblems stored in the minimal subproblem storage unit 24. to belong to a group, create a S-number of groups, to the S number of the language analysis unit ₂₀₂ 1 to 202 _S via the network 203. In addition, the group creation unit 26 transmits the input parameter ρ to each of the _S language analysis apparatuses 202 _{1 to} 202 S via the network 203. It should be noted that a group is created so that each of the minimal subproblems stored in the minimal subproblem storage unit 24 has the same group configuration as the previously assigned group.

Ｓ個の言語解析装置２０２_１〜２０２_Ｓの各々は、図８に示すように、入力部２４０、演算部２５０、及び出力部２６０を備えている。 Each of the S language analysis devices 202 _{1 to} 202 _S includes an input unit 240, a calculation unit 250, and an output unit 260 as shown in FIG.

入力部２４０は、言語解析制御装置２０１から送信された自グループｓに含まれる極小部分問題の部分集合を受け付ける。また、入力部２４０は、他の言語解析装置２０２からネットワーク２０３を介して送信された情報を受け付ける。 The input unit 240 receives a subset of the minimal subproblems included in the own group s transmitted from the language analysis control device 201. The input unit 240 receives information transmitted from another language analysis apparatus 202 via the network 203.

演算部２５０は、パラメータ更新部３２、部分解更新部３４、第１フラグ判定部３６、同期部３８、制約条件更新部４０、第２フラグ判定部４２、収束判定部４４、及びパラメータ記憶部４６を備えている。 The calculation unit 250 includes a parameter update unit 32, a partial decomposition update unit 34, a first flag determination unit 36, a synchronization unit 38, a constraint condition update unit 40, a second flag determination unit 42, a convergence determination unit 44, and a parameter storage unit 46. It has.

パラメータ更新部３２は、言語解析装置２０２に送信された自グループｓに含まれる各極小部分問題ｒに対するラグランジュ未定乗数α_ｓ（ｒ）を更新する。なお、自グループｓのグループフラグＦ_ｓ＝０の場合、パラメータ更新部３２は、各極小部分問題ｒに対するラグランジュ未定乗数α_ｓ（ｒ）の更新を行わず、初期値、又は前回更新された値をラグランジュ未定乗数α_ｓ（ｒ）として保持する。 The parameter updating unit 32 updates the Lagrange undetermined multiplier α _s (r) for each minimal subproblem r included in the own group s transmitted to the language analysis device 202. When the group flag F _s = 0 of the own group s, the parameter update unit 32 does not update the Lagrange undetermined multiplier α _s (r) for each minimal subproblem r, and the initial value or the value updated last time As Lagrange undetermined multiplier α _s (r).

部分解更新部３４は、言語解析装置２０２に送信された自グループｓに含まれる極小部分問題ｒの部分集合について、各極小部分問題の解ｚ_ｓ（ｒ）を更新する。なお、自グループｓのグループフラグＦ_ｓ＝０の場合、部分解更新部３４は、各極小部分問題の解ｚ_ｓ（ｒ）の解の更新を行わず、初期値、又は前回更新された値を各極小部分問題の解ｚ_ｓ（ｒ）の値として保持する。 The partial decomposition updating unit 34 updates the solution z _s (r) of each minimal subproblem for a subset of the minimal subproblem r included in the own group s transmitted to the language analysis device 202. When the group flag F _s = 0 of the own group s, the partial decomposition update unit 34 does not update the solution z _s (r) of each local subproblem, but the initial value or the value updated last time As the value of the solution z _s (r) of each minimal subproblem.

同期部３８は、当該言語解析装置２０２_ｓで今回更新され、又は保持されたα_ｓ（ｒ）及び、ｚ_ｓ（ｒ）を、自分以外の全ての言語解析装置２０２へネットワーク２０３を介して送信する。また、同期部３８は、他の言語解析装置２０２_ｉ全てから送信された、今回更新され、又は保持されたラグランジュ未定乗数α_ｉ（ｒ）及び、各極小部分問題の解ｚ_ｉ（ｒ）を受け取る。この処理によって、個々の言語解析装置２０２は全ての言語解析装置２０２_ｓの持つα_ｓ（ｒ）とｚ_ｓ（ｒ）の値を取得することができる。 The synchronization unit 38 transmits α _s (r) and z _s (r) updated or held this time by the language analysis device 202 _s to all the language analysis devices 202 other than itself via the network 203. To do. In addition, the synchronization unit 38 transmits the Lagrange undetermined multiplier α _i (r) and the solution z _i (r) of each local subproblem, which are transmitted from all the other language analysis devices 202 _{i and} are updated or held this time. receive. By this processing, each language analysis device 202 can acquire the values of α _s (r) and z _s (r) of all the language analysis devices 202 _s .

制約条件更新部４０は、他の言語解析装置２０２_ｉ全てから受け取ったラグランジュ未定乗数α_ｉ（ｒ）と極小部分問題の解ｚ_ｉ（ｒ）を使って、上記（１２）式に従って、各極小部分問題ｒに対する制約条件パラメータｕ（ｒ）を更新する。 The constraint condition update unit 40 uses the Lagrange undetermined multiplier α _i (r) and the solution z _i (r) of the minimal subproblem received from all the other language analyzers 202 _i according to the above equation (12). Update the constraint parameter u (r) for the subproblem r.

収束判定部４４は、得られた制約条件パラメータｕ（ｒ）が収束して最適値になっているか判定し、収束したと判定されたときに得られた制約条件パラメータｕ（ｒ）を組み合わせて、自然言語解析処理の結果を生成し、出力部２６０により言語解析制御装置２０１へ送信する。また、その時点で得られた全ての極小部分問題ｒの解ｕ（ｒ）、自グループｓの各極小部分問題ｒの解ｚ_ｓ（ｒ）、自グループｓのラグランジュ未定乗数α_ｓ（ｒ）の各々の値をパラメータ記憶部４６に記憶する。 The convergence determination unit 44 determines whether or not the obtained constraint parameter u (r) has converged to an optimum value, and combines the constraint parameter u (r) obtained when it is determined that it has converged. The result of the natural language analysis processing is generated and transmitted to the language analysis control device 201 by the output unit 260. Further, the solutions u (r) of all the minimal sub-problems r obtained at that time, the solutions z _s (r) of the respective minimal sub-problems r of the own group s, and the Lagrange undetermined multiplier α _s (r) of the own group s Are stored in the parameter storage unit 46.

＜自然言語解析処理システムの作用＞
次に、第２の実施の形態に係る自然言語解析処理システム２００の作用について説明する。まず、自然言語解析処理の対象となる一入力単位の文字列が言語解析制御装置２０１に入力されると、言語解析制御装置２０１において、自然言語解析処理を行う問題を、極小部分問題に分解して、極小部分問題の集合を生成する。そして、グループ作成部２６によって、生成された極小部分問題の集合と、極小部分問題記憶部２４に記憶されている極小部分問題の集合とを合わせた集合について、Ｓ個のグループを作成し、ネットワーク２０３を介してＳ個の言語解析装置２０２へ送信して、Ｓ個の言語解析装置２０２に割り当てる。 <Operation of natural language analysis processing system>
Next, the operation of the natural language analysis processing system 200 according to the second embodiment will be described. First, when a character string of one input unit to be subjected to natural language analysis processing is input to the language analysis control device 201, the language analysis control device 201 decomposes the problem of performing natural language analysis processing into a minimal partial problem. To generate a set of minimal subproblems. Then, the group creation unit 26 creates S groups for the set obtained by combining the set of the generated minimal subproblems and the set of the minimal subproblems stored in the minimal subproblem storage unit 24. The data is transmitted to S language analyzers 202 via 203 and assigned to S language analyzers 202.

そして、各言語解析装置２０２によって、上記図４に示す自然言語解析処理ルーチンが実行される。 Then, the natural language analysis processing routine shown in FIG. 4 is executed by each language analysis device 202.

少なくとも１つの言語解析装置２０２によって、最終的に更新された制約条件パラメータｕ（ｒ）を組み合わせて生成された自然言語解析処理の結果が、ネットワーク２０３を介して言語解析制御装置２０１へ送信される。言語解析制御装置２０１は、言語解析装置２０２により受信した自然言語解析処理の結果を出力部２３０により出力する。 The result of the natural language analysis process generated by combining the finally updated constraint parameter u (r) by at least one language analysis device 202 is transmitted to the language analysis control device 201 via the network 203. . The language analysis control device 201 outputs the result of the natural language analysis processing received by the language analysis device 202 by the output unit 230.

以上説明したように、第２の実施の形態に係る自然言語解析処理システムによれば、ネットワークを介して接続された複数の言語解析装置によって、分散並列処理による自然言語解析処理を行うため、処理を高速化できる。 As described above, according to the natural language analysis processing system according to the second embodiment, a plurality of language analysis devices connected via a network perform natural language analysis processing by distributed parallel processing. Can be speeded up.

なお、本発明は、上述した実施形態に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。 Note that the present invention is not limited to the above-described embodiment, and various modifications and applications are possible without departing from the gist of the present invention.

上記の実施の形態では、一入力単位の文字列が入力される毎に、問題を新たに逐次生成し、既存の極小部分問題に追加していく場合について説明したが、これに限定されるものではない。新たに入力された一入力単位の文字列を用いて、直前に作成した極小部分問題を変更することによって、新たな極小部分問題を作成するようにしてもよい。 In the above embodiment, each time a character string in one input unit is input, a new problem is sequentially generated and added to an existing minimal subproblem. However, the present invention is limited to this. is not. A new minimal subproblem may be created by changing the minimal subproblem created immediately before using a newly input character string of one input unit.

また、上記の実施の形態では、グループフラグの活性化及び制約フラグの非活性化の設定を、各計算ノード毎に全てのグループ及び制約条件パラメータについて設定を行っている場合について説明したが、これに限定されるものではない。各計算ノード毎に、自グループのグループフラグのみの活性化の設定、及び自グループに属している極小部分問題に関わる制約条件パラメータについてのみの非活性化の設定を行ってもよい。 In the above embodiment, the group flag activation and constraint flag deactivation settings have been described for the case where all the groups and the constraint parameter are set for each computation node. It is not limited to. For each computation node, the activation setting for only the group flag of the own group and the deactivation setting for only the constraint parameter related to the minimal subproblem belonging to the own group may be performed.

また、上記の実施の形態では、制約条件パラメータの更新を各計算ノード毎に全ての制約条件パラメータについて更新を行っている場合について説明したが、これに限定されるものではない。各計算ノード毎に、自グループに属する極小部分問題に関わる制約条件パラメータのみの更新を行ってもよい。 In the above-described embodiment, the case where the constraint parameter is updated for all the constraint parameters for each computation node has been described. However, the present invention is not limited to this. For each computation node, only the constraint parameter related to the minimal subproblem belonging to the own group may be updated.

また、上述の自然言語解析処理装置１００及び２００は内部にコンピュータシステムを有しているが、「コンピュータシステム」は、ＷＷＷシステムを利用している場合であれば、ホームページ提供環境（あるいは表示環境）を含むものとする。 Moreover, although the above-mentioned natural language analysis processing apparatuses 100 and 200 have computer systems inside, if the “computer system” uses a WWW system, a homepage providing environment (or display environment) Shall be included.

また、本願明細書中において、プログラムが予めインストールされている実施形態として説明したが、当該プログラムを、コンピュータ読み取り可能な記録媒体に格納して提供することも可能であるし、ネットワークを介して提供することも可能である。また、本実施の形態の自然言語解析処理装置１００の各部をハードウエアにより構成してもよい。 Further, in the present specification, the embodiment has been described in which the program is installed in advance. However, the program can be provided by being stored in a computer-readable recording medium or provided via a network. It is also possible to do. Moreover, you may comprise each part of the natural language analysis processing apparatus 100 of this Embodiment with hardware.

１０入力部
２０演算部
２２極小部分問題生成部
２４極小部分問題記憶部
２６グループ作成部
３０計算ノード
３２パラメータ更新部
３４部分解更新部
３６第１フラグ判定部
３８同期部
４０制約条件更新部
４２第２フラグ判定部
４４収束判定部
４６パラメータ記憶部
５０出力部
１００自然言語解析処理装置
２００自然言語解析処理システム
２０１言語解析制御装置
２０２言語解析装置
２０３ネットワーク
２２０演算部
２３０出力部
２４０入力部
２５０演算部
２６０出力部 DESCRIPTION OF SYMBOLS 10 Input part 20 Calculation part 22 Minimal subproblem generation part 24 Minimal subproblem storage part 26 Group preparation part 30 Calculation node 32 Parameter update part 34 Partial decomposition update part 36 1st flag determination part 38 Synchronization part 40 Restriction condition update part 42 1st 2 flag determination unit 44 convergence determination unit 46 parameter storage unit 50 output unit 100 natural language analysis processing device 200 natural language analysis processing system 201 language analysis control device 202 language analysis device 203 network 220 arithmetic unit 230 output unit 240 input unit 250 arithmetic unit 260 Output section

Claims

A natural language analysis processing device that performs natural language analysis processing including language analysis processing on an input character string obtained by concatenating each of character strings composed of at least one character that is sequentially input,
The partial problem when the problem of performing the language analysis processing on the input character string is decomposed into a partial problem of performing the language analysis processing in a predefined character unit or character string unit, the character unit or character string A sub-problem generating means for sequentially generating each unit input;
Sub-problem storage means for storing a set of sub-problems sequentially generated by the sub-problem generation means;
Each time the subproblem is generated sequentially by the subproblem generation means, S computation nodes (S is a natural number of 2 or more) for performing the natural language analysis processing;
Each time the partial problem is sequentially generated by the partial problem generating means, the partial problem is set such that each partial problem belongs to at least two groups with respect to the set of partial problems stored in the partial problem storage means. Creating S groups composed of arbitrary subsets of the total set of and assigning them to the S computation nodes,
Each of the S compute nodes is
For each subset of the subsets of the group assigned by the group creation means, each part of the subset is based on a constraint condition that the solutions of the respective minimal subproblems belonging to the two or more groups match. A partial update method for updating the solution of the problem;
The partial update means of the computation node when the solutions of the partial problems of the subset updated by the partial decomposition update means all match the solutions of the partial problems of the subset updated last time A group inactive state setting means for setting a group inactive state indicating that the update processing by is not performed,
For each subproblem, when the solution of the subproblem updated by the partial decomposition update means does not match the solution of the subproblem updated last time, the solution of the subproblem is made to match according to the constraint condition. A constraint condition active state setting means for canceling the constraint condition inactive state, which is set for the constraint parameter that is the solution at the time, and represents that the constraint parameter update process is not performed,
Notifying the other calculation nodes of the solutions of the respective partial problems updated by the partial decomposition updating means, and obtaining the solutions of the respective partial problems of the subset notified from the other calculation nodes Synchronization means;
Based on the solution of each subproblem of the subset of the other computation node obtained by the synchronization means and the solution of each subproblem of the subset updated by the partial decomposition update means, , Constraint update means for updating the constraint parameters of the partial problem;
For each subproblem, if the constraint condition parameter updated by the constraint condition update means does not match the previously updated value of the constraint condition parameter, the computation node to which the group to which the subproblem belongs is assigned Group active state setting means for canceling the group inactive state set for:
For each subproblem, when the constraint parameter updated by the constraint condition update unit matches the value of the constraint parameter updated last time, the constraint condition for the constraint parameter of the subproblem Restriction condition inactive state setting means for setting the inactive state;
It is determined whether or not the value of the constraint parameter has converged, and until it is determined that the value of the constraint parameter has converged, the update by the partial decomposition update unit, the setting by the group inactive state setting unit, the constraint Convergence determining means for repeating setting by the condition active state setting means, notification and acquisition by the synchronizing means, update by the constraint condition update means, setting by the group active state setting means, and setting by the constraint condition inactive state setting means; A natural language analysis processing apparatus including
The calculation node in which the group inactive state is set does not perform update by the partial decomposition update unit,
The synchronization means of the computation node in which the group inactive state is set notifies other computation nodes of the solution of each partial problem of the subset last updated by the partial decomposition update means,
The natural language analysis processing device, wherein the constraint condition update means does not update the constraint parameter in which the constraint condition inactive state is set.

The partial decomposition updating means, for the subset of the partial problem of the group assigned by the group creating means, the Lagrange undetermined multiplier updated last time, the solution of each partial problem of the subset, and the constraint parameter To update the Lagrangian undetermined multiplier so as to optimize a predetermined objective function value, and to use the updated Lagrange undetermined multiplier to optimize the value of the objective function, Update the solution of each subproblem of the subset,
The synchronizing means notifies the Lagrange undetermined multiplier updated by the partial decomposition updating means and the solution of each subproblem of the subset to other calculation nodes, and the Lagrange notified from the other calculation nodes. Obtain the solution to the undetermined multiplier and each minimal subproblem of the subset,
Each of the constraint condition updating means optimizes the value of the objective function based on the Lagrange undetermined multiplier of the other computation node acquired by the synchronization means and the solution of each partial problem of the subset. The natural language analysis processing apparatus according to claim 1, wherein the constraint parameter is updated for a partial problem.

A natural language analysis processing device that performs natural language analysis processing including language analysis processing on an input character string obtained by concatenating each of character strings composed of at least one character that is sequentially input,
The partial problem when the problem of performing the language analysis processing on the input character string is decomposed into a partial problem of performing the language analysis processing in a predefined character unit or character string unit, the character unit or character string A sub-problem generating means for sequentially generating each unit input;
Sub-problem storage means for storing a set of sub-problems sequentially generated by the sub-problem generation means;
Each time the subproblem is generated sequentially by the subproblem generation means, S computation nodes (S is a natural number of 2 or more) for performing the natural language analysis processing;
Each time the partial problem is sequentially generated by the partial problem generating means, the partial problem is set such that each partial problem belongs to at least two groups with respect to the set of partial problems stored in the partial problem storage means. A group creating means for creating S groups composed of arbitrary subsets of the entire set and assigning them to the S computing nodes;
A constraint condition update means, a group active state setting means, a constraint condition inactive state setting means, and a convergence determination means,
Each of the S compute nodes is
For each subset of the subsets of the group assigned by the group creation means, each part of the subset is based on a constraint condition that the solutions of the respective minimal subproblems belonging to the two or more groups match. A partial update method for updating the solution of the problem;
The partial update means of the computation node when the solutions of the partial problems of the subset updated by the partial decomposition update means all match the solutions of the partial problems of the subset updated last time A group inactive state setting means for setting a group inactive state indicating that the update processing by is not performed,
For each subproblem, when the solution of the subproblem updated by the partial decomposition update means does not match the solution of the subproblem updated last time, the solution of the subproblem is made to match according to the constraint condition. A constraint condition active state setting unit that cancels a constraint condition inactive state that is set for the constraint condition parameter that is a solution when the constraint condition parameter is not updated. For each subproblem based on the solution of each subproblem of the subset of each computation node and the solution of each subproblem of the subset updated by the partial decomposition update means, Update constraint parameters,
The group active state flag assigning unit, for each partial problem, if the constraint parameter updated by the constraint condition update unit does not match the value of the constraint parameter updated last time, the partial problem is Release the group inactive state set for the compute node to which the group to which it belongs is assigned,
The constraint condition inactive state setting means, for each partial problem, when the constraint condition parameter updated by the constraint condition update means matches the value of the constraint condition parameter updated last time, Set the constraint inactive state for the constraint parameter in question,
The convergence determination means determines whether or not the value of the constraint parameter has converged, and updates by the partial decomposition update means, the group inactive state setting until it is determined that the value of the constraint parameter has converged Setting by means, setting by the restriction condition active state setting means, notification and acquisition by the synchronization means, update by the restriction condition update means, setting by the group active state setting means, and setting by the restriction condition inactive state setting means Is a natural language analysis processing device,
The calculation node in which the group inactive state is set does not perform update by the partial decomposition update unit,
The constraint condition update means does not update the constraint parameter in which the constraint condition inactive state is set,
The solution of each subproblem of the subset in the computation node in which the group inactive state is set is the solution of each subproblem of the subset last updated by the partial decomposition update means. Natural language analysis Processing equipment.

The partial decomposition updating means, for the subset of the partial problem of the group assigned by the group creating means, the Lagrange undetermined multiplier updated last time, the solution of each partial problem of the subset, and the constraint parameter Updating the Lagrangian undetermined multiplier so as to optimize the value of the objective function determined in advance, and using the updated Lagrangian undetermined multiplier to optimize the value of the objective function. Update the solution of each subproblem in the subset,
The constraint condition update means optimizes the value of the objective function based on the Lagrange undetermined multiplier updated by the partial decomposition update means of each computation node and the solution of each partial problem of the subset. The natural language analysis processing apparatus according to claim 3, wherein the constraint parameter is updated for each partial problem.

It includes a subproblem generating means, a subproblem storing means, S (S is a natural number greater than or equal to 2) calculation nodes, and a group creating means, each of which is connected to a character string consisting of at least one character that is sequentially input. A natural language analysis processing method in a natural language analysis processing apparatus that performs natural language analysis processing including language analysis processing on an input character string obtained by:
The partial problem when the problem of performing the language analysis processing on the input character string is decomposed by the partial problem generation means into a partial problem of performing the language analysis processing in a predefined character unit or character string unit. , Each time the character unit or character string unit is input,
A set of subproblems sequentially generated by the subproblem generator by the subproblem storage means;
Each time the partial problem is sequentially generated by the group generation means by the group creation means, each partial problem belongs to at least two or more groups with respect to the set of partial problems stored in the partial problem storage means. Create S groups of arbitrary subsets for the entire set of subproblems and assign them to the S computational nodes,
The natural language analysis processing is performed by the S calculation nodes,
Performing the natural language analysis process by each of the S computation nodes is as follows:
Based on the constraint condition that the solutions of the local subproblems belonging to the two or more groups match with respect to a subset of the partial problem set of the group assigned by the group creation means by the partial decomposition update means. Update the solution of each subproblem of the subset,
When the solutions of the subproblems of the subset updated by the partial decomposition updating means by the group inactive state setting means all coincide with the solutions of the subproblems of the subset updated last time, Set a group inactive state indicating that update processing by the partial update unit of the calculation node is not performed,
When the solution of the partial problem updated by the partial decomposition update unit for each partial problem by the constraint active state setting unit does not match the solution of the partial problem updated last time, according to the constraint condition The constraint condition inactive state, which is set for the constraint parameter that is a solution when matching the solution of the subproblem and does not perform the update process of the constraint parameter, is released,
The synchronization means notifies the solution of each subproblem of the subset updated by the subdivision updating means to other calculation nodes, and also notifies each subproblem of the subset notified from the other calculation node. Get the solution,
Based on the solution of each partial problem of the subset of the other computation node acquired by the synchronization means by the constraint update means and the solution of each partial problem of the subset updated by the partial decomposition update means For each subproblem, update the constraint parameter of the subproblem,
If the constraint parameter updated by the constraint condition update unit for each partial problem by the group active state setting unit does not match the value of the constraint parameter updated last time, the group to which the partial problem belongs Release the group inactive state set for the compute node to which is assigned,
For each partial problem by the constraint condition inactive state setting means, when the constraint condition parameter updated by the constraint condition update means matches the value of the constraint condition parameter updated last time, the partial problem Setting the constraint inactive state for the constraint parameter of
It is determined whether or not the value of the constraint parameter has converged by a convergence determination unit, and the update by the partial decomposition update unit, the group inactive state setting unit until it is determined that the value of the constraint parameter has converged Setting by the constraint condition active state setting means, notification and acquisition by the synchronization means, update by the constraint condition update means, setting by the group active state setting means, and setting by the constraint condition inactive state setting means. A natural language analysis processing method including repetition,
The calculation node in which the group inactive state is set does not perform update by the partial decomposition update unit,
The synchronization means of the computation node in which the group inactive state is set notifies other computation nodes of the solution of each partial problem of the subset last updated by the partial decomposition update means,
The natural language analysis processing method, wherein the constraint condition update means does not update the constraint parameter in which the constraint condition inactive state is set.

The partial decomposition updating means, for the subset of the partial problem of the group assigned by the group creating means, the Lagrange undetermined multiplier updated last time, the solution of each partial problem of the subset, and the constraint parameter To update the Lagrangian undetermined multiplier so as to optimize a predetermined objective function value, and to use the updated Lagrange undetermined multiplier to optimize the value of the objective function, Update the solution of each subproblem of the subset,
The synchronizing means notifies the Lagrange undetermined multiplier updated by the partial decomposition updating means and the solution of each subproblem of the subset to other calculation nodes, and the Lagrange notified from the other calculation nodes. Obtain the solution to the undetermined multiplier and each minimal subproblem of the subset,
The constraint condition update means optimizes the value of the objective function based on the Lagrange undetermined multiplier of the other computation node acquired by the synchronization means and the solution of each partial problem of the subset. The natural language analysis processing method according to claim 5, wherein the constraint parameter is updated for a partial problem.

Sub-problem generating means, sub-problem storage means, S calculation nodes (S is a natural number of 2 or more), group creation means, constraint condition update means, group active state setting means, constraint condition inactive state setting means, and A natural language analysis apparatus that includes a convergence determination unit and performs a natural language analysis process including a language analysis process on an input character string obtained by concatenating each of character strings composed of at least one character sequentially input. A language analysis processing method,
The partial problem when the problem of performing the language analysis processing on the input character string is decomposed by the partial problem generation means into a partial problem of performing the language analysis processing in a predefined character unit or character string unit. , Each time the character unit or character string unit is input,
A set of subproblems sequentially generated by the subproblem generator by the subproblem storage means;
Each time the partial problem is sequentially generated by the group generation means by the group creation means, each partial problem belongs to at least two or more groups with respect to the set of partial problems stored in the partial problem storage means. Create S groups of arbitrary subsets for the entire set of subproblems and assign them to the S computational nodes,
The natural language analysis processing is performed by the S calculation nodes,
Updated by the constraint condition update means,
Set by the group active state setting means,
Set by the constraint inactive state setting means,
Determining by the convergence determining means,
Performing the natural language analysis process by each of the S computation nodes is as follows:
Based on the constraint condition that the solutions of the minimal sub-problems belonging to the two or more groups match with respect to a subset of the partial problem set of the group assigned by the group creation means by the partial decomposition update means. Update the solution of each subproblem of the subset,
When the solutions of the subproblems of the subset updated by the partial decomposition update means by the group inactive state setting means all match the solutions of the subproblems of the subset updated last time , Set a group inactive state indicating that update processing by the partial update means of the calculation node is not performed,
The constraint condition when the solution of the partial problem updated by the partial decomposition update unit for each partial problem does not match the solution of the partial problem updated last time by the constraint condition active state setting unit. Canceling the constraint inactive state, which is set for the constraint parameter that is a solution when matching the solution of the partial problem according to
The constraint condition update means updates the constraint parameter of the partial problem for each partial problem based on the solution of each partial problem of the subset updated by the partial decomposition update means of each computation node,
The group active state setting means, for each partial problem, if the restriction condition parameter updated by the restriction condition update means does not match the value of the restriction condition parameter updated last time, the partial problem belongs to Release the group inactive state set for the compute node to which a group is assigned,
The constraint condition inactive state setting means, for each partial problem, when the constraint condition parameter updated by the constraint condition update means matches the value of the constraint condition parameter updated last time. Setting the constraint inactive state for the constraint parameter of
The convergence determination means determines whether or not the value of the constraint parameter has converged, and updates by the partial decomposition update means for each calculation node until determining that the value of the constraint parameter has converged, the group Setting by the inactive state setting unit, setting by the constraint condition active state setting unit, notification and acquisition by the synchronization unit, update by the constraint condition update unit, setting by the group active state setting unit, and the constraint condition inactive state A natural language analysis processing method including repeating setting by a setting means,
The calculation node in which the group inactive state is set does not perform update by the partial decomposition update unit,
The constraint condition update means does not update the constraint parameter in which the constraint condition inactive state is set,
Natural language analysis in which the solution of each subproblem of the subset in the computation node in which the group inactive state is set is the solution of each subproblem of the subset last updated by the partial decomposition update means Processing method.

The program for functioning a computer as each means of the natural language analysis processing apparatus of any one of Claims 1-4.