JPH09212511A

JPH09212511A - Natural language processor

Info

Publication number: JPH09212511A
Application number: JP8018741A
Authority: JP
Inventors: Hiroshi Yasuhara; 宏安原
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1996-02-05
Filing date: 1996-02-05
Publication date: 1997-08-15

Abstract

PROBLEM TO BE SOLVED: To obtain a correct process result without making the same mistake again by rewriting the priority of priority information on an element which derives the process result according to whether the process result of a process part is correct or incorrect. SOLUTION: An analysis result is displayed by an update learning part 7 and if a user judges that the modification of one paragraph is incorrect, the user selects a rule which seems corrects out of grammatical rules which are already stored in a result storage part 6 or inputs a new grammatical rule. Then the priority flag 5C of the current rule is set OFF. The priority flag 5C of the rule which gives correct modification, on the other hand, is set ON. When the rule is updated and learnt, information showing whether or not the priority flag TC is ON or OFF effectively operates in next and succeeding analyzing processes. The load of a calculation quantity is smaller when the priority flag 5C is re-set ON or OFF than when the probability of a rule is changed at each time, and the immediateness of learning effect is higher.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は自然言語処理装置に
関し、例えば機械翻訳システムや自然言語インターフェ
ースに適用し得るものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a natural language processing apparatus, and can be applied to, for example, a machine translation system or a natural language interface.

【０００２】[0002]

【従来の技術】一般に、従来の自然言語処理において
は、自然言語の有する曖昧性のために、完全に正しい解
析処理や生成処理が保証されていない。そこで、従来か
らこの曖昧性を解消するための技術が様々な研究機関に
おいて研究されている。なお、かかる技術としては、例
えば文献（Keh-Yih SU and Jing-Shin CHANG:SEMANTIC
AND SYNTACTIC ASPECTS OF SCORE FUNCTION,pp642-644,
COLING88,1988）に記載されたものがある。この文献に
示されている技術は、文法規則に確率のような値を付与
しておき、解析処理や生成処理の際により確率の高いも
のを優先的に使用するというものである。2. Description of the Related Art Generally, in conventional natural language processing, completely correct analysis processing and generation processing are not guaranteed due to the ambiguity of natural language. Therefore, conventionally, various research institutions have been researching techniques for eliminating this ambiguity. Note that, as such a technique, for example, a document (Keh-Yih SU and Jing-Shin CHANG: SEMANTIC
AND SYNTACTIC ASPECTS OF SCORE FUNCTION, pp642-644,
COLING88,1988). The technique disclosed in this document is to assign a value such as a probability to a grammar rule, and preferentially use the one with a higher probability in the analysis process or the generation process.

【０００３】[0003]

【発明が解決しようとする課題】ところが、上記の技術
の場合、解析処理や生成処理が実行される度に、解析結
果や生成結果に含まれる誤判定の要因となった規則につ
いて確率を変更しなければならず、その計算量の負荷が
非常に大きくなるという問題があった。また、仮に結果
に応じて確率を変更したとしても、変更された結果が次
回の解析処理や生成処理においてすぐさま効果を発揮す
る保証もないため、即効性の点でやはり問題点があっ
た。However, in the case of the above technique, every time the analysis process or the generation process is performed, the probability of changing the analysis result or the rule included in the generation result, which is the cause of the erroneous determination, is changed. There is a problem that the load of the calculation amount becomes very large. Further, even if the probability is changed according to the result, there is no guarantee that the changed result will immediately exert an effect in the next analysis processing or generation processing, so there is still a problem in terms of immediate effect.

【０００４】このため、計算量の負荷がより小さく、即
効性に優れた自然言語処理方法の実現が望まれている。Therefore, it is desired to realize a natural language processing method which has a smaller calculation load and is excellent in immediate effect.

【０００５】[0005]

【課題を解決するための手段】かかる課題を解決するた
め、本発明は、自然言語処理規則を構成する要素それぞ
れについて、確率情報及び優先度情報の２つの情報を格
納する記憶装置と、言語資源の要素を用いて入力言語を
処理する処理部と、処理部の処理結果が誤っていると
き、その結果を導出する要素の優先度情報の優先度を低
く書き換え、処理部の処理を正しい結果に導出する要素
の優先度情報の優先度を高く書き換える更新学習部とで
自然言語処理装置を構成するようにする。In order to solve such a problem, the present invention relates to a storage device for storing two pieces of information, probability information and priority information, for each element constituting a natural language processing rule, and a language resource. When the processing result of the processing unit that processes the input language by using the element of is incorrect, and the processing result of the processing unit is incorrect, the priority of the priority information of the element that derives the result is rewritten to be low, and the processing of the processing unit is changed to the correct result. A natural language processing device is configured by an update learning unit that rewrites the priority information of the derived element to a higher priority.

【０００６】これにより、以前の処理において誤りがあ
ったことのある入力文と同じ構造の入力文が入力された
ときには、前回の処理で優先度情報の優先度が高く書き
換えられた要素が今回の処理において選択されることに
なり、２度と同じ誤りを繰り返すことなく正しい処理結
果を得ることができる。As a result, when an input sentence having the same structure as an input sentence that has been erroneous in the previous process is input, the element whose priority information has been rewritten to have a high priority in the previous process is used. It will be selected in the processing, and the correct processing result can be obtained without repeating the same error twice.

【０００７】なお、この優先度情報の書き換えに係る計
算量の負担は少なく、しかも次回の処理から有効である
ので即効性も保証される。The burden of the calculation amount relating to the rewriting of the priority information is small, and since it is effective from the next processing, immediate effect is guaranteed.

【０００８】また、本発明は、自然言語処理規則を構成
する要素それぞれについて、確率情報及び頻度情報の２
つの情報を格納する記憶装置と、言語資源の要素を用い
て入力言語を処理する処理部と、処理部の処理結果が誤
っているとき、その結果を導出する要素が採用された回
数を表す頻度情報の値を減少させ、処理部の処理結果が
正しいとき、その結果を導出する要素が採用された回数
を表す頻度情報の値を増加させる更新学習部とで自然言
語処理装置を構成するようにする。Further, according to the present invention, two pieces of probability information and frequency information are provided for each element constituting the natural language processing rule.
A storage device that stores one piece of information, a processing unit that processes an input language using elements of language resources, and when the processing result of the processing unit is incorrect, the frequency that represents the number of times the element that derives the result is adopted. When the processing result of the processing unit is correct, the value of the information is reduced, and the natural language processing device is configured with the update learning unit that increases the value of the frequency information indicating the number of times the element that derives the result is adopted. To do.

【０００９】このように１つの処理が終了したとき、確
率情報ではなく頻度情報のみを書き換えるので、確率情
報を再計算して書き換える場合に比して処理部の負担を
低減することができ、処理速度を向上することができ
る。In this way, when one process is completed, only the frequency information is rewritten, not the probability information, so that the burden on the processing unit can be reduced as compared with the case where the probability information is recalculated and rewritten. The speed can be improved.

【００１０】さらに、本発明は、自然言語処理規則を構
成する要素それぞれについて、確率情報及び信頼度情報
の２つの情報を格納する記憶装置と、言語資源の要素を
用いて入力言語を処理する処理部と、処理部の処理結果
が誤っている可能性が高いとき、その結果を導出する要
素に対応する信頼度情報の信頼度を低下させる更新学習
部とで自然言語処理装置を構成するようにする。Further, according to the present invention, for each element constituting the natural language processing rule, a storage device for storing two pieces of information, probability information and reliability information, and a processing for processing an input language by using an element of a language resource. The natural language processing device is configured by a processing unit and an update learning unit that reduces the reliability of reliability information corresponding to an element that derives the result when the processing result of the processing unit is likely to be incorrect. To do.

【００１１】信頼度情報の信頼度が低い要素について
は、次回以降の入力言語の処理において用いられないよ
うにできるので、その分、処理結果の精度を高めること
ができる。Since the element having low reliability of the reliability information can be prevented from being used in the processing of the input language from the next time onward, the accuracy of the processing result can be improved accordingly.

【００１２】[0012]

【発明の実施の形態】以下、本発明による自然言語処理
方法及び装置の一実施形態を図面を参照しながら詳述す
る。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of a natural language processing method and apparatus according to the present invention will be described in detail below with reference to the drawings.

【００１３】（１）自然言語処理装置の構成図１は、この実施形態に係る自然言語処理装置の概略構
成を示すブロック図である。この自然言語処理装置１
は、実際上、キーボードやマウス等の入力装置、デイス
プレイやプリンタ等の出力装置、ハードデイスク等の補
助記憶装置等を周辺装置として備えているワークステー
ションやパーソナルコンピュータ等の情報処理装置で構
成されているが、自然言語処理機能から構成を機能ブロ
ック化すると、図１に示す通りになる。(1) Configuration of Natural Language Processing Device FIG. 1 is a block diagram showing a schematic configuration of a natural language processing device according to this embodiment. This natural language processor 1
Is actually composed of an input device such as a keyboard and a mouse, an output device such as a display and a printer, and an information processing device such as a workstation and a personal computer equipped with auxiliary storage devices such as a hard disk as peripheral devices. However, if the structure is made into a functional block from the natural language processing function, it becomes as shown in FIG.

【００１４】さて、自然言語処理装置１は、入力部２、
処理部３、辞書格納部４、規則格納部５、結果格納部６
及び更新学習部７によって構成されている。The natural language processing apparatus 1 has an input unit 2,
Processing unit 3, dictionary storage unit 4, rule storage unit 5, result storage unit 6
And the update learning unit 7.

【００１５】このうち入力部２は、入力言語となる自然
言語又は中間言語の入力に用いられる。例えば解析処理
の場合には「机に本を置く。」と言った自然言語そのも
ののが入力され、生成処理の場合には「机−場所−置
く」や「本−対象−置く」と言った中間言語が入力され
る。また、入力部２は、この他にも、解析処理や生成処
理の際に用いられる規則に対応付けられている情報を個
別に修正する指示の入力に用いられる。Of these, the input unit 2 is used for inputting a natural language or an intermediate language as an input language. For example, in the case of analysis processing, the natural language itself such as "put a book on the desk." Is input, and in the case of generation processing, "desk-place-put" or "book-target-put" is said. An intermediate language is entered. In addition to this, the input unit 2 is also used to input an instruction to individually correct the information associated with the rule used in the analysis process or the generation process.

【００１６】処理部３は、辞書格納部４に格納されてい
る辞書と規則格納部５に格納されている規則とを用いて
入力された自然言語を解析処理又は中間言語から自然言
語を処理する部分である。なお、処理部３の処理結果
は、処理の際に使用した全ての規則とともに結果格納部
６に記憶され、チェックの際に必要に応じて読み出せる
ようになっている。The processing unit 3 analyzes the natural language input using the dictionary stored in the dictionary storage unit 4 and the rules stored in the rule storage unit 5, or processes the natural language from the intermediate language. It is a part. The processing result of the processing unit 3 is stored in the result storage unit 6 together with all the rules used at the time of processing, and can be read as needed at the time of checking.

【００１７】辞書格納部４は、辞書項目の集合である辞
書を格納する部分である。ここで、辞書項目とは、一つ
の語についての情報を規定するものであり、語の文法的
性質や意味的性質等を記述するものである。The dictionary storage unit 4 is a unit for storing a dictionary which is a set of dictionary items. Here, the dictionary item defines information about one word, and describes the grammatical property and the semantic property of the word.

【００１８】規則格納部５は、解析処理や生成処理に用
いられる解析規則や生成規則等の言語資源を格納する部
分である。なお、この規則格納部５に格納されている言
語資源の各要素、すなわち解析規則や生成規則のそれぞ
れには、確率情報５Ａ、頻度情報５Ｂ、優先フラグ５Ｃ
及び信頼フラグ５Ｄの４つの情報が対応付けられてお
り、解析処理や生成処理の際に適用する規則を決定する
際に利用できるようになっている。The rule storage unit 5 is a part for storing language resources such as analysis rules and generation rules used in analysis processing and generation processing. Each element of the language resource stored in the rule storage unit 5, that is, each of the analysis rule and the generation rule, has probability information 5A, frequency information 5B, and priority flag 5C.
And four pieces of information of the trust flag 5D are associated with each other, and can be used when deciding a rule to be applied in the analysis processing and the generation processing.

【００１９】因みに、確率情報５Ａは、対応する規則が
適用される確率を表す情報である。また、頻度情報５Ｂ
は対応する規則が適用された回数を示す情報である。優
先フラグ５Ｃは優先度の高さを表す情報であり、オンの
とき優先度が高く、オフのとき優先度が低い。さらに、
信頼フラグ５Ｄは信頼度の高さを表す情報であり、オン
のとき間違っている可能性が高く、オフのとき確からし
い可能性が高いことを示す。Incidentally, the probability information 5A is information representing the probability that the corresponding rule will be applied. Also, frequency information 5B
Is information indicating the number of times the corresponding rule is applied. The priority flag 5C is information indicating the high priority, and when it is on, the priority is high, and when it is off, the priority is low. further,
The reliability flag 5D is information indicating a high degree of reliability, and indicates that there is a high possibility that it is wrong when it is on, and a high probability that it is certain when it is off.

【００２０】更新学習部７は、処理の終了後に利用者が
処理結果をチェックする際、結果格納部６から処理結果
及びその結果が得られる過程で用いられた全ての規則を
読み出して画面上に表示する部分である。そして、更新
学習部７は、利用者によって規則格納部５に格納されて
いる各規則の情報の更新又は新規登録が指示された場合
には、必要な情報を書き換えたり追加するようになって
いる。なおこの際の更新命令又は登録命令は、入力部２
の入力に基づいて動作する処理部３から更新学習部７に
与えられる。このとき、更新学習部７が実行する具体的
な処理には次の５通りの処理がある。When the user checks the processing result after the processing is finished, the update learning unit 7 reads out the processing result from the result storage unit 6 and all the rules used in the process of obtaining the result, and displays them on the screen. This is the part to be displayed. Then, the update learning unit 7 is configured to rewrite or add necessary information when the user instructs to update or newly register the information of each rule stored in the rule storage unit 5. . In addition, the update command or the registration command at this time is input unit 2
Is given to the update learning unit 7 from the processing unit 3 that operates based on the input of. At this time, there are the following five types of specific processing executed by the update learning unit 7.

【００２１】１つ目の処理は、正しい規則が規則格納部
５に存在していないので新たに登録する処理である。こ
の処理で正しい規則が新たに登録されると、同時に、当
該規則の確率情報５ＡがＮＵＬＬ（リセット状態）に設
定され、頻度情報５Ｂが「１」に設定され、優先フラグ
５Ｃがオンに設定され、信頼フラグ５Ｄがオフに設定さ
れる。The first process is a process for newly registering because a correct rule does not exist in the rule storage unit 5. When a correct rule is newly registered in this process, at the same time, the probability information 5A of the rule is set to NULL (reset state), the frequency information 5B is set to "1", and the priority flag 5C is set to ON. , The confidence flag 5D is set to off.

【００２２】２つ目の処理は、誤っている可能性が高い
現規則の情報のみを更新する処理である。この処理では
現規則の頻度情報５Ｂの値が「１」減少され、優先フラ
グ５Ｃがオフされ、信頼フラグ５Ｄがオンされる。The second process is a process of updating only the information of the current rule which is highly likely to be erroneous. In this process, the value of the frequency information 5B of the current rule is decreased by "1", the priority flag 5C is turned off, and the confidence flag 5D is turned on.

【００２３】３つ目の処理は、現規則より適切な規則が
あることに基づく現規則の更新処理である。この処理で
は現規則の優先フラグ５Ｃがオフされる。The third process is a process for updating the current rule based on the fact that there is a more appropriate rule than the current rule. In this process, the priority flag 5C of the current rule is turned off.

【００２４】４つ目の処理は、現規則より適切であると
して選択された規則が更新格納部５に存在する場合にそ
の情報を更新する処理である。この規則では頻度情報５
Ｂが「１」増加され、優先フラグ５Ｃがオンされ、信頼
フラグ５Ｄがオフされる。The fourth process is a process of updating the information selected when the rule selected as more appropriate than the current rule exists in the update storage unit 5. In this rule, frequency information 5
B is incremented by "1", the priority flag 5C is turned on, and the trust flag 5D is turned off.

【００２５】５つ目の処理は、現状維持のまま何もしな
いという処理である。The fifth process is a process in which nothing is done while maintaining the current state.

【００２６】（２）処理動作例以下、この実施形態に係る自然言語処理装置１によっ
て、自然言語がどのように処理され、また規則の学習が
なされるかを具体的に説明する。(2) Example of processing operation Hereinafter, how the natural language processing apparatus 1 according to this embodiment processes a natural language and learns rules will be specifically described.

【００２７】ただし解析処理も生成処理も原理的な動作
は基本的に変わらないので、すなわちいずれもの場合に
も複数の候補の中から最適な規則を選択するという点で
は同じなので、以下図２に示す自然言語の解析処理手順
に基づいて説明する。However, since the principle operation of the analysis process and the generation process is basically the same, that is, the same rule is selected in each case from the plurality of candidates, the following is shown in FIG. Description will be given based on the natural language analysis processing procedure shown.

【００２８】まず、入力部２より自然言語が１分単位に
入力されると、ステップＳＰ１に示すように、処理部３
が辞書格納部４に格納されている辞書と規則格納部５に
格納されている規則を用いて入力文の解析を開始する。
なお、ここでは入力文として、「彼は黒い服を着た少年
を連れてきた。」が入力されるものとする。First, when a natural language is input from the input unit 2 in units of 1 minute, as shown in step SP1, the processing unit 3
Starts the analysis of the input sentence using the dictionary stored in the dictionary storage unit 4 and the rules stored in the rule storage unit 5.
Here, it is assumed that "He brought a boy in black clothes." Is input as the input sentence.

【００２９】処理部３では、図３に示すように、入力文
を構成するどの文節がどの文節をどのような関係で修飾
しているかという構造を予め規則格納部５に格納されて
いる規則を用いて解析する。この場合、修飾関係には行
為者の関係により、「彼は」が「着た」に係る関係１１
と、「彼は」が「連れてきた」に係る関係１２とがある
が、このいずれを適用すべきかが確率情報５Ａ等を用い
て処理部３により決定される。ただし、この入力文の解
析に使用した規則は全て結果格納部６に格納される。す
なわち関係１１を与える規則集合｛Ｒα１、Ｒα２、…
…｝と、関係１２を与える規則集合｛Ｒβ１、Ｒβ２、
……｝とが全て格納される。なお、規則集合の要素数は
解析に用いる文法に依存するので１個の場合もある。ま
た、適用規則がないときは、空集合になる。In the processing unit 3, as shown in FIG. 3, rules stored in advance in the rule storage unit 5 are used to determine the structure of which bunsetsu that constitutes an input sentence modifies which bunsetsu by what kind of relation. Analyze using. In this case, the modified relationship depends on the actor's relationship.
There is a relation 12 relating to “he is” brought in, and which is to be applied is determined by the processing unit 3 using the probability information 5A and the like. However, all the rules used to analyze this input sentence are stored in the result storage unit 6. That is, a rule set {Rα1, Rα2, ...
...} and a rule set {Rβ1, Rβ2, which gives a relation 12]
......} are all stored. Note that the number of elements in the rule set depends on the grammar used for analysis, and may be one. If there is no applicable rule, it will be an empty set.

【００３０】なお、このようにして得られた解析結果に
ついてチェックする方法としては、１文ごと対話的にチ
ェックする方法と複数の文についてされた解析結果をま
とめてチェックする方法との２通りがあるが、いずれの
場合にも入力文に対する解析結果が結果格納部６に格納
される点では同じである。There are two methods for checking the analysis results obtained in this way: a method of interactively checking each sentence and a method of collectively checking the analysis results of a plurality of sentences. However, in any case, the analysis result for the input sentence is the same in that it is stored in the result storage unit 6.

【００３１】さて、ここから先は、図３において実線で
示されている関係１１が処理部３における解析結果とし
て採用されたものとして話を進める。Now, from here onward, it is assumed that the relation 11 shown by the solid line in FIG. 3 is adopted as the analysis result in the processing unit 3.

【００３２】この解析が終了すると、ステップＳＰ２の
処理に移り、解析結果である関係１１が更新学習部７に
よって表示画面（図示せず）上に表示される。この解析
結果が正しいか否かは次のステップＳＰ３において文節
ごと順番に利用者に問い合わされる。When this analysis is completed, the process proceeds to step SP2, and the relation 11 as the analysis result is displayed on the display screen (not shown) by the update learning unit 7. Whether or not the analysis result is correct is inquired to the user in sequence for each clause in the next step SP3.

【００３３】ここで、１文を構成する全ての文節につい
ての係り受け関係（例えば「彼は」の文節の係り受け規
則Ｒαｉ（ｉ＝１、２……））が正しければ、肯定結果
が得られてステップＳＰ１に戻り、次の入力文について
の処理に移行する。なお、このとき、利用者が頻度情報
の更新を指示すれば、ステップＳＰ１１で、正しい係り
受け関係を与えた規則の頻度情報５Ｂには「１」が加算
される。Here, if the dependency relations about all the phrases constituting one sentence (for example, the dependency rules Rαi (i = 1, 2, ...) For the phrases of “he is” ”are correct, an affirmative result is obtained. Then, the process returns to step SP1 and shifts to the processing for the next input sentence. At this time, if the user gives an instruction to update the frequency information, in step SP11, "1" is added to the frequency information 5B of the rule that gives the correct dependency relationship.

【００３４】これに対して、ある文節の係り受け関係が
誤っていると利用者に判断された場合には、否定結果が
得られてステップＳＰ４に進み、利用者によってより正
しい文法規則が選択され又は入力されるのを待ち受ける
処理に移る。例えば「彼は」という文節は「着た」では
なく「連れてきた。」にかけるのが正しいのでこの経路
を通る。On the other hand, when the user determines that the dependency relation of a certain clause is wrong, a negative result is obtained and the process proceeds to step SP4, where the user selects a more correct grammar rule. Alternatively, the process waits for input. For example, it is correct to put the phrase “he is” in “brought in.” Instead of “worn.”

【００３５】なお、このステップＳＰ４では、既に結果
格納部６に格納されている文法規則の中から正しいと思
われる規則を利用者が選択し又は新たな文法規則を入力
するといった操作がなされる。In step SP4, the user selects a rule that seems to be correct from the grammatical rules already stored in the result storage unit 6 or inputs a new grammatical rule.

【００３６】続くステップＳＰ５では、このように修正
が加えられた規則を今後の解析処理に生かすためどのよ
うな条件で記録するか利用者に問い合わせる処理がなさ
れる。[0036] In step SP5, the process of inquiring thus modified is the rule applied to either the user to record in any condition for take future analysis processing is performed.

【００３７】このステップＳＰ５では、まず４通りの選
択がなされる。すなわち入力された規則を新規に登録す
るか（ステップＳＰ６）、誤っている可能性が高い現規
則についてのみ情報を更新するか（ステップＳＰ７）、
現規則は正しいがより適切な規則があるので現規則を更
新するか（ステップＳＰ８）、何もせずに次の文節の判
断に移るかの４つである。In this step SP5, first, four kinds of selections are made. That is, whether the entered rule is newly registered (step SP6) or the information is updated only for the current rule that is likely to be incorrect (step SP7).
The current rule is correct, but there is a more appropriate rule. Therefore, the current rule is updated (step SP8) or the next clause is judged without doing anything.

【００３８】なお、現規則は正しいがより適切な規則が
あるので現規則を更新する場合には適切な規則が規則格
納部５に存在するか否かによって次の２通りの選択がな
される。１つはより適切な規則（学習規則）が規則格納
部５に存在する場合であって現存する規則を更新する処
理であり、１つはより適切な規則（学習規則）が規則格
納部５に存在しない場合であって当該規則を新規に登録
する処理である。The current rule is correct, but there is a more appropriate rule. Therefore, when updating the current rule, the following two selections are made depending on whether or not the appropriate rule exists in the rule storage unit 5. One is a process of updating an existing rule when a more appropriate rule (learning rule) exists in the rule storage unit 5, and one is a more appropriate rule (learning rule) stored in the rule storage unit 5. This is the process of newly registering the rule even when it does not exist.

【００３９】例えばこの例の場合、関係１１の「彼は」
に対する係り受けは適切ではなく、ステップＳＰ４で関
係１２の係り受けが選択されているので、ステップＳＰ
８の処理が選択される。このとき更新学習部７は、図４
に示すように、更新前の現規則Ｒαi の優先フラグ５Ｃ
をオフに設定する。For example, in the case of this example, "he is" in relation 11
Is not appropriate, and since the dependency of relation 12 is selected in step SP4, step SP4
Process 8 is selected. At this time, the update learning unit 7 operates as shown in FIG.
, The priority flag 5C of the current rule Rαi before updating
Set to off.

【００４０】一方、この例では正しい係り受けを与える
関係１２が規則格納部５に既に存在しているので、この
ステップＳＰ８の後に設けられているステップＳＰ９に
おいて肯定結果が得られて処理がステップＳＰ１０に進
む。そして、規則Ｒβｊの頻度情報５Ｂが「Ｆβｊ」か
ら「Ｆβｊ＋１」に「１」増加され、優先フラグ５Ｃが
オンされ、信頼フラグ５Ｄがオフされる。On the other hand, in this example, since the relation 12 that gives the correct dependency already exists in the rule storage unit 5, a positive result is obtained in step SP9 provided after this step SP8, and the process proceeds to step SP10. Proceed to. Then, the frequency information 5B of the rule Rβj is increased by “1” from “Fβj” to “Fβj + 1”, the priority flag 5C is turned on, and the reliability flag 5D is turned off.

【００４１】さて、前述の例では正しい係り受け関係を
与える規則が規則格納部５に存在したのでステップＳＰ
９からステップＳＰ１０の処理に移ったが、存在しない
場合にはステップＳＰ９からステップＳＰ６の処理に移
り、新たに入力された規則と共に確率情報５Ａ等が新た
に登録される。これを表しているのが図５である。In the above example, since the rule that gives the correct dependency relationship exists in the rule storage unit 5, step SP
Although the process proceeds from step 9 to step SP10, but if it does not exist, the process proceeds from step SP9 to step SP6, and the probability information 5A and the like are newly registered together with the newly input rule. This is shown in FIG.

【００４２】すなわち、現規則の優先フラグ５Ｃがオフ
に更新された状態で、新たに登録された規則Ｒβｋの確
率情報５ＡがＮＵＬＬ（リセット状態）に設定され、頻
度情報５Ｂが「１」に設定され、優先フラグ５Ｃがオン
に設定され、信頼フラグ５Ｄがオフに設定される。That is, with the priority flag 5C of the current rule updated to OFF, the probability information 5A of the newly registered rule Rβk is set to NULL (reset state) and the frequency information 5B is set to "1". Then, the priority flag 5C is set to ON and the trust flag 5D is set to OFF.

【００４３】これらの処理が各文節ごと全ての文節につ
いて繰り返し実行される。These processes are repeatedly executed for all clauses for each clause.

【００４４】以上のように、この実施形態によれば、規
則が更新され学習されると、次回の解析処理より優先フ
ラグ５Ｃがオンであるかオフであるかの情報が有効に作
用し、当該優先フラグがオンである規則集合｛Ｒβ１、
Ｒβ２、……｝の方が｛Ｒα１、Ｒα２、……｝より優
先的に適用されることになる。このように、優先フラグ
５Ｃをオン又はオフに設定し直すことは、規則の確率を
毎回変更する場合に比して計算量の負担が小さい。ま
た、学習効果の即効性も高いので非常に使い勝手の良い
自然言語処理装置が得られる。As described above, according to this embodiment, when the rule is updated and learned, the information on whether the priority flag 5C is ON or OFF is effective from the next analysis processing, and A rule set {Rβ1, whose priority flag is on,
Rβ2, ...} will be applied in preference to {Rα1, Rα2, ...}. As described above, resetting the priority flag 5C to be on or off has a smaller calculation load than a case where the probability of the rule is changed every time. In addition, since the learning effect is immediately effective, a very easy-to-use natural language processing device can be obtained.

【００４５】また、この自然言語処理装置は信頼フラグ
５Ｄの内容を規則格納部５に記憶しているので、後日規
則の再調整のときに検査対象として検討することが可能
となり、規則の正確さをより一層高めることもできる。Further, since this natural language processing device stores the content of the reliability flag 5D in the rule storage unit 5, it becomes possible to consider it as an inspection target at the time of readjustment of the rule at a later date, and the accuracy of the rule. Can be further enhanced.

【００４６】（３）他の実施形態なお、上述の実施形態においては、ステップＳＰ７、ス
テップＳＰ１０、ステップＳＰ１１において、頻度情報
５Ｂの値を「１」増減する場合について述べたが、デバ
ック処理等において同じ文を繰り返し解析する際に頻度
情報５Ｂの値が意味無くカウントアップされるのを避け
るために、利用者からの頻度情報５Ｂの更新可否に従っ
て増減を決定すれば良く、必ずしも頻度情報５Ｂの値を
「１」増減する場合だけでなくても良い。(3) Other Embodiments In the above embodiment, the case where the value of the frequency information 5B is increased or decreased by "1" in step SP7, step SP10, and step SP11 has been described. In order to prevent the value of the frequency information 5B from being counted up without meaning when repeatedly analyzing the same sentence, the increase / decrease may be determined according to whether or not the user can update the frequency information 5B. Is not limited to the case of increasing or decreasing by "1".

【００４７】また、上述の実施形態においては、各規則
の確率惰報５Ａの更新については何ら説明しなかった
が、この確率情報５Ａは利用者から再計算の指示がなさ
れたとき、又は予め設定されている所定回数になったと
き、頻度惰報５Ｂに基づいて再計算するようにすれば良
い。このようにすれば、確率情報５Ａの計算を毎回実行
することによる負担を低減できる。In the above embodiment, the update of the probability information 5A of each rule was not described, but the probability information 5A is set when the user gives an instruction for recalculation or is set in advance. When the predetermined number of times has been reached, the recalculation may be performed based on the frequency coasting report 5B. In this way, the burden of executing the calculation of the probability information 5A every time can be reduced.

【００４８】さらに、上述の実施形態においては、規則
格納部５に確率情報５Ａ、頻度情報５Ｂ、優先フラグ５
Ｃ及び信頼フラグ５Ｄの４つの情報を格納し、これら４
つの情報を用いて入力文を解析する場合について述べた
が、これらのうち１つ以上いくつかを組み合わせた情報
を格納し、これら情報に基づいて入力文を解析する場合
にも適用し得る。Further, in the above-described embodiment, the rule storage 5 stores the probability information 5A, the frequency information 5B, and the priority flag 5.
4 pieces of information of C and trust flag 5D are stored, and these 4 pieces of information are stored.
The case where the input sentence is analyzed using one piece of information has been described, but the present invention can be applied to the case where information obtained by combining one or more of these pieces of information is stored and the input sentence is analyzed based on these pieces of information.

【００４９】例えば確率情報５Ａと優先フラグ５Ｃの２
つの情報だけを格納する場合にも適用し得る。このよう
にしても複数の候補のうち優先フラグ５Ｃがオンのもの
を優先させれば、同じような構造の入力文の解析におい
ては新しい規則を優先的に適用でき、一度失敗したパタ
ーンは次回には正しく処理できるという効果が可能であ
る。For example, 2 of probability information 5A and priority flag 5C
It can also be applied when storing only one information. Even in this way, if the priority flag 5C is turned on among a plurality of candidates, a new rule can be preferentially applied in the analysis of an input sentence having a similar structure, and a pattern that has failed once will be next time. Can be processed correctly.

【００５０】さらにまた、上述の実施形態においては、
解析処理又は生成処理の際に用いる規則の学習について
説明したが、必ずしも規則の学習に限定するものではな
く、辞書データを学習させる場合にも広く適用し得る。
例えば動詞の格文法に対する学習処理にも同様に適用で
きる。例えば「山に登って景色を見た。」の係り受け解
析において、「登る」に対する２つの格文法（「名詞
（人間、動物）が」と「名詞（高いところ）に」）と、
「見る」に対する２つの格文法（「名詞（人間、生物）
が」と「名詞（対象物）を」）との関係から「山」は
「登る」に係るということを解析する場合にも適用する
ことができる。この際、動詞に対する前記４つの格文法
は上述の実施形態における規則と同様にみなすことがで
き、本実施例と同様に頻度情報５Ａや優先フラグ５Ｃ等
を適用することができる。Furthermore, in the above embodiment,
Although the learning of the rules used in the analysis process or the generation process has been described, the present invention is not necessarily limited to the learning of the rules and can be widely applied to the case of learning dictionary data.
For example, it can be similarly applied to a learning process for a case grammar of a verb. For example, in the dependency analysis of "I climbed a mountain to see the scenery.", Two case grammars for "Climb"("Noun (human, animal) ga" and "Noun (high place)")
Two case grammars for "seeing"("nouns (humans, creatures)"
It can also be applied when analyzing that a "mountain" relates to "climbing" from the relationship between "ga" and "noun (object)". At this time, the four case grammars for the verb can be regarded as the same as the rules in the above-described embodiment, and the frequency information 5A, the priority flag 5C, and the like can be applied as in the present embodiment.

【００５１】[0051]

【発明の効果】以上のように、本発明によれば、自然言
語処理規則を構成する要素それぞれについて、確率情報
及び優先度情報の２つの情報を対応付け、処理部による
入力言語の処理結果が誤っているとき、その結果を導出
する要素の優先度情報の優先度を低く書き換え、処理結
果が正しいとき、その結果を導出する要素の優先度情報
の優先度を高く書き換えることにより、以前の処理にお
いて誤りがあったことのある入力文と同じ構造の入力文
が入力されたときには、前回の処理で優先度情報の優先
度が高く書き換えられた要素を今回の処理において優先
的に選択されるようにでき、２度と同じ誤りを繰り返す
ことなく正しい処理結果を得ることができる自然言語処
理装置を実現することができる。As described above, according to the present invention, with respect to each element constituting the natural language processing rule, two pieces of information, probability information and priority information, are associated with each other, and the processing result of the input language by the processing unit is When the result is incorrect, the priority information of the element that derives the result is rewritten to a lower priority, and when the processing result is correct, the priority information of the element that derives the result is rewritten to a higher priority. When an input sentence with the same structure as an input sentence that has had an error in is input, the element whose priority information has been rewritten with high priority in the previous process is selected preferentially in this process. Therefore, it is possible to realize a natural language processing device that can obtain a correct processing result without repeating the same error twice.

【００５２】また、以上のように、本発明によれば、自
然言語処理規則を構成する要素それぞれについて、確率
情報及び頻度情報の２つの情報を対応付け、処理部の処
理結果が誤っているとき、その結果を導出する要素が採
用された回数を表す頻度情報の値を減少させ、処理部の
処理結果が正しいとき、その結果を導出する要素が採用
された回数を表す頻度情報の値を増加させるといった具
合に、１つの処理が終了するたびに確率情報を書き換え
るのではなく頻度情報のみを書き換えることにより、確
率情報を再計算して書き換える場合に比して処理部の負
担を低減することができ、処理速度を向上することがで
きる自然言語処理装置を実現することができる。As described above, according to the present invention, two pieces of information, probability information and frequency information, are associated with each other for each element constituting the natural language processing rule, and the processing result of the processing unit is incorrect. , The value of the frequency information indicating the number of times the element that derives the result is adopted is decreased, and when the processing result of the processing unit is correct, the value of the frequency information indicating the number of times the element that derives the result is adopted is increased. By rewriting only the frequency information instead of rewriting the probability information each time one process is completed, the burden on the processing unit can be reduced as compared with the case where the probability information is recalculated and rewritten. Therefore, it is possible to realize the natural language processing device that can improve the processing speed.

【００５３】さらに、以上のように、本発明によれば、
自然言語処理規則を構成する要素それぞれについて、確
率情報及び信頼度情報の２つの情報を対応付け、処理部
の処理結果が誤っている可能性が高いとき、その結果を
導出する要素に対応する信頼度情報の信頼度を低下させ
ることにより、信頼度情報の信頼度が低い要素について
は、次回以降の入力言語の処理において用いられないよ
うにでき、その分、処理結果の精度の高い自然言語処理
装置を実現することができる。Furthermore, as described above, according to the present invention,
Probability information and reliability information are associated with each of the elements constituting the natural language processing rule, and when the processing result of the processing unit is highly likely to be incorrect, the reliability corresponding to the element that derives the result. By lowering the reliability of the reliability information, it is possible to prevent elements with low reliability of reliability information from being used in the processing of the input language on and after the next time, and the natural language processing with high accuracy of the processing result. The device can be realized.

[Brief description of drawings]

【図１】実施形態に係る自然言語処理装置を示すブロッ
ク図である。FIG. 1 is a block diagram showing a natural language processing device according to an embodiment.

【図２】実施形態の自然言語処理装置で実行される動作
フローチャートである。FIG. 2 is an operation flowchart executed by the natural language processing device according to the embodiment.

【図３】実施形態における係り受け解析の解析結果と対
応規則集合の説明に供する説明図である。FIG. 3 is an explanatory diagram for explaining an analysis result of a dependency analysis and a correspondence rule set in the embodiment.

【図４】実施形態において更新する規則が解析規則に存
在している場合の状態変化を示す説明図である。FIG. 4 is an explanatory diagram showing a state change when a rule to be updated exists in an analysis rule in the embodiment.

【図５】実施形態において更新する規則が解析規則に存
在していない場合の状態変化を示す説明図である。FIG. 5 is an explanatory diagram showing a state change when a rule to be updated does not exist in the analysis rule in the embodiment.

[Explanation of symbols]

１……自然言語処理装置、２……入力部、３……処理
部、４……辞書格納部、５……規則格納部、６……結果
格納部、７……更新学習部。1 ... Natural language processing device, 2 ... Input unit, 3 ... Processing unit, 4 ... Dictionary storage unit, 5 ... Rule storage unit, 6 ... Result storage unit, 7 ... Update learning unit.

Claims

[Claims]

1. A storage device that stores two pieces of information, probability information and priority information, for each element that constitutes a natural language processing rule, and a processing unit that processes an input language using the element of the language resource. When the processing result of the processing unit is incorrect, the priority of the priority information of the element that derives the result is rewritten to a lower priority,
A natural language processing apparatus comprising: an update learning unit that rewrites the priority information of an element that derives the processing of the processing unit to a correct result to a higher priority.

2. The priority information is rewritten so that the former priority is lower when an appropriate element is present as compared with the element even if the processing result is not incorrect. Natural language processing device described in.

3. A storage device that stores two pieces of information, probability information and frequency information, for each element that constitutes the natural language processing rule, a processing unit that processes an input language using the element of the language resource, When the processing result of the processing unit is incorrect, the value of the frequency information indicating the number of times the element that derives the result is adopted is reduced, and when the processing result of the processing unit is correct, the element that derives the result is A natural language processing device, comprising: an update learning unit that increases the value of the frequency information that indicates the number of times that the natural language processing unit has been adopted.

4. The natural language according to claim 3, wherein the probability information is recalculated based on the value of the frequency information when the number of times of processing the input language reaches a predetermined number. Processing equipment.

5. A storage device that stores two pieces of information, probability information and reliability information, for each element that constitutes the natural language processing rule, and a processing unit that processes an input language using the element of the language resource. When it is highly possible that the processing result of the processing unit is incorrect,
A natural language processing device, comprising: an update learning unit that reduces the reliability of reliability information corresponding to an element that derives the result.