JPWO2018229937A1

JPWO2018229937A1 - Intention estimation apparatus and intention estimation method

Info

Publication number: JPWO2018229937A1
Application number: JP2019514140A
Authority: JP
Inventors: ▲イ▼ 景; 悠介小路
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2017-06-15
Filing date: 2017-06-15
Publication date: 2019-07-11
Anticipated expiration: 2037-06-15
Also published as: WO2018229937A1; JP6632764B2

Abstract

取得した文字列に基づき当該文字列に含まれる形態素の解析を行う形態素解析部（１０３）と、文字列に対する意図数を推定し、推定した意図数に応じて、当該文字列が、一つしか意図を含まない単意図文字列であるか、複数の意図を含む複意図文字列であるかを判断する意図数推定部（１０６）と、意図数推定部（１０６）が、文字列は単意図文字列であると判断した場合、形態素解析部（１０３）が解析した形態素に基づき、意図毎に形態素との関連度が対応付けられた単意図推定モデルを用いて、当該単意図文字列に対する意図を単意図として推定する単意図推定部（１０８）と、意図数推定部（１０６）が、文字列は複意図文字列であると判断した場合、形態素解析部（１０３）が解析した形態素に基づき、複数の意図毎に形態素との関連度が対応付けられた複合意図推定情報モデルを用いて、当該複意図文字列に対する複数の意図を推定する複合意図推定部（１１０）と、複合意図推定部（１１０）が推定した複数の意図を複合意図として統合する推定結果統合部（１１１）とを備えた。 A morphological analysis unit (103) that analyzes morphemes included in the character string based on the acquired character string, and the number of intentions for the character string is estimated, and only one of the character strings is calculated according to the estimated number of intentions The intention number estimation unit (106) that determines whether it is a single intention string that does not contain an intention or a multiple intention string that includes multiple intentions, and the intention number estimation unit (106) If it is determined to be a character string, an intention for the single intention character string using a single intention estimation model in which the degree of association with the morpheme is associated with each intention based on the morpheme analyzed by the morpheme analysis unit (103) If the single intention estimation unit (108) for estimating the single intention and the intention number estimation unit (106) determine that the character string is a double intention character string, the morpheme analysis unit (103) analyzes the morpheme based on the analysis. , Morphemes with multiple intentions A compound intention estimating unit (110) that estimates a plurality of intentions for the compound intention string by using a compound intention estimation information model associated with a degree of association, and a plurality of intentions estimated by the compound intention estimating unit (110) And a combined estimation result unit (111).

Description

この発明は、入力された文字列を認識してユーザの意図を推定する意図推定装置及び意図推定方法に関するものである。 The present invention relates to an intention estimation apparatus and an intention estimation method for recognizing an input character string and estimating a user's intention.

従来、ユーザにより発話された音声を音声認識して文字列に変換し、当該文字列から、どのような操作を実行したいのかという使用者の意図を推定する意図推定装置が知られている。１つの発話に複数の意図が含まれる場合（以下、複意図発話ともいう）もあるため、意図推定装置は、複意図発話に対して意図を推定可能であることが求められる。 Conventionally, there is known an intention estimation apparatus which performs speech recognition of a voice uttered by a user, converts the speech into a character string, and deduces from the character string an intention of the user as to what kind of operation to perform. Since there is also a case where a plurality of intentions are included in one utterance (hereinafter also referred to as a double intention utterance), the intention estimation device is required to be able to estimate an intention for a double intention utterance.

例えば、非特許文献１に開示されている教師あり学習を用いた方式では、文字列をＢａｇｏｆｗｏｒｄｓと呼ばれる形式で表現し、当該Ｂａｇｏｆｗｏｒｄｓを特徴量として、サポートベクトルマシンまたは対数線形モデル（最大エントロピーモデル）と呼ばれる分類器（意図理解モデル）を学習させ、学習結果を用いて算出される確率値に基づき、意図が推定される。当該方式によれば、例えば、「ラーメン屋と中華料理を検索して。」等、１つの文字列が、「ラーメン屋を検索」という意図と、「中華料理を検索」という意図を含む、並列の構造を持つ場合でも、発話者等の意図が推定される。 For example, in the method using supervised learning disclosed in Non-Patent Document 1, a character string is expressed in a form called Bag of words, and the Bag of words is used as a feature amount to support vector machine or log linear model ( A classifier (intense understanding model) called maximum entropy model is trained, and intention is estimated based on probability values calculated using learning results. According to the method, for example, one string such as “search for ramen restaurant and Chinese food.” Includes the intention “search for ramen restaurant” and the intention “search for Chinese food”. Even in the case of having the following structure, the intention of the speaker or the like can be estimated.

高村大也著、「言語処理のための機械学習入門」、第５版、株式会社コロナ社、２０１０年８月５日、ｐ．９９−１４６Takamura Takaya, "Introduction to Machine Learning for Language Processing", 5th Edition, Corona Co., Ltd., August 5, 2010, p. 99-146

このような、非特許文献１に開示されている意図推定の方式を、１つの発話に複数の意図が含まれ得る場合にも適用する場合、意図毎に別々のモデルを学習し、実行時に各モデルに基づく判定結果を統合することになる。
しかしながら、上述したような、１つの発話に対して、実行時に複数のモデルに基づく判定結果を統合する方式では、発話が１つの意図しか含まない場合（以下、単意図発話ともいう）でも、複数のモデルそれぞれに基づく意図推定を行うため、複数の意図が推定されて出力されることがあり、全体として意図の推定精度が低くなる場合があるという課題があった。When the intention estimation method disclosed in Non-Patent Document 1 is applied to the case where one utterance may include a plurality of intentions, a separate model is learned for each intention and each model is executed at the time of execution. It will integrate the judgment results based on the model.
However, in the method of integrating the determination results based on a plurality of models at the time of execution with respect to one utterance as described above, even if the utterance includes only one intention (hereinafter, also referred to as a single intention utterance) In order to perform intention estimation based on each of the models, there is a problem that a plurality of intentions may be estimated and output, and the estimation accuracy of the intention may be lowered as a whole.

この発明は上記のような課題を解決するためになされたもので、取得した文字列が単意図文字列、複意図文字列のどちらもあり得る場合においても、精度よく意図を推定することができる意図推定装置を提供することを目的とする。 The present invention has been made to solve the problems as described above, and it is possible to accurately estimate intention even when the acquired character string can be both a single intention character string and a double intention character string. It aims at providing an intention estimation device.

この発明に係る意図推定装置は、取得した文字列に基づき当該文字列に含まれる形態素の解析を行う形態素解析部と、文字列に対する意図数を推定し、推定した意図数に応じて、当該文字列が、一つしか意図を含まない単意図文字列であるか、複数の意図を含む複意図文字列であるかを判断する意図数推定部と、意図数推定部が、文字列は単意図文字列であると判断した場合、形態素解析部が解析した形態素に基づき、意図毎に形態素との関連度が対応付けられた単意図推定モデルを用いて、当該単意図文字列に対する意図を単意図として推定する単意図推定部と、意図数推定部が、文字列は複意図文字列であると判断した場合、形態素解析部が解析した形態素に基づき、複数の意図毎に形態素との関連度が対応付けられた複合意図推定モデルを用いて、当該複意図文字列に対する複数の意図を推定する複合意図推定部と、複合意図推定部が推定した複数の意図を複合意図として統合する推定結果統合部とを備えたものである。 The intention estimation apparatus according to the present invention estimates a morpheme analysis unit that analyzes morphemes included in the character string based on the acquired character string, estimates the number of intentions for the character string, and determines the character according to the estimated intention number. Intended number estimation unit that determines whether a string is a single intention string that contains only one intention or a multiple intention string that includes multiple intentions; If it is determined to be a character string, using the single intention estimation model in which the degree of association with the morpheme is associated with each intention based on the morpheme analyzed by the morpheme analysis unit, the intention for the single intention character string is single intention If the single intention estimation unit to estimate as and the intention number estimation unit determine that the character string is a double intention character string, the degree of association with the morpheme for each of the plurality of intentions is based on the morpheme analyzed by the morpheme analysis unit. Combined intention estimation model associated With, those having a composite intention estimation unit for estimating a plurality of intentions for the double intent string, the estimated result integration unit that integrates a plurality of intent composite intention estimation unit has estimated as a composite intended.

この発明によれば、ユーザの意図を推定する精度を向上することができる。 According to the present invention, it is possible to improve the accuracy of estimating the user's intention.

実施の形態１に係る意図推定装置の構成例を示す図である。FIG. 1 is a diagram showing an example of a configuration of an intention estimation device according to a first embodiment. 実施の形態１における意図数推定モデルの一例を示す図である。FIG. 7 is a diagram showing an example of an intention number estimation model according to the first embodiment. 実施の形態１における単意図推定モデルの一例を示す図である。5 is a diagram showing an example of a single intention estimation model in Embodiment 1. FIG. 実施の形態１における複合意図推定モデルの一例を示す図である。5 is a diagram showing an example of a composite intention estimation model in Embodiment 1. FIG. 図５Ａ，図５Ｂは、実施の形態１に係る意図推定装置のハードウェア構成の一例を示す図である。FIG. 5A and FIG. 5B are diagrams showing an example of the hardware configuration of the intention estimation apparatus according to the first embodiment. 実施の形態１の意図数推定モデル生成装置の構成例を示す図である。FIG. 2 is a diagram showing an example of a configuration of an intention number estimation model generation device according to Embodiment 1. 実施の形態１において、学習用データ記憶部に記憶されている学習用データの例を示す図である。FIG. 6 is a diagram showing an example of learning data stored in a learning data storage unit in the first embodiment. 実施の形態１において、意図数推定モデル生成装置が意図数推定モデルを生成する処理を説明するためのフローチャートである。In Embodiment 1, it is a flowchart for demonstrating the process which the intention number estimation model production | generation apparatus produces | generates the intention number estimation model. 実施の形態１において、ユーザとナビゲーション装置との間で行われる対話例を示す図である。In Embodiment 1, it is a figure which shows the example of interaction performed between a user and a navigation apparatus. 実施の形態１に係る意図推定装置の動作を説明するためのフローチャートである。5 is a flowchart for explaining the operation of the intention estimation device according to the first embodiment. 実施の形態１において、図１０のステップＳＴ１００４における、意図数推定部の動作について説明するためのフローチャートであるFIG. 11 is a flowchart for describing an operation of an intention number estimation unit in step ST1004 of FIG. 10 in the first embodiment. 実施の形態１において、意図数推定部が取得する、各意図数に対する係り受け情報のスコアの一例を示す図である。In Embodiment 1, it is a figure which shows an example of the score of the dependency information with respect to each intention number which the intention number estimation part acquires. 実施の形態１において、意図数推定部が最終スコアを算出するために用いる計算式を示す図である。In Embodiment 1, it is a figure which shows the calculation formula which an intention number estimation part uses in order to calculate a final score. 実施の形態１において、意図数推定部が算出する、各意図数の最終スコアの一例を示す図である。In Embodiment 1, it is a figure which shows an example of the final score of each intention number which an intention number estimation part calculates. 実施の形態１において、意図数推定部が算出する、各意図数の最終スコアの一例を示す図である。In Embodiment 1, it is a figure which shows an example of the final score of each intention number which an intention number estimation part calculates. この実施の形態１において、意図数推定部が、複合意図推定部が推定結果とした、ユーザの意図の判定結果の一例である。In the first embodiment, the intention number estimation unit is an example of the determination result of the user's intention, which is the estimation result of the combined intention estimation unit. この実施の形態１において、推定結果統合部により統合された意図の統合結果の一例を示す図である。In this Embodiment 1, it is a figure which shows an example of the integration result of the intent integrated by the presumed result integration part. 実施の形態２に係る意図推定装置の構成例を示す図である。FIG. 7 is a diagram showing an example of configuration of an intention estimation apparatus according to a second embodiment. 実施の形態２において、ユーザとナビゲーション装置との間で行われる対話例を示す図である。In Embodiment 2, it is a figure which shows the example of a dialog performed between a user and a navigation apparatus. 実施の形態２における意図推定装置の動作を説明するためのフローチャートである。FIG. 10 is a flowchart for explaining the operation of the intention estimation device in the second embodiment. 実施の形態２において、複合意図推定部が判定した、ユーザの意図の判定結果の一例である。In Embodiment 2, it is an example of the determination result of a user's intention which the compound intention estimation part determined. この実施の形態２において、推定結果統合部により統合された意図の統合結果の一例を示す図である。In this Embodiment 2, it is a figure which shows an example of the integration result of the intent integrated by the presumed result integration part. 実施の形態２において、推定結果選択部により生成された最終意図推定結果の内容の一例を示す図である。In Embodiment 2, it is a figure which shows an example of the content of the final intention estimation result produced | generated by the estimation result selection part.

以下、この発明の実施の形態について、図面を参照しながら詳細に説明する。
実施の形態１．Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
Embodiment 1

実施の形態１に係る意図推定装置１は、一例として、車両のドライバ等のユーザに対して経路案内等を行うナビゲーション装置に搭載され、ユーザが発話した発話内容から、ユーザの意図を推定し、当該推定したユーザの意図に応じた操作を、ナビゲーション装置に実行させる制御を行うものとする。意図推定装置１が、ナビゲーション装置と、ネットワーク等を介して接続されるようにしてもよい。
なお、ナビゲーション装置に搭載される例等は一例に過ぎず、実施の形態１に係る意図推定装置１は、ナビゲーション装置のユーザに限らず、ユーザから発話等によって入力された情報を受け付け、当該受け付けた情報に応じた動作を行うあらゆる装置において、当該装置のユーザの意図を推定する意図推定装置に適用できる。The intention estimation device 1 according to the first embodiment is, for example, mounted on a navigation device for providing route guidance and the like to a user such as a driver of a vehicle, and estimates the user's intention from the utterance content uttered by the user. Control is performed to cause the navigation device to execute an operation according to the estimated user's intention. The intention estimation device 1 may be connected to the navigation device via a network or the like.
The example installed in the navigation device is only an example, and the intention estimation device 1 according to the first embodiment is not limited to the user of the navigation device, but receives the information input from the user by speech etc. The present invention can be applied to an intention estimation apparatus that estimates the intention of the user of the apparatus in any apparatus that performs an operation according to the information.

図１は、実施の形態１に係る意図推定装置１の構成例を示す図である。
意図推定装置１は、図１に示すように、音声受付部１０１と、音声認識部１０２と、形態素解析部１０３と、係り受け解析部１０４と、意図数推定モデル記憶部１０５と、意図数推定部１０６と、単意図推定モデル記憶部１０７と、単意図推定部１０８と、複合意図推定モデル記憶部１０９と、複合意図推定部１１０と、推定結果統合部１１１と、コマンド実行部１１２と、応答生成部１１３と、通知制御部１１４とを備える。
なお、この実施の形態１では、図１に示すように、意図数推定モデル記憶部１０５、単意図推定モデル記憶部１０７、及び、複合意図推定モデル記憶部１０９は、意図推定装置１に備えられるものとするが、これに限らず、意図数推定モデル記憶部１０５、単意図推定モデル記憶部１０７、及び、複合意図推定モデル記憶部１０９は、意図推定装置１の外部の、意図推定装置１が参照可能な場所に備えられるものとしてもよい。FIG. 1 is a diagram showing an example of the configuration of an intention estimation device 1 according to a first embodiment.
As shown in FIG. 1, the intention estimation apparatus 1 includes a speech reception unit 101, a speech recognition unit 102, a morphological analysis unit 103, a dependency analysis unit 104, an intention number estimation model storage unit 105, and an intention number estimation. Unit 106, single intention estimation model storage unit 107, single intention estimation unit 108, combined intention estimation model storage unit 109, combined intention estimation unit 110, estimation result integration unit 111, command execution unit 112, response A generation unit 113 and a notification control unit 114 are provided.
In the first embodiment, as shown in FIG. 1, the intention number estimation model storage unit 105, the single intention estimation model storage unit 107, and the combined intention estimation model storage unit 109 are provided in the intention estimation device 1. The intention number estimation model storage unit 105, the single intention estimation model storage unit 107, and the combined intention estimation model storage unit 109 are not limited to this, and the intention estimation device 1 outside the intention estimation device 1 is It may be provided at a place where reference is possible.

音声受付部１０１は、ユーザの発話を含む音声を受け付ける。音声受付部１０１は、受け付けた音声の情報を音声認識部１０２に出力する。 The voice receiving unit 101 receives a voice including the user's speech. The voice receiving unit 101 outputs the received voice information to the voice recognition unit 102.

音声認識部１０２は、音声受付部１０１が受け付けた音声に対応する音声データを音声認識した上で文字列に変換する。音声認識部１０２は、文字列を形態素解析部１０３に出力する。 The voice recognition unit 102 performs voice recognition on voice data corresponding to the voice received by the voice receiving unit 101 and converts the voice data into a character string. The speech recognition unit 102 outputs the character string to the morphological analysis unit 103.

形態素解析部１０３は、音声認識部１０２から出力された文字列に対して形態素解析を行う。
ここで、形態素解析とは、文字列を、言語として意味を持つ最小単位である形態素に区切り、辞書を利用して品詞を付与する、既存の自然言語処理技術である。例えば、「東京タワーへ行く」という文字列に対して形態素解析が行われると、当該文字列は、「東京タワー／固有名詞、へ／格助詞、行く／動詞」のような形態素に区切られる。
形態素解析部１０３は、形態素解析結果を、係り受け解析部１０４及び意図数推定部１０６に出力する。The morphological analysis unit 103 performs morphological analysis on the character string output from the speech recognition unit 102.
Here, the morphological analysis is an existing natural language processing technology that divides a character string into morphemes that are the smallest units having meaning as a language, and adds parts of speech using a dictionary. For example, when morphological analysis is performed on a character string “Go to Tokyo Tower”, the character string is divided into morphemes such as “Tokyo Tower / proper noun, to / case particle, go / verb”.
The morphological analysis unit 103 outputs the morphological analysis result to the dependency analysis unit 104 and the intention number estimation unit 106.

係り受け解析部１０４は、形態素解析部１０３による形態素解析後の文字列に対して、形態素間の関係性の解析を行い、係り受け情報を生成する。ここで、形態素間の関係性とは、文字列に含まれる形態素の係り受けの関係である。係り受けの関係とは、例えば「動作対象」、「並列関係」等の、形態素間の関係をいう。係り受け解析部１０４は、係り受けの解析手法として、例えば、Ｓｈｉｆｔ−ｒｅｄｕｃｅ、または、全域木等、既存の解析手法を用いればよい。
係り受け解析部１０４は、形態素間の関係性の解析結果を、係り受け情報として意図数推定部１０６に出力する。The dependency analysis unit 104 analyzes the relationship between the morphemes with respect to the character string after the morphological analysis by the morphological analysis unit 103, and generates dependency information. Here, the relation between morphemes is a relation of dependency of morphemes included in a character string. The dependency relationship means, for example, a relationship between morphemes such as “operation target”, “parallel relationship” and the like. As the dependency analysis method, the dependency analysis unit 104 may use an existing analysis method such as Shift-reduce or a spanning tree, for example.
The dependency analysis unit 104 outputs the analysis result of the relationship between morphemes to the intention number estimation unit 106 as dependency information.

意図数推定モデル記憶部１０５は、意図数推定モデルを記憶する。意図数推定モデルとは、係り受け情報を特徴量として意図数推定を行うためのモデルである。 The intention number estimation model storage unit 105 stores an intention number estimation model. The intention number estimation model is a model for performing intention number estimation using the dependency information as a feature quantity.

図２は、実施の形態１における意図数推定モデルの一例を示す図である。
図２に例示した意図数推定モデルにおいては、各意図数と、係り受け情報との関連度がスコアとして記述されている。
この実施の形態１では、係り受け情報は、各形態素間の関係性及びその出現件数が“＿”で接続される形で表現されている。
例えば図２のように、「並列関係」の関係にある形態素の組が１つの文字列の中に１回出現している場合には、係り受け情報は、「並列関係＿１件」となる。
図２に示された係り受け情報のうち、「動作対象＿１件」は、一つの文字列に「動作対象」の関係にある形態素の組が１組しかないことを示すため、意図数も「１」となる場合が多い。したがって、図２に示すように、「動作対象＿１件」については、意図数「１件」に対するスコアが、意図数「２件」及び「３件」に対するスコアよりも高くなる。これに対し、「並列関係＿１件」及び「動作対象＿２件」については、いずれも意図数が２以上になる可能性が高いので、意図数「２件」及び「３件」に対するスコアが、意図数「１件」に対するスコアよりも高くなる。このように、意図数推定モデルでは、意図数と係り受け情報の関連度に応じて、当該関連度が高いほど高いスコアが設定されている。
なお、説明を容易にするため、図２では、意図数について、「１件」、「２件」及び「３件」の三種類のみを示している。
この実施の形態１では、図２に例示したような意図数推定モデルを用いて、統計的な手法で、ユーザの意図数を推定する。FIG. 2 is a diagram showing an example of an intention number estimation model according to the first embodiment.
In the intention number estimation model illustrated in FIG. 2, the degree of association between the number of intentions and the dependency information is described as a score.
In the first embodiment, the dependency information is expressed in a form in which the relationship between morphemes and the number of occurrences thereof are connected by “_”.
For example, as shown in FIG. 2, when the set of morphemes in the “parallel relation” relationship appears once in one character string, the dependency information is “parallel relation_1”.
Among the pieces of dependency information shown in FIG. 2, “1 operation target” indicates that there is only one set of morphemes having a “operation target” relationship in one character string, and therefore the number of intentions is “ It is often 1 ". Therefore, as shown in FIG. 2, the score for the number of intentions “1” is higher than the score for the numbers of intentions “2” and “3” for “1 operation target”. On the other hand, the scores for the number of intentions “2 cases” and “3 cases” are high, as the number of intentions is likely to be 2 or more for both “1 parallel relation” and “2 operation targets”. It is higher than the score for the number of intentions “1 case”. As described above, in the intention number estimation model, a higher score is set as the degree of association is higher, according to the number of intentions and the degree of association of the dependency information.
In addition, in order to demonstrate easily, in FIG. 2, only three types of "one case", "two cases", and "three cases" are shown in FIG.
In the first embodiment, the intention number of users is estimated by a statistical method using the intention number estimation model as illustrated in FIG.

意図数推定部１０６は、係り受け解析部１０４から出力された係り受け情報に基づき、意図数推定モデル記憶部１０５に記憶されている意図数推定モデルを用いて文字列に含まれる意図数を推定する。意図数推定部１０６による意図数推定の具体的な手法は後述する。
意図数推定部１０６は、推定した意図数に応じて、音声受付部１０１が受け付けた音声に基づく文字列が、単意図発話であるか、複意図発話であるかを判断し、当該判断結果に応じて、形態素解析部１０３が出力した、文字列の形態素解析結果を、単意図推定部１０８、あるいは、複合意図推定部１１０に出力する。具体的には、意図数推定部１０６は、音声受付部１０１が受け付けた音声に基づく文字列が単意図発話による単意図文字列であると判断した場合は、形態素解析部１０３が出力した、文字列の形態素解析結果を、単意図推定部１０８に出力する。また、音声受付部１０１が受け付けた音声に基づく文字列が複意図発話であると判断した場合は、形態素解析部１０３が出力した、文字列の形態素解析結果を、複合意図推定部１１０に出力する。The intention number estimation unit 106 estimates the number of intentions included in the character string using the intention number estimation model stored in the intention number estimation model storage unit 105 based on the dependency information output from the dependency analysis unit 104. Do. The specific method of the intention number estimation by the intention number estimation unit 106 will be described later.
The intention number estimation unit 106 determines whether the character string based on the voice accepted by the voice acceptance unit 101 is a single intention utterance or a multiple intention utterance according to the estimated intention number, and In response, the morpheme analysis result of the character string output from the morpheme analysis unit 103 is output to the single intention estimation unit 108 or the combined intention estimation unit 110. Specifically, when the intention number estimation unit 106 determines that the character string based on the speech accepted by the speech acceptance unit 101 is a single intention string by a single intention utterance, the character output by the morphological analysis unit 103 The morphological analysis result of the column is output to the single intention estimation unit 108. Also, when it is determined that the character string based on the voice received by the voice receiving unit 101 is a double intention utterance, the morpheme analysis result of the character string output by the morphological analysis unit 103 is output to the combined intention estimation unit 110 .

なお、この実施の形態１では、意図数推定モデルを用いて、統計的な手法で意図数を推定するが、これに限らない。統計的な手法の代わりに、ルールとして係り受け情報と意図数の対応関係を事前に用意し、意図数を推定してもよい。例えば、「文字列の中に、施設名及び施設種類の「並列関係」が１件のみであれば、当文字列が含む意図数を「２」とする。」のようなルールにより意図数を推定することが可能である。 In the first embodiment, although the intention number is estimated by a statistical method using the intention number estimation model, the present invention is not limited to this. Instead of the statistical method, correspondence between dependency information and the number of intentions may be prepared in advance as a rule, and the number of intentions may be estimated. For example, “If there is only one“ parallel relation ”of the facility name and facility type in the character string, the number of intentions included in the character string is“ 2 ”. It is possible to estimate the number of intentions by a rule such as "."

また、後述する、この実施の形態１における意図推定の方式としては、例えば最大エントロピー法が利用できる。単意図推定部と複合意図推定部は、意図推定の際に、統計的手法を利用して、予め大量に収集した形態素と意図の組から、入力された形態素に対応する意図がどれだけ尤もらしいかを推定する。 Further, as a method of intention estimation in this embodiment 1 to be described later, for example, the maximum entropy method can be used. The single intention estimation unit and the composite intention estimation unit use statistical methods to estimate the intention, and the intention corresponding to the input morpheme is likely to be plausible from a set of morphemes and intentions collected in large amounts in advance To estimate.

単意図推定モデル記憶部１０７は、形態素を特徴量として意図推定を行うための意図推定モデルを記憶する。意図は、「＜主意図＞［＜スロット名＞＝＜スロット値＞、・・・］」のような形で表現することができる。ここで、主意図とは、意図の分類または機能を示すものである。ナビゲーション装置の例では、主意図とは、目的地設定、または、音楽を聞く等、ユーザが、例えば入力装置（図示省略）を最初に操作して行った入力に対応して発生する、上位層のコマンドに対応する。
スロット名及びスロット値は、主意図を実行するために必要な情報を示す。例えば、「近くのレストランを検索する」という文字列に含まれる意図は、主意図が「周辺検索」であり、スロット名が「施設種類」であり、スロット値が「レストラン」である。よって、近くのレストランを検索する」という文字列に含まれる意図は、「周辺検索［施設種類＝レストラン］」のように表すことができる。The single intention estimation model storage unit 107 stores an intention estimation model for performing intention estimation using a morpheme as a feature amount. The intention can be expressed as "<main intention>[<slotname> = <slot value>, ...]". Here, the main intention indicates the classification or function of the intention. In the example of the navigation device, the main intention is the upper layer generated in response to the input performed by the user operating the input device (not shown) for example, such as destination setting or listening to music. Corresponds to the command of.
The slot name and slot value indicate the information necessary to carry out the main intention. For example, the intention included in the string "search for nearby restaurants" is that the main intention is "periphery search", the slot name is "facility type", and the slot value is "restaurant". Therefore, the intention included in the character string “search nearby restaurants” can be expressed as “nearby search [facility type = restaurant]”.

図３は、実施の形態１における単意図推定モデルの一例を示す図である。
図３に示すように、単意図推定モデルは、「目的地設定［施設＝○○］」（○○は具体的な施設名であり、以下同じ）または「周辺検索［施設種類＝レストラン］」等の意図に対する各形態素のスコアを表すものである。この実施の形態１の単意図推定モデルにおいて、意図に対する各形態素のスコアとは、意図と各形態素との関連度であり、意図と各形態素との関連度が高いほど、各形態素のスコアは高く設定されている。単意図推定モデルは、図３に示すように、意図と形態素との関連度の学習によって作成された、意図毎に形態素との関係度を対応付けたモデルである。
例えば、図３に示すように、形態素「行く」または「目的地」については、ユーザは目的地設定を意図している可能性が高いので、意図「目的地設定［施設＝○○］」における、形態素「行く」または「目的地」のスコアは、他の形態素のスコアよりも高くなる。一方で、形態素「美味しい」または「食事」については、ユーザは周辺レストランの検索を意図している可能性が高いので、意図「周辺検索［施設種類＝レストラン］」における、形態素「美味しい」または「食事」のスコアは、他の形態素のスコアよりも高くなる。FIG. 3 is a diagram showing an example of the single intention estimation model in the first embodiment.
As shown in FIG. 3, the single intention estimation model is “destination setting [facility = ○○]” (○ is a specific facility name and the same applies hereinafter) or “peripheral search [facility type = restaurant]” Represents the score of each morpheme for the intention such as. In the single intention estimation model of the first embodiment, the score of each morpheme for the intention is the degree of association between the intention and each morpheme, and the higher the degree of association between the intention and each morpheme, the higher the score of each morpheme is It is set. The single intention estimation model, as shown in FIG. 3, is a model created by learning the degree of association between an intention and a morpheme, in which the degree of association with the morpheme is associated with each intention.
For example, as shown in FIG. 3, the user is likely to set a destination for the morpheme "go" or "destination", so that the intention "destination setting [facility = ○○]" is used. The scores of the morpheme "go" or "destination" are higher than the scores of other morphemes. On the other hand, with regard to the morpheme "delicious" or "meal", since the user is likely to search for nearby restaurants, the morpheme "delicious" or "delicious search" in the "periphery search [facility type = restaurant]" The score of "meal" is higher than the scores of other morphemes.

単意図推定部１０８は、形態素解析部１０３が出力した、文字列の形態素解析結果に基づき、単意図推定モデル記憶部１０７に記憶されている単意図推定モデルを用いてユーザの意図を推定する。具体的には、単意図推定部１０８は、単意図推定モデルを用いて、形態素解析部１０３によって形態素解析された形態素と意図とが対応付けられたスコアが一番大きくなる意図を、ユーザの意図と推定する。単意図推定部１０８は、推定結果を、単意図推定結果としてコマンド実行部１１２に出力する。 The single intention estimation unit 108 estimates the user's intention using the single intention estimation model stored in the single intention estimation model storage unit 107 based on the morpheme analysis result of the character string output by the morphological analysis unit 103. Specifically, using the single intention estimation model, the single intention estimation unit 108 uses the single intention estimation model to determine the intention that the score in which the morpheme analyzed by the morphological analysis unit 103 is associated with the intention is the largest. Estimate. The single intention estimation unit 108 outputs the estimation result to the command execution unit 112 as a single intention estimation result.

複合意図推定モデル記憶部１０９は、意図毎に別々のモデルの学習によって作成された複合意図推定モデルを記憶する。複合意図推定モデルは、各意図に対して、推定対象意図の学習データを正例とし、それ以外の意図の学習データを全て負例として、統計的な手法による学習によって作成されたモデルであり、各意図が推定対象意図に所属するかどうかの２値について判断するためのモデルである。 The combined intention estimation model storage unit 109 stores, for each intention, a combined intention estimation model created by learning a separate model. The compound intention estimation model is a model created by learning by a statistical method, with learning data of the estimation target intention as a positive example and learning data of other intentions as negative examples for each intention, It is a model for judging about binary of whether each intention belongs to presumed object intention.

図４は、実施の形態１における複合意図推定モデルの一例を示す図である。
複合意図推定モデルは、意図毎に生成された複数の判定用意図推定モデルを含む。
なお、図４では、説明を容易にするため、意図の数は「目的地設定［施設＝○○］」（図４Ａ参照）、「周辺検索［施設種類＝レストラン］」（図４Ｂ参照）、及び「経由地追加［施設＝○○］」（図４Ｃ参照）の三つとして例を示している。この実施の形態１の複合意図推定モデルにおいて、意図に対する各形態素のスコアとは、意図と各形態素との関連度であり、意図と各形態素との関連度が高いほど、各形態素のスコアは高く設定されている。複合意図推定モデルは、図４に示すように、複数の意図について、別々に、意図と形態素との関連度の学習によって作成され、意図毎に形態素との関係度を対応付けたモデルである。FIG. 4 is a diagram showing an example of a combined intention estimation model in the first embodiment.
The combined intention estimation model includes a plurality of judgment ready drawing estimation models generated for each intention.
In FIG. 4, in order to facilitate the description, the number of intentions is “destination setting [facility = ○○]” (see FIG. 4A), “peripheral search [facility type = restaurant]” (see FIG. 4B), And the example is shown as three of "via addition place [facility = ○○]" (refer to FIG. 4C). In the combined intention estimation model according to the first embodiment, the score of each morpheme for the intention is the degree of association between the intention and each morpheme, and the higher the degree of association between the intention and each morpheme, the higher the score for each morpheme is It is set. As shown in FIG. 4, the compound intention estimation model is a model in which a plurality of intentions are separately prepared by learning the degree of association between the intention and the morpheme, and the degree of association with the morpheme is associated with each intention.

複合意図推定部１１０は、複合意図推定モデル記憶部１０９に記憶されている複合意図推定モデルを用いて、判定用意図推定モデル毎に、形態素解析部１０３が出力した、文字列の形態素解析結果に基づき、音声受付部１０１で受け付けた音声に基づく文字列が、該当の意図であるか否かを判定する。具体的には、複合意図推定部１１０は、判定用意図推定モデル毎に、形態素解析部１０３によって形態素解析された形態素と意図とが対応付けられたスコアが、予め設定された閾値以上かどうかを判定し、文字列が、該当の意図であるか否かを判定する。
複合意図推定部１１０は、複合意図推定モデルに含まれる判定用意図推定モデル毎の判定結果を、推定結果として、推定結果統合部１１１へ出力する。The composite intention estimation unit 110 uses the composite intention estimation model stored in the composite intention estimation model storage unit 109 to determine the morpheme analysis result of the character string output by the morphological analysis unit 103 for each of the determination preparation diagram estimation models. Based on the determination, it is determined whether the character string based on the voice received by the voice receiving unit 101 is the corresponding intention. Specifically, the composite intention estimation unit 110 determines, for each determination preparation drawing estimation model, whether the score in which the morpheme analyzed by the morphological analysis unit 103 and the intention are associated with each other is equal to or more than a preset threshold. It determines and determines whether the character string is the corresponding intention.
The combined intention estimation unit 110 outputs the determination result for each of the determination preparation drawing estimation models included in the combined intention estimation model to the estimation result integration unit 111 as an estimation result.

推定結果統合部１１１は、複合意図推定部１１０が出力した、複合意図推定モデルに含まれる判定用意図推定モデル毎の推定結果を統合する。
推定結果統合部１１１は、推定した意図の統合結果を、複合意図推定結果としてコマンド実行部１１２へ出力する。The estimation result integration unit 111 integrates the estimation results for each of the preparation preparation drawing estimation models included in the composite intention estimation model, which are output from the composite intention estimation unit 110.
The estimation result integration unit 111 outputs the integration result of the estimated intentions to the command execution unit 112 as a composite intention estimation result.

コマンド実行部１１２は、単意図推定部１０８から出力された単意図推定結果、または、推定結果統合部１１１から出力された複合意図推定結果に基づき、対応するコマンドを、ナビゲーション装置のコマンド処理部に、実行させる。例えば、“美味しい店を探して”というユーザの発話に対して、単意図推定部１０８が、「周辺検索［施設種類＝レストラン］」の意図を推定し、単意図推定結果として出力した場合、コマンド実行部１１２は、周辺のレストランを検索するというコマンドを、ナビゲーション装置のコマンド処理部に、実行させる。
コマンド実行部１１２は、コマンド処理部に実行させたコマンドの内容を示す実行操作情報を、応答生成部１１３に出力する。The command execution unit 112 transmits a corresponding command to the command processing unit of the navigation device based on the single intention estimation result output from the single intention estimation unit 108 or the composite intention estimation result output from the estimation result integration unit 111. Make it run. For example, when the single intention estimation unit 108 estimates the intention of “periphery search [facility type = restaurant]” and outputs it as a single intention estimation result in response to the user's utterance “searching for a good store”, the command The execution unit 112 causes the command processing unit of the navigation device to execute a command to search for a nearby restaurant.
The command execution unit 112 outputs execution operation information indicating the content of the command executed by the command processing unit to the response generation unit 113.

応答生成部１１３は、コマンド実行部１１２から出力された実行操作情報に基づき、コマンド実行部１１２がコマンド処理部に実行させたコマンドに対応する応答データを生成する。応答データは、テキストデータの形式で生成してもよいし、音声データの形式で生成してもよい。
応答生成部１１３が、応答データを音声データの形式で生成する場合、応答生成部１１３は、例えば、「周辺のレストランを検索しました。リストから選択してください」のような合成音を出力するための音声データを生成すればよい。
応答生成部１１３は、生成した応答データを、通知制御部１１４に出力する。The response generation unit 113 generates response data corresponding to the command executed by the command processing unit 112 based on the execution operation information output from the command execution unit 112. The response data may be generated in the form of text data or may be generated in the form of audio data.
When the response generation unit 113 generates the response data in the form of voice data, the response generation unit 113 outputs a synthetic sound such as, for example, “Searched nearby restaurants. Please select from a list”. It suffices to generate audio data for the
The response generation unit 113 outputs the generated response data to the notification control unit 114.

通知制御部１１４は、応答生成部１１３から出力された応答データを、例えば、ナビゲーション装置が備えるスピーカ等の出力装置から出力させ、ユーザに通知する。つまり、通知制御部１１４は、出力装置を制御して、コマンド処理部によりコマンドが実行されたことをユーザに通知させる。なお、通知の態様については、表示による通知、音声による通知、または振動による通知等、ユーザが通知を認識できるものであれば何でもよい。 The notification control unit 114 causes the response data output from the response generation unit 113 to be output, for example, from an output device such as a speaker included in the navigation device, and notifies the user. That is, the notification control unit 114 controls the output device to notify the user that the command processing unit has executed the command. In addition, about the aspect of a notification, the notification by a display, the notification by an audio | voice, the notification by a vibration etc. may be anything as long as a user can recognize notification.

次に、この実施の形態１に係る意図推定装置１のハードウェア構成について説明する。
図５Ａ，図５Ｂは、この発明の実施の形態１に係る意図推定装置１のハードウェア構成の一例を示す図である。
この発明の実施の形態１において、音声認識部１０２と、形態素解析部１０３と、係り受け解析部１０４と、意図数推定部１０６と、単意図推定部１０８と、複合意図推定部１１０と、推定結果統合部１１１と、コマンド実行部１１２と、応答生成部１１３と、通知制御部１１４の各機能は、処理回路５０１により実現される。すなわち、意図推定装置１は、受け付けたユーザの発話に関する情報に基づき、ユーザの意図を推定する処理、または、推定した意図に応じた機械コマンドを実行及び通知させる処理の制御を行うための処理回路５０１を備える。
処理回路５０１は、図５Ａに示すように専用のハードウェアであっても、図５Ｂに示すようにメモリ５０５に格納されるプログラムを実行するＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）５０６であってもよい。Next, the hardware configuration of the intention estimation device 1 according to the first embodiment will be described.
FIGS. 5A and 5B are diagrams showing an example of a hardware configuration of the intention estimation device 1 according to the first embodiment of the present invention.
In the first embodiment of the present invention, speech recognition unit 102, morphological analysis unit 103, dependency analysis unit 104, intention number estimation unit 106, single intention estimation unit 108, compound intention estimation unit 110, estimation Each function of the result integration unit 111, the command execution unit 112, the response generation unit 113, and the notification control unit 114 is realized by the processing circuit 501. That is, the intention estimation apparatus 1 is a processing circuit for performing a process of estimating a user's intention or a process of executing and notifying a machine command according to the estimated intention based on the received information on the user's utterance. It has 501.
The processing circuit 501 may be dedicated hardware as shown in FIG. 5A or a CPU (Central Processing Unit) 506 that executes a program stored in the memory 505 as shown in FIG. 5B.

処理回路５０１が専用のハードウェアである場合、処理回路５０１は、例えば、単一回路、複合回路、プログラム化したプロセッサ、並列プログラム化したプロセッサ、ＡＳＩＣ（ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）、ＦＰＧＡ（Ｆｉｅｌｄ−ＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ）、またはこれらを組み合わせたものが該当する。 When the processing circuit 501 is dedicated hardware, the processing circuit 501 may be, for example, a single circuit, a composite circuit, a programmed processor, a parallel programmed processor, an application specific integrated circuit (ASIC), an FPGA (field-programmable) Gate Array) or a combination thereof is applicable.

処理回路５０１がＣＰＵ５０６の場合、音声認識部１０２と、形態素解析部１０３と、係り受け解析部１０４と、意図数推定部１０６と、単意図推定部１０８と、複合意図推定部１１０と、推定結果統合部１１１と、コマンド実行部１１２と、応答生成部１１３と、通知制御部１１４の各機能は、ソフトウェア、ファームウェア、または、ソフトウェアとファームウェアとの組み合わせにより実現される。すなわち、音声認識部１０２と、形態素解析部１０３と、係り受け解析部１０４と、意図数推定部１０６と、単意図推定部１０８と、複合意図推定部１１０と、推定結果統合部１１１と、コマンド実行部１１２と、応答生成部１１３と、通知制御部１１４は、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）５０２、メモリ５０５等に記憶されたプログラムを実行するＣＰＵ５０６、またはシステムＬＳＩ（Ｌａｒｇｅ−ＳｃａｌｅＩｎｔｅｇｒａｔｉｏｎ）等の処理回路により実現される。また、ＨＤＤ５０２、またはメモリ５０５等に記憶されたプログラムは、音声認識部１０２と、形態素解析部１０３と、係り受け解析部１０４と、意図数推定部１０６と、単意図推定部１０８と、複合意図推定部１１０と、推定結果統合部１１１と、コマンド実行部１１２と、応答生成部１１３と、通知制御部１１４の手順や方法をコンピュータに実行させるものであるとも言える。ここで、メモリ５０５とは、例えば、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、フラッシュメモリ、ＥＰＲＯＭ（ＥｒａｓａｂｌｅＰｒｏｇｒａｍｍａｂｌｅＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＥＥＰＲＯＭ（ＥｌｅｃｔｒｉｃａｌｌｙＥｒａｓａｂｌｅＰｒｏｇｒａｍｍａｂｌｅＲｅａｄ−ＯｎｌｙＭｅｍｏｒｙ）等の、不揮発性もしくは揮発性の半導体メモリ、磁気ディスク、フレキシブルディスク、光ディスク、コンパクトディスク、ミニディスク、またはＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｃ）等が該当する。 When the processing circuit 501 is the CPU 506, the speech recognition unit 102, the morphological analysis unit 103, the dependency analysis unit 104, the intention number estimation unit 106, the single intention estimation unit 108, the combined intention estimation unit 110, and the estimation result Each function of the integration unit 111, the command execution unit 112, the response generation unit 113, and the notification control unit 114 is realized by software, firmware, or a combination of software and firmware. That is, the speech recognition unit 102, the morphological analysis unit 103, the dependency analysis unit 104, the intention number estimation unit 106, the single intention estimation unit 108, the compound intention estimation unit 110, the estimation result integration unit 111, and the command The execution unit 112, the response generation unit 113, and the notification control unit 114 execute processes such as a hard disk drive (HDD) 502, a CPU 506 that executes a program stored in the memory 505, or a system LSI (Large-Scale Integration) It is realized by a circuit. The program stored in the HDD 502 or the memory 505 is a combination of the speech recognition unit 102, the morphological analysis unit 103, the dependency analysis unit 104, the intention number estimation unit 106, and the single intention estimation unit 108. It can also be said that the computer is made to execute the procedures and methods of the estimation unit 110, the estimation result integration unit 111, the command execution unit 112, the response generation unit 113, and the notification control unit 114. Here, the memory 505 is, for example, a non-volatile memory such as a random access memory (RAM), a read only memory (ROM), a flash memory, an erasable programmable read only memory (EPROM), or an electrically erasable programmable read only memory (EEPROM). Semiconductor or volatile semiconductor memory, magnetic disk, flexible disk, optical disk, compact disk, mini disk, DVD (Digital Versatile Disc), etc. correspond.

なお、音声認識部１０２と、形態素解析部１０３と、係り受け解析部１０４と、意図数推定部１０６と、単意図推定部１０８と、複合意図推定部１１０と、推定結果統合部１１１と、コマンド実行部１１２と、応答生成部１１３と、通知制御部１１４の各機能について、一部を専用のハードウェアで実現し、一部をソフトウェアまたはファームウェアで実現するようにしてもよい。例えば、音声認識部１０２については専用のハードウェアとしての処理回路５０１でその機能を実現し、形態素解析部１０３と、係り受け解析部１０４と、意図数推定部１０６と、単意図推定部１０８と、複合意図推定部１１０と、推定結果統合部１１１と、コマンド実行部１１２と、応答生成部１１３と、通知制御部１１４については処理回路がメモリ５０５に格納されたプログラムを読み出して実行することによってその機能を実現することが可能である。
意図数推定モデル記憶部１０５、単意図推定モデル記憶部１０７、及び、複合意図推定モデル記憶部１０９は、例えば、ＨＤＤ５０２を使用する。なお、これは一例にすぎず、意図数推定モデル記憶部１０５、単意図推定モデル記憶部１０７、及び、複合意図推定モデル記憶部１０９は、ＤＶＤ、またはメモリ５０５等によって構成されるものであってもよい。
また、意図推定装置１は、ナビゲーション装置等の外部機器との通信を行う、入力インタフェース装置５０３、及び、出力インタフェース装置５０４を有する。
音声受付部１０１は、入力インタフェース装置５０３で構成される。The speech recognition unit 102, the morphological analysis unit 103, the dependency analysis unit 104, the intention number estimation unit 106, the single intention estimation unit 108, the compound intention estimation unit 110, the estimation result integration unit 111, and the command The functions of the execution unit 112, the response generation unit 113, and the notification control unit 114 may be partially realized by dedicated hardware and partially realized by software or firmware. For example, the function of the speech recognition unit 102 is realized by the processing circuit 501 as dedicated hardware, and the morphological analysis unit 103, the dependency analysis unit 104, the intention number estimation unit 106, and the single intention estimation unit 108 The processing circuit of the combined intention estimation unit 110, the estimation result integration unit 111, the command execution unit 112, the response generation unit 113, and the notification control unit 114 reads and executes the program stored in the memory 505. It is possible to realize that function.
The intention number estimation model storage unit 105, the single intention estimation model storage unit 107, and the combined intention estimation model storage unit 109 use, for example, the HDD 502. Note that this is merely an example, and the intention number estimation model storage unit 105, the single intention estimation model storage unit 107, and the combined intention estimation model storage unit 109 are constituted by a DVD, a memory 505, or the like. It is also good.
Further, the intention estimation device 1 includes an input interface device 503 and an output interface device 504 that communicate with an external device such as a navigation device.
The voice receiving unit 101 is configured of an input interface device 503.

次に、実施の形態１に係る意図推定装置１の動作について説明する。
まず、意図推定装置１におけるユーザの意図を推定する動作の前提となる、意図数推定モデルの生成処理に関する動作について説明する。
ここでは、意図数推定モデルの生成処理は、意図推定装置１とは別の、意図数推定モデル生成装置２によって行われるものとする。Next, the operation of the intention estimation device 1 according to the first embodiment will be described.
First, an operation related to generation processing of an intention number estimation model, which is a premise of an operation of estimating a user's intention in the intention estimation device 1, will be described.
Here, it is assumed that the intention number estimation model generation process is performed by the intention number estimation model generation device 2 different from the intention estimation device 1.

図６は、実施の形態１の意図数推定モデル生成装置２の構成例を示す図である。
意図数推定モデル生成装置２は、図６に示すように、学習用データ記憶部１１５と、形態素解析部１０３と、係り受け解析部１０４と、意図数推定モデル生成部１１６とを備える。
形態素解析部１０３及び係り受け解析部１０４の構成及び動作は、図１等を用いて説明した形態素解析部１０３及び係り受け解析部１０４の構成及び動作と同様であるため、同じ符号を付して重複した説明を省略する。FIG. 6 is a diagram showing a configuration example of the intention number estimation model generation device 2 according to the first embodiment.
As shown in FIG. 6, the intention number estimation model generation device 2 includes a learning data storage unit 115, a morphological analysis unit 103, a dependency analysis unit 104, and an intention number estimation model generation unit 116.
The configurations and operations of the morphological analysis unit 103 and the dependency analysis unit 104 are the same as the configurations and operations of the morphological analysis unit 103 and the dependency analysis unit 104 described with reference to FIG. Duplicate descriptions will be omitted.

学習用データ記憶部１１５は、文字列と意図数との対応関係を学習用データとして記憶する。なお、ここでは、意図数推定モデル生成装置２が学習用データ記憶部１１５を備えるものとしているが、これに限らず、学習用データ記憶部１１５は、意図数推定モデル生成装置２の外部の、意図数推定モデル生成装置２が参照可能な場所に備えられるようにしてもよい。 The learning data storage unit 115 stores the correspondence between the character string and the number of intentions as learning data. Here, although it is assumed that the intention number estimation model generation device 2 includes the learning data storage unit 115, the present invention is not limited to this, and the learning data storage unit 115 is outside the intention number estimation model generation device 2. The intention number estimation model generation device 2 may be provided at a referenceable location.

ここで、図７は、実施の形態１において、学習用データ記憶部１１５に記憶されている学習用データの例を示す図である。
図７に示すように、学習用データは、発話等により音声出力される文字列の例示文である発話の文例（以下、発話文例という）に、対応する意図数が付与されたデータである。例えば、発話文例７０１「○○へ行きたい」については、意図数「１件」が付与されている。
学習用データは、予め、モデルの作成者等によって作成されるものである。モデルの作成者等は、複数の発話文例について、発話文例毎に予め意図数を付与した学習データを作成し、学習用データ記憶部１１５に記憶させておく。Here, FIG. 7 is a diagram showing an example of learning data stored in the learning data storage unit 115 in the first embodiment.
As shown in FIG. 7, the learning data is data in which the number of intentions corresponding to an utterance sentence example (hereinafter referred to as an utterance sentence example) which is an example sentence of a character string voice output by speech or the like. For example, the intention number “1 case” is assigned to the utterance sentence example 701 “I want to go to ○”.
The learning data is created in advance by a model creator or the like. The creator of the model creates learning data in which the intention number is given in advance for each utterance sentence example for a plurality of utterance sentence examples, and stores it in the learning data storage unit 115.

意図数推定モデル生成部１１６は、学習用データ記憶部１１５に記憶されている学習用データ、及び、係り受け解析部１０４による形態素間の関係性の解析結果に基づき、発話文例と対応する意図数を統計的な手法で学習し、係り受け情報と意図数の対応関係を示す意図数推定モデル（図２参照）を生成する。意図数推定モデル生成部１１６は、生成した意図数推定モデルを、意図数推定モデル記憶部１０５に記憶させる。 The intention number estimation model generation unit 116 calculates the number of intentions corresponding to the utterance sentence example based on the learning data stored in the learning data storage unit 115 and the analysis result of the relationship between the morphemes by the dependency analysis unit 104. Are learned by a statistical method, and an intention number estimation model (see FIG. 2) indicating the correspondence between dependency information and the number of intentions is generated. The intention number estimation model generation unit 116 stores the generated intention number estimation model in the intention number estimation model storage unit 105.

図８は、実施の形態１において、意図数推定モデル生成装置２が意図数推定モデルを生成する処理を説明するためのフローチャートである。
まず、形態素解析部１０３は、学習用データ記憶部１１５に記憶されている学習用データの各文例に対して形態素解析を行う（ステップＳＴ８０１）。例えば、図７の発話文例７０１の場合、形態素解析部１０３は、「○○へ行きたい」に対して形態素解析を行い、「○○／名詞、へ／格助詞、行き／動詞、たい／助動詞」という形態素解析結果を得る。形態素解析部１０３は、形態素解析結果を、係り受け解析部１０４に出力する。FIG. 8 is a flowchart for explaining processing in which the intention number estimation model generation device 2 generates an intention number estimation model in the first embodiment.
First, the morphological analysis unit 103 performs morphological analysis on each sentence example of learning data stored in the learning data storage unit 115 (step ST801). For example, in the case of the utterance sentence example 701 of FIG. 7, the morphological analysis unit 103 performs morphological analysis on "I want to go to ○ ○", and "名詞 / noun, to / case particle, go / verb, tai / auxiliary verb Result of morpheme analysis. The morphological analysis unit 103 outputs the morphological analysis result to the dependency analysis unit 104.

係り受け解析部１０４は、形態素解析部１０３から出力された形態素解析結果に基づき、形態素解析部１０３が解析した形態素を用いて、係り受け解析を行う（ステップＳＴ８０２）。例えば、発話文例７０１の場合、係り受け解析部１０４は、形態素「○○」、「へ」、「行き」及び「たい」に対し係り受け解析を行う。係り受け解析部１０４は、前記形態素から「動作対象」という形態素間の関係性の解析結果を得て、当該解析結果に意図数を付与して、「動作対象＿１件」を係り受け情報として意図数推定モデル生成部１１６に出力する。 The dependency analysis unit 104 performs dependency analysis using the morpheme analyzed by the morphological analysis unit 103 based on the morpheme analysis result output from the morphological analysis unit 103 (step ST802). For example, in the case of the utterance sentence example 701, the dependency analysis unit 104 performs dependency analysis on the morphemes “○”, “to”, “to go” and “to”. The dependency analysis unit 104 obtains the analysis result of the relationship between the morphemes “operation target” from the morpheme, adds the number of intentions to the analysis result, and uses “operation target _1” as the dependency information. It is output to the number estimation model generation unit 116.

意図数推定モデル生成部１１６は、係り受け解析部１０４が出力した係り受け情報に基づき、学習用データ記憶部１１５に記憶されている学習用データを用いて、意図数推定モデルを生成する（ステップＳＴ８０３）。例えば、発話文例７０１「○○へ行きたい」の場合、係り受け情報は「動作対象＿１件」であり、学習用データに含まれる意図数は図７に示すように「意図数１件」である。したがって、意図数推定モデル生成部１１６は、発話文例７０１を用いた場合、係り受け情報「動作対象＿１件」に対しては、「意図数１件」のスコアが他の意図数のスコアよりも高くなるように学習する。意図数推定モデル生成部１１６は、学習用データに含まれる全ての発話文例に対して上記のステップＳＴ８０１〜ステップＳＴ８０３と同様の処理を行い、最終的に図２に示すような意図数推定モデルを生成する。
そして、意図数推定モデル生成部１１６は、生成した意図数推定モデルを、意図数推定モデル記憶部１０５に記憶させる。なお、意図数推定モデル記憶部１０５は、例えば、ネットワークを介して、意図数推定モデル生成装置２がアクセス可能な場所に備えられている。The intention number estimation model generation unit 116 generates an intention number estimation model using learning data stored in the learning data storage unit 115 based on the dependency information output by the dependency analysis unit 104 (step ST 803). For example, in the case of the utterance sentence example 701 “I want to go to ○”, the dependency information is “1 operation target” and the number of intentions included in the learning data is “1 intention number” as shown in FIG. is there. Therefore, when the intention number estimation model generation unit 116 uses the utterance sentence example 701, the score of “1 intention number” is higher than the score of other intention numbers for the dependency information “1 operation target”. Learn to be high. The intention number estimation model generation unit 116 performs the same processing as the above-described steps ST801 to ST803 on all utterance sentence examples included in the learning data, and finally, the intention number estimation model as shown in FIG. Generate
Then, the intention number estimation model generation unit 116 stores the generated intention number estimation model in the intention number estimation model storage unit 105. The intention number estimation model storage unit 105 is provided, for example, at a place where the intention number estimation model generation device 2 can access via the network.

なお、ここでは、意図数推定モデル生成部１１６は、係り受け解析部１０４から出力されたすべての係り受け情報を特徴量として意図数推定に使うものとしたが、意図数推定モデル生成部１１６の構成は、これに限るものではない。意図数推定モデル生成部１１６は、「並列関係のみ使用」あるいは「動作の対象のみ使用」のように明確な規則を決めて特徴量を選択する構成、あるいは統計的な手法を用いて意図数推定に効果が高い係り受け情報のみを使用する構成とすることもできる。 Here, although the intention number estimation model generation unit 116 uses all dependency information output from the dependency analysis unit 104 as a feature amount for the intention number estimation, the intention number estimation model generation unit 116 The configuration is not limited to this. The intention number estimation model generation unit 116 determines the number of intentions by determining a clear rule such as “use only parallel relation” or “use only operation target”, or uses a statistical method to estimate the number of intentions It is also possible to use only dependency information that is highly effective.

また、ここでは、意図推定装置１とは別の意図数推定モデル生成装置２が、意図数推定モデルを生成し、意図数推定モデル記憶部１０５に記憶させるものとしたが、これに限らず、意図推定装置１が意図数推定モデルを生成して意図数推定モデル記憶部１０５に記憶させるものとしてもよい。この場合、意図推定装置１は、図１を用いて説明した構成に加え、学習用データ記憶部１１５及び意図数推定モデル生成部１１６をさらに備える。なお、学習用データ記憶部１１５は、意図推定装置１の外部の、意図推定装置１が参照可能な場所に備えられるようにしてもよい。 Furthermore, although here the intention number estimation model generation device 2 different from the intention estimation device 1 generates the intention number estimation model and stores it in the intention number estimation model storage unit 105, the present invention is not limited thereto. The intention estimation apparatus 1 may generate an intention number estimation model and store the intention number estimation model storage unit 105 as the intention number estimation model. In this case, the intention estimation apparatus 1 further includes a learning data storage unit 115 and an intention number estimation model generation unit 116 in addition to the configuration described with reference to FIG. 1. Note that the learning data storage unit 115 may be provided outside the intention estimation apparatus 1 at a place where the intention estimation apparatus 1 can refer.

続いて、上記のとおり意図数推定モデルが生成され、意図数推定モデル記憶部１０５に記憶されていることを前提に、当該意図数推定モデルを用いた、実施の形態１に係る意図推定装置１における意図推定処理に関する動作について説明する。 Subsequently, on the assumption that the intention number estimation model is generated as described above and stored in the intention number estimation model storage unit 105, the intention estimation device 1 according to the first embodiment uses the intention number estimation model. The operation related to the intention estimation process in FIG.

ここで、図９は、実施の形態１において、ユーザとナビゲーション装置との間で行われる対話例を示す図である。
図１０は、実施の形態１に係る意図推定装置１の動作を説明するためのフローチャートである。Here, FIG. 9 is a diagram showing an example of interaction performed between the user and the navigation device in the first embodiment.
FIG. 10 is a flowchart for explaining the operation of the intention estimation device 1 according to the first embodiment.

まず、図９に示すように、ナビゲーション装置が、「ピっと鳴ったらお話ください。」という音声を、例えばナビゲーション装置が備えるスピーカから出力する（Ｓ１）。具体的には、意図推定装置１の音声制御部（図示省略）が、ナビゲーション装置に対して、「ピっと鳴ったらお話ください。」という音声を出力させる。
ナビゲーション装置が、「ピっと鳴ったらお話ください」という音声を出力すると、当該音声に対し、ユーザが「○○へ行きたい。」と発話する（Ｕ１）。なお、図９では、ナビゲーション装置が意図推定装置１から指示を受けて出力する音声を「Ｓ」と表し、ユーザからの発話を「Ｕ」と表している。First, as shown in FIG. 9, the navigation device outputs, for example, a voice "Please talk when it sounds" from a speaker provided in the navigation device (S1). Specifically, the voice control unit (not shown) of the intention estimation device 1 causes the navigation device to output a voice saying "Please talk when it rings".
When the navigation device outputs a voice saying "Please talk when it sounds", the user utters "I want to go to ○" with respect to the voice (U1). In addition, in FIG. 9, the audio | voice which a navigation apparatus receives and outputs an instruction | indication from the intention estimation apparatus 1 represents as "S", and the speech from a user is represented as "U".

ユーザが「○○へ行きたい」（Ｕ１）と発話すると、音声受付部１０１が当該発話による音声を受け付ける。音声認識部１０２は、音声受付部１０１が受け付けた音声に対して音声認識処理を行い（ステップＳＴ１００１）、当該音声を文字列に変換する。音声認識部１０２は、変換した文字列を形態素解析部１０３に出力する。
形態素解析部１０３は、音声認識部１０２から出力された文字列に対し、形態素解析処理を行う（ステップＳＴ１００２）。例えば、形態素解析部１０３は、「○○」、「へ」、「行き」及び「たい」という形態素を得て、当該形態素の情報を、形態素解析結果として係り受け解析部１０４及び意図数推定部１０６に出力する。When the user utters "I want to go to ○ (U1)" (U1), the voice receiving unit 101 receives the voice according to the voice. The speech recognition unit 102 performs speech recognition processing on the speech received by the speech reception unit 101 (step ST1001), and converts the speech into a character string. The speech recognition unit 102 outputs the converted character string to the morphological analysis unit 103.
The morphological analysis unit 103 performs morphological analysis processing on the character string output from the speech recognition unit 102 (step ST1002). For example, the morpheme analysis unit 103 obtains morphemes of “○”, “to”, “go”, and “want”, and the information of the morpheme as the morpheme analysis result as the dependency analysis unit 104 and the intention number estimation unit Output to 106.

係り受け解析部１０４は、形態素解析部１０３から出力された形態素解析結果に対し係り受け解析処理を実施する（ステップＳＴ１００３）。例えば、係り受け解析部１０４は、形態素「○○」は「行き」という動作の対象であるため、音声認識部１０２から出力された文字列には、「動作対象」という形態素間の関係性があると解析する。また、「動作対象」が１件であるため、形態素解析部１０３は、「動作対象＿１件」と解析する。そして、形態素解析部１０３は、「動作対象＿１件」との解析結果を、係り受け情報とし、意図数推定部１０６に出力する。 The dependency analysis unit 104 performs dependency analysis processing on the morphological analysis result output from the morphological analysis unit 103 (step ST1003). For example, since the dependency analysis unit 104 is the target of the operation of “Go” as the morpheme “○”, the character string output from the speech recognition unit 102 has the relationship between the morphemes of “operation target”. Analyze that there is. In addition, since the “operation target” is one, the morphological analysis unit 103 analyzes “one operation target”. Then, the morphological analysis unit 103 outputs the analysis result of “1 operation target” as dependency information to the intention number estimation unit 106.

意図数推定部１０６は、ステップＳＴ１００３において係り受け解析部１０４から出力された係り受け情報「動作対象＿１件」を特徴量として、意図数推定モデル記憶部１０５に記憶されている意図数推定モデルを用いて、意図数を推定する（ステップＳＴ１００４）。意図数推定部１０６による意図数の推定動作について、図１１を用いて詳細に説明する。 The intention number estimation unit 106 determines the intention number estimation model stored in the intention number estimation model storage unit 105 with the dependency information “1 operation target” output from the dependency analysis unit 104 in step ST1003 as a feature amount. The number of intentions is estimated using (step ST1004). The estimation operation of the intention number by the intention number estimation unit 106 will be described in detail using FIG.

図１１は、図１０のステップＳＴ１００４における、意図数推定部１０６の動作について説明するためのフローチャートである。
まず、意図数推定部１０６は、係り受け解析部１０４から出力された係り受け情報と意図数推定モデルとを照合し、各意図数に対する各係り受け情報のスコアを取得する（ステップＳＴ１１０１）。FIG. 11 is a flowchart for explaining the operation of the intention number estimation unit 106 in step ST1004 of FIG.
First, the intention number estimation unit 106 collates the dependency information output from the dependency analysis unit 104 with the intention number estimation model, and acquires a score of each dependency information for each intention number (step ST1101).

ここで、図１２は、実施の形態１において、意図数推定部１０６が取得する、各意図数に対する係り受け情報のスコアの一例を示す図である。
図１２に示すように、特徴量とする係り受け情報が「動作対象＿１件」である場合、意図数推定部１０６は、例えば、意図数「１件」に対する特徴量「動作対象＿１件」のスコアとして、０．２を取得する。意図数推定部１０６は、他の意図数についても、同様に、特徴量「動作対象＿１件」のスコアを取得する。Here, FIG. 12 is a diagram illustrating an example of a score of dependency information with respect to each intention number acquired by the intention number estimation unit 106 according to the first embodiment.
As shown in FIG. 12, when the dependency information to be the feature amount is “operation target_1”, for example, the intention number estimation unit 106 determines that the feature amount “operation target_1” for the number of intentions “1”. As a score, get 0.2. The intention number estimation unit 106 similarly obtains the score of the feature amount “1 operation target” for other intention numbers.

次に、意図数推定部１０６は、ステップＳＴ１１０１で取得した各意図数のスコアに基づき、意図数を推定する対象としている１つの文字列である推定対象に対する各意図数の最終スコアを算出する（ステップＳＴ１１０２）。この実施の形態１において、意図数推定部１０６が求める最終スコアとは、各意図数について、当該意図数に対する各係り受け情報のスコアを全て乗算して算出された積である。すなわち、最終スコアとは、各意図数について、当該意図数に対する、意図数推定に用いる各特徴量のスコアを全て乗算して算出された積である。
図１３は、実施の形態１において、意図数推定部１０６が最終スコアを算出するために用いる計算式を示す図である。
図１３において、Ｓは、推定対象に対する複数の意図数のうち、最終スコアの算出対象としたある意図数（以下、対象意図数という）の最終スコアである。また、図１３において、Ｓｉは、対象意図数に対するｉ番目の特徴量のスコアである。Next, the intention number estimation unit 106 calculates, based on the score of each intention number acquired in step ST 1101, a final score of each intention number for the estimation object that is one character string for which the intention number is to be estimated Step ST1102). In the first embodiment, the final score obtained by the intention number estimation unit 106 is a product calculated by multiplying all the scores of the dependency information with respect to the intention number for each intention number. That is, the final score is a product calculated by multiplying the number of intentions by the score of each feature amount used for estimation of the number of intentions.
FIG. 13 is a diagram showing a calculation formula used by the intention number estimation unit 106 to calculate the final score in the first embodiment.
In FIG. 13, S is a final score of a number of intentions (hereinafter referred to as the number of intention intentions) for which the final score is to be calculated among a plurality of intention numbers for the estimation target. Moreover, in FIG. 13, Si is a score of the i-th feature amount with respect to the number of intended intentions.

図１４は、実施の形態１において、意図数推定部１０６が算出する、各意図数の最終スコアの一例を示す図である。
意図数推定部１０６は、図１３に示す計算式を用いて、図１４に示す最終スコアを算出する。この例では、特徴量となる係り受け情報は「動作対象＿１件」の１つであるため、最終スコアと特徴量「動作対象＿１件」に対応するスコアは同じである。
図１４に示したように、意図数「１件」に対して、特徴量「動作対象＿１件」のスコアは０．２となり、最終スコアＳも０．２となる。意図数推定部１０６は、同様に、他の意図数についても、それぞれ最終スコアを算出する。FIG. 14 is a diagram illustrating an example of the final score of each intention number calculated by the intention number estimation unit 106 in the first embodiment.
The intention number estimation unit 106 calculates the final score shown in FIG. 14 using the calculation formula shown in FIG. In this example, since the dependency information to be the feature amount is one of “the operation target_1”, the score corresponding to the final score and the feature amount “the operation target_1” is the same.
As shown in FIG. 14, the score of the feature amount “1 motion target” is 0.2 and the final score S is 0.2 as well with respect to the number of intentions “1”. Similarly, the intention number estimation unit 106 calculates the final score for each of the other intention numbers.

図１１のフローチャートに戻る。
意図数推定部１０６は、ステップＳＴ１１０２において算出した各意図数の最終スコアに基づき、意図数を推定する（ステップＳＴ１１０３）。具体的には、意図数推定部１０６は、算出した推定対象の各意図数のうち、最も高い最終スコアを有する意図数を、推定対象の意図数として推定する。
ここでは、意図数推定部１０６は、意図数「１件」を意図数として推定する。It returns to the flowchart of FIG.
The intention number estimation unit 106 estimates the number of intentions based on the final score of each intention number calculated in step ST1102 (step ST1103). Specifically, the intention number estimation unit 106 estimates the number of intentions having the highest final score among the calculated intention numbers of the estimation target as the number of intentions of the estimation target.
Here, the intention number estimation unit 106 estimates the number of intentions “one case” as the number of intentions.

図１０のフローチャートに戻る。
意図数推定部１０６は、ステップＳＴ１００４で意図数を推定した結果、意図数が１より大きいかどうかを判定する（ステップＳＴ１００５）。
ステップＳＴ１００５において、推定した意図数が１より大きい場合（ステップＳＴ１００５の“ＹＥＳ”の場合）、ステップＳＴ１０１０〜ステップＳＴ１０１４へ進む。ステップＳＴ１００５において、推定した意図数が１より大きくなった場合の、ステップＳＴ１０１０以降の処理の詳細については、具体例をあげて後述する。It returns to the flowchart of FIG.
As a result of estimating the number of intentions in step ST1004, it is determined whether the number of intentions is greater than 1 (step ST1005).
In step ST1005, when the estimated number of intentions is larger than 1 (in the case of "YES" of step ST1005), it progresses to step ST1010-step ST1014. The details of the process after step ST1010 when the estimated number of intentions becomes larger than 1 in step ST1005 will be described later with a specific example.

ステップＳＴ１００５において、推定した意図数が１以下の場合（ステップＳＴ１００５の“ＮＯ”の場合）、ステップＳＴ１００６へ進む。
例えば、図９のＵ１の例では、意図数推定部１０６は意図数を推定した結果、意図数が「１」であるため、ステップＳＴ１００６へ進む。
ステップＳＴ１００６において、意図数推定部１０６は、ステップＳＴ１００２において形態素解析部１０３が形態素解析した形態素解析結果である文字列を単意図推定部１０８に出力する。そして、単意図推定部１０８は、単意図推定モデル記憶部１０７に記憶された単意図推定モデル（図３参照）を用いて、形態素解析結果である文字列、すなわち、単意図発話文に対して、ユーザの意図を推定する（ステップＳＴ１００６）。例えば、文字列が「○○へ行きたい。」である場合、「目的地設定［施設＝○○］」をユーザの意図と推定する。具体的には、単意図推定部１０８は、単意図推定モデルを用いて、形態素解析部１０３による、文字列の形態素解析結果のスコアが一番大きくなる意図を、ユーザの意図と推定する。
単意図推定部１０８は、当該意図推定結果を、単意図推定結果としてコマンド実行部１１２に出力する。In step ST1005, when the estimated number of intentions is 1 or less ("NO" of step ST1005), it progresses to step ST1006.
For example, in the example of U1 in FIG. 9, as a result of the intention number estimation unit 106 estimating the number of intentions, the number of intentions is “1”, so the process proceeds to step ST1006.
In step ST1006, the intention number estimation unit 106 outputs, to the single intention estimation unit 108, a character string that is the morpheme analysis result that the morpheme analysis unit 103 performs the morpheme analysis in step ST1002. Then, using the single intention estimation model (see FIG. 3) stored in the single intention estimation model storage unit 107, the single intention estimation unit 108 uses a single intention utterance text as a morphological analysis result. , The user's intention is estimated (step ST1006). For example, when the character string is "I want to go to ○.", "Destination setting [facility = ○ 0"] is estimated as the user's intention. Specifically, the single intention estimation unit 108 uses the single intention estimation model to estimate the intention with which the score of the morphological analysis result of the character string by the morphological analysis unit 103 is the largest is the intention of the user.
The single intention estimation unit 108 outputs the intention estimation result to the command execution unit 112 as a single intention estimation result.

コマンド実行部１１２は、ステップＳＴ１００６において単意図推定部１０８から出力された単意図推定結果に対応するコマンドを、ナビゲーション装置のコマンド処理部に、実行させる（ステップＳＴ１００７）。例えば、コマンド実行部１１２は、ナビゲーション装置のコマンド処理部に、施設○○を目的地に設定するという操作を実行させる。
また、コマンド実行部１１２は、ステップＳＴ１００７で実行させたコマンドの内容を示す実行操作情報を、応答生成部１１３に出力する。The command execution unit 112 causes the command processing unit of the navigation device to execute a command corresponding to the single intention estimation result output from the single intention estimation unit 108 in step ST1006 (step ST1007). For example, the command execution unit 112 causes the command processing unit of the navigation device to execute an operation of setting the facility ○ as a destination.
The command execution unit 112 also outputs execution operation information indicating the content of the command executed in step ST1007 to the response generation unit 113.

応答生成部１１３は、ステップＳＴ１００７においてコマンド実行部１１２から出力された実行操作情報に基づき、コマンド実行部１１２がコマンド処理部に実行させたコマンドに対応する応答データを生成する（ステップＳＴ１００８）。応答生成部１１３は、生成した応答データを、通知制御部１１４に出力する。 The response generation unit 113 generates response data corresponding to the command executed by the command processing unit 112 based on the execution operation information output from the command execution unit 112 in step ST1007 (step ST1008). The response generation unit 113 outputs the generated response data to the notification control unit 114.

通知制御部１１４は、ステップＳＴ１００８において応答生成部１１３から出力された応答データに基づく音声を、例えば、ナビゲーション装置が備えるスピーカから出力させる（ステップＳＴ１００９）。その結果、図９の「Ｓ２」に示すように、「○○を目的地に設定しました。」等の音声が出力され、ユーザへの、実行されたコマンドの通知を行うことができる。 The notification control unit 114 causes, for example, a speaker provided in the navigation device to output a voice based on the response data output from the response generation unit 113 in step ST1008 (step ST1009). As a result, as shown in “S2” of FIG. 9, a voice such as “O is set as the destination” is output, and the user can be notified of the executed command.

次に、図９において「Ｕ２」で示すように、ユーザが、「△△も寄って、高速道路を選択して。」と発話したとして、この場合の意図推定装置１の動作を、図１０に沿って説明する。
「Ｕ２」で示すようにユーザが発話すると、音声受付部１０１が当該発話による音声を受け付け、音声認識部１０２は、受け付けた発話による音声に対して音声認識処理を行い（ステップＳＴ１００１）、文字列に変換する。音声認識部１０２は、変換した文字列を形態素解析部１０３及び意図数推定部１０６に出力する。
形態素解析部１０３は、音声認識部１０２から出力された文字列に対し、形態素解析処理を行う（ステップＳＴ１００２）。例えば、形態素解析部１０３は、「△△」、「も」、「寄っ」、「て」、「高速道路」、「を」、「選択し」及び「て」の形態素を得、当該形態素の情報を、形態素解析結果として係り受け解析部１０４に出力する。Next, as shown by “U2” in FIG. 9, assuming that the user utters “Drop down Δ, select expressway”, the operation of the intention estimation device 1 in this case is shown in FIG. Explain along.
When the user utters as indicated by "U2", the voice receiving unit 101 receives the voice by the voice, and the voice recognition unit 102 performs voice recognition processing on the voice by the received voice (step ST1001), and the character string Convert to The speech recognition unit 102 outputs the converted character string to the morphological analysis unit 103 and the intention number estimation unit 106.
The morphological analysis unit 103 performs morphological analysis processing on the character string output from the speech recognition unit 102 (step ST1002). For example, the morphological analysis unit 103 obtains morphemes of “ΔΔ”, “Momo”, “Give”, “Te”, “Highway”, “O”, “Select” and “Te”, and The information is output to the dependency analysis unit 104 as a morphological analysis result.

次に、係り受け解析部１０４は、形態素解析部１０３から出力された形態素解析結果に対して係り受け解析処理を行う（ステップＳＴ１００３）。ここでは、「△△」は「寄っ」の動作の対象であり、「高速道路」は「選択」の動作の対象であり、また動作「寄っ」と「選択」とは並列の関係にあるため、係り受け解析部１０４は、「動作対象＿２件」及び「並列関係＿１件」との解析結果を、係り受け情報とし、意図数推定部１０６に出力する。 Next, the dependency analysis unit 104 performs dependency analysis processing on the morphological analysis result output from the morphological analysis unit 103 (step ST1003). Here, “△” is the target of the “stop” operation, “highway” is the target of the “select” operation, and the “stop” and the “select” are in a parallel relationship. The dependency analysis unit 104 outputs the analysis results of “operation target _2” and “parallel relationship _1” as dependency information to the intention number estimation unit 106.

意図数推定部１０６は、取得した係り受け情報「動作対象＿２件」及び「並列関係＿１件」を特徴量として、意図数推定モデル記憶部１０５に記憶されている意図数推定モデルを用いて意図数を推定する（ステップＳＴ１００４）。
ステップＳＴ１００４の具体的な動作は、上記のように、図１１を用いて詳細に説明したとおりであるが、まず、「Ｕ１」の場合の処理と同じように、意図数推定部１０６は、係り受け解析部１０４から出力された係り受け情報と意図数推定モデルを照合し、各意図数に対する各係り受け情報のスコアを取得する（図１１のステップＳＴ１１０１参照）。
続いて、意図数推定部１０６は、図１３で示した計算式より、推定対象の意図数に対する最終スコアを算出する（図１１のステップＳＴ１１０２参照）。The intention number estimation unit 106 uses the intention number estimation model stored in the intention number estimation model storage unit 105 as the feature amount of the acquired dependency information “2 operation objects” and “parallel relation_1” as the feature amount. The number is estimated (step ST1004).
Although the specific operation of step ST1004 is as described in detail with reference to FIG. 11 as described above, first, the intention number estimation unit 106 determines the number of intentions as in the case of “U1”. The dependency information output from the reception analysis unit 104 is compared with the intention number estimation model, and the score of each dependency information for each intention number is acquired (see step ST1101 in FIG. 11).
Subsequently, the intention number estimation unit 106 calculates a final score with respect to the number of intentions to be estimated from the calculation formula shown in FIG. 13 (see step ST1102 in FIG. 11).

図１５は、実施の形態１において、意図数推定部１０６が算出する、各意図数の最終スコアの一例を示す図である。
意図数推定部１０６は、図１３に示す計算式を用いて、ユーザによる発話「Ｕ２」に対して、図１５に示す最終スコアを算出する。ここでは、意図数「１件」に対して、特徴量「動作対象＿２件」のスコアは０．０１、「並列関係＿１件」のスコアは０．０１となる。その結果、意図数推定部１０６は、発話「Ｕ２」に対する意図数「１件」の最終スコアＳを１ｅ−４（＝０．０００１）と算出する。意図数推定部１０６は、同様に、発話「Ｕ２」に対する他の意図数についても、それぞれ最終スコアを算出する。FIG. 15 is a diagram showing an example of the final score of each intention number calculated by the intention number estimation unit 106 in the first embodiment.
The intention number estimation unit 106 calculates the final score shown in FIG. 15 for the speech "U2" by the user, using the calculation formula shown in FIG. Here, with respect to the number of intentions “1 case”, the score of the feature amount “2 motion targets” is 0.01, and the score of “parallel relationship_1 case” is 0.01. As a result, the intention number estimation unit 106 calculates the final score S of the number of intentions “one case” for the utterance “U2” as 1e−4 (= 0. 0001). Similarly, the intention number estimation unit 106 calculates the final score for each of the other intention numbers for the utterance “U2”.

意図数推定部１０６は、算出した各意図数の最終スコアに基づき、意図数を推定する（図１１のステップＳＴ１１０３参照）。具体的には、意図数推定部１０６は、算出した推定対象の各意図数のうち、最も高い最終スコアを有する意図数「２件」を、推定対象の意図数として推定する。 The intention number estimation unit 106 estimates the number of intentions based on the final score of each calculated intention number (see step ST1103 in FIG. 11). Specifically, the intention number estimation unit 106 estimates the number of intentions “2” having the highest final score among the calculated intention numbers of the estimation target as the number of intentions of the estimation target.

図１０のフローチャートに戻る。
意図数推定部１０６は、ステップＳＴ１００４で意図数を推定した結果、意図数が１より大きいかどうかを判定する（ステップＳＴ１００５）。
ステップＳＴ１００５において、推定した意図数が１より大きい場合（ステップＳＴ１００５の“ＹＥＳ”の場合）、ステップＳＴ１０１０へ進む。
ここでは、推定した意図数は１より大きい「２件」であるため（ステップＳＴ１００５の“ＹＥＳ”の場合）、ステップＳＴ１０１０に進む。It returns to the flowchart of FIG.
As a result of estimating the number of intentions in step ST1004, it is determined whether the number of intentions is greater than 1 (step ST1005).
In step ST1005, when the estimated number of intentions is larger than 1 (in the case of "YES" of step ST1005), it progresses to step ST1010.
Here, since the estimated number of intentions is “2 cases” larger than 1 (in the case of “YES” in step ST1005), the process proceeds to step ST1010.

ステップＳＴ１０１０において、意図数推定部１０６は、ステップＳＴ１００２において形態素解析部１０３が形態素解析した形態素解析結果である文字列を複合意図推定部１１０に出力する。そして、複合意図推定部１１０は、複合意図推定モデル記憶部１０９に記憶された複合意図推定モデル（図４参照）を用いて、形態素結果である文字列、すなわち、複意図発話文に対して、ユーザの意図を推定する（ステップＳＴ１０１０）。 In step ST1010, the intention number estimation unit 106 outputs, to the combined intention estimation unit 110, a character string that is the morpheme analysis result that the morpheme analysis unit 103 performs the morpheme analysis in step ST1002. Then, compound intention estimation unit 110 uses the compound intention estimation model (see FIG. 4) stored in compound intention estimation model storage unit 109 to generate a character string as a morpheme result, that is, a compound intention utterance text, The intention of the user is estimated (step ST1010).

ここで、図１６は、この実施の形態１において、複合意図推定部１１０が推定結果とした、ユーザの意図の判定結果の一例である。
図１６では、説明を容易にするため、複合意図推定モデル記憶部１０９に記憶されている複合意図推定モデルとして、意図「経由地追加［施設＝△△］」の判定用意図推定モデル、意図「ルート変更［高速道路優先］」の判定用意図推定モデル、及び、意図「目的地設定［施設=△△］」の判定用意図推定モデルの三つのモデルがあるものとして説明する。すなわち、複合意図推定部１１０は、形態素解析部１０３による形態素解析結果である文字列が、この三つの意図に該当するかどうかについて判定する。意図数推定部１０６は、上記三つの判定用意図推定モデルを用いて判定する意図に対する意図推定スコアが０．５を超えた場合に、当該意図推定スコアが０．５を超えたと判定された意図を、該当意図であると判定するものとする。
なお、意図推定スコアとは、各形態素のスコアを足したものを元に算出される確率値をいう。よって、各判定用意図推定モデルにおいて意図推定スコアを合計すると「１」となる。Here, FIG. 16 is an example of the determination result of the user's intention, which is the estimation result of the compound intention estimation unit 110 in the first embodiment.
In FIG. 16, as the combined intention estimation model stored in the combined intention estimation model storage unit 109, the determination preparation figure estimation model of the intention “add via route [facility = ΔΔ]”, the intention “in order to facilitate the description. It is assumed that there are three models of the judgment preparation map estimation model of route change [highway priority] and the judgment preparation map estimation model of intention “destination setting [facility = ΔΔ]”. That is, the composite intention estimation unit 110 determines whether the character string that is the result of the morphological analysis by the morphological analysis unit 103 corresponds to the three intentions. The intention number estimation unit 106 determines that the intention estimation score exceeds 0.5 when the intention estimation score for the intention to be determined using the three determination ready drawing estimation models exceeds 0.5. Shall be determined to be the corresponding intention.
The intention estimation score is a probability value calculated based on the sum of the scores of the morphemes. Therefore, when the intention estimation scores are summed up in each determination preparation drawing estimation model, it becomes “1”.

図１６において、図１６Ａは、意図「経由地追加［施設＝△△］」の判定用意図推定モデルの判定結果である。複合意図推定部１１０は、意図「経由地追加［施設＝△△］」の意図推定スコアとして０．７５を得る。この場合、意図推定スコアが０．５を超えるため、複合意図推定部１１０は、意図「経由地追加［施設＝△△］」が「Ｕ２」の文字列の該当意図であると判定する。
図１６において、図１６Ｂは、意図「ルート変更［高速道路優先］」の判定用意図推定モデルの判定結果である。複合意図推定部１１０は、意図推定スコアが０．７であり、０．５を超えるため（図１６Ｂ参照）、意図「ルート変更［高速道路優先］」も「Ｕ２」の文字列の該当意図であると判定する。
図１６において、図１６Ｃは、意図「目的地設定［施設＝△△］」の判定用意図推定モデルの判定結果である。複合意図推定部１１０は、意図「目的地設定［施設=△△］」の意図推定スコアが０．５以下であるため、意図「目的地設定［施設=△△］」ではなく、「他の意図」が「Ｕ２」の文字列の該当意図であると判定する。In FIG. 16, FIG. 16A shows the determination result of the preparation preparation chart estimation model with the intention “add via point [facility = ΔΔ]”. The combined intention estimation unit 110 obtains 0.75 as the intention estimation score of the intention “add via route [facility = ΔΔ]”. In this case, since the intention estimation score exceeds 0.5, the combined intention estimation unit 110 determines that the intention “via addition [facility = ΔΔ]” is the corresponding intention of the character string “U2”.
In FIG. 16, FIG. 16B shows the determination result of the preparation preparation chart estimation model of the intention "route change [highway priority]". Since the intention estimation score of the compound intention estimation unit 110 is 0.7 and exceeds 0.5 (see FIG. 16B), the intention “route change [highway priority]” is also the corresponding intention of the character string “U2”. Determine that there is.
In FIG. 16, FIG. 16C shows the determination result of the preparation preparation chart estimation model with the intention “destination setting [facility = ΔΔ]”. Since the intention estimation score of the intention “destination setting [facility = ΔΔ]” is 0.5 or less, the combined intention estimation unit 110 does not “intention destination setting [facility = ΔΔ]” but “others”. It is determined that "intention" is the corresponding intention of the character string "U2".

複合意図推定部１１０は、図１６Ａ〜図１６Ｃで示す三つの意図推定モデルにより得た該当意図である、「経由地追加［施設＝△△］」、「ルート変更［高速道路優先］」、及び、「他の意図」を、意図推定結果として推定結果統合部１１１に出力する。 The combined intention estimation unit 110 has the corresponding intentions obtained by the three intention estimation models shown in FIGS. 16A to 16C, “add via route [facility = ΔΔ]”, “route change [highway priority]], and , “Other intention” is output to the estimation result integration unit 111 as an intention estimation result.

推定結果統合部１１１は、ステップＳＴ１０１０において複合意図推定部１１０から意図推定結果として出力された複数の該当意図のうち、「他の意図」以外の該当意図を、統合結果に加えることで、該当意図を統合する（ステップＳＴ１０１１）。 The estimation result integration unit 111 adds the corresponding intention other than the “other intention” to the integration result among the plurality of corresponding intentions output as the intention estimation result from the compound intention estimation unit 110 in step ST 1010. Are integrated (step ST1011).

図１６Ａに示すように、意図「経由地追加［施設＝△△］」の判定用意図推定モデルの判定結果は、意図「経由地追加［施設＝△△］」であるため、推定結果統合部１１１は、意図「経由地追加［施設＝△△］」を統合結果に加える。推定結果統合部１１１は、意図「ルート変更［高速道路優先］」を統合結果に加える。
一方、図１６Ｃに示すように、意図「目的地設定［施設=△△］」の判定用意図推定モデルの判定結果は、「他の意図」であるため、推定結果統合部１１１は、意図「目的地設定［施設=△△］」も「他の意図」も統合結果には加えない。As shown in FIG. 16A, the determination result of the preparation preparation diagram estimation model of the intention “append route addition [facility = ΔΔ]” is the intention “append route addition [facility = ΔΔ]”, so the estimation result integration unit 111 adds the intention "add via point [facility = ΔΔ]" to the integration result. The estimation result integration unit 111 adds the intention “route change [highway priority]” to the integration result.
On the other hand, as shown in FIG. 16C, since the determination result of the determination preparation diagram estimation model of the intention “destination setting [facility = ΔΔ]” is “other intention”, the estimation result integration unit 111 Neither destination setting [facility = Δ △] ”nor“ other intention ”is added to the integration result.

図１７は、この実施の形態１において、推定結果統合部１１１により統合された意図の統合結果の一例を示す図である。
推定結果統合部１１１は、推定した意図の統合結果を、複合意図推定結果としてコマンド実行部１１２へ出力する。FIG. 17 is a diagram showing an example of the integration result of the intention integrated by the estimation result integration unit 111 in the first embodiment.
The estimation result integration unit 111 outputs the integration result of the estimated intentions to the command execution unit 112 as a composite intention estimation result.

コマンド実行部１１２は、ステップＳＴ１０１１において複合意図推定部１１０から出力された複合意図推定結果に対応するコマンドを、ナビゲーション装置のコマンド処理部に、実行させる（ステップＳＴ１０１２）。例えば、コマンド実行部１１２は、ナビゲーション装置のコマンド処理部に、施設△△を経由地に追加するという操作を実行させる。また、コマンド実行部１１２は、ナビゲーション装置のコマンド処理部に、ルートを高速道路優先に変更するという操作を実行させる。
また、コマンド実行部１１２は、ステップＳＴ１０１２で実行させたコマンドの内容を示す実行操作情報を、応答生成部１１３に出力する。The command execution unit 112 causes the command processing unit of the navigation device to execute a command corresponding to the composite intention estimation result output from the composite intention estimation unit 110 in step ST1011 (step ST1012). For example, the command execution unit 112 causes the command processing unit of the navigation device to execute an operation of adding the facility ΔΔ to the via point. The command execution unit 112 also causes the command processing unit of the navigation device to execute an operation of changing the route to expressway priority.
Also, the command execution unit 112 outputs execution operation information indicating the content of the command executed in step ST1012 to the response generation unit 113.

応答生成部１１３は、ステップＳＴ１０１２においてコマンド実行部１１２から出力された実行操作情報に基づき、コマンド実行部１１２がコマンド処理部に実行させたコマンドに対応する応答データを生成する（ステップＳＴ１０１３）。応答生成部１１３は、生成した応答データを、通知制御部１１４に出力する。 The response generation unit 113 generates response data corresponding to the command executed by the command processing unit 112 based on the execution operation information output from the command execution unit 112 in step ST1012 (step ST1013). The response generation unit 113 outputs the generated response data to the notification control unit 114.

通知制御部１１４は、ステップＳＴ１０１３において応答生成部１１３から出力された応答データに基づく音声を、例えば、ナビゲーション装置が備えるスピーカから出力させる（ステップＳＴ１０１４）。その結果、図９の「Ｓ３」に示すように、「△△を経由地に追加しました。」、及び、「ルートを高速道路優先にしました。」等の音声が出力され、ユーザへの、実行されたコマンドの通知を行うことができる。 The notification control unit 114 causes, for example, a speaker provided in the navigation device to output a voice based on the response data output from the response generation unit 113 in step ST1013 (step ST1014). As a result, as shown in “S3” in FIG. 9, voices such as “I added Δ to the via point” and “I made the route a priority on the expressway” are output, and the user receives , Can be notified of the executed command.

以上のように、実施の形態１によれば、意図推定装置１を、取得した文字列に基づき当該文字列に含まれる形態素の解析を行う形態素解析部１０３と、文字列に対する意図数を推定し、推定した意図数に応じて、当該文字列が、一つしか意図を含まない単意図文字列（単意図発話）であるか、複数の意図を含む複意図文字列（複意図発話）であるかを判断する意図数推定部１０６と、意図数推定部１０６が、文字列は単意図文字列であると判断した場合、形態素解析部１０３が解析した形態素に基づき、意図毎に形態素との関連度が対応付けられた単意図推定モデルを用いて、当該単意図文字列に対する意図を単意図として推定する単意図推定部１０８と、意図数推定部１０６が、文字列は複意図文字列であると判断した場合、形態素解析部１０３が解析した形態素に基づき、複数の意図毎に形態素との関連度が対応付けられた複合意図推定情報モデルを用いて、当該複意図文字列に対する複数の意図を推定する複合意図推定部１１０と、複合意図推定部１１０が推定した複数の意図を複合意図として統合する推定結果統合部１１１とを備えるように構成した。これにより、取得した文字列が単意図文字列、複意図文字列のどちらもあり得る場合においても、精度よく意図を推定することができる。 As described above, according to the first embodiment, the intention estimation device 1 estimates the number of intentions for the character string, and the morpheme analysis unit 103 that analyzes morphemes included in the character string based on the acquired character string. Depending on the estimated number of intentions, the character string is a single intention character string (single intention utterance) including only one intention or a multiple intention character string including multiple intentions (multiple intention utterance) If the intention number estimation unit 106 determines whether the character string is a single intention character string, the intention number estimation unit 106 determines whether the character string is a single intention character string, the relation with the morpheme for each intention based on the morpheme analyzed by the morpheme analysis unit 103. A single intention estimation unit 108 that estimates the intention for the single intention character string as a single intention using the single intention estimation model with which the degree is associated, the intention number estimation unit 106, and the character string is a multiple intention character string If it is determined that the morphological analysis unit And a composite intention estimation unit 110 configured to estimate a plurality of intentions for the multiple intention character string using a composite intention estimation information model in which the association degree with the morpheme is associated with each of the plurality of intentions based on the morpheme analyzed by 03. And the estimation result integration unit 111 that integrates the plurality of intentions estimated by the composite intention estimation unit 110 as a composite intention. In this way, it is possible to estimate the intention with high accuracy even when the acquired character string may be either a single intention character string or a multiple intention character string.

実施の形態２．
実施の形態１では、ユーザの発話から、ユーザの意図が２以上であると推定した場合、複合意図推定部１１０が推定した複合意図推定結果を推定結果統合部１１１が統合し、コマンド実行部１１２が、当該統合された複合意図推定結果に対応するコマンドをナビゲーション装置に実行させるようにしていた。
この実施の形態２では、さらに、複合意図推定部１１０が推定した複合意図推定結果の意図数に上限を設定する実施の形態について説明する。
以下、図面を用いて本発明の実施の形態２について説明する。Second Embodiment
In the first embodiment, when it is estimated from the user's speech that the user's intention is two or more, the estimation result integration unit 111 integrates the compound intention estimation result estimated by the compound intention estimation unit 110, and the command execution unit 112 Is configured to cause the navigation device to execute a command corresponding to the integrated composite intention estimation result.
In the second embodiment, an embodiment will be described in which an upper limit is set to the number of intentions of the composite intention estimation result estimated by the composite intention estimation unit 110.
The second embodiment of the present invention will be described below with reference to the drawings.

図１８は、実施の形態２に係る意図推定装置１Ｂの構成例を示す図である。
この実施の形態２の意図推定装置１Ｂは、実施の形態１において図１を用いて説明した意図推定装置１とは、推定結果選択部１１７を備える点において異なる。意図推定装置１Ｂのその他の構成については、実施の形態１において図１を用いて説明した意図推定装置１の構成と同様であるので、意図推定装置１と同様の構成については、図１と同一の符号を付して重複した説明を省略する。
なお、この実施の形態２では、推定結果統合部１１１は、推定した意図の統合結果である複合意図推定結果を推定結果選択部１１７に出力する。このとき、推定結果統合部１１１は、意図推定スコアについても、複合意図推定結果に含めて、推定結果選択部１１７に出力する。
また、この実施の形態２では、意図数推定部１０６は、推定した意図数の情報を、推定結果選択部１１７に出力するようにする。FIG. 18 is a diagram showing a configuration example of the intention estimation device 1B according to the second embodiment.
The intention estimation device 1B of the second embodiment differs from the intention estimation device 1 described with reference to FIG. 1 in the first embodiment in that the estimation result selection unit 117 is provided. The other configuration of intention estimation apparatus 1B is the same as the configuration of intention estimation apparatus 1 described with reference to FIG. 1 in the first embodiment, so the same configuration as that of intention estimation apparatus 1 is the same as FIG. The sign of is attached and the duplicate explanation is omitted.
In the second embodiment, estimation result integration unit 111 outputs the combined intention estimation result, which is the integration result of the estimated intentions, to estimation result selection unit 117. At this time, the estimation result integration unit 111 also outputs the intention estimation score to the estimation result selection unit 117 by including it in the combined intention estimation result.
Further, in the second embodiment, the intention number estimation unit 106 outputs information on the estimated intention number to the estimation result selection unit 117.

推定結果選択部１１７は、推定結果統合部１１１から出力された複合意図推定結果に対し、意図数推定部１０６から出力された意図数を意図出力上限として、推定結果とする意図を、複合意図推定結果の意図推定スコアの上位から選択する。推定意図の選択について具体的な手法は後述する。 The estimation result selection unit 117 determines the intention as the estimation result with the intention number output from the intention number estimation unit 106 as the intention output upper limit for the compound intention estimation result output from the estimation result integration unit 111 Select from the top of the intention estimation score of the result. A specific method for selecting an estimation intention will be described later.

実施の形態２における意図推定装置１Ｂの動作について説明する。
ここで、図１９は、実施の形態２において、ユーザとナビゲーション装置との間で行われる対話例を示す図である。
図２０は、実施の形態２における意図推定装置１Ｂの動作を説明するためのフローチャートである。The operation of the intention estimation device 1B in the second embodiment will be described.
Here, FIG. 19 is a diagram showing an example of interaction performed between the user and the navigation device in the second embodiment.
FIG. 20 is a flow chart for explaining the operation of the intention estimation device 1B in the second embodiment.

まず、図１９に示すように、ナビゲーション装置が、「ピっと鳴ったらお話ください。」という音声を、例えばナビゲーション装置が備えるスピーカから出力する（Ｓ０１）。具体的には、意図推定装置１Ｂの音声制御部（図示省略）が、ナビゲーション装置に対して、「ピっと鳴ったらお話ください。」という音声を出力させる。
ナビゲーション装置が、「ピっと鳴ったらお話ください。」という音声を出力すると、当該音声に対し、ユーザが「○○は寄らなくていい、近くにコンビニある？」と発話する（Ｕ０１）。なお、ここでは、図１９に示すように、ナビゲーション装置が意図推定装置１Ｂから指示を受けて出力する音声を「Ｓ」と表し、ユーザからの発話を「Ｕ」と表している。First, as shown in FIG. 19, the navigation device outputs, for example, a voice "Please talk when it sounds" from a speaker provided in the navigation device (S01). Specifically, the voice control unit (not shown) of the intention estimation device 1B causes the navigation device to output a voice saying "Please talk when it pops."
When the navigation device outputs a voice saying "Please talk when it puzzes", the user utters the voice "Do you want to go to ○○, there is a convenience store nearby?" (U01). Here, as shown in FIG. 19, the voice that the navigation device receives an instruction from the intention estimation device 1B and outputs is represented as "S", and the speech from the user is represented as "U".

以下、図２０のフローチャートに沿って説明するが、図２０のステップＳＴ２００１〜ステップＳＴ２０１１，ステップＳＴ２０１３〜ステップＳＴ２０１５の具体的な動作は、それぞれ、実施の形態１で説明した図１０のステップＳＴ１００１〜ステップＳＴ１０１４の具体的な動作と同様である。 Hereinafter, although it demonstrates along the flowchart of FIG. 20, the specific operation | movement of step ST2001-step ST2011 of FIG. 20 and step ST2013-step ST2015 is respectively step ST1001-step of FIG. 10 demonstrated in Embodiment 1. It is the same as the specific operation of ST1014.

まず、音声受付部１０１がユーザの発話による音声を受け付け、音声認識部１０２が受け付けた音声に対して音声認識処理を行って文字列に変換し、形態素解析部１０３が文字列に対して形態素解析処理を行う（ステップＳＴ２００１、ＳＴ２００２）。例えば、形態素解析部１０３は、形態素解析部１０３は、「○○」、「は」、「寄ら」、「なく」、「て」、「いい」、「近く」、「に」、「コンビニ」及び「ある」の形態素を得て、当該形態素の情報を、形態素解析結果として係り受け解析部１０４及び意図数推定部１０６に出力する。
次に、係り受け解析部１０４が文字列に対して係り受け解析処理を行う（ステップＳＴ２００３）。例えば、「○○」が「寄ら」の動作の対象であり、「コンビに」が「ある」の動作の対象であり、また、動作「いい」と「ある」は「並列関係」であるため、係り受け解析部１０４は、「動作対象＿２件」、「並列関係＿１件」との解析結果を、係り受け情報とし、意図数推定部１０６に出力する。
そして、係り受け解析部１０４から出力された係り受け情報を用いて、意図数推定部１０６が意図数を推定する（ステップＳＴ２００４）。ここでは、意図数推定部１０６が推定した意図数が「２件」となり（実施の形態１で説明した図１１のステップＳＴ１１０４参照）、推定された意図数が「１」より大きいため（ステップＳＴ２００５の“ＹＥＳ”の場合）、ステップＳＴ２０１０以後の処理に移る。ここまでは実施の形態１で説明した図１０のステップＳＴ１００１〜１００５と同様である。First, the voice receiving unit 101 receives a voice according to the user's speech, and the voice recognition unit 102 performs voice recognition processing on the received voice to convert it into a character string, and the morpheme analysis unit 103 analyzes the morpheme for the character string A process is performed (steps ST2001 and ST2002). For example, the morphological analysis unit 103, the morphological analysis unit 103, "○ 、", "ha", "don't care", "don't", "te", "good", "near", "ni", "convenience store" And obtains the morpheme “a”, and outputs the information of the morpheme to the dependency analysis unit 104 and the intention number estimation unit 106 as the morpheme analysis result.
Next, the dependency analysis unit 104 performs dependency analysis processing on the character string (step ST2003). For example, "○○" is the target of the motion of "don't care", the target of the motion of "combine" is "present", and the motions "good" and "present" are "parallel". The dependency analysis unit 104 outputs the analysis results of “operation target _2” and “parallel relationship _1” as dependency information to the intention number estimation unit 106.
Then, using the dependency information output from the dependency analysis unit 104, the intention number estimation unit 106 estimates the number of intentions (step ST2004). Here, since the number of intentions estimated by the intention number estimation unit 106 is “2” (see step ST1104 of FIG. 11 described in the first embodiment) and the estimated number of intentions is larger than “1” (step ST2005) (In the case of “YES” in step), and proceeds to processing after step ST2010. Up to this point is the same as steps ST1001 to 1005 in FIG. 10 described in the first embodiment.

ステップＳＴ２０１０において、意図数推定部１０６は、形態素解析部１０３が形態素解析した結果である文字列を複合意図推定部１１０に出力する。そして、複合意図推定部１１０は、複意図発話文に対して、ユーザの意図を推定する。 In step ST2010, the intention number estimation unit 106 outputs, to the combined intention estimation unit 110, a character string that is the result of the morphological analysis of the morphological analysis unit 103. Then, compound intention estimating section 110 estimates the user's intention for the multiple intention utterance sentence.

ここで、図２１は、実施の形態２において、複合意図推定部１１０が判定した、ユーザの意図の判定結果の一例である。
図２１では、説明を容易にするため、複合意図推定モデル記憶部１０９に記憶されている複合意図推定モデルとして、意図「経由地削除［施設＝○○］」の判定用意図推定モデル、意図「周辺検索［施設種類＝コンビニ］」の判定用意図推定モデル、意図「ルート削除」の判定用意図推定モデルの三つのモデルがあるものとして説明する。なお、実施の形態１と同様、意図数推定部１０６は、上記三つの判定用意図推定モデルを用いて判定する意図に対する意図推定スコアが０．５を超えた場合に、当該意図推定スコアが０．５を超えたと判定された意図を、該当意図であると判定するものとする。Here, FIG. 21 is an example of the determination result of the user's intention determined by the compound intention estimation unit 110 in the second embodiment.
In FIG. 21, in order to facilitate the explanation, as the composite intention estimation model stored in the composite intention estimation model storage unit 109, the determination preparation diagram estimation model of the intention “delete via route [facility = ○○]”, the intention “ Description will be made on the assumption that there are three models of the judgment preparation diagram estimation model of peripheral search [facility type = convenience store] and the intention preparation diagram estimation model of intention “route deletion”. As in the first embodiment, the intention number estimation unit 106 determines that the intention estimation score is 0 when the intention estimation score for the intention determined using the three determination preparation drawing estimation models exceeds 0.5. .5 The intention determined to exceed 5 shall be determined to be the corresponding intention.

図２１において、図２１Ａは、意図「経由地削除［施設＝○○］」の判定用意図推定モデルの判定結果である。複合意図推定部１１０は、意図「経由地削除［施設＝○○］」の意図推定スコアが０．６５を得る。この場合、意図推定スコアが０．５を超えるため、複合意図推定部１１０は、意図「経由地削除［施設＝○○］」が「Ｕ０１」の文字列の該当意図であると判定する。
図２１において、図２１Ｂは、意図「周辺検索［施設種類＝コンビニ］」判定用意図推定モデルの判定結果であり、図２１Ｃは、意図「ルート削除」判定用意図推定モデルの判定結果である。複合意図推定部１１０は、意図推定スコアが０．７であり、０．５を超えるため（図２１Ｂ参照）、意図「周辺検索［施設種類＝コンビニ］」も「Ｕ０１」の文字列の該当意図であると判定する。また、複合意図推定部１１０は、意図推定スコアが０．５５であり、０．５を超えるため（図２１Ｃ参照）、「ルート削除」も「Ｕ０１」の文字列の該当意図であると判定する。
複合意図推定部１１０は、図２１Ａ〜図２１Ｃで示す三つの意図推定モデルにより得た該当意図である、「経由地削除［施設＝○○］」、「周辺検索［施設種類=コンビニ］」、及び、「ルート削除」を推定結果統合部１１１に出力する。In FIG. 21, FIG. 21A shows the determination result of the preparation preparation chart estimation model with the intention “deletion of via point [facility = ○○]”. The combined intention estimation unit 110 obtains an intention estimation score of 0.65 for the intention “delete via route [facility = ○○]”. In this case, since the intention estimation score exceeds 0.5, the combined intention estimation unit 110 determines that the intention “delete via route [facility = ○○]” is the corresponding intention of the character string “U01”.
In FIG. 21, FIG. 21B shows the determination result of the intention “periphery search [facility type = convenience store]” determination preparation diagram estimation model, and FIG. 21C shows the determination result of the intention “route deletion” determination preparation diagram estimation model. Since the intention estimation score of the combined intention estimation unit 110 is 0.7 and exceeds 0.5 (see FIG. 21B), the intention “periphery search [facility type = convenience store]” is also the corresponding intention of the character string “U01” It is determined that Further, since the intention estimation score is 0.55 and exceeds 0.5 (see FIG. 21C), the composite intention estimation unit 110 determines that “route deletion” is also the corresponding intention of the character string “U01”. .
The combined intention estimation unit 110 is the corresponding intentions obtained by the three intention estimation models shown in FIGS. 21A to 21C, “Delete via site [facility = ○○]”, “periphery search [facility type = convenience store]”, And, “route deletion” is output to the estimation result integration unit 111.

推定結果統合部１１１は、ステップＳＴ２０１０において複合意図推定部１１０から意図推定結果として出力された複数の該当意図のうち、「他の意図」以外の該当意図を、統合結果に加えることで、該当意図を統合する（ステップＳＴ２０１１）。 The estimation result integration unit 111 adds the corresponding intention other than the “other intention” to the integration result among the plurality of corresponding intentions output as the intention estimation result from the compound intention estimation unit 110 in step ST2010. Are integrated (step ST2011).

図２１Ａに示すように、意図「経由地削除［施設＝○○］」の判定用意図推定モデルの判定結果は、意図「経由地削除［施設＝○○］」であるため、推定結果統合部１１１は、意図「経由地削除［施設＝○○］」を統合結果に加える。また、図２１Ｂ及び図２１Ｃに示すように、意図「周辺検索［施設種類＝コンビニ］」の判定用意図推定モデルの判定結果は「周辺検索［施設種類＝コンビニ］」であり、意図「ルート削除」の判定用意図推定モデルの判定結果は「ルート削除」であるため、推定結果統合部１１１は、「周辺検索［施設種類＝コンビニ］」及び「ルート削除」も同様に統合結果に加える。このとき、この実施の形態２では、推定結果統合部１１１は、意図推定スコアも、統合結果に加える。 As shown in FIG. 21A, the determination result of the prepared preparation diagram estimation model of the intention “delete via route [facility = ○○]” is the intention “intermediate place delete [facilities = ○○]”, so the estimation result integration unit 111 adds the intention "Deletion place deletion [facility = ○○]" to the integration result. Also, as shown in FIG. 21B and FIG. 21C, the determination result of the judgment preparation prepared map estimation model of the intention "nearby search [facility type = convenience store]" is "nearby search [facility type = convenience store]" Since the judgment result of the judgment preparation diagram estimation model is “route deletion”, the estimation result integration unit 111 similarly adds “periphery search [facility type = convenience store]” and “route deletion” to the integration result. At this time, in the second embodiment, the estimation result integration unit 111 also adds the intention estimation score to the integration result.

図２２は、この実施の形態２において、推定結果統合部１１１により統合された意図の統合結果の一例を示す図である。
推定結果統合部１１１は、推定した意図の統合結果を、複合意図推定結果として推定結果選択部１１７へ出力する。FIG. 22 is a diagram showing an example of the integration result of the intention integrated by the estimation result integration unit 111 in the second embodiment.
The estimation result integration unit 111 outputs the integration result of the estimated intentions to the estimation result selection unit 117 as a combined intention estimation result.

推定結果選択部１１７は、ステップＳＴ２０１１において推定結果統合部１１１から出力された複合意図推定結果に対し、ステップＳＴ２００４において意図数推定部１０６から出力された意図数を意図出力上限として、推定結果とする意図を、複合意図推定結果の意図推定スコアの上位から選択し、選択した推定意図を最終意図推定結果とする（ステップＳＴ２０１２）。
具体的には、推定結果選択部１１７は、意図数推定部１０６から出力された意図数を意図出力上限とし、意図推定スコアを判断基準として、当該意図推定スコアの上位の推定意図のみを選択する。The estimation result selection unit 117 sets the number of intentions output from the intention number estimation unit 106 in step ST2004 as the intention output upper limit, with respect to the composite intention estimation result output from the estimation result integration unit 111 in step ST2011. The intention is selected from the top of the intention estimation score of the composite intention estimation result, and the selected estimation intention is set as the final intention estimation result (step ST2012).
Specifically, the estimation result selection unit 117 sets the number of intentions output from the intention number estimation unit 106 as the intention output upper limit, and selects only the high-order estimation intention of the intention estimation score using the intention estimation score as a determination criterion. .

ここで、ステップＳＴ２００４において、意図数推定部１０６は意図数「２件」と推定した。そのため、推定結果選択部１１７は、最終意図推定結果の数を「２」以下にする。推定結果統合部１１１による推定統合結果は、「経由地削除［施設=○○］」、「周辺検索［施設種類＝コンビニ］」及び「ルート削除」の３つである。
また、図２２で示したように意図推定スコアは、「経由地削除［施設＝○○］」が「０．６５」、「周辺検索［施設種類＝コンビニ］」が「０．７」、「ルート削除」が「０．５５」である。
推定結果選択部１１７は、意図数推定部１０６から出力された意図数を意図出力上限とし、複合意図推定結果の意図推定スコアの上位二つを選択して、最終意図推定結果として出力するので、推定結果選択部１１７は、「経由地削除［施設＝○○］」及び「周辺検索［施設種類＝コンビニ］」を選択し、最終意図推定結果とすることになる。Here, in step ST2004, the intention number estimation unit 106 estimates that the number of intentions is "2". Therefore, the estimation result selection unit 117 sets the number of final intention estimation results to “2” or less. The estimation integration result by the estimation result integration unit 111 is three of “route deletion [facility = ○○]”, “periphery search [facility type = convenience store]” and “route deletion”.
In addition, as shown in FIG. 22, the intention estimation score is “0.65” for “via route deletion [facility = ○○]” and “0.7” for “peripheral search [facility type = convenience store]”. Route deletion is "0.55".
The estimation result selection unit 117 sets the intention number output from the intention number estimation unit 106 as the intention output upper limit, selects the top two intention estimation scores of the composite intention estimation result, and outputs the selected result as the final intention estimation result. The estimation result selection unit 117 selects “via route deletion [facility = ○○]” and “peripheral search [facility type = convenience store]” as final intention estimation results.

このように、意図推定装置１Ｂでは、推定結果選択部１１７により、「ルート削除」を複合意図推定結果から削除することで、余計な意図推定結果の出力を抑え、複合意図推定結果に上限を設けない場合に比べ、意図推定の精度をより向上することができる。その結果、より適切な最終意図推定結果を得ることができる。
図２３は、実施の形態２において、推定結果選択部１１７により生成された最終意図推定結果の内容の一例を示す図である。
推定結果選択部１１７は、最終意図推定結果をコマンド実行部１１２に出力する。Thus, in the intention estimation apparatus 1B, the output of unnecessary intention estimation results is suppressed by deleting "root deletion" from the compound intention estimation results by the estimation result selection unit 117, and an upper limit is set for the compound intention estimation results. The accuracy of intention estimation can be further improved as compared to the case where there is no such condition. As a result, a more appropriate final intention estimation result can be obtained.
FIG. 23 is a diagram showing an example of the content of the final intention estimation result generated by the estimation result selection unit 117 in the second embodiment.
The estimation result selection unit 117 outputs the final intention estimation result to the command execution unit 112.

コマンド実行部１１２は、ステップＳＴ２０１２において推定結果選択部１１７から出力された最終意図推定結果に対応するコマンドを、ナビゲーション装置のコマンド処理部に、実行させる（ステップＳＴ２０１３）。例えば、コマンド実行部１１２は、ナビゲーション装置のコマンド処理部に、経由地を削除するコマンド及び周辺のコンビニを検索するコマンドを実行させる。
また、応答生成部１１３は、コマンド実行部１１２がコマンド処理部に実行させたコマンドに対応する応答データを生成し（ステップＳＴ２０１４）、通知制御部１１４は、応答生成部１１３が生成した応答データを、ナビゲーション装置が備えるスピーカから出力させる（ステップＳＴ２０１５）。その結果、図１９の「Ｓ０２」に示すように、「経由地○○を削除しました。」「周辺のコンビニを検索します。リストから選択してください。」等の音声が出力され、ユーザへの、実行されたコマンドの通知を行うことができる。具体的な動作は、実施の形態１で説明した、図１０のステップＳＴ１０１２〜ステップＳＴ１０１４と同じである。The command execution unit 112 causes the command processing unit of the navigation device to execute a command corresponding to the final intention estimation result output from the estimation result selection unit 117 in step ST2012 (step ST2013). For example, the command execution unit 112 causes the command processing unit of the navigation device to execute a command for deleting a via point and a command for searching for a convenience store in the vicinity.
Further, the response generation unit 113 generates response data corresponding to the command executed by the command processing unit 112 by the command execution unit 112 (step ST2014), and the notification control unit 114 generates the response data generated by the response generation unit 113. , And output from a speaker provided in the navigation device (step ST2015). As a result, as shown in “S02” in FIG. 19, voices such as “Deleted via xxx” and “Search nearby convenience stores. Please select from a list.” Are output, and the user Can be notified of executed commands. The specific operation is the same as step ST1012 to step ST1014 of FIG. 10 described in the first embodiment.

以上のように、実施の形態２によれば、実施の形態１に係る意図推定装置１の構成に加え、意図数推定部１０６が推定した意図数を上限として、推定結果統合部１１１が統合した複数の意図のうち、意図数推定部１０６が意図数を推定する際に算出した意図推定スコアの上位の意図を選択し、複合意図とする推定結果選択部１１７を備えるように構成した。これにより、意図数推定部１０６で得た意図数結果を用いて、推定結果統合部１１１で得た複合意図推定結果に対して出力上限を設定し、不適切な意図推定結果の出力を抑えることができるため、最終統合結果の精度がより向上する。 As described above, according to the second embodiment, in addition to the configuration of the intention estimation apparatus 1 according to the first embodiment, the estimation result integration unit 111 integrates the number of intentions estimated by the intention number estimation unit 106 as the upper limit. Among the plurality of intentions, the intention number estimation unit 106 selects the intention higher in the intention estimation score calculated when the intention number is estimated, and is configured to include the estimation result selection unit 117 as a combined intention. Thereby, using the intention number result obtained by the intention number estimation unit 106, an output upper limit is set for the composite intention estimation result obtained by the estimation result integration unit 111, and the output of the inappropriate intention estimation result is suppressed. Accuracy of the final integration result is further improved.

なお、これまで説明した意図推定装置１，１Ｂの機能の一部は他の装置で実行されるようにしてもよい。例えば、一部の機能を、外部に設けられたサーバ、あるいは、スマートフォンまたはタブレット等の携帯端末等により実行するようにしてもよい。 Note that part of the functions of the intention estimation devices 1 and 1B described above may be performed by another device. For example, some of the functions may be executed by an externally provided server or a portable terminal such as a smartphone or a tablet.

また、上述した実施の形態１，２では、意図推定装置１，１Ｂは、ユーザの発話による音声をもとに、ユーザの意図を推定するものとしたが、ユーザの意図を推定する元となる情報はこれに限らない。例えば、意図推定装置１，１Ｂは、ユーザがキーボード等の入力装置を用いて入力した文字列を受け付け、当該文字列をもとに、ユーザの意図を推定するようにすることもできる。 In the first and second embodiments described above, the intention estimation devices 1 and 1B estimate the user's intention based on the speech of the user's speech. Information is not limited to this. For example, the intention estimation devices 1 and 1B can also receive a character string input by the user using an input device such as a keyboard, and estimate the user's intention based on the character string.

なお、本願発明はその発明の範囲内において、各実施の形態の自由な組み合わせ、あるいは各実施の形態の任意の構成要素の変形、もしくは各実施の形態において任意の構成要素の省略が可能である。 In the scope of the invention, the present invention allows free combination of each embodiment, or modification of any component of each embodiment, or omission of any component in each embodiment. .

この発明に係る意図推定装置は、文字列の意図を推定する精度を向上することができるように構成したため、入力された文字列を認識してユーザの意図を推定する意図推定装置等に適用することができる。 Since the intention estimation apparatus according to the present invention is configured to improve the accuracy of estimating the intention of a character string, it is applied to an intention estimation apparatus or the like that recognizes an input character string and estimates the user's intention. be able to.

１，１Ｂ意図推定装置、２意図数推定モデル生成装置、１０１音声受付部、１０２音声認識部、１０３形態素解析部、１０４係り受け解析部、１０５意図数推定モデル記憶部、１０６意図数推定部、１０７単意図推定モデル記憶部、１０８単意図推定部、１０９複合意図推定モデル記憶部、１１０複合意図推定部、１１１推定結果統合部、１１２コマンド実行部、１１３応答生成部、１１４通知制御部、１１５学習用データ記憶部、１１６意図数推定モデル生成部、１１７推定結果選択部、５０１処理回路、５０２ＨＤＤ、５０３入力インタフェース装置、５０４出力インタフェース装置、５０５メモリ、５０６ＣＰＵ。 1, 1 B intention estimation device, 2 intention number estimation model generation device, 101 speech reception unit, 102 speech recognition unit, 103 morpheme analysis unit, 104 dependency analysis unit, 105 intention number estimation model storage unit, 106 intention number estimation unit, 107 single intention estimation model storage unit 108 single intention estimation unit 109 composite intention estimation model storage unit 110 composite intention estimation unit 111 estimation result integration unit 112 command execution unit 113 response generation unit 114 notification control unit 115 Data storage unit for learning, 116 intention number estimation model generation unit, 117 estimation result selection unit, 501 processing circuit, 502 HDD, 503 input interface device, 504 output interface device, 505 memory, 506 CPU.

Claims

A morphological analysis unit that analyzes morphemes included in the character string based on the acquired character string;
If the number of intentions for the character string is estimated, and the character string is a single intention string including only one intention or is a multiple intention string including multiple intentions, according to the estimated number of intentions An intention number estimation unit that determines
When the intention number estimation unit determines that the character string is a single intention character string, a single intention estimation model in which the association degree with the morpheme is associated with each intention based on the morpheme analyzed by the morpheme analysis unit. A single intention estimation unit that estimates the intention for the single intention string as a single intention using
When the intention number estimation unit determines that the character string is a double intention character string, a compound intention in which the association degree with the morpheme is associated with each of a plurality of intentions based on the morpheme analyzed by the morpheme analysis unit. A combined intention estimating unit that estimates a plurality of intentions for the multiple intention character string using an estimation model;
An intention estimation apparatus comprising: an estimation result integration unit which integrates a plurality of intentions estimated by the composite intention estimation unit as a composite intention.

A dependency analysis unit configured to analyze a relationship between morphemes included in the character string based on the morphemes analyzed by the morpheme analysis unit, and generate dependency information;
The intention estimation device according to claim 1, wherein the intention number estimation unit estimates the number of intentions for the character string based on the dependency information generated by the dependency analysis unit.

The intention number estimation unit
The intention number for the character string is estimated using the intention number estimation model in which the dependency information is a feature amount and the correspondence relationship between the dependency information and the intention number is learned. Intention estimation device.

Among a plurality of intentions integrated by the estimation result integration unit with the intention number estimated by the intention number estimation unit as an upper limit, the intention higher in intention estimation score calculated by the intention number estimation unit when the intention number is estimated The intention estimation apparatus according to claim 1, further comprising: an estimation result selection unit configured to select the combined intention as the combined intention.

The morphological analysis unit analyzes morphemes included in the character string based on the acquired character string;
The intention number estimation unit estimates the number of intentions for the character string, and depending on the estimated intention number, the character string may be a single intention character string including only one intention, or a plurality of intentions. Determining whether it is an intended character string;
When the single intention estimation unit determines that the intention number estimation unit determines that the character string is a single intention character string, the degree of association with the morpheme is associated with each intention based on the morpheme analyzed by the morpheme analysis unit. Estimating the intention for the single intention string as a single intention using the single intention estimation model
When the combined intention estimation unit determines that the intention number estimation unit determines that the character string is a multiple intention character string, the degree of association with the morpheme for each of a plurality of intentions is based on the morpheme analyzed by the morpheme analysis unit. Estimating a plurality of intentions for the compound intention string using the associated compound intention estimation model;
The estimation result integration unit integrates the plurality of intentions estimated by the combined intention estimation unit as a combined intention.

The dependency analysis unit analyzes the relationship between morphemes included in the character string based on the morphemes analyzed by the morpheme analysis unit, and generates dependency information.
6. The intention estimation method according to claim 5, wherein the intention number estimation unit estimates the number of intentions for the character string based on the dependency information generated by the dependency analysis unit.

The intention number estimation unit
The method further includes the step of estimating the number of intentions for the character string using the intention number estimation model in which the dependency information is a feature amount and the correspondence relationship between the dependency information and the number of intentions has been learned. The intention estimation method described in 6.

Among the plurality of intentions integrated by the estimation result integration unit, the estimation unit selected by the estimation result selection unit estimates the intention number when the intention number estimation unit estimates the number of intentions, with the intention number estimated by the intention number estimation unit as an upper limit. The intention estimation method according to claim 5, further comprising the step of selecting the upper intention of the estimated score and setting the intention as the combined intention.