WO2018229937A1

WO2018229937A1 - Intention inference device and intention inference method

Info

Publication number: WO2018229937A1
Application number: PCT/JP2017/022144
Authority: WO
Inventors: ▲イ▼ 景; 悠介小路
Original assignee: 三菱電機株式会社
Priority date: 2017-06-15
Filing date: 2017-06-15
Publication date: 2018-12-20
Also published as: JP6632764B2; JPWO2018229937A1

Abstract

A device comprising: a morphological analysis unit (103) for, on the basis of an acquired character string, carrying out an analysis of morphemes included in the character string; an intention quantity estimation unit (106) for estimating an intention quantity with respect to the character string, and according to the estimated intention quantity, determining whether the character string is a single-intention character string including only one intention, or a multiple-intention character string including a plurality of intentions; a single-intention inference unit (108) for inferring, if the intention quantity inference unit (106) determines that the character string is a single-intention character string, only one intention as the intention with respect to the single-intention character string, on the basis of the morphemes analyzed by the morphological analysis unit (103) and using single-intention inference models associated with degrees of relevance with the morphemes for each intention; a composite intention inference unit (110) for inferring, if the intention quantity inference unit (106) determines that the character string is a multiple-intention character string, a plurality of intentions with respect to the multiple-intention character string, on the basis of the morphemes analyzed by the morphological analysis unit (103) and using multiple-intention inference information models associated with degrees of relevance with the morphemes for each of the plurality of intentions; and an inference result integration unit (111) for integrating the plurality of inferences inferred by the multiple-intention inference unit (110) as a composite intention.

Description

Intention estimation device and intention estimation method

The present invention relates to an intention estimation device and an intention estimation method for recognizing an input character string and estimating a user's intention.

2. Description of the Related Art Conventionally, an intention estimation device that recognizes speech uttered by a user and converts it into a character string and estimates a user's intention as to what operation to perform from the character string is known. Since there is a case where a plurality of intentions are included in one utterance (hereinafter also referred to as multi-intention utterance), the intention estimation device is required to be able to estimate the intention with respect to the multi-intention utterance.

For example, in the method using supervised learning disclosed in Non-Patent Document 1, a character string is expressed in a format called Bag of words, and the Bag of words is used as a feature quantity, and a support vector machine or logarithmic linear model ( A classifier (intention understanding model) called “maximum entropy model” is learned, and the intention is estimated based on the probability value calculated using the learning result. According to this method, for example, “search for ramen shop and Chinese food.” One character string includes the intention of “search for ramen shop” and the intention of “search for Chinese food”. Even if it has the structure of, the intention of the speaker or the like is estimated.

When applying the intention estimation method disclosed in Non-Patent Document 1 to a case where a plurality of intentions can be included in one utterance, a separate model is learned for each intention, The determination results based on the model will be integrated.
However, in the method of integrating determination results based on a plurality of models at the time of execution for one utterance as described above, even when the utterance includes only one intention (hereinafter also referred to as a single intention utterance), a plurality of Since the intention estimation based on each model is performed, a plurality of intentions may be estimated and output, and there is a problem that the estimation accuracy of the intention may be lowered as a whole.

The present invention has been made to solve the above-described problems. Even when the acquired character string can be either a single intention character string or a multiple intention character string, the intention can be accurately estimated. An object is to provide an intention estimation device.

The intention estimation apparatus according to the present invention includes a morpheme analysis unit that analyzes a morpheme included in a character string based on the acquired character string, and estimates an intention number for the character string. According to the estimated intention number, the character The intention number estimation unit for determining whether a string is a single intention character string including only one intention or a multi-intention character string including multiple intentions, and the intention number estimation unit. If it is determined to be a character string, based on the morpheme analyzed by the morpheme analyzer, the intention to the single intention string is determined by using a single intention estimation model in which the degree of association with the morpheme is associated with each intention. If the single intention estimation unit and the intention number estimation unit determine that the character string is a double intention character string, the degree of association with the morpheme for each of the plurality of intentions is based on the morpheme analyzed by the morpheme analysis unit. Associated complex intention estimation mode A multiple intention estimation unit that estimates a plurality of intentions with respect to the multi-intention character string, and an estimation result integration unit that integrates a plurality of intentions estimated by the multiple intention estimation unit as a complex intention. .

According to the present invention, it is possible to improve the accuracy of estimating the user's intention.

1 is a diagram illustrating a configuration example of an intention estimation apparatus according to Embodiment 1. FIG. 6 is a diagram illustrating an example of an intention number estimation model according to Embodiment 1. FIG. 6 is a diagram illustrating an example of a single intention estimation model in Embodiment 1. FIG. 6 is a diagram illustrating an example of a composite intention estimation model according to Embodiment 1. FIG. 5A and 5B are diagrams illustrating an example of a hardware configuration of the intention estimation apparatus according to Embodiment 1. 3 is a diagram illustrating a configuration example of an intention number estimation model generation device according to Embodiment 1. FIG. In Embodiment 1, it is a figure which shows the example of the data for learning memorize | stored in the data storage part for learning. 5 is a flowchart for explaining processing in which the intention number estimation model generation device generates an intention number estimation model in the first embodiment. FIG. 6 is a diagram illustrating an example of a dialogue performed between a user and a navigation device in the first embodiment. 4 is a flowchart for explaining an operation of the intention estimation apparatus according to the first embodiment. In Embodiment 1, it is a flowchart for demonstrating operation | movement of the intention number estimation part in step ST1004 of FIG. In Embodiment 1, it is a figure which shows an example of the score of the dependency information with respect to each intention number which an intention number estimation part acquires. In Embodiment 1, it is a figure which shows the calculation formula used in order for an intention number estimation part to calculate a final score. In Embodiment 1, it is a figure which shows an example of the final score of each intention number which an intention number estimation part calculates. In Embodiment 1, it is a figure which shows an example of the final score of each intention number which an intention number estimation part calculates. In the first embodiment, the intention number estimation unit is an example of a determination result of the user's intention, which is the estimation result of the composite intention estimation unit. In this Embodiment 1, it is a figure which shows an example of the integration result of the intent integrated by the estimation result integration part. 6 is a diagram illustrating a configuration example of an intention estimation apparatus according to Embodiment 2. FIG. In Embodiment 2, it is a figure which shows the example of an interaction | dialogue performed between a user and a navigation apparatus. 10 is a flowchart for explaining the operation of the intention estimation apparatus in the second embodiment. In Embodiment 2, it is an example of the determination result of a user's intention which the compound intention estimation part determined. In this Embodiment 2, it is a figure which shows an example of the integrated result of the intent integrated by the estimation result integration part. In Embodiment 2, it is a figure which shows an example of the content of the final intention estimation result produced | generated by the estimation result selection part.

Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
Embodiment 1 FIG.

The intention estimation device 1 according to the first embodiment is installed in a navigation device that performs route guidance for a user such as a vehicle driver as an example, estimates the user's intention from the utterance content spoken by the user, Control that causes the navigation device to execute an operation according to the estimated user intention is performed. The intention estimation device 1 may be connected to the navigation device via a network or the like.
In addition, the example etc. which are mounted in a navigation apparatus are only examples, and the intention estimation apparatus 1 according to Embodiment 1 is not limited to the user of the navigation apparatus, and accepts information input by utterances or the like from the user. The present invention can be applied to an intention estimation apparatus that estimates the intention of a user of the apparatus in any apparatus that performs an operation corresponding to the information.

FIG. 1 is a diagram illustrating a configuration example of an intention estimation apparatus 1 according to the first embodiment.
As shown in FIG. 1, the intention estimation apparatus 1 includes a voice reception unit 101, a voice recognition unit 102, a morpheme analysis unit 103, a dependency analysis unit 104, an intention number estimation model storage unit 105, and an intention number estimation. Unit 106, single intention estimation model storage unit 107, single intention estimation unit 108, compound intention estimation model storage unit 109, compound intention estimation unit 110, estimation result integration unit 111, command execution unit 112, response A generation unit 113 and a notification control unit 114 are provided.
In the first embodiment, as shown in FIG. 1, the intention estimation apparatus 1 includes an intention number estimation model storage unit 105, a single intention estimation model storage unit 107, and a composite intention estimation model storage unit 109. The intention number estimation model storage unit 105, the single intention estimation model storage unit 107, and the composite intention estimation model storage unit 109 are not limited to this, and the intention estimation device 1 outside the intention estimation device 1 is used. It is good also as what is provided in the place which can be referred.

The voice reception unit 101 receives a voice including a user's utterance. The voice reception unit 101 outputs the received voice information to the voice recognition unit 102.

The voice recognition unit 102 recognizes voice data corresponding to the voice received by the voice reception unit 101 and converts it into a character string. The voice recognition unit 102 outputs the character string to the morpheme analysis unit 103.

The morpheme analysis unit 103 performs morpheme analysis on the character string output from the speech recognition unit 102.
Here, morpheme analysis is an existing natural language processing technique in which a character string is divided into morphemes that are the smallest units having meaning as a language, and parts of speech are given using a dictionary. For example, when a morphological analysis is performed on a character string “go to Tokyo Tower”, the character string is divided into morphemes such as “Tokyo Tower / proprietary noun, he / case particle, go / verb”.
The morpheme analysis unit 103 outputs the morpheme analysis result to the dependency analysis unit 104 and the intention number estimation unit 106.

The dependency analysis unit 104 analyzes the relationship between morphemes with respect to the character string after the morpheme analysis by the morpheme analysis unit 103, and generates dependency information. Here, the relationship between morphemes is a dependency relationship of morphemes included in a character string. The dependency relationship refers to a relationship between morphemes such as “operation target” and “parallel relationship”. The dependency analysis unit 104 may use an existing analysis method such as Shift-reduce or spanning tree as a dependency analysis method.
The dependency analysis unit 104 outputs the analysis result of the relationship between morphemes to the intention number estimation unit 106 as dependency information.

The intention number estimation model storage unit 105 stores an intention number estimation model. The intention number estimation model is a model for estimating the number of intentions using dependency information as a feature amount.

FIG. 2 is a diagram illustrating an example of an intention number estimation model in the first embodiment.
In the intention number estimation model illustrated in FIG. 2, the degree of association between each intention number and the dependency information is described as a score.
In the first embodiment, the dependency information is expressed in a form in which the relationship between the morphemes and the number of appearances thereof are connected by “_”.
For example, as shown in FIG. 2, when a set of morphemes having a “parallel relationship” appears once in one character string, the dependency information is “parallel relationship_1”.
In the dependency information shown in FIG. 2, “operation target — 1 item” indicates that there is only one set of morphemes having a relationship of “operation target” in one character string. 1 "in many cases. Therefore, as shown in FIG. 2, for “operation target — 1”, the score for the intention number “1” is higher than the scores for the intention numbers “2” and “3”. On the other hand, for both “Parallel relationship_1” and “Operation target_2”, the number of intentions is likely to be 2 or more, so the scores for the intention numbers “2” and “3” are The score is higher than the score for the intention number “1”. Thus, in the intention number estimation model, a higher score is set as the degree of association is higher in accordance with the degree of association between the number of intentions and the dependency information.
For ease of explanation, FIG. 2 shows only three types of intention numbers “1”, “2”, and “3”.
In the first embodiment, the intention number of the user is estimated by a statistical method using the intention number estimation model illustrated in FIG.

The intention number estimation unit 106 estimates the number of intentions included in the character string using the intention number estimation model stored in the intention number estimation model storage unit 105 based on the dependency information output from the dependency analysis unit 104. To do. A specific method of intention number estimation by the intention number estimation unit 106 will be described later.
The intention number estimation unit 106 determines whether the character string based on the voice received by the voice reception unit 101 is a single intention utterance or a multi-intention utterance according to the estimated number of intentions. In response, the morpheme analysis result of the character string output by the morpheme analysis unit 103 is output to the single intention estimation unit 108 or the composite intention estimation unit 110. Specifically, the intention number estimation unit 106 determines that the character string based on the voice received by the voice reception unit 101 is a single intention character string based on the single intention utterance, and outputs the character output by the morpheme analysis unit 103. The morphological analysis result of the column is output to the single intention estimation unit 108. When it is determined that the character string based on the voice received by the voice reception unit 101 is a multi-intention utterance, the morphological analysis result of the character string output by the morpheme analysis unit 103 is output to the composite intention estimation unit 110. .

In Embodiment 1, the intention number is estimated by a statistical method using the intention number estimation model, but the present invention is not limited to this. Instead of a statistical method, a correspondence relationship between dependency information and the number of intentions may be prepared in advance as a rule, and the number of intentions may be estimated. For example, if there is only one “parallel relationship” between the facility name and the facility type in the character string, the number of intentions included in the character string is “2”. It is possible to estimate the number of intentions by a rule such as “

Further, for example, a maximum entropy method can be used as the intention estimation method in the first embodiment, which will be described later. The single intention estimator and the complex intention estimator use statistical methods to estimate the likelihood of the intention corresponding to the input morpheme from a set of morpheme and intention collected in advance. Estimate.

The single intention estimation model storage unit 107 stores an intention estimation model for performing intention estimation using morphemes as feature quantities. The intention can be expressed in a form such as “<main intention>[<slotname> = <slot value>,...]”. Here, the main intention indicates the classification or function of the intention. In the example of the navigation device, the main intention is an upper layer generated in response to an input made by the user first operating an input device (not shown) such as destination setting or listening to music. Corresponds to the command.
The slot name and the slot value indicate information necessary for executing the main intention. For example, the intention included in the character string “search for nearby restaurants” is that the main intention is “periphery search”, the slot name is “facility type”, and the slot value is “restaurant”. Therefore, the intention included in the character string “search for nearby restaurants” can be expressed as “periphery search [facility type = restaurant]”.

FIG. 3 is a diagram illustrating an example of a single intention estimation model in the first embodiment.
As shown in FIG. 3, the single intention estimation model is “destination setting [facility = XX]” (XX is a specific facility name, the same applies hereinafter) or “surround search [facility type = restaurant]”. Represents the score of each morpheme with respect to the intention. In the single intent estimation model of the first embodiment, the score of each morpheme with respect to the intention is the degree of association between the intention and each morpheme. The higher the degree of association between the intention and each morpheme, the higher the score of each morpheme. Is set. As illustrated in FIG. 3, the single intention estimation model is a model created by learning the degree of association between an intention and a morpheme and associated with the degree of relationship with the morpheme for each intention.
For example, as shown in FIG. 3, for the morpheme “go” or “destination”, there is a high possibility that the user intends to set the destination. Therefore, in the intention “destination setting [facility = OO]” The score of the morpheme “Go” or “Destination” is higher than the score of other morphemes. On the other hand, for the morpheme “delicious” or “meal”, there is a high possibility that the user intends to search for nearby restaurants, so the morpheme “delicious” or “delicious” in the intention “periphery search [facility type = restaurant]” The “meal” score is higher than the score of other morphemes.

The single intention estimation unit 108 estimates the user's intention using the single intention estimation model stored in the single intention estimation model storage unit 107 based on the morphological analysis result of the character string output from the morphological analysis unit 103. Specifically, the single intention estimation unit 108 uses the single intention estimation model to determine the intention that the score corresponding to the morpheme analyzed by the morpheme analysis unit 103 and the intention becomes the largest. Estimated. The single intention estimation unit 108 outputs the estimation result to the command execution unit 112 as a single intention estimation result.

The compound intention estimation model storage unit 109 stores a compound intention estimation model created by learning different models for each intention. The compound intention estimation model is a model created by learning by a statistical method with the learning data of the intent to be estimated as a positive example and the learning data of other intentions as all negative examples for each intention. This is a model for determining whether or not each intention belongs to the estimation target intention.

FIG. 4 is a diagram illustrating an example of the composite intention estimation model in the first embodiment.
The composite intention estimation model includes a plurality of determination preparation diagram estimation models generated for each intention.
In FIG. 4, for ease of explanation, the number of intentions is “Destination setting [facility = OO]” (see FIG. 4A), “Nearby search [facility type = restaurant]” (see FIG. 4B), An example is shown as three of “additional stop point [facility = OO]” (see FIG. 4C). In the composite intention estimation model of the first embodiment, the score of each morpheme with respect to the intention is the degree of association between the intention and each morpheme. The higher the degree of association between the intention and each morpheme, the higher the score of each morpheme. Is set. As shown in FIG. 4, the compound intention estimation model is a model in which a plurality of intentions are created separately by learning the degree of association between the intention and the morpheme, and the degree of relationship with the morpheme is associated with each intention.

The composite intention estimation unit 110 uses the composite intention estimation model stored in the composite intention estimation model storage unit 109 to generate a character string morphological analysis result output from the morphological analysis unit 103 for each determination preparation diagram estimation model. Based on this, it is determined whether or not the character string based on the voice received by the voice receiving unit 101 has the corresponding intention. Specifically, the composite intention estimation unit 110 determines whether the score in which the morpheme analyzed by the morpheme analysis unit 103 is associated with the intention is greater than or equal to a preset threshold for each determination preparation diagram estimation model. It is determined whether or not the character string has the corresponding intention.
The composite intention estimation unit 110 outputs the determination result for each determination preparation diagram estimation model included in the composite intention estimation model to the estimation result integration unit 111 as an estimation result.

The estimation result integration unit 111 integrates the estimation results for each determination preparation diagram estimation model included in the compound intention estimation model output from the compound intention estimation unit 110.
The estimation result integration unit 111 outputs the estimated intention integration result to the command execution unit 112 as a composite intention estimation result.

The command execution unit 112 sends a corresponding command to the command processing unit of the navigation device based on the single intention estimation result output from the single intention estimation unit 108 or the composite intention estimation result output from the estimation result integration unit 111. Let it run. For example, when the single intention estimation unit 108 estimates the intention of “periphery search [facility type = restaurant]” and outputs it as a single intention estimation result for a user's utterance “search for a delicious restaurant”, a command The execution unit 112 causes the command processing unit of the navigation device to execute a command to search for nearby restaurants.
The command execution unit 112 outputs execution operation information indicating the contents of the command executed by the command processing unit to the response generation unit 113.

Based on the execution operation information output from the command execution unit 112, the response generation unit 113 generates response data corresponding to the command that the command execution unit 112 has executed by the command processing unit. The response data may be generated in the form of text data or in the form of audio data.
When the response generation unit 113 generates the response data in the form of voice data, the response generation unit 113 outputs a synthesized sound such as “Search for nearby restaurants. Please select from the list.” For this purpose, it is sufficient to generate voice data.
The response generation unit 113 outputs the generated response data to the notification control unit 114.

The notification control unit 114 outputs the response data output from the response generation unit 113 from, for example, an output device such as a speaker included in the navigation device, and notifies the user. That is, the notification control unit 114 controls the output device to notify the user that the command has been executed by the command processing unit. The notification mode may be anything as long as the user can recognize the notification, such as notification by display, notification by voice, or notification by vibration.

Next, the hardware configuration of the intention estimation apparatus 1 according to the first embodiment will be described.
5A and 5B are diagrams showing an example of the hardware configuration of the intention estimation apparatus 1 according to Embodiment 1 of the present invention.
In Embodiment 1 of the present invention, a speech recognition unit 102, a morpheme analysis unit 103, a dependency analysis unit 104, an intention number estimation unit 106, a single intention estimation unit 108, a composite intention estimation unit 110, and an estimation The functions of the result integration unit 111, the command execution unit 112, the response generation unit 113, and the notification control unit 114 are realized by the processing circuit 501. In other words, the intention estimation apparatus 1 is a processing circuit for controlling a process for estimating a user's intention or a process for executing and notifying a machine command corresponding to the estimated intention based on the received information related to the user's utterance. 501.
The processing circuit 501 may be dedicated hardware as shown in FIG. 5A or may be a CPU (Central Processing Unit) 506 that executes a program stored in the memory 505 as shown in FIG. 5B.

When the processing circuit 501 is dedicated hardware, the processing circuit 501 includes, for example, a single circuit, a composite circuit, a programmed processor, a parallel programmed processor, an ASIC (Application Specific Integrated Circuit), and an FPGA (Field-Programmable). Gate Array) or a combination of these.

When the processing circuit 501 is the CPU 506, the speech recognition unit 102, the morpheme analysis unit 103, the dependency analysis unit 104, the intention number estimation unit 106, the single intention estimation unit 108, the composite intention estimation unit 110, and the estimation result The functions of the integration unit 111, the command execution unit 112, the response generation unit 113, and the notification control unit 114 are realized by software, firmware, or a combination of software and firmware. That is, the speech recognition unit 102, the morpheme analysis unit 103, the dependency analysis unit 104, the intention number estimation unit 106, the single intention estimation unit 108, the composite intention estimation unit 110, the estimation result integration unit 111, the command The execution unit 112, the response generation unit 113, and the notification control unit 114 are processes such as an HDD (Hard Disk Drive) 502, a CPU 506 that executes a program stored in the memory 505, or a system LSI (Large-Scale Integration). Realized by a circuit. The programs stored in the HDD 502, the memory 505, and the like include a speech recognition unit 102, a morpheme analysis unit 103, a dependency analysis unit 104, an intention number estimation unit 106, a single intention estimation unit 108, a composite intention It can also be said that the computer executes the procedures and methods of the estimation unit 110, the estimation result integration unit 111, the command execution unit 112, the response generation unit 113, and the notification control unit 114. Here, the memory 505 is, for example, a RAM (Random Access Memory), a ROM (Read Only Memory), a flash memory, an EPROM (Erasable Programmable Read Only Memory, an EEPROM). Or a volatile semiconductor memory, a magnetic disk, a flexible disk, an optical disk, a compact disk, a mini disk, or a DVD (Digital Versatile Disc).

Note that the speech recognition unit 102, the morphological analysis unit 103, the dependency analysis unit 104, the intention number estimation unit 106, the single intention estimation unit 108, the complex intention estimation unit 110, the estimation result integration unit 111, a command A part of the functions of the execution unit 112, the response generation unit 113, and the notification control unit 114 may be realized by dedicated hardware, and a part may be realized by software or firmware. For example, the function of the speech recognition unit 102 is realized by a processing circuit 501 as dedicated hardware. The morpheme analysis unit 103, the dependency analysis unit 104, the intention number estimation unit 106, and the single intention estimation unit 108 As for the composite intention estimation unit 110, the estimation result integration unit 111, the command execution unit 112, the response generation unit 113, and the notification control unit 114, the processing circuit reads and executes the program stored in the memory 505 The function can be realized.
The intention number estimation model storage unit 105, the single intention estimation model storage unit 107, and the composite intention estimation model storage unit 109 use, for example, the HDD 502. This is merely an example, and the intention number estimation model storage unit 105, the single intention estimation model storage unit 107, and the composite intention estimation model storage unit 109 are configured by a DVD or a memory 505 or the like. Also good.
In addition, the intention estimation device 1 includes an input interface device 503 and an output interface device 504 that communicate with an external device such as a navigation device.
The voice reception unit 101 includes an input interface device 503.

Next, the operation of the intention estimation apparatus 1 according to Embodiment 1 will be described.
First, the operation related to the generation process of the intention number estimation model, which is a premise of the operation of estimating the intention of the user in the intention estimation device 1, will be described.
Here, it is assumed that the generation process of the intention number estimation model is performed by the intention number estimation model generation apparatus 2 that is different from the intention estimation apparatus 1.

FIG. 6 is a diagram illustrating a configuration example of the intention number estimation model generation device 2 according to the first embodiment.
The intention number estimation model generation apparatus 2 includes a learning data storage unit 115, a morpheme analysis unit 103, a dependency analysis unit 104, and an intention number estimation model generation unit 116, as shown in FIG.
The configurations and operations of the morpheme analysis unit 103 and the dependency analysis unit 104 are the same as the configurations and operations of the morpheme analysis unit 103 and the dependency analysis unit 104 described with reference to FIG. A duplicate description is omitted.

The learning data storage unit 115 stores the correspondence between the character string and the number of intentions as learning data. Here, the intention number estimation model generation device 2 includes the learning data storage unit 115. However, the learning data storage unit 115 is not limited thereto, and the learning data storage unit 115 is external to the intention number estimation model generation device 2. The intention number estimation model generation device 2 may be provided in a place where it can be referred to.

Here, FIG. 7 is a diagram illustrating an example of learning data stored in the learning data storage unit 115 in the first embodiment.
As shown in FIG. 7, the learning data is data in which an intention number corresponding to an utterance sentence example (hereinafter referred to as an utterance sentence example) that is an example sentence of a character string output by speech or the like is given. For example, the intention number “1” is given to the utterance sentence example 701 “I want to go to XX”.
The learning data is created in advance by a model creator or the like. A model creator or the like creates learning data to which an intention number is assigned in advance for each utterance sentence example for a plurality of utterance sentence examples, and stores the learning data in the learning data storage unit 115.

The intention number estimation model generation unit 116 is based on the learning data stored in the learning data storage unit 115 and the analysis result of the relationship between morphemes by the dependency analysis unit 104, and the number of intentions corresponding to the utterance sentence example. Is calculated by a statistical method, and an intention number estimation model (see FIG. 2) indicating the correspondence between dependency information and the number of intentions is generated. The intention number estimation model generation unit 116 stores the generated intention number estimation model in the intention number estimation model storage unit 105.

FIG. 8 is a flowchart for explaining processing in which the intention number estimation model generation device 2 generates an intention number estimation model in the first embodiment.
First, the morphological analysis unit 103 performs morphological analysis on each sentence example of the learning data stored in the learning data storage unit 115 (step ST801). For example, in the case of the utterance sentence example 701 in FIG. 7, the morphological analysis unit 103 performs a morphological analysis on “I want to go to XX”, and “ The result of morphological analysis is obtained. The morpheme analysis unit 103 outputs the morpheme analysis result to the dependency analysis unit 104.

The dependency analysis unit 104 performs dependency analysis using the morpheme analyzed by the morpheme analysis unit 103 based on the morpheme analysis result output from the morpheme analysis unit 103 (step ST802). For example, in the case of the utterance sentence example 701, the dependency analysis unit 104 performs dependency analysis on the morphemes “OO”, “HE”, “GO”, and “TAI”. The dependency analysis unit 104 obtains the analysis result of the relationship between the morphemes “operation target” from the morpheme, assigns the number of intentions to the analysis result, and sets “_1 operation target” as dependency information. It outputs to the number estimation model production | generation part 116. FIG.

The intention number estimation model generation unit 116 generates an intention number estimation model using the learning data stored in the learning data storage unit 115 based on the dependency information output by the dependency analysis unit 104 (step). ST803). For example, in the case of an utterance sentence example 701 “I want to go to XX”, the dependency information is “operation target — 1”, and the number of intentions included in the learning data is “number of intentions 1” as shown in FIG. is there. Therefore, in the case where the utterance sentence example 701 is used, the intention number estimation model generation unit 116 has a score of “one intention number” for the dependency information “operation target — 1 case” than a score of other intention numbers. Learn to be higher. The intention number estimation model generation unit 116 performs the same processing as the above-described steps ST801 to ST803 on all utterance sentence examples included in the learning data, and finally generates an intention number estimation model as shown in FIG. Generate.
Then, the intention number estimation model generation unit 116 stores the generated intention number estimation model in the intention number estimation model storage unit 105. Note that the intention number estimation model storage unit 105 is provided at a location accessible by the intention number estimation model generation device 2 via a network, for example.

Here, the intention number estimation model generation unit 116 uses all the dependency information output from the dependency analysis unit 104 as feature quantities for estimation of the number of intentions. The configuration is not limited to this. The intention number estimation model generation unit 116 selects a feature amount by determining a clear rule such as “use only parallel relationship” or “use only target of motion”, or estimates the intention number using a statistical method. It is also possible to adopt a configuration that uses only dependency information that is highly effective.

Here, the intention number estimation model generation device 2 different from the intention estimation device 1 generates the intention number estimation model and stores it in the intention number estimation model storage unit 105. The intention estimation apparatus 1 may generate the intention number estimation model and store it in the intention number estimation model storage unit 105. In this case, the intention estimation apparatus 1 further includes a learning data storage unit 115 and an intention number estimation model generation unit 116 in addition to the configuration described with reference to FIG. Note that the learning data storage unit 115 may be provided outside the intention estimation apparatus 1 in a place where the intention estimation apparatus 1 can be referred to.

Subsequently, on the assumption that the intention number estimation model is generated and stored in the intention number estimation model storage unit 105 as described above, the intention estimation apparatus 1 according to Embodiment 1 using the intention number estimation model is used. The operation related to the intention estimation process will be described.

Here, FIG. 9 is a diagram illustrating an example of a dialogue performed between the user and the navigation device in the first embodiment.
FIG. 10 is a flowchart for explaining the operation of the intention estimation apparatus 1 according to the first embodiment.

First, as shown in FIG. 9, the navigation device outputs, for example, a voice “Please speak when it beeps” from a speaker included in the navigation device (S1). Specifically, the voice control unit (not shown) of the intention estimation device 1 causes the navigation device to output a voice saying “Please speak when you make a beep.”
When the navigation device outputs a voice saying “Please speak when it beeps”, the user utters “I want to go to XX” in response to the voice (U1). In FIG. 9, the voice output by the navigation device in response to an instruction from the intention estimation device 1 is represented as “S”, and the utterance from the user is represented as “U”.

When the user utters “I want to go to XX” (U1), the voice receiving unit 101 receives the voice of the utterance. The voice recognition unit 102 performs voice recognition processing on the voice received by the voice receiving unit 101 (step ST1001), and converts the voice into a character string. The voice recognition unit 102 outputs the converted character string to the morpheme analysis unit 103.
The morpheme analysis unit 103 performs a morpheme analysis process on the character string output from the speech recognition unit 102 (step ST1002). For example, the morpheme analysis unit 103 obtains morphemes “OO”, “HE”, “GO”, and “TAI”, and uses the morpheme analysis result as a dependency analysis unit 104 and an intention number estimation unit. The data is output to 106.

The dependency analysis unit 104 performs dependency analysis processing on the morpheme analysis result output from the morpheme analysis unit 103 (step ST1003). For example, in the dependency analysis unit 104, since the morpheme “XX” is a target of the operation “going”, the character string output from the speech recognition unit 102 has a relationship between the morphemes “operation target”. Analyzes that there is. Further, since there is one “operation target”, the morphological analysis unit 103 analyzes “operation target — 1”. Then, the morphological analysis unit 103 outputs the analysis result of “operation target — 1” as dependency information to the intention number estimation unit 106.

The intention number estimation unit 106 uses the dependency information “operation target — 1” output from the dependency analysis unit 104 in step ST1003 as a feature amount and stores the intention number estimation model stored in the intention number estimation model storage unit 105. And the number of intentions is estimated (step ST1004). The intention number estimation operation by the intention number estimation unit 106 will be described in detail with reference to FIG.

FIG. 11 is a flowchart for explaining the operation of intention number estimation section 106 in step ST1004 of FIG.
First, the intention number estimation unit 106 collates the dependency information output from the dependency analysis unit 104 with the intention number estimation model, and acquires a score of each dependency information for each intention number (step ST1101).

Here, FIG. 12 is a diagram illustrating an example of the dependency information score for each intention number acquired by the intention number estimation unit 106 in the first embodiment.
As illustrated in FIG. 12, when the dependency information as the feature amount is “operation target — 1”, the intention number estimation unit 106, for example, sets the feature amount “operation target — 1” for the intention number “1”. 0.2 is acquired as a score. The intention number estimation unit 106 similarly obtains the score of the feature quantity “operation target_1” for other intention numbers.

Next, the intention number estimation unit 106 calculates the final score of each intention number with respect to the estimation target, which is one character string to be estimated, based on the score of each intention number acquired in step ST1101 ( Step ST1102). In the first embodiment, the final score obtained by the intention number estimation unit 106 is a product calculated by multiplying each intention number by all the scores of the dependency information for the intention number. In other words, the final score is a product calculated by multiplying each intention number by the score of each feature quantity used for estimating the intention number with respect to the intention number.
FIG. 13 is a diagram illustrating a calculation formula used by the intention number estimation unit 106 to calculate the final score in the first embodiment.
In FIG. 13, S is the final score of a certain number of intentions (hereinafter referred to as the number of target intentions) as a final score calculation target among a plurality of intention numbers for the estimation target. In FIG. 13, Si is a score of the i-th feature amount with respect to the target intention number.

FIG. 14 is a diagram showing an example of the final score of each intention number calculated by the intention number estimation unit 106 in the first embodiment.
The intention number estimation unit 106 calculates the final score shown in FIG. 14 using the calculation formula shown in FIG. In this example, the dependency information serving as the feature amount is one of “operation target — 1”, so the final score and the score corresponding to the feature amount “operation target — 1” are the same.
As shown in FIG. 14, the score of the feature quantity “operation target — 1” is 0.2 and the final score S is 0.2 with respect to the intention number “1”. Similarly, the intention number estimation unit 106 calculates a final score for each of the other intention numbers.

Returning to the flowchart of FIG.
The intention number estimation unit 106 estimates the number of intentions based on the final score of each intention number calculated in step ST1102 (step ST1103). Specifically, the intention number estimation unit 106 estimates the number of intentions having the highest final score among the calculated number of intentions of the estimation target as the number of intentions of the estimation target.
Here, the intention number estimation unit 106 estimates the intention number “1” as the intention number.

Returning to the flowchart of FIG.
The intention number estimation unit 106 determines whether the intention number is larger than 1 as a result of estimating the intention number in step ST1004 (step ST1005).
In step ST1005, when the estimated number of intentions is larger than 1 (in the case of “YES” in step ST1005), the process proceeds to steps ST1010 to ST1014. Details of the processing after step ST1010 when the estimated number of intentions is greater than 1 in step ST1005 will be described later with a specific example.

In step ST1005, when the estimated number of intentions is 1 or less (in the case of “NO” in step ST1005), the process proceeds to step ST1006.
For example, in the example of U1 in FIG. 9, the intention number estimation unit 106 estimates the number of intentions. As a result, the number of intentions is “1”, so the process proceeds to step ST1006.
In step ST1006, the intention number estimation unit 106 outputs a character string that is a morpheme analysis result obtained by the morpheme analysis unit 103 in step ST1002 to the single intention estimation unit 108. Then, the single intention estimation unit 108 uses the single intention estimation model (see FIG. 3) stored in the single intention estimation model storage unit 107 to perform a morphological analysis result on a character string, that is, a single intention speech sentence. The user's intention is estimated (step ST1006). For example, when the character string is “I want to go to XX”, “Destination setting [facility = XX]” is estimated as the user's intention. Specifically, the single intention estimation unit 108 uses the single intention estimation model to estimate the intention that the score of the morphological analysis result of the character string by the morpheme analysis unit 103 is the largest as the user's intention.
The single intention estimation unit 108 outputs the intention estimation result to the command execution unit 112 as a single intention estimation result.

The command execution unit 112 causes the command processing unit of the navigation device to execute a command corresponding to the single intention estimation result output from the single intention estimation unit 108 in step ST1006 (step ST1007). For example, the command execution unit 112 causes the command processing unit of the navigation device to execute an operation of setting the facility XX as the destination.
Moreover, the command execution part 112 outputs the execution operation information which shows the content of the command performed by step ST1007 to the response production | generation part 113. FIG.

The response generation unit 113 generates response data corresponding to the command executed by the command processing unit 112 by the command processing unit 112 based on the execution operation information output from the command execution unit 112 in step ST1007 (step ST1008). The response generation unit 113 outputs the generated response data to the notification control unit 114.

The notification control unit 114 outputs voice based on the response data output from the response generation unit 113 in step ST1008, for example, from a speaker included in the navigation device (step ST1009). As a result, as shown in “S2” of FIG. 9, a voice such as “XX set as destination” is output, and the executed command can be notified to the user.

Next, as indicated by “U2” in FIG. 9, assuming that the user utters “Select a highway by approaching ΔΔ”, the operation of the intention estimation apparatus 1 in this case is as shown in FIG. It explains along.
When the user utters as indicated by “U2”, the voice receiving unit 101 receives the voice of the utterance, and the voice recognition unit 102 performs voice recognition processing on the voice of the received utterance (step ST1001). Convert to The voice recognition unit 102 outputs the converted character string to the morphological analysis unit 103 and the intention number estimation unit 106.
The morpheme analysis unit 103 performs a morpheme analysis process on the character string output from the speech recognition unit 102 (step ST1002). For example, the morpheme analyzing unit 103 obtains morphemes of “△△”, “mo”, “stop”, “te”, “highway”, “choose”, “select”, and “te”, and The information is output to the dependency analysis unit 104 as a morphological analysis result.

Next, the dependency analysis unit 104 performs dependency analysis processing on the morpheme analysis result output from the morpheme analysis unit 103 (step ST1003). Here, “△△” is the target of the “stop” operation, “Highway” is the target of the “selection” operation, and the operations “stop” and “selection” are in a parallel relationship. The dependency analysis unit 104 outputs the analysis results of “operation target — 2” and “parallel relationship — 1” as dependency information to the intention number estimation unit 106.

The intention number estimation unit 106 uses the intention number estimation model stored in the intention number estimation model storage unit 105 with the acquired dependency information “operation target — 2” and “parallel relationship — 1” as feature amounts. The number is estimated (step ST1004).
The specific operation of step ST1004 is as described in detail with reference to FIG. 11 as described above. First, as in the case of “U1”, the intention number estimation unit 106 is related to The dependency information output from the reception analysis unit 104 is compared with the intention number estimation model, and the score of each dependency information for each intention number is acquired (see step ST1101 in FIG. 11).
Subsequently, the intention number estimation unit 106 calculates a final score for the number of intentions to be estimated from the calculation formula shown in FIG. 13 (see step ST1102 in FIG. 11).

FIG. 15 is a diagram showing an example of the final score of each intention number calculated by the intention number estimation unit 106 in the first embodiment.
The intention number estimation unit 106 calculates the final score shown in FIG. 15 for the utterance “U2” by the user, using the calculation formula shown in FIG. 13. Here, for the number of intentions “1”, the score of the feature quantity “operation target_2” is 0.01, and the score of “parallel relationship_1” is 0.01. As a result, the intention number estimation unit 106 calculates the final score S of the intention number “1” for the utterance “U2” as 1e−4 (= 0.0001). Similarly, the intention number estimation unit 106 calculates a final score for each of the other intention numbers for the utterance “U2”.

The intention number estimation unit 106 estimates the number of intentions based on the calculated final score of each intention number (see step ST1103 in FIG. 11). Specifically, the intention number estimation unit 106 estimates the number of intentions “2” having the highest final score among the calculated number of intentions of the estimation target as the number of intentions of the estimation target.

Returning to the flowchart of FIG.
The intention number estimation unit 106 determines whether the intention number is larger than 1 as a result of estimating the intention number in step ST1004 (step ST1005).
In step ST1005, when the estimated number of intentions is larger than 1 (in the case of “YES” in step ST1005), the process proceeds to step ST1010.
Here, since the estimated number of intentions is “2” larger than 1 (in the case of “YES” in step ST1005), the process proceeds to step ST1010.

In step ST1010, the intention number estimation unit 106 outputs a character string that is a morpheme analysis result obtained by the morpheme analysis unit 103 in step ST1002 to the composite intention estimation unit 110. Then, the compound intention estimation unit 110 uses the compound intention estimation model (see FIG. 4) stored in the compound intention estimation model storage unit 109 to perform a morphological result on a character string, that is, a compound intention speech sentence. The user's intention is estimated (step ST1010).

Here, FIG. 16 is an example of a determination result of the user's intention, which is the estimation result by the composite intention estimation unit 110 in the first embodiment.
In FIG. 16, in order to facilitate the explanation, as the combined intention estimation model stored in the combined intention estimation model storage unit 109, the determination ready diagram estimation model of the intention “route addition [facility = ΔΔ]”, the intention “ A description will be given assuming that there are three models: a route change [highway priority] determination determination map estimation model and an intention “destination setting [facility = ΔΔ]” determination preparation map estimation model. That is, the composite intention estimation unit 110 determines whether the character string that is the result of the morphological analysis by the morpheme analysis unit 103 corresponds to these three intentions. The intention number estimation unit 106 determines that the intention estimation score exceeds 0.5 when the intention estimation score for the intention determined using the above three determination preparation diagram estimation models exceeds 0.5. Are determined to be relevant intent.
The intention estimation score is a probability value calculated based on the sum of the scores of each morpheme. Accordingly, the sum of the intention estimation scores in each determination preparation diagram estimation model is “1”.

In FIG. 16, FIG. 16A shows the determination result of the determination preparation diagram estimation model of the intention “addition of transit point [facility = ΔΔ]”. The composite intention estimation unit 110 obtains 0.75 as the intention estimation score for the intention “addition of a transit point [facility = ΔΔ]”. In this case, since the intention estimation score exceeds 0.5, the composite intention estimation unit 110 determines that the intention “addition of a transit point [facility = ΔΔ]” is the corresponding intention of the character string “U2”.
In FIG. 16, FIG. 16B shows the determination result of the determination ready map estimation model of the intention “route change [highway priority]”. The composite intention estimation unit 110 has an intention estimation score of 0.7 and exceeds 0.5 (see FIG. 16B), so the intention “route change [highway priority]” is also the corresponding intention of the character string “U2”. Judge that there is.
In FIG. 16, FIG. 16C shows the determination result of the determination ready map estimation model of the intention “destination setting [facility = ΔΔ]”. Since the intention estimation score of the intention “destination setting [facility = ΔΔ]” is 0.5 or less, the composite intention estimation unit 110 does not use the intention “destination setting [facility = ΔΔ]” but “others”. It is determined that the intention is “U2”.

The composite intention estimation unit 110 includes “intermediate location addition [facility = ΔΔ]”, “route change [highway priority]”, and the corresponding intentions obtained by the three intention estimation models shown in FIGS. 16A to 16C. , “Other intentions” are output to the estimation result integration unit 111 as intention estimation results.

The estimation result integration unit 111 adds a corresponding intention other than “other intentions” to the integration result among the plurality of corresponding intentions output as the intention estimation result from the composite intention estimation unit 110 in step ST1010. Are integrated (step ST1011).

As illustrated in FIG. 16A, the determination result of the determination preparation diagram estimation model of the intention “addition of a transit point [facility = ΔΔ]” is the intention “addition of a transit point [facility = ΔΔ]”. 111 adds the intention “addition of transit point [facility = ΔΔ]” to the integration result. The estimation result integration unit 111 adds the intention “route change [highway priority]” to the integration result.
On the other hand, as illustrated in FIG. 16C, the determination result of the determination preparation diagram estimation model of the intention “destination setting [facility = ΔΔ]” is “other intention”, and therefore the estimation result integration unit 111 determines the intention “ Neither destination setting [facility = △△] nor “other intention” is added to the integrated result.

FIG. 17 is a diagram illustrating an example of an intention integration result integrated by the estimation result integration unit 111 in the first embodiment.
The estimation result integration unit 111 outputs the estimated intention integration result to the command execution unit 112 as a composite intention estimation result.

The command execution unit 112 causes the command processing unit of the navigation device to execute a command corresponding to the composite intention estimation result output from the composite intention estimation unit 110 in step ST1011 (step ST1012). For example, the command execution unit 112 causes the command processing unit of the navigation device to execute an operation of adding the facility ΔΔ to the waypoint. In addition, the command execution unit 112 causes the command processing unit of the navigation device to execute an operation of changing the route to the highway priority.
Moreover, the command execution part 112 outputs the execution operation information which shows the content of the command performed by step ST1012 to the response production | generation part 113. FIG.

The response generation unit 113 generates response data corresponding to the command executed by the command processing unit 112 by the command processing unit 112 based on the execution operation information output from the command execution unit 112 in step ST1012 (step ST1013). The response generation unit 113 outputs the generated response data to the notification control unit 114.

The notification control unit 114 outputs voice based on the response data output from the response generation unit 113 in step ST1013, for example, from a speaker included in the navigation device (step ST1014). As a result, as shown in “S3” in FIG. 9, voices such as “Δ △ has been added to the waypoint” and “The route has been given priority to the expressway” are output, , Notification of executed commands can be performed.

As described above, according to the first embodiment, the intention estimation apparatus 1 estimates the number of intentions for a character string and the morpheme analysis unit 103 that analyzes the morpheme included in the character string based on the acquired character string. Depending on the estimated number of intentions, the character string is a single intention character string (single intention utterance) including only one intention or a multiple intention character string (multipurpose intention utterance) including a plurality of intentions. If the intention number estimation unit 106 and the intention number estimation unit 106 determine whether the character string is a single intention character string, the relationship between the morpheme for each intention based on the morpheme analyzed by the morpheme analysis unit 103 The single intention estimation unit 108 that estimates the intention of the single intention character string as a single intention using the single intention estimation model associated with the degree, and the intention number estimation unit 106, the character string is a double intention character string. Morphological analysis Based on the morpheme analyzed by 103, a composite intention estimation unit 110 that estimates a plurality of intentions for the multi-intention character string using a composite intention estimation information model in which a degree of association with a morpheme is associated with each of a plurality of intentions And an estimation result integration unit 111 that integrates a plurality of intentions estimated by the composite intention estimation unit 110 as a composite intention. Thereby, even when the acquired character string can be either a single intention character string or a double intention character string, the intention can be estimated with high accuracy.

Embodiment 2. FIG.
In Embodiment 1, when it is estimated from the user's utterance that the user's intention is 2 or more, the estimation result integration unit 111 integrates the combined intention estimation result estimated by the combined intention estimation unit 110, and the command execution unit 112. However, the navigation apparatus is caused to execute a command corresponding to the integrated combined intention estimation result.
In the second embodiment, an embodiment in which an upper limit is set for the number of intentions of the combined intention estimation result estimated by the combined intention estimation unit 110 will be described.
The second embodiment of the present invention will be described below with reference to the drawings.

FIG. 18 is a diagram illustrating a configuration example of the intention estimation apparatus 1B according to the second embodiment.
The intention estimation device 1B according to the second embodiment is different from the intention estimation device 1 described with reference to FIG. 1 in the first embodiment in that an estimation result selection unit 117 is provided. Since the other configuration of the intention estimation device 1B is the same as the configuration of the intention estimation device 1 described with reference to FIG. 1 in the first embodiment, the same configuration as the intention estimation device 1 is the same as FIG. A duplicate description is omitted.
In the second embodiment, the estimation result integration unit 111 outputs a combined intention estimation result, which is an integration result of the estimated intention, to the estimation result selection unit 117. At this time, the estimation result integration unit 111 also outputs the intention estimation score to the estimation result selection unit 117 by including it in the combined intention estimation result.
In the second embodiment, the intention number estimation unit 106 outputs information on the estimated intention number to the estimation result selection unit 117.

The estimation result selection unit 117 uses the intention number output from the intention number estimation unit 106 as the intention output upper limit for the combined intention estimation result output from the estimation result integration unit 111, and determines the intention as the estimation result as the combined intention estimation. Select from the top of the resulting intention estimation scores. A specific method for selecting the estimation intention will be described later.

The operation of intention estimation apparatus 1B in the second embodiment will be described.
Here, FIG. 19 is a diagram illustrating an example of a dialogue performed between the user and the navigation device in the second embodiment.
FIG. 20 is a flowchart for explaining the operation of intention intent device 1B in the second embodiment.

First, as shown in FIG. 19, the navigation device outputs, for example, a voice “Please speak when you hear a beep” from a speaker included in the navigation device (S01). Specifically, the voice control unit (not shown) of the intention estimation device 1B causes the navigation device to output a voice “Please speak when you hear a slap.”
When the navigation device outputs a voice saying “Please speak when it beeps”, the user utters “Oh, XX doesn't have to be near, is there a convenience store nearby” (U01). Here, as shown in FIG. 19, the voice that the navigation device receives and outputs an instruction from the intention estimation device 1 </ b> B is represented as “S”, and the utterance from the user is represented as “U”.

Hereinafter, description will be made with reference to the flowchart of FIG. 20. The specific operations of step ST2001 to step ST2011, step ST2013 to step ST2015 of FIG. 20 are the same as those of step ST1001 to step ST1001 of FIG. 10 described in the first embodiment. It is the same as the specific operation of ST1014.

First, the voice receiving unit 101 receives a voice uttered by a user, performs voice recognition processing on the voice received by the voice recognition unit 102 and converts it into a character string, and a morpheme analysis unit 103 performs a morphological analysis on the character string. Processing is performed (steps ST2001 and ST2002). For example, the morpheme analysis unit 103, the morpheme analysis unit 103, “XX”, “HA”, “Yorai”, “None”, “Te”, “Good”, “Near”, “Ni”, “Convenience store” And the morpheme of “A” is obtained, and information on the morpheme is output to the dependency analysis unit 104 and the intention number estimation unit 106 as a morpheme analysis result.
Next, the dependency analysis unit 104 performs dependency analysis processing on the character string (step ST2003). For example, because “XX” is the target of the “Ori” action, “Combined” is the target of the “Yes” action, and the actions “Good” and “Yes” are “Parallel”. The dependency analysis unit 104 outputs the analysis results of “operation object — 2” and “parallel relationship — 1” as dependency information to the intention number estimation unit 106.
Then, using the dependency information output from dependency analysis section 104, intention number estimation section 106 estimates the number of intentions (step ST2004). Here, the number of intentions estimated by the number-of-intentions estimation unit 106 is “2” (see step ST1104 in FIG. 11 described in Embodiment 1), and the estimated number of intentions is larger than “1” (step ST2005). In the case of “YES”), the process proceeds to the processing after step ST2010. The steps so far are the same as steps ST1001 to 1005 in FIG. 10 described in the first embodiment.

In step ST2010, the intention number estimation unit 106 outputs a character string that is a result of the morphological analysis performed by the morpheme analysis unit 103 to the composite intention estimation unit 110. Then, the compound intention estimation unit 110 estimates the user's intention with respect to the compound intention utterance.

Here, FIG. 21 is an example of a determination result of the user's intention determined by the composite intention estimation unit 110 in the second embodiment.
In FIG. 21, in order to facilitate the explanation, a determination preparation diagram estimation model of intention “departure point [facility = OO]”, intention “ Description will be made assuming that there are three models: a peripheral search [facility type = convenience store] determination ready map estimation model, and an intention “route deletion” determination ready map estimation model. Similar to the first embodiment, the intention estimation unit 106 determines that the intention estimation score is 0 when the intention estimation score for the intention determined using the above three determination preparation diagram estimation models exceeds 0.5. An intention determined to exceed .5 shall be determined to be the corresponding intention.

In FIG. 21, FIG. 21A shows the determination result of the determination preparation diagram estimation model of the intention “departure point [facility = OO]”. The composite intention estimation unit 110 obtains an intention estimation score of 0.65 for the intention “departure place [facility = OO]”. In this case, since the intention estimation score exceeds 0.5, the composite intention estimation unit 110 determines that the intention “deletion of waypoint [facility = OO]” is the corresponding intention of the character string “U01”.
In FIG. 21, FIG. 21B is a determination result of the intention “periphery search [facility type = convenience store]” determination preparation diagram estimation model, and FIG. 21C is a determination result of the intention “route deletion” determination preparation diagram estimation model. Since the intention estimation score 110 is 0.7 and exceeds 0.5 (see FIG. 21B), the intention “periphery search [facility type = convenience store]” is also the corresponding intention of the character string “U01”. It is determined that In addition, since the intention estimation score is 0.55 and exceeds 0.5 (see FIG. 21C), the combined intention estimation unit 110 determines that “route deletion” is also the corresponding intention of the character string “U01”. .
The composite intention estimation unit 110 includes “in-place deletion [facility = OO]”, “periphery search [facility type = convenience store]”, which are the corresponding intentions obtained by the three intention estimation models shown in FIGS. 21A to 21C. Then, “route deletion” is output to the estimation result integration unit 111.

The estimation result integration unit 111 adds a corresponding intention other than “other intentions” to the integration result among the plurality of corresponding intentions output as the intention estimation result from the composite intention estimation unit 110 in step ST2010. Are integrated (step ST2011).

As shown in FIG. 21A, the determination result of the determination ready map estimation model of the intention “deleting waypoint [facility = XX]” is the intention “deleting waypoint [facility = XXX]”. 111 adds the intention “deletion of waypoints [facility = OO]” to the integration result. In addition, as shown in FIGS. 21B and 21C, the determination result of the determination preparation diagram estimation model of the intention “periphery search [facility type = convenience store]” is “periphery search [facility type = convenience store]”, and the intention “route deletion” Since the determination result of the determination preparation diagram estimation model of “is a route deletion”, the estimation result integration unit 111 similarly adds “neighbor search [facility type = convenience store]” and “route deletion” to the integration result. At this time, in the second embodiment, the estimation result integration unit 111 also adds the intention estimation score to the integration result.

FIG. 22 is a diagram illustrating an example of an intention integration result integrated by the estimation result integration unit 111 in the second embodiment.
The estimation result integration unit 111 outputs the estimated intention integration result to the estimation result selection unit 117 as a composite intention estimation result.

The estimation result selection unit 117 uses the intention number output from the intention number estimation unit 106 in step ST2004 as the intention output upper limit for the combined intention estimation result output from the estimation result integration unit 111 in step ST2011, and sets the estimation result as an estimation result. The intention is selected from the top of the intention estimation score of the composite intention estimation result, and the selected estimation intention is set as the final intention estimation result (step ST2012).
Specifically, the estimation result selection unit 117 selects only the estimated intentions higher than the intention estimation score using the intention number output from the intention number estimation unit 106 as the intention output upper limit and the intention estimation score as a criterion. .

Here, in step ST2004, the intention number estimation unit 106 estimates the number of intentions “2”. Therefore, the estimation result selection unit 117 sets the number of final intention estimation results to “2” or less. There are three estimation integration results by the estimation result integration unit 111: “departure point deletion [facility = OO]”, “periphery search [facility type = convenience store]”, and “route deletion”.
In addition, as shown in FIG. 22, the intention estimation score is “0.65” for “departure point [facility = OO]”, “0.7” for “periphery search [facility type = convenience store]”, “ “Route deletion” is “0.55”.
Since the estimation result selection unit 117 sets the intention number output from the intention number estimation unit 106 as the intention output upper limit, selects the top two of the intention estimation scores of the composite intention estimation result, and outputs the result as the final intention estimation result. The estimation result selection unit 117 selects “departure point [facility = OO]” and “periphery search [facility type = convenience store]”, and obtains the final intention estimation result.

In this way, in the intention estimation apparatus 1B, the estimation result selection unit 117 deletes “route deletion” from the composite intention estimation result, thereby suppressing output of the extra intention estimation result and setting an upper limit on the composite intention estimation result. The accuracy of intention estimation can be further improved as compared with the case where there is no device. As a result, a more appropriate final intention estimation result can be obtained.
FIG. 23 is a diagram illustrating an example of the content of the final intention estimation result generated by the estimation result selection unit 117 in the second embodiment.
The estimation result selection unit 117 outputs the final intention estimation result to the command execution unit 112.

The command execution unit 112 causes the command processing unit of the navigation device to execute a command corresponding to the final intention estimation result output from the estimation result selection unit 117 in step ST2012 (step ST2013). For example, the command execution unit 112 causes the command processing unit of the navigation device to execute a command for deleting a waypoint and a command for searching for a nearby convenience store.
Further, the response generation unit 113 generates response data corresponding to the command executed by the command processing unit 112 by the command execution unit 112 (step ST2014), and the notification control unit 114 outputs the response data generated by the response generation unit 113. Then, the sound is output from a speaker included in the navigation device (step ST2015). As a result, as shown in “S02” in FIG. 19, voices such as “The route point has been deleted.” “Search for nearby convenience stores. Select from the list.” Are output, and the user is output. Can be notified of executed commands. The specific operation is the same as step ST1012 to step ST1014 of FIG. 10 described in the first embodiment.

As described above, according to the second embodiment, in addition to the configuration of the intention estimation apparatus 1 according to the first embodiment, the estimation result integration unit 111 integrates the intention number estimated by the intention number estimation unit 106 as an upper limit. Among the plurality of intentions, an intention higher in the intention estimation score calculated when the intention number estimation unit 106 estimates the number of intentions is selected, and an estimation result selection unit 117 that is a composite intention is provided. Thereby, using the intention number result obtained by the intention number estimation unit 106, an output upper limit is set for the composite intention estimation result obtained by the estimation result integration unit 111, and the output of an inappropriate intention estimation result is suppressed. Therefore, the accuracy of the final integration result is further improved.

Note that some of the functions of the intention estimation devices 1 and 1B described so far may be executed by other devices. For example, some functions may be executed by a server provided outside or a mobile terminal such as a smartphone or a tablet.

In the first and second embodiments described above, the intention estimation devices 1 and 1B estimate the user's intention based on the voice generated by the user's utterance. The information is not limited to this. For example, the intention estimation devices 1 and 1B can accept a character string input by the user using an input device such as a keyboard, and can estimate the user's intention based on the character string.

In the present invention, within the scope of the invention, any combination of the embodiments, or any modification of any component in each embodiment, or omission of any component in each embodiment is possible. .

Since the intention estimation apparatus according to the present invention is configured to improve the accuracy of estimating the intention of a character string, the intention estimation apparatus is applied to an intention estimation apparatus that recognizes an input character string and estimates a user's intention. be able to.

1, 1B intention estimation device, 2 intention number estimation model generation device, 101 speech reception unit, 102 speech recognition unit, 103 morpheme analysis unit, 104 dependency analysis unit, 105 intention number estimation model storage unit, 106 intention number estimation unit, 107 single intention estimation model storage unit, 108 single intention estimation unit, 109 compound intention estimation model storage unit, 110 compound intention estimation unit, 111 estimation result integration unit, 112 command execution unit, 113 response generation unit, 114 notification control unit, 115 Data storage unit for learning, 116 intention number estimation model generation unit, 117 estimation result selection unit, 501 processing circuit, 502 HDD, 503 input interface device, 504 output interface device, 505 memory, 506 CPU.

Claims

A morpheme analyzer that analyzes morphemes contained in the character string based on the acquired character string;
Estimating the number of intentions for the character string, and according to the estimated number of intentions, whether the character string is a single intention character string including only one intention or a multiple intention character string including a plurality of intentions An intention number estimation unit for determining
When the intention number estimation unit determines that the character string is a single intention character string, based on the morpheme analyzed by the morpheme analysis unit, a single intention estimation model in which a degree of association with a morpheme is associated with each intention A single intention estimation unit that estimates the intention of the single intention string as a single intention using
When the intention number estimation unit determines that the character string is a multiple intention character string, based on the morpheme analyzed by the morpheme analysis unit, a composite intention in which the degree of association with the morpheme is associated with each of a plurality of intentions A compound intention estimator that estimates a plurality of intentions for the multi-intent character string using the estimation model;
An intention estimation apparatus comprising: an estimation result integration unit that integrates a plurality of intentions estimated by the composite intention estimation unit as a composite intention.
Based on the morpheme analyzed by the morpheme analysis unit, analyzing the relationship between the morphemes included in the character string, comprising a dependency analysis unit for generating dependency information,
The intention estimation apparatus according to claim 1, wherein the intention number estimation unit estimates the number of intentions for the character string based on the dependency information generated by the dependency analysis unit.
The intention number estimation unit includes:
The number of intentions for the character string is estimated using an intention number estimation model in which the dependency information is a feature amount and a correspondence relationship between the dependency information and the number of intentions is learned. Intent estimation device.
Of the plurality of intentions integrated by the estimation result integration unit with the intention number estimated by the intention number estimation unit as an upper limit, the intentions higher in the intention estimation score calculated when the intention number estimation unit estimates the number of intentions The intention estimation apparatus according to claim 1, further comprising: an estimation result selection unit that selects a combination intention.
A morpheme analyzer that analyzes a morpheme included in the character string based on the acquired character string;
The intention number estimation unit estimates the number of intentions for the character string, and according to the estimated number of intentions, the character string is a single intention character string including only one intention or a plurality of intentions. Determining whether it is an intended character string;
When the single intention estimation unit determines that the intention number estimation unit determines that the character string is a single intention character string, the degree of association with the morpheme is associated with each intention based on the morpheme analyzed by the morpheme analysis unit. Estimating the intention with respect to the single intention character string as a single intention using the single intention estimation model obtained;
When the complex intention estimation unit determines that the intention number estimation unit determines that the character string is a dual intention character string, the relevance with the morpheme is determined for each of a plurality of intentions based on the morpheme analyzed by the morpheme analysis unit. Estimating a plurality of intentions for the multi-intention character string using the associated complex intention estimation model; and
An estimation result integrating unit comprising: a step of integrating a plurality of intentions estimated by the composite intention estimation unit as a composite intention.
A dependency analysis unit, based on the morpheme analyzed by the morpheme analysis unit, analyzing a relationship between morphemes included in the character string, and generating dependency information,
The intention estimation method according to claim 5, wherein the intention number estimation unit includes a step of estimating an intention number for the character string based on dependency information generated by the dependency analysis unit.
The intention number estimation unit includes:
The method includes the step of estimating the number of intentions for the character string using an intention number estimation model obtained by using the dependency information as a feature amount and learning a correspondence relationship between the dependency information and the number of intentions. 6. The intention estimation method according to 6.
The intention calculated by the estimation result selection unit when the intention number estimation unit estimates the number of intentions out of a plurality of intentions integrated by the estimation result integration unit with the intention number estimated by the intention number estimation unit as an upper limit. The intention estimation method according to claim 5, further comprising a step of selecting an intention having a higher estimated score and making the compound intention.