WO2021131737A1

WO2021131737A1 - Information processing device, information processing method, and information processing program

Info

Publication number: WO2021131737A1
Application number: PCT/JP2020/045993
Authority: WO
Inventors: 英樹野間; 直矢村松
Original assignee: ソニーグループ株式会社
Priority date: 2019-12-27
Filing date: 2020-12-10
Publication date: 2021-07-01

Abstract

An image processing device (100) according to the present application comprises an acquisition unit (131) and a provision unit (133). The acquisition unit (131) acquires speech information indicating speech uttered by a user. The provision unit (133) provides expression information to be expressed by a robot device, the expression information being provided on the basis of the speech information acquired by the acquisition unit (131), and user information pertaining to the user in which thing information indicating a specific thing and emotion information indicating the user's emotion toward the specific thing are associated with other. The provision unit (133) provides expression information that is response information indicating a response to the speech.

Description

Information processing equipment, information processing methods and information processing programs

The present invention relates to an information processing device, an information processing method, and an information processing program.

In recent years, various devices that respond to user actions have become widespread. Such devices include agents and the like that present answers to inquiries from users. For example, Patent Document 1 discloses a technique of calculating an expected value of the user's attention to output information and controlling information output based on the expected value.

Japanese Unexamined Patent Publication No. 2015-132878

By the way, in recent years, agents tend to place more importance on communication with users in addition to simply presenting information. However, it cannot be said that sufficient communication occurs in a device that responds to a user's action as described in Patent Document 1.

Therefore, in this disclosure, we propose an information processing device, an information processing method, and an information processing program that can expand the range of communication with the user.

In order to solve the above-mentioned problems, the information processing apparatus of one form according to the present disclosure includes an acquisition unit that acquires utterance information indicating an utterance by a user, the utterance information acquired by the acquisition unit, and a specific object. Based on the user information about the user in which the thing information indicating the above and the emotion information indicating the user's feelings toward the specific thing are associated with each other, the providing unit that provides the expression information expressed by the robot device. Be prepared.

It is a figure which shows an example of information processing which concerns on embodiment of this disclosure. It is a figure which shows the structural example of the information processing apparatus which concerns on embodiment of this disclosure. It is a block diagram which shows an example of the schematic structure of the information processing apparatus which concerns on embodiment of this disclosure. It is a figure which shows an example of the preference information database which concerns on embodiment of this disclosure. It is a flowchart which shows the information processing procedure which concerns on embodiment of this disclosure. It is a hardware block diagram which shows an example of an information processing apparatus and a computer which realizes the function of an information processing apparatus.

Hereinafter, embodiments of the present disclosure will be described in detail with reference to the drawings. In each of the following embodiments, the same parts are designated by the same reference numerals, so that duplicate description will be omitted.

The present disclosure will be described according to the order of items shown below.
1. 1. Embodiment 1-1. Outline of information processing according to the embodiment 1-2. Configuration of Information Processing Device According to Embodiment 1-3. Information processing procedure according to the embodiment 2. Effect of this disclosure 3. Hardware configuration

[1. Embodiment]
[1-1. Outline of information processing according to the embodiment]
First, the outline of information processing according to the embodiment of the present disclosure will be described with reference to FIG. FIG. 1 is a diagram showing an example of information processing according to the embodiment of the present disclosure. The information processing shown in FIG. 1 is performed by the robot device 10 and the information processing device 100.

The robot device 10 can be various devices that perform autonomous operations based on environmental recognition. For example, the robot device 10 according to the present embodiment is an oblong agent-type robot device that autonomously travels by wheels. The robot device 10 realizes various communications including information presentation by performing autonomous operations according to, for example, the user, the surroundings, and its own situation. The robot device 10 may be a small robot having a size and weight that can be easily lifted by a user with one hand.

The information processing device 100 acquires utterance information indicating utterance by the user. Further, the information processing device 100 is a robot device based on the acquired speech information and user information about the user in which the thing information indicating a specific thing and the emotion information indicating the user's feelings for the specific thing are associated with each other. The expression information expressed by 10 is provided.

In the example shown in FIG. 1, the information processing device 100 acquires voice data of a user's utterance from the robot device 10. Subsequently, the information processing device 100 acquires the voice recognition result of the acquired voice data (for example, "I ate curry yesterday"). Subsequently, the information processing device 100 decomposes the voice recognition result into morphemes by morphological analysis. For example, the information processing apparatus 100 acquires morphemes such as "yesterday: noun, tense", "curry: noun, food," "eat: verb", and "yo: particle" by morphological analysis.

Further, the information processing device 100 acquires the voice direction (for example, "direction of 45 degrees to the left") from the robot device 10. Further, the information processing device 100 acquires a user's face identification result (for example, face ID "U1") from the robot device 10. Further, the information processing device 100 acquires image data from the robot device 10. Subsequently, the information processing device 100 matches the voice direction (the direction of 45 degrees to the left) acquired from the robot device 10 with the position of the user in the image data (the user with the face ID "U1" in the direction of 45 degrees to the left). Identify the user who is the speaker of the utterance.

Subsequently, when the information processing device 100 identifies the user who is the speaker, the information processing device 100 acquires the user's preference information. For example, the information processing apparatus 100 acquires information that the food that the user U1 likes is "curry" as the preference information of the user U1. Subsequently, the information processing apparatus 100 inputs the character string "curry", which is the preference target of the user U1, into the sympathy model, and acquires the character string "delicious" as the output data of the sympathy model.

Subsequently, the information processing apparatus 100 acquires the spoken sentence template. For example, the information processing device 100 acquires an utterance sentence template of "X is Y, isn't it?" (X is a character string indicating a user's preference target, and Y is a character string output from the sympathy model). When the information processing device 100 acquires the utterance template "X is Y, isn't it?" Apply the character string to generate the response sentence "Curry is delicious, isn't it?" Further, in the information processing device 100, since the utterance "I ate curry yesterday" by the user U1 includes the food "curry" that the user U1 likes, the emotion of the user U1 who made the utterance is a positive emotion. Presumed to be.

When the information processing device 100 generates the response sentence, the information processing device 100 provides the robot device 10 with the response sentence "Curry is delicious, isn't it?". Further, when the information processing device 100 estimates the emotion of the user U1, the information processing device 100 provides the robot device 10 with information on the behavior based on the estimated emotion. For example, when the information processing device 100 estimates that the emotion of the user U1 is a positive emotion, the information processing device 100 provides the robot device 10 with information on a bright tone as information on the tone of the voice of the response sentence output by the robot device 10 by voice. To do. Further, the information processing device 100 provides the robot device 10 with information indicating a smile as information regarding the facial expression of the robot device 10. Further, when the robot device 10 moves at the time of outputting the response sentence, the information processing device 100 provides the robot device 10 with information indicating a quick speed as information on the operating speed.

[1-2. Configuration of Information Processing Device According to Embodiment]
Next, an example of a schematic configuration of the information processing apparatus according to the embodiment of the present disclosure will be described with reference to FIG. FIG. 2 is a block diagram showing an example of a schematic configuration of the information processing apparatus according to the embodiment of the present disclosure. In the example of FIG. 2, the information processing system 1 has a robot device 10 and an information processing device 100.

In the example shown in FIG. 2, the information processing device 100 includes an acquisition unit 131 and a generation unit 132. The acquisition unit 131 includes a voice recognizer, a speaker identifyr, a morphological analyzer, and a preprocessing unit.

The voice recognizer acquires voice data from the robot device 10. When the voice recognizer acquires the voice data, it recognizes the voice of the voice data and outputs the voice recognition result to the preprocessing unit.

The speaker classifier acquires the voice direction, face recognition result, and image data from the robot device 10. When the speaker classifier acquires the voice direction, face recognition result, and image data, it identifies the speaker and outputs the speaker identification result to the preprocessing unit.

The pre-processing unit acquires the voice recognition result from the voice recognizer. When the preprocessing unit acquires the voice recognition result, it outputs the acquired voice recognition result to the morphological analyzer.

The morphological analyzer acquires the voice recognition result from the preprocessing unit. When the morphological analyzer acquires the speech recognition result, it performs morphological analysis on the speech recognition result and decomposes the speech recognition result into morphemes. When the morphological analyzer performs morphological analysis, it outputs the morpheme to the preprocessing unit.

In addition, the preprocessing unit acquires the speaker identification result from the speaker classifier. When the preprocessing unit acquires the speaker identification result, it refers to the preference information database 121 and acquires the identified speaker preference information from the preference information database 121.

Subsequently, when the preprocessing unit acquires the morpheme and speaker preference information, it outputs the acquired morpheme and speaker preference information to the generation unit 132.

The generation unit 132 acquires morpheme and speaker preference information from the preprocessing unit. When the morpheme and the speaker's preference information are acquired, the generation unit 132 generates an utterance sentence and estimates the speaker's emotion based on the acquired state and speaker's preference information.

Next, the configuration of the information processing device according to the embodiment of the present disclosure will be described with reference to FIG. FIG. 3 is a diagram showing a configuration example of the information processing device according to the embodiment of the present disclosure. As shown in FIG. 3, the information processing device 100 according to the embodiment of the present disclosure includes a communication unit 110, a storage unit 120, and a control unit 130.

(Communication unit 110)
The communication unit 110 is realized by, for example, a NIC or the like. Then, the communication unit 110 is connected to the network N by wire or wirelessly, and transmits / receives information to / from, for example, the robot device 10.

(Memory unit 120)
The storage unit 120 is realized by, for example, a semiconductor memory element such as a RAM (Random Access Memory) or a flash memory (Flash Memory), or a storage device such as a hard disk or an optical disk. For example, the storage unit 120 stores the information processing program according to the embodiment. As shown in FIG. 3, the storage unit 120 has a preference information database 121, a model information database 122, and a template database 123.

(Preference information database 121)
The preference information database 121 stores various information related to the user's preference. An example of the preference information database according to the embodiment will be described with reference to FIG. FIG. 4 is a diagram showing an example of a preference information database according to the embodiment of the present disclosure. In the example shown in FIG. 4, the preference information database 121 has a "user ID", a "name", a "favorite food", a "disliked food", a "recently dissatisfied", a "sad thing", and a "sad thing". It has items such as "hobbies" and "hometown".

"User ID" indicates identification information that identifies the user. "Name" indicates the name of the user. "Favorite food" indicates the food that the user likes. "Disliked food" indicates food that the user dislikes. "Recently dissatisfied" indicates that the user has recently been dissatisfied. "What makes you sad" indicates what makes the user sad. "Hobby" indicates a user's hobby. "Hometown" indicates the hometown of the user.

In the example shown in the first record of FIG. 4, the name of the user (user U1) identified by the user ID "U1" is "A". Also, the food that user U1 likes is "curry". The food that user U1 dislikes is "coriander". Also, what user U1 has recently been dissatisfied with is the "tax increase." Also, what makes user U1 sad is the "birthday". The hobby of user U1 is "futsal". The hometown of the user is "Tokyo".

(Model information database 122)
The model information database 122 stores various information related to the empathy model. Specifically, the model information database 122 stores various information related to the learning model learned to output the character string corresponding to the user's preference information when the character string indicating the user's preference information is input. To do. For example, the model information database 122 stores the identification information that identifies the sympathy model and the model data of the sympathy model in association with each other.

(Template database 123)
The template database 123 stores various information related to the template. Specifically, the template database 123 stores an utterance template of "X is Y, isn't it?" (X is a character string indicating a user's preference target, Y is a character string output from the sympathy model).

(Control unit 130)
In the control unit 130, various programs (corresponding to an example of an information processing program) stored in a storage device inside the information processing device 100 by a CPU (Central Processing Unit), an MPU (Micro Processing Unit), or the like have a RAM as a work area. It is realized by executing as. Further, the control unit 130 is realized by, for example, an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array).

As shown in FIG. 3, the control unit 130 has an acquisition unit 131, a generation unit 132, and a provision unit 133, and realizes or executes an information processing function or operation described below. The internal configuration of the control unit 130 is not limited to the configuration shown in FIG. 3, and may be another configuration as long as it is a configuration for performing information processing described later.

(Acquisition unit 131)
The acquisition unit 131 acquires utterance information indicating utterance by the user. Specifically, the acquisition unit 131 acquires voice data of a user's utterance from the robot device 10. Subsequently, the acquisition unit 131 acquires the voice recognition result (for example, "I ate curry yesterday") of the acquired voice data. Subsequently, the acquisition unit 131 decomposes the speech recognition result into morphemes by morphological analysis. For example, the acquisition unit 131 acquires morphemes such as "yesterday: noun, tense", "curry: noun, food,""eat:verb", and "yo: particle" by morphological analysis.

Further, the acquisition unit 131 acquires the voice direction (for example, "direction of 45 degrees to the left") from the robot device 10. Further, the acquisition unit 131 acquires the user's face identification result (for example, the face ID “U1”) from the robot device 10. Further, the acquisition unit 131 acquires image data from the robot device 10. Subsequently, the acquisition unit 131 collates the voice direction (the direction of 45 degrees to the left) acquired from the robot device 10 with the position of the user in the image data (the user of the face ID “U1” in the direction of 45 degrees to the left). Identify the user who is the speaker of the speech.

Subsequently, when the acquisition unit 131 identifies the user who is the speaker, the acquisition unit 131 acquires the user's preference information. Specifically, when the acquisition unit 131 identifies a user, it refers to the preference information database 121 and acquires the preference information of the identified user. For example, the acquisition unit 131 acquires information that the food that the user U1 likes is "curry" as the preference information of the user U1. In this way, the acquisition unit 131 associates the thing information indicating a specific thing (for example, "curry") with the emotion information indicating the user's feelings for the specific thing (for example, "the food that the user likes"). Acquires user information about the user.

(Generator 132)
The generation unit 132 generates a character string to be applied to the utterance sentence template. Specifically, when the user's preference information is acquired by the acquisition unit 131, the generation unit 132 inputs the acquired preference information into the sympathy model and utters "X is Y, isn't it?" As the output data of the sympathy model. Generate a character string that applies to the "Y" part of the sentence template. For example, the generation unit 132 inputs the character string "curry", which is the preference target of the user U1, into the sympathy model, and generates the character string "delicious" as the output data of the sympathy model.

Subsequently, the generation unit 132 acquires the utterance sentence template. For example, the generation unit 132 acquires an utterance sentence template of "X is Y, isn't it?" (X is a character string indicating a user's preference target, and Y is a character string output from the empathy model). When the generation unit 132 acquires the utterance sentence template "X is Y, isn't it?", The character string "curry", which is the preference target of the user U1, is given to X, and the character "delicious", which is the output data of the sympathy model, is given to Y. Fit the columns and generate the response "Curry is delicious, isn't it?" In this way, when the speaker's preference information is acquired by the acquisition unit 131, the generation unit 132 generates response information indicating a response to the utterance based on the speaker's preference information. For example, when the speaker's preference information is acquired by the acquisition unit 131, the generation unit 132 generates an utterance sentence output by the robot device based on the speaker's preference information.

In addition, since the generation unit 132 includes "curry", which is a food that user U1 likes, in "I ate curry yesterday", which is an utterance by user U1, the emotion of user U1 who made the utterance is a positive emotion. Presumed to be. In this way, the generation unit 132 estimates the emotion of the speaker based on the utterance information acquired by the acquisition unit 131 and the preference information of the speaker. For example, the generation unit 132 estimates that the speaker's emotions are positive when the utterance information acquired by the acquisition unit 131 includes the speaker's preference target.

Further, in the generation unit 132, when the utterance "Pakuchi came out at noon" by the user U1 includes the food "Pakuchi" that the user U1 dislikes, the emotion of the user U1 who made the utterance is Presumed to be a negative emotion. Further, the generation unit 132 inputs the character string "Pakuchi", which is an object disliked by the user U1, into the sympathy model, and generates the character string "Sorry" as the output data of the sympathy model. Subsequently, when the generation unit 132 acquires the utterance sentence template "X is Y, isn't it?", The character string "Pakuchi", which is the object that the user U1 dislikes, is given to X, and the output data of the sympathy model is given to Y. Apply the string "Sorry" to generate the response "Pakuchi is sorry".

In the generation unit 132, when the user U1's utterance "I'm going to see a futsal game next time" includes the user U1's hobby "futsal", the emotion of the user U1 who made the utterance is positive. Presumed to be emotion. Further, the generation unit 132 inputs the character string "futsal", which is the hobby of the user U1, into the sympathy model, and generates the character string "fun" as the output data of the sympathy model. Subsequently, when the generation unit 132 acquires the utterance sentence template "X is Y, isn't it?", The character string "futsal", which is the object of the user U1's hobby, is given to X, and the output data of the sympathy model is given to Y. Apply the string "fun" to generate the response sentence "futsal is fun".

(Providing Department 133)
The providing unit 133 is based on the speech information acquired by the acquiring unit 131 and the user information about the user in which the thing information indicating a specific thing and the emotion information indicating the user's feelings for the specific thing are linked. It provides the expression information expressed by the robot device. Specifically, the providing unit 133 provides expression information which is response information indicating a response to an utterance. More specifically, the providing unit 133 provides the expression information which is the sentence of the utterance output by the robot device. For example, the providing unit 133 provides expression information showing empathy based on emotional information associated with the thing information included in the utterance information. The providing unit 133 provides the expression information generated by the generating unit 132. In the example shown in FIG. 1, the providing unit 133 provides the robot device 10 with a response sentence "curry is delicious, isn't it?" Generated by the generating unit 132.

Further, when the utterance information includes a plurality of thing information, the providing unit 133 provides the expression information showing sympathy based on the emotional information associated with at least one of the plurality of thing information. To do. For example, suppose that when the speaker "likes" "curry" and "dislikes" "rabbit", the acquisition unit 131 acquires the utterance "I ate curry while watching the rabbit at the zoo". At this time, the provider 133 was associated with at least one of "curry" and "rabbit", such as "curry, I like it!" Or "rabbit, I hate it a little". Provide a response sentence that shows empathy based on the feeling of "like" or "dislike".

Alternatively, when the utterance information includes a plurality of things information, the providing unit 133 gives an emotion associated with the thing information included in the bunsetsu that does not modify the other bunsetsu among the bunsetsu included in the utterance information. Provide expression information that shows sympathy based on information. For example, in the above example, the phrase "while watching the rabbit" modifies the phrase "I ate curry", but the phrase "I ate curry" does not qualify other phrases. Therefore, the providing department 133 provides "Curry, that's good!", Which is a response sentence showing sympathy based on the feeling of "like" associated with "curry" included in the phrase "I ate curry". To do.

Further, when the emotion of the user U1 is estimated, the providing unit 133 provides the robot device 10 with information on the behavior based on the estimated emotion. Specifically, the providing unit 133 provides display information which is information indicating the tone of the voice output by the robot device. For example, assuming that the emotion of the user U1 is a positive emotion, the providing unit 133 provides the robot device 10 with information on a bright tone as information on the tone of the voice of the response sentence output by the robot device 10 by voice. .. Alternatively, assuming that the emotion of the user U1 is a negative emotion, the providing unit 133 provides the robot device 10 with information on the dark tone as information on the tone of the voice of the response sentence output by the robot device 10 by voice. ..

Further, the providing unit 133 provides the expression information which is the information indicating the facial expression of the robot device. For example, assuming that the emotion of the user U1 is a positive emotion, the providing unit 133 provides the robot device 10 with information indicating a smile as information regarding the facial expression of the robot device 10. Alternatively, assuming that the emotion of the user U1 is a negative emotion, the providing unit 133 provides the robot device 10 with information indicating a sad face as information regarding the facial expression of the robot device 10.

Further, the providing unit 133 provides display information which is information indicating the operating speed of the robot device. For example, if the providing unit 133 estimates that the emotion of the user U1 is a positive emotion when the robot device 10 moves at the time of outputting the response sentence, the robot device 10 provides information indicating a quick speed as information on the operating speed. To provide. Alternatively, assuming that the emotion of the user U1 is a negative emotion, the providing unit 133 provides the robot device 10 with information indicating a slow speed as information regarding the operating speed.

[1-3. Information processing procedure according to the embodiment]
Next, the procedure of information processing according to the embodiment of the present disclosure will be described with reference to FIG. FIG. 5 is a flowchart showing an information processing procedure according to the embodiment of the present disclosure. In the example shown in FIG. 5, the information processing apparatus 100 acquires the morpheme of the voice recognition result (step S101). Subsequently, the information processing device 100 acquires the preference information of the speaker based on the speaker identification result (step S102). Subsequently, the information processing device 100 estimates the emotion of the speaker based on the acquired morpheme and the preference information of the speaker, and generates an utterance sentence of the robot device 10 (step S103).

[2. Effect of this disclosure]
As described above, the information processing apparatus 100 according to the present disclosure includes an acquisition unit 131 and a provision unit 133. The acquisition unit 131 acquires utterance information indicating utterance by the user. The providing unit 133 is based on the speech information acquired by the acquiring unit 131 and the user information about the user in which the thing information indicating a specific thing and the emotion information indicating the user's feelings for the specific thing are linked. It provides the expression information expressed by the robot device.

As a result, the information processing device 100 enables the robot device to express an expression that matches the emotions of the other party, so that the range of communication with the user can be expanded.

Further, the providing unit 133 provides the expression information which is the response information indicating the response to the utterance.

As a result, the information processing device 100 enables the robot device to respond according to the emotions of the other party, so that the range of communication with the user can be expanded.

In addition, the providing unit 133 provides expression information showing sympathy based on emotional information associated with the thing information included in the utterance information.

Thereby, the information processing device 100 enables the robot device to estimate the user's emotion based on, for example, the user's preference information included in the utterance, and to express an expression showing sympathy for the estimated emotion. Therefore, the range of communication with the user can be expanded.

Further, when the utterance information includes a plurality of thing information, the providing unit 133 provides the expression information showing sympathy based on the emotional information associated with at least one of the plurality of thing information. To do.

As a result, the information processing device 100 enables the robot device to make an appropriate response according to the emotion of the other party, so that the range of communication with the user can be expanded.

In addition, when the utterance information includes a plurality of things information, the providing unit 133 gives an emotion associated with the thing information included in the bunsetsu that does not modify the other bunsetsu among the bunsetsu included in the utterance information. Provide expression information that shows sympathy based on information.

In addition, the providing unit 133 provides the expression information which is the text of the utterance output by the robot device.

As a result, the information processing device 100 enables the robot device to make an appropriate utterance according to the emotion of the other party.

Further, the providing unit 133 provides the expression information which is the information indicating the tone of the voice output by the robot device.

This allows the information processing device 100 to allow the robot device to speak with an appropriate voice tone that matches the emotions of the other party.

Further, the providing unit 133 provides the expression information which is the information indicating the facial expression of the robot device.

This enables the information processing device 100 to allow the robot device to express an appropriate facial expression according to the emotion of the other party.

Further, the providing unit 133 provides display information which is information indicating the operating speed of the robot device.

This enables the information processing device 100 to allow the robot device to express an appropriate operation according to the emotion of the other party.

[3. Hardware configuration]
The information device such as the information processing device 100 according to the above-described embodiment and modification is realized by, for example, a computer 1000 having a configuration as shown in FIG. FIG. 6 is a hardware configuration diagram showing an example of a computer 1000 that realizes the functions of an information processing device such as the information processing device 100. Hereinafter, the information processing apparatus 100 according to the embodiment will be described as an example. The computer 1000 includes a CPU 1100, a RAM 1200, a ROM (Read Only Memory) 1300, an HDD (Hard Disk Drive) 1400, a communication interface 1500, and an input / output interface 1600. Each part of the computer 1000 is connected by a bus 1050.

The CPU 1100 operates based on the program stored in the ROM 1300 or the HDD 1400, and controls each part. For example, the CPU 1100 expands the program stored in the ROM 1300 or the HDD 1400 into the RAM 1200 and executes processing corresponding to various programs.

The ROM 1300 stores a boot program such as a BIOS (Basic Input Output System) executed by the CPU 1100 when the computer 1000 is started, a program that depends on the hardware of the computer 1000, and the like.

The HDD 1400 is a computer-readable recording medium that non-temporarily records a program executed by the CPU 1100 and data used by the program. Specifically, the HDD 1400 is a recording medium for recording an information processing program according to the present disclosure, which is an example of program data 1450.

The communication interface 1500 is an interface for the computer 1000 to connect to an external network 1550 (for example, the Internet). For example, the CPU 1100 receives data from another device or transmits data generated by the CPU 1100 to another device via the communication interface 1500.

The input / output interface 1600 is an interface for connecting the input / output device 1650 and the computer 1000. For example, the CPU 1100 receives data from an input device such as a keyboard or mouse via the input / output interface 1600. Further, the CPU 1100 transmits data to an output device such as a display, a speaker, or a printer via the input / output interface 1600. Further, the input / output interface 1600 may function as a media interface for reading a program or the like recorded on a predetermined recording medium (media). The media is, for example, an optical recording medium such as a DVD (Digital Versatile Disc) or PD (Phase change rewritable Disk), a magneto-optical recording medium such as an MO (Magneto-Optical disk), a tape medium, a magnetic recording medium, or a semiconductor memory. Is.

For example, when the computer 1000 functions as the information processing device 100 according to the embodiment, the CPU 1100 of the computer 1000 realizes the functions of the control unit 130 and the like by executing the information processing program loaded on the RAM 1200. Further, the information processing program according to the present disclosure and the data in the storage unit 120 are stored in the HDD 1400. The CPU 1100 reads the program data 1450 from the HDD 1400 and executes the program, but as another example, these programs may be acquired from another device via the external network 1550.

The present technology can also have the following configurations.
(1)
An acquisition unit that acquires utterance information indicating utterances by the user,
Based on the speech information acquired by the acquisition unit, the user information about the user in which the thing information indicating a specific thing and the emotion information indicating the user's emotion toward the specific thing are associated with each other, the robot A provider that provides the expression information that the device expresses,
Information processing device equipped with.
(2)
The providing part
The information processing device according to (1) above, which provides the expression information which is the response information indicating the response to the utterance.
(3)
The providing part
The information processing device according to (1) or (2), which provides the expression information indicating sympathy based on the emotional information associated with the thing information included in the utterance information.
(4)
The providing part
When the spoken information includes a plurality of the said thing information, the expression showing sympathy based on the emotional information associated with at least one of the said said thing information among the plurality of said thing information. The information processing apparatus according to any one of (1) to (3) above, which provides information.
(5)
The providing part
When the utterance information includes a plurality of the thing information, the emotion associated with the thing information included in the clause not modifying the other clauses among the clauses included in the utterance information. The information processing apparatus according to any one of (1) to (4) above, which provides the expression information showing sympathy based on the information.
(6)
The providing part
The information processing device according to any one of (1) to (5) above, which provides the expression information which is the text of the utterance output by the robot device.
(7)
The providing part
The information processing device according to any one of (1) to (6) above, which provides the display information which is information indicating the tone of the voice output by the robot device.
(8)
The providing part
The information processing device according to any one of (1) to (7), which provides the expression information which is information indicating the facial expression of the robot device.
(9)
The providing part
The information processing device according to any one of (1) to (8), which provides the display information which is information indicating the operating speed of the robot device.
(10)
Acquires utterance information indicating the utterance by the user,
The robot device expresses based on the acquired utterance information, the thing information indicating a specific thing, and the user information about the user in which the emotion information indicating the user's feelings for the specific thing is associated with each other. Provide expression information,
An information processing method that executes processing.
(11)
On the computer
The acquisition procedure for acquiring utterance information indicating the utterance by the user,
Based on the speech information acquired by the acquisition procedure, the user information about the user in which the thing information indicating a specific thing and the emotion information indicating the user's emotion toward the specific thing are associated with each other, the robot Providing procedure, which provides the expression information expressed by the device,
Information processing program to execute.

1 Information processing system 10 Robot device 100 Information processing device 110 Communication unit 120 Storage unit 121 Preference information database 122 Model information database 123 Template database 130 Control unit 131 Acquisition unit 132 Generation unit 133 Providing unit

Claims

An acquisition unit that acquires utterance information indicating utterances by the user,
Based on the speech information acquired by the acquisition unit, the user information about the user in which the thing information indicating a specific thing and the emotion information indicating the user's emotion toward the specific thing are associated with each other, the robot A provider that provides the expression information that the device expresses,
Information processing device equipped with.
The providing part
The information processing apparatus according to claim 1, which provides the expression information which is the response information indicating the response to the utterance.
The providing part
The information processing device according to claim 1, which provides the expression information showing sympathy based on the emotional information associated with the thing information included in the utterance information.
The providing part
When the spoken information includes a plurality of the said thing information, the expression showing sympathy based on the emotional information associated with at least one of the said said thing information among the plurality of said thing information. The information processing device according to claim 1, which provides information.
The providing part
When the utterance information includes a plurality of the thing information, the emotion associated with the thing information included in the clause not modifying the other clauses among the clauses included in the utterance information. The information processing apparatus according to claim 1, which provides the expression information showing sympathy based on the information.
The providing part
The information processing device according to claim 1, which provides the expression information which is a sentence of an utterance output by the robot device.
The providing part
The information processing device according to claim 1, which provides the display information which is information indicating the tone of the voice output by the robot device.
The providing part
The information processing device according to claim 1, which provides the expression information which is information indicating the facial expression of the robot device.
The providing part
The information processing device according to claim 1, which provides the display information which is information indicating the operating speed of the robot device.
Acquires utterance information indicating the utterance by the user,
The robot device expresses based on the acquired utterance information, the thing information indicating a specific thing, and the user information about the user in which the emotion information indicating the user's feelings for the specific thing is associated with each other. Provide expression information,
An information processing method that executes processing.
On the computer
The acquisition procedure for acquiring utterance information indicating the utterance by the user,
Based on the speech information acquired by the acquisition procedure, the user information about the user in which the thing information indicating a specific thing and the emotion information indicating the user's emotion toward the specific thing are associated with each other, the robot Providing procedure, which provides the expression information expressed by the device,
Information processing program to execute.