WO2022085474A1

WO2022085474A1 - Information processing method

Info

Publication number: WO2022085474A1
Application number: PCT/JP2021/037243
Authority: WO
Inventors: 洋矢羽田; 孝啓西; 正真遠間; 敏康杉尾; ジャンジャスパーヴァンデンバーグ; デイビッドマイケルデュフィー; バーナデットエリオットボウマン
Original assignee: パナソニックＩｐマネジメント株式会社
Priority date: 2020-10-20
Filing date: 2021-10-07
Publication date: 2022-04-28

Abstract

This information processing method is for an information provision system for communicating with a user, and includes: acquiring first speech information including a first question from a user together with a device ID; acquiring communication policy information including a hearing level or a sight level corresponding to the user; and specifying an output format corresponding to the hearing level or the sight level and outputting, to a speaker or a display, first answer information indicating a first answer to the first question.

Description

Information processing method

This disclosure relates to an information processing method in an information providing system including a device for communicating with a user.

In Patent Document 1, when communicating, the knowledge level of the information transmitter user is compared with the knowledge level of the receiver user, and when the knowledge level of the receiver user is lower than the knowledge level of the transmitter user, A method of transmitting to the recipient user by selecting the event with the highest knowledge level of the recipient user and converting the quantitative value of the degree of the input event into the quantitative value of the degree of the selected event with the highest knowledge level. Is disclosed.

Japanese Unexamined Patent Publication No. 2014-71711

In an information providing system including a device that communicates with a user, it is desired to realize communication according to the characteristics of the user.

The information processing method according to one aspect of the present disclosure is an information processing method in an information providing system including a device for communicating with a user, wherein the device includes a microphone and a speaker, and is acquired by the microphone of the device. The first voice information including the first question from the user is acquired together with the device ID that identifies the device, and the communication policy information including the hearing level corresponding to the user is acquired based on the device ID, and the hearing level is obtained. Is an output format to be output from the speaker set for the user via the user's information communication terminal or the device, and is set corresponding to the user ID that identifies the user, and is set to correspond to the user ID. Is associated with the device ID in the information providing system, generates first answer information indicating the first answer to the first question, specifies an output format corresponding to the hearing level, and specifies the first answer. Includes outputting information to the speaker of the device. The output format includes a volume, and the volume when the hearing level is the first level is higher than the volume when the hearing level is the second level higher than the first level.

According to one aspect of the present disclosure, in an information providing system including a device for communicating with a user, communication according to the characteristics of the user can be realized.

It is a figure which shows an example of the whole structure of the information system which concerns on embodiment of this disclosure. It is a block diagram which shows an example of the structure of the information system which concerns on embodiment of this disclosure. It is a table showing an example of the correlation with the communication ability related to the means of communication. It is a table showing an example of the correlation between the communication policy and the language level of the user. It is a table showing an example of the correlation between the communication policy and the knowledge level of the user. It is a table showing an example of the correlation between the communication policy and the visual level of the user. It is a table which shows an example of the correlation between a communication policy and a user's hearing level. It is a table which shows an example of the response method according to the communication ability. It is a sequence diagram which shows an example which performs the initial setting of communication. It is a sequence diagram which shows an example which performs the initial setting of communication. It is a table which shows various parameters set in the initial setting. It is a sequence diagram which shows an example of the response method which answers a question from a user. It is a sequence diagram which shows an example of the response method which answers a question from a user. It is a flowchart which shows an example which makes a response according to a communication policy. It is a sequence diagram which shows an example which asks a question to a user and measures the communication ability by the answer. It is a flowchart which shows an example which asks a question to a user and updates the communication ability by the answer. This is an example of a table used when asking a question to a user and updating communication ability by answering the question. This is an example of a table used when the user's language level is subdivided into multiple languages and managed. This is an example of a table used when the knowledge level of a user is subdivided into multiple fields and managed. It is a sequence diagram which shows an example which performs the initial setting of communication. It is a sequence diagram which shows an example of the response method which answers a question from a user. It is a sequence diagram which shows an example which asks a question to a user and measures the communication ability by the answer. It is a sequence diagram which shows an example which estimates the degree of understanding of a user from communication and intervenes with support as appropriate. It is a flowchart which shows an example which estimates the understanding degree of a user from a communication and intervenes with support as appropriate. It is a sequence diagram which shows an example of summarizing before the end of communication. It is a sequence diagram which shows an example of summarizing after the end of communication. It is a sequence diagram which shows an example which summarizes only to a user after the end of communication. It is a sequence diagram which shows an example which summarizes only to a user after the end of communication. This is an example of a table used to support the user's task execution. It is a sequence diagram which shows an example which supports the task execution of a user.

(Background to this disclosure)
We receive a lot of information in our daily lives and emit a lot of information. Whether a user can correctly understand information received from television, magazines, or conversations with people in his or her life depends largely on the content of the information, how it is expressed, how it is communicated, and the user's perceptual ability. There is. However, in reality, it is difficult for the user who receives the information to clearly indicate whether or not the received information can be understood. For example, if there is not enough time for the user to ask a question due to time constraints, if the user is unfamiliar with the language used for communication and cannot keep up with the story, then the user has the necessary prerequisite knowledge about what is being said. If you do not have it, if the user's visual and auditory abilities are weakened and you cannot fully recognize it, there are various cases.

The lack of understanding of the recipient of these communications can even lead to major life-threatening problems because the necessary information did not reach the people who needed it.

In addition, there are not only issues related to information exchange in daily life, but also tasks that must be carried out on a daily basis. However, there are many tasks in daily life, and it is difficult to carry out all of them reliably.

The problems that can occur in carrying out such daily life may become even greater in the future due to the decline in cognitive ability due to aging and the lack of language ability required by internationalization.

The communication support device according to one aspect of the present disclosure is an information communication terminal that communicates with a user by continuously measuring the language level, knowledge level, visual level, and / or auditory level of the user. The recommended method (policy) for smooth communication with the user to be determined is recorded, and the information communication terminal is provided with the first communication newly received by the user. When the first communication received by the user is compared with the recommended method (policy) and it is determined that the first communication does not match the recommended method (policy), the first communication is detected by the sensor. Generate individual communication to the user by changing at least one of the language expression method of the first communication received by the user, the required knowledge system, the display method of character information, and the output method of voice information. Then, the individual communication is transmitted to the user.

As a result, if the linguistic expression of the information that constitutes the communication received by the user is difficult for the user, or if the prerequisite knowledge necessary for understanding the information is insufficient, the textual information contained in the communication is recognized. If there is a lack of visual and / or hearing to recognize the voice information contained in the communication, the smooth communication that is normally performed between the user and the information communication terminal is performed. According to the recommended method (policy) for, the user received by simplifying the language expression, simplifying the prerequisite knowledge, displaying the text information that is easier to recognize visually, and outputting the voice information that is easier to hear more audibly. It is possible to support the understanding of communication according to the ability of the user.

The information processing method according to the first aspect of the present disclosure is an information processing method in an information providing system including a device for communicating with a user, wherein the device includes a microphone and a speaker, and is acquired by the microphone of the device. The first voice information including the first question from the user is acquired together with the device ID for identifying the device, and the communication policy information including the hearing level corresponding to the user is acquired based on the device ID. The hearing level is an output format output from the speaker set for the user via the information communication terminal of the user or the device, and is set corresponding to the user ID that identifies the user. The user ID is associated with the device ID in the information providing system, generates first answer information indicating the first answer to the first question, specifies an output format corresponding to the hearing level, and specifies the first answer. 1 Includes outputting response information to the speaker of the device. The output format includes a volume, and the volume when the hearing level is the first level is higher than the volume when the hearing level is the second level higher than the first level. Here, the first level and the second level in the auditory level may be referred to as a first auditory level and a second auditory level, respectively.

According to the above configuration, when a user asks a question to the device, the answer to the question can be output from the speaker of the device in an output format according to the user's hearing level. As a result, it is possible to provide the answer to the user in an output format that the user can listen to more easily than in the case where the answer is provided in a uniform output format.

In the information processing method of the second aspect according to the first aspect, the output format may further include at least one of speed and clarity.

The information processing method according to the third aspect of the present disclosure is an information processing method in an information providing system including a device for communicating with a user, wherein the device includes a microphone and a display, and is acquired by the microphone of the device. The first voice information including the first question from the user is acquired together with the device ID for identifying the device, and the communication policy information including the visual level corresponding to the user is acquired based on the device ID. The visual level is an output format output from the display set for the user via the user's information communication terminal or the device, and is set in correspondence with the user ID that identifies the user. The user ID is associated with the device ID in the information providing system, generates first answer information indicating the first answer to the first question, specifies an output format corresponding to the visual level, and specifies the first answer. 1 Includes outputting response information to the display of the device. The output format includes a display size of characters, and the display size of the characters when the visual level is the first level is the second level when the visual level is higher than the first level. It is larger than the display size of characters. Here, the first level and the second level in the visual level may be referred to as a first visual level and a second visual level, respectively.

According to the above configuration, when a user asks a question to the device, the answer to the question can be output from the display of the device in an output format according to the user's visual level. As a result, the answer can be provided to the user in an output format that can be more easily viewed by the user, as compared with the case where the answer is provided in a uniform output format.

In the information processing method of the fourth aspect according to the third aspect, the output format may further include at least one of modification of character edging, color arrangement of characters, and arrangement of characters.

In the information processing method of the fifth aspect according to each of the first to fourth aspects, the communication policy information further includes a knowledge level corresponding to the user, and the knowledge level is the information communication terminal of the user. Alternatively, it is a response level set for the user via the device, and is set in correspondence with the user ID, and the first response information is based on the first voice information and the communication policy information. , May be generated according to the knowledge level.

According to the above configuration, when a user asks a question to the device, an answer according to the user's knowledge level can be output from the speaker of the device. This allows the user to be provided with a personalized answer created for the user. As a result, it is possible to provide the user with an answer that is easier for the user to understand than when a uniformly created general answer is provided.

In the information processing method of the sixth aspect according to the fifth aspect, the knowledge level may be set for each field. Further, the field may include at least one of social common sense, formal science, natural science, social science, humanities, and applied science.

In the information processing method of the seventh aspect according to the fifth aspect, the first number of technical terms included in the 1-1 answer to the first question when the knowledge level is the first level is the above. It may be less than the second number of technical terms included in the 1-2 answers to the first question when the knowledge level is a second level higher than the first level. Here, the first level and the second level in the knowledge level may be referred to as a first knowledge level and a second knowledge level, respectively.

In the information processing method of the eighth aspect according to the fifth aspect, the average first length of the sentence included in the 1-1 answer to the first question when the knowledge level is the first level. May be shorter than the average second length of the text contained in the 1-2 answers to the first question when the knowledge level is a second level higher than the first level.

In the information processing method of the ninth aspect according to the fifth aspect, the knowledge level is the total number of first characters of the first answer to the first question when the knowledge level is the first level. It may be less than the total number of second characters of the first and second answers to the first question in the case of the second level higher than the first level.

In the information processing method of the tenth aspect according to the fifth aspect, the communication policy information further includes a language level corresponding to the user, and the language level refers to the information communication terminal or the device of the user. It is a language level set for the user via the user, and is set in association with the user ID, and the first response information may be generated according to the knowledge level and the language level.

In the information processing method of the eleventh aspect according to the fifth aspect, the second voice information including the second question is output to the apparatus in order to output to the user from the speaker of the apparatus, and the second question is asked. Is used for updating the knowledge level, unlike the first question, and obtains the second answer information indicating the user's second answer to the second question from the device, and uses the second answer information as the second answer information. Based on this, the knowledge level may be updated.

In the information processing method of the twelfth aspect according to the fifth aspect, the knowledge level may be updated based on the correctness of the second answer.

In the information processing method of the thirteenth aspect according to the fifth aspect, the knowledge level may be updated based on the time required from the output of the second voice information to the acquisition of the second answer information. ..

An information processing method according to another aspect of the present disclosure is an information processing method in an information providing system including a device for communicating with a plurality of users including a first user and a second user, wherein the device includes a microphone and a speaker. Provided, voice information including a question acquired by the microphone of the device is acquired together with a device ID for identifying the device, and the question is asked by the first user and the first user using the voice information and the speaker identification database. The speaker identification database determines which of the two users is used, and associates the speaker identification database with the first user ID that identifies the first user to determine the feature amount of the voice of the first user and identifies the second user. The feature amount of the voice of the second user is managed in association with the second user ID, and the first user ID and the second user ID are associated with the device ID in the information providing system, (i). When it is determined that the question is from the first user, the first communication policy information including the first knowledge level corresponding to the first user is acquired, and the first knowledge level is the information communication of the first user. The first knowledge, which is the response level set for the first user via the terminal or the device, is associated with the first user ID, and is based on the voice information and the first communication policy information. The first answer information indicating the answer to the question is generated according to the level, the first answer information is output to the speaker of the apparatus, and (ii) when the question is determined to be by the second user. , The second communication policy information including the second knowledge level corresponding to the second user is acquired, and the second knowledge level is the information communication terminal of the first user, the information communication terminal of the second user, or the device. It is a response level set for the second user via the above, is associated with the second user ID, and is based on the voice information and the second communication policy information according to the second knowledge level. The second answer information indicating the answer to the question is generated, and the second answer information is output to the speaker of the apparatus.

According to the above configuration, when a first user or a second user asks a question to the device, it is determined whether the question is made by the first user or the second user, and it is determined according to the knowledge level of each user. Different answers can be provided for each user. In other words, even if the first user or the second user asks the same question, the answer can be changed for each user according to the knowledge level of each user. As a result, it is possible to provide each user with an answer that can be more easily understood by each user as compared with the case where a uniformly created general answer is provided to each user.

Further, according to the above configuration, voice information is acquired together with the device ID, and the first user ID and the second user ID are associated with the device ID in the information providing system. As a result, when determining who the speaker is using the voice information and the speaker identification database, the device ID can be used to extract the user associated with the device ID as a candidate for the speaker. can. As a result, since the data requiring collation can be narrowed down, the speaker can be efficiently identified without collating with all the features of the user's voice managed in the identification database.

An information processing method according to another aspect of the present disclosure is an information processing method in an information providing system including a device for communicating with a user, wherein the device includes a microphone and a speaker, and is acquired by the microphone of the device. The first voice information including the first question from the user is acquired together with the device ID that identifies the device, and the communication policy information including the knowledge level corresponding to the user is acquired based on the device ID. The knowledge level is a response level set for the user via the user's information communication terminal or the device, and is set in association with a user ID that identifies the user, and the user ID is the information. It is associated with the device ID in the providing system, and based on the first voice information and the communication policy information, the first answer information indicating the first answer to the first question is generated according to the knowledge level. It includes outputting the first answer information to the speaker of the apparatus.

A device according to another aspect of the present disclosure is a device that communicates with a user visually or auditorily, and measures or sets at least one of the user's language level, knowledge level, visual level, and auditory level. The recommended method (policy) for smooth communication with the user, which is determined by the above, is recorded, and by communicating with the user according to the above recommended method (policy), the personalized understanding for the user is recorded. Achieve easy and smooth communication.

The present disclosure can also be realized as a program for causing a computer to execute each characteristic configuration included in such an information processing method, or as a communication support system operated by this program. Needless to say, such a computer program can be distributed via a computer-readable non-temporary recording medium such as a CD-ROM or a communication network such as the Internet.

Note that all of the embodiments described below show a specific example of the present disclosure. The numerical values, shapes, components, steps, order of steps, etc. shown in the following embodiments are examples, and are not intended to limit the present disclosure. Further, among the components in the following embodiments, the components not described in the independent claim indicating the highest level concept are described as arbitrary components. Moreover, in all the embodiments, each content can be combined.

(Embodiment)
It is expected that the Internet will continue to spread in our society and that various sensors will become familiar to us. As a result, it is expected that our society will be able to digitize information on individual conditions and activities, as well as information on the entire city, including buildings and transportation networks, and use it on computer systems. To. Digitized personal data (personal information) is stored in the cloud via a communication network, managed by an information bank as big data, and used for various purposes for individuals.

Such an advanced information society is called Society 5.0 in Japan. The highly information-oriented society is an information infrastructure that highly integrates the real space (physical space), which is the material world surrounding individuals, and the virtual space (cyberspace), in which computers cooperate with each other to perform various processes related to the physical space. It is a society where economic development and solution of social issues are expected by cyber physical system).

In such a highly information-oriented society, it is necessary to analyze communication (including acquisition, provision, and expression method of information) and behavior of individuals in various daily situations, and to analyze big data including accumulated personal information. Then, it becomes possible to provide the necessary information to the individual by the communication method that seems to be the most suitable for the individual according to the scene.

From now on, on the premise of an advanced information society in which such a cyber-physical system operates, the theme is to support daily life that is close to the individual, and the mode of comprehensively individualizing communication and daily life. I will explain how to support the task of.

FIG. 1 is a diagram showing an example of the overall configuration of the information system according to the embodiment of the present disclosure. In FIG. 1, the upper half shows a cyber space containing a cloud, and the lower half shows a physical space containing people and things. In this disclosure, devices related to users who are mainly assisted in communication and task execution in daily life are described in the center of the left and right. In addition, an object related to another user who has many daily contacts with the user (hereinafter referred to as a cohabitant to distinguish the user) is arranged at the left end. In addition, objects related to the communication partner (doctor in the figure) with whom the user communicates for a specific purpose in social life are arranged on the right side.

The user uses a personal information communication terminal 99, an information communication terminal 100 that can be used by a person other than the user, such as a cohabitant, and an information source 102. The information communication terminal 99 may be, for example, a smartphone owned by the user or a personal computer. The information communication terminal 100 may be, for example, a robot that communicates with the user in the user's living space, a smart speaker, or a wearable device that is worn and used like a smart watch, smart glasses, hearables, or smart clothes. It may be a smartphone or a personal computer used by the user. The information communication terminal 100 may have a function of communicating with the user by voice, may have a function of communicating with the user by displaying a video or a character, and may have a touch, a gesture, or the like. It may have a function of communicating with the user by using facial expressions.

The information source 102 is, for example, a television or a magazine, which is a tool for a user to acquire information on a daily basis. Users live every day while using information communication terminals (99, 100) and information sources 102. Each of these information communication terminals is connected to a cloud 101 that stores and manages user information and information about the device via a wide area communication network.

On the other hand, the cohabitant has a personal information communication terminal 99 and uses the information communication terminal 100 and the information source 102 together with the user. The information communication terminal 100 and the information source 102 described as shared assets in this figure are things that may be used by both the user and the cohabitant. For example, the information communication terminal 100 may be used by a user or by a cohabitant.

When the doctor diagnoses the user's illness, the doctor inputs the diagnosis result as a medical record using the information communication terminal 110. The input medical record information is stored and managed in the cloud 111 that stores the medical record information connected to the information communication terminal 110 by the wide area communication network. Here, a doctor is used as an example of communication, but this disclosure is not limited to this, and any person who directly communicates with the user, such as a lawyer, a police officer, a friend, or a neighbor, can be used. good.

FIG. 2 shows a block diagram showing an example of the configuration of the information system according to the embodiment of the present disclosure. The information communication terminals (99, 100) are

sensors

203, 213 and video /

audio output units

206, 217 for communicating with the user by video information and voice information, respectively, and an operation unit that accepts button presses and touch operations from the user. 205, 216, Data used by

arithmetic units

202, 212 and

arithmetic units

202, 212 that perform information processing such as voice recognition, voice synthesis, information retrieval, and information drawing performed in information communication terminals (99, 100). It includes

memory

204 and 215 to be held, and

communication units

201 and 211 for performing information communication with a computer on the network. Further, when the information communication terminal 100 is in the form of a robot, the facial expression is displayed on the movable unit 214 for gesturing to smoothly communicate with the user and the video / audio output unit 217. May be good.

The cloud 101 that manages user information has a communication unit 221 for performing information communication with a computer on the network, a memory 223 that records user information, and forms and outputs user information in response to an external request. A calculation unit 222 that performs information processing is provided.

Similarly, the cloud 111 that manages the information handled by the other party also has a communication unit 241 for performing information communication with a computer on the network, a memory 243 that records the information handled by the other party, and the other party handles the information in response to an external request. It is provided with a calculation unit 242 that performs information processing such as inputting / outputting information.

The information communication terminal 110 handled by the other party has an operation unit 234 that accepts button presses and touch operations from the other party, an arithmetic unit 232 that performs information processing such as information retrieval and information drawing performed in the information communication terminal 110, and an arithmetic unit. It includes a memory 233 for holding data used by 232, and a communication unit 231 for performing information communication with a computer on the network.

Return to the explanation in Fig. 1. When a smartphone is used as the information communication terminal 100, the user communicates via an application running on the terminal. For example, in the application, the communication history with the user may be displayed in characters for a short conversation such as a chat, and the avatar may be displayed as the conversation partner. In the form of the robot described above, it is a physical device that actually exists in the physical space, but the avatar drawn by the above application is graphic data displayed on the screen of the smartphone (video / audio output unit 217). Although the form is different, there is no essential difference in communicating with the user, and the information communication terminal 100 will be described below as the form of the robot.

Further, the information communication terminal 100 communicates with the cloud 101 that manages user information via a network that is a wide area communication network. The cloud 101 that manages user information manages an index for specifying a user's communication ability, a conversation history, and a user's daily task, which will be described later.

Users can ask questions to the information and communication terminals (99, 100) that they use on a daily basis about things they could not understand or want to know more deeply from the information sources 102 that they usually touch, such as televisions and magazines. Search using an information communication terminal (99, 100).

In the embodiment of the present disclosure, the information communication terminal 100 and the cloud 101 that manages the user information are described separately, but the present invention is not limited to this, and the user information is stored inside the information communication terminal 100 and the user information is described. The configuration may be such that the cloud 101 that manages the above is not used. In this case, since the user information is not acquired from the cloud 101 via the network, there is a possibility that communication cannot be used or the risk of leakage of the user information is reduced.

On the other hand, the other party (doctor) who communicates with the user will be described using the example of FIG. The doctor with whom you communicate with the user diagnoses the user. Further, the examination result is input via the information communication terminal 110 which is a medical record input terminal. The medical record is managed in the cloud 111 as information handled by the other party, which is a medical record storage cloud.

The cloud 101 that manages user information communicates with the medical record storage cloud 111, and can acquire or access information on the user's own medical examination results, prescriptions, and past medical history. Considering the cloud 101 that manages user information as an information bank, an individual has the independence and initiative for his / her personal information, and his / her own personal information (personal) for the purpose of returning value to himself / herself or society. It is a world of Society 5.0 type that can utilize data). Here, personal authentication technology using biometric information, personal information leakage prevention technology by distributed encryption management, and data encrypted so that it can be used for a wide range of needs while ensuring the security of personal information are maintained. Secret calculation technology that enables arithmetic processing is used, but the details of those technologies are omitted in this disclosure.

Hereinafter, a mode in which the user smoothly communicates with the other party (for example, a doctor) via the information communication terminal 100, which is familiar with the communication ability of the user, will be described. In addition to supporting communication with others, we will also explain a form that supports the user's daily task execution while being close to the user, making use of the position of the information communication terminal 100 that the user uses on a daily basis. ..

FIG. 3 is a table showing an example of the correlation between the means of communication between the user and the information communication terminal 100 and the communication ability of the related user.

As shown at the top of the table, when the information communication terminal 100 to be used is a robot, an AI speaker mainly for voice communication, an earphone for hearing support, etc., voice is used as a means of communication with the user. be. When only voice is used as a means of communication, it is related to a specific communication method recommended when the information communication terminal 100 communicates with a user (hereinafter, this is referred to as a communication policy or simply a policy). Is the user's language level, knowledge level, and auditory level (communication ability element described as Yes in the table). The user's visual level is not included in the actual communication ability element, so it is listed as No in the table.

It can be said that the policy is a guideline on how to express the information to be conveyed, which should be observed in order to smoothly convey certain information to the target user.

Similarly, as shown in the second row of the table, when the information communication terminal 100 to be used is a smartphone, an AI speaker attached to a monitor, a system in which an AI speaker and a television are linked, etc., it can be used as a means of communication with the user. May use voice and text. In this case, it is the user's language level, knowledge level, visual level, and auditory level that are relevant to the communication policy of the information communication terminal 100.

Similarly, as shown at the bottom of the table, when the information communication terminal 100 to be used is a smartphone, smart watch, television, or the like, only characters may be used as a means of communication with the user. In this case, it is the user's language level, knowledge level, and visual level that are related to the communication policy of the information communication terminal 100.

As described above, the means of communication that can be used differ depending on the form of the information communication terminal 100 to be used and the function to be used. Thereby, the policy of communication between the user and the information communication terminal 100 is determined by the language level and knowledge level of the user, as well as the visual level and / or the auditory level used according to the means of communication.

FIG. 4 is a table showing an example of the correlation between the communication policy and the language level of the user. As shown here, the language level is divided into multiple stages such as 1-5. The language level means that 1 is the least linguistic knowledge and 5 is the most linguistic knowledge. Therefore, when the language level is 1, set the policy to use only the basic words that are often used. On the contrary, when the language level is 5, set the policy to use various words used in a wide range of situations.

For example, the words used for each language level are words taken by first grade elementary school students at language level 1, words taken by third grade elementary school students at language level 2, and taken by sixth grade elementary school students at language level 3. The higher the language level, the more language knowledge is required, such as words, words taken by third-year junior high school students at language level 4, and words taken by third-year high school students at language level 5. May be set to.

FIG. 5 is a table showing an example of the correlation between the communication policy and the knowledge level of the user. As shown here, the knowledge level is divided into a plurality of stages such as 1 to 5. The knowledge level means that 1 is the least knowledgeable and 5 is the most knowledgeable. Therefore, when the knowledge level is 1, set the policy to use only the basic knowledge that is often used. On the contrary, when the knowledge level is 5, the policy is set to use various knowledge used in a wide range of situations.

For example, the knowledge used for each knowledge level is the knowledge that the first grader of elementary school takes if it is knowledge level 1, the knowledge that the third grader of elementary school takes if it is knowledge level 2, and the sixth grader of elementary school if it is knowledge level 3. Knowledge, knowledge taken by third-year junior high school students at language level 4, knowledge taken by third-year high school students at language level 5, and so on. The higher the knowledge level, the greater the amount of knowledge required. You may set it. In addition, the higher the level of knowledge in the answer to which the policy is applied, the greater the number of technical terms contained in the answer, or the longer the average length of the text contained in the answer. However, the total number of characters in the answer may be increased.

FIG. 6 is a table showing an example of the correlation between the communication policy and the visual level of the user. As shown here, the visual level is divided into a plurality of stages such as 1 to 5. Perceptual level means that 1 has the lowest visual cognitive ability and 5 has the highest visual cognitive ability. Therefore, when the visual level is 1, the policy is set so that characters and sentences are displayed in the design with the highest visibility. The design referred to here is the display size of characters, the modification of character borders, the color scheme of characters, the arrangement of characters, and the like. On the contrary, when the visual level is 5, the policy is set to display characters using a general design.

For example, the character design used for each visual level is a general design (character size = 16) for visual level 5, a slightly higher visibility design (character size = 22) for visual level 4, and a visual level. 3 is a highly visible design (character size = 26), visual level 2 is a more visible design (character size = 32), and visual level 1 is the most visible design (character size). The lower the visual level, the larger the characters displayed in stages may be set, such as = 36).

FIG. 7 is a table showing an example of the correlation between the communication policy and the user's hearing level. As shown here, the hearing level is divided into a plurality of stages such as 1 to 5. Hearing level means that 1 has the lowest auditory cognitive ability and 5 has the highest auditory cognitive ability. Therefore, when the hearing level is 1, the policy is set to output the sound at the slowest, clearest, and / or loudest volume. Clarity here means that the pronunciation of letters is clear, there are few ambiguous pronunciations, and it is easy to hear. On the contrary, when the hearing level is 5, the policy is set to output the sound at the general speed, clarity, and volume.

For example, as for the pronunciation of sentences used for each hearing level, if the hearing level is 5, the voice is output with general audibility (speech speed = 100 words / minute, volume = 50 dB), and if the hearing level is 4, the voice is slightly output. Easy-to-hear voice output (speech speed = 90 words / minute, volume = 60 dB), easy-to-hear voice output at hearing level 3 (speech speed = 80 words / minute, volume = 65 dB), hearing level 2 If there is, the voice is output more easily (speech speed = 70 words / minute, volume = 70 dB), and if the hearing level is 1, the voice is output most easily (speech speed = 60 words / minute, volume = 75 dB). The lower the hearing level, the slower the sound that is output in stages, and / or the sound that is output at a louder volume may be set so that it is easier to hear.

FIG. 8 is a table showing specific examples of communication policies between the user and the information communication terminal 100 according to the communication ability of the user. In this table, the communication means (voice, text, video) used by the information communication terminal 100, which receives the question "What is a PCR test?" From the user, and the user's communication ability (language level, knowledge level). , Perceptual level, Hearing level), and how it changes depending on the policy.

From now on, the policy will be expressed in the following function notation using the five elements that determine it. In other words, the policy is an output value that changes according to the input value of the communication method used, the language level, the knowledge level, the visual level, and the auditory level of the user. Communication ability that is not involved in the policy is expressed as 0.

Policy = f (means of communication used, language level, knowledge level, visual level, auditory level)
At the top of the table is an example of responding to the user's question "What is a PCR test?" With policy = f (

voice

1, 1, 0, 2). Using the above example, since the language level is 1, the words taken by the first graders of elementary school are used, and since the knowledge level is 1, the answer sentence "It is a method to check if you are sick" is given in the category of the knowledge taken by the first graders of elementary school. Since the hearing level is 2, it is easier to hear, and the policy is to output voice at a speech speed of 70 words / minute and a volume of 70 dB.

Similarly, the second row of the table is an example of responding to the same question with policy = f (voice, 4, 4, 0, 4). In this case, use more advanced language, use broader knowledge, and respond rather slowly. For example, it responds by voice-outputting "This is a test method that amplifies the genetic information of the virus to check whether it is infected with the virus" rather slowly (speech speed = 90 words / minute).

Similarly, the third row of the table is an example of responding to the same question with policy = f (

characters

4, 4, 4, 0). In this case, since the language level and the knowledge level are the same, the same answer sentence as in the second stage is responded with a slightly highly visible design. For example, it responds by displaying "This is a test method that amplifies the genetic information of a virus to check whether it is infected with a virus" in a slightly highly visible design (character size = 22).

The same applies to the following, but in the fifth row of the table, the same question is answered with policy = f (character + video +

audio

2, 2, 2, 2). In this case, the answer sentence "This is a method to check for diseases by genes" is displayed in a more visible design (character size = 32, character modification = bold), and at the same time, furigana is added to the Chinese characters for easy reading. .. Further, a supplementary illustration video (800) is used to explain a method of investigating a disease by a gene with a simplified two-step image video.

Also, for foreigners who are not very familiar with Japanese, the language level of Japanese is set to 1, but in some cases it is easier to communicate with the same policy as ordinary people at the knowledge level and visual level. be. The bottom of the table is such a case, and if you do not know the katakana display of "virus" or the Japanese character of "idenshi", it is widely known as English notation (Virus). By explaining in combination with the abbreviations (RNA, DNA), it is possible to make a response that supports understanding according to the communication ability of the user.

FIG. 9 is a sequence diagram showing an example of initializing communication. In particular, here shows a flowchart for initializing the information communication terminal 100 using the user's information communication terminal 99. Hereinafter, the information communication terminal 99 is also referred to as a smartphone 99, and an example of the case where the information communication terminal 99 is a smartphone will be described. Further, the information communication terminal 100 is also referred to as a robot 100, and an example of the case where the information communication terminal 100 is a robot will be described.

In step S901, the user operates an application for setting the robot 100 installed on the smartphone 99, and operates so as to perform the initial setting. The smartphone 99 opens an ad hoc communication session with the robot 100, and in step S902, the robot transmits identification information (robot ID) to the smartphone. This may be performed in a connection form called ad hoc mode or the like in Wi-Fi (registered trademark) of wireless communication, may be performed by using a beacon signal in Bluetooth (registered trademark), or another communication method. , May be done by communication mode. Alternatively, the user may input the robot ID into the smartphone 99, or a code indicating the robot ID (for example, a QR code (registered trademark)) may be read and set by the camera of the smartphone 99.

In step S903, the user newly creates a user account, and at that time, registers a user ID that serves as user identification information in the application. Further, in step S904, when the user talks to the robot 100 or the like, information (user identification information) for the robot 100 or the cloud 101 to recognize that the communication partner is the user is registered. The user identification information may be, for example, a wakeup word used only by the user for the robot 100, a physical feature of the user (face, fingerprint, etc.), or a voice feature of the user.

Finally, in step S905, the user registers a policy for the robot 100 to communicate with the user. This is setting information (that is, communication policy information) including at least one of the above-mentioned language level, knowledge level, visual level, and auditory level. Depending on the application that performs the initial setting, you may set the level while checking which level the user prefers while referring to specific communication cases, or perform a simple test on the user and recommend the level. May be suggested to the user to set the level.

When the information necessary for the initial setting of the user is collected, in step S906, the application for setting the smartphone 99 uses the user ID, the user identification information, the policy and the robot ID of the robot 100 used by the user. It is transmitted to the cloud 101 that manages the information. In step S906, the cloud 101 that manages the received user information records the above information regarding the initial setting of the user in the memory 223 and registers the user. Then, in step S908, the smartphone 99 is notified of the completion of registration.

In step S909, the user confirms that the user's registration was successful via the video / audio output unit 206 of the smartphone 99, and then further adds if there is an additional registration. Here, we will continue to register the cohabitants. The initial setting of the housemate is done in the same way as the user registration earlier. User ID that is the identification information of the co-resident in the account setting of the co-resident, user identification information for the robot 100 or the cloud 101 to recognize that the co-resident is the communication partner, and the robot 100 communicates with the co-resident. Register the policy for taking in the application of the smartphone 99 (steps S910 to S915).

When the information required for the initial settings of the cohabitant is collected, the application that sets the smartphone 99 uses the user information of the cohabitant, the user identification information, the policy, and the robot ID of the robot used by the cohabitant. Send to the managed cloud 101. Upon receiving this, the cloud 101 records the above information regarding the initial setting of the cohabitant in the memory 223 and registers the cohabitant. Then, the notification of the completion of registration is notified to the smartphone 99. Then, the user completes the initial setting for Yu and the housemate (step S916).

Registration of these users may be performed individually rather than continuously. Further, the registration of the cohabitant may be performed by the cohabitant himself / herself using the smartphone of the cohabitant instead of the smartphone of the user. Here, since the user and the cohabitant use one robot 100, a plurality of users may be registered in a form associated with the robot 100. Further, in the registration, the user identification information for identifying the communication partner and the policy used at that time are registered so that the person who communicates with the robot 100 can respond appropriately depending on whether the user or the cohabitant can respond appropriately. It is important to be done.

FIG. 10 is a sequence diagram showing an example of initializing communication. The difference from FIG. 9 is that the initial setting is performed directly by using the robot 100 without going through the smartphone 99. The information registered in the initial settings is exactly the same as in FIG. When the user starts the initial setting for the robot 100, the user registers his / her user ID, user identification information, and policy via the robot 100 (steps S1001 to S1004). In step S1005, the robot 100 notifies the cloud 101 that manages the user information of the initial setting information of these users together with the robot ID, and in step S1007, the user registration is completed. Similarly, the user continues to perform the initial setting of the cohabitant via the robot 100.

Even in this case, the robot ID that identifies the robot 100 and the initial setting values (user ID, user identification information, policy) for each user are registered in the cloud 101 as a pair (steps S1008 to S1015).

FIG. 11 is a table showing various parameters set in the initial settings shown by the flowcharts of FIGS. 9 and 10. Since this table is used when generating communication with a user or a cohabitant, it is recorded and managed in the cloud 101 that manages user information.

In the table, one record corresponds to one row. For example, looking at the record on the first line, the user ID (0001) is the information of the user who uses the robot 100 of the robot ID (00001000), and three pieces of information are used as the user identification information for the robot 100 to identify this user. And the policy (f (character + voice, 3, 3, 3, 3)) applied to this user is recorded.

The three pieces of information contained in the user identification information include this user-specific wakeup word (Einstein) used when this user talks to the robot 100, and the robot 100 or the cloud 101 identifies that this user is communicating. A user's physical feature amount (for example, a face feature amount vector) and a user's voice feature amount (voice feature amount vector) are included.

Another user, a cohabitant, is registered in the robot 100, which has the same robot ID (00001000). The roommate is a user ID (0002), and the record contains a unique wakeup word (Razaford) used only by the housemate, physical features of the housemate, voice features of the housemate, and a policy for the housemate (f). (Character + voice, 4, 4, 5, 5)) is registered. The same is recorded for other robots and users.

When one user occupies and uses one robot 100, it is not necessary to identify with whom the robot 100 is communicating. However, when the same robot 100 is used by a plurality of people, the robot 100 or the cloud 101 may be a user with a user ID of 0001 or a cohabitant of 0002 by identifying the difference from the registered user identification information. Or you can determine if you are talking to someone other than these two. As a result, the robot 100 can communicate with the policy set according to the person with whom the robot is talking.

FIG. 12 is a sequence diagram showing an example of a response method for answering a question from a user. This figure is a flow from the information from the information source 102 to the user asking a question to the robot 100 by voice and the robot 100 answering the user by voice according to the set user policy. Here, it is assumed that the initial setting of the user has been completed in advance.

For example, in step S1201, the user does not understand the meaning of the word "PCR test" used in the explanation "How many new infections are by PCR test" that appears in the news information from the information source 102. And. In step S1202, the user asks the robot 100, "What is a PCR test?" Orally (voice). The robot 100 acquires the voice information of the question asked by the user by the sensor 213, and in step S1203, transmits the voice information and its own robot ID to the cloud 101 that manages the user information via the communication unit 211.

In step S1204, the cloud 101 refers to the database of FIG. 11, that is, the speaker identification database, and extracts the user who is using the received robot ID. Further, it is determined by collating with the user identification information which of the extracted users the utterance was made. Here, the speaker may be specified by comparing the voice feature amount of the received voice information with the registered voice feature amount. Further, the voice information may be recognized and converted into an utterance character string, and the utterance character string may be intentionally analyzed to estimate that the question is a question. Here, it is estimated by the cloud 101 that the speaker was the user and that the utterance was a question from the voice features.

When identifying the speaker by comparing with the voice features, if there is no user who satisfies the predetermined matching degree, the default policy may be used as the speech of the unregistered user.

Next, in step S1205, the user's policy is referred to from the database, and the answer content (answer character string) according to the language level and knowledge level set for the user and the auditory level set for the user are met. Determine the presentation method (speech speed, volume, etc.). Here, assuming that the user's policy is f (voice, 4, 4, 0, 4), the answer is "This is a test method that amplifies the genetic information of the virus to check whether it is infected with the virus now." It is determined that the character string is output as a voice that is slightly easy to hear (for example, utterance speed = 90 words / minute, volume = 60 dB). Then, in step S1206, the cloud 101 responds to the robot 100 with the specified answer content (answer character string) and the specified presentation method or output format (speech speed, volume) as the answer according to the user's policy. Request to answer.

In step S1207, the robot 100 converts the response character string into an audio signal from the received response content (answer character string) and the presentation method, and responds to the user by voice via the video / audio output unit 217. At this time, the voice output is controlled so that the above answer voice is slightly easier to hear. As a result, the user can hear the answer according to the user's language level, knowledge level, and auditory level, and can understand the contents well (step S1208). As a result, the user can immediately and accurately understand the news information.

After that, in step S1209, the cloud 101 requests the robot 100 to inquire the user about the evaluation of the policy applied to this answer by the inquiry content (inquiry character string) and the presentation method (utterance speed, volume). In step S1210, based on this request, the robot 100 asks the user a question based on the presentation method designated via the video / audio output unit 217 for the audio obtained by converting the inquiry character string into an audio signal. This evaluation request may be made when the robot 100 finishes the last reply within a time of 3 seconds or more and 10 seconds or less and there is no further inquiry from the user. When making an evaluation request for an answer in this way, it is clear that the conversation has ended (3 seconds or more) without the user having a continuous question, and whether the user understood or was easy to understand the given answer. Do it at the time you remember (10 seconds or less). As a result, correct feedback can be obtained without burdening the user and without bothering. Of course, this setting of 3 seconds or more and 10 seconds or less is an example, and may be operated at time intervals such as 1 second or more and 30 seconds or less, 2 seconds or more and 7 seconds or less.

The robot 100 converts the user's utterance into text information (character string) from the evaluation feedback (voice) to the user's response policy, or the voice information spoken by the user is used as it is (voice) and is clouded with its own robot ID. Transfer to 101 (steps S1211, S1212). Upon receiving the user's evaluation information, the cloud 101 receives the date and time of the question asked by the user, the content thereof, the policy used when the robot 100 answers, the content of the answer, and the user's evaluation of the answer in step S1213. Information (value that quantifies user's evaluation) is linked and recorded.

In this way, by continuously measuring the user's communication ability (language level, knowledge level, visual level, auditory level), changes in the user's communication ability can be detected immediately, and new ones can be obtained accordingly. You can update to the policy or encourage users to update the policy. In this figure, in step S1214, a new policy is updated according to a change in the communication ability of the user, and in steps S1215 and S1216, the notification is given to the user via the robot 100.

As explained above, the policy is to determine the change in the communication ability of the user by continuously measuring the communication exchanged between the user and the robot 100, and update it as necessary. Since the policy is determined by the user's language level, knowledge level, visual level, and auditory level, the robot 100 determines the user's language level, knowledge level, visual level, and auditory level in daily communication with the user. Always check and evaluate.

For example, the robot 100 asks a question about knowledge that exceeds the knowledge level of the policy currently applied to check the user's knowledge level, or displays the characters a little smaller to check the user's visual level. Alternatively, the user's communication ability may be constantly grasped and reflected in the policy by giving an answer with a slightly smaller voice and checking the user's hearing level.

On the other hand, if the robot 100 can specify the information transmitted by the information source 102 and the expression method thereof, the communication ability of the user may be set from the information. For example, if the TV program the user is watching is watching the pre-election debate between US presidential candidates in English, the user's English language level is 5, the knowledge level is 5, and the hearing level is the user. It may be estimated from the distance between and the TV and / or the output volume information of the TV.

Similarly, if the TV program that the user is watching is an animation for children mainly for the lower grades of elementary school, the user's language level is 1, the knowledge level is 1, and the hearing level is the user and TV. And / or may be estimated from the output volume information of the television. In addition, the books that the user is reading on the robot 100 or tablet terminal are specialized books and papers, and the destinations that are accessed via the Internet are documents and URLs that require advanced knowledge understanding of the Cabinet Office and the Japan Patent Office. In the case of an address, assuming that there is a sufficiently advanced body of knowledge, the language level may be estimated to be 5, the knowledge level to be 5, and the visual level may be estimated from the character display size on the robot 100 or the tablet terminal.

Since the robot 100 is close to the user on a daily basis and measures the communication ability of the user, when a sudden decrease in language level or knowledge level that does not normally occur is detected, the user himself / herself is in good physical condition. Ask the user if it is not bad, or contact the user's family, the insurance company to which the user is subscribed, the security company, the family doctor, etc. to check if there is any danger to the user's body. You may work to do so. These judgments can be made more accurately by making judgments together with sensor information that can detect the amount of activity and biometric information such as a smart watch worn by the user.

In the above description, the cloud 101 requesting the robot 100 having a specific robot ID to evaluate the response policy of the user is included in the message in which the evaluation result of the user's response policy is returned from the robot 100 to the cloud 101. Questions and answers are linked by the included robot ID. However, the present disclosure is not limited to this, and a unique message ID for identifying the message is given to the message of the evaluation request from the cloud 101 to the robot 100 and the message of the evaluation result from the robot 100 to the cloud 101. You may manage this question and answer in association with each other.

As explained above, the voice information spoken by the user, the character string thereof, the answer content reflecting the user's policy, and the presentation method are exchanged between the robot 100 and the cloud 101. On the other hand, the user ID and the user identification information, which are the information for identifying the individual user, are not exchanged. Instead, the robot ID of the robot 100 communicating with the cloud 101 is used. Since the robot ID and the user's initial setting information (user ID, user identification information, policy) are managed in association with the memory 223 of the cloud 101 in the initial setting, the robot ID is used in the communication between the robot 100 and the cloud 101. It is possible to narrow down the number of people who are communicating with the robot 100 to a scale of several people, and to identify them with high accuracy by using the user identification information.

FIG. 13 is a sequence diagram showing an example of a response method for answering a question from a user. The difference between FIGS. 12 and 13 is that the robot 100 responds to the user using characters, video, and voice. Others are the same as in FIG. 12, so the same parts are designated by the same reference numerals and the description thereof will be omitted.

The user policy set in the cloud 101 is f (character + video +

voice

2, 2, 2, 2), and here, not only voice but also text and video are set to be answered. It is the same as FIG. 12 until the robot 100 receives a question from the user, transfers it to the cloud 101, and the answer content and the presentation method are determined by the cloud 101.

Next, in step S1306, an answer using characters, video, and voice is given based on the user's policy set in the cloud 101. The user's language level, knowledge level, visual level, and auditory level are all set to 2, and a relatively easy answer content and presentation method are required. Therefore, Cloud 101 generates an answer string using only the basic words and knowledge, "This is a method of investigating a disease by a gene", based on the set language level and knowledge level as an answer according to the user's policy. .. Furthermore, based on the set visual level and auditory level, this character string is displayed together with the illustration 800 that explains with a more visible design (adjustment of character size and color scheme, ruby for Chinese characters, etc.). At the same time, the robot 100 is instructed to output voice more slowly and at a louder volume.

Upon receiving this, in step S1307, the robot 100 displays the illustration 800 via the video / audio output unit 217 as an answer according to the user's policy, and displays the answer character string "This is a method of investigating a disease by a gene". Audio is output via the video / audio output unit 217 more slowly and at a louder volume. Since the user's visual level is set to 2 for the character string included in the illustration 800, the character string is displayed on the video / audio output unit 217 of the robot 100 with a more visible design (character size = 32).

The video / audio output unit 217 may use the projection mapping technique to display the illustration 800 near the user in a more visible design. When the image is projected to the space near the user in this way, the size may be larger than the size of the image / audio output unit 217 of the robot 100 so that the user can easily see the image. As a result, it becomes possible to display characters and images larger than those displayed on the video / audio output unit 217, which makes it easier for the user to see.

FIG. 14 is a flowchart showing an example of responding according to the currently applied policy. As described above, when the robot 100 receives a question from the user in step S1401, in step S1402, the robot 100 and / or the cloud 101 responds to the question (for example, the text of the answer) and its expression method (for example, the text of the answer). For example, the speed and volume at which the text of the answer is read aloud, the size and color scheme for displaying the text of the answer) are generated according to the current policy, and the user responds. In this way, by responding with a policy suitable for the user through daily communication with the user, it is possible to realize a robot 100 that performs individual optimum communication according to the communication ability of the user.

FIG. 15 is a sequence diagram showing an example in which the robot 100 asks a question to the user and measures the communication ability of the user based on the answer. For example, the robot 100 asks the user a question "If the PCR test is negative, isn't it infected?" By voice, and detects that the user answers "I'm not infected." By voice. By spontaneously and actively starting this series of exchanges based on the current policy (current user's communication ability), the user's communication ability can be continuously measured from various angles. can. The above answer example is generally correct, but it is not a perfect answer. If the user has a more accurate understanding, supplementary information such as the fact that there are few pathogens immediately after infection and it is difficult to detect by PCR test, and that there is a possibility of false negative in PCR test, etc. as feedback to the answer. You may return it.

In the example of this figure, in step S1501, the robot 100 constantly senses the user state, and it is determined whether the user is in a relaxed state and can afford to answer the question from the robot 100. In step S1502, the cloud 101 that manages user information instructs the robot 100 to confirm the current user state at a predetermined timing, for example, once every three months, and in step S1503, the robot 100 tells the robot 100 the current user state. Get information about.

In step S1504, the cloud 101 ends the process when it is determined that the quiz question is at an inappropriate timing based on the acquired user status and the question frequency so far. In step S1504, when the cloud 101 determines that the timing is appropriate for the quiz question, the cloud 101 instructs the robot 100 to set the quiz question. At this time, the cloud 101 specifies the question content (voice signal or text information of the question) and the presentation method (speech speed, volume, etc.) to the robot 100. In step S1505, the robot 100 asks the user a question content designated via the video / audio output unit 217 according to the received instruction by the designated presentation method. The robot 100 transfers the user's answer as an audio signal (or converted into character information) to the cloud 101 (steps S1506 and S1507).

In step S1508, the cloud 101 provides feedback voice information (for example, either a correct answer or an incorrect answer, and in the case of an incorrect answer, exemplary answer information, etc.) and presentation information for the received answer to the robot 100. In step S1509, the robot 100 gives voice feedback to the user accordingly.

In step S1510, the cloud 101 includes the date and time of the question asked to the user, the content thereof, the policy used when the robot 100 asks the question, the content of the user's answer to the question, and the user or the user when the question is asked. At least two of the surrounding environment of the robot 100 (intensity of ambient noise, distance between the user and the robot 100, time required for the user to answer) are recorded in association with each other.

The cloud 101 re-determines the user's language level, knowledge level, visual level, and auditory level from the user's answer to the question, determines whether or not the update is necessary, and updates to a new policy if necessary (step S1511). , S1512).

In this way, questions are given to evaluate the user's communication ability (language level, knowledge level, visual level, auditory level), and the user's response is continuously measured to change the user's communication ability. Can be detected immediately and updated to a new policy or recommended to the user to update the policy accordingly.

In this example, we are asking about the accuracy of the PCR test, so we are confirming the knowledge level of the user. However, depending on the policy used when asking questions, it is possible to check the visual level by using small letters, for example, or by asking questions at a low volume. It is possible to judge multiple elements (language level, knowledge level, visual level, auditory level) that compose communication ability at the same time by one question.

In the above example, the explanation was given using voice communication, but the present disclosure is not limited to this, and questions may be given using visual information (or in combination with voice information). In that case, the question content generated by the cloud 101 is expressed by characters and video information, and the presentation method adjusts the visibility such as the size and color scheme of the characters according to the user's policy. By asking questions using visual information, it is possible to confirm whether the characters and images used in the questions can be smoothly visually recognized by the user. In addition, it is possible to appropriately update the user's visual level by confirming that the characters and images in the question were not difficult to see when giving feedback to the answer. For example, if the user finds it a little hard to see or read, adjust the user's policy (lower the visual level) to make it more visible in future communication using visual information. You may change to use high text and video information.

FIG. 16 is a flowchart showing an example in which the robot 100 asks a question to the user and updates the communication ability and / or the policy of the user according to the answer.

First, in step S1601, the robot 100 sets a policy to be used for a predetermined question with reference to the policy currently applied to the user, and gives a question according to the policy used for the question. The policy used for the question means that the question is given with the same level of difficulty as the policy currently set by the user and the presentation method. In step S1602, the robot 100 subsequently senses the user's answer. When the answer is completed, in step S1603, it is determined whether the question is finished or the question is still continued. If you want to continue the question, proceed to No, set a predetermined question policy, and ask the question again.

When the question is finished, the process proceeds to Yes, and in step S1604, a policy suitable for the user is determined based on the policy used for the question and the correctness of the answer. For example, among the questioning policies in which the correct answer rate of the user is 95% or more, the policy that requires the highest communication ability may be determined. If the following test results are obtained, it is determined that the policy = f (voice, 4, 3, 0, 3) suitable for the user.

Question policy = 98% correct answer rate for questions of f (voice, 3, 3, 0, 3)
Question policy = 96% correct answer rate for questions of f (voice, 4, 3, 0, 3)
Question policy = 84% correct answer rate for questions of f (voice, 3, 4, 0, 3)
Question policy = 78% correct answer rate for questions of f (voice, 3, 3, 0, 4)
When a policy suitable for the user is determined, in step S1605, it is determined whether or not it is the same as the policy currently used in daily communication with the user. In the same case, the process proceeds to Yes and the policy is not updated, and the process ends. On the other hand, if they are not the same, the process proceeds to No, the user is notified in step S1606 that the policy is to be changed, the policy is updated in step S1607, and the process ends. Alternatively, you may propose to the user to change to a new policy and update the policy after obtaining the user's consent.

In this way, the robot 100 voluntarily asks questions according to a plurality of policies, and by confirming the correct answer rate for each of the plurality of policies, it is possible to estimate what policy is suitable for the user. By applying this as a new policy, it is possible to smoothly proceed with daily communication between the user and the robot 100 and without stress to the user.

FIG. 17 shows an example of a table used when the robot 100 asks a question to the user and updates the communication ability (policy) of the user based on the answer. Here, it is assumed that the currently applied policy is f (character + voice, 3, 3, 3, 3).

Since the communication means can use both text and voice, in the case of each communication means, each element of the communication ability of the currently applied policy is given a question of -1, ± 0, +1 and the answer is given. Determine the appropriate policy. For example, when the question is given by changing the language level = 2, 3, 4, the other knowledge level, visual level, and auditory level are all fixed at 3 which is currently applied. The same is true for other tests. In addition to the correct and incorrect answers (correct / incorrect), the time taken for the answer (answer time), the question text could not be read within the specified time, or the question voice could not be heard and the question was requested again. Record the number of times (number of questions) as well. From these test results, each element of communication ability may be set as follows.

First, the user answers the question with only the language level changed to 2, 3 and 4 without repeating the question in a certain answer time. Therefore, regarding the language level, no problem is detected by the user in any of 2, 3 and 4, and in order to convey information most appropriately, the highest language level = 4 is set as a newly recommended language level.

For questions that change only the knowledge level to 2, 3, 4, the

question

2 or 3 is correct, but the question with knowledge level = 4 takes a relatively long time to answer and is incorrect. It turns out that it was. Therefore, the newly recommended knowledge level = 3 is set.

For questions that changed only the visual level to 2, 3 and 4, the answer time became longer as the visual level increased, even though other conditions did not change. This is probably because it took time for the user to visually recognize the question text. Therefore, in order to ensure readability, the visual level is set to 2 which is close to the average response time.

It can be seen that the question is repeated once only when the question has a hearing level of 4 for the question with the hearing level changed to 2, 3 and 4. This is probably because the user could not hear the question text aurally. Therefore, the hearing level is set to 3 so as to ensure the ease of hearing.

From the results of these tests, the applied policy = f (character +

voice

3, 3, 3, 3) is replaced by the newly recommended policy = f (character +

voice

4, 3, 2, 3). May be updated as.

The policy determination method explained here is just an example that is easy to understand for convenience of explanation. For example, even in the same policy, by repeating a plurality of questions and measuring a plurality of answers, a policy may be determined according to the more accurately measured communication ability of the user.

FIG. 18 is an example of a table used when the language level of a user is subdivided into a plurality of languages and managed. Here, the language level of the user is recorded and managed for each of English, Chinese, Hindi, Japanese, Spanish, and Arabic. These language levels may be set by the user by self-report, or may be evaluated and set according to a score such as a language proficiency test.

As shown in the table, in this example, the user has English (English language level = 3) used in everyday situations and Japanese (Japanese language level = 4) used in a slightly wider range of situations. It turns out that there is knowledge. The languages marked with "-" in the "Language level for each language" in the table are not at the level where communication is established, indicating that they cannot be used in practice. In this way, by having the user set or have the robot 100 measure the communication ability (language level) that differs depending on the type of language used for communication, communication can be achieved according to the language level of each user's language.

Here, in order to support the smooth communication of the user, the language level for each language used by the robot 100 is set, but this can also be used for other purposes. For example, the user currently has an English language level of 3, but by intentionally setting the communication with the robot 100 to English, it is possible to practice English conversation aiming at a higher English language level. .. When the robot 100 tests the user's English proficiency, the user can know his / her English proficiency. Further, it is conceivable that the robot 100 raises the language level of English to a higher level in accordance with the improvement of the English ability of the user to communicate.

FIG. 19 is an example of a table used when the knowledge level of a user is subdivided into a plurality of fields and managed. In the above explanation, the knowledge level is expressed by one aggregated value as one element of the communication ability of the user. However, even among users who are judged to have the same knowledge level, it is considered that the knowledge is different for each field. That is, the question of the robot 100, "If the PCR test is negative, is it not infected?" Illustrated in FIG. 15 clearly asks the knowledge about medical treatment. Depending on whether or not this question is answered correctly, it may be inappropriate to comprehensively express the user's knowledge level as a single value, especially for non-medical themes.

Therefore, as shown in the figure, it is conceivable to set and manage the knowledge level for each field. For example, common sense represents the general knowledge of society that everyone should know. The following are categorized academically: formal science is knowledge about mathematics and statistics, natural science is knowledge about physics and chemistry, social science is knowledge about politics and economics, and human literature is philosophy and history. Knowledge about science and applied science may represent knowledge about engineering and medicine, respectively.

By setting or measuring the knowledge level for each field via the robot 100 in this way, it is possible to deal with the bias of knowledge for each user. In other words, in this example, the knowledge level of the user is 4 overall, but the applied science is 5 and the humanities is 3. By setting and changing policies according to the knowledge level of each field that has become a hot topic in this way, it becomes possible to support user communication in detail.

In addition, in order to support the smooth communication of users, the knowledge level for each field was set here, but this can also be used for other purposes. For example, the user currently has a social science knowledge level of 3, but the robot 100 tries to incorporate political and economic topics in daily communication with the user, and provides political and economic news information to the user every day. It is also possible to acquire a higher level of social science knowledge by introducing them. When the robot 100 tests the user on the social sciences, the user can know the knowledge level of his / her social science field. Further, it is conceivable that the robot 100 raises the knowledge level in the social science field to a higher level for communication according to the knowledge of the user in the social science field.

Note that the subdivision of the knowledge level is not limited to each academic field as described above, but may be set according to the knowledge of each specific field such as music, movies, animation, and painting. Further, it may be set separately from Japanese rock, Japanese pop, American rock, American pop, etc. in more detail.

FIG. 20 is a sequence diagram showing an example of initializing communication. In the above explanation, the answer contents and the presentation method according to the communication ability of the user are generated by the cloud 101, but here, the system configuration is generated by the robot 100. As a result, the personal information of the user is not managed at all in the cloud 101, and the personal information is not exchanged on the Internet, so that this service can be used more safely. As shown in the figure, the range in which personal information or information closely related to personal information is handled includes only the user, the smartphone 99, and the robot 100, and does not include the cloud 101.

The difference from FIG. 9 is that when the user makes initial settings using the application of the smartphone 99, it is the robot 100 that registers the initial setting information instead of the cloud 101. After the initial setting information (user ID, user identification information, policy) of the user is confirmed (steps S2001 to S2005), in step S2006, the smartphone notifies the robot 100 of the initial setting information and causes the robot 100 to register the initial setting information.

In step S2008, the robot 100 records the user's initial setting information in the memory 215, and in step S2009, returns a notification of registration completion to the smartphone 99.

At the same time, the smartphone 99 also encrypts the initial setting information of this user and records it in the memory 204 of the smartphone 99 (step S2007). This can be used when the robot 100 breaks down, or when the user's initial setting information is set in the new robot 100 when the robot 100 is replaced.

The user who has received the notification of the completion of user registration via the video / audio output unit 206 of the smartphone 99 ends the initial setting (step S2010). After that, the smartphone 99 notifies the cloud 101 of the robot ID and the user ID, which are a part of the initial settings of the user (the minimum information required for maintenance purposes that are not closely related to personal information), and the pair thereof. Is registered (steps S2011, S2012). Conversely, only the information closely related to the personal information of the user and the initial setting information excluding the user identification information and the policy are registered in the cloud 101.

The cloud 101 does not share information related to personal information such as questions asked by the user to the robot 100 and the contents of conversations, but maintenance information and the like are appropriately provided from the smartphone and / or the robot 100 in order to improve service quality. You may be notified.

The process of initializing the roommate as the second user is the same as the process of initializing the first user, so it is omitted here.

In the above description, the form in which the smartphone 99 and the robot 100 are explicitly used separately is described, but the present disclosure is not limited to this, and as shown in FIG. 10, the initial setting is performed using only the robot 100. And, as shown in FIG. 20, the initial setting information may be registered in the robot 100. The robot 100 referred to here may be replaced by a smartphone 99 owned by the user. For example, a user uses a smartphone 99 as a mobile phone every day, but at home, by storing the smartphone in a robot-type case, the user uses the smartphone 99 as if it were the robot 100 described above. It is also possible. By equipping this case with not only a smartphone charging function but also a microphone, speaker, and projection mapping function, it is possible to enable usage like a robot or smart speaker simply by storing the smartphone in this case. Is.

FIG. 21 is a sequence diagram showing an example of a response method for answering a question from a user. This figure is an embodiment different from FIG. 12 in which the cloud 101 generates an answer according to the user's policy, and is a case where the robot 100 generates an answer according to the user's policy. Therefore, there is no exchange of information with the cloud 101 in the conversation processing with the user. Since the robot 100 manages initial setting information (user identification information and policies) closely related to the user's personal information and realizes a conversation with the user by internal processing, there is an advantage that the risk of leakage of the user's personal information can be reduced. be. Hereinafter, the same parts as those in FIG. 12 will be designated by the same reference numerals, and the description thereof will be omitted.

In step S2102, the user asks the robot 100 "What is a PCR test?" Orally (voice). In step S2103, the robot 100 acquires the voice information of the question asked by the user by the sensor 213, and the user identification recorded in the memory 215 of the robot 100 which of the preset users was the utterance. Determined by matching with information (at least one of the unique wakeup words, physical features, and voice features), and the user is asking a question (rather than a co-resident or a non-default third party). Estimate that. If the user is fixed at one person, this step of identifying the speaker may be omitted assuming that the utterance of the default user is used.

In step S2104, the robot 100 next reads the user's policy from the memory 215, and the answer content (answer character string) according to the language level and knowledge level set for the user and the auditory level set for the user. Determine the presentation method (speech speed, volume, etc.) according to. Further, in step S2105, the robot 100 gives an answer according to the user's policy by voice via the video / audio output unit 217, and the user understands the PCR test by the answer.

After that, in step S2107, the robot 100 inquires the user by voice about the evaluation of the policy applied to this answer. Based on the evaluation voice from the user, the robot 100 has the date and time of the question asked by the user, the content thereof, the policy applied when the robot 100 answers, the content of the answer, and the user's evaluation information (user) for the answer. At least two of (values obtained by quantifying the evaluation of) are linked and recorded (steps S2108 and S2109).

Further, if it is determined in step S2110 that the user's evaluation can be improved by updating the policy, the robot 100 updates the policy and records it in the memory 215, and at the same time, in step S2111, a new policy is created. Notify the user that the policy will be updated.

Although the robot 100 may access information on the Internet when generating an answer, the user's personal information (user identification information, policy, etc.) is not used even in the access, and the user's personal information is not used. No information is leaked. However, when the user explicitly permits the use of personal information, the personal information including the user's initial setting information and the member information of a specific group is registered in the initial setting so that the robot 100 can use it. May be. For example, when accessing articles and event information managed by academic societies, news information distributed by news distribution companies, cash and securities information of users managed by financial institutions, acquisition / usage status of point services, etc. Etc. correspond to this case.

Although the embodiment in which the cloud 101 manages the policy and generates the answer and the robot 100 manages the policy and generates the answer have been described, the cloud 101 and the robot 100 cooperate to generate the answer. You can do it.

For example, the user's initial setting information is recorded and managed in the memory 215 of the robot 100. If there is a part corresponding to personal information from the content of the question or conversation received from the user, the robot 100 hides or anonymizes the part, and then converts it into a general question content that cannot identify the user's personal information. Send to cloud 101. The cloud 101 returns a general answer to the robot 100. It is conceivable that the robot 100 applies the user's policy to the received general answer content, converts it into an answer that is easy for the user to understand, and then presents it to the user.

In this case, depending on the currently set policy (especially the user's language level, knowledge level, visual level, auditory level), the linguistic expression of the answer content, prerequisite knowledge, display size and color scheme of characters and images, and speech speed And volume are adjusted. For example, the answer content for language level = 4 and knowledge level = 4 in the second line of Fig. 8 is "a test method that amplifies the genetic information possessed by the virus to check whether it is currently infected with the virus". If the answer is specific, the user's current setting level is language level = 2, and if the knowledge level is 2, then the user's current setting is "a method of investigating a disease by gene" as shown in the 4th line. If the level is language level = 1 and knowledge level = 1, summarize the general answer contents such as "How to check if you are sick" as in the first line based on the set policy. Alternatively, the expression may be converted into a plain expression and presented to the user.

FIG. 22 is a sequence diagram showing an example in which a user is asked a question and the communication ability is measured by the answer. This figure is an embodiment different from FIG. 15 in which the cloud 101 generates questions according to the user's policy, and is a case where the robot 100 generates questions according to the user's policy. Therefore, there is no exchange of information with the cloud 101 in a series of conversation processing with the user. Since the robot 100 manages initial setting information (user identification information and policies) closely related to the user's personal information and realizes a conversation with the user by internal processing, there is an advantage that the risk of leakage of the user's personal information can be reduced. be. Hereinafter, this figure will be described while omitting the description of the same part as in FIG.

In the example of this figure, in step S2201, the robot 100 constantly senses the user state, and it is determined whether the user is in a relaxed state and can afford to answer the question from the robot 100. The robot 100 tests the user's communication ability in order to update the user's policy at a predetermined timing, for example, once every three months.

In step S2202, the robot 100 ends the process when it is determined that the quiz question is at an inappropriate timing based on the user status and the question frequency so far. When it is determined that the timing is appropriate for the quiz question, the robot 100 determines the question content (at least one of the characters representing the question, the video, and the audio information) and the presentation method (character display size, color scheme, audio). Determine the speaking speed, volume, etc.). In step S2203, the robot 100 asks the user a question content determined via the video / audio output unit 217 by the determined presentation method. In step S2204, the robot 100 acquires the user's answer as an audio signal via the sensor 213.

The robot 100 feeds back the received response to the user using any of text, video, and audio information (step S2205, step S2206). Feedback includes information such as correct and incorrect answers.

In step S2207, the robot 100 asks the user the date and time of the question, the content thereof, the policy used when the robot 100 asks the question, the content of the user's answer to the question, and the user or the user when the question is asked. At least two of the surrounding environment of the robot 100 (intensity of ambient noise, distance between the user and the robot 100, time required for the user to answer) are recorded in association with each other.

The robot 100 re-determines the user's language level, knowledge level, visual level, and auditory level from the user's answer to the question, determines whether or not the update is necessary, and updates to a new policy if necessary (step S2208). , S2209).

In this way, questions are given to evaluate the user's communication ability (language level, knowledge level, visual level, auditory level), and the user's response is continuously measured to change the user's communication ability. Immediately detect and update to a new policy or recommend the user to update the policy accordingly, but it can be implemented securely without disseminating the user's user identification information or policy on the Internet. can.

FIG. 23 is a sequence diagram showing an example in which when a user is communicating with a doctor with whom he / she is talking, the degree of understanding of the user is estimated from the communication and support intervention is appropriately performed. Here, it is assumed that the robot 100 shares the same experience as the user. That is, it is assumed that the robot 100 also perceives what the user sees and hears. For example, there is a case where the robot 100 is near the user, or a case where the user communicates with the doctor with whom the user is talking, such as a remote diagnosis, via the robot 100 or the smartphone 99.

In step S2301, the conversation 1 spoken to the user by the doctor with whom the conversation partner is talking is transmitted to the user as well as to the robot 100. In step S2302, the robot 100 either keeps the conversation 1 as voice data or converts the voice into character data and transmits it to the cloud 101 that manages the user information. In step S2303, the user responds to the conversation 1 to the doctor in the conversation 2. This conversation 2 is also transmitted to the robot 100. In step S2304, the robot 100 either keeps the conversation 2 as voice data or converts the voice into character data and transmits it to the cloud 101 that manages the user information.

The cloud 101 determines the policy to which conversation 1 applies, compares it to the policy currently applied, and / or estimates, based on the content of conversation 2, whether the user is likely to understand conversation 1. .. Specifically, this is when conversation 1 is more difficult than the policy currently applied, conversation 2 is an ambiguous response, or it takes a long time to respond. May decide that communication support is needed.

Comparing the policy to which Conversation 1 applies with the policy currently applied is to judge the conversation of the other party in light of the policy which is a guideline for daily and smooth communication between the user and the robot 100. This is to determine whether the conversation of the other party was made in a way that is sufficiently easy to understand even in the communication ability of the user. In the example of this figure, since it was determined that conversation 1 and conversation 2 do not need such communication support, the cloud 101 listens to these conversations without intervening (step S2305).

Subsequently, in step S2306, the conversation 3 that the doctor spoke to the user (for example, "Have you ever had anaphylaxis?") Is transmitted to the user as well as to the robot 100. In step S2307, the robot 100 similarly transmits the conversation 3 to the cloud 101. In step S2308, the user responds to the conversation 3 with the conversation 4 (for example, "I don't think ...") to the doctor. This conversation 4 is also transmitted to the robot 100. In step S2309, the robot 100 similarly transmits the conversation 4 to the cloud 101.

Similarly, the cloud 101 determines the policy to which conversation 3 applies, compares it to the policy currently applied, and / or, based on the content of conversation 4, whether the user is likely to understand conversation 3. presume. Here, it is assumed that the policy to which the conversation 3 corresponds is more difficult than the policy currently applied, and / or the conversation 4 is an ambiguous response. The cloud 101 determines that communication support is necessary, and the user who manages the conversation 5 (for example, "there is no particularly strong allergy other than pollen") for the user to understand the conversation 3 is managed by the cloud 101. Created by accessing the information allergy test result information to intervene in the conversation between the user and the doctor (step S2310). Here, the conversation 5 is a conversation created according to the currently applied policy in order for the user to correctly understand the conversation 3 and facilitate communication.

In step S2311, the cloud 101 instructs the robot 100 to have a conversation 5, and in step S2312, the robot 100 utters the conversation 5 to the doctor and the user. In response, in step S2313, the doctor responds to the user and the robot 100 in conversation 6 (eg, "OK"). In step S2314, the robot 100 similarly transmits the conversation 6 to the cloud 101, and in step S2315 the cloud 101 determines that no additional communication intervention is required. The necessity of this communication intervention can be determined in the same manner as the evaluation for conversation 1.

Hereinafter, conversation 7 and conversation 8 are the same as conversation 1 and conversation 2, and therefore description thereof will be omitted (steps S2316 to S2320).

In the above description, an example of determining the necessity of communication support for conversation 1 and conversation 2 based on the policy to which conversation 1 corresponds and the response of conversation 2 is shown, but the present invention is not limited to this, and the external reaction of the user is not limited to this. Intervention judgment for communication may be made by detecting (for example, bending the neck, waving a hand) and biometric information of the user (for example, brain wave, eye movement, respiration, heart rate (variation), etc.). In this case, the material for determining intervention is the policy corresponding to the conversation made by the other party, the content of the conversation made by the user who responds to it, the time taken for the user to respond to the conversation made by the other party, and the user's. At least one or more of the apparent reaction and the biometric information of the user may be used for judgment.

In the above explanation, the cloud 101 determines the necessity of communication support every time the user responds, but the present invention is not limited to this. For example, this judgment may be made at all times during communication, and communication support may be provided at any time when necessary. For example, it may be the timing before the user responds. Alternatively, the time may be set when the time when no one is communicating (in this case, the time when neither the user nor the doctor is silent) becomes a predetermined time or more (for example, 3 seconds or more has elapsed).

FIG. 24 is a flowchart showing an example of estimating the degree of understanding of the user from the communication in which the user participates and appropriately intervening in support.

In step S2401, the robot 100 senses whether the user is communicating with someone from the conversation or chat. In step S2402, the language level of the words used in the conversation or chat, which is the communication, is further determined during the communication.

In step S2403, if the level of the language used in the conversation or chat is equal to or lower than the language level of the policy currently applied, the process proceeds to Yes, and sensing is continued until the conversation ends (Yes in step S2406). If not, the process proceeds to No, and in step S2404, the relevant part having a high language level used is replaced according to the policy currently applied. Further, in step S2405, the robot 100 conveys the contents of the conversation or chat to the user in an easy-to-understand manner according to the policy, and continues sensing until the conversation ends.

In this way, the content of the communication (conversation or chat) in which the user participates is analyzed, and compared with the policy currently applied, the part that is likely to be difficult for the user to understand is automatically easy to understand. Replace it with the method.

For example, to explain an example in the scene of FIG. 23, when the user and the doctor are having a conversation, the application that supports the communication of the smartphone is whether the conversation content is sufficiently easy for the user to understand, or the policy currently applied. Judgment by comparison with. If the application determines that it is difficult for the user to understand or respond accurately, it displays information to support understanding on the screen of the smartphone. Specifically, the avatar displayed by the application says, "Anaphylaxis is a strong allergic reaction." "According to an allergy test conducted three years ago, there is no strong allergy other than pollen of sugi and cypress." Is displayed according to the currently applied policy (for example, high visibility and large font size with only basic words). While checking these supplementary information displayed on the smartphone during the conversation with the doctor, the user can confidently answer the doctor's interview and answer the correct information smoothly without interrupting the communication.

Although the above flowchart explains only the language level, the same applies to the knowledge level, the visual level, and the auditory level. Determine the policy (knowledge level, visual level, hearing level required for understanding) applicable to conversation or chat, and compare it with the currently applied knowledge level, visual level, hearing level to understand and understand for the user. If it is determined that it is difficult to respond with confidence, understanding support will be provided by converting the conversation or chat to the currently applied policy and communicating it in the same way as above.

FIG. 25 is a sequence diagram showing an example in which the robot 100 summarizes the communication so far immediately before the end of communication, unlike FIG. 23. As shown in the figure, the user and the doctor with whom he / she talks have conversations 1 to 6. Since the processing up to this point overlaps with the processing described in FIG. 23, the explanation is omitted, but the cloud 101 analyzes the communication in the middle of the conversation and records the process.

In step S2515, when the cloud 101 determines that the conversation between the user and the doctor with whom the user is talking is about to end as a result of communication analysis, in step S2516, the conversation is summarized from the conversation so far and the conclusion is organized. In step S2517, the summary and / or conclusion is further expressed as conversation 7 (for example, "Take two antihypertensive drugs after a meal?") According to the policy, and the robot 100 confirms with the user and the doctor. Make a request. In step S2518, the robot 100 utters the received conversation 7 and confirms whether the user and the doctor have a common understanding.

The end of the conversation is the context and flow of the conversation, the user or the other party summarizing the conversation so far, the user or the other party saying goodbye, the user trying to get out of the room where the other party is talking, the robot 100. May be detected by a method such as confirming to the user or the other party that the message is terminated.

In step S2519, the doctor or user responds to the summary or conclusion shown in conversation 7 in conversation 8 (eg, "yes, yes"). In step S2520, the conversation 8 is also notified from the robot 100 to the cloud 101 as before. In step S2521, if the conversation 8 clearly agrees with the conversation 7, the cloud 101 ends the process. If this is not the case, a conversation for explanation is generated as conversation 9, and the robot 100 is instructed in step S2522.

In step S2523, the robot 100 utters the received conversation 9 and requests the user and the doctor to summarize, correct the conclusion, and explain. In step S2524, the user or doctor responds to it in conversation 10. In step S2526, the cloud 101 is summarized by the conversation 10, and when the conclusion is clarified, the process ends. If this is not the case, continue to ask for a summary or explanation of the conclusion, as in Conversation 9.

FIG. 26 is a sequence diagram showing an example in which the robot 100 summarizes the communication so far after the communication is completed, unlike FIG. 25. As shown in the figure, the user and the doctor with whom he / she talks have conversations 1 to 6. Since the processing up to this point overlaps with the processing described with reference to FIG. 25, the description thereof is omitted.

In step S2616, the robot 100 indicates that the conversation has ended in conversation 6, the context and flow of the conversation, the user or the other party summarizing the conversation so far, the user or the other party giving a farewell greeting, and the user. It may be detected by a method such as leaving the room where the other party is present, a certain period of time elapses after the user stops talking with the other party, or the robot 100 confirms with the user or the other party that the conversation has ended.

After confirming the end of the conversation, the cloud 101 conveys a summary of the conversation so far and / or a conclusion to the user via the robot 100 according to the policy currently applied to the user (steps S2617 to S2619). .. By doing so, it becomes easy for the user to re-recognize and correctly understand the content and result of the communication with the other party (step S2620).

FIG. 27 is a sequence diagram showing an example of summarizing only to the user after the communication is completed. As shown in the figure, the user and the doctor with whom he / she talks have conversations 1 to 6. Since the processing up to this point overlaps with the processing described with reference to FIG. 25, the description thereof is omitted.

In step S2716, when the user's diagnosis is completed, the doctor to talk to inputs a medical record or a prescription to the information communication terminal 110 for inputting the medical record. The medical record and the prescription are transmitted from the information communication terminal 110 for inputting the medical record to the medical record storage cloud 111 that manages the information used by the other party, and in step S2717, the cloud 111 records these. Further, in step S2718, the cloud 111 shares the medical record and prescription information with the cloud 101 that manages the user information. The cloud 101 that manages user information securely stores this.

The cloud 101 that manages the user information determines that the conversation (diagnosis) between the user and the doctor who was the other party has ended due to the input of new medical record or prescription information. Therefore, when it is determined through the robot 100 that the user is in a state where the user can relax and secure privacy, the cloud 101 currently applies the policy regarding the diagnosis result of the doctor with whom the user was talking and the prescription of the medicine. According to the robot 100, the user is urged to confirm (for example, "Because the blood pressure is high, I have to take two medicines after eating") (steps S2719 to S2721). By doing so, it becomes easy for the user to re-recognize and correctly understand the content and result of communication with the other party (doctor in this example) (step S2722).

In the above, the user information management cloud 101 is like an information bank in which various personal information about users is recorded and managed. For example, by setting prior confirmation as a user in advance, when the personal information of the user recorded in the medical record storage cloud 111 is added or corrected, it is immediately shared with the user information management cloud 101. It is set to be used. By setting in this way (prior permission of the person), information including personal information about various users is collected in the cloud 101 that manages the user information and can be used for various purposes. An example of use will be described later with reference to FIG. 29, assuming that the robot 100 supports the taking of medicines.

FIG. 28 is a sequence diagram showing an example of summarizing only to the user after the communication is completed. As shown in the figure, the user and the doctor with whom he / she is talking are having conversations 1 to 6, but the content of this conversation cannot be detected by the robot 100. For example, this happens when the user goes to the hospital for diagnosis and the robot 100 remains at the user's home.

The processing up to this point is almost the same as the processing described in FIG. 27, so the explanation is omitted. The difference is that the robot 100 cannot detect the communication between the user and the doctor with whom the user is talking. Therefore, the robot 100 and the cloud 101 that manages the user information do not contain any information about conversation.

In step S2807, the doctor with whom the conversation partner inputs a chart or prescription into the information communication terminal 110 for inputting the chart (not shown), and in step S2808, the chart storage cloud 111 that manages the information used by the conversation partner. Further, in step S2809, no information is input to the robot 100 and the cloud 101 that manages the user information until it is shared with the cloud 101 that manages the user information.

However, once the medical examination result and the information about the prescription of the drug are input to the cloud 101 that manages the user information, the subsequent processing can proceed in the same manner as in FIG. 27.

FIG. 29 is an example of a table used when supporting the task execution of the user. As shown in the table, users have multiple routine tasks. These tasks are identified by a number for each task.

The task name of task 1 is "garbage removal", the date and time when this task occurs is "every Wednesday at 9 am", the place where this task occurs is "home", and the content of this task is "home". Throw away the trash. " Whether or not the user has performed the task of this task 1 is determined by the video / audio sensing unit of the robot 100, for example, whether or not the user has taken out the dust and returned home without the dust, as shown in the check column. It is possible to recognize an image taken through another imaging device and make a judgment using the shooting date and time.

Similarly, the task name of task 2 is "feeding", the date and time when this task occurs is "every day at 10 am", the place where this task occurs is indefinite, and the content of this task is "feeding pets". I'll give you. " Whether or not the user has performed the task of this task 2 is determined by checking, for example, whether the user puts pet food on a plate and the pet eats the pet food on the plate through the camera of the robot 100, as shown in the check box. It is possible to recognize a shot image and make a judgment using the shooting date and time.

Similarly, the name of the task with the value of task 3 is "taking medicine", the date and time when this task occurs is "13:00 every day", the place where this task occurs is indefinite, and the content of this task is "medicine". To drink. " Whether or not the user has performed the task of this task 3 is determined by recognizing an image taken by the camera of the robot 100, for example, whether or not the user has taken all the medicines, and determining the shooting date and time. It can be judged by using.

Although it was explained that image recognition is used to confirm the execution of the task, it may be realized by other methods. In addition, depending on the content to be confirmed, the user may be requested to cooperate. For example, in order to check if all the medicines have been taken or if there are any omissions or duplications in the medicines to be taken, show the robot 100 (video / audio sensing unit) with all the medicines on the palm before taking the medicines. You may drink from. Alternatively, the robot 100 may have a conversation asking the user to show all the medicines before taking the medicines.

FIG. 30 is a sequence diagram showing an example of supporting the user's daily task execution. Here, the execution support of the three tasks illustrated in FIG. 29 will be described with the setting that the start of the flowchart of FIG. 30 is Wednesday morning.

Here, it is assumed that the robot 100 constantly senses the user state (for example, the user's biological information, activity amount, place, posture, etc.) (step S3001). Further, it is assumed that the cloud 101 that manages the user information regularly monitors whether the task performed by the user or the start time of the registered task as shown in FIG. 29 has come (step S3002).

In step S3003, it is assumed that the user executes task 1 before the scheduled time for task 1 (garbage removal) (9:00 am on Wednesday). In step S3004, the robot 100 detects and confirms whether the task 1 has been performed according to the confirmation items in the check column. When the robot 100 detects that the execution of the task 1 is completed, in step S3005, the robot 100 notifies the cloud 101 that the execution is completed. In step S3006, the cloud 101 records the execution of task 1.

Next, in step S3007, the cloud 101 detects that the time has passed and the scheduled time for task 2 (feeding) has come (10 am every day). In step S3008, the cloud 101 instructs the robot 100 to confirm the execution of the task 2. In step S3009, the robot 100 asks the user whether the task 2 has been executed. As a result, the user is aware of performing the task 2 and executes it (step S3010).

In step S3011, when the robot 100 detects and confirms that the user has executed task 2 according to the confirmation items in the check column, in step S3012, the robot 100 notifies the cloud 101 that the execution is completed. In step S3013, and cloud 101 records the performance of task 2.

Next, in step S3014, the cloud 101 detects that the time has passed and the scheduled time (13:00 every day) for task 3 (taking medicine) has come. In step S3015, the cloud 101 instructs the robot 100 to confirm the execution of the task 3. In step S3016, the robot 100 asks the user whether the task 3 has been executed. As a result, the user is aware of performing the task 3 and executes it (step S3017).

In this example, unlike task 1 and task 2, it is assumed that the robot 100 cannot detect and confirm the execution of task 3 according to the confirmation items in the check column (step S3018). For example, the case where the medicine is taken in a place invisible to the robot 100 corresponds to this.

In step S3019, if the cloud 101 has not received the completion notification even after a predetermined time has elapsed after instructing the robot 100 to confirm the execution of the task 3, the robot asks the robot to confirm the execution of the task 3 again in step S3020. Instruct 100. In step S3021, the robot 100 confirms to the user whether the execution of the task 3 is completed, and in step S3022, the user replies that the execution has been completed. In step S3023, the robot 100 receives this answer and notifies the cloud 101 of it. In step S3024, cloud 101 records the execution of task 3.

In this way, the user's daily tasks are registered in the format shown in FIG. 29, and the process of confirming whether the tasks are executed at the scheduled time as shown in FIG. 30 is carried out using the robot 100 and the cloud 101. This allows the user to perform miscellaneous and often forgotten tasks without omission.

It may be unclear whether the user himself / herself performed a certain task due to aging or dementia. In this case, the user may ask the robot 100 whether the task has been performed and confirm it. Alternatively, when the robot 100 forgets to perform the task and tries to perform the task again, the robot 100 may detect the user's action and inform the user that the task has already been performed. Since all the execution records of these tasks are recorded in the cloud 101, it is possible to reply to the user or convey the execution status of the task based on the data.

In the above description, the information communication terminal 100 and the cloud 101 that manages user information are separately described as being communicated via a network, but the stored user information is recorded and managed inside the information communication terminal 100. You may do so. In this case, it is not necessary to exchange data corresponding to the user's personal information such as conversation contents and communication ability via the network, and there is an advantage that the cost of information communication and information management and the risk of leakage can be reduced.

This disclosure is useful as a technique that enables smooth communication with users.

99: Information and communication terminals (smartphones, personal computers, etc.)
100: Information and communication terminals (robots, smart speakers, wearable devices, etc.)
101: Cloud that manages user information 102: Information source 110: Information communication terminal (medical record input terminal)
111: Cloud that manages information handled by the other party (medical record storage cloud)

Claims

An information processing method in an information providing system including a device for communicating with a user, wherein the device includes a microphone and a speaker.
The first voice information including the first question from the user acquired by the microphone of the apparatus is acquired together with the apparatus ID for identifying the apparatus.
Based on the device ID, communication policy information including the hearing level corresponding to the user is acquired, and the hearing level is output from the user's information communication terminal or the speaker set for the user via the device. The output format is set to correspond to the user ID that identifies the user, and the user ID is associated with the device ID in the information providing system.
Generate the first answer information indicating the first answer to the first question,
Including that the first answer information is output to the speaker of the apparatus by designating an output format corresponding to the hearing level.
The output format includes volume and
The volume when the hearing level is the first level is louder than the volume when the hearing level is the second level higher than the first level.
Information processing method.
The output format further comprises at least one of speed or clarity.
The information processing method according to claim 1.
The communication policy information further includes a knowledge level corresponding to the user.
The knowledge level is a response level set for the user via the information communication terminal of the user or the device, and is set in association with the user ID.
The first answer information is generated according to the knowledge level based on the first voice information and the communication policy information.
The information processing method according to claim 1.
The knowledge level is set for each field and
The field includes at least one of social common sense, formal science, natural science, social science, humanities, and applied science.
The information processing method according to claim 3.
When the knowledge level is the first level, the first number of technical terms included in the 1-1 answer to the first question is
It is less than the second number of technical terms contained in the 1-2 answers to the first question when the knowledge level is a second level higher than the first level.
The information processing method according to claim 3.
When the knowledge level is the first level, the average first length of the sentences included in the 1-1 answer to the first question is
It is shorter than the average second length of the sentences contained in the 1-2 answers to the first question when the knowledge level is the second level higher than the first level.
The information processing method according to claim 3.
When the knowledge level is the first level, the total number of first characters of the 1-1 answer to the first question is
It is less than the total number of second characters in the 1-2 answers to the first question when the knowledge level is a second level higher than the first level.
The information processing method according to claim 3.
The communication policy information further includes a language level corresponding to the user.
The language level is a language level set for the user via the information communication terminal of the user or the device, and is set in association with the user ID.
The first answer information is generated according to the knowledge level and the language level.
The information processing method according to claim 3.
The second voice information including the second question is output to the device from the speaker of the device to be output to the user, and the second question is different from the first question and is used for updating the knowledge level. Be,
The second answer information indicating the user's second answer to the second question is acquired from the apparatus.
Update the knowledge level based on the second answer information,
The information processing method according to claim 3.
The knowledge level is updated based on the correctness of the second answer or based on the time required from the output of the second voice information to the acquisition of the second answer information.
The information processing method according to claim 9.
An information processing method in an information providing system including a device for communicating with a user, wherein the device includes a microphone and a display.
The first voice information including the first question from the user acquired by the microphone of the apparatus is acquired together with the apparatus ID for identifying the apparatus.
Based on the device ID, communication policy information including the visual level corresponding to the user is acquired, and the visual level is output from the user's information communication terminal or the display set for the user via the device. The output format is set to correspond to the user ID that identifies the user, and the user ID is associated with the device ID in the information providing system.
Generate the first answer information indicating the first answer to the first question,
Including outputting the first response information to the display of the apparatus by designating an output format corresponding to the visual level.
The output format includes the display size of characters.
The display size of the character when the visual level is the first level is larger than the display size of the character when the visual level is the second level higher than the first level.
Information processing method.
The output format further comprises at least one of character border modification, character color scheme, or character arrangement.
The information processing method according to claim 11.
The communication policy information further includes a knowledge level corresponding to the user.
The knowledge level is a response level set for the user via the information communication terminal of the user or the device, and is set in association with the user ID.
The first answer information is generated according to the knowledge level based on the first voice information and the communication policy information.
The information processing method according to claim 11.
The knowledge level is set for each field and
The field includes at least one of social common sense, formal science, natural science, social science, humanities, and applied science.
The information processing method according to claim 13.
When the knowledge level is the first level, the first number of technical terms included in the 1-1 answer to the first question is
It is less than the second number of technical terms contained in the 1-2 answers to the first question when the knowledge level is a second level higher than the first level.
The information processing method according to claim 13.
When the knowledge level is the first level, the average first length of the sentences included in the 1-1 answer to the first question is
It is shorter than the average second length of the sentences contained in the 1-2 answers to the first question when the knowledge level is the second level higher than the first level.
The information processing method according to claim 13.
When the knowledge level is the first level, the total number of first characters of the 1-1 answer to the first question is
It is less than the total number of second characters in the 1-2 answers to the first question when the knowledge level is a second level higher than the first level.
The information processing method according to claim 13.
The communication policy information further includes a language level corresponding to the user.
The language level is a language level set for the user via the information communication terminal of the user or the device, and is set in association with the user ID.
The first answer information is generated according to the knowledge level and the language level.
The information processing method according to claim 13.
The device further comprises a speaker.
The second voice information including the second question is output to the device from the speaker of the device to be output to the user, and the second question is different from the first question and is used for updating the knowledge level. Be,
The second answer information indicating the user's second answer to the second question is acquired from the apparatus.
Update the knowledge level based on the second answer information,
The information processing method according to claim 13.
The knowledge level is updated based on the correctness of the second answer or based on the time required from the output of the second voice information to the acquisition of the second answer information.
The information processing method according to claim 19.