WO2015104883A1 - Information processing device, information processing method, and program - Google Patents

Information processing device, information processing method, and program Download PDF

Info

Publication number
WO2015104883A1
WO2015104883A1 PCT/JP2014/078111 JP2014078111W WO2015104883A1 WO 2015104883 A1 WO2015104883 A1 WO 2015104883A1 JP 2014078111 W JP2014078111 W JP 2014078111W WO 2015104883 A1 WO2015104883 A1 WO 2015104883A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
user
expected value
output
attention
Prior art date
Application number
PCT/JP2014/078111
Other languages
French (fr)
Japanese (ja)
Inventor
淳己 大村
道成 河野
麗子 桐原
智 朝川
伊藤 洋子
Original Assignee
ソニー株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニー株式会社 filed Critical ソニー株式会社
Publication of WO2015104883A1 publication Critical patent/WO2015104883A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/012Head tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/01Indexing scheme relating to G06F3/01
    • G06F2203/011Emotion or mood input determined on the basis of sensed human body parameters such as pulse, heart rate or beat, temperature of skin, facial expressions, iris, voice pitch, brain activity patterns

Definitions

  • This disclosure relates to an information processing apparatus, an information processing method, and a program.
  • Patent Document 1 a morphological analysis is performed on each of a candidate indicating silence from a predetermined number of generated utterance content candidates and each of the generated predetermined number of utterance content candidates, and independent from each of the candidates.
  • a word is extracted and there is a candidate indicating silence or a candidate that does not include an independent word in the generated candidate utterance contents, the input utterance is ignored, and the display device and the speaker A technique for controlling so that the response content is not responded is described. As a result, a more appropriate response can be taken when the input voice is rejected.
  • Patent Document 2 discloses a method and apparatus for quickly and accurately managing a dialogue between a human and an agent using voice information, facial expression information, and delay time information, and a voice dialogue system using the method and apparatus.
  • the voice dialogue system the step of generating the first dialogue order information using the dialogue information analyzed from the voice uttered by the user, and the expression information analyzed from the user's face image are used.
  • the presence / absence of a system response is controlled based on the content of the user's utterance.
  • the presentation of information to the user by voice can be optimized to some extent.
  • the user's utterances and facial expressions are only fragmentary materials for estimating whether or not the information presented by the system at that time is appropriate for the user. There was still room for improvement in optimization.
  • the present disclosure proposes a new and improved information processing apparatus, information processing method, and program capable of realizing a more flexible response to user actions.
  • data indicating a user's action is acquired, and based on the acquired data, an expected value of attention directed to information output to the user is calculated, and the expected value is calculated.
  • An information processing apparatus including a processor configured to provide for output control of the information is provided.
  • the processor acquires data indicating a user's action, and calculates an expected value of attention directed to information output to the user based on the acquired data.
  • An information processing method including providing the expected value for output control of the information is provided.
  • data indicating a user's action is acquired, and based on the acquired data, an expected value of attention directed to information output to the user is calculated, and the expectation A program is provided for causing a computer to realize a function of providing a value for output control of the information.
  • FIG. 1 is a block diagram illustrating a schematic functional configuration of a system according to an embodiment of the present disclosure. It is a figure showing an example of device composition of a system concerning one embodiment of this indication. It is a figure which shows the example of calculation rule DB which concerns on one Embodiment of this indication.
  • 14 is a flowchart illustrating an example of selection of a display method according to an embodiment of the present disclosure.
  • 5 is a flowchart illustrating an example of selection of an output method according to an embodiment of the present disclosure. It is a figure which shows the example of FIG. 5 more concretely.
  • 5 is a diagram illustrating an example of display color selection according to an embodiment of the present disclosure.
  • FIG. 6 is a diagram for describing a first example of information stock according to an embodiment of the present disclosure.
  • FIG. 9 is a diagram for describing a second example of information stock according to an embodiment of the present disclosure. It is a flowchart which shows the process for the stock of the information which concerns on one Embodiment of this embodiment.
  • FIG. 3 is a block diagram illustrating a hardware configuration example of an information processing apparatus according to an embodiment of the present disclosure.
  • FIG. 1 is a block diagram illustrating a schematic functional configuration of a system according to an embodiment of the present disclosure.
  • a system 10 includes a camera 101, a sensor 103, a microphone 105, an action data acquisition unit 107, an action DB 109, an action recognition server 111, an attention expectation value calculation unit, and an output control unit.
  • System 10 is used to present information to the user.
  • the system 10 may be for continuously presenting information to the same user via a terminal device worn or carried by the user.
  • the system 10 may be for presenting information to an unspecified user (a part of which may be specified) present in the vicinity thereof via a stationary terminal device. .
  • the camera 101 can shoot a user of the system 10.
  • the camera 101 can acquire image data indicating a user action.
  • the sensor 103 is various sensors that target the user.
  • the sensor 103 includes an acceleration sensor, a gyro sensor, a geomagnetic sensor, a GPS receiver, and the like mounted on a terminal device worn or carried by the user.
  • the sensor 103 may include an ultrasonic sensor or an infrared sensor.
  • the sensor 103 can acquire sensor data indicating a user action.
  • the microphone 105 can pick up sound generated in the vicinity of the user.
  • the microphone 105 acquires audio data indicating a user action.
  • the microphone 105 may constitute a microphone array so that the direction of a sound source can be specified based on audio data. Data acquired by some or all of the camera 101, sensor 103, and microphone 105 is provided to the action data acquisition unit 107.
  • the action data acquisition unit 107 may acquire user action recognition information from the action recognition server 111.
  • the action recognition server 111 may be included in the system 10 or may be a service outside the system 10.
  • the behavior recognition server 111 recognizes the user's behavior based on various behavior recognition technologies, for example.
  • data acquired by the sensor 103, the camera 101, and the microphone 105 of the system 10 can be used for the action recognition.
  • the data is separately transmitted from the sensor 103, the camera 101, and the microphone 105 to the action recognition server 111 (not shown).
  • the behavior recognition server 111 may recognize the user's behavior based on sensor data acquired by a terminal device or the like outside the system 10.
  • the action data acquisition unit 107 is a data acquisition / management function realized by a processor of the information processing apparatus, for example, and acquires / manages various data indicating user actions. As described above, user actions can be provided from the camera 101, the sensor 103, the microphone 105, and / or the behavior recognition server 111. Furthermore, the action data acquisition unit 107 may acquire information on an operation performed by a user in a terminal device included in the system 10 or other terminal devices. In this case, for example, a keyword for information search using a web browser, content information used by the user, and the like can be acquired.
  • the action data acquisition unit 107 stores the provided data (hereinafter also referred to as action data) in the action DB 109 as necessary, and then provides the data to the expected attention value calculation unit 113. Further, the action data acquisition unit 107 may provide action data to the feedback analysis unit 129.
  • the user action indicated by the action data will be described.
  • the user action may include a user motion or facial expression.
  • the expected attention value calculation unit 113 may calculate the expected value of the user's attention directed to the information to be output based on the latest or latest action data.
  • the user action may include a reaction to information output from the display 123 or the speaker 125.
  • the expected attention value calculation unit 113 calculates the expected value of the user's attention when the information is next output, or the feedback analysis unit 129 calculates the expected value. To verify the validity of.
  • the action DB 109 is a database realized by, for example, a memory or a storage of an information processing apparatus, and stores action data acquired by the action data acquisition unit 107 temporarily or continuously.
  • the action data acquired by the action data acquisition unit 107 may be provided to the expected attention value calculation unit 113 after being temporarily stored in the action DB 109 or without being stored in the action DB 109.
  • the attention expectation value calculation unit 113 calculates the expectation value based on the latest action of the user.
  • the action data acquired by the action data acquisition unit 107 may be continuously stored in the action DB 109.
  • the action data acquisition unit 107 reads out data for a necessary period from the action data stored in the action DB 109 and provides the data to the expected attention value calculation unit 113.
  • the expected attention value calculator 113 calculates the expected value based on the history of the user's action.
  • the attention expectation value calculation unit 113 is an arithmetic function realized by, for example, the processor of the information processing apparatus, and is directed to information output to the user based on the action data acquired by the action data acquisition unit 107. Calculate the expected value of attention.
  • attention means the degree of user attention directed to the output information.
  • the expected attention value calculator 113 provides the calculated expected value of attention to the output controller 115 for output control of information.
  • the attention directed to the output information can vary depending on the user's situation. For example, when the user turns his gaze toward a terminal device including the display 123 or calls to the terminal device, it is expected that much attention will be paid to the output information. On the other hand, when a user is in conversation with another user or is on a train, there is a high possibility that attention will not be paid much to the information output by voice (not heard).
  • the expected attention value calculation unit 113 estimates the user's situation based on action data provided from the camera 101, the sensor 103, the microphone 105, the behavior recognition server 111, and the like, and expects the attention level according to the situation. A value can be calculated.
  • the attention value expected value calculation unit 113 may calculate the expected value by referring to the data of the calculation rule DB 131 that associates the user's action with the expected value of attention.
  • the attention directed to the output information may vary depending on the content of the output information. For example, if a keyword in which the user is interested is displayed on the display 123 or output from the speaker 125, the user is expected to pay much attention to the output information.
  • the attention expectation value calculation unit 113 acquires the content of the information to be output from the information generation unit 117, and is estimated from the content of the information and action data, for example, a keyword for information retrieval executed in the past by the user. The user's interest is compared, and if there is a match or something in common, the expected value of attention can be raised.
  • the attention expected value calculation unit 113 may calculate the expected value by referring to the data of the calculation rule DB 131 in which the action related to the information content and the expected value of attention are associated with each other.
  • the expected attention value calculation unit 113 may use only the action data indicating the latest action of the user when calculating the expected value of the user's attention based on the action data as described above.
  • Action data indicating a history of user actions may be used.
  • the expected value is calculated based on the latest action, the amount of calculation can be reduced, and the expected value can be calculated quickly with a low load.
  • the expected value is calculated based on the action history, the expected value can be calculated based on the context of the user's action, and the accuracy of the expected value is improved. For example, when the expected value is calculated based on the action history, the expected value of the attention calculated by the attention expected value calculation unit 113 may be different even if the latest action of the user is the same.
  • the expected attention value calculation unit 113 may correct the attention based on the accuracy of the user's action estimation based on the action data.
  • the action data includes, for example, user image data provided by the camera 101, user sensing data provided by the sensor 103, and voice data in the vicinity of the user provided by the microphone 105.
  • the attention expectation value calculation unit 113 estimates the user's action based on these data, but the accuracy of the estimation may vary from time to time.
  • the output control unit 115 is an arithmetic function realized by, for example, the processor of the information processing apparatus, and controls output of information to the user based on the expected value of the attention level of the user calculated by the expected attention level calculation unit 113. To do. More specifically, the output control unit 115 may determine whether or not to output information based on an expected value of the user's attention. For example, for some information generated by the information generation unit 117, the output control unit 115 refers to the expected value calculated by the attention expectation value calculation unit 113, and outputs the information when the expected value falls below the threshold value. It may be suppressed, and if not, information output may be executed. The output control unit 115 may select an information output method based on the expected value of the user's attention.
  • the output control unit 115 may With reference to the expected value calculated by the force expectation value calculation unit 113, information may be output as an image when the expected value is below the threshold value, and information may be output as a sound when the expected value is not.
  • the output control unit 115 may output information that was not output because the expected value of the user's attention is low when the expected value is high. As described above, the output control unit 115 suppresses the output of information when the expected value of the user's attention calculated by the expected attention value calculation unit 113 is below a threshold value, for example. At this time, the user may be in a busy state temporarily, for example, and may wish to be provided with the suppressed information a little later. In such a case, the output control unit 115 may acquire the information temporarily stored in the information cache DB 119 by the information generation unit 117 as a result of suppressing the output first, and output the information to the user.
  • the information generation unit 117 is an arithmetic function realized by, for example, the processor of the information processing apparatus. For example, based on information provided from the information server 121, information for outputting to the user via the output control unit 115 Is generated.
  • the information server 121 may be included in the system 10 or may be a service external to the system 10.
  • the information server 121 cooperates with the behavior recognition server 111 to generate information for supporting the user's behavior (such as spot information in the vicinity of the user's current location, user's schedule information, or traffic information). Push transmission to the unit 117.
  • the information server 121 may push-transmit a notification such as an incoming message to the user or a delivery of new information to the information generation unit 117 in cooperation with another service provided in the terminal device.
  • the information server 121 may transmit the information and notification as described above in response to a request that the information generation unit 117 automatically transmits (regardless of a user operation).
  • the information generated by the information generation unit 117 is output to the user from the display 123, the speaker 125, and / or the other output device 127 according to the control of the output control unit 115. Therefore, the information generation unit 117 generates image data to be displayed on the display 123, audio data to be output by the speaker 125, and / or a control signal for operating the other output device 127.
  • the information generation unit 117 may receive the data or signal as described above from the information server 121 and output the data or signal as it is, or generates the data or signal as described above based on the information received from the information server 121. May be.
  • the information generation unit 117 can generate information according to the selected output method. For example, when the output control unit 115 determines to output information via the display 123, the information generation unit 117 can generate image data. For example, when the output control unit 115 determines to output information through the speaker 125, the information generation unit 117 can generate audio data.
  • the information generation unit 117 temporarily stores the generated information in the information cache DB 119.
  • the information cache DB 119 is a database realized by, for example, the memory or storage of the information processing apparatus, and temporarily stores information generated by the information generation unit 117.
  • information that has not been output as described above is stored, and the output information may also be stored for a predetermined period, for example, when a re-output is requested by the user.
  • the calculation of the expected value of the user's attention by the attention expected value calculation unit 113 may be performed based on the content of the information to be output.
  • the generated information or information indicating the contents thereof may be provided to the attention expectation value calculation unit 113 prior to output.
  • Display 123 displays an image for the user of system 10.
  • the speaker 125 outputs sound toward the user.
  • the output device 127 can include, for example, an illumination and a vibrator as described later.
  • the output control unit 115 controls the output of information via these output devices based on the expected value of the user's attention calculated by the expected attention value calculation unit 113.
  • the camera 101, the sensor 103, and / or the microphone 105 (hereinafter also collectively referred to as an input device) includes a display 123, a speaker 125, and / or another output device 127 (hereinafter collectively referred to as an output device).
  • the input device and the output device are provided in, for example, the same terminal device or a terminal device whose positional relationship is fixed.
  • the feedback analysis unit 129 is an arithmetic function realized by, for example, the processor of the information processing apparatus, and analyzes the user action as feedback for the control of the output control unit 115 related to information output. For example, when output of information is executed in accordance with the control of the output control unit 115 based on the expected value calculated by the expected attention value calculation unit 113, output control is appropriately performed from data indicating a user reaction to the information. Can be guessed. For example, when the user completely ignores the output information, it is estimated that the actual user's attention is lower than the calculated expected value.
  • the feedback analysis unit 129 may modify the calculation rule stored in the calculation rule DB 131 based on the analysis result. Further, the feedback analysis unit 129 may correct parameters and the like used in the calculation process in the expected attention value calculation unit 113 based on the analysis result.
  • the calculation rule DB 131 is a database realized by, for example, a memory or storage of an information processing apparatus, and stores data that associates a user action with an expected value of attention.
  • the data of the calculation rule DB 131 may be prepared in advance, for example. Further, the data may be corrected based on the result of the analysis by the feedback analysis unit 129. When such corrections are repeated, it can be said that the data of the calculation rule DB 131 is formed by learning based on the user's reaction to the output information.
  • the expected attention value calculation unit 113 refers to the calculation rule DB 131 based on the user action indicated by the action data acquired by the action data acquisition unit 107, and acquires a score indicating the expected value of attention.
  • the expected attention value calculator 113 refers to the calculation rule DB 131 based on the actions of a plurality of users, calculates the expected value of attention by weighting and adding the obtained scores. May be. For example, a plurality of actions may be detected redundantly based on different input data, such as “on the train” and “in conversation with another user”.
  • FIG. 2 is a diagram illustrating a device configuration example of a system according to an embodiment of the present disclosure.
  • the system 10 may include a terminal device 151 and a server 152.
  • the server 152 may include a plurality of servers like the servers 152a and 152b in the illustrated example.
  • the terminal device 151 has, for example, a function of outputting information to the user, a function of acquiring data indicating the user's action, and a function of exchanging information and data with the server 152.
  • the terminal device 151 can be, for example, a smartphone, a wearable terminal, a tablet terminal, a personal computer, a television, a game machine, or the like.
  • the terminal device 151 may be carried by a specific user or may be a stationary type used by an unspecified user.
  • the terminal device 151 is realized by, for example, a hardware configuration of an information processing device as will be described later.
  • the server 152 has, for example, a function of processing data received from the terminal device 151 and a function of transmitting information to be output in the terminal device 151.
  • the server 152 is realized by one or a plurality of server devices on a network, for example. Each server device is realized by a hardware configuration of an information processing device as will be described later.
  • the camera 101, sensor 103, microphone 105, display 123, speaker 125, and other output device 127 are realized in the terminal device 151, and other functional configurations are realized in the server 152. Is done.
  • the action data acquisition unit 107 is realized in the server 152b, and the expected attention value calculation unit 113 and the output control unit 115 are realized in the server 152a. May be realized.
  • the action recognition server 111 and the information server 121 may be realized in the server 152, and other functional configurations may be realized in the terminal device 151.
  • the information processing apparatus according to the present embodiment, for example, the information processing apparatus that implements the expected attention value calculation unit 113 and the output control unit 115 may be the terminal device 151.
  • the server 152 may include a plurality of servers (not limited to two as in the illustrated example, but may be three or more).
  • the system as shown in FIG. The entire ten functional configurations (excluding the above two servers) may be realized in the terminal device 151. In this case, the system 10 may not include the server 152.
  • FIG. 3 is a diagram illustrating an example of a calculation rule DB according to an embodiment of the present disclosure.
  • FIG. 3 shows records 131a to 131e associating actions, attention scores, sources, and conditions as examples of data stored in the calculation rule DB.
  • the action is a user action specified when the action data acquired by the action data acquiring unit 107 satisfies a predetermined condition.
  • the attention score is a score corresponding to an expected value of attention directed to information output to the user when each action is specified.
  • the source provides action data for identifying each action.
  • a condition is a condition that action data provided by a source should satisfy in order to identify each action.
  • the record 131a is specified when the user's action of “turning the line of sight” is detected by the camera 101 that the user has turned the line of sight toward the terminal device, and is given an attention score of 8.0. Is defined. In the illustrated example, since the attention score is defined in the range of 0 to 10, 8.0 means an expected value of relatively high attention. In addition, for example, various known techniques can be used for the image processing for detecting the user's line of sight executed in the expected attention value calculator 113, and thus detailed description thereof is omitted.
  • the record 131b is specified when the user's action of “calling” is detected when the user's utterance is detected by the microphone 105 and the utterance of another user is not detected, and the attention score of 9.0 is specified.
  • various known techniques can be used for the voice processing for identifying the user's utterance voice and the utterance voice of another user, which is executed in the expected attention value calculation unit 113, for example. Therefore, detailed description is omitted.
  • the expected attention value calculator 113 may detect the content of the user's utterance using various known techniques. In this case, for example, as a condition for specifying the user's action of “calling”, it may be defined that the user speaks a predetermined calling word, for example, “Hey”, “Hey”.
  • the record 131c, the record 131d, the record 131e, and other records are associated with the user action, the attention score, and the source and condition for specifying the action.
  • the record 131c may be “(Other Another action “conversation” with the user of the user is specified, and the attention score is lowered ("conversation” is 0.5 compared to 9.0 of "calling”).
  • the record 131d indicates that the action “get on the train” is specified when the user's action “get on the train” is recognized based on the action recognition data provided from the action recognition server 111. Defined.
  • the record 131e indicates that when the content of the information scheduled to be output by the information generation unit 117 includes a search keyword indicated by the search history acquired by the user action acquisition unit 107, the user action “search in the past” It defines what is identified.
  • a record that associates an action related to information content with an expected value of attention such as a record 131e, has a calculation rule DB 131 in a common format with data that associates another action with the expected value of attention.
  • the data that associates the information content with the expected attention value is stored in the calculation rule DB 131 in a format different from the data that associates the action with the expected attention value. May be.
  • the expected attention value calculator 113 calculates the expected value of attention directed to information output to the user with reference to the calculation rule DB 131 as in the illustrated example.
  • the expected attention value calculation unit 113 may use, for example, the attention score in the illustrated example as it is as the expected value of attention, or when multiple actions are detected in duplicate,
  • the expected value of attention may be calculated by weighting and adding the power scores.
  • the expected attention value calculator 113 may estimate the user's action based on the action data and adjust the expected value of the attention based on the accuracy of the estimation.
  • the expected value of attention may be adjusted to approach the average value if the estimation accuracy is low. For example, for an action with a higher attention score than the average value (5.0), for example, “turn gaze”, “call”, etc., it is determined that the accuracy of estimation based on action data is low (for example, In the result of the image analysis, the attention score may be temporarily lowered when the probability that the user is looking at the terminal device is dominant but not so high.
  • the attention score is temporarily raised when it is determined that the accuracy of the estimation based on the action data is low. May be. This is processing for estimating the reliability of the identified action low when the estimation accuracy is low, and bringing the calculated expected attention level close to that when no action is identified. In another example, when the accuracy of estimation is low, the attention score may be temporarily reduced uniformly.
  • a specific example of calculating the expected value of the user's attention in this embodiment will be further described.
  • the following specific example may be realized, for example, by the processing logic of the attention expectation value calculation unit 113 or may be realized by data stored in the calculation rule DB 131.
  • the expected attention value calculation unit 113 may raise the expected value of the calculated attention when the user's utterance (call) includes a specific word or phrase. More specifically, the expected value of the calculated attention is raised by combining utterances such as “Hurry”, “Hey”, and “Respond” with other actions (for example, turning the line of sight). In addition, the expected value of attention can be raised by a specific command (such as a command including the name of the system) or an action pointing at the terminal device. Using this, the user can control the system so that it is easy to respond to calls.
  • the expected attention value calculation unit 113 may calculate the expected value of attention based on the surrounding environment estimated as the user's action. More specifically, for example, when it is estimated that there is only one user, the expected attention value calculator 113 may raise the calculated expected value of attention. It is unlikely that you will speak to yourself except when you are alone or when you are on the phone, so when a user's utterance is detected, it is a call to the system compared to when there is another user. It is estimated that there is a high possibility. On the other hand, the attention expectation value calculation unit 113 may reduce the calculated expected value even when a user's utterance is detected in an environment where there is a lot of noise such as TV sound or train noise.
  • the expected value calculated may not be lowered even in a noisy environment.
  • the expected attention level calculator 113 may raise the calculated expected level of attention when the user continuously speaks the same content. In this case, it is highly likely that the user is requesting some kind of response from the system, so the user is expected to be attentive toward the information that is output, whether or not there was an audio output from the system before that. The value is estimated to be high.
  • the attention level expectation value calculation unit 113 calculates the expected level of attention level calculated based on the subsequent user action when one or more dialogues have already occurred between the system and the user. You may raise it. This is because user actions after interaction with the system are presumed to be likely related to information output from the system. However, for example, when the user leaves the terminal device that provides information after the dialogue, the expected value of attention can be low.
  • the expected attention value calculation unit 113 may raise the expected value of the calculated attention when the user has a characteristic dialogue with the system. More specifically, if you use a dialect, speak loudly, or add some keyword at the end or beginning, it is assumed that the user uttered with a feature that leads to the system, Expectation of attention can be high.
  • the expected attention value calculator 113 may calculate the expected value of attention for each output method according to the user's state. For example, when it is specified from the action recognition result that the user is working, in a meeting, or moving on a train, the expected attention value calculation unit 113 calculates the audio output. You may lower the expected value of attention. On the other hand, in this case, the expected attention level calculator 113 may increase the expected level of attention calculated for the user's gesture or simple action (for example, hitting or shaking the terminal device). If the user is sleeping, it is presumed that there is no intentional action from the user, so the expected value of the attention level calculated when any action is detected is reduced or set to 0 uniformly. May be. However, this is not the case when detecting a user's sleep log, for example, sleep phase, sleep, pulse, sleep level, and the like.
  • the expected attention level calculator 113 may correct the calculated expected level of attention according to the words included in the user's utterance.
  • the attention expectation value calculation unit 113 may select a specific person (for example, family, friend, company boss, etc.) or specific content (for example, anniversary, return date of borrowed book, date of submission of official documents) Etc.) may be included in the user's utterance, it may be determined that the user has a conversation with high importance, and the calculated expected value of attention may be raised. Thereby, for example, important information that the user should not forget can be reminded.
  • FIG. 4 is a flowchart illustrating an example of selection of a display method according to an embodiment of the present disclosure.
  • the output control unit 115 first determines whether or not the expected attention value calculated by the expected attention value calculation unit 113 exceeds the first threshold th1 (S101).
  • the output control unit 115 causes the display 123 to display information in a window displayed in the forefront (S103). This is processing when it is estimated that the user's attention to the output information is the highest (the user pays high attention to the output information). If information is displayed in the window displayed in the foreground, the user can obtain a lot of information immediately.
  • the output control unit 115 determines whether or not the expected value exceeds the second threshold th2 (S105). .
  • the second threshold th2 is smaller than the first threshold th1.
  • the output control unit 115 causes the display 123 to display information in a pop-up window (S107). This is processing when it is estimated that the user's attention to the output information is moderate (the user may or may not pay attention to the output information). If information is displayed in a pop-up window, even if the user does not need information, it does not get in the way.
  • the output control unit 115 ends the process without outputting information. That is, the output control unit 115 suppresses output of information. This is a process when it is assumed that the user's attention to the output information is low (the user pays little attention to the output information and may be an obstacle). Information that is not output in this case may be stored in the information cache DB 119 and output later.
  • FIG. 5 is a flowchart illustrating an example of selecting an output method according to an embodiment of the present disclosure.
  • the output control unit 115 first determines whether or not the expected value of attention calculated by the expected attention value calculation unit 113 exceeds the first threshold th1 (S151).
  • the output control unit 115 causes both the display 123 and the speaker 125 to output information (S153). This is processing when it is estimated that the user's attention to the output information is the highest (the user pays high attention to the output information). If information is output using both the display 123 and the speaker 125, the user can obtain a lot of information in a short time.
  • the output control unit 115 determines whether or not the expected value exceeds the second threshold th2 (S155). .
  • the second threshold th2 is smaller than the first threshold th1. If the expected value exceeds the second threshold th2, the output control unit 115 outputs information using only the display 123 (S157). This is processing when it is estimated that the user's attention to the output information is moderate (the user may or may not pay attention to the output information). If information is output using only the display 123, even if the user does not need information, it does not get in the way.
  • the output control unit 115 ends the process without outputting information. That is, the output control unit 115 suppresses output of information. This is a process when it is assumed that the user's attention to the output information is low (the user pays little attention to the output information and may be an obstacle). As in the example of FIG. 4, information that has not been output may be stored in the information cache DB 119 and output later.
  • FIG. 6 is a diagram showing the example of FIG. 5 more specifically.
  • the user has a conversation with another user.
  • the expected attention value calculation unit 113 identifies the user's action based on, for example, voice data acquired by the microphone 105, and compares the data with reference to the data of the calculation rule DB 131 as illustrated in FIG. Low expected value.
  • this expected value is between the first threshold th1 and the second threshold th2. Therefore, the process of S157 in the flowchart of FIG. 5 is executed, and information is output using only the display 123, as shown in FIG.
  • the attention expectation value calculation unit 113 identifies a user action based on, for example, audio data acquired by the microphone 105, and similarly calculates a relatively high expectation value by referring to the data in the rule DB 131. In the illustrated example, this expected value exceeds the first threshold th1. Therefore, the process of S153 in the flowchart of FIG. 5 is executed, and information is output using both the display on the display 123 and the sound 125v output from the speaker 125, as shown in (c).
  • information corresponding to the content of the user's conversation is output.
  • Such information is generated by, for example, the information generation unit 117 specifying the user's utterance content based on the voice data acquired by the microphone 105 and acquiring information related to the utterance content from the information server 121.
  • the expected attention level calculator 113 may raise the calculated expected level of attention by including the content of the information to be output in the user's utterance content.
  • FIG. 7 is a diagram illustrating an example of display color selection according to an embodiment of the present disclosure.
  • a user is walking in the city wearing a wearable terminal device on a bracelet.
  • the user passed near a certain store (SHOP).
  • This store was a store (Italian restaurant) related to the search keyword (referred to as “Italian”) in the information search executed by the user before.
  • the information generation unit 117 is estimated from the user position information specified by the GPS receiver included in the sensor 103 and the user information search history previously acquired by the action data acquisition unit 107.
  • the store may be an object of interest of the user
  • information for notifying the user that the store (SHOP) is nearby is generated.
  • the relationship between the store (SHOP) and the user may be estimated using user profile information held by an external service. For example, in a restaurant information providing service of a restaurant, store information bookmarks, store information search history, store information registered by other users having similar attributes, and the like are stored. In the social media service, evaluation information and the like expressed by the user on the social media for the service and the store are held.
  • the information generation unit 117 estimates the relationship between the store (SHOP) and the user based on such information, for example, and further notifies the user that the store (SHOP) is nearby based on the user location information. Information may be generated.
  • notification information to the user is output by illumination included in the other output device 127.
  • the output control unit 115 may change the illumination display color as shown in FIGS. 7A to 7C according to the expected attention value calculated by the expected attention value calculation unit 113. Good.
  • the output control unit 115 may emit the illumination with a conspicuous color when the expected value of the user's attention is high, or may emit the plain color or not when the expected value is low.
  • the output control unit 115 emits illumination with a plain color, and the expected value of the user's attention is If it is low (it is likely that the user has not yet noticed), the illumination may be emitted with a conspicuous color.
  • FIGS. 8 and 9 the state of communication with the user detected by the system is represented by an indicator as shown in the lower right of the figure.
  • the indicator may be actually displayed by, for example, an illumination provided as the other output device 127 in the terminal device, or interpreted as a display for explanation in FIGS. 8 and 9 (not actually displayed). May be.
  • FIG. 8 is a diagram for describing a first example of information stock according to an embodiment of the present disclosure.
  • the system detects that the user is speaking.
  • the system did not correctly detect the content of the user's utterance, and the expected value of the user's attention to the output information was calculated low. I didn't output the information and stocked it.
  • the user notices that there is no response from the system, and calls him “Ooi”.
  • a system that correctly detects a call from a user estimates that the expected value of the user's attention is high, and outputs stocked information. More specifically, the system provides information to the user by the voice 125v output from the speaker 125, "I'm sorry. Today is lunch in Osaki.”
  • FIG. 9 is a diagram for describing a second example of information stock according to an embodiment of the present disclosure.
  • the system detects that the user is speaking.
  • the system is in the user's conversation with another user. It was estimated that the expected value of attention was low, and the information generated by the information generation unit 117 was not output and stocked.
  • the user did not actually need information from the system (it was just asking for today's schedule in a conversation with another user), so the system stocked the information. The decision was correct.
  • FIG. 10 is a flowchart showing processing for stocking information according to an embodiment of the present invention.
  • the output control unit 115 first determines whether or not the expected value of attention calculated by the expected attention value calculation unit 113 exceeds the first threshold th1 (S201).
  • the output control unit 115 further determines whether there is information stocked in the information cache DB 119 (S203).
  • the output control unit 115 outputs the stocked information (S205). This is, for example, the process shown in (c) in the example of FIG.
  • the output control unit 115 outputs the information (S207).
  • the output control unit 115 determines whether or not the expected value exceeds the second threshold th2 (S209). .
  • the second threshold th2 is smaller than the first threshold th1.
  • the output control unit 115 outputs the information generated by the information generation unit 117 (S207). That is, in the illustrated example, when the expected value of attention is between the first threshold value th1 and the second threshold value th2, the stocked information is not output, but for example, a new information generation unit 117 is generated. The information generated by is output.
  • the output control unit 115 stocks information (S211).
  • the information stock example described above can be variously modified.
  • the actions for the user to retrieve the stocked information from the system are, for example, calling, saying the same thing again, staring (looking at), turning the face, There are possible operations such as operating buttons on the terminal device, clapping hands, and shutting down (waiting for a response from the system).
  • these actions may be registered as actions for extracting the stocked information.
  • the system may add an apology message for not responding correctly (for example, the system response in (c) in the example of FIG. 8).
  • the actions of the recognized user such as “Is n’t you talking to other users?”
  • the user still provides the information
  • the stocked information may be output when requested.
  • the stock information may be displayed in a list on the display 123, and the user may select which information to output.
  • the stocked information can be discarded when a predetermined time has passed as in the example of FIG. 9 described above, but the time for that can be arbitrarily set. For example, depending on the content of the information, a time from several minutes to several hours or days may be set as the time until the stocked information is discarded.
  • FIG. 11 is a block diagram illustrating a hardware configuration example of the information processing apparatus according to the embodiment of the present disclosure.
  • the illustrated information processing apparatus 900 can realize, for example, the terminal device or server in the above-described embodiment.
  • the information processing apparatus 900 includes a CPU (Central Processing unit) 901, a ROM (Read Only Memory) 903, and a RAM (Random Access Memory) 905.
  • the information processing apparatus 900 may include a host bus 907, a bridge 909, an external bus 911, an interface 913, an input device 915, an output device 917, a storage device 919, a drive 921, a connection port 923, and a communication device 925.
  • the information processing apparatus 900 may include an imaging device 933 and a sensor 935 as necessary.
  • the information processing apparatus 900 may include a processing circuit called DSP (Digital Signal Processor) or ASIC (Application Specific Integrated Circuit) instead of or in addition to the CPU 901.
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • the CPU 901 functions as an arithmetic processing device and a control device, and controls all or a part of the operation in the information processing device 900 according to various programs recorded in the ROM 903, the RAM 905, the storage device 919, or the removable recording medium 927.
  • the ROM 903 stores programs and calculation parameters used by the CPU 901.
  • the RAM 905 primarily stores programs used in the execution of the CPU 901, parameters that change as appropriate during the execution, and the like.
  • the CPU 901, the ROM 903, and the RAM 905 are connected to each other by a host bus 907 configured by an internal bus such as a CPU bus. Further, the host bus 907 is connected to an external bus 911 such as a PCI (Peripheral Component Interconnect / Interface) bus via a bridge 909.
  • PCI Peripheral Component Interconnect / Interface
  • the input device 915 is a device operated by the user, such as a mouse, a keyboard, a touch panel, a button, a switch, and a lever.
  • the input device 915 may be, for example, a remote control device that uses infrared rays or other radio waves, or may be an external connection device 929 such as a mobile phone that supports the operation of the information processing device 900.
  • the input device 915 includes an input control circuit that generates an input signal based on information input by the user and outputs the input signal to the CPU 901. The user operates the input device 915 to input various data and instruct processing operations to the information processing device 900.
  • the output device 917 is a device that can notify the user of the acquired information visually or audibly.
  • the output device 917 can be, for example, a display device such as an LCD (Liquid Crystal Display), a PDP (Plasma Display Panel), an organic EL (Electro-Luminescence) display, an audio output device such as a speaker and headphones, and a printer device.
  • the output device 917 outputs the result obtained by the processing of the information processing device 900 as video such as text or an image, or outputs it as audio such as voice or sound.
  • the storage device 919 is a data storage device configured as an example of a storage unit of the information processing device 900.
  • the storage device 919 includes, for example, a magnetic storage device such as an HDD (Hard Disk Drive), a semiconductor storage device, an optical storage device, or a magneto-optical storage device.
  • the storage device 919 stores programs executed by the CPU 901, various data, various data acquired from the outside, and the like.
  • the drive 921 is a reader / writer for a removable recording medium 927 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory, and is built in or externally attached to the information processing apparatus 900.
  • the drive 921 reads information recorded on the attached removable recording medium 927 and outputs the information to the RAM 905.
  • the drive 921 writes a record in the attached removable recording medium 927.
  • the connection port 923 is a port for directly connecting a device to the information processing apparatus 900.
  • the connection port 923 can be, for example, a USB (Universal Serial Bus) port, an IEEE 1394 port, a SCSI (Small Computer System Interface) port, or the like.
  • the connection port 923 may be an RS-232C port, an optical audio terminal, an HDMI (registered trademark) (High-Definition Multimedia Interface) port, or the like.
  • the communication device 925 is a communication interface configured with, for example, a communication device for connecting to the communication network 931.
  • the communication device 925 may be, for example, a communication card for wired or wireless LAN (Local Area Network), Bluetooth (registered trademark), or WUSB (Wireless USB).
  • the communication device 925 may be a router for optical communication, a router for ADSL (Asymmetric Digital Subscriber Line), or a modem for various communication.
  • the communication device 925 transmits and receives signals and the like using a predetermined protocol such as TCP / IP with the Internet and other communication devices, for example.
  • the communication network 931 connected to the communication device 925 is a wired or wireless network, such as the Internet, a home LAN, infrared communication, radio wave communication, or satellite communication.
  • the imaging device 933 uses various members such as an imaging element such as a CCD (Charge Coupled Device) or a CMOS (Complementary Metal Oxide Semiconductor), and a lens for controlling the formation of a subject image on the imaging element. It is an apparatus that images a real space and generates a captured image.
  • the imaging device 933 may capture a still image or may capture a moving image.
  • the sensor 935 is various sensors such as an acceleration sensor, a gyro sensor, a geomagnetic sensor, an optical sensor, and a sound sensor.
  • the sensor 935 acquires information about the state of the information processing apparatus 900 itself, such as the posture of the information processing apparatus 900, and information about the surrounding environment of the information processing apparatus 900, such as brightness and noise around the information processing apparatus 900, for example. To do.
  • the sensor 935 may include a GPS sensor that receives a GPS (Global Positioning System) signal and measures the latitude, longitude, and altitude of the apparatus.
  • GPS Global Positioning System
  • Each component described above may be configured using a general-purpose member, or may be configured by hardware specialized for the function of each component. Such a configuration can be appropriately changed according to the technical level at the time of implementation.
  • Embodiments of the present disclosure include, for example, an information processing device (terminal device or server) as described above, a system, an information processing method executed by the information processing device or system, a program for causing the information processing device to function, And a non-transitory tangible medium on which the program is recorded.
  • An information processing apparatus comprising a processor configured to provide the expected value for output control of the information.
  • the information processing apparatus according to (1) wherein the processor calculates the expected value based on a latest action of the user.
  • the information processing apparatus according to (1) or (2), wherein the processor calculates the expected value based on a history of actions of the user.
  • the information processing apparatus is further configured to correct the expected value calculation rule based on data indicating the reaction.
  • the information processing apparatus according to any one of (1) to (6), wherein the expected value is provided to determine whether or not to output the information.
  • the processor further executes output control of the information, and outputs the information that was not output because the expected value is low when the expected value is high Processing equipment.
  • the information processing apparatus according to any one of (1) to (8), wherein the expected value is provided to select an output method of the information.
  • the processor according to any one of (1) to (9), wherein the processor estimates the action of the user based on the acquired data and adjusts the expected value based on the accuracy of the estimation.
  • the information processing apparatus described in 1. (11) The information processing apparatus according to (10), wherein the processor brings the expected value closer to an average value when the estimation accuracy is low. (12) The information processing apparatus according to any one of (1) to (11), wherein the processor increases the expected value when an utterance of a specific phrase is included in the user action. (13) The information processing apparatus according to any one of (1) to (12), wherein the processor calculates the expected value based on a user's surrounding environment estimated as the user's action. (14) The processor Get data indicating user actions, Based on the acquired data, calculate an expected value of attention directed to information output to the user, An information processing method including providing the expected value for output control of the information. (15) Acquire data indicating user actions, Based on the acquired data, calculate an expected value of attention directed to information output to the user, A program for causing a computer to realize a function of providing the expected value for output control of the information.

Abstract

In order to implement more flexible responses to user actions, this invention provides an information processing device containing a processor that is configured so as to: acquire data that represents a user action; compute, on the basis of the acquired data, an expected value for the level of attention that will be paid to information outputted to the user; and provide said expected value for the purposes of controlling the output of said information.

Description

情報処理装置、情報処理方法およびプログラムInformation processing apparatus, information processing method, and program
 本開示は、情報処理装置、情報処理方法およびプログラムに関する。 This disclosure relates to an information processing apparatus, an information processing method, and a program.
 システムからユーザへの情報提示を最適化するための技術は、これまでに種々提案されている。例えば、特許文献1には、生成された所定個の発話内容の候補から無音を示す候補、および生成された所定個の発話内容の候補の各々に対して形態素解析を行って候補の各々から自立語を抽出し、生成された所定個の発話内容の候補中に無音を示す候補、または自立語を含まない候補が存在する場合には、入力された発話音声を無視し、表示装置およびスピーカから応答内容が応答されないように制御する技術が記載されている。これによって、入力された音声を棄却する場合に、より適切な対応を行うことができる。 Various technologies for optimizing the presentation of information from the system to the user have been proposed so far. For example, in Patent Document 1, a morphological analysis is performed on each of a candidate indicating silence from a predetermined number of generated utterance content candidates and each of the generated predetermined number of utterance content candidates, and independent from each of the candidates. When a word is extracted and there is a candidate indicating silence or a candidate that does not include an independent word in the generated candidate utterance contents, the input utterance is ignored, and the display device and the speaker A technique for controlling so that the response content is not responded is described. As a result, a more appropriate response can be taken when the input voice is rejected.
 また、特許文献2には、音声情報、表情情報および遅延時間情報を利用して、迅速かつ正確に人間とエージェント間の対話を管理するための方法および装置と、これを利用した音声対話システムが記載されている。より具体的には、音声対話システムにおいて、ユーザが発話した音声から分析された対話情報を利用して第1対話順序情報を生成するステップと、ユーザの顔映像から分析された表情情報を利用して第2対話順序情報を生成するステップと、第1対話順序情報、第2対話順序情報、システムの状態情報、ユーザの音声入力の有無およびユーザの無応答時間を利用して、最終的な対話順序を決定するステップとが実行される。 Patent Document 2 discloses a method and apparatus for quickly and accurately managing a dialogue between a human and an agent using voice information, facial expression information, and delay time information, and a voice dialogue system using the method and apparatus. Are listed. More specifically, in the voice dialogue system, the step of generating the first dialogue order information using the dialogue information analyzed from the voice uttered by the user, and the expression information analyzed from the user's face image are used. Generating the second interaction order information, and using the first interaction order information, the second interaction order information, the system state information, the presence / absence of voice input by the user, and the non-response time of the user, Determining the order.
特開2010-151941号公報JP 2010-151941 A 特開2004-206704号公報JP 2004-206704 A
 上記の特許文献1に記載された技術では、ユーザの発話内容に基づいてシステムの応答の有無を制御する。また、特許文献2に記載された技術では、ユーザの発話や表情、遅延時間に応じてシステムが発話するか否かを決定する。このような技術によって、例えば音声によるユーザへの情報提示をある程度最適化することができる。しかしながら、ユーザの発話や表情というのは、その時にシステムによって提示された情報がユーザにとって適切であるか否かを推定するための断片的な材料にすぎないため、上記のような技術による情報提示の最適化にはなおも改善の余地があった。 In the technique described in Patent Document 1 above, the presence / absence of a system response is controlled based on the content of the user's utterance. In the technique described in Patent Literature 2, it is determined whether or not the system speaks according to the user's speech, facial expression, and delay time. With such a technique, for example, the presentation of information to the user by voice can be optimized to some extent. However, the user's utterances and facial expressions are only fragmentary materials for estimating whether or not the information presented by the system at that time is appropriate for the user. There was still room for improvement in optimization.
 そこで、本開示では、ユーザのアクションに対してより柔軟な応答を実現することが可能な、新規かつ改良された情報処理装置、情報処理方法およびプログラムを提案する。 Therefore, the present disclosure proposes a new and improved information processing apparatus, information processing method, and program capable of realizing a more flexible response to user actions.
 本開示によれば、ユーザのアクションを示すデータを取得し、上記取得されたデータに基づいて、上記ユーザに対して出力される情報に向けられる注意力の期待値を算出し、上記期待値を上記情報の出力制御のために提供するように構成されるプロセッサを備える情報処理装置が提供される。 According to the present disclosure, data indicating a user's action is acquired, and based on the acquired data, an expected value of attention directed to information output to the user is calculated, and the expected value is calculated. An information processing apparatus including a processor configured to provide for output control of the information is provided.
 また、本開示によれば、プロセッサが、ユーザのアクションを示すデータを取得し、上記取得されたデータに基づいて、上記ユーザに対して出力される情報に向けられる注意力の期待値を算出し、上記期待値を上記情報の出力制御のために提供することを含む情報処理方法が提供される。 Further, according to the present disclosure, the processor acquires data indicating a user's action, and calculates an expected value of attention directed to information output to the user based on the acquired data. An information processing method including providing the expected value for output control of the information is provided.
 また、本開示によれば、ユーザのアクションを示すデータを取得し、上記取得されたデータに基づいて、上記ユーザに対して出力される情報に向けられる注意力の期待値を算出し、上記期待値を上記情報の出力制御のために提供する機能をコンピュータに実現させるためのプログラムが提供される。 In addition, according to the present disclosure, data indicating a user's action is acquired, and based on the acquired data, an expected value of attention directed to information output to the user is calculated, and the expectation A program is provided for causing a computer to realize a function of providing a value for output control of the information.
 以上説明したように本開示によれば、ユーザのアクションに対してより柔軟な応答を実現することができる。 As described above, according to the present disclosure, a more flexible response to a user action can be realized.
 なお、上記の効果は必ずしも限定的なものではなく、上記の効果とともに、または上記の効果に代えて、本明細書に示されたいずれかの効果、または本明細書から把握され得る他の効果が奏されてもよい。 Note that the above effects are not necessarily limited, and any of the effects shown in the present specification, or other effects that can be grasped from the present specification, together with or in place of the above effects. May be played.
本開示の一実施形態に係るシステムの概略的な機能構成を示すブロック図である。1 is a block diagram illustrating a schematic functional configuration of a system according to an embodiment of the present disclosure. 本開示の一実施形態に係るシステムの装置構成例を示す図である。It is a figure showing an example of device composition of a system concerning one embodiment of this indication. 本開示の一実施形態に係る算出ルールDBの例を示す図である。It is a figure which shows the example of calculation rule DB which concerns on one Embodiment of this indication. 本開示の一実施形態に係る表示方法の選択の例を示すフローチャートである。14 is a flowchart illustrating an example of selection of a display method according to an embodiment of the present disclosure. 本開示の一実施形態に係る出力方法の選択の例を示すフローチャートである。5 is a flowchart illustrating an example of selection of an output method according to an embodiment of the present disclosure. 図5の例をより具体的に示す図である。It is a figure which shows the example of FIG. 5 more concretely. 本開示の一実施形態に係る表示色選択の例を示す図である。5 is a diagram illustrating an example of display color selection according to an embodiment of the present disclosure. FIG. 本開示の一実施形態に係る情報のストックの第1の例について説明するための図である。FIG. 6 is a diagram for describing a first example of information stock according to an embodiment of the present disclosure. 本開示の一実施形態に係る情報のストックの第2の例について説明するための図である。FIG. 9 is a diagram for describing a second example of information stock according to an embodiment of the present disclosure. 本実施の一実施形態に係る情報のストックのための処理を示すフローチャートである。It is a flowchart which shows the process for the stock of the information which concerns on one Embodiment of this embodiment. 本開示の実施形態に係る情報処理装置のハードウェア構成例を示すブロック図である。FIG. 3 is a block diagram illustrating a hardware configuration example of an information processing apparatus according to an embodiment of the present disclosure.
 以下に添付図面を参照しながら、本開示の好適な実施の形態について詳細に説明する。なお、本明細書および図面において、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複説明を省略する。 Hereinafter, preferred embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. In the present specification and drawings, components having substantially the same functional configuration are denoted by the same reference numerals, and redundant description is omitted.
 なお、説明は以下の順序で行うものとする。
 1.システム構成
 2.算出ルールの例
 3.出力制御の例
  3-1.表示方法の変更の例
  3-2.出力方法の変更の例
  3-3.その他の例
 4.情報のストックの例
 5.ハードウェア構成
 6.補足
The description will be made in the following order.
1. System configuration 2. Examples of calculation rules Example of output control 3-1. Example of changing display method 3-2. Example of changing output method 3-3. Other examples 4. Examples of information stock Hardware configuration Supplement
 (1.システム構成)
 図1は、本開示の一実施形態に係るシステムの概略的な機能構成を示すブロック図である。図1を参照すると、システム10は、カメラ101と、センサ103と、マイクロフォン105と、アクションデータ取得部107と、アクションDB109と、行動認識サーバ111と、注意力期待値算出部と、出力制御部115と、情報生成部117と、情報キャッシュDB119と、情報サーバ121と、ディスプレイ123と、スピーカ125と、その他出力装置127と、フィードバック解析部129と、算出ルールDB131とを含む。
(1. System configuration)
FIG. 1 is a block diagram illustrating a schematic functional configuration of a system according to an embodiment of the present disclosure. Referring to FIG. 1, a system 10 includes a camera 101, a sensor 103, a microphone 105, an action data acquisition unit 107, an action DB 109, an action recognition server 111, an attention expectation value calculation unit, and an output control unit. 115, an information generation unit 117, an information cache DB 119, an information server 121, a display 123, a speaker 125, other output devices 127, a feedback analysis unit 129, and a calculation rule DB 131.
 システム10は、ユーザに情報を提示するために用いられる。例えば、システム10は、ユーザが装着または携帯する端末装置を介して、同じユーザに継続的に情報を提示するためのものであってもよい。あるいは、システム10は、据え置き型の端末装置を介して、その近傍に居合わせた不特定なユーザ(その一部が特定可能であってもよい)に情報を提示するためのものであってもよい。 System 10 is used to present information to the user. For example, the system 10 may be for continuously presenting information to the same user via a terminal device worn or carried by the user. Alternatively, the system 10 may be for presenting information to an unspecified user (a part of which may be specified) present in the vicinity thereof via a stationary terminal device. .
 カメラ101は、システム10のユーザを撮影可能である。カメラ101は、ユーザのアクションを示す画像データを取得することが可能である。センサ103は、ユーザをセンシング対象とする各種のセンサである。例えば、センサ103は、ユーザに装着または携帯される端末装置に搭載される、加速度センサやジャイロセンサ、地磁気センサ、GPS受信機などを含む。また、例えば、センサ103は、超音波センサや赤外線センサなどを含んでもよい。センサ103は、ユーザのアクションを示すセンサデータを取得することが可能である。マイクロフォン105は、ユーザの近傍で発生した音声を拾得することが可能である。マイクロフォン105は、ユーザのアクションを示す音声データを取得する。マイクロフォン105は、音声データに基づいて音源の方向を特定することが可能なようにマイクロフォンアレイを構成していてもよい。上記のカメラ101、センサ103、およびマイクロフォン105の一部または全部によって取得されたデータは、アクションデータ取得部107に提供される。 The camera 101 can shoot a user of the system 10. The camera 101 can acquire image data indicating a user action. The sensor 103 is various sensors that target the user. For example, the sensor 103 includes an acceleration sensor, a gyro sensor, a geomagnetic sensor, a GPS receiver, and the like mounted on a terminal device worn or carried by the user. For example, the sensor 103 may include an ultrasonic sensor or an infrared sensor. The sensor 103 can acquire sensor data indicating a user action. The microphone 105 can pick up sound generated in the vicinity of the user. The microphone 105 acquires audio data indicating a user action. The microphone 105 may constitute a microphone array so that the direction of a sound source can be specified based on audio data. Data acquired by some or all of the camera 101, sensor 103, and microphone 105 is provided to the action data acquisition unit 107.
 さらに、アクションデータ取得部107は、行動認識サーバ111からユーザの行動認識情報を取得してもよい。行動認識サーバ111は、システム10に含まれていてもよいし、システム10の外部のサービスであってもよい。行動認識サーバ111は、例えば各種の行動認識技術に基づいて、ユーザの行動を認識する。行動の認識には、例えばシステム10のセンサ103やカメラ101、マイクロフォン105によって取得されたデータが利用されうる。この場合、データは、センサ103や、カメラ101、マイクロフォン105から、別途行動認識サーバ111に送信される(図示せず)。あるいは、行動認識サーバ111は、システム10の外部の端末装置などによって取得されたセンサデータなどに基づいてユーザの行動を認識してもよい。 Furthermore, the action data acquisition unit 107 may acquire user action recognition information from the action recognition server 111. The action recognition server 111 may be included in the system 10 or may be a service outside the system 10. The behavior recognition server 111 recognizes the user's behavior based on various behavior recognition technologies, for example. For example, data acquired by the sensor 103, the camera 101, and the microphone 105 of the system 10 can be used for the action recognition. In this case, the data is separately transmitted from the sensor 103, the camera 101, and the microphone 105 to the action recognition server 111 (not shown). Alternatively, the behavior recognition server 111 may recognize the user's behavior based on sensor data acquired by a terminal device or the like outside the system 10.
 アクションデータ取得部107は、例えば情報処理装置のプロセッサによって実現されるデータ取得/管理機能であり、ユーザのアクションを示す各種のデータを取得/管理する。上述の通り、ユーザのアクションは、カメラ101、センサ103、マイクロフォン105、および/または行動認識サーバ111から提供されうる。さらに、アクションデータ取得部107は、ユーザがシステム10に含まれる端末装置、またはそれ以外の端末装置において実行した操作の情報を取得してもよい。この場合、例えば、ウェブブラウザを用いた情報検索のキーワードや、ユーザが利用したコンテンツの情報などが取得されうる。アクションデータ取得部107は、提供されたデータ(以下、アクションデータともいう)を必要に応じてアクションDB109に格納してから、注意力期待値算出部113に提供する。また、アクションデータ取得部107は、アクションデータをフィードバック解析部129に提供してもよい。 The action data acquisition unit 107 is a data acquisition / management function realized by a processor of the information processing apparatus, for example, and acquires / manages various data indicating user actions. As described above, user actions can be provided from the camera 101, the sensor 103, the microphone 105, and / or the behavior recognition server 111. Furthermore, the action data acquisition unit 107 may acquire information on an operation performed by a user in a terminal device included in the system 10 or other terminal devices. In this case, for example, a keyword for information search using a web browser, content information used by the user, and the like can be acquired. The action data acquisition unit 107 stores the provided data (hereinafter also referred to as action data) in the action DB 109 as necessary, and then provides the data to the expected attention value calculation unit 113. Further, the action data acquisition unit 107 may provide action data to the feedback analysis unit 129.
 ここで、アクションデータによって示されるユーザのアクションについて説明する。例えば、ユーザのアクションは、ユーザのモーションまたは表情を含んでもよい。この場合、例えば、最新の、または直近のアクションデータに基づいて、注意力期待値算出部113が、これから出力される情報に向けられるユーザの注意力の期待値を算出してもよい。また、ユーザのアクションは、ディスプレイ123やスピーカ125から出力された情報に対するリアクションを含んでもよい。この場合、リアクションを示すアクションデータに基づいて、注意力期待値算出部113が、次に情報が出力された場合のユーザの注意力の期待値を算出したり、フィードバック解析部129が期待値算出の妥当性の検証を実施したりする。 Here, the user action indicated by the action data will be described. For example, the user action may include a user motion or facial expression. In this case, for example, the expected attention value calculation unit 113 may calculate the expected value of the user's attention directed to the information to be output based on the latest or latest action data. The user action may include a reaction to information output from the display 123 or the speaker 125. In this case, based on the action data indicating the reaction, the expected attention value calculation unit 113 calculates the expected value of the user's attention when the information is next output, or the feedback analysis unit 129 calculates the expected value. To verify the validity of.
 アクションDB109は、例えば情報処理装置のメモリまたはストレージによって実現されるデータベースであり、アクションデータ取得部107が取得したアクションデータが一時的または継続的に格納される。例えば、アクションデータ取得部107が取得したアクションデータは、アクションDB109に一時的に格納された後、またはアクションDB109に格納されることなく、注意力期待値算出部113に提供されてもよい。この場合、注意力期待値算出部113は、ユーザの最新のアクションに基づいて期待値を算出することになる。あるいは、アクションデータ取得部107が取得したアクションデータは、アクションDB109に継続的に格納されてもよい。この場合、アクションデータ取得部107は、アクションDB109に蓄積されたアクションデータの中から、必要な期間のデータを読み出して注意力期待値算出部113に提供する。この場合、注意力期待値算出部113は、ユーザのアクションの履歴に基づいて期待値を算出することになる。 The action DB 109 is a database realized by, for example, a memory or a storage of an information processing apparatus, and stores action data acquired by the action data acquisition unit 107 temporarily or continuously. For example, the action data acquired by the action data acquisition unit 107 may be provided to the expected attention value calculation unit 113 after being temporarily stored in the action DB 109 or without being stored in the action DB 109. In this case, the attention expectation value calculation unit 113 calculates the expectation value based on the latest action of the user. Alternatively, the action data acquired by the action data acquisition unit 107 may be continuously stored in the action DB 109. In this case, the action data acquisition unit 107 reads out data for a necessary period from the action data stored in the action DB 109 and provides the data to the expected attention value calculation unit 113. In this case, the expected attention value calculator 113 calculates the expected value based on the history of the user's action.
 注意力期待値算出部113は、例えば情報処理装置のプロセッサによって実現される演算機能であり、アクションデータ取得部107によって取得されたアクションデータに基づいて、ユーザに対して出力される情報に向けられる注意力の期待値を算出する。ここで、注意力(attention)は、出力された情報に向けられるユーザの注意の程度を意味する。注意力期待値算出部113は、算出した注意力の期待値を、情報の出力制御のために出力制御部115に提供する。 The attention expectation value calculation unit 113 is an arithmetic function realized by, for example, the processor of the information processing apparatus, and is directed to information output to the user based on the action data acquired by the action data acquisition unit 107. Calculate the expected value of attention. Here, attention means the degree of user attention directed to the output information. The expected attention value calculator 113 provides the calculated expected value of attention to the output controller 115 for output control of information.
 ここで、出力される情報に向けられる注意力は、ユーザの状況に応じて変動しうる。例えば、ユーザがディスプレイ123などを備える端末装置に向けて視線を向けたり、端末装置に向けて呼びかけたりした場合、出力される情報には多くの注意が払われると予想される。一方、ユーザが他のユーザと会話中であったり、電車に乗っていたりする場合、特に音声で出力される情報に対してはあまり注意が払われない(耳に入らない)可能性が高い。注意力期待値算出部113は、例えば、カメラ101や、センサ103、マイクロフォン105、行動認識サーバ111などから提供されるアクションデータに基づいてユーザの状況を推定し、状況に応じた注意力の期待値を算出することができる。注意力期待値算出部113は、ユーザのアクションと注意力の期待値とを関連付けた算出ルールDB131のデータを参照することによって、期待値を算出してもよい。 Here, the attention directed to the output information can vary depending on the user's situation. For example, when the user turns his gaze toward a terminal device including the display 123 or calls to the terminal device, it is expected that much attention will be paid to the output information. On the other hand, when a user is in conversation with another user or is on a train, there is a high possibility that attention will not be paid much to the information output by voice (not heard). The expected attention value calculation unit 113 estimates the user's situation based on action data provided from the camera 101, the sensor 103, the microphone 105, the behavior recognition server 111, and the like, and expects the attention level according to the situation. A value can be calculated. The attention value expected value calculation unit 113 may calculate the expected value by referring to the data of the calculation rule DB 131 that associates the user's action with the expected value of attention.
 また、出力される情報に向けられる注意力は、出力される情報の内容によっても変動しうる。例えば、ユーザが興味のあるキーワードがディスプレイ123に表示されたり、スピーカ125から出力されたりすれば、ユーザは出力される情報に多くの注意を払うと予想される。注意力期待値算出部113は、出力される予定の情報の内容を情報生成部117から取得し、情報の内容と、アクションデータ、例えばユーザが過去に実行した情報検索のキーワードなどから推定されるユーザが興味のある内容とを比較し、一致または共通する点があれば、注意力の期待値を引き上げることができる。注意力期待値算出部113は、情報の内容に関するアクションと注意力の期待値とを関連付けた算出ルールDB131のデータを参照することによって、期待値を算出してもよい。 Also, the attention directed to the output information may vary depending on the content of the output information. For example, if a keyword in which the user is interested is displayed on the display 123 or output from the speaker 125, the user is expected to pay much attention to the output information. The attention expectation value calculation unit 113 acquires the content of the information to be output from the information generation unit 117, and is estimated from the content of the information and action data, for example, a keyword for information retrieval executed in the past by the user. The user's interest is compared, and if there is a match or something in common, the expected value of attention can be raised. The attention expected value calculation unit 113 may calculate the expected value by referring to the data of the calculation rule DB 131 in which the action related to the information content and the expected value of attention are associated with each other.
 なお、注意力期待値算出部113は、上記のようにアクションデータに基づいてユーザの注意力の期待値を算出するにあたり、ユーザの最新のアクションを示すアクションデータだけを利用してもよいし、ユーザのアクションの履歴を示すアクションデータを利用してもよい。最新のアクションに基づいて期待値を算出する場合、演算量を削減し、低負荷で迅速に期待値を算出することができる。一方、アクションの履歴に基づいて期待値を算出する場合、ユーザのアクションの文脈をふまえて期待値を算出することができ、期待値の精度が向上する。例えば、アクションの履歴に基づいて期待値を算出する場合、ユーザの最新のアクションが同じでも、注意力期待値算出部113によって算出される注意力の期待値が異なることがありうる。 Note that the expected attention value calculation unit 113 may use only the action data indicating the latest action of the user when calculating the expected value of the user's attention based on the action data as described above. Action data indicating a history of user actions may be used. When the expected value is calculated based on the latest action, the amount of calculation can be reduced, and the expected value can be calculated quickly with a low load. On the other hand, when the expected value is calculated based on the action history, the expected value can be calculated based on the context of the user's action, and the accuracy of the expected value is improved. For example, when the expected value is calculated based on the action history, the expected value of the attention calculated by the attention expected value calculation unit 113 may be different even if the latest action of the user is the same.
 さらに、注意力期待値算出部113は、アクションデータに基づいてユーザの注意力の期待値を算出するにあたり、アクションデータに基づくユーザのアクションの推測の精度に基づいて注意力を補正してもよい。アクションデータは、例えばカメラ101によって提供されるユーザの画像データや、センサ103によって提供されるユーザのセンシングデータや、マイクロフォン105によって提供されるユーザの近傍における音声データを含む。注意力期待値算出部113は、これらのデータに基づいてユーザのアクションを推定するが、推定の精度は時によって異なりうる。 Furthermore, when calculating the expected value of the user's attention based on the action data, the expected attention value calculation unit 113 may correct the attention based on the accuracy of the user's action estimation based on the action data. . The action data includes, for example, user image data provided by the camera 101, user sensing data provided by the sensor 103, and voice data in the vicinity of the user provided by the microphone 105. The attention expectation value calculation unit 113 estimates the user's action based on these data, but the accuracy of the estimation may vary from time to time.
 出力制御部115は、例えば情報処理装置のプロセッサによって実現される演算機能であり、注意力期待値算出部113によって算出されたユーザの注意力の期待値に基づいて、ユーザに対する情報の出力を制御する。より具体的には、出力制御部115は、ユーザの注意力の期待値に基づいて、情報を出力するか否かを決定してもよい。例えば、情報生成部117によって生成された何らかの情報について、出力制御部115は、注意力期待値算出部113によって算出された期待値を参照し、期待値が閾値を下回る場合には情報の出力を抑止し、そうでない場合には情報の出力を実行してもよい。また、出力制御部115は、ユーザの注意力の期待値に基づいて、情報の出力方法を選択してもよい。例えば、情報生成部117によって生成される情報が、ディスプレイ123を介して画像として出力してもよく、スピーカ125を介して音声として出力してもよいような場合に、出力制御部115は、注意力期待値算出部113によって算出された期待値を参照し、期待値が閾値を下回る場合には情報を画像として出力し、そうでない場合には情報を音声として出力してもよい。 The output control unit 115 is an arithmetic function realized by, for example, the processor of the information processing apparatus, and controls output of information to the user based on the expected value of the attention level of the user calculated by the expected attention level calculation unit 113. To do. More specifically, the output control unit 115 may determine whether or not to output information based on an expected value of the user's attention. For example, for some information generated by the information generation unit 117, the output control unit 115 refers to the expected value calculated by the attention expectation value calculation unit 113, and outputs the information when the expected value falls below the threshold value. It may be suppressed, and if not, information output may be executed. The output control unit 115 may select an information output method based on the expected value of the user's attention. For example, when the information generated by the information generation unit 117 may be output as an image via the display 123 or as audio via the speaker 125, the output control unit 115 may With reference to the expected value calculated by the force expectation value calculation unit 113, information may be output as an image when the expected value is below the threshold value, and information may be output as a sound when the expected value is not.
 さらに、出力制御部115は、ユーザの注意力の期待値が低いために出力されなかった情報を、期待値が高いときに出力してもよい。上述のように、出力制御部115は、例えば、注意力期待値算出部113によって算出されたユーザの注意力の期待値が閾値を下回る場合に、情報の出力を抑止する。このとき、ユーザは、例えば一時的に忙しい状態であって、その少し後であれば抑止された情報についても提供されることを望んでいる場合もありうる。このような場合に、出力制御部115は、先に出力を抑止した結果、情報生成部117が情報キャッシュDB119に一時的に格納した情報を取得して、ユーザに対して出力してもよい。 Furthermore, the output control unit 115 may output information that was not output because the expected value of the user's attention is low when the expected value is high. As described above, the output control unit 115 suppresses the output of information when the expected value of the user's attention calculated by the expected attention value calculation unit 113 is below a threshold value, for example. At this time, the user may be in a busy state temporarily, for example, and may wish to be provided with the suppressed information a little later. In such a case, the output control unit 115 may acquire the information temporarily stored in the information cache DB 119 by the information generation unit 117 as a result of suppressing the output first, and output the information to the user.
 情報生成部117は、例えば情報処理装置のプロセッサによって実現される演算機能であり、例えば情報サーバ121から提供される情報に基づいて、出力制御部115を介してユーザに対して出力するための情報を生成する。情報サーバ121は、システム10に含まれていてもよいし、システム10の外部のサービスであってもよい。例えば、情報サーバ121は、行動認識サーバ111と連携して、ユーザの行動を支援するための情報(ユーザの現在地の近傍にあるスポットの情報、ユーザのスケジュール情報、または交通情報など)を情報生成部117にプッシュ送信する。また、例えば、情報サーバ121は、端末装置において提供される他のサービスと連携して、ユーザへのメッセージの着信や新着情報の配信などの通知を情報生成部117にプッシュ送信してもよい。あるいは、情報サーバ121は、情報生成部117が(ユーザ操作によらず)自動的に送信するリクエストに応じて上記のような情報や通知を送信してもよい。 The information generation unit 117 is an arithmetic function realized by, for example, the processor of the information processing apparatus. For example, based on information provided from the information server 121, information for outputting to the user via the output control unit 115 Is generated. The information server 121 may be included in the system 10 or may be a service external to the system 10. For example, the information server 121 cooperates with the behavior recognition server 111 to generate information for supporting the user's behavior (such as spot information in the vicinity of the user's current location, user's schedule information, or traffic information). Push transmission to the unit 117. In addition, for example, the information server 121 may push-transmit a notification such as an incoming message to the user or a delivery of new information to the information generation unit 117 in cooperation with another service provided in the terminal device. Alternatively, the information server 121 may transmit the information and notification as described above in response to a request that the information generation unit 117 automatically transmits (regardless of a user operation).
 上述のように、情報生成部117によって生成された情報は、出力制御部115の制御に従って、ディスプレイ123、スピーカ125、および/またはその他出力装置127から、ユーザに対して出力される。従って、情報生成部117は、ディスプレイ123が表示するための画像データ、スピーカ125が出力するための音声データ、および/またはその他出力装置127を動作させるための制御信号を生成する。なお、情報生成部117は、上記のようなデータまたは信号を情報サーバ121から受信してそのまま出力してもよいし、情報サーバ121から受信した情報に基づいて上記のようなデータまたは信号を生成してもよい。出力制御部115がユーザの注意力の期待値に基づいて情報の出力方法を制御する場合、情報生成部117は、選択された出力方法に応じた情報を生成しうる。例えば、出力制御部115がディスプレイ123を介して情報を出力することを決定した場合、情報生成部117は画像データを生成しうる。また、例えば、出力制御部115がスピーカ125を介して情報を出力することを決定した場合、情報生成部117は音声データを生成しうる。 As described above, the information generated by the information generation unit 117 is output to the user from the display 123, the speaker 125, and / or the other output device 127 according to the control of the output control unit 115. Therefore, the information generation unit 117 generates image data to be displayed on the display 123, audio data to be output by the speaker 125, and / or a control signal for operating the other output device 127. The information generation unit 117 may receive the data or signal as described above from the information server 121 and output the data or signal as it is, or generates the data or signal as described above based on the information received from the information server 121. May be. When the output control unit 115 controls the information output method based on the expected value of the user's attention, the information generation unit 117 can generate information according to the selected output method. For example, when the output control unit 115 determines to output information via the display 123, the information generation unit 117 can generate image data. For example, when the output control unit 115 determines to output information through the speaker 125, the information generation unit 117 can generate audio data.
 また、情報生成部117は、出力制御部115がユーザの注意力の期待値に基づいて情報の出力を抑止することを決定した場合に、生成した情報を情報キャッシュDB119に一時的に格納する。情報キャッシュDB119は、例えば情報処理装置のメモリまたはストレージによって実現されるデータベースであり、情報生成部117によって生成された情報を一時的に格納する。情報キャッシュDB119には、上記のように出力されなかった情報が格納される他、出力された情報も、例えばユーザから再出力を要求された場合のために所定の期間にわたって格納されてもよい。また、上述のように、注意力期待値算出部113によるユーザの注意力の期待値の算出が、出力される予定の情報の内容に基づいて実施される場合もあるため、情報生成部117は、生成された情報、またはその内容を示す情報を、出力に先立って注意力期待値算出部113に提供してもよい。 In addition, when the output control unit 115 determines to suppress the output of information based on the expected value of the user's attention, the information generation unit 117 temporarily stores the generated information in the information cache DB 119. The information cache DB 119 is a database realized by, for example, the memory or storage of the information processing apparatus, and temporarily stores information generated by the information generation unit 117. In the information cache DB 119, information that has not been output as described above is stored, and the output information may also be stored for a predetermined period, for example, when a re-output is requested by the user. Further, as described above, the calculation of the expected value of the user's attention by the attention expected value calculation unit 113 may be performed based on the content of the information to be output. The generated information or information indicating the contents thereof may be provided to the attention expectation value calculation unit 113 prior to output.
 ディスプレイ123は、システム10のユーザに向けて画像を表示する。スピーカ125は、ユーザに向けて音声を出力する。その他出力装置127は、例えば後述するようなイルミネーションやバイブレータなどを含みうる。上述したように、出力制御部115は、これらの出力装置を介した情報の出力を、注意力期待値算出部113によって算出されたユーザの注意力の期待値に基づいて制御する。ここで、例えばカメラ101、センサ103、および/またはマイクロフォン105(以下、総称して入力装置ともいう)は、ディスプレイ123、スピーカ125、および/またはその他出力装置127(以下、総称して出力装置ともいう)から出力される情報に向けられるユーザの注意力の期待値を算出するために、ユーザのアクションを示すデータを取得する。従って、入力装置と出力装置とは、例えば同一の端末装置、または互いの位置関係が固定された端末装置に設けられていることが望ましい。 Display 123 displays an image for the user of system 10. The speaker 125 outputs sound toward the user. In addition, the output device 127 can include, for example, an illumination and a vibrator as described later. As described above, the output control unit 115 controls the output of information via these output devices based on the expected value of the user's attention calculated by the expected attention value calculation unit 113. Here, for example, the camera 101, the sensor 103, and / or the microphone 105 (hereinafter also collectively referred to as an input device) includes a display 123, a speaker 125, and / or another output device 127 (hereinafter collectively referred to as an output device). In order to calculate the expected value of the user's attention directed to the information output from the above, data indicating the user's action is acquired. Therefore, it is desirable that the input device and the output device are provided in, for example, the same terminal device or a terminal device whose positional relationship is fixed.
 フィードバック解析部129は、例えば情報処理装置のプロセッサによって実現される演算機能であり、ユーザのアクションを、情報出力に関する出力制御部115の制御に対するフィードバックとして解析する。例えば、注意力期待値算出部113によって算出された期待値に基づく出力制御部115の制御に従って情報の出力が実行された場合に、その情報に対するユーザのリアクションを示すデータから、出力の制御が適切であったかを推測することができる。例えば、ユーザが出力された情報を全く無視していた場合、実際のユーザの注意力は、算出された期待値よりも低かったと推測される。フィードバック解析部129は、解析の結果に基づいて、算出ルールDB131に格納された算出ルールを修正してもよい。また、フィードバック解析部129は、解析の結果に基づいて、注意力期待値算出部113での算出処理において使用されるパラメータなどを修正してもよい。 The feedback analysis unit 129 is an arithmetic function realized by, for example, the processor of the information processing apparatus, and analyzes the user action as feedback for the control of the output control unit 115 related to information output. For example, when output of information is executed in accordance with the control of the output control unit 115 based on the expected value calculated by the expected attention value calculation unit 113, output control is appropriately performed from data indicating a user reaction to the information. Can be guessed. For example, when the user completely ignores the output information, it is estimated that the actual user's attention is lower than the calculated expected value. The feedback analysis unit 129 may modify the calculation rule stored in the calculation rule DB 131 based on the analysis result. Further, the feedback analysis unit 129 may correct parameters and the like used in the calculation process in the expected attention value calculation unit 113 based on the analysis result.
 算出ルールDB131は、例えば情報処理装置のメモリまたはストレージによって実現されるデータベースであり、ユーザのアクションと注意力の期待値とを関連付けるデータが格納される。算出ルールDB131のデータは、例えば予め用意されたものであってもよい。さらに、上記のフィードバック解析部129による解析の結果に基づいて、データが修正されてもよい。このような修正が繰り返された場合、算出ルールDB131のデータは、出力された情報に対するユーザのリアクションに基づく学習によって形成されているともいえる。例えば、注意力期待値算出部113は、アクションデータ取得部107が取得したアクションデータによって示されるユーザのアクションに基づいて算出ルールDB131を参照し、注意力の期待値を示すスコアを取得する。このとき、注意力期待値算出部113は、複数のユーザのアクションに基づいて算出ルールDB131を参照し、取得された複数のスコアを重みづけして足し合わせることによって注意力の期待値を算出してもよい。例えば、「電車に乗っている」かつ「他のユーザと会話中」といったように、例えば異なる入力データに基づいて複数のアクションが重複して検出される場合がありうる。 The calculation rule DB 131 is a database realized by, for example, a memory or storage of an information processing apparatus, and stores data that associates a user action with an expected value of attention. The data of the calculation rule DB 131 may be prepared in advance, for example. Further, the data may be corrected based on the result of the analysis by the feedback analysis unit 129. When such corrections are repeated, it can be said that the data of the calculation rule DB 131 is formed by learning based on the user's reaction to the output information. For example, the expected attention value calculation unit 113 refers to the calculation rule DB 131 based on the user action indicated by the action data acquired by the action data acquisition unit 107, and acquires a score indicating the expected value of attention. At this time, the expected attention value calculator 113 refers to the calculation rule DB 131 based on the actions of a plurality of users, calculates the expected value of attention by weighting and adding the obtained scores. May be. For example, a plurality of actions may be detected redundantly based on different input data, such as “on the train” and “in conversation with another user”.
 図2は、本開示の一実施形態に係るシステムの装置構成例を示す図である。図2を参照すると、システム10は、端末装置151と、サーバ152とを含みうる。また、サーバ152は、図示された例のサーバ152a,152bのように、複数のサーバを含んでもよい。 FIG. 2 is a diagram illustrating a device configuration example of a system according to an embodiment of the present disclosure. With reference to FIG. 2, the system 10 may include a terminal device 151 and a server 152. Further, the server 152 may include a plurality of servers like the servers 152a and 152b in the illustrated example.
 端末装置151は、例えば、ユーザに対して情報を出力する機能と、ユーザのアクションを示すデータを取得する機能と、情報やデータをサーバ152との間でやりとりする機能とを有する。端末装置151は、例えば、スマートフォン、ウェアラブル端末、タブレット端末、パーソナルコンピュータ、テレビ、ゲーム機などでありうる。端末装置151は、特定のユーザによって携帯されるものであってもよいし、不特定のユーザによって使用される据え置き型のものであってもよい。端末装置151は、例えば、後述するような情報処理装置のハードウェア構成によって実現される。 The terminal device 151 has, for example, a function of outputting information to the user, a function of acquiring data indicating the user's action, and a function of exchanging information and data with the server 152. The terminal device 151 can be, for example, a smartphone, a wearable terminal, a tablet terminal, a personal computer, a television, a game machine, or the like. The terminal device 151 may be carried by a specific user or may be a stationary type used by an unspecified user. The terminal device 151 is realized by, for example, a hardware configuration of an information processing device as will be described later.
 サーバ152は、例えば、端末装置151から受信されたデータを処理する機能と、端末装置151において出力するための情報を送信する機能とを有する。サーバ152は、例えば、ネットワーク上の1または複数のサーバ装置によって実現される。それぞれのサーバ装置は、後述するような情報処理装置のハードウェア構成によって実現される。 The server 152 has, for example, a function of processing data received from the terminal device 151 and a function of transmitting information to be output in the terminal device 151. The server 152 is realized by one or a plurality of server devices on a network, for example. Each server device is realized by a hardware configuration of an information processing device as will be described later.
 例えば、図1に示した機能構成のうち、カメラ101、センサ103、マイクロフォン105、ディスプレイ123、スピーカ125、およびその他出力装置127が端末装置151において実現され、それ以外の機能構成がサーバ152において実現される。このとき、例えばサーバ152bにおいてアクションデータ取得部107が実現され、サーバ152aにおいて注意力期待値算出部113および出力制御部115が実現される、といったように、複数のサーバの間に機能が分散して実現されてもよい。 For example, among the functional configurations shown in FIG. 1, the camera 101, sensor 103, microphone 105, display 123, speaker 125, and other output device 127 are realized in the terminal device 151, and other functional configurations are realized in the server 152. Is done. At this time, for example, the action data acquisition unit 107 is realized in the server 152b, and the expected attention value calculation unit 113 and the output control unit 115 are realized in the server 152a. May be realized.
 別の例では、図1に示した機能構成のうち、行動認識サーバ111、および情報サーバ121がサーバ152において実現され、それ以外の機能構成が端末装置151において実現されてもよい。このように、システム10において、本実施形態に係る情報処理装置、例えば注意力期待値算出部113および出力制御部115を実現する情報処理装置は、端末装置151であってもよく、サーバ152であってもよい。サーバ152は、上述のように複数のサーバ(図示された例のように2つには限らず、3つ以上であってもよい)を含んでもよい。 In another example, among the functional configurations shown in FIG. 1, the action recognition server 111 and the information server 121 may be realized in the server 152, and other functional configurations may be realized in the terminal device 151. As described above, in the system 10, the information processing apparatus according to the present embodiment, for example, the information processing apparatus that implements the expected attention value calculation unit 113 and the output control unit 115 may be the terminal device 151. There may be. As described above, the server 152 may include a plurality of servers (not limited to two as in the illustrated example, but may be three or more).
 さらに、例えば、アクションデータ取得部107が行動認識サーバ111からのデータを取得せず、情報生成部117も情報サーバ121からのデータを取得しないような場合には、図1に示したようなシステム10の機能構成(上記の2つのサーバを除く)の全体が端末装置151において実現されてもよい。この場合、システム10はサーバ152を含まなくてもよい。 Further, for example, when the action data acquisition unit 107 does not acquire data from the action recognition server 111 and the information generation unit 117 does not acquire data from the information server 121, the system as shown in FIG. The entire ten functional configurations (excluding the above two servers) may be realized in the terminal device 151. In this case, the system 10 may not include the server 152.
 (2.算出ルールの例)
 図3は、本開示の一実施形態に係る算出ルールDBの例を示す図である。図3には、算出ルールDBに格納されるデータの例として、アクションと、注意力スコアと、ソースと、条件とを関連付けたレコード131a~131eが示されている。
(2. Examples of calculation rules)
FIG. 3 is a diagram illustrating an example of a calculation rule DB according to an embodiment of the present disclosure. FIG. 3 shows records 131a to 131e associating actions, attention scores, sources, and conditions as examples of data stored in the calculation rule DB.
 アクションは、アクションデータ取得部107が取得するアクションデータが所定の条件を満たした場合に特定されるユーザのアクションである。注意力スコアは、それぞれのアクションが特定された場合に、ユーザに対して出力される情報に向けられる注意力の期待値に対応するスコアである。ソースは、それぞれのアクションを特定するためのアクションデータを提供する。条件は、それぞれのアクションを特定するために、ソースによって提供されるアクションデータが満たすべき条件である。 The action is a user action specified when the action data acquired by the action data acquiring unit 107 satisfies a predetermined condition. The attention score is a score corresponding to an expected value of attention directed to information output to the user when each action is specified. The source provides action data for identifying each action. A condition is a condition that action data provided by a source should satisfy in order to identify each action.
 例えば、レコード131aは、「視線を向ける」というユーザのアクションが、カメラ101によってユーザが端末装置に視線を向けたことが検出された場合に特定され、8.0の注意力スコアが与えられることを定義している。なお、図示された例において、注意力スコアは0~10の範囲で定義されているため、8.0は比較的高い注意力の期待値を意味する。また、例えば注意力期待値算出部113において実行される、ユーザの視線を検出するための画像処理については、公知のさまざまな技術を利用することが可能であるため、詳細な説明は省略する。 For example, the record 131a is specified when the user's action of “turning the line of sight” is detected by the camera 101 that the user has turned the line of sight toward the terminal device, and is given an attention score of 8.0. Is defined. In the illustrated example, since the attention score is defined in the range of 0 to 10, 8.0 means an expected value of relatively high attention. In addition, for example, various known techniques can be used for the image processing for detecting the user's line of sight executed in the expected attention value calculator 113, and thus detailed description thereof is omitted.
 また、例えば、レコード131bは、「呼びかける」というユーザのアクションが、マイクロフォン105によってユーザの発話が検出され、かつ他のユーザの発話が検出されなかった場合に特定され、9.0の注意力スコアが与えられることを定義している。なお、例えば注意力期待値算出部113において実行される、ユーザの発話音声と他のユーザの発話音声とを識別するための音声処理については、公知のさまざまな技術を利用することが可能であるため、詳細な説明は省略する。また、例えば注意力期待値算出部113は、同様に公知の様々な技術を利用して、ユーザの発話内容を検出してもよい。この場合、例えば、「呼びかける」というユーザのアクションを特定するための条件として、ユーザが所定の呼びかけの言葉、例えば「おーい」、「ねえ」などを発話することを定義してもよい。 Further, for example, the record 131b is specified when the user's action of “calling” is detected when the user's utterance is detected by the microphone 105 and the utterance of another user is not detected, and the attention score of 9.0 is specified. Is defined to be given. For example, various known techniques can be used for the voice processing for identifying the user's utterance voice and the utterance voice of another user, which is executed in the expected attention value calculation unit 113, for example. Therefore, detailed description is omitted. In addition, for example, the expected attention value calculator 113 may detect the content of the user's utterance using various known techniques. In this case, for example, as a condition for specifying the user's action of “calling”, it may be defined that the user speaks a predetermined calling word, for example, “Hey”, “Hey”.
 以下、レコード131c、レコード131d、レコード131e、および図示していない他のレコードでも、同様にしてユーザのアクションと注意力スコア、およびそれを特定するためのソースならびに条件とが関連付けられる。例えば、レコード131cは、レコード131bの「呼びかける」の場合と同様にマイクロフォン105によってユーザの発話が検出された場合であっても、他のユーザの発話も同時に検出された場合には、「(他のユーザとの)会話」という別のアクションが特定され、注意力スコアが低くなる(「呼びかける」の9.0に対し、「会話」は0.5)ことを定義している。また、レコード131dは、行動認識サーバ111から提供される行動認識データに基づいて、「電車に乗る」というユーザの行動が認識された場合には「電車に乗る」というアクションが特定されることを定義している。 Hereinafter, in the same manner, the record 131c, the record 131d, the record 131e, and other records (not shown) are associated with the user action, the attention score, and the source and condition for specifying the action. For example, even if the user's utterance is detected by the microphone 105 as in the case of “call” of the record 131b, the record 131c may be “(Other Another action "conversation" with the user of the user is specified, and the attention score is lowered ("conversation" is 0.5 compared to 9.0 of "calling"). The record 131d indicates that the action “get on the train” is specified when the user's action “get on the train” is recognized based on the action recognition data provided from the action recognition server 111. Defined.
 また、レコード131eは、情報生成部117が出力する予定の情報の内容が、ユーザアクション取得部107が取得した検索履歴よって示される検索キーワードを含む場合に、「過去に検索」というユーザのアクションが特定されることを定義している。図示された例では、レコード131eのように、情報の内容に関するアクションと注意力の期待値とを関連付けるレコードが、他のアクションと注意力の期待値とを関連付けるデータと共通の形式で算出ルールDB131に格納されているが、他の例では、情報の内容と注意力の期待値とを関連付けるデータが、アクションと注意力の期待値とを関連付けるデータとは別の形式で算出ルールDB131に格納されてもよい。 Further, the record 131e indicates that when the content of the information scheduled to be output by the information generation unit 117 includes a search keyword indicated by the search history acquired by the user action acquisition unit 107, the user action “search in the past” It defines what is identified. In the illustrated example, a record that associates an action related to information content with an expected value of attention, such as a record 131e, has a calculation rule DB 131 in a common format with data that associates another action with the expected value of attention. However, in another example, the data that associates the information content with the expected attention value is stored in the calculation rule DB 131 in a format different from the data that associates the action with the expected attention value. May be.
 上述のように、注意力期待値算出部113は、図示した例のような算出ルールDB131を参照して、ユーザに対して出力される情報に向けられる注意力の期待値を算出する。このとき、注意力期待値算出部113は、例えば図示された例における注意力スコアをそのまま注意力の期待値として用いてもよいし、複数のアクションが重複して検出された場合には、注意力スコアを重みづけして足し合わせることによって注意力の期待値を算出してもよい。 As described above, the expected attention value calculator 113 calculates the expected value of attention directed to information output to the user with reference to the calculation rule DB 131 as in the illustrated example. At this time, the expected attention value calculation unit 113 may use, for example, the attention score in the illustrated example as it is as the expected value of attention, or when multiple actions are detected in duplicate, The expected value of attention may be calculated by weighting and adding the power scores.
 さらに、注意力期待値算出部113は、アクションデータに基づいてユーザのアクションを推測し、推測の精度に基づいて注意力の期待値を調整してもよい。この場合、注意力の期待値は、推測の精度が低ければ、平均値に近づくように調整されてもよい。例えば、注意力スコアが平均値(5.0とする)よりも高いアクション、例えば「視線を向ける」、「呼びかける」などについては、アクションデータに基づく推測の精度が低いと判断される場合(例えば、画像解析の結果において、ユーザが端末装置に視線を向けている確率が優勢ではあるがあまり高くないような場合)に注意力スコアが一時的に引き下げられてもよい。また、注意力スコアが平均値よりも低いアクション、例えば「会話」、「電車に乗る」などについては、アクションデータに基づく推測の精度が低いと判断される場合に注意力スコアが一時的に引き上げられてもよい。これは、推測の精度が低い場合には、特定されたアクションの信頼度を低く見積もり、算出される注意力の期待値をアクションが特定されなかった場合に近づけるための処理である。他の例では、推測の精度が低い場合に、注意力スコアが一律に一時的に引き下げられてもよい。 Furthermore, the expected attention value calculator 113 may estimate the user's action based on the action data and adjust the expected value of the attention based on the accuracy of the estimation. In this case, the expected value of attention may be adjusted to approach the average value if the estimation accuracy is low. For example, for an action with a higher attention score than the average value (5.0), for example, “turn gaze”, “call”, etc., it is determined that the accuracy of estimation based on action data is low (for example, In the result of the image analysis, the attention score may be temporarily lowered when the probability that the user is looking at the terminal device is dominant but not so high. For actions with an attention score lower than the average, such as “conversation” and “get on the train”, the attention score is temporarily raised when it is determined that the accuracy of the estimation based on the action data is low. May be. This is processing for estimating the reliability of the identified action low when the estimation accuracy is low, and bringing the calculated expected attention level close to that when no action is identified. In another example, when the accuracy of estimation is low, the attention score may be temporarily reduced uniformly.
 本実施形態におけるユーザの注意力の期待値の算出の具体的な例について、さらに説明する。以下の具体的な例は、例えば、注意力期待値算出部113の処理ロジックによって実現されてもよいし、算出ルールDB131に格納されるデータによって実現されてもよい。 A specific example of calculating the expected value of the user's attention in this embodiment will be further described. The following specific example may be realized, for example, by the processing logic of the attention expectation value calculation unit 113 or may be realized by data stored in the calculation rule DB 131.
 例えば、注意力期待値算出部113は、ユーザの発話(呼びかけ)に特定の語句が含まれる場合に、算出される注意力の期待値を引き上げてもよい。より具体的には、「急いで」、「ねえ」、「答えて」などの発話を他のアクション(例えば視線を向けるなど)と組み合わせることによって、算出される注意力の期待値が引き上げられる。他にも、特定のコマンド(システムの名称を含むコマンドなど)や、端末装置を指さすアクションによって、注意力の期待値が引き上げられうる。これを利用して、ユーザは、システムが呼びかけに反応しやすいように制御することができる。 For example, the expected attention value calculation unit 113 may raise the expected value of the calculated attention when the user's utterance (call) includes a specific word or phrase. More specifically, the expected value of the calculated attention is raised by combining utterances such as “Hurry”, “Hey”, and “Respond” with other actions (for example, turning the line of sight). In addition, the expected value of attention can be raised by a specific command (such as a command including the name of the system) or an action pointing at the terminal device. Using this, the user can control the system so that it is easy to respond to calls.
 また、例えば、注意力期待値算出部113は、ユーザのアクションとして推測される周辺環境に基づいて、注意力の期待値を算出してもよい。より具体的には、例えば、注意力期待値算出部113は、ユーザが1人だけでいると推定される場合、算出される注意力の期待値を引き上げてもよい。ユーザが1人でいる場合、電話している場合などを除いて独り言を言う可能性は低いため、ユーザの発話が検出された場合、他のユーザがいる場合に比べてそれがシステムへの呼びかけである可能性は高いと推測される。一方、注意力期待値算出部113は、テレビの音や、電車の騒音などのノイズが多い環境では、ユーザの発話が検出された場合でも、算出される期待値を引き下げてもよい。例えばスピーカ125を介して音声で情報を提供してもユーザの注意が向けられる可能性が低いためである。ただし、スピーカ125での音声出力において、ビームフォーミングやノイズキャンセリングなどが十分に可能である場合には、ノイズが多い環境であっても算出される期待値を引き下げなくてもよい。 Also, for example, the expected attention value calculation unit 113 may calculate the expected value of attention based on the surrounding environment estimated as the user's action. More specifically, for example, when it is estimated that there is only one user, the expected attention value calculator 113 may raise the calculated expected value of attention. It is unlikely that you will speak to yourself except when you are alone or when you are on the phone, so when a user's utterance is detected, it is a call to the system compared to when there is another user. It is estimated that there is a high possibility. On the other hand, the attention expectation value calculation unit 113 may reduce the calculated expected value even when a user's utterance is detected in an environment where there is a lot of noise such as TV sound or train noise. For example, even if information is provided by voice through the speaker 125, it is unlikely that the user's attention is directed. However, when the sound output from the speaker 125 is sufficiently capable of beam forming, noise canceling, and the like, the expected value calculated may not be lowered even in a noisy environment.
 また、例えば、注意力期待値算出部113は、ユーザが同じ内容を続けて発話した場合には、算出される注意力の期待値を引き上げてもよい。この場合、ユーザはシステムに何らかの応答を求めている可能性が高いため、その前にシステムからの音声出力があった場合でもそうでなくても、ユーザが出力される情報に向ける注意力の期待値は高いと推定される。 Also, for example, the expected attention level calculator 113 may raise the calculated expected level of attention when the user continuously speaks the same content. In this case, it is highly likely that the user is requesting some kind of response from the system, so the user is expected to be attentive toward the information that is output, whether or not there was an audio output from the system before that. The value is estimated to be high.
 また、例えば、注意力期待値算出部113は、システムとユーザとの間で1回以上の対話が既に発生している場合、その後のユーザのアクションに基づいて算出される注意力の期待値を引き上げてもよい。これは、システムとの対話の後のユーザのアクションは、システムから出力された情報に関連する可能性が高いと推定されるためである。ただし、例えばユーザが対話の後に情報を提供する端末装置から離れたような場合には、注意力の期待値は低くなりうる。 Further, for example, the attention level expectation value calculation unit 113 calculates the expected level of attention level calculated based on the subsequent user action when one or more dialogues have already occurred between the system and the user. You may raise it. This is because user actions after interaction with the system are presumed to be likely related to information output from the system. However, for example, when the user leaves the terminal device that provides information after the dialogue, the expected value of attention can be low.
 また、例えば、注意力期待値算出部113は、ユーザがシステムに対して特徴的な対話をした場合に、算出される注意力の期待値を引き上げてもよい。より具体的には、方言を使ったり、声高に発話したり、語尾または語頭に何らかのキーワードをつけたりした場合、ユーザがシステムに通じるような特徴をつけて発話を実行したものと推定されるため、注意力の期待値は高くなりうる。 Further, for example, the expected attention value calculation unit 113 may raise the expected value of the calculated attention when the user has a characteristic dialogue with the system. More specifically, if you use a dialect, speak loudly, or add some keyword at the end or beginning, it is assumed that the user uttered with a feature that leads to the system, Expectation of attention can be high.
 また、例えば、注意力期待値算出部113は、ユーザの状態に応じて、出力方法ごとに注意力の期待値を算出してもよい。例えば、行動認識結果などから、ユーザが仕事中であったり、会議中であったり、電車で移動中であったりすることが特定される場合、注意力期待値算出部113は、音声出力について算出される注意力の期待値を引き下げてもよい。一方、この場合、注意力期待値算出部113は、ユーザのジェスチャや簡単なアクション(例えば、端末装置を叩いたり振ったりすること)について算出される注意力の期待値を引き上げてもよい。ユーザが睡眠中であれば、ユーザからの意図的なアクションはないと推測されるため、何らかのアクションが検出された場合にも算出される注意力の期待値が引き下げられるか、一律に0に設定されてもよい。ただし、ユーザの睡眠のログ、例えば寝相、寝言、脈拍、睡眠レベルなどを検出する場合には、この限りではない。 Also, for example, the expected attention value calculator 113 may calculate the expected value of attention for each output method according to the user's state. For example, when it is specified from the action recognition result that the user is working, in a meeting, or moving on a train, the expected attention value calculation unit 113 calculates the audio output. You may lower the expected value of attention. On the other hand, in this case, the expected attention level calculator 113 may increase the expected level of attention calculated for the user's gesture or simple action (for example, hitting or shaking the terminal device). If the user is sleeping, it is presumed that there is no intentional action from the user, so the expected value of the attention level calculated when any action is detected is reduced or set to 0 uniformly. May be. However, this is not the case when detecting a user's sleep log, for example, sleep phase, sleep, pulse, sleep level, and the like.
 また、例えば、注意力期待値算出部113は、ユーザの発話に含まれる語句に応じて、算出される注意力の期待値を補正してもよい。例えば、注意力期待値算出部113は、特定の人物(例えば、家族、友人、会社の上司など)や、特定の内容(例えば、記念日、借りた本の返却日、公的書類の提出日など)に関する語句がユーザの発話に含まれる場合に、ユーザが重要度の高い会話をしていると判定し、算出される注意力の期待値を引き上げてもよい。これによって、例えば、ユーザが忘れてはいけない重要な情報をリマインドすることができる。 Also, for example, the expected attention level calculator 113 may correct the calculated expected level of attention according to the words included in the user's utterance. For example, the attention expectation value calculation unit 113 may select a specific person (for example, family, friend, company boss, etc.) or specific content (for example, anniversary, return date of borrowed book, date of submission of official documents) Etc.) may be included in the user's utterance, it may be determined that the user has a conversation with high importance, and the calculated expected value of attention may be raised. Thereby, for example, important information that the user should not forget can be reminded.
 (3.出力制御の例)
 (3-1.表示方法の選択の例)
 図4は、本開示の一実施形態に係る表示方法の選択の例を示すフローチャートである。図4を参照すると、出力制御部115は、まず、注意力期待値算出部113によって算出された注意力の期待値が第1の閾値th1を超えているか否かを判定する(S101)。ここで、期待値が第1の閾値th1を超えていれば、出力制御部115は、ディスプレイ123において、最前面に表示されるウインドウで情報を表示させる(S103)。これは、出力される情報に対するユーザの注意力が最も高い(ユーザが出力される情報に対して高い注意を払っている)と推測される場合の処理である。最前面に表示されるウインドウで情報を表示させれば、ユーザは、すぐに多くの情報を得ることができる。
(3. Example of output control)
(3-1. Example of selection of display method)
FIG. 4 is a flowchart illustrating an example of selection of a display method according to an embodiment of the present disclosure. Referring to FIG. 4, the output control unit 115 first determines whether or not the expected attention value calculated by the expected attention value calculation unit 113 exceeds the first threshold th1 (S101). Here, if the expected value exceeds the first threshold th1, the output control unit 115 causes the display 123 to display information in a window displayed in the forefront (S103). This is processing when it is estimated that the user's attention to the output information is the highest (the user pays high attention to the output information). If information is displayed in the window displayed in the foreground, the user can obtain a lot of information immediately.
 一方、S101において注意力の期待値が第1の閾値th1を超えていなかった場合、さらに、出力制御部115は、期待値が第2の閾値th2を超えているか否かを判定する(S105)。第2の閾値th2は、第1の閾値th1よりも小さい。ここで、期待値が第2の閾値th2を超えていれば、出力制御部115は、ディスプレイ123において、ポップアップのウインドウで情報を表示させる(S107)。これは、出力される情報に対するユーザの注意力が中程度(ユーザが出力される情報に対して注意を払うかもしれないし、払わないかもしれない)と推測される場合の処理である。ポップアップのウインドウで情報を表示させれば、ユーザが情報を必要としない場合でもあまり邪魔にはならない。 On the other hand, when the expected value of attention does not exceed the first threshold th1 in S101, the output control unit 115 further determines whether or not the expected value exceeds the second threshold th2 (S105). . The second threshold th2 is smaller than the first threshold th1. Here, if the expected value exceeds the second threshold th2, the output control unit 115 causes the display 123 to display information in a pop-up window (S107). This is processing when it is estimated that the user's attention to the output information is moderate (the user may or may not pay attention to the output information). If information is displayed in a pop-up window, even if the user does not need information, it does not get in the way.
 一方、S105において注意力の期待値が第2の閾値th2を超えていなかった場合、出力制御部115は、情報を出力させることなく処理を終了する。つまり、出力制御部115は、情報の出力を抑止する。これは、出力される情報に対するユーザの注意力が低い(ユーザが出力される情報に対してほとんど注意を払わず、むしろ邪魔になるかもしれない)と推測される場合の処理である。なお、この場合に出力されなかった情報は、情報キャッシュDB119に格納され、後で出力されてもよい。 On the other hand, if the expected value of attentiveness does not exceed the second threshold th2 in S105, the output control unit 115 ends the process without outputting information. That is, the output control unit 115 suppresses output of information. This is a process when it is assumed that the user's attention to the output information is low (the user pays little attention to the output information and may be an obstacle). Information that is not output in this case may be stored in the information cache DB 119 and output later.
 (3-2.出力方法の選択の例)
 図5は、本開示の一実施形態に係る出力方法の選択の例を示すフローチャートである。図5を参照すると、出力制御部115は、まず、注意力期待値算出部113によって算出された注意力の期待値が第1の閾値th1を超えているか否かを判定する(S151)。ここで、期待値が第1の閾値th1を超えていれば、出力制御部115は、ディスプレイ123およびスピーカ125の両方を使用して情報を出力させる(S153)。これは、出力される情報に対するユーザの注意力が最も高い(ユーザが出力される情報に対して高い注意を払っている)と推測される場合の処理である。ディスプレイ123およびスピーカ125の両方を使用して情報を出力すれば、ユーザは、短時間の間の多くの情報を得ることができる。
(3-2. Example of selection of output method)
FIG. 5 is a flowchart illustrating an example of selecting an output method according to an embodiment of the present disclosure. Referring to FIG. 5, the output control unit 115 first determines whether or not the expected value of attention calculated by the expected attention value calculation unit 113 exceeds the first threshold th1 (S151). Here, if the expected value exceeds the first threshold th1, the output control unit 115 causes both the display 123 and the speaker 125 to output information (S153). This is processing when it is estimated that the user's attention to the output information is the highest (the user pays high attention to the output information). If information is output using both the display 123 and the speaker 125, the user can obtain a lot of information in a short time.
 一方、S151において注意力の期待値が第1の閾値th1を超えていなかった場合、さらに、出力制御部115は、期待値が第2の閾値th2を超えているか否かを判定する(S155)。第2の閾値th2は、第1の閾値th1よりも小さい。ここで、期待値が第2の閾値th2を超えていれば、出力制御部115は、ディスプレイ123のみを使用して情報を出力させる(S157)。これは、出力される情報に対するユーザの注意力が中程度(ユーザが出力される情報に対して注意を払うかもしれないし、払わないかもしれない)と推測される場合の処理である。ディスプレイ123のみを使用して情報を出力すれば、ユーザが情報を必要としない場合でもあまり邪魔にはならない。 On the other hand, when the expected value of attention does not exceed the first threshold th1 in S151, the output control unit 115 further determines whether or not the expected value exceeds the second threshold th2 (S155). . The second threshold th2 is smaller than the first threshold th1. If the expected value exceeds the second threshold th2, the output control unit 115 outputs information using only the display 123 (S157). This is processing when it is estimated that the user's attention to the output information is moderate (the user may or may not pay attention to the output information). If information is output using only the display 123, even if the user does not need information, it does not get in the way.
 一方、S155において注意力の期待値が第2の閾値th2を超えていなかった場合、出力制御部115は、情報を出力させることなく処理を終了する。つまり、出力制御部115は、情報の出力を抑止する。これは、出力される情報に対するユーザの注意力が低い(ユーザが出力される情報に対してほとんど注意を払わず、むしろ邪魔になるかもしれない)と推測される場合の処理である。図4の例と同様に、出力されなかった情報は情報キャッシュDB119に格納され、後で出力されてもよい。 On the other hand, when the expected value of attention does not exceed the second threshold th2 in S155, the output control unit 115 ends the process without outputting information. That is, the output control unit 115 suppresses output of information. This is a process when it is assumed that the user's attention to the output information is low (the user pays little attention to the output information and may be an obstacle). As in the example of FIG. 4, information that has not been output may be stored in the information cache DB 119 and output later.
 図6は、図5の例をより具体的に示す図である。(a)に示すように、ユーザは他のユーザと会話している。この場合、注意力期待値算出部113は、例えばマイクロフォン105によって取得された音声データに基づいてユーザのアクションを特定し、例えば図3に示したような算出ルールDB131のデータを参照して、比較的低い期待値を算出する。図示された例において、この期待値は、第1の閾値th1と第2の閾値th2との間にある。従って、図5のフローチャートにおけるS157の処理が実行され、(b)に示すように、ディスプレイ123のみを使用して情報が出力される。 FIG. 6 is a diagram showing the example of FIG. 5 more specifically. As shown in (a), the user has a conversation with another user. In this case, the expected attention value calculation unit 113 identifies the user's action based on, for example, voice data acquired by the microphone 105, and compares the data with reference to the data of the calculation rule DB 131 as illustrated in FIG. Low expected value. In the illustrated example, this expected value is between the first threshold th1 and the second threshold th2. Therefore, the process of S157 in the flowchart of FIG. 5 is executed, and information is output using only the display 123, as shown in FIG.
 ここで、ユーザは、ディスプレイ123に表示された情報に興味を示し、(c)に示すように、「それもっと見せて!」と端末装置に向かって呼びかけている。注意力期待値算出部113は、例えばマイクロフォン105によって取得された音声データに基づいてユーザのアクションを特定し、同様に、ルールDB131のデータを参照して、比較的高い期待値を算出する。図示された例において、この期待値は、第1の閾値th1を超えている。従って、図5のフローチャートにおけるS153の処理が実行され、(c)に示すように、ディスプレイ123における表示とスピーカ125から出力された音声125vとの両方を使用して情報が出力される。 Here, the user is interested in the information displayed on the display 123, and as shown in (c), calls the terminal device “Show me more!”. The attention expectation value calculation unit 113 identifies a user action based on, for example, audio data acquired by the microphone 105, and similarly calculates a relatively high expectation value by referring to the data in the rule DB 131. In the illustrated example, this expected value exceeds the first threshold th1. Therefore, the process of S153 in the flowchart of FIG. 5 is executed, and information is output using both the display on the display 123 and the sound 125v output from the speaker 125, as shown in (c).
 なお、図6の例では、ユーザの会話の内容(イタリアンレストランについて)に対応した情報が出力されている。このような情報は、例えば、情報生成部117が、マイクロフォン105によって取得された音声データに基づいてユーザの発話内容を特定し、発話内容に関連する情報を情報サーバ121から取得することによって生成される。発話内容の特定のための音声処理については、公知のさまざまな技術を利用することが可能であるため、詳細な説明は省略する。また、この場合、注意力期待値算出部113は、出力される予定の情報の内容がユーザの発話内容に含まれることによって、算出される注意力の期待値を引き上げてもよい。 In the example of FIG. 6, information corresponding to the content of the user's conversation (about Italian restaurants) is output. Such information is generated by, for example, the information generation unit 117 specifying the user's utterance content based on the voice data acquired by the microphone 105 and acquiring information related to the utterance content from the information server 121. The Since various known techniques can be used for voice processing for specifying the utterance content, detailed description thereof is omitted. In this case, the expected attention level calculator 113 may raise the calculated expected level of attention by including the content of the information to be output in the user's utterance content.
 (3-3.その他の例)
 図7は、本開示の一実施形態に係る表示色選択の例を示す図である。図7を参照すると、ユーザは、腕輪上のウェアラブル端末装置を装着して街を歩いている。ここで、ユーザは、ある店舗(SHOP)の近傍を通りかかった。この店舗は、ユーザが以前に実行した情報検索における検索キーワード(「イタリアン」とする)に関連する店舗(イタリアンレストラン)であった。この場合、例えば、情報生成部117は、センサ103に含まれるGPS受信機によって特定されるユーザの位置情報と、以前にアクションデータ取得部107によって取得されたユーザの情報検索の履歴から推定される店舗(SHOP)とユーザとの関係(店舗がユーザの興味の対象である可能性がある)とに基づいて、ユーザに店舗(SHOP)が近くにあることを通知する情報を生成する。あるいは、店舗(SHOP)とユーザとの関係は、外部サービスが保有するユーザのプロフィール情報などを利用して推定されてもよい。例えば、飲食店の店舗情報提供サービスでは、店舗情報のブックマークや、店舗情報の検索履歴、類似した属性を有する他のユーザが登録した店舗情報などが保持されている。ソーシャルメディアのサービスでは、サービスや店舗についてユーザがソーシャルメディア上で表明した評価の情報などが保持されている。情報生成部117は、例えばこのような情報に基づいて、店舗(SHOP)とユーザとの関係を推定し、さらにユーザの位置情報に基づいてユーザに店舗(SHOP)が近くにあることを通知する情報を生成してもよい。
(3-3. Other examples)
FIG. 7 is a diagram illustrating an example of display color selection according to an embodiment of the present disclosure. Referring to FIG. 7, a user is walking in the city wearing a wearable terminal device on a bracelet. Here, the user passed near a certain store (SHOP). This store was a store (Italian restaurant) related to the search keyword (referred to as “Italian”) in the information search executed by the user before. In this case, for example, the information generation unit 117 is estimated from the user position information specified by the GPS receiver included in the sensor 103 and the user information search history previously acquired by the action data acquisition unit 107. Based on the relationship between the store (SHOP) and the user (the store may be an object of interest of the user), information for notifying the user that the store (SHOP) is nearby is generated. Alternatively, the relationship between the store (SHOP) and the user may be estimated using user profile information held by an external service. For example, in a restaurant information providing service of a restaurant, store information bookmarks, store information search history, store information registered by other users having similar attributes, and the like are stored. In the social media service, evaluation information and the like expressed by the user on the social media for the service and the store are held. The information generation unit 117 estimates the relationship between the store (SHOP) and the user based on such information, for example, and further notifies the user that the store (SHOP) is nearby based on the user location information. Information may be generated.
 図示された例において、ユーザへの通知情報は、その他出力装置127に含まれるイルミネーションによって出力される。出力制御部115は、注意力期待値算出部113によって算出された注意力の期待値に応じて、図7の(a)~(c)に示すように、イルミネーションの表示色を変更してもよい。例えば、出力制御部115は、ユーザの注意力の期待値が高い場合には目立つ色でイルミネーションを発光させ、期待値が低い場合には地味な色で発光させるか、発光させなくてもよい。あるいは、出力制御部115は、ユーザの注意力の期待値が高い場合には(ユーザが既に気付いている可能性が高いため)地味な色でイルミネーションを発光させ、ユーザの注意力の期待値が低い場合には(ユーザがまだ気付いていない可能性が高いため)目立つ色でイルミネーションを発光させてもよい。 In the illustrated example, notification information to the user is output by illumination included in the other output device 127. The output control unit 115 may change the illumination display color as shown in FIGS. 7A to 7C according to the expected attention value calculated by the expected attention value calculation unit 113. Good. For example, the output control unit 115 may emit the illumination with a conspicuous color when the expected value of the user's attention is high, or may emit the plain color or not when the expected value is low. Alternatively, when the expected value of the user's attention is high (since there is a high possibility that the user has already noticed), the output control unit 115 emits illumination with a plain color, and the expected value of the user's attention is If it is low (it is likely that the user has not yet noticed), the illumination may be emitted with a conspicuous color.
 (4.情報のストックの例)
 続いて、図8および図9を参照して、本開示の一実施形態に係る情報のストックの例について説明する。なお、以下の図8および図9では、システムによって検出されているユーザとの間のコミュニケーションの状態が、図の右下に示すようなインジケータによって表現されている。インジケータは、例えば端末装置にその他出力装置127として設けられるイルミネーションなどによって実際に表示されてもよいし、図8および図9における説明のための表示(実際に表示されるわけではない)として解釈されてもよい。
(4. Example of information stock)
Subsequently, an example of information stock according to an embodiment of the present disclosure will be described with reference to FIGS. 8 and 9. In FIGS. 8 and 9 below, the state of communication with the user detected by the system is represented by an indicator as shown in the lower right of the figure. The indicator may be actually displayed by, for example, an illumination provided as the other output device 127 in the terminal device, or interpreted as a display for explanation in FIGS. 8 and 9 (not actually displayed). May be.
 図8は、本開示の一実施形態に係る情報のストックの第1の例について説明するための図である。図8の例では、(a)に示すように、ユーザが「今日の予定は?」と発話している間、システムはユーザが発話中であることを検出している。ここで、(b)に示すように、システムはユーザの発話内容を正しく検出せず、出力された情報に対するユーザの注意力の期待値を低く算出してしまったため、情報生成部117が生成した情報を出力せず、ストックしてしまった。ここで、ユーザは、システムからの応答がないことに気づき、「おーい」と呼びかけている。(c)に示すように、ユーザからの呼びかけを正しく検出したシステムは、ユーザの注意力の期待値が高いものと推測し、ストックされた情報を出力する。より具体的には、システムは、「失礼しました。今日は大崎でランチです」という、スピーカ125から出力された音声125vによってユーザに情報を提供する。 FIG. 8 is a diagram for describing a first example of information stock according to an embodiment of the present disclosure. In the example of FIG. 8, as shown in FIG. 8A, while the user speaks “What is today's schedule?”, The system detects that the user is speaking. Here, as shown in (b), the system did not correctly detect the content of the user's utterance, and the expected value of the user's attention to the output information was calculated low. I didn't output the information and stocked it. Here, the user notices that there is no response from the system, and calls him “Ooi”. As shown in (c), a system that correctly detects a call from a user estimates that the expected value of the user's attention is high, and outputs stocked information. More specifically, the system provides information to the user by the voice 125v output from the speaker 125, "I'm sorry. Today is lunch in Osaki."
 図9は、本開示の一実施形態に係る情報のストックの第2の例について説明するための図である。図9の例では、(a)に示すように、ユーザが「今日の予定は?」と発話している間、システムはユーザが発話中であることを検出している。さらに、(b)に示すように、ユーザと会話している他のユーザが「空いてるよ」と返答したため、システムはユーザが他のユーザと会話中であることから、出力された情報に対するユーザの注意力の期待値が低いものと推測し、情報生成部117が生成した情報を出力せず、ストックした。図8の例とは異なり、ユーザは実際にシステムからの情報を必要としていたわけではなかった(他のユーザとの会話で今日の予定を尋ねただけであった)ため、システムが情報をストックした判断は正しかった。その後、所定の時間が経過した場合、(c)に示すように、システムはストックされた情報を不要になったものとして破棄し、定常状態に戻る。なお、(b)の時点で情報を破棄せずにストックするのは、他のユーザと会話中であることが誤検出されており、ユーザが実際には情報を必要としていたことが分かった場合(図8の(b)のように呼びかけられたりした場合)に情報を出力できる状態を維持するためである。 FIG. 9 is a diagram for describing a second example of information stock according to an embodiment of the present disclosure. In the example of FIG. 9, as shown in FIG. 9A, while the user speaks “What is today's schedule?”, The system detects that the user is speaking. Furthermore, as shown in (b), since another user who is conversing with the user replied "I'm free", the system is in the user's conversation with another user. It was estimated that the expected value of attention was low, and the information generated by the information generation unit 117 was not output and stocked. Unlike the example in Figure 8, the user did not actually need information from the system (it was just asking for today's schedule in a conversation with another user), so the system stocked the information. The decision was correct. Thereafter, when a predetermined time has elapsed, as shown in (c), the system discards the stocked information as unnecessary and returns to a steady state. Note that stocking without destroying information at the time of (b) is when it is falsely detected that the user is talking to another user, and the user actually needs the information. This is to maintain a state where information can be output (when called as shown in FIG. 8B).
 図10は、本実施の一実施形態に係る情報のストックのための処理を示すフローチャートである。図10を参照すると、出力制御部115は、まず、注意力期待値算出部113によって算出された注意力の期待値が第1の閾値th1を超えているか否かを判定する(S201)。ここで、期待値が第1の閾値th1を超えていれば、出力制御部115は、さらに、情報キャッシュDB119にストックされた情報があるか否かを判定する(S203)。ここで、ストックされた情報があった場合、出力制御部115は、ストックされた情報を出力する(S205)。これは、例えば図8の例において(c)に示された処理である。続いて、出力制御部115は、情報生成部117によって生成された他の情報(より新しい情報でありうる)があれば、当該情報を出力する(S207)。 FIG. 10 is a flowchart showing processing for stocking information according to an embodiment of the present invention. Referring to FIG. 10, the output control unit 115 first determines whether or not the expected value of attention calculated by the expected attention value calculation unit 113 exceeds the first threshold th1 (S201). Here, if the expected value exceeds the first threshold th1, the output control unit 115 further determines whether there is information stocked in the information cache DB 119 (S203). Here, when there is stocked information, the output control unit 115 outputs the stocked information (S205). This is, for example, the process shown in (c) in the example of FIG. Subsequently, if there is other information (may be newer information) generated by the information generation unit 117, the output control unit 115 outputs the information (S207).
 一方、S201において注意力の期待値が第1の閾値th1を超えていなかった場合、さらに、出力制御部115は、期待値が第2の閾値th2を超えているか否かを判定する(S209)。第2の閾値th2は、第1の閾値th1よりも小さい。ここで、期待値が第2の閾値th2を超えていれば、出力制御部115は、情報生成部117によって生成された情報を出力する(S207)。つまり、図示された例では、注意力の期待値が第1の閾値th1と第2の閾値th2との間である場合には、ストックされた情報は出力されないが、例えば新たに情報生成部117によって生成された情報は出力される。一方、S209で注意力の期待値が第2の閾値th2を超えていなかった場合、出力制御部115は、情報をストックする(S211)。 On the other hand, when the expected value of attention does not exceed the first threshold th1 in S201, the output control unit 115 further determines whether or not the expected value exceeds the second threshold th2 (S209). . The second threshold th2 is smaller than the first threshold th1. Here, if the expected value exceeds the second threshold th2, the output control unit 115 outputs the information generated by the information generation unit 117 (S207). That is, in the illustrated example, when the expected value of attention is between the first threshold value th1 and the second threshold value th2, the stocked information is not output, but for example, a new information generation unit 117 is generated. The information generated by is output. On the other hand, if the expected value of attention does not exceed the second threshold th2 in S209, the output control unit 115 stocks information (S211).
 なお、上記のような情報のストックの例では、さまざまな変形が可能である。例えば、情報が(例えば誤って)ストックされた場合にユーザがシステムからストックされた情報を引き出すためのアクションとしては、例えば呼びかける、同じことをもう一度言う、見つめる(視線を向ける)、顔を向ける、端末装置のボタンなどを操作する、手をたたく、黙る(システムの応答を待つ)などがありうる。システムでは、これらのアクションが、ストックされた情報を引き出すためのアクションとして登録されていてもよい。この場合、システムは、ストックされた情報を出力するときに、正しく応答できなかったことに対する謝罪のメッセージなどを加えてもよい(例えば図8の例における(c)でのシステムの応答)。また、ストックされている情報を表示する前段階として、認識されているユーザのアクションを表示し(「他のユーザと会話中ではないのですか?」など)、ユーザがそれでもなお情報の提供を要求した場合にストックされた情報を出力してもよい。 It should be noted that the information stock example described above can be variously modified. For example, if the information is stocked (e.g., inadvertently), the actions for the user to retrieve the stocked information from the system are, for example, calling, saying the same thing again, staring (looking at), turning the face, There are possible operations such as operating buttons on the terminal device, clapping hands, and shutting down (waiting for a response from the system). In the system, these actions may be registered as actions for extracting the stocked information. In this case, when outputting the stocked information, the system may add an apology message for not responding correctly (for example, the system response in (c) in the example of FIG. 8). Also, as a step before displaying the stock information, it shows the actions of the recognized user (such as “Is n’t you talking to other users?”) And the user still provides the information The stocked information may be output when requested.
 また、ストックされている情報が複数ある場合もありうる。その場合は、例えばストックされた情報をディスプレイ123において一覧表示させ、どの情報を出力するかをユーザに選択させてもよい。なお、ストックされた情報は、上記の図9の例のように所定の時間が経過した場合には破棄されうるが、そのための時間は任意に設定されうる。例えば、情報の内容に応じて、数分から数時間、または数日などの時間が、ストックされた情報が破棄されるまでの時間として設定されてもよい。 Also, there may be a plurality of stocked information. In that case, for example, the stock information may be displayed in a list on the display 123, and the user may select which information to output. The stocked information can be discarded when a predetermined time has passed as in the example of FIG. 9 described above, but the time for that can be arbitrarily set. For example, depending on the content of the information, a time from several minutes to several hours or days may be set as the time until the stocked information is discarded.
 (5.ハードウェア構成)
 次に、図11を参照して、本開示の実施形態に係る情報処理装置のハードウェア構成について説明する。図11は、本開示の実施形態に係る情報処理装置のハードウェア構成例を示すブロック図である。図示された情報処理装置900は、例えば、上記の実施形態における端末装置やサーバを実現しうる。
(5. Hardware configuration)
Next, a hardware configuration of the information processing apparatus according to the embodiment of the present disclosure will be described with reference to FIG. FIG. 11 is a block diagram illustrating a hardware configuration example of the information processing apparatus according to the embodiment of the present disclosure. The illustrated information processing apparatus 900 can realize, for example, the terminal device or server in the above-described embodiment.
 情報処理装置900は、CPU(Central Processing unit)901、ROM(Read Only Memory)903、およびRAM(Random Access Memory)905を含む。また、情報処理装置900は、ホストバス907、ブリッジ909、外部バス911、インターフェース913、入力装置915、出力装置917、ストレージ装置919、ドライブ921、接続ポート923、通信装置925を含んでもよい。さらに、情報処理装置900は、必要に応じて、撮像装置933、およびセンサ935を含んでもよい。情報処理装置900は、CPU901に代えて、またはこれとともに、DSP(Digital Signal Processor)またはASIC(Application Specific Integrated Circuit)と呼ばれるような処理回路を有してもよい。 The information processing apparatus 900 includes a CPU (Central Processing unit) 901, a ROM (Read Only Memory) 903, and a RAM (Random Access Memory) 905. The information processing apparatus 900 may include a host bus 907, a bridge 909, an external bus 911, an interface 913, an input device 915, an output device 917, a storage device 919, a drive 921, a connection port 923, and a communication device 925. Furthermore, the information processing apparatus 900 may include an imaging device 933 and a sensor 935 as necessary. The information processing apparatus 900 may include a processing circuit called DSP (Digital Signal Processor) or ASIC (Application Specific Integrated Circuit) instead of or in addition to the CPU 901.
 CPU901は、演算処理装置および制御装置として機能し、ROM903、RAM905、ストレージ装置919、またはリムーバブル記録媒体927に記録された各種プログラムに従って、情報処理装置900内の動作全般またはその一部を制御する。ROM903は、CPU901が使用するプログラムや演算パラメータなどを記憶する。RAM905は、CPU901の実行において使用するプログラムや、その実行において適宜変化するパラメータなどを一次記憶する。CPU901、ROM903、およびRAM905は、CPUバスなどの内部バスにより構成されるホストバス907により相互に接続されている。さらに、ホストバス907は、ブリッジ909を介して、PCI(Peripheral Component Interconnect/Interface)バスなどの外部バス911に接続されている。 The CPU 901 functions as an arithmetic processing device and a control device, and controls all or a part of the operation in the information processing device 900 according to various programs recorded in the ROM 903, the RAM 905, the storage device 919, or the removable recording medium 927. The ROM 903 stores programs and calculation parameters used by the CPU 901. The RAM 905 primarily stores programs used in the execution of the CPU 901, parameters that change as appropriate during the execution, and the like. The CPU 901, the ROM 903, and the RAM 905 are connected to each other by a host bus 907 configured by an internal bus such as a CPU bus. Further, the host bus 907 is connected to an external bus 911 such as a PCI (Peripheral Component Interconnect / Interface) bus via a bridge 909.
 入力装置915は、例えば、マウス、キーボード、タッチパネル、ボタン、スイッチおよびレバーなど、ユーザによって操作される装置である。入力装置915は、例えば、赤外線やその他の電波を利用したリモートコントロール装置であってもよいし、情報処理装置900の操作に対応した携帯電話などの外部接続機器929であってもよい。入力装置915は、ユーザが入力した情報に基づいて入力信号を生成してCPU901に出力する入力制御回路を含む。ユーザは、この入力装置915を操作することによって、情報処理装置900に対して各種のデータを入力したり処理動作を指示したりする。 The input device 915 is a device operated by the user, such as a mouse, a keyboard, a touch panel, a button, a switch, and a lever. The input device 915 may be, for example, a remote control device that uses infrared rays or other radio waves, or may be an external connection device 929 such as a mobile phone that supports the operation of the information processing device 900. The input device 915 includes an input control circuit that generates an input signal based on information input by the user and outputs the input signal to the CPU 901. The user operates the input device 915 to input various data and instruct processing operations to the information processing device 900.
 出力装置917は、取得した情報をユーザに対して視覚的または聴覚的に通知することが可能な装置で構成される。出力装置917は、例えば、LCD(Liquid Crystal Display)、PDP(Plasma Display Panel)、有機EL(Electro-Luminescence)ディスプレイなどの表示装置、スピーカおよびヘッドホンなどの音声出力装置、ならびにプリンタ装置などでありうる。出力装置917は、情報処理装置900の処理により得られた結果を、テキストまたは画像などの映像として出力したり、音声または音響などの音声として出力したりする。 The output device 917 is a device that can notify the user of the acquired information visually or audibly. The output device 917 can be, for example, a display device such as an LCD (Liquid Crystal Display), a PDP (Plasma Display Panel), an organic EL (Electro-Luminescence) display, an audio output device such as a speaker and headphones, and a printer device. . The output device 917 outputs the result obtained by the processing of the information processing device 900 as video such as text or an image, or outputs it as audio such as voice or sound.
 ストレージ装置919は、情報処理装置900の記憶部の一例として構成されたデータ格納用の装置である。ストレージ装置919は、例えば、HDD(Hard Disk Drive)などの磁気記憶部デバイス、半導体記憶デバイス、光記憶デバイス、または光磁気記憶デバイスなどにより構成される。このストレージ装置919は、CPU901が実行するプログラムや各種データ、および外部から取得した各種のデータなどを格納する。 The storage device 919 is a data storage device configured as an example of a storage unit of the information processing device 900. The storage device 919 includes, for example, a magnetic storage device such as an HDD (Hard Disk Drive), a semiconductor storage device, an optical storage device, or a magneto-optical storage device. The storage device 919 stores programs executed by the CPU 901, various data, various data acquired from the outside, and the like.
 ドライブ921は、磁気ディスク、光ディスク、光磁気ディスク、または半導体メモリなどのリムーバブル記録媒体927のためのリーダライタであり、情報処理装置900に内蔵、あるいは外付けされる。ドライブ921は、装着されているリムーバブル記録媒体927に記録されている情報を読み出して、RAM905に出力する。また、ドライブ921は、装着されているリムーバブル記録媒体927に記録を書き込む。 The drive 921 is a reader / writer for a removable recording medium 927 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory, and is built in or externally attached to the information processing apparatus 900. The drive 921 reads information recorded on the attached removable recording medium 927 and outputs the information to the RAM 905. In addition, the drive 921 writes a record in the attached removable recording medium 927.
 接続ポート923は、機器を情報処理装置900に直接接続するためのポートである。接続ポート923は、例えば、USB(Universal Serial Bus)ポート、IEEE1394ポート、SCSI(Small Computer System Interface)ポートなどでありうる。また、接続ポート923は、RS-232Cポート、光オーディオ端子、HDMI(登録商標)(High-Definition Multimedia Interface)ポートなどであってもよい。接続ポート923に外部接続機器929を接続することで、情報処理装置900と外部接続機器929との間で各種のデータが交換されうる。 The connection port 923 is a port for directly connecting a device to the information processing apparatus 900. The connection port 923 can be, for example, a USB (Universal Serial Bus) port, an IEEE 1394 port, a SCSI (Small Computer System Interface) port, or the like. The connection port 923 may be an RS-232C port, an optical audio terminal, an HDMI (registered trademark) (High-Definition Multimedia Interface) port, or the like. By connecting the external connection device 929 to the connection port 923, various types of data can be exchanged between the information processing apparatus 900 and the external connection device 929.
 通信装置925は、例えば、通信ネットワーク931に接続するための通信デバイスなどで構成された通信インターフェースである。通信装置925は、例えば、有線または無線LAN(Local Area Network)、Bluetooth(登録商標)、またはWUSB(Wireless USB)用の通信カードなどでありうる。また、通信装置925は、光通信用のルータ、ADSL(Asymmetric Digital Subscriber Line)用のルータ、または、各種通信用のモデムなどであってもよい。通信装置925は、例えば、インターネットや他の通信機器との間で、TCP/IPなどの所定のプロトコルを用いて信号などを送受信する。また、通信装置925に接続される通信ネットワーク931は、有線または無線によって接続されたネットワークであり、例えば、インターネット、家庭内LAN、赤外線通信、ラジオ波通信または衛星通信などである。 The communication device 925 is a communication interface configured with, for example, a communication device for connecting to the communication network 931. The communication device 925 may be, for example, a communication card for wired or wireless LAN (Local Area Network), Bluetooth (registered trademark), or WUSB (Wireless USB). The communication device 925 may be a router for optical communication, a router for ADSL (Asymmetric Digital Subscriber Line), or a modem for various communication. The communication device 925 transmits and receives signals and the like using a predetermined protocol such as TCP / IP with the Internet and other communication devices, for example. The communication network 931 connected to the communication device 925 is a wired or wireless network, such as the Internet, a home LAN, infrared communication, radio wave communication, or satellite communication.
 撮像装置933は、例えば、CCD(Charge Coupled Device)またはCMOS(Complementary Metal Oxide Semiconductor)などの撮像素子、および撮像素子への被写体像の結像を制御するためのレンズなどの各種の部材を用いて実空間を撮像し、撮像画像を生成する装置である。撮像装置933は、静止画を撮像するものであってもよいし、また動画を撮像するものであってもよい。 The imaging device 933 uses various members such as an imaging element such as a CCD (Charge Coupled Device) or a CMOS (Complementary Metal Oxide Semiconductor), and a lens for controlling the formation of a subject image on the imaging element. It is an apparatus that images a real space and generates a captured image. The imaging device 933 may capture a still image or may capture a moving image.
 センサ935は、例えば、加速度センサ、ジャイロセンサ、地磁気センサ、光センサ、音センサなどの各種のセンサである。センサ935は、例えば情報処理装置900の筐体の姿勢など、情報処理装置900自体の状態に関する情報や、情報処理装置900の周辺の明るさや騒音など、情報処理装置900の周辺環境に関する情報を取得する。また、センサ935は、GPS(Global Positioning System)信号を受信して装置の緯度、経度および高度を測定するGPSセンサを含んでもよい。 The sensor 935 is various sensors such as an acceleration sensor, a gyro sensor, a geomagnetic sensor, an optical sensor, and a sound sensor. The sensor 935 acquires information about the state of the information processing apparatus 900 itself, such as the posture of the information processing apparatus 900, and information about the surrounding environment of the information processing apparatus 900, such as brightness and noise around the information processing apparatus 900, for example. To do. The sensor 935 may include a GPS sensor that receives a GPS (Global Positioning System) signal and measures the latitude, longitude, and altitude of the apparatus.
 以上、情報処理装置900のハードウェア構成の一例を示した。上記の各構成要素は、汎用的な部材を用いて構成されていてもよいし、各構成要素の機能に特化したハードウェアにより構成されていてもよい。かかる構成は、実施する時々の技術レベルに応じて適宜変更されうる。 Heretofore, an example of the hardware configuration of the information processing apparatus 900 has been shown. Each component described above may be configured using a general-purpose member, or may be configured by hardware specialized for the function of each component. Such a configuration can be appropriately changed according to the technical level at the time of implementation.
 (6.補足)
 本開示の実施形態は、例えば、上記で説明したような情報処理装置(端末装置またはサーバ)、システム、情報処理装置またはシステムで実行される情報処理方法、情報処理装置を機能させるためのプログラム、およびプログラムが記録された一時的でない有形の媒体を含みうる。
(6. Supplement)
Embodiments of the present disclosure include, for example, an information processing device (terminal device or server) as described above, a system, an information processing method executed by the information processing device or system, a program for causing the information processing device to function, And a non-transitory tangible medium on which the program is recorded.
 以上、添付図面を参照しながら本開示の好適な実施形態について詳細に説明したが、本開示の技術的範囲はかかる例に限定されない。本開示の技術分野における通常の知識を有する者であれば、請求の範囲に記載された技術的思想の範疇内において、各種の変更例または修正例に想到し得ることは明らかであり、これらについても、当然に本開示の技術的範囲に属するものと了解される。 The preferred embodiments of the present disclosure have been described in detail above with reference to the accompanying drawings, but the technical scope of the present disclosure is not limited to such examples. It is obvious that a person having ordinary knowledge in the technical field of the present disclosure can come up with various changes or modifications within the scope of the technical idea described in the claims. Of course, it is understood that it belongs to the technical scope of the present disclosure.
 また、本明細書に記載された効果は、あくまで説明的または例示的なものであって限定的ではない。つまり、本開示に係る技術は、上記の効果とともに、または上記の効果に代えて、本明細書の記載から当業者には明らかな他の効果を奏しうる。 In addition, the effects described in this specification are merely illustrative or illustrative, and are not limited. That is, the technology according to the present disclosure can exhibit other effects that are apparent to those skilled in the art from the description of the present specification in addition to or instead of the above effects.
 なお、以下のような構成も本開示の技術的範囲に属する。
(1)ユーザのアクションを示すデータを取得し、
 前記取得されたデータに基づいて、前記ユーザに対して出力される情報に向けられる注意力の期待値を算出し、
 前記期待値を前記情報の出力制御のために提供する
 ように構成されるプロセッサを備える情報処理装置。
(2)前記プロセッサは、前記ユーザの最新のアクションに基づいて前記期待値を算出する、前記(1)に記載の情報処理装置。
(3)前記プロセッサは、前記ユーザのアクションの履歴に基づいて前記期待値を算出する、前記(1)または(2)に記載の情報処理装置。
(4)前記ユーザのアクションは、前記ユーザのモーションまたは表情を含む、前記(1)~(3)のいずれか1項に記載の情報処理装置。
(5)前記ユーザのアクションは、既に出力された前記情報に対するリアクションを含む、前記(4)に記載の情報処理装置。
(6)前記プロセッサは、さらに、前記リアクションを示すデータに基づいて前記期待値の算出ルールを修正するように構成される、前記(5)に記載の情報処理装置。
(7)前記期待値は、前記情報を出力するか否かを決定するために提供される、前記(1)~(6)のいずれか1項に記載の情報処理装置。
(8)前記プロセッサは、さらに、前記情報の出力制御を実行し、前記期待値が低いために出力されなかった前記情報を前記期待値が高いときに出力する、前記(7)に記載の情報処理装置。
(9)前記期待値は、前記情報の出力方法を選択するために提供される、前記(1)~(8)のいずれか1項に記載の情報処理装置。
(10)前記プロセッサは、前記取得されたデータに基づいて前記ユーザのアクションを推測し、該推測の精度に基づいて前記期待値を調整する、前記(1)~(9)のいずれか1項に記載の情報処理装置。
(11)前記プロセッサは、前記推測の精度が低い場合に前記期待値を平均値に近づける、前記(10)に記載の情報処理装置。
(12)前記プロセッサは、前記ユーザのアクションに特定の語句の発話が含まれる場合に、前記期待値を引き上げる、前記(1)~(11)のいずれか1項に記載の情報処理装置。
(13)前記プロセッサは、前記ユーザのアクションとして推測されるユーザの周辺環境に基づいて前記期待値を算出する、前記(1)~(12)のいずれか1項に記載の情報処理装置。
(14)プロセッサが、
  ユーザのアクションを示すデータを取得し、
  前記取得されたデータに基づいて、前記ユーザに対して出力される情報に向けられる注意力の期待値を算出し、
  前記期待値を前記情報の出力制御のために提供する
 ことを含む情報処理方法。
(15)ユーザのアクションを示すデータを取得し、
 前記取得されたデータに基づいて、前記ユーザに対して出力される情報に向けられる注意力の期待値を算出し、
 前記期待値を前記情報の出力制御のために提供する
 機能をコンピュータに実現させるためのプログラム。
The following configurations also belong to the technical scope of the present disclosure.
(1) Acquire data indicating user actions,
Based on the acquired data, calculate an expected value of attention directed to information output to the user,
An information processing apparatus comprising a processor configured to provide the expected value for output control of the information.
(2) The information processing apparatus according to (1), wherein the processor calculates the expected value based on a latest action of the user.
(3) The information processing apparatus according to (1) or (2), wherein the processor calculates the expected value based on a history of actions of the user.
(4) The information processing apparatus according to any one of (1) to (3), wherein the user action includes a motion or a facial expression of the user.
(5) The information processing apparatus according to (4), wherein the user action includes a reaction to the already output information.
(6) The information processing apparatus according to (5), wherein the processor is further configured to correct the expected value calculation rule based on data indicating the reaction.
(7) The information processing apparatus according to any one of (1) to (6), wherein the expected value is provided to determine whether or not to output the information.
(8) The information according to (7), wherein the processor further executes output control of the information, and outputs the information that was not output because the expected value is low when the expected value is high Processing equipment.
(9) The information processing apparatus according to any one of (1) to (8), wherein the expected value is provided to select an output method of the information.
(10) The processor according to any one of (1) to (9), wherein the processor estimates the action of the user based on the acquired data and adjusts the expected value based on the accuracy of the estimation. The information processing apparatus described in 1.
(11) The information processing apparatus according to (10), wherein the processor brings the expected value closer to an average value when the estimation accuracy is low.
(12) The information processing apparatus according to any one of (1) to (11), wherein the processor increases the expected value when an utterance of a specific phrase is included in the user action.
(13) The information processing apparatus according to any one of (1) to (12), wherein the processor calculates the expected value based on a user's surrounding environment estimated as the user's action.
(14) The processor
Get data indicating user actions,
Based on the acquired data, calculate an expected value of attention directed to information output to the user,
An information processing method including providing the expected value for output control of the information.
(15) Acquire data indicating user actions,
Based on the acquired data, calculate an expected value of attention directed to information output to the user,
A program for causing a computer to realize a function of providing the expected value for output control of the information.
  10  システム
 101  カメラ
 103  センサ
 105  マイクロフォン
 107  アクションデータ取得部
 109  アクションDB
 113  注意力期待値算出部
 115  出力制御部
 117  情報生成部
 119  情報キャッシュDB
 123  ディスプレイ
 125  スピーカ
 127  その他出力装置
 129  フィードバック解析部
 131  算出ルールDB
10 System 101 Camera 103 Sensor 105 Microphone 107 Action Data Acquisition Unit 109 Action DB
113 Attentiveness Expected Value Calculation Unit 115 Output Control Unit 117 Information Generation Unit 119 Information Cache DB
123 Display 125 Speaker 127 Other output device 129 Feedback analysis unit 131 Calculation rule DB

Claims (15)

  1.  ユーザのアクションを示すデータを取得し、
     前記取得されたデータに基づいて、前記ユーザに対して出力される情報に向けられる注意力の期待値を算出し、
     前記期待値を前記情報の出力制御のために提供する
     ように構成されるプロセッサを備える情報処理装置。
    Get data indicating user actions,
    Based on the acquired data, calculate an expected value of attention directed to information output to the user,
    An information processing apparatus comprising a processor configured to provide the expected value for output control of the information.
  2.  前記プロセッサは、前記ユーザの最新のアクションに基づいて前記期待値を算出する、請求項1に記載の情報処理装置。 The information processing apparatus according to claim 1, wherein the processor calculates the expected value based on the latest action of the user.
  3.  前記プロセッサは、前記ユーザのアクションの履歴に基づいて前記期待値を算出する、請求項1に記載の情報処理装置。 The information processing apparatus according to claim 1, wherein the processor calculates the expected value based on a history of actions of the user.
  4.  前記ユーザのアクションは、前記ユーザのモーションまたは表情を含む、請求項1に記載の情報処理装置。 The information processing apparatus according to claim 1, wherein the user action includes a motion or a facial expression of the user.
  5.  前記ユーザのアクションは、既に出力された前記情報に対するリアクションを含む、請求項4に記載の情報処理装置。 5. The information processing apparatus according to claim 4, wherein the user action includes a reaction to the already output information.
  6.  前記プロセッサは、さらに、前記リアクションを示すデータに基づいて前記期待値の算出ルールを修正するように構成される、請求項5に記載の情報処理装置。 6. The information processing apparatus according to claim 5, wherein the processor is further configured to correct the expected value calculation rule based on data indicating the reaction.
  7.  前記期待値は、前記情報を出力するか否かを決定するために提供される、請求項1に記載の情報処理装置。 The information processing apparatus according to claim 1, wherein the expected value is provided to determine whether or not to output the information.
  8.  前記プロセッサは、さらに、前記情報の出力制御を実行し、前記期待値が低いために出力されなかった前記情報を前記期待値が高いときに出力する、請求項7に記載の情報処理装置。 The information processing apparatus according to claim 7, wherein the processor further executes output control of the information and outputs the information that was not output because the expected value is low when the expected value is high.
  9.  前記期待値は、前記情報の出力方法を選択するために提供される、請求項1に記載の情報処理装置。 The information processing apparatus according to claim 1, wherein the expected value is provided for selecting an output method of the information.
  10.  前記プロセッサは、前記取得されたデータに基づいて前記ユーザのアクションを推測し、該推測の精度に基づいて前記期待値を調整する、請求項1に記載の情報処理装置。 The information processing apparatus according to claim 1, wherein the processor estimates an action of the user based on the acquired data, and adjusts the expected value based on the accuracy of the estimation.
  11.  前記プロセッサは、前記推測の精度が低い場合に前記期待値を平均値に近づける、請求項10に記載の情報処理装置。 The information processing apparatus according to claim 10, wherein the processor approaches the expected value to an average value when the estimation accuracy is low.
  12.  前記プロセッサは、前記ユーザのアクションに特定の語句の発話が含まれる場合に、前記期待値を引き上げる、請求項1に記載の情報処理装置。 The information processing apparatus according to claim 1, wherein the processor increases the expected value when an utterance of a specific phrase is included in the user action.
  13.  前記プロセッサは、前記ユーザのアクションとして推測されるユーザの周辺環境に基づいて前記期待値を算出する、請求項1に記載の情報処理装置。 The information processing apparatus according to claim 1, wherein the processor calculates the expected value based on a user's surrounding environment estimated as the user's action.
  14.  プロセッサが、
      ユーザのアクションを示すデータを取得し、
      前記取得されたデータに基づいて、前記ユーザに対して出力される情報に向けられる注意力の期待値を算出し、
      前記期待値を前記情報の出力制御のために提供する
     ことを含む情報処理方法。
    Processor
    Get data indicating user actions,
    Based on the acquired data, calculate an expected value of attention directed to information output to the user,
    An information processing method including providing the expected value for output control of the information.
  15.  ユーザのアクションを示すデータを取得し、
     前記取得されたデータに基づいて、前記ユーザに対して出力される情報に向けられる注意力の期待値を算出し、
     前記期待値を前記情報の出力制御のために提供する
     機能をコンピュータに実現させるためのプログラム。
     
    Get data indicating user actions,
    Based on the acquired data, calculate an expected value of attention directed to information output to the user,
    A program for causing a computer to realize a function of providing the expected value for output control of the information.
PCT/JP2014/078111 2014-01-09 2014-10-22 Information processing device, information processing method, and program WO2015104883A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2014002536A JP2015132878A (en) 2014-01-09 2014-01-09 Information processing device, information processing method and program
JP2014-002536 2014-01-09

Publications (1)

Publication Number Publication Date
WO2015104883A1 true WO2015104883A1 (en) 2015-07-16

Family

ID=53523725

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2014/078111 WO2015104883A1 (en) 2014-01-09 2014-10-22 Information processing device, information processing method, and program

Country Status (2)

Country Link
JP (1) JP2015132878A (en)
WO (1) WO2015104883A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11183167B2 (en) 2017-03-24 2021-11-23 Sony Corporation Determining an output position of a subject in a notification based on attention acquisition difficulty

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020004137A (en) * 2018-06-28 2020-01-09 エヌ・ティ・ティ・コミュニケーションズ株式会社 Evaluation device, evaluation method, and evaluation program
JP7120060B2 (en) 2019-02-06 2022-08-17 トヨタ自動車株式会社 VOICE DIALOGUE DEVICE, CONTROL DEVICE AND CONTROL PROGRAM FOR VOICE DIALOGUE DEVICE
WO2021039191A1 (en) 2019-08-27 2021-03-04 ソニー株式会社 Information processing device, method for controlling same, and program
WO2021039190A1 (en) 2019-08-27 2021-03-04 ソニー株式会社 Information processing device, method for controlling same, and program

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009223187A (en) * 2008-03-18 2009-10-01 Pioneer Electronic Corp Display content controller, display content control method and display content control method program
JP2011135419A (en) * 2009-12-25 2011-07-07 Fujitsu Ten Ltd Data communication system, on-vehicle machine, communication terminal, server device, program, and data communication method
JP2011239247A (en) * 2010-05-12 2011-11-24 Nippon Hoso Kyokai <Nhk> Digital broadcast receiver and related information presentation program

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009223187A (en) * 2008-03-18 2009-10-01 Pioneer Electronic Corp Display content controller, display content control method and display content control method program
JP2011135419A (en) * 2009-12-25 2011-07-07 Fujitsu Ten Ltd Data communication system, on-vehicle machine, communication terminal, server device, program, and data communication method
JP2011239247A (en) * 2010-05-12 2011-11-24 Nippon Hoso Kyokai <Nhk> Digital broadcast receiver and related information presentation program

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11183167B2 (en) 2017-03-24 2021-11-23 Sony Corporation Determining an output position of a subject in a notification based on attention acquisition difficulty

Also Published As

Publication number Publication date
JP2015132878A (en) 2015-07-23

Similar Documents

Publication Publication Date Title
AU2018241137B2 (en) Dynamic thresholds for always listening speech trigger
AU2019200295B2 (en) Far-field extension for digital assistant services
US10373617B2 (en) Reducing the need for manual start/end-pointing and trigger phrases
WO2015104883A1 (en) Information processing device, information processing method, and program
US20230074406A1 (en) Using large language model(s) in generating automated assistant response(s
JPWO2019098038A1 (en) Information processing device and information processing method
US11244682B2 (en) Information processing device and information processing method
KR102356623B1 (en) Virtual assistant electronic device and control method thereof
US20200327890A1 (en) Information processing device and information processing method
US11250873B2 (en) Information processing device and information processing method
WO2023038654A1 (en) Using large language model(s) in generating automated assistant response(s)
WO2018139036A1 (en) Information processing device, information processing method, and program
WO2017175442A1 (en) Information processing device and information processing method
WO2019146187A1 (en) Information processing device and information processing method
US20200234187A1 (en) Information processing apparatus, information processing method, and program
JP2021113835A (en) Voice processing device and voice processing method
US11430429B2 (en) Information processing apparatus and information processing method
US20200342870A1 (en) Information processing device and information processing method
WO2018139050A1 (en) Information processing device, information processing method, and program
US20230367960A1 (en) Summarization based on timing data
US11803352B2 (en) Information processing apparatus and information processing method
US11386870B2 (en) Information processing apparatus and information processing method
US20230306968A1 (en) Digital assistant for providing real-time social intelligence
WO2019054009A1 (en) Information processing device, information processing method and program
JP2021119642A (en) Information processing device, information processing method, and recording medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14878120

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14878120

Country of ref document: EP

Kind code of ref document: A1