WO2017130474A1 - Information processing device, information processing method, and program - Google Patents
Information processing device, information processing method, and program Download PDFInfo
- Publication number
- WO2017130474A1 WO2017130474A1 PCT/JP2016/080485 JP2016080485W WO2017130474A1 WO 2017130474 A1 WO2017130474 A1 WO 2017130474A1 JP 2016080485 W JP2016080485 W JP 2016080485W WO 2017130474 A1 WO2017130474 A1 WO 2017130474A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information processing
- information
- processing apparatus
- present
- content
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/34—Browsing; Visualisation therefor
- G06F16/345—Summarisation for human users
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/017—Gesture based interaction, e.g. based on a set of recognized hand gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/10—Speech classification or search using distance or distortion measures between unknown speech and reference templates
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1815—Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Definitions
- the present disclosure relates to an information processing apparatus, an information processing method, and a program.
- speaker When a person who speaks (hereinafter referred to as “speaker”) speaks, it is difficult to speak only what the speaker wants to convey.
- This disclosure proposes a new and improved information processing apparatus, information processing method, and program capable of summarizing the content of an utterance.
- an information processing apparatus including a processing unit that performs a summarization process for summarizing the content of an utterance indicated by voice information based on a user's utterance based on information indicating a weight related to the acquired summary.
- the information processing apparatus which has a step which performs the summarization process which summarizes the content of the speech which the audio
- a program for causing a computer to implement a function of performing a summarization process for summarizing the content of utterances indicated by voice information based on a user's utterances based on information indicating weights related to the acquired summaries Provided.
- the information processing method according to the present embodiment will be described by dividing it into a first information processing method and a second information processing method.
- a case where the same information processing apparatus performs both the processing related to the first information processing method and the processing related to the second information processing method will be mainly described.
- the information processing apparatus that performs the process according to the above may be different from the information processing apparatus that performs the process according to the second information processing method.
- a person who is a target of processing related to the information processing method according to the present embodiment is indicated as “user”.
- a user for example, “speaker (or a person who can be a speaker)” (when a first information processing method described later is performed) or “operator of an operation device related to notification” ( And a second information processing method to be described later).
- the information processing apparatus performs processing for summarizing the content of the utterance (hereinafter referred to as “summarization processing”) as processing related to the first information processing method.
- the information processing apparatus summarizes the content of the utterance indicated by the voice information based on the user's utterance based on the information indicating the weight related to the acquired summary.
- the summary for example, the content of an utterance is selected based on the weight related to the summary, or a part is extracted from the content of the utterance based on the weight related to the summary.
- the information indicating the weight related to the summary includes, for example, data indicating the weight related to the summary, which is stored in a table (or database, hereinafter the same applies) for setting the weight related to the summary described later. Further, the information indicating the weight regarding the summary may be data indicating that the weight regarding the summary is relatively large or small. The information indicating the weight related to the summary is acquired by referring to a table for setting the weight related to the summary described later, for example.
- the voice information according to the present embodiment is voice data including voice based on the utterance of the speaker.
- the voice information according to the present embodiment is generated when a voice input device such as a microphone picks up voice based on the utterance of the speaker.
- the audio information according to the present embodiment may be information obtained by converting an analog signal generated according to the audio picked up by the audio input device into a digital signal by an AD (Analog-to-Digital) converter.
- the voice input device (or the voice input device and the AD converter) may be included in the information processing apparatus according to the present embodiment, or may be a device external to the information processing apparatus according to the present embodiment. There may be.
- the content of the utterance indicated by the voice information includes, for example, a character string indicated by text data (hereinafter, referred to as “voice text information”) obtained as a result of arbitrary voice recognition processing performed on the voice information. It is done.
- voice text information a character string indicated by text data
- the information processing apparatus recognizes the character string indicated by the voice text information as the content of the utterance indicated by the voice information, and summarizes the character string indicated by the voice text information.
- the voice recognition processing for the voice information may be performed by the information processing apparatus according to the present embodiment, or may be performed by an external device of the information processing apparatus according to the present embodiment.
- the information processing apparatus according to the present embodiment performs the speech recognition process
- the information processing apparatus according to the present embodiment indicates the character indicated by the speech text information obtained as a result of performing the speech recognition process on the acquired speech information. Summarize the column.
- the external device of the information processing apparatus according to the present embodiment performs speech recognition processing
- the information processing apparatus according to the present embodiment summarizes the character string indicated by the speech text information acquired from the external device.
- the voice recognition process may be repeatedly performed, for example, periodically / non-periodically, or at a predetermined trigger such as a timing when the voice information is acquired. It may be done accordingly.
- the voice recognition process may be performed when a predetermined operation such as a voice recognition start operation related to the summary is performed, for example.
- the weight related to the summary according to the present embodiment is an index for extracting more important words (in other words, words that the speaker will want to convey) from the content of the utterance indicated by the voice information. is there. Based on the weight related to the summary according to the present embodiment, the content of the utterance indicated by the voice information is summarized, so that more important words corresponding to the weight related to the summary are included in the content of the summarized utterance. .
- the weight related to the summary according to the present embodiment is at least one of audio information, information about the user, information about the application, information about the environment, and information about the device (1 or 2 of these) as shown below, for example. Is set based on the above.
- the information on the user includes, for example, at least one of user status information indicating the user status and user operation information based on the user operation.
- the user state for example, an action taken by the user (including an operation such as a gesture), a state of emotion of the user, and the like can be mentioned.
- the user state is one or more of user biometric information obtained from an arbitrary biosensor, a detection result of a motion sensor such as a velocity sensor or an angular velocity sensor, and a captured image captured by an imaging device. It is estimated by an arbitrary action estimation process or an arbitrary emotion estimation process used.
- the processing related to the estimation of the user state may be performed by the information processing apparatus according to the present embodiment, or may be performed by an external device of the information processing apparatus according to the present embodiment.
- examples of user operations include various operations such as a speech recognition start operation related to summarization and an operation for starting a predetermined application.
- the information about the application indicates, for example, the execution state of the application.
- the information regarding the environment indicates, for example, a situation around the user (or a situation where the user is placed).
- the environmental information include data indicating the level of noise around the user.
- the level of noise around the user is specified by threshold processing using one or two or more thresholds for level classification, for example, by extracting non-speech from voice information generated by a microphone.
- the processing related to the acquisition of information related to the environment as described above may be performed by the information processing apparatus according to the present embodiment, or may be performed by an external apparatus of the information processing apparatus according to the present embodiment.
- the information about the device indicates, for example, one or both of the device type and the device state.
- Examples of the state of the device include a processing load of a processor included in the device.
- the content of the utterance indicated by the voice information is summarized by performing the summarization process according to the first information processing method. Therefore, the content of the utterance of the speaker indicated by the voice information can be simplified.
- the content of the utterance is summarized based on the weights related to the summary set as described above, so that more important words corresponding to the weights related to the summary are summarized. Included in the content of the utterance.
- the information processing apparatus performs processing for controlling notification of notification content (hereinafter referred to as “notification control processing”) based on summary information as processing related to the second information processing method.
- the summary information indicates the content of the summarized utterance corresponding to the voice information based on the utterance of the first user.
- the summary information is obtained, for example, by performing summary processing according to the first information processing method.
- the content of the summarized utterance indicated by the summary information is not limited to the above, and may be summarized by any method capable of summarizing the utterance content indicated by the speech information based on the user's utterance. Good.
- the summary information indicates the content of the summarized utterance obtained by performing the summary processing according to the first information processing method.
- the information processing apparatus controls the notification of the notification content to the second user.
- the notification content for the second user may be, for example, the content of the summarized utterance indicated by the summary information, or the content of the notification that is different from the content of the summarized utterance, or the summary
- the content of the summarized utterance indicated by the summary information, such as the translated content of the utterance, may not be used.
- the first user according to the present embodiment and the second user according to the present embodiment may be different or the same. As an example of the case where the first user and the second user are different, there is a case where the first user is a speaker and the second user is a communication partner. Moreover, the case where a 1st user and a 2nd user are the same as an example as a case where a 1st user and a 2nd user are the same speaker is mentioned.
- the information processing apparatus causes notification contents to be notified by one or both of notification by a visual method and notification by an auditory method, for example.
- the information processing apparatus causes the notification content to be displayed on the display screen of the display device.
- the information processing apparatus transmits, for example, a display control signal including display data corresponding to the notification content and a display command to the display device, so that the notification content is displayed on the display screen of the display device. Display.
- a display screen for displaying the notification content for example, a display device that constitutes a display unit (described later) included in the information processing apparatus according to the present embodiment, or an external device of the information processing apparatus according to the present embodiment.
- Display devices When the display screen for displaying the notification content is an external display device, the information processing apparatus according to the present embodiment includes, for example, a communication unit (described later) included in the information processing apparatus according to the present embodiment, or the present embodiment.
- the display control signal is transmitted to an external display device to a communication device external to the information processing apparatus.
- the information processing apparatus outputs, for example, the notification content from a sound output device such as a speaker as sound (music may be included). To let you know.
- the information processing apparatus transmits the audio output control signal including the audio data indicating the audio corresponding to the notification content and the audio output command to the audio output device. , Voice output from the audio output device.
- the audio output device that outputs the notification content by audio may be, for example, an audio output device included in the information processing apparatus according to the present embodiment, or an audio output outside the information processing apparatus according to the present embodiment. It may be a device.
- the audio output device that outputs the notification content by voice is an external audio output device
- the information processing apparatus according to the present embodiment includes, for example, a communication unit (described later) included in the information processing apparatus according to the present embodiment, or The audio output control signal is transmitted to the external audio output device to the communication device external to the information processing apparatus according to the present embodiment.
- the notification method of the notification content in the information processing apparatus according to the present embodiment is not limited to one or both of the notification method using the visual method and the notification method using the auditory method.
- the information processing apparatus according to the present embodiment can also notify a break in the notification content by a tactile notification method, for example, by vibrating a vibration device.
- the notification content based on the summarized utterance content obtained by the summarization process according to the first information processing method is notified.
- the content of the summarized utterance obtained by the summary processing according to the first information processing method is “because it is difficult to utter only the content that the speaker wants to convey” This is a summary result that can further reduce the possibility of the “event to occur”.
- the notification content is notified, so that “the communication partner needs time to understand the content that the speaker wants to convey”. It is possible to further reduce the possibility of occurrence of “an event caused by difficulty in speaking only the content that the speaker wants to convey” such as “it takes time for translation”.
- the processes related to the information processing method according to the present embodiment include the summary process related to the first information processing method and the second information described above. It is not restricted to the notification control process which concerns on a processing method.
- the process related to the information processing method according to the present embodiment includes a process (hereinafter referred to as “translation process”) for translating the content of the utterance summarized by the summary process according to the first information processing method into another language. .) May be further included.
- translation process the content of the summarized utterance is translated from the first language corresponding to the speech information based on the utterance into a second language different from the first language.
- translation result the content of the translated summary utterance obtained by performing the translation process is referred to as “translation result”.
- the translation processing according to the present embodiment may be performed as part of the processing according to the first information processing method, or may be performed as part of the processing according to the second information processing method.
- one or both of the result of the summary processing according to the first information processing method and the result of the translation processing according to the present embodiment may be arbitrarily set.
- a recording control process for recording on the recording medium may be further included.
- the recording control process for example, “one or both of the result of the summary process according to the first information processing method and the result of the translation process according to the embodiment” and “position information corresponding to the user”
- Information related to the user such as the user's biometric information obtained from an arbitrary biosensor (described later) may be associated and recorded as a log.
- the use case to which the information processing method according to this embodiment is applied is not limited to “conversation support”.
- the information processing method according to the present embodiment can be applied to any use case in which the content of the utterance indicated by the audio information can be summarized as described below. ⁇ "Meeting with meeting letters" realized by summarizing the utterances indicated by the voice information indicating the voice of the meeting, generated by an IC (Integrated Circuit) recorder, etc.
- 1 to 5 are explanatory diagrams for explaining an example of a use case to which the information processing method according to this embodiment is applied.
- the person indicated by “U1” corresponds to the user according to the present embodiment. 2 and 5, the person indicated by “U2” corresponds to the partner with whom the user U1 communicates.
- the person indicated by “U1” in FIGS. 1, 2 and 5 is indicated as “user U1”, and the person indicated by “U2” in FIGS. 2 and 5 is indicated as “communication partner U2”. Show.
- a case where the native language of the communication partner U2 is Japanese is taken as an example.
- FIG. 1, FIG. 2, and FIG. 5 show an example in which the user U1 is wearing an eyewear type device having a display screen.
- an audio input device such as a microphone
- an audio output device such as a speaker
- an imaging device are connected to the eyewear type apparatus worn by the user U1 shown in FIGS. Yes.
- a wearable apparatus used by being worn on the body of the user U1 such as the eyewear type apparatus shown in FIG. Examples include a communication device such as a smartphone and a computer such as a server.
- a communication device such as a smartphone
- a computer such as a server.
- the information processing apparatus sets a weight related to a summary, for example, by using a table for setting a weight related to the summary.
- the table for setting the weight regarding the summary may be stored in a storage unit (described later) included in the information processing apparatus according to the present embodiment, or may be stored outside the information processing apparatus according to the present embodiment. It may be stored in a recording medium.
- the information processing apparatus according to the present embodiment uses, for example, a table for setting a weight related to summarization by appropriately referring to a storage unit (described later) or an external recording medium.
- the information processing apparatus can set the weight related to the summary by determining the weight related to the summary using an arbitrary algorithm for determining the weight related to the summary, for example.
- 6 to 8 are explanatory diagrams showing examples of tables for setting the weights related to the summary according to the present embodiment.
- FIG. 6 shows an example of a table for specifying the weight related to the summary, and shows an example of the table weighted for each type of weight related to the summary for each registered vocabulary.
- the combination indicated by the value “1” corresponds to the weighted combination.
- the combination indicated by the value “0” corresponds to the combination that is not weighted.
- FIG. 7 and 8 show examples of tables for specifying the types of weights related to the summary.
- FIG. 7 shows an example of a table in which schedule contents specified from the state of the schedule application (or schedule contents estimated from the state of the schedule application) and weight types related to the summary are associated with each other.
- FIG. 8 shows an example of a table in which user behavior (an example of a user state) is associated with a summary weight type.
- the information processing apparatus includes, for example, a table for specifying the type of weight related to the summary as shown in FIGS. 7 and 8 and a table for specifying the weight related to the summary as shown in FIG. By using both as tables for setting the weight for the summary, the weight for the summary is set.
- the example of the table for specifying the kind of weight regarding the summary which concerns on this embodiment is not restricted to the example shown to FIG. 7, FIG. 8, and the example of the table for specifying the weight regarding a summary is, It goes without saying that the present invention is not limited to the example shown in FIG. Moreover, the table for setting the weight regarding the summary which concerns on this embodiment may be provided for every languages, such as Japanese, English, Chinese, for example.
- the information processing apparatus determines the type of weight related to summarization based on at least one of, for example, audio information, information about a user, information about an application, information about an environment, and information about a device. In this case, it is possible to set the weight for the summary using only the table for specifying the weight for the summary as shown in FIG.
- the information processing apparatus for example, as illustrated in FIG. 6 based on a recognition result based on at least one of audio information, information about a user, information about an application, information about an environment, and information about a device. From the table for specifying the weight for the summary, the type of the weight for the summary is determined by selecting the type of the weight for the summary related to the recognition result. Then, the information processing apparatus according to the present embodiment refers to, for example, a table for specifying the weight related to the summary as illustrated in FIG. , A weight is set for the vocabulary corresponding to the combination indicated by the value “1”.
- the information processing apparatus sets a weight related to the summary by performing, for example, any one of the following processes (a-1) to (a-5).
- examples relating to the setting of weights related to summarization are not limited to the examples shown in (a-1) to (a-5) below.
- the information processing apparatus can set a weight related to summarization according to a language recognized based on voice information.
- weight settings for summarization according to language include, for example, “if the language recognized based on speech information is Japanese, increase the verb weight” or “recognize based on speech information. If the language used is English, increase the weight of the nouns ”.
- the information processing apparatus relates to, for example, a weight related to the summary according to the situation around the user indicated by the information related to the environment, and a summary corresponding to the content (for example, device type) indicated by the information related to the device Each weight may be set.
- (A-1) First example of setting of weight related to summary: an example of setting of weight related to summary based on user status indicated by user status information included in information related to user
- user U1 is a device such as a smartphone
- the information processing apparatus according to the present embodiment recognizes that the user U1 is moving with respect to the destination. Then, the information processing apparatus according to the present embodiment sets the weight related to the summary corresponding to the recognition result by referring to the table for setting the weight related to the summary.
- the information processing apparatus determines the type of weight related to the summary illustrated in FIG. 8 based on the recognition result that the user U1 obtained as described above is moving to the destination. From the table for specifying, “time” corresponding to the action “moving” is specified as the type of weight related to the summary. Then, the information processing apparatus according to the present embodiment refers to the table for specifying the weight related to the summary illustrated in FIG. 6, and the value of the combination of the weight type and the vocabulary related to the specified summary is “ A weight is set for the vocabulary corresponding to the combination indicated by 1 ′′. When the table for specifying the weight related to the summary shown in FIG. 6 is used, weights are set for the vocabulary “AM”, “when”,.
- the information processing apparatus when the user U1 operates a device such as a smartphone and starts a game application, the information processing apparatus according to the present embodiment recognizes that the user U1 is playing a game. Then, the information processing apparatus according to the present embodiment sets the weight related to the summary corresponding to the recognition result by referring to the table for setting the weight related to the summary.
- the information processing apparatus based on the recognition result that the user U1 obtained as described above is playing a game, from the table for specifying the type of weight related to the summary illustrated in FIG.
- the “game term” corresponding to the action “in game” is specified as the type of weight related to the summary.
- the information processing apparatus refers to the table for specifying the weights related to the summary illustrated in FIG. 6, and the value of the combination of the determined weight type and vocabulary related to the summary is “ A weight is set for the vocabulary corresponding to the combination indicated by 1 ′′.
- the information processing apparatus is included in the table for specifying the weight related to the summary illustrated in FIG. 6 based on the recognition result that the user U1 obtained as described above is in the game. It is also possible to determine the type of weight related to the summary related to the recognition result such as “game term” as the type of weight related to the summary.
- the information processing apparatus refers to the table for specifying the weights related to the summary illustrated in FIG. 6, and the value of the combination of the determined weight type and vocabulary related to the summary is “ A weight is set for the vocabulary corresponding to the combination indicated by 1 ′′.
- the information processing apparatus recognizes the state of the user U1 estimated based on the detection result of a motion sensor such as an acceleration sensor or an angular velocity sensor provided in an apparatus such as a smartphone used by the user U1, for example. It is also possible to set a weight for the summary based on the result.
- a motion sensor such as an acceleration sensor or an angular velocity sensor provided in an apparatus such as a smartphone used by the user U1, for example. It is also possible to set a weight for the summary based on the result.
- the action “meal” is selected from the table for specifying the type of weight related to the summary illustrated in FIG.
- the “cooking” corresponding to is specified as a weight type related to the summary.
- the information processing apparatus refers to the table for specifying the weights related to the summary illustrated in FIG. 6, and the value of the combination of the determined weight type and vocabulary related to the summary is “ A weight is set for the vocabulary corresponding to the combination indicated by 1 ′′.
- (A-2) Second example of weight setting for summarization: an example of setting weight for summarization based on voice information
- the information processing apparatus sets weights for summarization based on voice information.
- the information processing apparatus determines the type of weight related to the summary, for example, as follows based on the audio information.
- the average frequency band of the voice indicated by the voice information is, for example, 300 to 550 [Hz]: “Male” is determined as the type of weight related to the summary.
- the average frequency band of the voice indicated by the voice information is, for example, 400 to 700 [Hz]: “Women” is determined as the type of weight regarding the summary.
- the sound pressure and volume of the sound indicated by the sound information are equal to or higher than the set first threshold value, or when the sound pressure and sound volume of the sound indicated by the sound information are larger than the first threshold value:
- One or both of “anger” and “joy” is determined as the type.
- the first threshold value for example, a fixed value such as 72 [dB] can be cited.
- fixed values such as 54 [dB] are mentioned, for example.
- the first threshold value and the second threshold value may change dynamically depending on the distance between a user such as the user U1 and a communication partner such as the communication partner U2. .
- the threshold value is increased by 6 [dB] and moved away by 0.5 [m]. Every 6 [dB] lowering ”.
- the distance may be estimated by, for example, arbitrary image processing on a captured image captured by an imaging device, or may be acquired by a distance sensor. When the distance is estimated, the process related to the distance estimation may be performed by the information processing apparatus according to the present embodiment or may be performed by an external apparatus of the information processing apparatus according to the present embodiment.
- the third threshold value and the fourth threshold value may be fixed values set in advance, or may be variable values that can be changed based on a user operation or the like.
- the emotion eg, anger, joy, sadness
- the type of weight related to the summary corresponding to the estimated emotion is determined. It is possible to set.
- the information processing apparatus for example, the rate of change of the fundamental frequency obtained from the speech information, the rate of change of the sound, Based on the rate of change or the like, the strength of the weight related to emotion may be changed.
- the information processing apparatus provides a table for specifying the types of weights related to summarization as shown in FIGS.
- the type of weight related to the summary may be determined using the table, or the weight related to the summary may be determined using only the table for specifying the weight related to the summary as shown in FIG.
- the information processing apparatus specifies the weight related to the summary as illustrated in FIG. 6 as in the first example illustrated in (a-1), for example.
- weighting is set for the vocabulary corresponding to the combination whose value is indicated by “1” among the combinations of the weight type and the vocabulary regarding the specified summary.
- (A-3) Third example of weight setting for summarization: an example of setting weight for summarization based on the execution state of the application indicated by the information regarding the application The information processing apparatus according to this embodiment is based on the execution state of the application. To set the weight for the summary.
- the information processing apparatus when the user U1 operates a device such as a smartphone to start a schedule application and confirms a destination, the information processing apparatus according to the present embodiment is based on the execution state of the schedule application, as shown in FIG. From the table for specifying the type of weight related to the summary shown in (2), “time” and “location” corresponding to the schedule content “place move (biz)” are specified as the types of weight related to the summary. Then, the information processing apparatus according to the present embodiment refers to the table for specifying the weight related to the summary illustrated in FIG. 6, and the value of the combination of the weight type and the vocabulary related to the specified summary is “ A weight is set for the vocabulary corresponding to the combination indicated by 1 ′′. When the table for specifying the weights related to the summary shown in FIG. 6 is used, weights are set for the vocabularies “AM”, “Shibuya”, “when”, “where”,. .
- the information processing apparatus can determine the type of weight related to the summary based on the properties of the application being executed, for example, and set the weight related to the summary as described below.
- the map application is executed: “Time”, “Location”, “Person name”, etc. are determined as the types of weights related to the summary.
- the transfer guidance application is executed: “Time”, “Place”, “Train”, etc. are determined as the types of weights related to the summary.
- an application for smoothly advancing questions for hearing about Japan is being executed: “Question”, “Japan”, etc. are determined as the types of weights related to summarization.
- (A-4) Fourth example of setting of weight related to summary: an example of setting of weight related to summary based on user operation indicated by user operation information included in information related to user
- the information processing apparatus includes: A summary weight is set based on the user's operation.
- the information processing apparatus uses the type of weight related to the summary selected by the operation of selecting the type of weight related to the summary (an example of the user's operation) to set the weight related to the summary. Decide as the type.
- the information processing apparatus for example, when a predetermined operation such as a speech recognition start operation related to the summary is performed, weights related to the summary that are associated in advance with the predetermined operation.
- the type may be set automatically. As an example, when a speech recognition start operation related to a summary is performed, “question” or the like is determined as the type of weight related to the summary.
- the information processing apparatus specifies the weight related to the summary as illustrated in FIG. 6 as in the first example illustrated in (a-1), for example.
- weighting is set for the vocabulary corresponding to the combination whose value is indicated by “1” among the combinations of the weight type and the vocabulary regarding the specified summary.
- the information processing apparatus relates to summarization by combining two or more of (a-1) to (a-4) above. It is possible to set a weight.
- the information processing apparatus performs a summarization process according to the first information processing method, and, for example, an utterance indicated by voice information generated by a microphone connected to the eyewear type apparatus illustrated in FIG.
- the contents of are summarized.
- the information processing apparatus summarizes a character string indicated by voice text information based on voice information, for example.
- the information processing apparatus uses, for example, the content of the utterance by the objective function using the weight related to the summary set by the processing shown in (a) as shown in Equation 1 below. To summarize.
- Equation 1 W is a weight related to the summary.
- a i shown in Equation 1 is a parameter for adjusting the contribution rate of each weight related to the summary, and takes a real number from 0 to 1, for example.
- z yi is, if included phrase y i "1" indicates, unless contains phrase y i is a binary variable indicating the "0".
- the information processing apparatus is not limited to the method using the objective function using the weight related to the summarization shown in Equation 1, but can summarize the content of the utterance using the set weight related to the summarization. Any possible method can be used.
- FIG. 3 shows an example of the result of the summary process according to the first information processing method.
- FIG. 3A shows an example of the content of an utterance before it is summarized.
- 3B shows an example of the content of the summarized utterance, and
- C of FIG. 3 shows another example of the content of the summarized utterance.
- the content of the utterance is summarized, so that the content of the utterance is simplified before the content is summarized. Therefore, even if the communication partner U2 cannot fully understand English by summarizing the content of the utterance as shown in FIG. 3B, the communication partner U2 is visited by the user U1. It is possible to increase the possibility of understanding the content.
- C in FIG. 3 indicates that “the information processing apparatus according to the present embodiment further performs morphological analysis on the summary result illustrated in B in FIG. 3 and combines the morphemes based on the result of the morphological analysis.
- the information processing apparatus when the language of the character string indicated by the speech text information corresponding to the content of the utterance is Japanese, the information processing apparatus according to this embodiment includes the main part of speech (noun, verb, adjective, adverb) Split text is generated in units that combine morphemes other than. For example, when the language of the character string indicated by the speech text information corresponding to the content of the utterance is English, the information processing apparatus according to the present embodiment further sets 5W1H as the divided text.
- the content of the utterance is summarized as shown in FIG. 3C
- the content of the utterance is simplified more than the summary result shown in B of FIG. Therefore, even if the communication partner U2 cannot fully understand English by summarizing the content of the utterance as shown in FIG. 3C, the communication partner U2 is visited by the user U1.
- the possibility that the contents can be understood can be further increased as compared with the case of obtaining the summary result shown in FIG.
- the information processing apparatus may further translate the content of the utterance summarized by the summary process shown in (b) above into another language, for example. As described above, the information processing apparatus according to the present embodiment translates the first language corresponding to the utterance into a second language different from the first language.
- the information processing apparatus specifies, for example, the position where the user U1 exists, and the language of the character string indicated by the speech text information corresponding to the content of the utterance is different from the official language at the specified position. In some cases, the content of the summarized utterance is translated into the official language.
- the position where the user U1 exists is acquired from, for example, a wearable device worn by the user U1 such as the eyewear-type device shown in FIG. 1 or a communication device such as a smartphone possessed by the user U1.
- the location information for example, data indicating the detection result of a device capable of specifying a location such as a GNSS (Global Navigation Satellite System) device (or estimation of a device capable of estimating the location by an arbitrary method) Data showing the results).
- GNSS Global Navigation Satellite System
- the information processing apparatus for example, if the language of the character string indicated by the speech text information corresponding to the utterance content is different from the set language, the summarized utterance content is You may translate into the set language.
- the information processing apparatus translates the content of the summarized utterance into another language by processing of an arbitrary algorithm that can be translated into another language.
- FIG. 4 shows an example of the result of the translation processing according to this embodiment.
- FIG. 4A shows the summary result shown in FIG. 3C as an example of the content of the summarized speech before being translated.
- 4B shows an example of the content of the summary result shown in FIG. 3C translated into another language by the translation process.
- the summary result shown in FIG. 3C is translated into Japanese. An example is shown.
- the translation result obtained by translating the divided text such as the summary result shown in FIG. 3C may be referred to as “divided translation text”.
- the content of the utterance summarized as shown in B of FIG. 4 is translated into Japanese, which is the native language of the communication partner U2, so that the communication partner U2 can understand the content that the user U1 is visiting.
- the possibility of being able to be made can be further increased as compared with the case where the content of the summarized utterance is not translated.
- (D) An example of the notification control process according to the second information processing method The information processing apparatus according to the present embodiment notifies the content of the utterance indicated by the voice information summarized by the summarization process shown in (b) above. . Further, when the content of the summarized utterance is translated into another language by further performing the translation processing shown in (c) above, the information processing apparatus according to the present embodiment notifies the translation result. .
- the information processing apparatus for example, summarizes the content of an utterance (or a translation result) by one or both of a notification by a visual method and a notification by an auditory method. , Let the notification content.
- FIG. 5 shows an example of the result of the notification control process according to the present embodiment.
- FIG. 5 “an example in which the translation result is audibly notified by outputting a voice indicating the translation result from the voice output device connected to the eyewear type device worn by the user U1” Is shown.
- FIG. 5 shows an example in which the translation result shown in B of FIG. 4 is notified.
- FIG. 5 shows an example in which the sound pressure at the location corresponding to the speech location where the sound pressure is strong (the “why” portion shown in FIG. 5) is made stronger than the other locations based on the voice information.
- FIG. 5 shows an example in which when a voice indicating the translation result is output, the division of the divided text is notified by inserting a sound feedback as indicated by a symbol “S” in FIG. Yes.
- notification realized by the notification control process according to the second information processing method is not limited to the example shown in FIG. Another example of the notification realized by the notification control process according to the second information processing method will be described later.
- the content (translation result) of the summarized utterance translated into Japanese which is the native language of the communication partner U2
- the content (translation result) of the summarized utterance translated into Japanese which is the native language of the communication partner U2
- Japanese which is the native language of the communication partner U2
- the information processing apparatus summarizes the utterance content indicated by the voice information based on the user's utterance based on the information indicating the weight related to the summary. .
- the weight regarding the summary is set based on, for example, one or more of the voice information, the user state, the application execution state, and the user operation. Further, as described above, the information processing apparatus according to the present embodiment summarizes the content of an utterance by an objective function using a weight related to a set summary as shown in Equation 1 above, for example.
- the information processing apparatus can perform one or more of the following processes (1) to (3), for example, as the summary process.
- Examples of the start conditions for the summary processing according to the present embodiment include the following examples. ⁇ Conditions related to the no-speech period during which no utterance continues ⁇ Conditions related to the state of speech recognition for acquiring the utterance content from speech information ⁇ Conditions related to the utterance content ⁇ Elapsed time since the speech information was obtained Conditions
- FIGS. 9A to 9C are explanatory diagrams for explaining an example of the summary processing according to the first information processing method, and show an outline of the start timing of the summary processing.
- FIGS. 9A to 9C show an outline of processing in each start condition.
- start condition an example in which the start condition is a condition related to a non-speech period
- the condition related to a non-speech period include a condition related to the length of a non-speech period.
- the information processing apparatus is configured so that the non-speech period is set or the non-speech period is set. When the predetermined period is exceeded, it is determined that the start condition is satisfied.
- the period according to the first example of the start condition may be a fixed period that is set in advance, or may be a variable period that can be changed based on a user operation or the like.
- the “silent period” shown in A of FIG. 9A corresponds to the silent period.
- the information processing apparatus detects, for example, a voice section in which voice is present based on voice information. Then, the information processing apparatus according to the present embodiment detects the silent period exceeding the set time after the voice section is detected, or detects the silent period longer than the set time. If this is done, the summarization process is started as a summarization process start trigger (hereinafter referred to as “summary trigger”).
- summary trigger a summarization process start trigger
- start condition is first condition related to voice recognition state
- detection of a voice recognition stop request is detected.
- the conditions concerning are mentioned.
- the information processing apparatus determines that the start condition is satisfied based on the detection of the voice recognition stop request. .
- the information processing apparatus according to the present embodiment determines that the start condition is satisfied, for example, when a voice recognition stop request is detected.
- the information processing apparatus for example, after the voice recognition is started based on the “speech recognition start operation” illustrated in B of FIG. 9A, “ When a speech recognition stop request including a speech recognition stop command based on the “speech recognition stop operation” is detected, the summary processing is started as a summary trigger.
- the voice recognition start operation and the voice recognition stop operation include an operation on an arbitrary UI (User Interface) related to voice recognition.
- a speech recognition stop request is not limited to being obtained based on the speech recognition stop operation.
- a speech recognition stop request is generated by a device that performs speech recognition processing when an error occurs during speech recognition processing or when an interrupt processing is entered during speech recognition processing. May be.
- start condition an example in which the start condition is a second condition related to the state of speech recognition
- the second condition related to the state of speech recognition includes a condition related to completion of speech recognition Is mentioned.
- the information processing apparatus determines that the start condition is satisfied based on the completion of the voice recognition.
- the information processing apparatus determines that the start condition is satisfied, for example, when the completion of voice recognition is detected.
- the information processing apparatus displays a summary trigger when the result of the speech recognition process is obtained, as indicated by “acquisition of speech recognition result” in A of FIG. As a result, the summarization process is started.
- start condition an example in which the start condition is the first condition related to the content of the utterance
- the first condition related to the content of the utterance is based on the content of the utterance indicated by the voice information. Examples include conditions relating to detection of a predetermined word.
- the predetermined start condition is the first condition regarding the content of the utterance
- the information processing apparatus starts the condition based on the detection of the predetermined word from the content of the utterance indicated by the voice information. Is determined to be satisfied.
- the information processing apparatus according to the present embodiment determines that the start condition is satisfied, for example, when a predetermined word is detected from the utterance content indicated by the audio information.
- Examples of the predetermined word relating to the first condition regarding the content of the utterance include a word called a filler word.
- the predetermined words related to the first condition relating to the content of the utterance may be preset fixed words that cannot be added, deleted, or changed, or may be added, deleted, or changed based on a user operation or the like. May be possible.
- Et shown in B of FIG. 9B corresponds to an example of filler word (an example of a predetermined word).
- the information processing apparatus starts the summarization process using, for example, a summary trigger as a case where a filler word is detected from a character string indicated by voice text information obtained based on voice information.
- start condition an example in which the start condition is the second condition related to the content of the utterance
- the second condition related to the content of the utterance is based on the utterance content indicated by the voice information.
- a condition related to the detection of stagnation is given.
- the information processing apparatus determines that the start condition is satisfied based on the detection of stagnation based on the voice information.
- the information processing apparatus according to the present embodiment determines that the start condition is satisfied, for example, when stagnation is detected based on audio information.
- the information processing apparatus for example, from “a method of detecting voiced pause (including syllable extension) from voice information” or “a character string indicated by voice text information obtained based on voice information, Say stagnation based on speech information by any method that can detect stagnation based on speech information or estimate speech based on speech information, such as "Method of detecting words associated with sloppyness" Is detected.
- the information processing apparatus starts the summarization process using, for example, a summarization trigger when it is estimated that there is stagnation.
- start condition is a condition related to the elapsed time since the voice information was obtained.
- the condition related to the elapsed time after the voice information was obtained is the elapsed time.
- a condition related to the length of time is given.
- the predetermined start condition is a condition related to the elapsed time since the voice information is obtained
- the information processing apparatus is configured to operate when the elapsed time exceeds a predetermined period, or When the time is equal to or longer than a predetermined period, it is determined that the start condition is satisfied.
- the period according to the sixth example of the start condition may be a fixed period set in advance or a variable period that can be changed based on a user operation or the like.
- the information processing apparatus performs, for example, a summarization process when a predetermined time has elapsed since it was detected that audio information was obtained as a summarization trigger. To start.
- the start conditions are from the start condition according to the first example shown in (1-1) to the start condition according to the sixth example shown in (1-6) above.
- the condition which combined 2 or more of them may be sufficient.
- the information processing apparatus starts the summarization process when any one of the combined start conditions is satisfied as a summarization trigger.
- the information processing apparatus sets a summary processing exclusion condition (hereinafter referred to as “summary exclusion condition”). If it is determined that the condition is satisfied, the summarization process is not performed.
- summary exclusion condition a summary processing exclusion condition
- the summary exclusion condition for example, a condition related to gesture detection can be given.
- the information processing apparatus determines that the summary exclusion condition is satisfied when a predetermined gesture that has been set is detected.
- the predetermined gesture related to the summary exclusion condition may be a fixed gesture set in advance, or may be added, deleted, or changed based on a user operation or the like.
- the information processing apparatus performs image processing on a captured image obtained by imaging with an imaging device, estimates a motion based on a detection result of a motion sensor such as an acceleration sensor or an angular velocity sensor, and the like. Then, it is determined whether or not a predetermined gesture related to the summary exclusion condition has been performed.
- summary exclusion condition according to the present embodiment is not limited to the above-described conditions related to gesture detection.
- the summary exclusion condition according to the present embodiment is “the operation for invalidating the function for performing the summary processing, such as pressing a button for invalidating the function for performing the summary processing,” or “ It may be an arbitrary condition set as the summary exclusion condition, such as “the processing load of the information processing apparatus according to the present embodiment is larger than a set threshold”.
- the information processing apparatus includes an utterance period identified based on speech information and the number of characters identified based on speech information. Based on one or both of the above, the utterance content summary level (or the utterance content summary level, the same shall apply hereinafter) is changed. In other words, the information processing apparatus according to the present embodiment changes the level of the summary of the utterance content based on at least one of the utterance period specified based on the voice information and the number of characters specified based on the voice information. .
- the information processing apparatus changes the level of utterance content summarization by, for example, limiting the number of characters indicated by the summarized utterance content.
- the information processing apparatus limits the number of characters indicated by the summarized utterance content, for example, by preventing the number of characters indicated by the summarized utterance content from exceeding the set upper limit value. .
- By limiting the number of characters indicated by the summarized utterance content it is possible to automatically reduce the number of characters indicated by the summarized utterance content, that is, the summary amount.
- the utterance period is specified, for example, by detecting a voice section in which voice is present based on voice information. Further, the number of characters corresponding to the utterance is specified by counting the number of characters in the character string indicated by the speech text information based on the speech information.
- the information processing apparatus When changing the summary level of the utterance content based on the utterance period, the information processing apparatus according to the present embodiment, for example, when the utterance period exceeds a predetermined period or the utterance period is set When the predetermined period is exceeded, the summary level of the content of the utterance is changed.
- the above-mentioned period when the level of the summary of the utterance content is changed based on the utterance period may be a fixed period set in advance, or a variable that can be changed based on a user operation or the like. It may be a period.
- the information processing apparatus when changing the summary level of the content of the utterance based on the number of characters specified based on the voice information, the information processing apparatus according to the present embodiment, for example, when the number of characters is larger than a set threshold, Or, when the number of characters exceeds a set threshold value, the level of utterance content summarization is changed.
- the threshold in the case of changing the summary level of the utterance content based on the number of characters specified based on the voice information may be a preset fixed threshold, or may be used by a user operation or the like. It may be a variable threshold that can be changed based on this.
- the information processing apparatus uses the contents of the utterances summarized by the summarization processing according to the first information processing method. It is possible to further perform a translation process for translating into the above languages. As described above, the information processing apparatus according to the present embodiment translates the first language corresponding to the utterance into a second language different from the first language.
- the reliability of the translation result may be set for each translation unit.
- Translation unit is a unit that translates in translation processing.
- a translation unit for example, a fixed unit that is set, such as for each word, for each one or two or more phrases, can be cited.
- the translation unit may be dynamically set according to, for example, a language (first language) corresponding to the utterance.
- the translation unit may be changeable based on, for example, a user setting operation.
- the reliability of the translation result is, for example, an index indicating the certainty of the translation result. For example, 0 [%] (indicating that the reliability is the lowest) to 100 [%] (indicating that the reliability is the highest) (Shown).
- the reliability of the translation result is obtained by using an arbitrary machine learning result such as a machine learning result using a feedback result with respect to the translation result. Note that the reliability of the translation result is not limited to being obtained using machine learning, but may be obtained by any method capable of obtaining the certainty of the translation result.
- the information processing apparatus can perform, for example, one or both of the following (i) and (ii) as the translation processing.
- the information processing apparatus performs translation processing when it is determined that a set translation processing exclusion condition is satisfied. Absent.
- Exceptional conditions for translation processing according to the present embodiment include, for example, conditions relating to gesture detection.
- the information processing apparatus according to the present embodiment determines that the translation process is satisfied when a predetermined gesture that has been set is detected.
- the predetermined gesture related to the translation processing may be a fixed gesture set in advance, or may be added, deleted, or changed based on a user operation or the like.
- Examples of the fixed gesture set in advance include gestures and hand gestures related to non-verbal communication such as hand signs.
- the information processing apparatus performs image processing on a captured image obtained by imaging with an imaging device, estimates a motion based on a detection result of a motion sensor such as an acceleration sensor or an angular velocity sensor, and the like. Then, it is determined whether or not a predetermined gesture related to the translation process has been performed.
- exclusion conditions for the translation processing according to the present embodiment are not limited to the conditions relating to gesture detection as described above.
- the exclusion condition for the translation process according to the present embodiment is that “an operation for invalidating the function for performing the translation process, such as pressing a button for invalidating the function for performing the translation process, is detected” or
- An arbitrary condition set as an exclusion condition for translation processing such as “the processing load of the information processing apparatus according to the present embodiment has become larger than a set threshold”, may be used.
- the exclusion conditions for the translation processing according to the present embodiment may be the same conditions as the summary exclusion conditions according to the above-described embodiment, or may be different conditions.
- the information processing apparatus can also retranslate content translated into another language into a language before translation.
- the information processing apparatus when an operation for performing re-translation processing is detected, such as when a button for performing re-translation is pressed, for example, , Retranslate to the language before translation.
- the retranslation trigger is not limited to the detection of the operation for performing the retranslation processing as described above.
- the information processing apparatus according to the present embodiment can automatically perform retranslation based on the reliability of the translation result set for each translation unit.
- the information processing apparatus according to the present embodiment performs retranslation when the reliability of the translation result set for each translation unit is less than or equal to the set threshold value or less than the set threshold value. Re-translate as a trigger.
- the information processing apparatus may perform a summary process using the result of the re-translation.
- the information processing apparatus for example, in the case where there are words included in the content after retranslation in the content of the utterance indicated by the voice information acquired after retranslation Include the words contained in the re-translated content in the summarized utterance content.
- the summarization process using the result of retranslation as described above, for example, “When the same words as before retranslation appear in the content uttered by the user, the summary corresponding to this utterance Is adjusted so that the same words are not deleted.
- the information processing apparatus is configured to display the utterance content indicated by the voice information summarized by the summary process according to the first information processing method. Notify me.
- the information processing apparatus notifies the translation result.
- the information processing apparatus notifies the notification content by one or both of notification by a visual method and notification by an auditory method, for example.
- FIG. 10 is an explanatory diagram showing an example of notification by a visual method realized by the notification control process according to the second information processing method.
- FIG. 10 shows an example when the information processing apparatus according to the present embodiment displays the translation result on the display screen of the smartphone.
- the information processing apparatus can perform one or more of the following processes (I) to (VII) as the notification control process, for example.
- the information processing apparatus according to the present embodiment notifies the translation result will be described as an example.
- the information processing apparatus according to the present embodiment can also notify the content of the summarized utterance before translation in the same manner as when the translation result is notified.
- FIG. 11 to FIG. 21 are explanatory diagrams for explaining an example of the notification control processing according to the second information processing method.
- an example of the notification control process according to the second information processing method will be described with reference to FIGS. 11 to 21 as appropriate.
- Notification in word order of translated language The information processing apparatus according to the present embodiment notifies the translation result in a word order corresponding to another translated language.
- the word order corresponding to the other translated languages may be a fixed word order set in advance, or may be changeable based on a user operation or the like.
- the information processing apparatus notifies the translation result based on the reliability for each translation unit by performing one or both of the following processes (II-1) and (II-2), for example. .
- the information processing apparatus when displaying the translation result visually on the display screen of the display device, the information processing apparatus according to the present embodiment gives a priority notification of the translation result with high reliability depending on how to display the translation result. Realize.
- the information processing apparatus when the translation result is audibly notified by voice from the voice output device, the information processing apparatus according to the present embodiment realizes a high-priority translation result notification according to the order of notification, for example. May be.
- the notification realized by the notification control process based on the reliability for each translation unit according to the first example taking the case where the translation result is displayed visually on the display screen of the display device as an example. An example will be described.
- FIG. 11 shows a first example in the case where the translation result is displayed on the display screen of the display device, and shows an example in which the translation result with high reliability is notified preferentially.
- “Recommendation”, “Sightseeing”, “Directions”, “Tell me”, and “Asakusa” correspond to the translation results for each translation unit.
- FIG. 11 shows an example in which lower reliability is set in the order of “recommended”, “sightseeing”, “direction”, “tell me”, and “Asakusa”.
- the information processing apparatus displays the translation results for each translation unit so that the translation results for each translation unit are hierarchically displayed in an order of high reliability. And display it on the display screen.
- hierarchical display is realized by threshold processing using, for example, reliability for each translation unit and one or more thresholds related to determination of the hierarchy to be displayed.
- the threshold for hierarchical display may be a fixed value set in advance, or may be a variable value that can be changed based on a user operation or the like.
- the information processing apparatus when displaying the translation results for each of a plurality of translation units on the same hierarchy, the information processing apparatus according to the present embodiment, for example, “from left to right in the display screen area corresponding to the hierarchy”
- the translation results for each of the plurality of translation units are displayed in a predetermined order such as “Arrange in order of higher reliability”.
- the information processing apparatus when there are a plurality of translation results whose reliability is greater than the predetermined threshold or a plurality of translation results whose reliability is equal to or higher than the predetermined threshold as a result of the threshold processing, the information processing apparatus according to the present embodiment For example, as shown in FIG. 11B, a plurality of existing translation results may be displayed together in a predetermined area of the display screen.
- the predetermined threshold include one or more thresholds among one or more thresholds used for threshold processing.
- examples of the predetermined area include “display screen area corresponding to a hierarchy associated with the threshold processing based on the predetermined threshold”.
- the translation result for each translation unit for which “high reliability (corresponding to a score) in translation processing” is set is displayed at the top and the reliability is high.
- a predetermined threshold value is exceeded, “translation results for each translation unit are displayed together” is realized.
- the display example in the case where the translation result with high reliability is preferentially notified is not limited to the example shown in FIG.
- the information processing apparatus when the translation result is displayed visually on the display screen of the display device, the information processing apparatus according to the present embodiment realizes the notice emphasized according to the reliability depending on the display method.
- the information processing apparatus when the translation result is audibly notified by voice from the voice output device, the information processing apparatus according to the present embodiment changes the reliability by changing the sound pressure, the volume, and the like of the voice based on the reliability, for example.
- a notice that is emphasized according to the above may be realized.
- the notification realized by the notification control process based on the reliability for each translation unit according to the second example will be described by taking as an example the case where the translation result is displayed visually on the display screen of the display device. An example will be described.
- the information processing apparatus emphasizes and displays the translation result according to the reliability by, for example, “displaying each translation result for each translation unit in a size corresponding to the reliability”.
- FIG. 12 shows a second example when the translation result is displayed on the display screen of the display device, and shows a first example when the translation result is displayed in an emphasized manner according to the reliability. .
- “Recommendation”, “Sightseeing”, “Directions”, “Tell me”, and “Asakusa” correspond to the translation results for each translation unit.
- FIG. 12 shows an example in which lower reliability is set in the order of “recommended”, “sightseeing”, “direction”, “tell me”, and “Asakusa”.
- FIG. 12 shows the information processing apparatus according to this embodiment in addition to the notification control process based on the reliability for each translation unit according to the first example, in addition to the translation results for each translation unit.
- An example in which the size is displayed in accordance with is shown. Note that, in the case of performing the notification control process based on the reliability for each translation unit according to the second example, the information processing apparatus according to the present embodiment is reliable as in the hierarchical display shown in FIG. Needless to say, it is not necessary to preferentially notify the translation result having a high degree.
- the information processing apparatus displays each translation result for each translation unit in a size corresponding to the reliability, for example, as shown in FIG.
- the information processing apparatus according to the present embodiment refers to, for example, “a table (or database) in which reliability is associated with a display size when displaying a translation result for each translation unit on the display screen”.
- the translation result for each translation unit is displayed in a size corresponding to the reliability.
- the translation result for each translation unit having a high reliability (corresponding to a score) in the translation process is displayed at the top and displayed at the top.
- the size is changed prominently ”.
- the display example when displaying each translation result for each translation unit in a size corresponding to the reliability is not limited to the example shown in FIG.
- the information processing apparatus provides reliability by “displaying each translation result for each translation unit so that a translation result with high reliability is displayed in the foreground”. Depending on the case, the translation result may be highlighted and displayed.
- FIG. 13 shows a third example when the translation result is displayed on the display screen of the display device, and shows a second example when the translation result is displayed with emphasis according to the reliability. .
- “Recommendation”, “Sightseeing”, “Direction”, “Tell me”, “Asakusa”, etc. correspond to the translation results for each translation unit.
- FIG. 13 shows an example in which lower reliability is set in the order of “recommendation”, “sightseeing”, “direction”, “tell me”, “Asakusa”,.
- FIG. 13 shows that the information processing apparatus according to the present embodiment displays a translation result with higher reliability on the display screen in addition to the notification control process based on the reliability for each translation unit according to the first example.
- the example displayed on the near side is shown.
- the information processing apparatus according to the present embodiment displays a hierarchical display as illustrated in FIG. Needless to say, it is not necessary to preferentially notify a translation result with a high degree of reliability.
- the information processing apparatus displays a translation result with high reliability on the display screen.
- the information processing apparatus according to the present embodiment is, for example, “a table (or database) in which the reliability and the coordinate value in the depth direction when displaying the translation result for each translation unit on the display screen are associated with each other”.
- each translation result for each translation unit is displayed so that the translation result with high reliability is displayed on the front side of the display screen.
- the translation result for each translation unit having a high reliability (corresponding to a score) in the translation process is displayed on the front in the depth direction on the display screen.
- the result of translation for each translation unit for which high reliability is set is more conspicuous.
- the example of a display in the case of displaying each translation result for every translation unit so that a translation result with high reliability is displayed in the foreground on the display screen is not limited to the example shown in FIG. Needless to say.
- the information processing apparatus for example, “displays each translation result for each translation unit in one or both of the color according to the reliability and the transparency according to the reliability”,
- the translation result may be highlighted and displayed according to the reliability.
- FIG. 14 shows a fourth example when the translation result is displayed on the display screen of the display device, and shows a third example when the translation result is displayed in an emphasized manner according to the reliability.
- “recommendation”, “tourism”, “direction”, “tell me”, and “Asakusa” correspond to the translation results for each translation unit.
- FIG. 14 shows an example in which lower reliability is set in the order of “recommendation”, “tourism”, “direction”, “tell me”, and “Asakusa”.
- FIG. 14 shows that the information processing apparatus according to the present embodiment further adds the translation result for each translation unit to the reliability in addition to the notification control process based on the reliability for each translation unit according to the first example.
- An example is shown in which one or both of the color according to the degree and the transparency according to the reliability are displayed.
- the information processing apparatus according to the present embodiment displays a hierarchical display as illustrated in FIG. Needless to say, it is not necessary to preferentially notify a translation result with a high degree of reliability.
- the information processing apparatus displays each translation result for each translation unit in a color corresponding to the reliability.
- the information processing apparatus according to the present embodiment may display each translation result for each translation unit with a transparency according to the reliability.
- the information processing apparatus according to the present embodiment can display, for example, each translation result for each translation unit with a color according to the reliability and a transparency according to the reliability.
- the information processing apparatus has, for example, “reliability, color when displaying the translation result for each translation unit on the display screen, and transparency when displaying the translation result for each translation unit on the display screen.
- each translation result for each translation unit is displayed in one or both of a color corresponding to the reliability and a transparency corresponding to the reliability. .
- Notification control processing based on audio information
- the information processing apparatus is Based on the voice information, the display method of the notification content is controlled.
- the information processing apparatus controls how the notification content is displayed based on the audio information, for example, by “displaying the notification content in a size corresponding to the sound pressure or volume specified from the audio information”. To do.
- the information processing apparatus refers to, for example, “a table (or database) in which sound pressure or volume, display size when displaying divided text, and font size are associated with each other”.
- the notification content is displayed in a size corresponding to the sound pressure or volume specified from the sound information.
- the information processing apparatus controls how to display the notification content. Similarly to the above, it is possible to control the display method of the translation result based on the voice information.
- FIG. 15 shows a fifth example in which the translation result is displayed on the display screen of the display device, and shows an example in which the translation result is displayed with emphasis based on the audio information.
- “recommendation”, “tourism”, “direction”, “tell me”, and “Asakusa” correspond to the translation results for each translation unit.
- FIG. 15 shows an example when the sound pressure or volume is lower in the order of “Tell me”, “Direction”, “Recommendation”, “Sightseeing”, and “Asakusa”, for example.
- the information processing apparatus converts the translation result for each translation unit (the contents of the translated summary utterance) into a sound pressure or volume specified from the sound information. Display in the appropriate size.
- the information processing apparatus according to the present embodiment for example, “a table (or database) in which the sound pressure or volume, the display size when displaying the translation result for each translation unit, and the font size” are associated with each other.
- the translation result is displayed in a size corresponding to the sound pressure or sound volume specified from the sound information.
- the display as shown in FIG. 15 when the display as shown in FIG. 15 is performed, “the font and the display size are displayed in a large size so that the one with a high sound pressure (or volume) is more conspicuous” is realized.
- the display example in the case of controlling the display method based on the audio information is not limited to the example shown in FIG.
- the operations performed on the display screen include, for example, operations using operation input devices such as buttons, direction keys, a mouse, and a keyboard, and operations on the display screen (when the display device is a touch panel).
- operation input devices such as buttons, direction keys, a mouse, and a keyboard
- operations on the display screen when the display device is a touch panel.
- Arbitrary operations that can be performed on the screen are listed.
- the information processing apparatus performs, for example, one or both of the following processes (IV-1) and (IV-2) to display the display screen based on an operation performed on the display screen. Change the displayed contents.
- (IV-1) First Example of Notification Control Processing Based on Operation Performed on Display Screen
- the information processing apparatus is displayed on the display screen based on the operation performed on the display screen. Change the contents. Examples of changing the content displayed on the display screen according to the present embodiment include one or more of the examples shown below. -Change the display position of the notification content on the display screen (or change the display position of the translation result on the display screen) ⁇ Deleting part of the notification content displayed on the display screen (or deleting part of the translation result displayed on the display screen)
- the information processing apparatus changes the display position of the notification content on the display screen (or the display position of the translation result on the display screen) based on an operation performed on the display screen, for example,
- the content to be presented to the communication partner can be changed manually.
- the information processing apparatus may display a part of the notification content displayed on the display screen (or the translation result displayed on the display screen based on an operation performed on the display screen. For example, it is possible to manually delete a translation result in which a mistranslation has occurred.
- FIGS. 16A to 16C show examples of display screens in the case where contents displayed on the display screen are changed based on operations performed on the display screen.
- FIG. 16A shows an example of a display when the translation result for each translation unit by the translation process is re-translated.
- FIG. 16B shows an example of display in the case where a part of the translation result (translated summarized utterance content) for each translation unit displayed on the display screen is deleted.
- FIG. 16C shows an example of display when the display position of the translation result (translated summarized utterance content) for each translation unit displayed on the display screen is changed.
- a case where the user desires to delete “recommendation” which is a part of the translation result for each translation unit displayed on the display screen is taken as an example.
- a window W for selecting whether or not to delete is displayed as shown in A of FIG. 16B.
- “Recommendation” which is a part of the translation result is deleted as shown in B of FIG. 16B.
- the example of deleting a part of the translation result for each translation unit displayed on the display screen is not limited to the example shown in FIG. 16B.
- a case where the user desires to change the display position of “Recommend” and “Tell me” in the translation results for each translation unit displayed on the display screen will be described as an example.
- the user selects “Tell me” as indicated by reference numeral O1 in A of FIG. 16C and then designates the position indicated by reference numeral O2 in B of FIG. 16C by a drag operation, as shown in B of FIG. 16C.
- the display positions of “recommend” and “tell me” are switched.
- the example of changing the display position of the translation result for each translation unit displayed on the display screen is not limited to the example shown in FIG. 16C.
- the information processing apparatus when one part of the notification content is displayed on the display screen, the information processing apparatus according to the present embodiment is displayed on the display screen based on an operation performed on the display screen. Change the contents.
- the information processing apparatus changes the content displayed on the display screen, for example, by changing the notification content displayed on the display screen from the one part to the other part.
- FIGS. 17 and 18 are examples of display screens in the case of changing the translation result (translated summarized utterance content) for each translation unit based on the operation performed on the display screen. Show.
- FIG. 17 shows an example of a display screen in which the content displayed on the display screen can be changed by a slider-type UI as shown in FIG.
- FIG. 18 shows an example of a display screen in which the content displayed on the display screen can be changed by a revolver type UI whose display changes by rotating in the depth direction of the display screen.
- a case where the user desires to display the content displayed on the display screen is taken as an example.
- the user operates the slider-type UI by touching an arbitrary part of the slider shown in FIG. 17A, for example, to display the translation result displayed on the display screen from one part to another part. To change.
- a case where the user desires to display the contents displayed on the display screen is taken as an example.
- the user operates the revolver-type UI, for example, by performing a flick operation as indicated by reference numeral O1 in FIG. 18 to change the translation result displayed on the display screen from one part to another part. Change it.
- the example of changing the translation result displayed on the display screen is not limited to the examples shown in FIGS.
- the information processing apparatus audibly transmits a translation result from a voice output device based on voice operation. You may be notified.
- FIG. 19 shows an example of the case where the translation result is audibly notified based on the voice operation.
- FIG. 19 shows an example in which the content to be notified to the communication partner is selected from the translation results for each translation unit by the translation processing based on the operation by voice.
- the information processing apparatus when the translation results for each translation unit by the translation processing are “recommendation”, “tourism”, “direction”, and “tell me”, the information processing apparatus according to the present embodiment is shown in FIG. As shown, the retranslated result is notified by voice as indicated by reference numeral “I1” in A of FIG. At this time, the information processing apparatus according to the present embodiment may insert a sound feedback as indicated by a symbol “S” in FIG. 19A at the division of the divided text as shown in FIG. 19A. .
- a voice selection operation as indicated by the symbol “O” is detected in B of FIG. 19, the information processing apparatus according to the present embodiment As indicated by reference numeral “I2”, a voice indicating a translation result corresponding to the selection operation by the voice is output from the voice output device.
- B in FIG. 19 shows an example of a selection operation by voice for designating a number to be notified to the communication partner.
- the example of the selection operation by voice according to the present embodiment is not limited to the example described above.
- FIG. 20 shows another example in the case where the translation result is audibly notified based on the voice operation.
- FIG. 20 shows an example in which the content to be notified to the communication partner is excluded from the translation results for each translation unit by the translation processing based on the voice operation.
- the information processing apparatus when the translation results for each translation unit by the translation processing are “recommendation”, “tourism”, “direction”, and “tell me”, the information processing apparatus according to the present embodiment is shown in FIG. As shown, the re-translated result is notified by voice as indicated by reference numeral “I1” in A of FIG. Note that the information processing apparatus according to the present embodiment may insert sound feedback at the division of the divided text, as in A of FIG.
- the information processing apparatus After the re-translation result is notified by voice, when an excluding operation by voice as indicated by the symbol “O” is detected in B of FIG. 20, the information processing apparatus according to the present embodiment As indicated by reference numeral “I2”, a voice indicating a translation result corresponding to the selection operation by the voice is output from the voice output device.
- B in FIG. 20 shows an example of an excluding operation by voice for designating a number that does not require notification to the communication partner.
- the example of the audio exclusion operation according to the present embodiment is not limited to the example described above.
- the information processing apparatus can also dynamically control the notification order of notification contents. It is.
- the information processing apparatus controls the notification order of notification contents based on at least one of information corresponding to the first user and information corresponding to the second user, for example.
- the information corresponding to the first user includes, for example, at least one of information regarding the first user, information regarding the application, and information regarding the device.
- the information corresponding to the second user includes at least one of information on the second user, information on the application, and information on the device.
- the information related to the first user indicates, for example, one or both of the situation where the first user is placed and the state of the first user.
- the information regarding a 2nd user shows the one or both of the condition where the 2nd user is placed, and the state of a 2nd user, for example.
- the information related to the application indicates, for example, the execution state of the application.
- the information about the device indicates one or both of the device type and the device state, for example.
- the process of estimating the situation where the user is placed may be performed by the information processing apparatus according to the present embodiment, or may be performed by an external device of the information processing apparatus according to the present embodiment.
- the user state is an arbitrary behavior estimation process using one or more of the user's biological information, the detection result of the motion sensor, the captured image captured by the imaging device, and the like.
- it is estimated by an arbitrary emotion estimation process.
- FIG. 21 shows an example of display when the notification order is dynamically controlled.
- FIG. 21A shows an example of the case where the translation result (translated summarized utterance content) for each translation unit is displayed based on the state of the user.
- B of FIG. 21 has shown an example when the translation result for every translation unit by a translation process is displayed based on the execution state of an application.
- FIG. 21C shows an example of a case where the translation result for each translation unit by the translation process is displayed based on the situation where the user is placed.
- FIG. 21A shows an example of display based on the state of the user when the translation results for each translation unit are “recommended”, “tourist”, “direction”, and “tell me”.
- the information processing apparatus uses a verb as illustrated in FIG. Is displayed preferentially, such as by displaying on the leftmost side of the display screen.
- the information processing apparatus specifies the notification order by referring to, for example, “a table (or database) in which the user status and information indicating the display order are associated with each other”.
- FIG. 21B shows an example of display based on the execution state of the application when the translation results for each translation unit are “Hokkaido”, “Origin”, “Delicious”, and “Fish”.
- the information processing apparatus when the type of application being executed is recognized as “meal browser”, the information processing apparatus according to the present embodiment 21 preferentially displays adjectives by displaying the adjectives on the leftmost side of the display screen as shown in FIG.
- the information processing apparatus according to the present embodiment specifies the notification order by referring to, for example, “a table (or database) in which an application type and information indicating a display order are associated with each other”.
- FIG. 21C shows a display based on the situation where the user is placed when the translation result for each translation unit is “Hurry”, “Shibuya”, “Collecting”, and “No time”. An example is shown.
- the information processing apparatus when noise detected from voice information (for example, sound other than voice based on speech) is larger than a set threshold, the information processing apparatus according to the present embodiment recognizes that the user is in a noisy situation. To do. Then, the information processing apparatus according to the present embodiment preferentially displays the noun (or proper noun) by displaying the noun (or proper noun) on the leftmost side of the display screen as shown in FIG. 21C. .
- the information processing apparatus according to the present embodiment specifies the notification order by referring to, for example, “a table (or database) in which an environment where a user is placed and information indicating a display order are associated”. .
- the information processing apparatus when the notification order is dynamically controlled based on two or more of the situation where the user is placed, the user's state, and the application execution state (the notification order is dynamically changed based on a plurality of pieces of information) As an example of control, the information processing apparatus according to the present embodiment has the priority (or priority) set for each of the situation where the user is placed, the state of the user, and the execution state of the application. Based on this, the notification order is specified.
- the information processing apparatus causes notification contents corresponding to an index having a high priority (or priority) to be preferentially notified.
- FIG. 21 shows an example of the notification by the visual method.
- the information processing apparatus according to the present embodiment can also perform the notification by the auditory method.
- the information processing apparatus can also dynamically control the notification order based on each piece of information about the device.
- dynamically controlling the notification order based on the information about the device for example, dynamically controlling the notification order according to the processing load of the processor can be cited.
- the information processing apparatus can also dynamically control the information content of notification content. It is.
- the information processing apparatus is, for example, notification content information based on one or more of summary information, information corresponding to the first user, information corresponding to the second user, and voice information.
- the amount is dynamically controlled. Examples of the dynamic change of the information amount include, for example, the following (VII-1) to (VII-5). Needless to say, examples of dynamically changing the information amount are not limited to the examples shown in the following (VII-1) to (VII-5).
- VII-1 Example of dynamic change of notification content based on summary information
- the information processing apparatus includes, for example, “that” and “it” in the content of the summarized utterance indicated by the summary information. Are included, the instruction word (or the translation result of the instruction word) is not notified.
- the information processing apparatus for example, if the content of the summarized utterance indicated by the summary information includes a word corresponding to the greeting (or corresponding to the greeting) (Translation result of words to be) is not notified.
- (VII-2) Example of dynamic change of notification content based on information corresponding to first user-In the information processing apparatus according to the present embodiment, for example, when the facial expression of the first user is determined to be laughter The amount of information when notifying the notification content is reduced.
- the information processing apparatus according to the present embodiment notifies, for example, notification contents when it is determined that the first user's line of sight is facing upward (an example when it is determined that the first user is close to monologue). I won't let you.
- the information processing apparatus according to the present embodiment displays a notification content when a gesture (for example, a pointing gesture) corresponding to an instruction word such as “that”, “it”, “this” is detected. , Do not let me know.
- the information processing apparatus according to the present embodiment for example, notifies all of the notification contents when it is determined that the first user is in a situation where noise is high.
- VII-3 Example of dynamic change of notification content based on information corresponding to second user-In the information processing apparatus according to the present embodiment, for example, when the facial expression of the second user is determined to be laughter The amount of information when notifying the notification content is reduced.
- the information processing apparatus according to the present embodiment determines that the second user may not understand the utterance content (for example, the second user When it is determined that the user's line of sight is not suitable for the first user), the amount of information when the notification content is notified is increased.
- the information processing apparatus determines, for example, that the second user is yawning (for example, determines that the second user is bored) Etc.), the amount of information when the notification content is notified is reduced.
- the information processing apparatus increases the amount of information when notifying the notification content when, for example, it is determined that the second user has nodded or consulted. .
- the information processing apparatus when it is determined that the size of the pupil of the second user is larger than a predetermined size, or When it is determined that the size is equal to or greater than the predetermined size (an example when it is determined that the user is interested), the amount of information when the notification content is notified is increased.
- the information processing apparatus determines that the second user may not understand the utterance content (for example, the second user For example, when it is determined that the user's hand is not moving), the amount of information when the notification content is notified is increased.
- the information processing apparatus when it is determined that the inclination of the body of the second user is tilted forward (determined as interested) An example of when the notification is made) increases the amount of information when the notification content is notified.
- the information processing apparatus for example, notifies all of the notification contents when it is determined that the second user is in a situation where noise is high.
- the information processing apparatus has a volume of utterances detected from voice information larger than a predetermined threshold, or When the volume of the utterance is equal to or higher than the predetermined threshold, the notification content is not notified.
- the information processing apparatus for example, if the volume of the utterance detected from the voice information is greater than a predetermined threshold or the volume of the utterance is greater than or equal to the predetermined threshold Notify some or all of
- FIG. 5 An example of dynamic change of notification content based on a combination of a plurality of pieces of information-The information processing apparatus according to the present embodiment is, for example, when the first user and the second user are different, When it is determined that the user's line of sight matches the line of sight of the second user, the amount of information to be notified is increased (information corresponding to the first user and corresponding to the second user) Example of dynamic change of notification content based on information).
- FIGS. 22 to 33 are flowcharts showing an example of processing related to the information processing method according to the present embodiment.
- an example of processing according to the information processing method according to the present embodiment will be described with reference to FIGS. 22 to 33 as appropriate.
- the information processing apparatus sets a weight related to summarization (hereinafter, sometimes referred to as “weight for summarization function” or simply “weight”) (S100, presetting).
- the information processing apparatus determines a weight related to the summary by determining a weight related to the summary and holding the weight in a recording medium such as a storage unit (described later).
- An example of the process in step S100 is the process shown in FIG.
- the information processing apparatus acquires data indicating schedule contents from a schedule application (S200).
- the information processing apparatus is a table (hereinafter referred to as “behavior information summary weight table”) for identifying the behavior recognized from the data indicating the acquired schedule content and the type of weight related to the summary illustrated in FIG.
- the type of weight related to the summary is determined (S202).
- the information processing apparatus may indicate a type of weight related to the summary determined in step S202 and a table for specifying the weight related to the summary illustrated in FIG. 6 (hereinafter referred to as “summary table”). )), The weight for the summary is determined (S204).
- the information processing apparatus performs, for example, the process illustrated in FIG. 23 as the process of step S100 in FIG. Needless to say, the process of step S100 in FIG. 22 is not limited to the process shown in FIG.
- the information processing apparatus validates voice input by, for example, starting an application related to voice input (S102).
- the information processing apparatus determines whether audio information has been acquired (S104). If it is not determined in step S104 that the voice information has been acquired, the information processing apparatus according to the present embodiment does not proceed with the processes in and after step S106 until it is determined that the voice information has been acquired, for example.
- the information processing apparatus analyzes the voice information (S106).
- the information processing apparatus according to the present embodiment obtains, for example, sound pressure, pitch, average frequency band, and the like by analyzing audio information.
- the information processing apparatus according to the present embodiment holds the audio information in a recording medium such as a storage unit (described later) (S108).
- the information processing apparatus sets a weight related to summarization based on voice information or the like (S110).
- An example of the process in step S110 is the process shown in FIG.
- the information processing apparatus sets weights related to summarization based on, for example, the average frequency of the voice indicated by the voice information (hereinafter sometimes referred to as “input voice”). (S300).
- An example of the process in step S300 is the process shown in FIG.
- step S302 shows an example in which the process of step S302 is performed after the process of step S300
- the process of step S110 of FIG. 22 is not limited to the process shown in FIG.
- the information processing apparatus according to the present embodiment can perform the process of step S304 after the process of step S302.
- the process of S300 and the process of step S302 can also be performed in parallel.
- the information processing apparatus determines whether or not the average frequency band of voice is 300 [Hz] to 550 [Hz] (S400).
- step S400 If it is determined in step S400 that the average frequency band of the voice is 300 [Hz] to 550 [Hz], the information processing apparatus according to the present embodiment selects “male” as the type of weight related to the summary. Is determined (S402).
- step S400 If it is not determined in step S400 that the average frequency band of the voice is 300 [Hz] to 550 [Hz], the information processing apparatus according to the present embodiment has an average frequency band of the voice. , 400 [Hz] to 700 [Hz] is determined (S404).
- step S404 If it is determined in step S404 that the average frequency band of the voice is 400 [Hz] to 700 [Hz], the information processing apparatus according to the present embodiment selects “female” as the type of weight related to the summary. Is determined (S406).
- step S404 If it is not determined in step S404 that the average frequency band of the voice is 400 [Hz] to 700 [Hz], the information processing apparatus according to the present embodiment does not determine the weight related to the summary.
- the information processing apparatus performs, for example, the process shown in FIG. 25 as the process of step S300 of FIG. Needless to say, the process of step S300 in FIG. 24 is not limited to the process shown in FIG.
- step S110 in FIG. 22 sets a weight related to the summary based on the sound pressure of the sound indicated by the sound information (S302).
- An example of the processing in step S302 is the processing shown in FIG.
- the information processing apparatus determines a threshold value related to sound pressure based on the distance between the user of the speaker and the communication partner (S500).
- a threshold value related to sound pressure based on the distance between the user of the speaker and the communication partner (S500).
- An example of the process in step S500 is the process shown in FIG.
- the information processing apparatus acquires the distance D between the current communication partner and the image recognition based on the captured image captured by the imaging device (S600).
- the information processing apparatus performs, for example, the following mathematical formula 2 (S602).
- the information processing apparatus performs the calculation of Equation 3 below, and determines the threshold value related to sound pressure by adjusting the threshold value VPWR_thresh_upper related to sound pressure and the threshold value VPWR_thresh_lowre related to sound pressure. (S604).
- the information processing apparatus performs, for example, the process shown in FIG. 27 as the process of step S500 of FIG. Needless to say, the process of step S500 in FIG. 26 is not limited to the process shown in FIG.
- the information processing apparatus determines whether or not the sound pressure of the sound indicated by the sound information is greater than or equal to a threshold VPWR_thresh_upper related to the sound pressure (S502).
- step S502 When it is determined in step S502 that the sound pressure of the sound indicated by the sound information is equal to or higher than the threshold VPWR_thresh_upper related to the sound pressure, the information processing apparatus according to the present embodiment selects “anger” as the weight type related to the summary. “Joy” is determined (S504).
- step S502 when it is not determined that the sound pressure of the sound indicated by the sound information is equal to or higher than the threshold VPWR_thresh_upper related to the sound pressure, the information processing apparatus according to the present embodiment has the sound pressure of the sound indicated by the sound information. Then, it is determined whether or not the threshold value VPWR_thresh_lowre relating to the sound pressure is below (S506).
- step S506 If it is determined in step S506 that the sound pressure of the sound indicated by the sound information is equal to or lower than the threshold VPWR_thresh_lowre related to the sound pressure, the information processing apparatus according to the present embodiment selects “sadness” as the weight type related to the summary. “Uncomfortable”, “pain”, and “anxiety” are determined (S508).
- Step S506 when it is not determined that the sound pressure of the sound indicated by the sound information is equal to or lower than the threshold VPWR_thresh_lowre related to the sound pressure, the information processing apparatus according to the present embodiment does not determine the weight regarding the summary.
- the information processing apparatus performs, for example, the process shown in FIG. 26 as the process of step S302 of FIG. Needless to say, the process of step S302 in FIG. 24 is not limited to the process shown in FIG.
- step S110 in FIG. 22 an example of the process of step S110 in FIG. 22 will be described.
- the information processing apparatus analyzes voice information and holds the number of mora and the location of the accent (S304). Note that the process of step S304 may be performed in the process of step S106 of FIG.
- the information processing apparatus performs, for example, the process shown in FIG. 24 as the process of step S110 of FIG. Needless to say, the process of step S110 in FIG. 22 is not limited to the process shown in FIG.
- the information processing apparatus performs voice recognition on voice information (S112).
- the voice text information is acquired by performing the process of step S112.
- step S112 When the process of step S112 is performed, the information processing apparatus according to the present embodiment sets a weight related to the summary based on the speech recognition result and the like (S114).
- An example of the process in step S114 is the process shown in FIG.
- the information processing apparatus sets a weight for summarization based on the language of the character string indicated by the speech text information (S700).
- An example of the process in step S700 is the process shown in FIG.
- FIG. 28 shows an example in which the processing of steps S704 to S710 is performed after the processing of steps S700 and S702, but the processing of step S114 of FIG. 22 is not limited to the processing shown in FIG.
- the information processing apparatus since the processes of steps S700 and S702 and the processes of steps S704 to S710 are independent processes, the information processing apparatus according to the present embodiment performs the processes of steps S700 and S702 after the processes of steps SS704 to S710. Alternatively, the processes in steps S700 and S702 and the processes in steps S704 to S710 can be performed in parallel.
- the information processing apparatus estimates the language of the character string indicated by the voice text information (S800).
- the information processing apparatus estimates a language by a process related to an arbitrary method capable of estimating a language from a character string, such as estimation based on matching with a language dictionary.
- the information processing apparatus determines whether or not the estimated language is Japanese (S802).
- step S802 If it is determined in step S802 that the estimated language is Japanese, the information processing apparatus according to the present embodiment determines the weight related to the summary so that the weight of the “Japanese verb” is high. (S804).
- step S802 if it is not determined that the estimated language is Japanese, the information processing apparatus according to the present embodiment determines whether the estimated language is English (S806).
- step S806 If it is determined in step S806 that the estimated language is English, the information processing apparatus according to the present embodiment determines the weight related to the summary so that the weight of “English nouns and verbs” increases. (S808).
- step S806 if it is not determined that the estimated language is English, the information processing apparatus according to the present embodiment does not determine the weight regarding the summary.
- the information processing apparatus performs, for example, the process shown in FIG. 29 as the process of step S700 of FIG. Needless to say, the process of step S700 of FIG. 28 is not limited to the process shown in FIG.
- step S114 in FIG. 22 an example of the process in step S114 in FIG. 22 will be described.
- the information processing apparatus analyzes voice information and holds the number of mora and the location of the accent (S702). Note that the process of step S702 may be performed in the process of step S106 of FIG.
- the information processing apparatus divides a character string indicated by the speech text information (hereinafter, may be referred to as “speech text result”) into morpheme units by natural language processing, and analyzes the corresponding speech information.
- the results are linked (S704).
- the information processing apparatus estimates an emotion based on the analysis result of the voice information linked in units of morphemes in step S704 (S706).
- the information processing apparatus can estimate an emotion by using an analysis result of audio information, such as a method of using a table in which an analysis result of audio information and an emotion are associated with each other. Estimate emotions by any method.
- the information processing apparatus determines the strength of the weight related to the summary (the strength of the weight related to the emotion) based on the analysis result of the speech information linked in units of morphemes in step S704 ( S708).
- the information processing apparatus determines the strength of the weight related to the summary based on the change rate of the fundamental frequency, the change rate of the sound, and the change rate of the utterance time in the analysis result of the speech information.
- the information processing apparatus according to the present embodiment uses, for example, a method of using a table in which an analysis result of speech information is associated with a strength of the summary weight, and the weight related to the summary by using the analysis result of the speech information.
- the strength of the weight for the summary is determined by any method that can determine the strength of the summary.
- the information processing apparatus determines a summary weight based on the emotion estimated in step S706 (S710). Further, the information processing apparatus according to the present embodiment may adjust the weight related to the summary determined based on the estimated emotion by the strength of the weight related to the summary determined in step S708.
- the information processing apparatus performs, for example, the process shown in FIG. 28 as the process of step S114 of FIG. Needless to say, the process of step S114 in FIG. 22 is not limited to the process shown in FIG.
- the information processing apparatus performs a summarization process based on the weights related to the summaries determined in steps S100, S110, and S114 (S116).
- step S116 determines whether or not to perform translation processing (S118).
- step S118 If it is not determined in step S118 that the translation process is to be performed, the information processing apparatus according to the present embodiment notifies the summary result by the notification control process (S120).
- step S118 If it is determined in step S118 that the translation process is to be performed, the information processing apparatus according to the present embodiment performs the translation process on the summary result and notifies the translation result by the notification control process (S122).
- An example of the process of step S122 is the process shown in FIG.
- the information processing apparatus performs morphological analysis, for example, by performing natural language processing on the summary result (S900).
- the information processing apparatus generates a divided text in which the main part of speech (noun, verb, adjective, adverb) and other morphemes are combined until there is no unprocessed summary result (S902).
- the information processing apparatus determines whether or not the language of the summary result is English (S904).
- step S904 If it is not determined in step S904 that the language of the summary result is English, the information processing apparatus according to the present embodiment performs the process of step S908 described later.
- step S904 If it is determined in step S904 that the language of the summary result is English, the information processing apparatus according to the present embodiment sets a word corresponding to 5W1H as a divided text (S906).
- step S904 When it is not determined in step S904 that the language of the summary result is English, or when the process of step S906 is performed, the information processing apparatus according to the present embodiment performs a translation process on each divided text, and the translation result And the original part-of-speech information before translation are linked and held (S908).
- the information processing apparatus determines whether or not the language of the divided translation text (an example of the translation result) is English (S910).
- step S910 When it is determined in step S910 that the language of the divided translation text is English, the information processing apparatus according to the present embodiment determines the notification order in English (S912).
- An example of the process in step S912 is the process shown in FIG.
- the information processing apparatus determines whether there is a divided translated text to be processed (S1000).
- the divided translation text to be processed in step S1000 corresponds to an unprocessed translation result among the translation results for each translation unit.
- the information processing apparatus determines, for example, that there is a divided translation text to be processed when there is an unprocessed translation result, and processes the divided translation text when there is no unprocessed translation result. Is determined not to exist.
- step S1000 When it is determined in step S1000 that there is a divided translation text to be processed, the information processing apparatus according to the present embodiment acquires a divided translation text to be processed next (S1002).
- the information processing apparatus determines whether or not the divided translated text to be processed includes a noun (S1004).
- step S1004 When it is determined in step S1004 that the divided translated text to be processed includes a noun, the information processing apparatus according to the present embodiment sets the priority to the maximum value “5” (S1006). Then, the information processing apparatus according to the present embodiment repeats the processing from step S1000.
- step S1004 determines whether the divided translation text to be processed includes a noun. If it is not determined in step S1004 that the divided translation text to be processed includes a noun, the information processing apparatus according to the present embodiment determines whether the divided translation text to be processed includes a verb (S1008). .
- step S1008 when it is determined that the divided translated text to be processed includes a verb, the information processing apparatus according to the present embodiment sets the priority to “4” (S1010). Then, the information processing apparatus according to the present embodiment repeats the processing from step S1000.
- step S1008 determines whether the divided translated text to be processed includes a verb. If it is not determined in step S1008 that the divided translated text to be processed includes a verb, the information processing apparatus according to the present embodiment determines whether the divided translated text to be processed includes an adjective (S1012). .
- step S1012 if it is determined that the divided translated text to be processed includes an adjective, the information processing apparatus according to the present embodiment sets the priority to “3” (S1014). Then, the information processing apparatus according to the present embodiment repeats the processing from step S1000.
- step S1012 if it is not determined that the divided translation text to be processed includes an adjective, the information processing apparatus according to the present embodiment determines whether the divided translation text to be processed includes an adverb (S1016). .
- step S1016 If it is determined in step S1016 that the divided translated text to be processed includes an adverb, the information processing apparatus according to the present embodiment sets the priority to “2” (S1018). Then, the information processing apparatus according to the present embodiment repeats the processing from step S1000.
- step S1016 if it is not determined that the divided translation text to be processed includes an adverb, the information processing apparatus according to the present embodiment sets the priority to the minimum value “1” (S1020). Then, the information processing apparatus according to the present embodiment repeats the processing from step S1000.
- step S1000 If it is not determined in step S1000 that there is a divided translated text to be processed, the information processing apparatus according to the present embodiment sorts the notification order according to the set priority (S1022).
- the information processing apparatus performs, for example, the process illustrated in FIG. 31 as the process in step S912 in FIG. Needless to say, the process of step S912 in FIG. 30 is not limited to the process shown in FIG.
- step S910 determines the notification order in Japanese (S914).
- An example of the process in step S914 is the process shown in FIG.
- the information processing apparatus determines whether or not there is a divided translated text to be processed, similar to step S1100 of FIG. 31 (S1100).
- the divided translation text to be processed in step S1100 corresponds to an unprocessed translation result among the translation results for each translation unit.
- step S1100 When it is determined in step S1100 that there is a divided translation text to be processed, the information processing apparatus according to the present embodiment acquires a divided translation text to be processed next (S1102).
- the information processing apparatus determines whether or not the divided translated text to be processed includes a verb (S1104).
- step S1104 When it is determined in step S1104 that the divided translated text to be processed includes a verb, the information processing apparatus according to the present embodiment sets the priority to the maximum value “5” (S1106). Then, the information processing apparatus according to the present embodiment repeats the processing from step S1100.
- step S1104 determines whether or not the divided translation text to be processed includes a verb. If it is not determined in step S1104 that the divided translation text to be processed includes a verb, the information processing apparatus according to the present embodiment determines whether or not the divided translation text to be processed includes a noun (S1108). .
- step S1108 If it is determined in step S1108 that the divided translated text to be processed includes a noun, the information processing apparatus according to the present embodiment sets the priority to “4” (S1110). Then, the information processing apparatus according to the present embodiment repeats the processing from step S1100.
- step S1108 determines whether the divided translation text to be processed includes a noun. If it is not determined in step S1108 that the divided translation text to be processed includes a noun, the information processing apparatus according to the present embodiment determines whether the divided translation text to be processed includes an adjective (S1112). .
- step S1112 If it is determined in step S1112 that the divided translated text to be processed includes an adjective, the information processing apparatus according to the present embodiment sets the priority to “3” (S1114). Then, the information processing apparatus according to the present embodiment repeats the processing from step S1100.
- step S1112 determines whether the divided translation text to be processed includes an adjective. If it is not determined in step S1112 that the divided translation text to be processed includes an adjective, the information processing apparatus according to the present embodiment determines whether the divided translation text to be processed includes an adverb (S1116). .
- step S1116 If it is determined in step S1116 that the divided translated text to be processed includes an adverb, the information processing apparatus according to the present embodiment sets the priority to “2” (S1118). Then, the information processing apparatus according to the present embodiment repeats the processing from step S1100.
- step S1116 If it is not determined in step S1116 that the divided translated text to be processed includes an adverb, the information processing apparatus according to the present embodiment sets the priority to the minimum value “1” (S1120). Then, the information processing apparatus according to the present embodiment repeats the processing from step S1100.
- the information processing apparatus sorts the notification order according to the set priority (S1122).
- the information processing apparatus performs, for example, the process illustrated in FIG. 32 as the process of step S914 in FIG. Needless to say, the process of step S914 in FIG. 30 is not limited to the process shown in FIG.
- step S912 the information processing apparatus according to the present embodiment causes the notification control process to notify the divided translated text for which the notification order is determined (S916).
- An example of the process in step S916 is the process shown in FIG.
- the information processing apparatus determines whether or not there is a divided translated text to be processed, similar to step S1000 in FIG. 31 (S1200).
- the divided translation text to be processed in step S1200 corresponds to an unprocessed translation result among the translation results for each translation unit.
- step S1200 When it is determined in step S1200 that there is a divided translation text to be processed, the information processing apparatus according to the present embodiment acquires a divided translation text to be processed next (S1202).
- the information processing apparatus acquires the sound pressure from the speech information corresponding to the divided translated text to be processed, and increases the sound pressure of the divided translated text to be processed for output (S1204).
- the information processing apparatus determines whether or not the divided translated text output in step S1204 is the last divided translated text (S1206).
- the information processing apparatus determines, for example, that there is an unprocessed translation result and determines that it is not the last divided translated text, and if there is no unprocessed translation result, the last divided translated text It is determined that
- step S1206 If it is not determined in step S1206 that the text is the last divided translated text, the information processing apparatus according to the present embodiment outputs a “beep” sound as sound feedback for notifying that it will continue thereafter (S1208). ). Then, the information processing apparatus according to the present embodiment repeats the processing from step S1200.
- step S1206 If it is determined in step S1206 that the text is the last divided translated text, the information processing apparatus according to the present embodiment outputs a “beep” sound as sound feedback for notifying the end. (S1210). Then, the information processing apparatus according to the present embodiment repeats the processing from step S1200.
- step S1200 If it is not determined in step S1200 that there is a divided translated text to be processed, the information processing apparatus according to the present embodiment ends the process of FIG.
- the information processing apparatus performs, for example, the process shown in FIG. 33 as the process of step S916 in FIG. Needless to say, the process of step S916 in FIG. 30 is not limited to the process shown in FIG.
- the use cases described with reference to FIGS. 1 to 5 can be realized by performing the processes shown in FIGS. Needless to say, the processing related to the information processing method according to the present embodiment is not limited to the processing shown in FIGS.
- FIG. 34 is a block diagram illustrating an example of the configuration of the information processing apparatus 100 according to the present embodiment.
- the information processing apparatus 100 includes, for example, a communication unit 102 and a control unit 104.
- the information processing apparatus 100 is operated by, for example, a ROM (Read Only Memory. Not shown), a RAM (Random Access Memory. Not shown), a storage unit (not shown), or a user of the information processing apparatus 100.
- a possible operation unit (not shown), a display unit (not shown) for displaying various screens on the display screen, and the like may be provided.
- the information processing apparatus 100 connects the above constituent elements by, for example, a bus as a data transmission path.
- the information processing apparatus 100 is driven by, for example, power supplied from an internal power supply such as a battery provided in the information processing apparatus 100, or power supplied from a connected external power supply.
- a ROM (not shown) stores control data such as a program used by the control unit 104 and calculation parameters.
- a RAM (not shown) temporarily stores a program executed by the control unit 104.
- the storage unit (not shown) is a storage unit included in the information processing apparatus 100.
- the storage unit (not shown) includes various data such as a table for setting weights related to the summary, data related to the information processing method according to the present embodiment, various applications, Memorize data.
- examples of the storage unit (not shown) include a magnetic recording medium such as a hard disk, and a non-volatile memory such as a flash memory. Further, the storage unit (not shown) may be detachable from the information processing apparatus 100.
- an operation input device to be described later can be cited.
- a display part (not shown), the display device mentioned later is mentioned.
- FIG. 35 is an explanatory diagram illustrating an example of a hardware configuration of the information processing apparatus 100 according to the present embodiment.
- the information processing apparatus 100 includes, for example, an MPU 150, a ROM 152, a RAM 154, a recording medium 156, an input / output interface 158, an operation input device 160, a display device 162, and a communication interface 164.
- the information processing apparatus 100 connects each component with a bus 166 as a data transmission path, for example.
- the MPU 150 is composed of, for example, one or two or more processors configured by an arithmetic circuit such as an MPU, various processing circuits, and the like, and functions as the control unit 104 that controls the information processing apparatus 100 as a whole. Further, the MPU 150 plays a role of, for example, the processing unit 110 described later in the information processing apparatus 100.
- the processing unit 110 may be configured with a dedicated (or general-purpose) circuit (for example, a processor separate from the MPU 150) that can realize the processing of the processing unit 110.
- the ROM 152 stores programs used by the MPU 150, control data such as calculation parameters, and the like.
- the RAM 154 temporarily stores a program executed by the MPU 150, for example.
- the recording medium 156 functions as a storage unit (not shown), and stores various data such as data related to the information processing method according to the present embodiment such as a table for setting weights related to summarization and various applications. To do.
- examples of the recording medium 156 include a magnetic recording medium such as a hard disk and a non-volatile memory such as a flash memory. Further, the recording medium 156 may be detachable from the information processing apparatus 100.
- the input / output interface 158 connects, for example, the operation input device 160 and the display device 162.
- the operation input device 160 functions as an operation unit (not shown)
- the display device 162 functions as a display unit (not shown).
- examples of the input / output interface 158 include a USB (Universal Serial Bus) terminal, a DVI (Digital Visual Interface) terminal, an HDMI (High-Definition Multimedia Interface) (registered trademark) terminal, and various processing circuits. .
- the operation input device 160 is provided on the information processing apparatus 100, for example, and is connected to the input / output interface 158 inside the information processing apparatus 100.
- Examples of the operation input device 160 include a button, a direction key, a rotary selector such as a jog dial, or a combination thereof.
- the display device 162 is provided on the information processing apparatus 100, for example, and is connected to the input / output interface 158 inside the information processing apparatus 100.
- Examples of the display device 162 include a liquid crystal display (Liquid Crystal Display), an organic EL display (Organic Electro-Luminescence Display, or an OLED display (Organic Light Emitting Diode Display)), and the like.
- the input / output interface 158 can be connected to an external device such as an operation input device (for example, a keyboard or a mouse) external to the information processing apparatus 100 or an external display device.
- the display device 162 may be a device capable of display and user operation, such as a touch panel.
- the communication interface 164 is a communication unit included in the information processing apparatus 100, and is a communication unit 102 for performing wireless or wired communication with, for example, an external device or an external device via a network (or directly). Function as.
- the communication interface 164 for example, a communication antenna and an RF (Radio Frequency) circuit (wireless communication), an IEEE 802.15.1 port and a transmission / reception circuit (wireless communication), an IEEE 802.11 port and a transmission / reception circuit (wireless communication). ), Or a LAN (Local Area Network) terminal and a transmission / reception circuit (wired communication).
- RF Radio Frequency
- the information processing apparatus 100 performs a process related to the information processing method according to the present embodiment, for example, with the configuration illustrated in FIG. Note that the hardware configuration of the information processing apparatus 100 according to the present embodiment is not limited to the configuration illustrated in FIG.
- the information processing apparatus 100 may not include the communication interface 164 when communicating with an external apparatus or the like via a connected external communication device.
- the communication interface 164 may be configured to be able to communicate with one or more external devices or the like by a plurality of communication methods.
- the information processing apparatus 100 can have a configuration that does not include the recording medium 156, the operation input device 160, and the display device 162, for example.
- the information processing apparatus 100 includes, for example, one or more of various sensors such as a motion sensor and a biological sensor, a voice input device such as a microphone, a voice output device such as a speaker, a vibration device, and an imaging device. Furthermore, you may provide.
- part or all of the configuration shown in FIG. 35 may be realized by one or two or more ICs.
- the communication unit 102 is a communication unit included in the information processing apparatus 100, and performs wireless or wired communication with, for example, an external apparatus or an external device via a network (or directly).
- the communication of the communication unit 102 is controlled by the control unit 104, for example.
- examples of the communication unit 102 include a communication antenna and an RF circuit, a LAN terminal, and a transmission / reception circuit, but the configuration of the communication unit 102 is not limited to the above.
- the communication unit 102 can have a configuration corresponding to an arbitrary standard capable of performing communication such as a USB terminal and a transmission / reception circuit, or an arbitrary configuration capable of communicating with an external device via a network.
- the communication unit 102 may be configured to be able to communicate with one or more external devices or the like by a plurality of communication methods.
- the control unit 104 is configured by, for example, an MPU and plays a role of controlling the entire information processing apparatus 100.
- the control unit 104 includes, for example, a processing unit 110 and plays a role of leading the processing related to the information processing method according to the present embodiment.
- the processing unit 110 plays a role of leading one or both of the processing related to the first information processing method and the processing related to the second information processing method.
- the processing unit 110 When performing the process related to the first information processing method described above, the processing unit 110 performs a summarization process for summarizing the content of the utterance indicated by the voice information, based on the acquired information indicating the weight related to the summary.
- the processing unit 110 performs, for example, the process described in [3-1] as the summary process.
- the processing unit 110 When performing the processing related to the second information processing method described above, the processing unit 110 performs notification control processing for controlling notification of notification contents based on the summary information.
- the processing unit 110 performs, for example, the process described in [3-3] as the notification control process.
- processing unit 110 may further perform a translation process for translating the content of the utterance summarized by the summarization process into another language.
- the processing unit 110 performs, for example, the process described in [3-2] as the translation process.
- the processing unit 110 can notify the translation result by the notification control process.
- the processing unit 110 performs, for example, a process related to speech recognition, a process related to speech analysis, a process related to estimation of a user's state, a process related to estimation of a distance between a user and a communication partner, and the like.
- Various processes related to the information processing method according to the embodiment can also be performed.
- Various processes related to the information processing method according to the present embodiment may be performed in an external device of the information processing apparatus 100.
- the information processing apparatus 100 has, for example, the configuration shown in FIG. 34 to perform processing related to the information processing method according to this embodiment (for example, “summarization processing related to the first information processing method and notification control related to the second information processing method” One or both of the processing ”,“ one or both of the summary processing according to the first information processing method and the notification control processing according to the second information processing method, and the translation processing ”).
- the information processing apparatus 100 may summarize the content of the utterance with the configuration illustrated in FIG. 34, for example. it can.
- the information processing apparatus 100 uses the configuration illustrated in FIG. Can be notified.
- the information processing apparatus 100 can achieve the effects that are achieved by performing the processing related to the information processing method according to the present embodiment as described above.
- the configuration of the information processing apparatus according to the present embodiment is not limited to the configuration shown in FIG.
- the information processing apparatus can include the processing unit 110 illustrated in FIG. 34 separately from the control unit 104 (for example, realized by another processing circuit). Further, for example, the summary processing according to the first information processing method, the notification control processing according to the second information processing method, and the translation processing according to the present embodiment may be performed in a distributed manner by a plurality of processing circuits.
- the summary processing according to the first information processing method, the notification control processing according to the second information processing method, and the translation processing according to the present embodiment define the processing according to the information processing method according to the present embodiment. It is a thing. Therefore, the configuration for realizing the processing according to the information processing method according to the present embodiment is not limited to the configuration illustrated in FIG. 34, and the configuration according to the method of dividing the processing according to the information processing method according to the present embodiment is taken. It is possible.
- the information processing apparatus when communicating with an external device via an external communication device having the same function and configuration as the communication unit 102, does not include the communication unit 102. Also good.
- the information processing apparatus has been described as the present embodiment, but the present embodiment is not limited to such a form.
- the present embodiment is used by being worn on a user's body, such as a “computer such as a personal computer (PC) or a server” or an “eyewear type device, a clock type device, a bracelet type device, etc.”
- Processing related to the information processing method according to the present embodiment such as “any wearable device”, “communication device such as a smartphone”, “tablet-type device”, “game machine”, “mobile object such as an automobile”, etc.
- the present invention can be applied to various devices capable of performing one or both of the processing related to the first information processing method and the processing related to the second information processing method.
- the present embodiment can be applied to a processing IC that can be incorporated in the above-described device, for example.
- the information processing apparatus may be applied to a processing system that is premised on connection to a network (or communication between apparatuses), such as cloud computing.
- a processing system in which processing according to the information processing method according to the present embodiment is performed for example, “summary processing and translation processing according to the first information processing method are performed by one apparatus configuring the processing system, And a system in which notification control processing according to the second information processing method is performed by another device constituting the system.
- Program according to this embodiment [I] Program according to first information processing method (computer program) A program for causing a computer to function as the information processing apparatus according to the present embodiment that performs processing according to the first information processing method (for example, “summarization processing according to the first information processing method” or “first information A program capable of executing processing related to the first information processing method, such as “summarization processing related to the processing method and translation processing related to the present embodiment”, is executed by a processor or the like in the computer.
- the contents of can be summarized.
- a program for causing a computer to function as the information processing apparatus according to the present embodiment that performs processing according to the first information processing method is executed by a processor or the like in the computer, whereby the first information processing described above is performed.
- the effect produced by the process according to the method can be produced.
- [II] Program Related to Second Information Processing Method A program for causing a computer to function as the information processing apparatus according to the present embodiment that performs processing related to the second information processing method (for example, “second information processing method” A program capable of executing processing related to the second information processing method such as “notification control processing related to the above”, “translation processing related to the present embodiment, and notification control processing related to the second information processing method”)
- second information processing method A program capable of executing processing related to the second information processing method such as “notification control processing related to the above”, “translation processing related to the present embodiment, and notification control processing related to the second information processing method”
- the contents of the summarized utterance can be notified by being executed by a processor or the like in the computer.
- a program for causing a computer to function as the information processing apparatus according to the present embodiment that performs processing according to the second information processing method is executed by a processor or the like in the computer, whereby the second information processing described above is performed.
- the effect produced by the process according to the method can be produced.
- Program related to information processing method includes a program related to the first information processing method and a program related to the second information processing method. Both of them may be included.
- a program for causing a computer to function as the information processing apparatus according to the present embodiment (one or both of the processing related to the first information processing method and the processing related to the second information processing method are executed)
- the present embodiment can also provide a recording medium in which the program is stored.
- An information processing apparatus (2) including a processing unit that performs a summarization process for summarizing the content of the utterance indicated by the voice information based on the user's utterance based on the acquired information indicating the weight related to the summary.
- the information processing apparatus according to (1) wherein the processing unit performs the digest process when it is determined that a predetermined start condition is satisfied.
- the start condition is a condition related to a non-speech period in which a state in which no speech is made continues.
- the processing unit determines that the start condition is satisfied when the non-speech period exceeds a predetermined period or when the non-speech period becomes equal to or greater than the predetermined period, (2) Information processing device.
- the start condition is a condition related to a state of speech recognition for acquiring the content of an utterance from the speech information, The information processing apparatus according to (2) or (3), wherein the processing unit determines that the start condition is satisfied based on detection of the voice recognition stop request.
- the start condition is a condition related to a state of speech recognition for acquiring the content of an utterance from the speech information, The information processing apparatus according to any one of (2) to (4), wherein the processing unit determines that the start condition is satisfied based on detection of completion of the voice recognition.
- the start condition is a condition related to the content of the utterance
- the processing unit determines that the start condition is satisfied based on detection of a predetermined word from the content of the utterance indicated by the audio information, according to any one of (2) to (5) Information processing device.
- the start condition is a condition related to the content of the utterance, The information processing apparatus according to any one of (2) to (6), wherein the processing unit determines that the start condition is satisfied based on detection of stagnation based on the audio information.
- the start condition is a condition related to an elapsed time after the voice information is obtained, The processing unit determines that the start condition is satisfied when the elapsed time exceeds a predetermined period or when the elapsed time is equal to or longer than the predetermined period.
- (2) to (7) The information processing apparatus according to any one of the above.
- the summary exclusion condition is a condition related to gesture detection, The information processing apparatus according to (9), wherein the processing unit determines that the summary exclusion condition is satisfied when a predetermined gesture is detected.
- the processing unit changes a summary level of the content of the utterance based on at least one of an utterance period specified based on the voice information and a number of characters specified based on the voice information.
- the information processing apparatus according to any one of (10) to (10).
- the processing unit sets a weight for the summary based on at least one of the audio information, information about a user, information about an application, information about an environment, and information about a device, (1) to (12) The information processing apparatus according to any one of the above.
- the information processing apparatus according to (13), wherein the information about the user includes at least one of the user status information and the user operation information.
- the processing unit further performs a translation process for translating the content of the speech summarized by the summary process into another language.
- the processing unit does not perform the translation process when it is determined that a predetermined translation exclusion condition is satisfied.
- the processor is The content translated into another language by the translation process is re-translated into the language before translation, If there is a word included in the re-translated content in the utterance content indicated by the speech information acquired after re-translation, the words included in the re-translated content are summarized.
- the information processing apparatus according to (15) or (16), which is included in the content of the uttered speech.
- the information processing apparatus according to any one of (1) to (17), wherein the processing unit further performs notification control processing for controlling notification of the content of the summarized utterance.
- An information processing method executed by an information processing apparatus comprising: performing a summarization process for summarizing the content of an utterance indicated by voice information based on a user's utterance based on information indicating a weight related to the acquired summary.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
1.本実施形態に係る情報処理方法
2.本実施形態に係る情報処理装置
3.本実施形態に係るプログラム In the following, description will be given in the following order.
1. 1. Information processing method according to this embodiment 2. Information processing apparatus according to this embodiment Program according to this embodiment
まず、本実施形態に係る情報処理方法について説明する。以下では、本実施形態に係る情報処理方法に係る処理を、本実施形態に係る情報処理装置が行う場合を例に挙げる。 (Information processing method according to this embodiment)
First, an information processing method according to the present embodiment will be described. Below, the case where the information processing apparatus concerning this embodiment performs the process concerning the information processing method concerning this embodiment is mentioned as an example.
[1-1]第1の情報処理方法の概要
上述したように、“発話者が伝えたい内容だけを発話することが困難であることに起因する事象”が生じる可能性をより低減する方法としては、発話者の発話の内容をより簡潔にする方法が、考えられる。 [1] Outline of information processing method according to this embodiment [1-1] Outline of first information processing method As described above, “because it is difficult to utter only the content that the speaker wants to convey As a method for further reducing the possibility of the “event to occur”, a method for simplifying the content of the utterance of the speaker can be considered.
上記第1の情報処理方法に係る要約処理が行われることによって、要約された音声情報が示す発話の内容を、得ることが可能である。 [1-2] Outline of Second Information Processing Method By performing the summarization process according to the first information processing method, it is possible to obtain the content of the utterance indicated by the summarized voice information.
なお、本実施形態に係る情報処理方法に係る処理は、上記第1の情報処理方法に係る要約処理と上記第2の情報処理方法に係る通知制御処理とに限られない。 [1-3] Other Processes Related to Information Processing Method According to the Present Embodiment The processes related to the information processing method according to the present embodiment include the summary process related to the first information processing method and the second information described above. It is not restricted to the notification control process which concerns on a processing method.
次に、本実施形態に係る情報処理方法が適用されるユースケースの一例を説明しつつ、本実施形態に係る情報処理方法に係る処理の一例を説明する。以下では、本実施形態に係る情報処理方法が適用されるユースケースとして、本実施形態に係る情報処理方法が、「会話支援」(後述するように、翻訳が行われる場合も含む。)に適用される場合を説明する。 [2] An example of a use case to which the information processing method according to the present embodiment is applied Next, while describing an example of a use case to which the information processing method according to the present embodiment is applied, the information processing according to the present embodiment An example of processing according to the method will be described. Hereinafter, as a use case to which the information processing method according to the present embodiment is applied, the information processing method according to the present embodiment is applied to “conversation support” (including a case where translation is performed as described later). The case where it will be described.
・IC(Integrated Circuit)レコーダなどにより生成された、会議の音声を示す音声情報が示す発話の内容を要約することにより実現される「会議文字お越し」
・テレビジョン番組における音声を示す音声情報が示す発話の内容を要約することにより実現される「番組テロップ自動作成」
・テレビジョン会議における音声を示す音声情報が示す発話の内容を要約することにより実現される、「会議テロップ自動作成」と「会議文字お越し」との一方または双方 The use case to which the information processing method according to this embodiment is applied is not limited to “conversation support”. For example, the information processing method according to the present embodiment can be applied to any use case in which the content of the utterance indicated by the audio information can be summarized as described below.
・ "Meeting with meeting letters" realized by summarizing the utterances indicated by the voice information indicating the voice of the meeting, generated by an IC (Integrated Circuit) recorder, etc.
・ "Program telop automatic creation" realized by summarizing the content of utterances indicated by audio information indicating audio in a television program
・ One or both of “automatic conference telop creation” and “come to conference text” realized by summarizing the utterances indicated by audio information indicating audio in a video conference
本実施形態に係る情報処理装置は、例えば、要約に関する重みを設定するためのテーブルを用いることによって、要約に関する重みを設定する。ここで、要約に関する重みを設定するためのテーブルは、本実施形態に係る情報処理装置が備える記憶部(後述する)に記憶されていてもよいし、本実施形態に係る情報処理装置の外部の記録媒体に記憶されていてもよい。本実施形態に係る情報処理装置は、例えば、記憶部(後述する)または外部の記録媒体を適宜参照することによって、要約に関する重みを設定するためのテーブルを用いる。 (A) Example of processing related to setting of weight related to summary The information processing apparatus according to the present embodiment sets a weight related to a summary, for example, by using a table for setting a weight related to the summary. Here, the table for setting the weight regarding the summary may be stored in a storage unit (described later) included in the information processing apparatus according to the present embodiment, or may be stored outside the information processing apparatus according to the present embodiment. It may be stored in a recording medium. The information processing apparatus according to the present embodiment uses, for example, a table for setting a weight related to summarization by appropriately referring to a storage unit (described later) or an external recording medium.
例えば、ユーザU1が、スマートフォンなどの装置を操作してスケジュールアプリケーションを起動し、目的地を確認すると、本実施形態に係る情報処理装置は、ユーザU1が目的地に対する移動中であると認識する。そして、本実施形態に係る情報処理装置は、要約に関する重みを設定するためのテーブルを参照することによって、認識結果に対応する要約に関する重みを設定する。 (A-1) First example of setting of weight related to summary: an example of setting of weight related to summary based on user status indicated by user status information included in information related to user For example, user U1 is a device such as a smartphone When the schedule application is activated to confirm the destination, the information processing apparatus according to the present embodiment recognizes that the user U1 is moving with respect to the destination. Then, the information processing apparatus according to the present embodiment sets the weight related to the summary corresponding to the recognition result by referring to the table for setting the weight related to the summary.
本実施形態に係る情報処理装置は、音声情報に基づいて、要約に関する重みを設定する。 (A-2) Second example of weight setting for summarization: an example of setting weight for summarization based on voice information The information processing apparatus according to the present embodiment sets weights for summarization based on voice information.
・音声情報が示す音声の平均的な周波数帯域が、例えば300~550[Hz]の場合:要約に関する重みの種類として、「男性」が決定される。
・音声情報が示す音声の平均的な周波数帯域が、例えば400~700[Hz]の場合:要約に関する重みの種類として、「女性」が決定される。
・音声情報が示す音声の音圧、音量が設定されている第1の閾値以上である場合、または、音声情報が示す音声の音圧、音量が第1の閾値より大きい場合:要約に関する重みの種類として、「怒り」と「喜び」との一方または双方が決定される。
・音声情報が示す音声の音圧、音量が設定されている第2閾値以下の場合、または、音声情報が示す音声の音圧、音量が第2の閾値より小さい場合:要約に関する重みの種類として、「悲しみ」、「不快」、「苦痛」、「不安」のうちの1または2以上が決定される。
・音声情報が示す音声のピッチ(音の高さ)あるいは発話速度(単位時間当たりの音素の量)が、設定されている第3の閾値より大きい場合、または、音声情報が示す音声のピッチあるいは発話速度が、第3の閾値以上である場合:要約に関する重みの種類として、「興奮」が決定される。
・音声情報が示す音声のピッチあるいは発話速度が、設定されている第4の閾値より小さい場合、または、音声情報が示す音声のピッチあるいは発話速度が、第4の閾値以下である場合:要約に関する重みの種類として、「平静」が決定される。 The information processing apparatus according to the present embodiment determines the type of weight related to the summary, for example, as follows based on the audio information.
When the average frequency band of the voice indicated by the voice information is, for example, 300 to 550 [Hz]: “Male” is determined as the type of weight related to the summary.
When the average frequency band of the voice indicated by the voice information is, for example, 400 to 700 [Hz]: “Women” is determined as the type of weight regarding the summary.
-When the sound pressure and volume of the sound indicated by the sound information are equal to or higher than the set first threshold value, or when the sound pressure and sound volume of the sound indicated by the sound information are larger than the first threshold value: One or both of “anger” and “joy” is determined as the type.
-When the sound pressure and volume of the sound indicated by the sound information are less than or equal to the set second threshold value, or when the sound pressure and sound volume of the sound indicated by the sound information are smaller than the second threshold value: , “Sadness”, “discomfort”, “pain”, and “anxiety” are determined.
When the pitch (sound pitch) or utterance speed (phoneme amount per unit time) indicated by the voice information is greater than the set third threshold, or the voice pitch indicated by the voice information or When the speaking rate is equal to or higher than the third threshold value: “excitement” is determined as the type of weight related to the summary.
When the voice pitch or speech speed indicated by the voice information is smaller than the set fourth threshold or when the voice pitch or speech speed indicated by the voice information is equal to or lower than the fourth threshold: “Silence” is determined as the weight type.
本実施形態に係る情報処理装置は、アプリケーションの実行状態に基づいて、要約に関する重みを設定する。 (A-3) Third example of weight setting for summarization: an example of setting weight for summarization based on the execution state of the application indicated by the information regarding the application The information processing apparatus according to this embodiment is based on the execution state of the application. To set the weight for the summary.
・地図アプリケーションが実行されている場合:要約に関する重みの種類として、「時間」、「場所」、「人名」などが決定される。
・乗換案内アプリケーションが実行されている場合:要約に関する重みの種類として、「時間」、「場所」、「電車」などが決定される。
・日本のことを聞くための質問を円滑に進めるためのアプリケーションが実行されている場合:要約に関する重みの種類として、「質問」、「日本」などが決定される。 Also, the information processing apparatus according to the present embodiment can determine the type of weight related to the summary based on the properties of the application being executed, for example, and set the weight related to the summary as described below.
When the map application is executed: “Time”, “Location”, “Person name”, etc. are determined as the types of weights related to the summary.
When the transfer guidance application is executed: “Time”, “Place”, “Train”, etc. are determined as the types of weights related to the summary.
When an application for smoothly advancing questions for hearing about Japan is being executed: “Question”, “Japan”, etc. are determined as the types of weights related to summarization.
本実施形態に係る情報処理装置は、ユーザの操作に基づいて、要約に関する重みを設定する。 (A-4) Fourth example of setting of weight related to summary: an example of setting of weight related to summary based on user operation indicated by user operation information included in information related to user The information processing apparatus according to this embodiment includes: A summary weight is set based on the user's operation.
本実施形態に係る情報処理装置は、上記(a-1)~上記(a-4)のうちの2以上を組み合わせることによって、要約に関する重みを設定することが、可能である。 (A-5) Fifth Example of Weight Setting for Summarization The information processing apparatus according to the present embodiment relates to summarization by combining two or more of (a-1) to (a-4) above. It is possible to set a weight.
例えば、ユーザU1が、目的地に向かう移動中に駅でゴミを捨てることを望むとき、駅にゴミ箱がないことから、“駅にゴミ箱がない理由”をコミュニケーション相手U2に英語で尋ねるケースを想定する(図1、図2)。 (B) An example of summary processing according to the first information processing method For example, when the user U1 wants to throw away trash at the station while moving toward the destination, there is no trash box at the station. Assume a case where the communication partner U2 is inquired in English about the reason why there is not (FIGS. 1 and 2).
本実施形態に係る情報処理装置は、例えば上記(b)に示す要約処理により要約された発話の内容を、さらに他の言語に翻訳してもよい。本実施形態に係る情報処理装置は、上述したように、発話に対応する第1の言語を、第1の言語と異なる第2の言語に翻訳する。 (C) Example of Translation Process The information processing apparatus according to this embodiment may further translate the content of the utterance summarized by the summary process shown in (b) above into another language, for example. As described above, the information processing apparatus according to the present embodiment translates the first language corresponding to the utterance into a second language different from the first language.
本実施形態に係る情報処理装置は、上記(b)に示す要約処理によって要約された、音声情報が示す発話の内容を、通知させる。また、上記(c)に示す翻訳処理がさらに行われることにより、要約された発話の内容が他の言語に翻訳された場合には、本実施形態に係る情報処理装置は、翻訳結果を通知させる。 (D) An example of the notification control process according to the second information processing method The information processing apparatus according to the present embodiment notifies the content of the utterance indicated by the voice information summarized by the summarization process shown in (b) above. . Further, when the content of the summarized utterance is translated into another language by further performing the translation processing shown in (c) above, the information processing apparatus according to the present embodiment notifies the translation result. .
次に、本実施形態に係る情報処理方法に係る処理について、より具体的に説明する。以下では、第1の情報処理方法に係る要約処理と、本実施形態に係る翻訳処理と、第2の情報処理方法に係る通知制御処理とについて、説明する。 [3] Processing related to the information processing method according to the present embodiment Next, processing related to the information processing method according to the present embodiment will be described more specifically. Below, the summary process which concerns on a 1st information processing method, the translation process which concerns on this embodiment, and the notification control process which concerns on a 2nd information processing method are demonstrated.
本実施形態に係る情報処理装置は、要約に関する重みを示す情報に基づいて、ユーザの発話に基づく音声情報が示す発話の内容を要約する。 [3-1] Summarization Processing According to First Information Processing Method The information processing apparatus according to the present embodiment summarizes the utterance content indicated by the voice information based on the user's utterance based on the information indicating the weight related to the summary. .
本実施形態に係る情報処理装置は、設定されている所定の開始条件を満たしたと判定した場合に、要約処理を行う。 (1) First example of summary processing: start timing of summary processing The information processing apparatus according to the present embodiment performs summary processing when it is determined that a predetermined start condition that has been set is satisfied.
・発話がされていない状態が継続する無発話期間に関する条件
・音声情報から発話の内容を取得するための音声認識の状態に関する条件
・発話の内容に関する条件
・音声情報が得られてからの経過時間に関する条件 Examples of the start conditions for the summary processing according to the present embodiment include the following examples.
・ Conditions related to the no-speech period during which no utterance continues ・ Conditions related to the state of speech recognition for acquiring the utterance content from speech information ・ Conditions related to the utterance content ・ Elapsed time since the speech information was obtained Conditions
無発話期間に関する条件としては、例えば、無発話期間の長さに係る条件が挙げられる。所定の開始条件が、無発話期間に関する条件である場合、本実施形態に係る情報処理装置は、無発話期間が設定されている所定の期間を越えた場合、または、無発話期間が設定されている所定の期間以上となった場合に、開始条件を満たしたと判定する。 (1-1) First example of start condition: an example in which the start condition is a condition related to a non-speech period Examples of the condition related to a non-speech period include a condition related to the length of a non-speech period. When the predetermined start condition is a condition related to a non-speech period, the information processing apparatus according to the present embodiment is configured so that the non-speech period is set or the non-speech period is set. When the predetermined period is exceeded, it is determined that the start condition is satisfied.
音声認識の状態に関する第1の条件としては、音声認識の停止要求の検出に係る条件が挙げられる。所定の開始条件が、音声認識の状態に関する第1の条件である場合、本実施形態に係る情報処理装置は、音声認識の停止要求が検出されたことに基づいて、開始条件を満たしたと判定する。本実施形態に係る情報処理装置は、例えば、音声認識の停止要求が検出された場合に、開始条件を満たしたと判定する。 (1-2) Second example of start condition: Example in which start condition is first condition related to voice recognition state As a first condition related to voice recognition state, detection of a voice recognition stop request is detected. The conditions concerning are mentioned. When the predetermined start condition is the first condition related to the voice recognition state, the information processing apparatus according to the present embodiment determines that the start condition is satisfied based on the detection of the voice recognition stop request. . The information processing apparatus according to the present embodiment determines that the start condition is satisfied, for example, when a voice recognition stop request is detected.
音声認識の状態に関する第2の条件としては、音声認識の完了に係る条件が挙げられる。所定の開始条件が、音声認識の状態に関する第2の条件である場合、本実施形態に係る情報処理装置は、音声認識の完了が検出されたことに基づいて、開始条件を満たしたと判定する。本実施形態に係る情報処理装置は、例えば、音声認識の完了が検出された場合に、開始条件を満たしたと判定する。 (1-3) Third example of start condition: an example in which the start condition is a second condition related to the state of speech recognition The second condition related to the state of speech recognition includes a condition related to completion of speech recognition Is mentioned. When the predetermined start condition is the second condition related to the voice recognition state, the information processing apparatus according to the present embodiment determines that the start condition is satisfied based on the completion of the voice recognition. The information processing apparatus according to the present embodiment determines that the start condition is satisfied, for example, when the completion of voice recognition is detected.
発話の内容に関する第1の条件としては、音声情報が示す発話の内容からの所定の言葉の検出に係る条件が挙げられる。所定の開始条件が、発話の内容に関する第1の条件である場合、本実施形態に係る情報処理装置は、音声情報が示す発話の内容から所定の言葉が検出されたことに基づいて、開始条件を満たしたと判定する。本実施形態に係る情報処理装置は、例えば、音声情報が示す発話の内容から所定の言葉が検出された場合に、開始条件を満たしたと判定する。 (1-4) Fourth example of start condition: an example in which the start condition is the first condition related to the content of the utterance The first condition related to the content of the utterance is based on the content of the utterance indicated by the voice information. Examples include conditions relating to detection of a predetermined word. When the predetermined start condition is the first condition regarding the content of the utterance, the information processing apparatus according to the present embodiment starts the condition based on the detection of the predetermined word from the content of the utterance indicated by the voice information. Is determined to be satisfied. The information processing apparatus according to the present embodiment determines that the start condition is satisfied, for example, when a predetermined word is detected from the utterance content indicated by the audio information.
発話の内容に関する第2の条件としては、音声情報が示す発話の内容からの言いよどみの検出に係る条件が挙げられる。所定の開始条件が、発話の内容に関する第2の条件である場合、本実施形態に係る情報処理装置は、音声情報に基づき言いよどみが検出されたことに基づいて、開始条件を満たしたと判定する。本実施形態に係る情報処理装置は、例えば、音声情報に基づき言いよどみが検出された場合に、開始条件を満たしたと判定する。 (1-5) Fifth example of start condition: an example in which the start condition is the second condition related to the content of the utterance The second condition related to the content of the utterance is based on the utterance content indicated by the voice information. A condition related to the detection of stagnation is given. When the predetermined start condition is the second condition regarding the content of the utterance, the information processing apparatus according to the present embodiment determines that the start condition is satisfied based on the detection of stagnation based on the voice information. The information processing apparatus according to the present embodiment determines that the start condition is satisfied, for example, when stagnation is detected based on audio information.
音声情報が得られてからの経過時間に関する条件としては、経過時間の長さに係る条件が挙げられる。所定の開始条件が、音声情報が得られてからの経過時間に関する条件である場合、本実施形態に係る情報処理装置は、経過時間が設定されている所定の期間を越えた場合、または、経過時間が設定されている所定の期間以上となった場合に、開始条件を満たしたと判定する。 (1-6) Sixth example of start condition: an example in which the start condition is a condition related to the elapsed time since the voice information was obtained. The condition related to the elapsed time after the voice information was obtained is the elapsed time. A condition related to the length of time is given. When the predetermined start condition is a condition related to the elapsed time since the voice information is obtained, the information processing apparatus according to the present embodiment is configured to operate when the elapsed time exceeds a predetermined period, or When the time is equal to or longer than a predetermined period, it is determined that the start condition is satisfied.
開始条件は、上記(1-1)に示す第1の例に係る開始条件~上記(1-6)に示す第6の例に係る開始条件のうちの、2以上を組み合わせた条件であってもよい。本実施形態に係る情報処理装置は、例えば、組み合わせた開始条件のうちの、いずれかの開始条件を満たした場合を、要約トリガとして、要約処理を開始する。 (1-7) Seventh Example of Start Condition The start conditions are from the start condition according to the first example shown in (1-1) to the start condition according to the sixth example shown in (1-6) above. The condition which combined 2 or more of them may be sufficient. For example, the information processing apparatus according to the present embodiment starts the summarization process when any one of the combined start conditions is satisfied as a summarization trigger.
本実施形態に係る情報処理装置は、設定されている要約処理の除外条件(以下、「要約除外条件」と示す。)を満たしたと判定した場合には、要約処理を行わない。 (2) Second Example of Summarization Processing: Exception Processing without Performing Summarization Processing The information processing apparatus according to the present embodiment sets a summary processing exclusion condition (hereinafter referred to as “summary exclusion condition”). If it is determined that the condition is satisfied, the summarization process is not performed.
本実施形態に係る情報処理装置は、音声情報に基づき特定される発話期間と、音声情報に基づき特定される文字数との一方または双方に基づいて、発話の内容の要約のレベル(または、発話の内容の要約の程度。以下、同様とする。)を変更する。換言すると、本実施形態に係る情報処理装置は、音声情報に基づき特定される発話期間と、音声情報に基づき特定される文字数との少なくとも一方に基づいて、発話の内容の要約のレベルを変更する。 (3) Third Example of Summarization Processing: Processing for Dynamically Changing Summarization Level The information processing apparatus according to the present embodiment includes an utterance period identified based on speech information and the number of characters identified based on speech information. Based on one or both of the above, the utterance content summary level (or the utterance content summary level, the same shall apply hereinafter) is changed. In other words, the information processing apparatus according to the present embodiment changes the level of the summary of the utterance content based on at least one of the utterance period specified based on the voice information and the number of characters specified based on the voice information. .
上記(c)に示すように、本実施形態に係る情報処理装置は、第1の情報処理方法に係る要約処理により要約された発話の内容を他の言語に翻訳する翻訳処理を、さらに行うことが可能である。本実施形態に係る情報処理装置は、上述したように、発話に対応する第1の言語を、第1の言語と異なる第2の言語に翻訳する。 [3-2] Translation processing according to this embodiment As shown in (c) above, the information processing apparatus according to this embodiment uses the contents of the utterances summarized by the summarization processing according to the first information processing method. It is possible to further perform a translation process for translating into the above languages. As described above, the information processing apparatus according to the present embodiment translates the first language corresponding to the utterance into a second language different from the first language.
本実施形態に係る情報処理装置は、設定されている翻訳処理の除外条件を満たしたと判定した場合には、翻訳処理を行わない。 (I) First example of translation processing: exception processing without translation processing The information processing apparatus according to the present embodiment performs translation processing when it is determined that a set translation processing exclusion condition is satisfied. Absent.
本実施形態に係る情報処理装置は、他の言語に翻訳された内容を、翻訳前の言語に再翻訳することも可能である。 (Ii) Second Example of Translation Processing: Processing in Retranslation The information processing apparatus according to the present embodiment can also retranslate content translated into another language into a language before translation.
本実施形態に係る情報処理装置は、第1の情報処理方法に係る要約処理によって要約された、声情報が示す発話の内容を、通知させる。 [3-3] Notification Control Process According to Second Information Processing Method The information processing apparatus according to the present embodiment is configured to display the utterance content indicated by the voice information summarized by the summary process according to the first information processing method. Notify me.
本実施形態に係る情報処理装置は、翻訳された他の言語に対応する語順で、翻訳結果を通知させる。 (I) First example of notification control processing: Notification in word order of translated language The information processing apparatus according to the present embodiment notifies the translation result in a word order corresponding to another translated language.
・名詞
・動詞
・形容詞
・副詞
・その他 For example, when the content of the utterance is summarized in the divided text as shown in C of FIG. 3 in the summarization process and the other language is English, the information processing apparatus according to the present embodiment Then, notify the translation result.
・ Noun ・ Verb ・ Adjective ・ Adverb ・ Other
・動詞
・名詞
・形容詞
・副詞
・その他 Further, for example, when the content of the utterance is summarized into the divided text as shown in C of FIG. 3 in the summary process, when the other language is Japanese, the information processing apparatus according to the present embodiment is The translation results are notified in the following order.
・ Verbs ・ Nouns ・ Adjectives ・ Adverbs ・ Others
上述したように、翻訳処理では、翻訳単位ごとに翻訳結果の信頼度が設定されうる。翻訳処理において翻訳単位ごとに翻訳結果の信頼度が設定される場合、本実施形態に係る情報処理装置は、要約された発話の内容における翻訳単位ごとの信頼度に基づいて、翻訳結果を通知させる。 (II) Second Example of Notification Control Processing: Notification Control Processing Based on Reliability for Each Translation Unit As described above, in translation processing, the reliability of translation results can be set for each translation unit. When the reliability of the translation result is set for each translation unit in the translation process, the information processing apparatus according to the present embodiment notifies the translation result based on the reliability for each translation unit in the summarized utterance content. .
本実施形態に係る情報処理装置は、信頼度が高い翻訳結果を、優先的に通知させる。 (II-1) First Example of Notification Control Processing Based on Reliability for Each Translation Unit The information processing apparatus according to the present embodiment gives priority to notification of translation results with high reliability.
本実施形態に係る情報処理装置は、信頼度に応じて強調されるように、翻訳結果を通知させる。 (II-2) Second Example of Notification Control Processing Based on Reliability for Each Translation Unit The information processing apparatus according to the present embodiment notifies the translation result so as to be emphasized according to the reliability.
通知内容を、表示デバイスの表示画面に表示させることにより視覚的に通知させる場合、本実施形態に係る情報処理装置は、音声情報に基づいて、通知内容の表示の仕方を制御する。 (III) Third example of notification control processing: Notification control processing based on audio information When the notification content is displayed visually on the display screen of the display device, the information processing apparatus according to the present embodiment is Based on the voice information, the display method of the notification content is controlled.
通知内容を、表示デバイスの表示画面に表示させることにより視覚的に通知させる場合、本実施形態に係る情報処理装置は、表示画面に対して行われる操作に基づいて、表示画面に表示されている内容を変更させる。 (IV) Fourth Example of Notification Control Processing: Notification Control Processing Based on Operation Performed on Display Screen When displaying the notification content on the display screen of the display device for visual notification, this embodiment The information processing apparatus changes the content displayed on the display screen based on an operation performed on the display screen.
本実施形態に係る情報処理装置は、表示画面に対して行われる操作に基づいて、表示画面に表示されている内容を変更させる。本実施形態に係る表示画面に表示されている内容を変更させる例としては、下記に示す例のうちの1または2以上が、挙げられる。
・表示画面における通知内容の表示位置の変更(または、表示画面における、翻訳結果の表示位置の変更)
・表示画面に表示されている通知内容の一部の削除(または、表示画面に表示されている、翻訳結果の一部の削除) (IV-1) First Example of Notification Control Processing Based on Operation Performed on Display Screen The information processing apparatus according to the present embodiment is displayed on the display screen based on the operation performed on the display screen. Change the contents. Examples of changing the content displayed on the display screen according to the present embodiment include one or more of the examples shown below.
-Change the display position of the notification content on the display screen (or change the display position of the translation result on the display screen)
・ Deleting part of the notification content displayed on the display screen (or deleting part of the translation result displayed on the display screen)
要約された発話の内容(または、翻訳結果)を、通知内容として表示デバイスの表示画面に表示させる場合には、要約された発話の内容(または、翻訳結果)が、一画面に表示しきれないことが、起こりうる。上記のように、要約された発話の内容(または、翻訳結果)を一画面に表示しきれないことが生じた場合、本実施形態に係る情報処理装置は、通知内容のうちの一の部分を、表示画面に表示させる。 (IV-2) Second Example of Notification Control Processing Based on Operation Performed on Display Screen When displaying summarized utterance content (or translation result) on display screen of display device as notification content It may happen that the content of the summarized utterance (or the translation result) cannot be displayed on one screen. As described above, when the content of the summarized utterance (or the translation result) cannot be displayed on one screen, the information processing apparatus according to the present embodiment selects one part of the notification content. And display it on the display screen.
本実施形態に係る情報処理装置は、音声による操作に基づいて、翻訳結果を、音声出力デバイスから音声により聴覚的に通知させてもよい。 (V) Fifth Example of Notification Control Processing: Notification Control Processing Based on Voice Operation The information processing apparatus according to this embodiment audibly transmits a translation result from a voice output device based on voice operation. You may be notified.
本実施形態に係る情報処理装置は、通知内容の通知順序を、動的に制御することも可能である。 (VI) Sixth Example of Notification Control Processing: Notification Control Processing for Controlling Notification Order Dynamically The information processing apparatus according to the present embodiment can also dynamically control the notification order of notification contents. It is.
“教えて”であった場合における、ユーザの状態に基づく表示の例を示している。 FIG. 21A shows an example of display based on the state of the user when the translation results for each translation unit are “recommended”, “tourist”, “direction”, and “tell me”.
本実施形態に係る情報処理装置は、通知内容の情報量を、動的に制御することも可能である。 (VII) Sixth Example of Notification Control Processing: Notification Control Processing for Controlling Notification Content Dynamically The information processing apparatus according to this embodiment can also dynamically control the information content of notification content. It is.
・本実施形態に係る情報処理装置は、例えば、要約情報が示す要約された発話の内容に、「あれ」、「それ」などの指示語が含まれる場合には、当該指示語(または、当該指示語の翻訳結果)を、通知させない。
・本実施形態に係る情報処理装置は、例えば、要約情報が示す要約された発話の内容に、挨拶に対応する言葉が含まれる場合には、当該挨拶に対応する言葉(または、当該挨拶に対応する言葉の翻訳結果)を、通知させない。 (VII-1) Example of dynamic change of notification content based on summary information The information processing apparatus according to the present embodiment includes, for example, “that” and “it” in the content of the summarized utterance indicated by the summary information. Are included, the instruction word (or the translation result of the instruction word) is not notified.
The information processing apparatus according to the present embodiment, for example, if the content of the summarized utterance indicated by the summary information includes a word corresponding to the greeting (or corresponding to the greeting) (Translation result of words to be) is not notified.
・本実施形態に係る情報処理装置は、例えば、第1のユーザの表情が笑いと判定された場合には、通知内容を通知させるときの情報量を減らす。
・本実施形態に係る情報処理装置は、例えば、第1のユーザの視線が上を向いていると判定された場合(独り言に近いと判定された場合の一例)には、通知内容を、通知させない。
・本実施形態に係る情報処理装置は、例えば、「あれ」、「それ」、「これ」などの指示語に対応するジェスチャ(例えば、指し示すジェスチャなど)が検出された場合には、通知内容を、通知させない。
・本実施形態に係る情報処理装置は、例えば、第1のユーザが雑音が大きい状況におかれていると判定された場合には、通知内容を、全て通知させる。 (VII-2) Example of dynamic change of notification content based on information corresponding to first user-In the information processing apparatus according to the present embodiment, for example, when the facial expression of the first user is determined to be laughter The amount of information when notifying the notification content is reduced.
The information processing apparatus according to the present embodiment notifies, for example, notification contents when it is determined that the first user's line of sight is facing upward (an example when it is determined that the first user is close to monologue). I won't let you.
The information processing apparatus according to the present embodiment, for example, displays a notification content when a gesture (for example, a pointing gesture) corresponding to an instruction word such as “that”, “it”, “this” is detected. , Do not let me know.
The information processing apparatus according to the present embodiment, for example, notifies all of the notification contents when it is determined that the first user is in a situation where noise is high.
・本実施形態に係る情報処理装置は、例えば、第2のユーザの表情が笑いと判定された場合には、通知内容を通知させるときの情報量を減らす。
・第2のユーザがコミュニケーション相手である場合、本実施形態に係る情報処理装置は、例えば、第2のユーザが発話内容を理解してない可能性があると判定したとき(例えば、第2のユーザの視線が、第1のユーザに向いていないと判定されたときなど)には、通知内容を通知させるときの情報量を増やす。
・第2のユーザがコミュニケーション相手である場合、本実施形態に係る情報処理装置は、例えば、第2のユーザがあくびしていると判定したとき(例えば、第2のユーザが飽きていると判定されたときなど)には、通知内容を通知させるときの情報量を減らす。
・第2のユーザがコミュニケーション相手である場合、本実施形態に係る情報処理装置は、例えば、第2のユーザがうなずきまたは相槌を行ったと判定したときには、通知内容を通知させるときの情報量を増やす。
・第2のユーザがコミュニケーション相手である場合、本実施形態に係る情報処理装置は、例えば、第2のユーザの瞳孔の大きさが所定の大きさより大きいと判定されたとき、または、当該瞳孔の大きさが当該所定の大きさ以上であると判定されたとき(興味があると判定されたときの一例)には、通知内容を通知させるときの情報量を増やす。
・第2のユーザがコミュニケーション相手である場合、本実施形態に係る情報処理装置は、例えば、第2のユーザが発話内容を理解してない可能性があると判定したとき(例えば、第2のユーザの手が動いてないと判定されたときなど)には、通知内容を通知させるときの情報量を増やす。
・第2のユーザがコミュニケーション相手である場合、本実施形態に係る情報処理装置は、例えば、第2のユーザの身体の傾き具合が前方に傾いていると判定されたとき(興味があると判定されたときの一例)には、通知内容を通知させるときの情報量を増やす。
・本実施形態に係る情報処理装置は、例えば、第2のユーザが雑音が大きい状況におかれていると判定された場合には、通知内容を、全て通知させる。 (VII-3) Example of dynamic change of notification content based on information corresponding to second user-In the information processing apparatus according to the present embodiment, for example, when the facial expression of the second user is determined to be laughter The amount of information when notifying the notification content is reduced.
When the second user is a communication partner, the information processing apparatus according to the present embodiment, for example, determines that the second user may not understand the utterance content (for example, the second user When it is determined that the user's line of sight is not suitable for the first user), the amount of information when the notification content is notified is increased.
When the second user is a communication partner, the information processing apparatus according to the present embodiment determines, for example, that the second user is yawning (for example, determines that the second user is bored) Etc.), the amount of information when the notification content is notified is reduced.
When the second user is a communication partner, the information processing apparatus according to the present embodiment increases the amount of information when notifying the notification content when, for example, it is determined that the second user has nodded or consulted. .
When the second user is a communication partner, the information processing apparatus according to the present embodiment, for example, when it is determined that the size of the pupil of the second user is larger than a predetermined size, or When it is determined that the size is equal to or greater than the predetermined size (an example when it is determined that the user is interested), the amount of information when the notification content is notified is increased.
When the second user is a communication partner, the information processing apparatus according to the present embodiment, for example, determines that the second user may not understand the utterance content (for example, the second user For example, when it is determined that the user's hand is not moving), the amount of information when the notification content is notified is increased.
When the second user is a communication partner, the information processing apparatus according to the present embodiment, for example, when it is determined that the inclination of the body of the second user is tilted forward (determined as interested) An example of when the notification is made) increases the amount of information when the notification content is notified.
The information processing apparatus according to the present embodiment, for example, notifies all of the notification contents when it is determined that the second user is in a situation where noise is high.
・本実施形態に係る情報処理装置は、例えば、音声情報から検出される発話の音量が所定の閾値より大きい場合、または、当該発話の音量が当該所定の閾値以上である場合には、通知内容を、通知させない。
・本実施形態に係る情報処理装置は、例えば、音声情報から検出される発話の音量が所定の閾値より大きい場合、または、当該発話の音量が当該所定の閾値以上である場合には、通知内容の一部、または全てを通知させる。 (VII-4) Example of dynamic change of notification content based on voice information-For example, the information processing apparatus according to the present embodiment has a volume of utterances detected from voice information larger than a predetermined threshold, or When the volume of the utterance is equal to or higher than the predetermined threshold, the notification content is not notified.
The information processing apparatus according to the present embodiment, for example, if the volume of the utterance detected from the voice information is greater than a predetermined threshold or the volume of the utterance is greater than or equal to the predetermined threshold Notify some or all of
・本実施形態に係る情報処理装置は、例えば、第1のユーザと第2のユーザとが異なる場合、第1のユーザの視線と第2のユーザの視線とが合ったと判定されたときに、通知内容を通知させるときの情報量を増やす(第1のユーザに対応する情報、および第2のユーザに対応する情報に基づく通知内容の動的な変更の一例)。 (VII-5) An example of dynamic change of notification content based on a combination of a plurality of pieces of information-The information processing apparatus according to the present embodiment is, for example, when the first user and the second user are different, When it is determined that the user's line of sight matches the line of sight of the second user, the amount of information to be notified is increased (information corresponding to the first user and corresponding to the second user) Example of dynamic change of notification content based on information).
次に、上述した本実施形態に係る情報処理方法に係る処理の具体例を示す。以下では、本実施形態に係る情報処理方法に係る処理の具体例として、図1~図5を参照して説明したユースケースにおける処理の一例を示す。 [4] Specific Example of Processing According to Information Processing Method According to This Embodiment Next, a specific example of processing according to the above-described information processing method according to this embodiment will be described. Hereinafter, an example of processing in the use case described with reference to FIGS. 1 to 5 will be shown as a specific example of processing related to the information processing method according to the present embodiment.
本実施形態に係る情報処理装置が本実施形態に係る情報処理方法に係る処理を行うことによって、例えば下記に示す効果が奏される。なお、本実施形態に係る情報処理方法が用いられることにより奏される効果が、下記に示す効果に限られないことは、言うまでもない。
・発話者がまとまりのない話し方で話した場合であっても、要点だけが翻訳され、発話者が伝えたい事項を受け手に伝えることが可能となる。
・要点だけが翻訳されることにより、受け手の確認時間を短縮することができ、円滑な翻訳コミュニケーションを実現することができる。
・翻訳処理の処理対象となる文章自体を極端に減らせるケースもあり、翻訳自体の精度を向上させることが可能となる。
・発話の内容が要約された上で翻訳されることによって、受け手は、不要な言葉を受け取らなくて済むので、受け手は理解をしやすい。その結果、外国語が得意ではない者に、言語の壁を越えて話すことを、促すことができる。 [5] An example of an effect produced by using the information processing method according to the present embodiment When the information processing apparatus according to the present embodiment performs processing according to the information processing method according to the present embodiment, for example, the following The effect shown is produced. Needless to say, the effects produced by using the information processing method according to the present embodiment are not limited to the effects described below.
・ Even if the speaker speaks in an uncoordinated manner, only the main points are translated, and it is possible to convey to the receiver what the speaker wants to convey.
・ By translating only the main points, the confirmation time for the recipient can be shortened, and smooth translation communication can be realized.
-In some cases, the sentences to be processed can be extremely reduced, and the accuracy of the translation itself can be improved.
・ Since the content of the utterance is summarized and translated, the recipient does not have to receive unnecessary words, so the recipient is easy to understand. As a result, it is possible to encourage those who are not good at foreign languages to speak across language barriers.
次に、上述した本実施形態に係る情報処理方法に係る処理を行うことが可能な本実施形態に係る情報処理装置の構成の一例について、説明する。以下では、本実施形態に係る情報処理装置の構成の一例として、上述した第1の情報処理方法に係る処理と上述した第2の情報処理方法に係る処理との一方または双方を行うことが可能な、情報処理装置の一例を示す。 (Information processing apparatus according to this embodiment)
Next, an example of the configuration of the information processing apparatus according to the present embodiment capable of performing the processing according to the information processing method according to the present embodiment described above will be described. Hereinafter, as an example of the configuration of the information processing apparatus according to the present embodiment, one or both of the processing related to the first information processing method described above and the processing related to the second information processing method described above can be performed. An example of an information processing apparatus is shown.
図35は、本実施形態に係る情報処理装置100のハードウェア構成の一例を示す説明図である。情報処理装置100は、例えば、MPU150と、ROM152と、RAM154と、記録媒体156と、入出力インタフェース158と、操作入力デバイス160と、表示デバイス162と、通信インタフェース164とを備える。また、情報処理装置100は、例えば、データの伝送路としてのバス166で各構成要素間を接続する。 [Hardware Configuration Example of Information Processing Apparatus 100]
FIG. 35 is an explanatory diagram illustrating an example of a hardware configuration of the
[I]第1の情報処理方法に係るプログラム(コンピュータプログラム)
コンピュータを、第1の情報処理方法に係る処理を行う本実施形態に係る情報処理装置として機能させるためのプログラム(例えば、“第1の情報処理方法に係る要約処理”や、“第1の情報処理方法に係る要約処理、および本実施形態に係る翻訳処理”など、第1の情報処理方法に係る処理を実行することが可能なプログラム)が、コンピュータにおいてプロセッサなどにより実行されることによって、発話の内容を要約することができる。 (Program according to this embodiment)
[I] Program according to first information processing method (computer program)
A program for causing a computer to function as the information processing apparatus according to the present embodiment that performs processing according to the first information processing method (for example, “summarization processing according to the first information processing method” or “first information A program capable of executing processing related to the first information processing method, such as “summarization processing related to the processing method and translation processing related to the present embodiment”, is executed by a processor or the like in the computer. The contents of can be summarized.
コンピュータを、第2の情報処理方法に係る処理を行う本実施形態に係る情報処理装置として機能させるためのプログラム(例えば、“第2の情報処理方法に係る通知制御処理”や、“本実施形態に係る翻訳処理、および第2の情報処理方法に係る通知制御処理”など、第2の情報処理方法に係る処理を実行することが可能なプログラム)が、コンピュータにおいてプロセッサなどにより実行されることによって、要約された発話の内容を、通知させることができる。 [II] Program Related to Second Information Processing Method A program for causing a computer to function as the information processing apparatus according to the present embodiment that performs processing related to the second information processing method (for example, “second information processing method” A program capable of executing processing related to the second information processing method such as “notification control processing related to the above”, “translation processing related to the present embodiment, and notification control processing related to the second information processing method”) However, the contents of the summarized utterance can be notified by being executed by a processor or the like in the computer.
本実施形態に係る情報処理方法に係るプログラムには、上記第1の情報処理方法に係るプログラムと上記第2の情報処理方法に係るプログラムとの双方が含まれていてもよい。 [III] Program related to information processing method according to this embodiment The program related to the information processing method according to this embodiment includes a program related to the first information processing method and a program related to the second information processing method. Both of them may be included.
(1)
取得した要約に関する重みを示す情報に基づいて、ユーザの発話に基づく音声情報が示す発話の内容を要約する要約処理を行う処理部を備える、情報処理装置
(2)
前記処理部は、所定の開始条件を満たしたと判定した場合に、前記要約処理を行う、(1)に記載の情報処理装置。
(3)
前記開始条件は、発話がされていない状態が継続する無発話期間に関する条件であり、
前記処理部は、前記無発話期間が所定の期間を越えた場合、または、前記無発話期間が前記所定の期間以上となった場合に、前記開始条件を満たしたと判定する、(2)に記載の情報処理装置。
(4)
前記開始条件は、前記音声情報から発話の内容を取得するための音声認識の状態に関する条件であり、
前記処理部は、前記音声認識の停止要求が検出されたことに基づいて、前記開始条件を満たしたと判定する、(2)、または(3)に記載の情報処理装置。
(5)
前記開始条件は、前記音声情報から発話の内容を取得するための音声認識の状態に関する条件であり、
前記処理部は、前記音声認識の完了が検出されたことに基づいて、前記開始条件を満たしたと判定する、(2)~(4)のいずれか1つに記載の情報処理装置。
(6)
前記開始条件は、発話の内容に関する条件であり、
前記処理部は、前記音声情報が示す発話の内容から所定の言葉が検出されたことに基づいて、前記開始条件を満たしたと判定する、(2)~(5)のいずれか1つに記載の情報処理装置。
(7)
前記開始条件は、発話の内容に関する条件であり、
前記処理部は、前記音声情報に基づき言いよどみが検出されたことに基づいて、前記開始条件を満たしたと判定する、(2)~(6)のいずれか1つに記載の情報処理装置。
(8)
前記開始条件は、前記音声情報が得られてからの経過時間に関する条件であり、
前記処理部は、前記経過時間が所定の期間を越えた場合、または、前記経過時間が前記所定の期間以上となった場合に、前記開始条件を満たしたと判定する、(2)~(7)のいずれか1つに記載の情報処理装置。
(9)
前記処理部は、所定の要約除外条件を満たしたと判定した場合には、前記要約処理を行わない、(1)~(8)のいずれか1つに記載の情報処理装置。
(10)
前記要約除外条件は、ジェスチャの検出に関する条件であり、
前記処理部は、所定のジェスチャが検出された場合に、前記要約除外条件を満たしたと判定する、(9)に記載の情報処理装置。
(11)
前記処理部は、前記音声情報に基づき特定される発話期間と、前記音声情報に基づき特定される文字数とのうちの少なくとも一方に基づいて、前記発話の内容の要約のレベルを変更する、(1)~(10)のいずれか1つに記載の情報処理装置。
(12)
前記処理部は、要約された発話の内容が示す文字数を制限することによって、前記前記発話の内容の要約のレベルを変更する、(11)に記載の情報処理装置。
(13)
前記処理部は、前記音声情報、ユーザに関する情報、アプリケーションに関する情報、環境に関する情報、およびデバイスに関する情報のうちの少なくとも1つに基づいて、前記要約に関する重みを設定する、(1)~(12)のいずれか1つに記載の情報処理装置。
(14)
前記ユーザに関する情報には、前記ユーザの状態情報と前記ユーザの操作情報とのうちの少なくとも1つが含まれる、(13)に記載の情報処理装置。
(15)
前記処理部は、前記要約処理により要約された発話の内容を他の言語に翻訳する翻訳処理を、さらに行う、(1)~(14)のいずれか1つに記載の情報処理装置。
(16)
前記処理部は、所定の翻訳除外条件を満たしたと判定した場合には前記翻訳処理を行わない、(15)に記載の情報処理装置。
(17)
前記処理部は、
前記翻訳処理により他の言語に翻訳された内容を、翻訳前の言語に再翻訳し、
再翻訳した後に取得された前記音声情報が示す発話の内容に、再翻訳後の内容に含まれている言葉が存在する場合には、前記再翻訳後の内容に含まれている言葉を、要約された発話の内容に含める、(15)、または(16)に記載の情報処理装置。
(18)
前記処理部は、要約された発話の内容の通知を制御する通知制御処理を、さらに行う、(1)~(17)のいずれか1つに記載の情報処理装置。
(19)
取得した要約に関する重みを示す情報に基づいて、ユーザの発話に基づく音声情報が示す発話の内容を要約する要約処理を行うステップを有する、情報処理装置により実行される情報処理方法。
(20)
取得した要約に関する重みを示す情報に基づいて、ユーザの発話に基づく音声情報が示す発話の内容を要約する要約処理を行う機能を、コンピュータに実現させるためのプログラム。 The following configurations also belong to the technical scope of the present disclosure.
(1)
An information processing apparatus (2) including a processing unit that performs a summarization process for summarizing the content of the utterance indicated by the voice information based on the user's utterance based on the acquired information indicating the weight related to the summary.
The information processing apparatus according to (1), wherein the processing unit performs the digest process when it is determined that a predetermined start condition is satisfied.
(3)
The start condition is a condition related to a non-speech period in which a state in which no speech is made continues.
The processing unit determines that the start condition is satisfied when the non-speech period exceeds a predetermined period or when the non-speech period becomes equal to or greater than the predetermined period, (2) Information processing device.
(4)
The start condition is a condition related to a state of speech recognition for acquiring the content of an utterance from the speech information,
The information processing apparatus according to (2) or (3), wherein the processing unit determines that the start condition is satisfied based on detection of the voice recognition stop request.
(5)
The start condition is a condition related to a state of speech recognition for acquiring the content of an utterance from the speech information,
The information processing apparatus according to any one of (2) to (4), wherein the processing unit determines that the start condition is satisfied based on detection of completion of the voice recognition.
(6)
The start condition is a condition related to the content of the utterance,
The processing unit determines that the start condition is satisfied based on detection of a predetermined word from the content of the utterance indicated by the audio information, according to any one of (2) to (5) Information processing device.
(7)
The start condition is a condition related to the content of the utterance,
The information processing apparatus according to any one of (2) to (6), wherein the processing unit determines that the start condition is satisfied based on detection of stagnation based on the audio information.
(8)
The start condition is a condition related to an elapsed time after the voice information is obtained,
The processing unit determines that the start condition is satisfied when the elapsed time exceeds a predetermined period or when the elapsed time is equal to or longer than the predetermined period. (2) to (7) The information processing apparatus according to any one of the above.
(9)
The information processing apparatus according to any one of (1) to (8), wherein the processing unit does not perform the summary processing when it is determined that a predetermined summary exclusion condition is satisfied.
(10)
The summary exclusion condition is a condition related to gesture detection,
The information processing apparatus according to (9), wherein the processing unit determines that the summary exclusion condition is satisfied when a predetermined gesture is detected.
(11)
The processing unit changes a summary level of the content of the utterance based on at least one of an utterance period specified based on the voice information and a number of characters specified based on the voice information. The information processing apparatus according to any one of (10) to (10).
(12)
The information processing apparatus according to (11), wherein the processing unit changes a summary level of the utterance content by limiting a number of characters indicated by the summarized utterance content.
(13)
The processing unit sets a weight for the summary based on at least one of the audio information, information about a user, information about an application, information about an environment, and information about a device, (1) to (12) The information processing apparatus according to any one of the above.
(14)
The information processing apparatus according to (13), wherein the information about the user includes at least one of the user status information and the user operation information.
(15)
The information processing apparatus according to any one of (1) to (14), wherein the processing unit further performs a translation process for translating the content of the speech summarized by the summary process into another language.
(16)
The information processing apparatus according to (15), wherein the processing unit does not perform the translation process when it is determined that a predetermined translation exclusion condition is satisfied.
(17)
The processor is
The content translated into another language by the translation process is re-translated into the language before translation,
If there is a word included in the re-translated content in the utterance content indicated by the speech information acquired after re-translation, the words included in the re-translated content are summarized. The information processing apparatus according to (15) or (16), which is included in the content of the uttered speech.
(18)
The information processing apparatus according to any one of (1) to (17), wherein the processing unit further performs notification control processing for controlling notification of the content of the summarized utterance.
(19)
An information processing method executed by an information processing apparatus, comprising: performing a summarization process for summarizing the content of an utterance indicated by voice information based on a user's utterance based on information indicating a weight related to the acquired summary.
(20)
A program for causing a computer to realize a function of performing a summarization process for summarizing the content of utterances indicated by voice information based on a user's utterances based on information indicating weights relating to acquired summaries.
102 通信部
104 制御部
110 処理部 DESCRIPTION OF
Claims (20)
- 取得した要約に関する重みを示す情報に基づいて、ユーザの発話に基づく音声情報が示す発話の内容を要約する要約処理を行う処理部を備える、情報処理装置。 An information processing apparatus comprising a processing unit that performs a summarization process for summarizing the content of an utterance indicated by voice information based on a user's utterance based on information indicating a weight related to the acquired summary.
- 前記処理部は、所定の開始条件を満たしたと判定した場合に、前記要約処理を行う、請求項1に記載の情報処理装置。 The information processing apparatus according to claim 1, wherein the processing unit performs the summarization process when it is determined that a predetermined start condition is satisfied.
- 前記開始条件は、発話がされていない状態が継続する無発話期間に関する条件であり、
前記処理部は、前記無発話期間が所定の期間を越えた場合、または、前記無発話期間が前記所定の期間以上となった場合に、前記開始条件を満たしたと判定する、請求項2に記載の情報処理装置。 The start condition is a condition related to a non-speech period in which a state in which no speech is made continues.
The said processing part determines with satisfy | filling the said start conditions, when the said non-utterance period exceeds predetermined period, or when the said non-utterance period becomes more than the said predetermined period. Information processing device. - 前記開始条件は、前記音声情報から発話の内容を取得するための音声認識の状態に関する条件であり、
前記処理部は、前記音声認識の停止要求が検出されたことに基づいて、前記開始条件を満たしたと判定する、請求項2に記載の情報処理装置。 The start condition is a condition related to a state of speech recognition for acquiring the content of an utterance from the speech information,
The information processing apparatus according to claim 2, wherein the processing unit determines that the start condition is satisfied based on detection of the voice recognition stop request. - 前記開始条件は、前記音声情報から発話の内容を取得するための音声認識の状態に関する条件であり、
前記処理部は、前記音声認識の完了が検出されたことに基づいて、前記開始条件を満たしたと判定する、請求項2に記載の情報処理装置。 The start condition is a condition related to a state of speech recognition for acquiring the content of an utterance from the speech information,
The information processing apparatus according to claim 2, wherein the processing unit determines that the start condition is satisfied based on detection of completion of the voice recognition. - 前記開始条件は、発話の内容に関する条件であり、
前記処理部は、前記音声情報が示す発話の内容から所定の言葉が検出されたことに基づいて、前記開始条件を満たしたと判定する、請求項2に記載の情報処理装置。 The start condition is a condition related to the content of the utterance,
The information processing apparatus according to claim 2, wherein the processing unit determines that the start condition is satisfied based on detection of a predetermined word from the content of the utterance indicated by the audio information. - 前記開始条件は、発話の内容に関する条件であり、
前記処理部は、前記音声情報に基づき言いよどみが検出されたことに基づいて、前記開始条件を満たしたと判定する、請求項2に記載の情報処理装置。 The start condition is a condition related to the content of the utterance,
The information processing apparatus according to claim 2, wherein the processing unit determines that the start condition is satisfied based on detection of stagnation based on the audio information. - 前記開始条件は、前記音声情報が得られてからの経過時間に関する条件であり、
前記処理部は、前記経過時間が所定の期間を越えた場合、または、前記経過時間が前記所定の期間以上となった場合に、前記開始条件を満たしたと判定する、請求項2に記載の情報処理装置。 The start condition is a condition related to an elapsed time after the voice information is obtained,
The information according to claim 2, wherein the processing unit determines that the start condition is satisfied when the elapsed time exceeds a predetermined period or when the elapsed time becomes equal to or longer than the predetermined period. Processing equipment. - 前記処理部は、所定の要約除外条件を満たしたと判定した場合には、前記要約処理を行わない、請求項1に記載の情報処理装置。 The information processing apparatus according to claim 1, wherein the processing unit does not perform the summarization process when it is determined that a predetermined summarization exclusion condition is satisfied.
- 前記要約除外条件は、ジェスチャの検出に関する条件であり、
前記処理部は、所定のジェスチャが検出された場合に、前記要約除外条件を満たしたと判定する、請求項9に記載の情報処理装置。 The summary exclusion condition is a condition related to gesture detection,
The information processing apparatus according to claim 9, wherein the processing unit determines that the summary exclusion condition is satisfied when a predetermined gesture is detected. - 前記処理部は、前記音声情報に基づき特定される発話期間と、前記音声情報に基づき特定される文字数とのうちの少なくとも一方に基づいて、前記発話の内容の要約のレベルを変更する、請求項1に記載の情報処理装置。 The processing unit changes a summary level of the content of the utterance based on at least one of an utterance period specified based on the voice information and a number of characters specified based on the voice information. The information processing apparatus according to 1.
- 前記処理部は、要約された発話の内容が示す文字数を制限することによって、前記前記発話の内容の要約のレベルを変更する、請求項11に記載の情報処理装置。 12. The information processing apparatus according to claim 11, wherein the processing unit changes a summary level of the utterance content by limiting a number of characters indicated by the summarized utterance content.
- 前記処理部は、前記音声情報、ユーザに関する情報、アプリケーションに関する情報、環境に関する情報、およびデバイスに関する情報のうちの少なくとも1つに基づいて、前記要約に関する重みを設定する、請求項1に記載の情報処理装置。 The information according to claim 1, wherein the processing unit sets a weight related to the summary based on at least one of the audio information, information about a user, information about an application, information about an environment, and information about a device. Processing equipment.
- 前記ユーザに関する情報には、前記ユーザの状態情報と前記ユーザの操作情報とのうちの少なくとも1つが含まれる、請求項13に記載の情報処理装置。 14. The information processing apparatus according to claim 13, wherein the information about the user includes at least one of the user status information and the user operation information.
- 前記処理部は、前記要約処理により要約された発話の内容を他の言語に翻訳する翻訳処理を、さらに行う、請求項1に記載の情報処理装置。 The information processing apparatus according to claim 1, wherein the processing unit further performs a translation process for translating the content of the utterance summarized by the summary process into another language.
- 前記処理部は、所定の翻訳除外条件を満たしたと判定した場合には前記翻訳処理を行わない、請求項15に記載の情報処理装置。 The information processing apparatus according to claim 15, wherein the processing unit does not perform the translation processing when it is determined that a predetermined translation exclusion condition is satisfied.
- 前記処理部は、
前記翻訳処理により他の言語に翻訳された内容を、翻訳前の言語に再翻訳し、
再翻訳した後に取得された前記音声情報が示す発話の内容に、再翻訳後の内容に含まれている言葉が存在する場合には、前記再翻訳後の内容に含まれている言葉を、要約された発話の内容に含める、請求項15に記載の情報処理装置。 The processor is
The content translated into another language by the translation process is re-translated into the language before translation,
If there is a word included in the re-translated content in the utterance content indicated by the speech information acquired after re-translation, the words included in the re-translated content are summarized. The information processing device according to claim 15, wherein the information processing device is included in the content of the uttered speech. - 前記処理部は、要約された発話の内容の通知を制御する通知制御処理を、さらに行う、請求項1に記載の情報処理装置。 The information processing apparatus according to claim 1, wherein the processing unit further performs notification control processing for controlling notification of the content of the summarized utterance.
- 取得した要約に関する重みを示す情報に基づいて、ユーザの発話に基づく音声情報が示す発話の内容を要約する要約処理を行うステップを有する、情報処理装置により実行される情報処理方法。 An information processing method executed by the information processing apparatus, including a step of performing a summarization process for summarizing the content of the utterance indicated by the voice information based on the user's utterance based on the information indicating the weight related to the acquired summary.
- 取得した要約に関する重みを示す情報に基づいて、ユーザの発話に基づく音声情報が示す発話の内容を要約する要約処理を行う機能を、コンピュータに実現させるためのプログラム。 A program for causing a computer to realize a function of performing a summarization process for summarizing the content of speech indicated by voice information based on a user's speech based on information indicating the weight related to the acquired summary.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP16888059.9A EP3410432A4 (en) | 2016-01-25 | 2016-10-14 | Information processing device, information processing method, and program |
JP2017563679A JP6841239B2 (en) | 2016-01-25 | 2016-10-14 | Information processing equipment, information processing methods, and programs |
US16/068,987 US11120063B2 (en) | 2016-01-25 | 2016-10-14 | Information processing apparatus and information processing method |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2016011224 | 2016-01-25 | ||
JP2016-011224 | 2016-01-25 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017130474A1 true WO2017130474A1 (en) | 2017-08-03 |
Family
ID=59397722
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2016/080485 WO2017130474A1 (en) | 2016-01-25 | 2016-10-14 | Information processing device, information processing method, and program |
Country Status (4)
Country | Link |
---|---|
US (1) | US11120063B2 (en) |
EP (1) | EP3410432A4 (en) |
JP (1) | JP6841239B2 (en) |
WO (1) | WO2017130474A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2019082981A (en) * | 2017-10-30 | 2019-05-30 | 株式会社テクノリンク | Inter-different language communication assisting device and system |
WO2020111880A1 (en) * | 2018-11-30 | 2020-06-04 | Samsung Electronics Co., Ltd. | User authentication method and apparatus |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2019101754A (en) * | 2017-12-01 | 2019-06-24 | キヤノン株式会社 | Summarization device and method for controlling the same, summarization system, and program |
KR102530391B1 (en) * | 2018-01-25 | 2023-05-09 | 삼성전자주식회사 | Application processor including low power voice trigger system with external interrupt, electronic device including the same and method of operating the same |
JP7131077B2 (en) * | 2018-05-24 | 2022-09-06 | カシオ計算機株式会社 | CONVERSATION DEVICE, ROBOT, CONVERSATION DEVICE CONTROL METHOD AND PROGRAM |
US11429795B2 (en) * | 2020-01-13 | 2022-08-30 | International Business Machines Corporation | Machine translation integrated with user analysis |
CN112085090A (en) * | 2020-09-07 | 2020-12-15 | 百度在线网络技术(北京)有限公司 | Translation method and device and electronic equipment |
KR20230067321A (en) * | 2021-11-09 | 2023-05-16 | 삼성전자주식회사 | Electronic device and controlling method of electronic device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000089789A (en) * | 1998-09-08 | 2000-03-31 | Fujitsu Ltd | Voice recognition device and recording medium |
JP2006058567A (en) * | 2004-08-19 | 2006-03-02 | Ntt Docomo Inc | Voice information summarizing system and voice information summarizing method |
JP2007156888A (en) * | 2005-12-06 | 2007-06-21 | Oki Electric Ind Co Ltd | Information presentation system and information presentation program |
JP2010256391A (en) * | 2009-04-21 | 2010-11-11 | Takeshi Hanamura | Voice information processing device |
WO2012023450A1 (en) * | 2010-08-19 | 2012-02-23 | 日本電気株式会社 | Text processing system, text processing method, and text processing program |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9552354B1 (en) * | 2003-09-05 | 2017-01-24 | Spoken Traslation Inc. | Method and apparatus for cross-lingual communication |
US7624093B2 (en) * | 2006-01-25 | 2009-11-24 | Fameball, Inc. | Method and system for automatic summarization and digest of celebrity news |
US7885807B2 (en) * | 2006-10-18 | 2011-02-08 | Hierodiction Software Gmbh | Text analysis, transliteration and translation method and apparatus for hieroglypic, hieratic, and demotic texts from ancient Egyptian |
US8682661B1 (en) * | 2010-08-31 | 2014-03-25 | Google Inc. | Robust speech recognition |
US10235346B2 (en) * | 2012-04-06 | 2019-03-19 | Hmbay Patents Llc | Method and apparatus for inbound message summarization using message clustering and message placeholders |
US20150348538A1 (en) * | 2013-03-14 | 2015-12-03 | Aliphcom | Speech summary and action item generation |
KR20140119841A (en) * | 2013-03-27 | 2014-10-13 | 한국전자통신연구원 | Method for verifying translation by using animation and apparatus thereof |
CN106462909B (en) * | 2013-12-20 | 2020-07-10 | 罗伯特·博世有限公司 | System and method for enabling contextually relevant and user-centric presentation of content for conversations |
US10409919B2 (en) * | 2015-09-28 | 2019-09-10 | Konica Minolta Laboratory U.S.A., Inc. | Language translation for display device |
US10043517B2 (en) * | 2015-12-09 | 2018-08-07 | International Business Machines Corporation | Audio-based event interaction analytics |
JP6604836B2 (en) * | 2015-12-14 | 2019-11-13 | 株式会社日立製作所 | Dialog text summarization apparatus and method |
-
2016
- 2016-10-14 WO PCT/JP2016/080485 patent/WO2017130474A1/en active Application Filing
- 2016-10-14 EP EP16888059.9A patent/EP3410432A4/en not_active Withdrawn
- 2016-10-14 JP JP2017563679A patent/JP6841239B2/en active Active
- 2016-10-14 US US16/068,987 patent/US11120063B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000089789A (en) * | 1998-09-08 | 2000-03-31 | Fujitsu Ltd | Voice recognition device and recording medium |
JP2006058567A (en) * | 2004-08-19 | 2006-03-02 | Ntt Docomo Inc | Voice information summarizing system and voice information summarizing method |
JP2007156888A (en) * | 2005-12-06 | 2007-06-21 | Oki Electric Ind Co Ltd | Information presentation system and information presentation program |
JP2010256391A (en) * | 2009-04-21 | 2010-11-11 | Takeshi Hanamura | Voice information processing device |
WO2012023450A1 (en) * | 2010-08-19 | 2012-02-23 | 日本電気株式会社 | Text processing system, text processing method, and text processing program |
Non-Patent Citations (5)
Title |
---|
OHNO T., ET AL.: "Real-time Captioning based on Simultaneous Summarization of Spoken Monologue", INFORMATION PROCESSING SOCIETY OF JAPAN, SIG NOTES, NUM 73, 7 August 2006 (2006-08-07), pages 51 - 56, XP009507779 * |
See also references of EP3410432A4 * |
SEIICHI YAMAMOTO: "Present state and future works of spoken language translation technologies", IEICE TECHNICAL REPORT, vol. 100, no. 523, 15 December 2000 (2000-12-15), pages 49 - 54, XP009507789 * |
SHOGO HATA ET AL.: "Sentence Boundary Detection Focused on Confidence Measure of Automatic Speech Recognition", IPSJ SIG NOTES, vol. 2009 -SL, no. 20, 15 February 2010 (2010-02-15), pages 1 - 6, XP009507777 * |
TATSUNORI MORI: "A Term Weighting Method based on Information Gain Ratio for Summarizing Documents retrieved by IR Systems", JOURNAL OF NATURAL LANGUAGE PROCESSING, vol. 9, no. 4, 10 July 2002 (2002-07-10), pages 3 - 32, XP055403101 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2019082981A (en) * | 2017-10-30 | 2019-05-30 | 株式会社テクノリンク | Inter-different language communication assisting device and system |
WO2020111880A1 (en) * | 2018-11-30 | 2020-06-04 | Samsung Electronics Co., Ltd. | User authentication method and apparatus |
US11443750B2 (en) | 2018-11-30 | 2022-09-13 | Samsung Electronics Co., Ltd. | User authentication method and apparatus |
Also Published As
Publication number | Publication date |
---|---|
US20190019511A1 (en) | 2019-01-17 |
JPWO2017130474A1 (en) | 2018-11-22 |
JP6841239B2 (en) | 2021-03-10 |
EP3410432A1 (en) | 2018-12-05 |
US11120063B2 (en) | 2021-09-14 |
EP3410432A4 (en) | 2019-01-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2017130474A1 (en) | Information processing device, information processing method, and program | |
US20220230374A1 (en) | User interface for generating expressive content | |
KR102197869B1 (en) | Natural assistant interaction | |
CN110288994B (en) | Detecting triggering of a digital assistant | |
JP6701066B2 (en) | Dynamic phrase expansion of language input | |
US20180330729A1 (en) | Text normalization based on a data-driven learning network | |
KR20230015413A (en) | Digital Assistant User Interfaces and Response Modes | |
JP2023022150A (en) | Bidirectional speech translation system, bidirectional speech translation method and program | |
KR20090129192A (en) | Mobile terminal and voice recognition method | |
CN110992927B (en) | Audio generation method, device, computer readable storage medium and computing equipment | |
KR102193029B1 (en) | Display apparatus and method for performing videotelephony using the same | |
US9558733B1 (en) | Audibly indicating secondary content with spoken text | |
KR101819458B1 (en) | Voice recognition apparatus and system | |
CN110612567A (en) | Low latency intelligent automated assistant | |
TW201510774A (en) | Apparatus and method for selecting a control object by voice recognition | |
CN108628819B (en) | Processing method and device for processing | |
KR102123059B1 (en) | User-specific acoustic models | |
WO2017130483A1 (en) | Information processing device, information processing method, and program | |
US9865250B1 (en) | Audibly indicating secondary content with spoken text | |
WO2019073668A1 (en) | Information processing device, information processing method, and program | |
JP5008248B2 (en) | Display processing apparatus, display processing method, display processing program, and recording medium | |
CN112099721A (en) | Digital assistant user interface and response mode | |
JP2005222316A (en) | Conversation support device, conference support system, reception work support system, and program | |
JP2018072509A (en) | Voice reading device, voice reading system, voice reading method and program | |
KR100777569B1 (en) | The speech recognition method and apparatus using multimodal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16888059 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2017563679 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2016888059 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2016888059 Country of ref document: EP Effective date: 20180827 |