WO2020049826A1 - Information processing device - Google Patents

Information processing device Download PDF

Info

Publication number
WO2020049826A1
WO2020049826A1 PCT/JP2019/023630 JP2019023630W WO2020049826A1 WO 2020049826 A1 WO2020049826 A1 WO 2020049826A1 JP 2019023630 W JP2019023630 W JP 2019023630W WO 2020049826 A1 WO2020049826 A1 WO 2020049826A1
Authority
WO
WIPO (PCT)
Prior art keywords
content
information
user input
unit
information processing
Prior art date
Application number
PCT/JP2019/023630
Other languages
French (fr)
Japanese (ja)
Inventor
田中 彰
充弘 小形
昇悟 池田
広樹 石塚
翔 七尾
誠 村▲崎▼
Original Assignee
株式会社Nttドコモ
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社Nttドコモ filed Critical 株式会社Nttドコモ
Priority to JP2020541024A priority Critical patent/JPWO2020049826A1/en
Publication of WO2020049826A1 publication Critical patent/WO2020049826A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/10Speech classification or search using distance or distortion measures between unknown speech and reference templates
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Definitions

  • the present invention relates to an information processing device.
  • a voice agent function of interpreting a voice input such as a voice command issued by a user and executing a process instructed by voice.
  • a voice input processing device that enables the use of simplified voice commands has been proposed (for example, Patent Document 1).
  • This type of voice input processing device for example, when a simplified voice command is received, with reference to an operation history that is a history of operation information that associates at least a part of the content of the voice command with the operation content, Issues predetermined commands for various controls.
  • an information processing apparatus includes an acquisition unit that acquires content information regarding content, and a user input in a natural language for an application that processes the content, in the content information. And an interpreting unit for interpreting based on the information.
  • the usability of the information processing device can be improved.
  • FIG. 1 is a block diagram illustrating an overall configuration of an information processing apparatus according to a first embodiment of the present invention.
  • FIG. 4 is an explanatory diagram illustrating an example of content information.
  • FIG. 9 is an explanatory diagram illustrating an example in which interpretation of a user input is uniquely specified.
  • FIG. 11 is an explanatory diagram illustrating another example in which the interpretation of a user input is uniquely specified.
  • FIG. 9 is an explanatory diagram illustrating an example in which interpretation of a user input is not uniquely specified;
  • 3 is a flowchart illustrating an example of an operation of the information processing apparatus illustrated in FIG. It is a block diagram showing the whole information processing unit composition concerning a 2nd embodiment of the present invention.
  • FIG. 4 is an explanatory diagram illustrating an example of content information.
  • FIG. 9 is an explanatory diagram illustrating an example in which interpretation of a user input is uniquely specified.
  • FIG. 11 is an explanatory diagram illustrating another example in which the interpretation of a user input is uniquely specified.
  • FIG. 8 is an explanatory diagram illustrating an example of an operation of the information processing device illustrated in FIG. 7. It is a block diagram showing the whole information processor concerning a 3rd embodiment of the present invention.
  • FIG. 5 is an explanatory diagram illustrating an example of a relationship between content information and an output mode of response information.
  • 10 is a flowchart illustrating an example of an operation of the information processing apparatus illustrated in FIG.
  • FIG. 1 is a block diagram illustrating an overall configuration of an information processing apparatus 10 according to the first embodiment of the present invention.
  • a smartphone is assumed as the information processing device 10.
  • any portable information processing device can be adopted as the information processing device 10, and may be, for example, a notebook computer, a wearable terminal, a tablet terminal, or the like.
  • the information processing device 10 is realized by a computer system including a processing device 100, a storage device 140, an input device 150, an output device 160, and a communication device 170.
  • a plurality of elements of the information processing device 10 are mutually connected by a single or a plurality of buses.
  • the term “apparatus” in this specification may be replaced with another term such as a circuit, a device, or a unit.
  • each of the plurality of elements of the information processing device 10 may be configured by a single device or a plurality of devices. Alternatively, some elements of the information processing device 10 may be omitted.
  • the processing device 100 is a processor that controls the entire information processing device 10, and is configured by, for example, a single chip or a plurality of chips.
  • the processing device 100 includes, for example, a central processing unit (CPU: Central Processing Unit) including an interface with a peripheral device, an arithmetic device, a register, and the like. Note that some or all of the functions of the processing device 100 are implemented by hardware such as a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), a PLD (Programmable Logic Device), and an FPGA (Field Programmable Gate Array). It may be realized.
  • the processing device 100 executes various types of processing in parallel or sequentially.
  • the processing device 100 functions as the agent unit 110 by, for example, reading out and executing the control program PR from the storage device 140.
  • the agent unit 110 interprets a user input, which is a user input in a natural language, and executes a process according to the user input.
  • the user input is, for example, an instruction or a question from the user in a natural language.
  • the method of user input (the method of user input in natural language) is not particularly limited as long as the information processing apparatus 10 can convert the content of the user input into text or the like and interpret it.
  • the user input method corresponds to input by voice, text, or the like.
  • the acquiring unit 112, the interpreting unit 114, the control command issuing unit 116, and the response information generating unit 118 shown in the agent unit 110 of FIG. 1 are examples of functional blocks of the agent unit 110. That is, the information processing apparatus 10 includes the obtaining unit 112, the interpreting unit 114, the control command issuing unit 116, and the response information generating unit 118.
  • the acquisition unit 112 acquires content information on content.
  • the acquisition unit 112 acquires content information on content being processed by an application that is in a state of receiving a user input.
  • the content that is being processed by the application that is in a state of accepting user input is also referred to as valid content.
  • the acquisition unit 112 specifies an application that is being executed by the information processing apparatus 10, and specifies valid content based on the name of the application or a file that is being processed by the application. Then, the obtaining unit 112 obtains content information on valid content.
  • the acquisition unit 112 specifies the movie as valid content. Further, for example, when the user has created a transmission message of a mail using the information processing device 10, the acquisition unit 112 specifies the mail as valid content. Then, the obtaining unit 112 obtains content information on valid content.
  • a mail is a type of content because a user refers to the mail.
  • the content information has one or a plurality of parameters determined according to the type of the content. For example, when the effective content is a TV (television) program, the obtaining unit 112 obtains a plurality of parameters including the title (program information) of the TV program. An example of the content information will be described with reference to FIG.
  • the interpretation unit 114 interprets a user input to an application that processes valid content based on the content information. For example, the interpretation unit 114 interprets the content of the user input based on the parameters included in the content information. For example, when the valid content is a TV program and the title, which is one of a plurality of parameters related to the TV program, indicates a baseball broadcast, when the user asks “What other games?” The user input “other games?” Is interpreted as a search for a game result of another baseball or a search for a progress of another baseball.
  • the control command issuing unit 116 issues a control command according to a user input based on the result of interpretation by the user input interpreting unit 114. For example, when the user input “other games?” Is interpreted by the search and interpretation unit 114 for searching for a result of another baseball game or for searching for the progress of another baseball game, the control command issuing unit 116 performs data broadcasting. A control command for retrieving the result of another baseball game or the progress of another baseball from information and the like included in the above is issued. By issuing a control command for searching for the result of another baseball game or the course of another baseball, the result of another baseball game or the course of another baseball is searched, and the search result is acquired by the response information generation unit 118. You.
  • the response information generation unit 118 generates response information to the user input based on the result of the interpretation by the user input interpretation unit 114.
  • the response information to the user input is, for example, information indicating that the instruction from the user has been received, information indicating the execution result of the process in response to the instruction from the user, information indicating the answer to the user's question, and the like. For example, if the user input “other games?” Is interpreted by the search and interpretation unit 114 for searching for a game result of another baseball or searching for the progress of another baseball, the response information generating unit 118 Response information indicating a search result of a baseball game result or a search result of another baseball game in progress is generated. As a result, for example, based on the response information, the game result of another baseball or the progress of another baseball is displayed as text on a display 162 described later.
  • the response information generation unit 118 When the interpretation result of the user input interpretation unit 114 includes a plurality of interpretations, the response information generation unit 118 generates response information for confirming which of the plurality of interpretations is applicable to the content of the user input. . That is, when the interpretation of the user input is not uniquely specified, the response information generation unit 118 generates response information for asking the user about the content of the user input. An example in which the interpretation of the user input is not uniquely specified will be described with reference to FIG.
  • the storage device 140 is a recording medium that can be read by the processing device 100, and stores a plurality of programs including the control program PR executed by the processing device 100, and various data used by the processing device 100.
  • the storage device 140 may be constituted by at least one of a ROM (Read Only Memory), an EPROM (Erasable Programmable ROM), an EEPROM (Electrically Erasable Programmable ROM), and a RAM (Random Access Memory).
  • the input device 150 is an input device that receives an external input.
  • the input device 150 includes a microphone 152 that receives a voice input operation and an operation unit 154 that receives an operation by a user.
  • the input device 150 transfers the user input received by the microphone 152 or the operation unit 154 to the agent unit 110.
  • the microphone 152 receives, for example, a user input such as an instruction or a question from the user by voice.
  • the operation unit 154 is a device (for example, a keyboard, a mouse, a switch, a button, and the like) for inputting information used by the information processing device 10 to the processing device 100, and receives an instruction from a user or a user input such as a question. Accept.
  • operation unit 154 receives an operation for inputting codes such as numerals and characters to processing device 100 and an operation for selecting an icon displayed on display 162.
  • a touch panel that detects contact with the display surface of the display 162 is suitable as the operation unit 154.
  • the operation unit 154 may include a plurality of operators that can be operated by the user.
  • the input device 150 may include a sensor that detects a movement or the like of the information processing device 10 itself.
  • the output device 160 is an output device that performs output to the outside.
  • the output device 160 includes a display 162, a speaker 164, and a light emitting unit 166.
  • the display 162 is an example of a display device, and displays various images under the control of the processing device 100.
  • the display 162 displays an image such as text or an icon indicating response information.
  • various display panels such as a liquid crystal display panel and an organic EL (Electro Luminescence) display panel are suitably used.
  • the speaker 164 outputs various sounds under the control of the processing device 100.
  • the speaker 164 outputs sound such as voice or music indicating response information.
  • the light emitting unit 166 has a light emitting element such as an LED (Light Emitting Diode), and emits various lights under the control of the processing device 100. For example, the processing device 100 turns on or blinks the light emitting unit 166 according to the content of the response information.
  • a light emitting element such as an LED (Light Emitting Diode)
  • LED Light Emitting Diode
  • the communication device 170 is a device that communicates with another device via a mobile communication network or a network such as the Internet.
  • the communication device 170 is also described as, for example, a network device, a network controller, a network card, or a communication module. Next, an example of content information will be described with reference to FIG.
  • FIG. 2 is an explanatory diagram showing an example of content information.
  • the content information has, for example, a content type and one or more parameters determined according to the content type.
  • a plurality of parameters are determined according to the type of content.
  • the parameters are, for example, information indicating the title of the movie or the TV program, the presence or absence of subtitles, the window size, the presence / absence of audio mute, the presence / absence of earphone connection, and the like.
  • the parameter indicating the presence or absence of subtitles is set to a first value (for example, the value “1”), it indicates that subtitles are present, and a second value (for example, value If “0” is set, it indicates that there is no subtitle.
  • the window size indicates, for example, whether a window displaying a movie or a TV program is full screen display or reduced display.
  • the parameter is information indicating, for example, the title of the song, whether or not to display lyrics, the window size, whether or not an earphone is connected, and the like.
  • the window size indicates, for example, whether a window displaying operation buttons for operating music reproduction or the like is displayed in full screen or reduced.
  • the parameters are, for example, information indicating the type of active window, the window size, the presence / absence of audio muting, the presence / absence of earphone connection, and the like.
  • the types of windows in the active state are, for example, a received mail list screen, a received mail display screen, a transmitted mail creation screen, and the like.
  • the window size indicates, for example, whether the active window is a full screen display or a reduced display.
  • the parameters are information indicating, for example, the display magnification of the map, the window size, the presence / absence of audio mute, the presence / absence of earphone connection, and the like.
  • the window size indicates, for example, whether the window displaying the map is full screen display or reduced display.
  • the parameters are information indicating, for example, the title of the game, the type of the game screen, the window size, the presence / absence of audio mute, the presence / absence of earphone connection, and the like.
  • the types of game screens include, for example, a scene during a fight, a scene for selecting an item, and a screen for displaying a game result.
  • the window size indicates, for example, whether the window displaying the game is full screen display or reduced display.
  • the parameters are information indicating, for example, the title of the game, the window size, the presence or absence of the connection of the earphone, and the like.
  • the window size indicates, for example, whether the window displaying the game is full screen display or reduced display.
  • the type of content and the parameters determined according to the type of content are not limited to the example shown in FIG.
  • the parameter may be one piece of information indicating whether or not the earphone is connected. That is, the number of parameters included in the content information may be one.
  • FIG. 3 is an explanatory diagram showing an example in which the interpretation of the user input is uniquely specified.
  • FIG. 3 shows an example of the operation of the information processing apparatus 10 when the user instructs by voice to “increase” while watching a movie using the information processing apparatus 10.
  • “increase”, which is a user input has two meanings: increasing the screen and increasing the volume.
  • the interpretation unit 114 uniquely interprets “enlarge”, which is a user input, as enlarging the screen.
  • the acquisition unit 112 specifies the movie as valid content, and obtains a plurality of parameters including the title of the movie (for example, as illustrated in FIG. 2). Parameter). Then, the interpreting unit 114 uniquely interprets “enlarge”, which is a user input, as increasing the screen based on the parameters and the like acquired by the acquiring unit 112.
  • the response information generation unit 118 Since the interpretation result of the interpretation unit 114 is to enlarge the screen, the response information generation unit 118 generates response information indicating that the screen is enlarged as response information to a user input. As a result, for example, a text stating "Make full screen" is displayed on display 162.
  • the control command issuing unit 116 issues a control command for displaying the movie on the full screen. As a result, the movie is displayed on the entire screen of the display 162.
  • the user may utter an instruction or the like after calling the information processing apparatus 10 with a predetermined word.
  • the input device 150 of the information processing device 10 may receive, as a user input, a voice following a call for a predetermined word.
  • the information processing device 10 can easily determine whether or not the word uttered by the user is an input to the information processing device 10 by detecting the presence or absence of a call for a predetermined word.
  • the user input is not limited to the voice input, and may be, for example, text.
  • the information processing apparatus 10 searches for a route from the current position to Shibuya, and displays the search result in text. 162 may be displayed.
  • the operation of the information processing apparatus 10 is described with an example where the user input is a voice, but the user input is not limited to the voice input.
  • FIG. 4 is an explanatory diagram showing another example in which the interpretation of the user input is uniquely specified.
  • FIG. 4 shows an example of the operation of the information processing apparatus 10 when the user gives an instruction to increase the volume while listening to a movie using the information processing apparatus 10.
  • the parameters included in the content information regarding the movie indicate audio output (no mute), connection of earphones, and full-screen display.
  • the interpretation unit 114 uniquely interprets “increase”, which is a user input, as increasing the volume.
  • the acquisition unit 112 specifies the movie as valid content, and obtains a plurality of parameters including the title of the movie (for example, as illustrated in FIG. 2). Parameter). Then, the interpreting unit 114 uniquely interprets “increase”, which is a user input, as increasing the volume based on the parameters and the like acquired by the acquiring unit 112.
  • the response information generation unit 118 Since the interpretation result of the interpretation unit 114 is to increase the volume, the response information generation unit 118 generates response information indicating that the volume is increased as response information to a user input. As a result, for example, text indicating “increase the volume” is displayed on the display 162.
  • control command issuing section 116 issues a control command to increase the volume. As a result, the volume of the movie being played by the information processing device 10 increases.
  • the response information when the interpretation of the user input is uniquely specified is not limited to the examples illustrated in FIGS. 3 and 4.
  • the information processing apparatus 10 may display a text “OK” on the display 162 as a response to the user input.
  • the information processing apparatus 10 can uniquely specify the interpretation of the user input based on the content information on the valid content. For this reason, usability of the information processing apparatus 10 can be improved.
  • An example in which the interpretation of the user input cannot be uniquely specified even by referring to the content information regarding the valid content will be described with reference to FIG.
  • FIG. 5 is an explanatory diagram showing an example where the interpretation of the user input is not uniquely specified.
  • FIG. 5 shows an example of the operation of the information processing device 10 when the user gives an instruction to increase the size while listening to a movie using the information processing device 10.
  • the parameters included in the content information regarding the movie indicate audio output (no mute), connection of earphones, and reduced display.
  • the interpretation unit 114 interprets “increase”, which is a user input, in an ambiguous manner as increasing the screen or increasing the volume.
  • the acquisition unit 112 specifies the movie as valid content, and obtains a plurality of parameters including the title of the movie (for example, as illustrated in FIG. 2). Parameter). Then, the interpreting unit 114 interprets “increase”, which is a user input, in an ambiguous manner as increasing the screen or increasing the volume based on the parameters and the like acquired by the acquiring unit 112.
  • the response information generation unit 118 Generates response information asking the user whether the interpretation is applicable to the contents of the user input. For example, the response information generation unit 118 generates response information for specifying the content of the user input such as “Is the screen to be increased or the volume?”. As a result, a text stating “Is the screen to be increased or the volume?” Is displayed on the display 162.
  • the user instructs “screen” and voice in response to the question “Is the screen to be increased or the volume?”.
  • the interpreting unit 114 uniquely interprets the first user input “enlarge” as enlarging the screen.
  • the response information generation unit 118 generates response information indicating that the user's instruction is to be executed.
  • the control command issuing unit 116 issues a control command for displaying a movie on the entire screen. As a result, for example, the text “OK” is displayed on the display 162, and the movie is displayed on the entire screen of the display 162.
  • the information processing apparatus 10 uses the response information asking the user which of the plurality of interpretations is applicable to the content of the user input. Identify the interpretation of the input. For this reason, usability of the information processing apparatus 10 can be improved.
  • FIG. 6 is a flowchart showing an example of the operation of the information processing apparatus shown in FIG. The operation illustrated in FIG. 6 is an example of a control method of the information processing device 10.
  • step S100 the processing device 100 determines whether or not there is a user input. For example, the processing device 100 determines whether the input device 150 has received a user input. When there is a user input, that is, when the input device 150 receives the user input, the operation of the information processing device 10 proceeds to step S110. On the other hand, when there is no user input, that is, when the input device 150 has not received the user input, the operation of the information processing device 10 returns to step S100.
  • the information processing apparatus 10 waits for the execution of the processing in step S110 until the input device 150 receives a user input.
  • the input device 150 receives a user input (for example, a voice input such as “increase” in FIGS. 3, 4, and 5)
  • the information processing device 10 executes the process of step S110.
  • the processing device 100 functions as the acquisition unit 112 and specifies valid contents.
  • the acquisition unit 112 specifies the movie as valid content.
  • the mail is specified as valid content
  • the map is valid content.
  • the action game is specified as valid content
  • the music game Is specified as valid content.
  • step S120 the processing device 100 functions as the obtaining unit 112, and obtains content information including one or more parameters indicating a valid content state.
  • the acquisition unit 112 sets parameters indicating the title of the movie, the presence or absence of subtitles, the window size, the presence or absence of audio mute, and the presence or absence of earphone connection, respectively. , As parameters to be included in the content information.
  • the processing device 100 functions as the interpretation unit 114, and interprets the content of the user input based on the content information acquired in step S120.
  • the interpretation unit 114 uniquely interprets “enlarge”, which is a user input, based on the enlargement of the screen and the parameters included in the content information.
  • the interpretation unit 114 uniquely interprets “increase”, which is a user input, based on increasing the volume, parameters included in the content information, and the like.
  • the interpreting unit 114 sets the user input “increase” in an ambiguous manner based on increasing the screen or increasing the volume and parameters included in the content information. To be interpreted.
  • step S140 the processing device 100 functions as the response information generation unit 118, and determines whether the content of the user input interpreted in step S130 is ambiguous.
  • the response information generation unit 118 determines whether the interpretation result of the user input by the interpretation unit 114 includes a plurality of interpretations.
  • the interpretation result of the user input by the interpretation unit 114 indicates one interpretation, so that the content of the user input is uniquely specified. Therefore, in the examples shown in FIGS. 3 and 4, the response information generation unit 118 determines that the content of the user input is not ambiguous. Further, in the example shown in FIG. 5, since the interpretation result of the user input by the interpretation unit 114 includes a plurality of interpretations, the content of the user input is not uniquely specified. Therefore, in the example shown in FIG. 5, the response information generation unit 118 determines that the content of the user input is ambiguous.
  • the determination as to whether the content of the user input interpreted in step S130 is ambiguous may be executed by a functional block other than the response information generation unit 118. For example, it may be determined whether the content of the user input interpreted by the interpreting unit 114 in step S130 is ambiguous. If the content of the user input is ambiguous, that is, if the interpretation result of the user input by the interpretation unit 114 includes a plurality of interpretations, the operation of the information processing apparatus 10 proceeds to step S142. On the other hand, when the content of the user input is not ambiguous, that is, when the content of the user input is uniquely specified, the operation of the information processing apparatus 10 proceeds to step S150.
  • step S142 the processing device 100 functions as the response information generation unit 118, and generates response information for asking the user about the contents of the user input based on the interpretation result in step S130. Then, the information processing device 10 outputs the generated response information.
  • the response information generation unit 118 uses the interpretation result of “increase” as the user input (two interpretations of increasing the screen and increasing the volume). Response information for asking the user about the contents of the user input such as "Is the screen to be increased or the volume?" Then, the information processing apparatus 10 displays a text stating “Is the screen to be increased or the volume?” On the display 162.
  • step S144 the processing device 100 functions as the interpretation unit 114, and determines the interpretation of the content of the user input based on the response to the response information output in step S142.
  • the interpreting unit 114 receives the answer “Screen” from the user in response to the question “Do you want to increase the screen or volume?” The interpretation of the content of "enlarge” is determined to enlarge the screen.
  • step S150 the processing device 100 functions as the response information generation unit 118, and generates response information to the user input based on the interpretation result of the content of the user input. Then, the information processing device 10 outputs the generated response information. For example, if the content of the user input interpreted in step S130 is ambiguous, the response information generation unit 118 generates response information based on the interpretation of the content of the user input determined in step S144. Further, for example, when the content of the user input interpreted in step S130 is unique, the response information generation unit 118 generates response information according to the content of the user input interpreted in step S130.
  • the user input “enlarge” is uniquely interpreted as enlarging the screen, so that the response information generation unit 118 outputs the response information indicating that the screen is enlarged to the user.
  • the information processing device 10 displays a text stating “Change to full screen” on the display 162 based on the generated response information.
  • “increase”, which is a user input is uniquely interpreted as increasing the volume, so that the response information generation unit 118 generates response information indicating that the volume is increased.
  • the information processing device 10 displays a text indicating “increase the volume” on the display 162 based on the generated response information.
  • the interpretation of “enlarge” as the user input is determined to enlarge the screen based on the response to the response information asking the user for the content of the user input.
  • the information processing device 10 displays a text “OK” on the display 162 based on the generated response information.
  • step S160 the processing device 100 functions as the control command issuing unit 116, and generates a control command corresponding to the user input based on the interpretation result of the content of the user input.
  • the control command issuing unit 116 since the interpretation result of the content of “enlarge”, which is a user input, is to enlarge the screen, the control command issuing unit 116 performs control to display the movie on the full screen. Issue a command. As a result, the movie is displayed on the entire screen of the display 162.
  • the control command issuing unit 116 issues a control command to increase the volume. As a result, the volume of the movie being played by the information processing device 10 increases.
  • step S170 the processing device 100 functions as the response information generation unit 118, and generates response information to the user input based on the execution result of the control command issued in step S160. Then, the information processing device 10 outputs the generated response information.
  • the response to the user input ends by executing the control command issued in step S160.
  • the end of the response to the user input is not limited to the execution of the control command issued in step S160. For example, when the content of the user input is a route search to a destination, the response to the user input ends by outputting the result of the route search.
  • the content of the user input is a route search to the destination
  • a control command for executing the route search to the destination is issued in step S160, and the response information generation unit 118 determines the route to the destination. Is generated based on the result of the route search. Then, the information processing apparatus 10 outputs the route to the destination by one or both of text and voice, and ends the response to the user input.
  • the operation of the information processing device 10 is not limited to the example illustrated in FIG.
  • a series of processes in steps S142 and S144 may be repeated until the interpretation of the content of the user input is determined.
  • one of the processes of steps S150 and S170 may be omitted according to the content of the user input.
  • the information processing device 10 interprets a user input (an input of a user in a natural language) to an application that processes content based on the content information. And an interpreting unit 114.
  • the information processing device 10 interprets the content of the user input based on the content information on the valid content. Therefore, for example, when an ambiguous instruction is issued from the user, it is possible to reduce the possibility that the user's instruction is not specified, and to reduce the execution of a process different from the user's intention. As a result, the usability of the information processing device 10 can be improved.
  • the information processing apparatus 10 further includes a control command issuing unit 116 that issues a control command according to a user input based on the result of interpretation by the user input interpreting unit 114. For example, even when an ambiguous instruction is issued from the user, the user's instruction is uniquely interpreted by the interpreting unit 114 based on the content information, so that a control command for a process different from the user's intention is issued. Can be reduced.
  • the information processing apparatus 10 also includes a response information generation unit 118 that generates response information to a user input based on the result of interpretation by the user input interpretation unit 114. For example, even when an ambiguous instruction is issued from the user, the instruction of the user is uniquely interpreted by the interpretation unit 114 based on the content information, so that response information is generated for an instruction different from the user's intention. Can be reduced.
  • the response information generation unit 118 When the interpretation result of the user input interpretation unit 114 includes a plurality of interpretations, the response information generation unit 118 generates response information asking the user which of the plurality of interpretations is applicable to the content of the user input. I do. For example, when the content of the user input cannot be uniquely specified even by using the content information on the valid content, the information processing apparatus 10 specifies the content of the user input by using the response information asking the user about the content of the user input. it can.
  • FIG. 7 is a block diagram showing the overall configuration of the information processing apparatus 10 according to the second embodiment of the present invention. Elements that are the same as or similar to the elements described in FIGS. 1 to 6 are denoted by the same reference numerals, and detailed description is omitted.
  • the information processing device 10 shown in FIG. 7 has the same configuration as the information processing device 10 shown in FIG.
  • the information processing device 10 is realized by a computer system including a processing device 100, a storage device 140, an input device 150, an output device 160, and a communication device 170.
  • a plurality of elements of the information processing device 10 are mutually connected by a single or a plurality of buses.
  • each of the plurality of elements of the information processing device 10 may be configured by a single device or a plurality of devices. Alternatively, some elements of the information processing device 10 may be omitted.
  • the processing device 100 shown in FIG. 7 is the same as or similar to the processing device 100 shown in FIG. 1 except that the control device PR is executed instead of the control program PR shown in FIG.
  • the processing device 100 functions as the agent unit 110a by reading and executing the control program PRa from the storage device 140.
  • the agent unit 110a interprets a user input, which is a user input in a natural language, and executes processing according to the user input, similarly to the agent unit 110 illustrated in FIG.
  • the acquiring unit 112a, the interpreting unit 114, the control command issuing unit 116, and the response information generating unit 118 shown in the agent unit 110a of FIG. 7 are examples of functional blocks of the agent unit 110a. That is, the information processing apparatus 10 includes the obtaining unit 112a, the interpreting unit 114, the control command issuing unit 116, and the response information generating unit 118.
  • the interpreting unit 114, the control command issuing unit 116, and the response information generating unit 118 illustrated in FIG. 7 are the same as the interpreting unit 114, the control command issuing unit 116, and the response information generating unit 118 illustrated in FIG. Therefore, FIG. 7 illustrates the acquisition unit 112a.
  • the acquisition unit 112a specifies a content corresponding to an active window that receives a user input from among the plurality of windows. Then, the obtaining unit 112a obtains content information on the content corresponding to the window in the active state. For example, when the window displaying the map is in an active state among a plurality of windows displayed on the display 162, the acquiring unit 112a specifies the display of the map as valid content. Then, the obtaining unit 112a obtains content information on valid content.
  • FIG. 8 is an explanatory diagram showing an example of the operation of the information processing apparatus 10 shown in FIG.
  • FIG. 8 shows an example of the operation of the information processing apparatus 10 when two windows WD (WD10 and WD20) are displayed on the display 162.
  • a movie is displayed in window WD10, and a mail is displayed in window WD20.
  • the dark shaded upper portion in the window WD in FIG. 8 indicates the active window WD for receiving a user's input.
  • the acquiring unit 112a specifies the window WD10 as the active window WD that receives a user input. Then, the acquisition unit 112a specifies the movie reproduced in the window WD10 as valid content. Therefore, the obtaining unit 112a obtains content information on a movie. In the state C2, the acquisition unit 112a specifies the window WD20 as the active window WD that receives a user input. Then, the acquisition unit 112a specifies the mail displayed in the window WD20 as valid content. Therefore, the obtaining unit 112a obtains the content information regarding the mail.
  • the acquisition unit 112a validates the content corresponding to the active window WD that receives a user input among the plurality of windows WD. Identify as content. For this reason, even when a plurality of windows WD are displayed on the display 162, the information processing apparatus 10 can interpret the user input based on the content information on the valid content. Therefore, even when a plurality of windows WD are displayed on the display 162, the usability of the information processing apparatus 10 can be improved.
  • FIG. 9 is a block diagram showing the overall configuration of the information processing apparatus 10 according to the third embodiment of the present invention. Elements that are the same as or similar to the elements described with reference to FIGS. 1 to 8 are given the same reference numerals, and detailed descriptions thereof will be omitted.
  • the information processing apparatus 10 shown in FIG. 9 has the same configuration as the information processing apparatus 10 shown in FIG. 1 except that an output device 160A is provided instead of the output device 160 shown in FIG.
  • the information processing device 10 is realized by a computer system including the processing device 100, the storage device 140, the input device 150, the output device 160A, and the communication device 170.
  • a plurality of elements of the information processing device 10 are mutually connected by a single or a plurality of buses.
  • each of the plurality of elements of the information processing device 10 may be configured by a single device or a plurality of devices. Alternatively, some elements of the information processing device 10 may be omitted.
  • the output device 160A has the same configuration as the output device 160 shown in FIG. 1 except that the output device 160A has the vibration generating unit 168. That is, the output device 160A includes the display 162, the speaker 164, the light emitting unit 166, and the vibration generating unit 168.
  • the vibration generator 168 is, for example, a vibrator, and vibrates under the control of the processing device 100. Specifically, the processing device 100 vibrates the information processing device 10 by vibrating the vibration generating unit 168 according to the content of the response information. The processing device 100 may set the pattern of the vibration according to the content of the response information to a pattern different from the pattern of the vibration indicating the incoming call or the like.
  • the processing device 100 shown in FIG. 9 is the same as or similar to the processing device 100 shown in FIG. 1 except that the control device PR shown in FIG. 1 is executed instead of the control program PR.
  • the processing device 100 functions as the agent unit 110b, the display data generation unit 120, and the sound data generation unit 130 by reading and executing the control program PRb from the storage device 140.
  • the agent unit 110b interprets a user input, which is a user input in a natural language, and executes a process according to the user input, similarly to the agent unit 110 illustrated in FIG.
  • the acquiring unit 112, the interpreting unit 114, the control command issuing unit 116, the response information generating unit 118, and the output mode determining unit 119 shown in the agent unit 110b of FIG. 9 are examples of functional blocks of the agent unit 110b. That is, the information processing apparatus 10 includes the obtaining unit 112, the interpreting unit 114, the control command issuing unit 116, the response information generating unit 118, and the output mode determining unit 119.
  • FIG. 9 illustrates the output mode determination unit 119, the display data generation unit 120, and the sound data generation unit 130.
  • the output mode determining unit 119 determines the output mode of the response information based on the content information on the valid content. For example, the output mode determination unit 119 selects an output mode of response information from output mode candidates including a plurality of output modes based on the content information.
  • the output mode candidates include, for example, an output mode in which response information is output as an image, an output mode in which response information is output as sound, an output mode in which response information is output as vibration, and an output mode in which light according to the content of the response information is output. Output modes.
  • the output mode of outputting the response information as an image may include, for example, an output mode of displaying the content of the response information in text and an output mode of displaying an icon corresponding to the content of the response information.
  • the output mode of outputting the response information as a sound includes, for example, an output mode in which a text indicating the content of the response information is read out, and music such as a melody, harmony, rhythm (or tempo), and timbre that can identify the content of the response information. And an output mode of outputting the target element.
  • the display data generation unit 120 When the output mode of the response information is determined by the output mode determining unit 119 to be an output mode in which the response information is output as an image, the display data generation unit 120 generates display data such as a text or an icon indicating the content of the response information. . Then, the display data generation unit 120 transfers the generated display data to the display 162.
  • the sound data generating unit 130 When the output mode of the response information is determined by the output mode determining unit 119 to be an output mode in which the response information is output as a sound, the sound data generating unit 130 generates sound data indicating the content of the response information.
  • the sound data is, for example, sound data that reads out text indicating the content of the response information, or sound data that includes a musical element capable of identifying the content of the response information.
  • the sound data generation unit 130 transfers the generated sound data to the speaker 164.
  • the function block of the agent unit 110b is not limited to the example shown in FIG.
  • the agent unit 110b may include the acquisition unit 112a illustrated in FIG. Next, an example of the relationship between the content information and the output mode of the response information will be described with reference to FIG.
  • FIG. 10 is an explanatory diagram illustrating an example of a relationship between content information and an output mode of response information. Note that the relationship between the content information and the output mode of the response information is not limited to the example illustrated in FIG. In FIG. 10, information indicated by one of the plurality of parameters is extracted and described. For example, when the type of content is a movie or a TV program, parameter information indicating the presence or absence of subtitles is described, and when the type of content is mail, parameter information indicating a window size is described.
  • the information processing device 10 can prevent the sound of a movie or the like from being difficult to hear by responding to the user input in a text display.
  • the information processing apparatus 10 displays the content of the response information in a typeface different from the subtitle of the movie, thereby easily distinguishing whether the text displayed on the display 162 indicates the content of the response information or the subtitle of the movie. Can be.
  • the information processing apparatus 10 displays the content of the response information at a position on the display 162 that does not overlap the subtitles, thereby preventing the subtitles of the movie from being difficult to see.
  • the information processing apparatus 10 can prevent the text and the like of the mail from being difficult to read by responding to the user input by voice. For example, when the output mode of the response information is text, if the text indicating the content of the response information is displayed over the text of the mail, the text of the mail becomes difficult to read.
  • both voice and text are selected as output modes of the response information.
  • the information processing apparatus 10 can reliably convey the contents of the response information to the user, as compared with the case of responding only with voice. Further, the information processing apparatus 10 displays the content of the response information in an area different from the display area of the mail on the display 162, so that it is possible to prevent the text of the mail from being difficult to read. For example, when the text indicating the content of the response information is displayed in the display area of the mail, if the text indicating the content of the response information is displayed over the text of the mail, the text of the mail becomes difficult to read.
  • the content type is a map and full-screen display
  • voice is selected as the output mode of the response information.
  • the output mode of the response information is text
  • the type of content is a map and reduced display
  • both voice and text are selected as output modes of the response information.
  • the contents of the response information can be reliably transmitted to the user, as compared with the case of responding only by voice.
  • the information processing apparatus 10 displays the contents of the response information in an area different from the display area of the map on the display 162, thereby preventing the map being displayed from being difficult to see.
  • the type of content is an action game and full screen display
  • voice is selected as the output mode of the response information.
  • the output mode of the response information is text
  • the type of content is an action game and the display is reduced
  • both voice and text are selected as the output mode of the response information.
  • the contents of the response information can be transmitted to the user more reliably than in the case of responding only by voice.
  • the information processing apparatus 10 displays the content of the response information in an area different from the display area of the action game on the display 162, thereby preventing the game screen and the like from being difficult to see.
  • the information processing apparatus 10 can prevent the sound of the game from being difficult to hear by responding to the user's input with a text, and can suppress the trouble in the progress of the music game.
  • the output mode of the response information is voice
  • the voice that conveys the content of the response information to the user overlaps with the sound of the game, it becomes difficult to hear the content of the response information and the sound of the game.
  • FIG. 11 is a flowchart showing an example of the operation of the information processing apparatus 10 shown in FIG.
  • the operation illustrated in FIG. 11 is an example of a control method of the information processing device 10.
  • the operation illustrated in FIG. 11 is the same as or similar to the operation illustrated in FIG. 6 except that the process of step S132 is added to the operation illustrated in FIG. Therefore, in FIG. 11, the operation of the information processing apparatus 10 will be described focusing on the processing of step S132.
  • the process of step S132 is executed, for example, after the process of step S130 is executed.
  • step S132 the processing device 100 functions as the output mode determining unit 119, and determines the output mode of the response information based on the content information acquired in step S120. For example, in steps S142, S150, and S170, response information is output in the output mode determined in the process of step S132. After the processing of step S132 is performed, the processing of step S140 is performed.
  • the operation of the information processing device 10 is not limited to the example illustrated in FIG.
  • the output mode determining unit 119 may determine the output mode of the response information in consideration of one or both of the content of the user input and the content of the response information in addition to the content information. That is, the output mode determining unit 119 may determine the output mode of the response information based on the content information and the content of the user input, or may determine the response mode based on the content information, the content of the user input, and the content of the response information.
  • the output mode of the information may be determined. For example, when the content of the user input is a highly urgent request or the like, or when the content of the response information is something that the user wants to be surely recognized (for example, a highly urgent content), the response information May be selected as both text and voice.
  • the information processing apparatus 10 may transmit the response information to the user by vibrating the vibration generating unit 168. .
  • the information processing apparatus 10 may convey the response information to the user by turning on or off a light emitting unit 166 such as an LED, or may output a short sound to the speaker 164. May output the response information to the user.
  • the information processing device 10 includes an output mode determination unit 119 that determines the output mode of the response information based on the content information.
  • the information processing apparatus 10 can change the output mode of the response information to the user input in accordance with the content information regarding the valid content. Therefore, as described with reference to FIG. 10, usability of the information processing apparatus 10 can be improved.
  • the content corresponding to the window WD in the active state is specified as the valid content among the plurality of windows WD displayed on the display 162, but the valid content is in the active state.
  • the content is not limited to the content corresponding to the window WD.
  • the acquiring unit 112 determines that the predetermined highest priority content among the plurality of contents respectively corresponding to the plurality of windows WD is the valid content. May be specified. For example, in the state C2 in FIG.
  • the acquisition unit 112a may specify the movie as valid content instead of the mail in the active state.
  • the output devices 160 and 160A have the light-emitting unit 166, but the light-emitting unit 166 outputs light corresponding to the content of the response information.
  • the light emitting unit 166 may be omitted from the output devices 160 and 160A.
  • the output device 160 may include the vibration generation unit 168 in a case where the output mode of outputting the response information by vibration is included in the output mode candidate.
  • the information processing device 10 may include an auxiliary storage device.
  • the auxiliary storage device is a recording medium readable by the processing device 100, for example, an optical disk such as a CD-ROM (Compact Disc ROM), a hard disk drive, a flexible disk, a magneto-optical disk (eg, a compact disk, a digital versatile disk). , A Blu-ray (registered trademark) disk), a smart card, a flash memory (for example, a card, a stick, a key drive), a floppy (registered trademark) disk, and a magnetic strip.
  • the auxiliary storage device may be called a storage.
  • the storage device 140 is a recording medium readable by the processing device 100, such as a ROM and a RAM.
  • a flexible disk for example, a compact disk, Disk, Blu-ray (registered trademark) disk, smart card, flash memory device (eg, card, stick, key drive), CD-ROM (Compact Disc-ROM), register, removable disk, hard disk, floppy (registered trademark) ) Disks, magnetic strips, databases, servers and other suitable storage media.
  • the program may be transmitted from a network via a telecommunication line. Further, the program may be transmitted from a communication network via a telecommunication line.
  • LTE Long Term Evolution
  • LTE-A Long Term Evolution-Advanced
  • SUPER 3G IMT-Advanced
  • 4G 4th generation mobile communication system
  • 5G 5th generation mobile communication system
  • FRA Full Radio Access
  • NR new Radio
  • W-CDMA registered trademark
  • GSM registered trademark
  • CDMA2000 CDMA2000
  • UMB Universal Mobile Broadband
  • IEEE 802.11 Wi-Fi (registered trademark)
  • a system using IEEE@802.16 WiMAX (registered trademark)
  • IEEE@802.20 UWB (Ultra-WideBand
  • Bluetooth registered trademark
  • a plurality of systems may be combined (for example, a combination of at least one of LTE and LTE-A with 5G) and applied.
  • the signal may be a message.
  • input and output information and the like may be stored in a specific place (for example, a memory), or may be managed using a management table. Information that is input and output can be overwritten, updated, or added. The output information or the like may be deleted. The input information or the like may be transmitted to another device.
  • the determination may be made based on a value (0 or 1) represented by one bit, or may be performed based on a Boolean value (Boolean: true or false). , May be performed by comparing numerical values (for example, comparison with a predetermined value).
  • each function illustrated in FIGS. 1, 7 and 9 is realized by an arbitrary combination of at least one of hardware and software.
  • a method of implementing each functional block is not particularly limited. That is, each functional block may be realized using one device physically or logically coupled, or directly or indirectly (for example, two or more devices physically or logically separated from each other). , Wired, wireless, etc.), and may be implemented using these multiple devices.
  • the functional block may be realized by combining one device or the plurality of devices with software.
  • the software is an instruction, an instruction set, a code, regardless of whether it is called software, firmware, middleware, microcode, a hardware description language, or another name. Should be interpreted broadly to mean code segment, program code, program, subprogram, software module, application, software application, software package, routine, subroutine, object, executable, thread of execution, procedure, function, etc. .
  • software, instructions, information, and the like may be transmitted and received via a transmission medium.
  • a transmission medium For example, if the software uses at least one of wired technology (coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), etc.) and wireless technology (infrared, microwave, etc.), the website, When transmitted from a server or other remote source, at least one of these wired and / or wireless technologies is included within the definition of a transmission medium.
  • system and “network” are used interchangeably.
  • the information, parameters, and the like described in the present disclosure may be represented using an absolute value, may be represented using a relative value from a predetermined value, or may be represented using another corresponding information. May be expressed as The names used for the parameters described above are not limiting in any way. Further, equations and the like using these parameters may differ from those explicitly disclosed in the present disclosure.
  • connection refers to a direct or indirect connection between two or more elements. Any connection or combination is meant and may include the presence of one or more intermediate elements between two elements “connected” or “coupled” to each other.
  • the coupling or connection between the elements may be physical, logical, or a combination thereof.
  • connection may be read as “access”.
  • two elements may be implemented using at least one of one or more wires, cables, and printed electrical connections, and as some non-limiting and non-exhaustive examples, in the radio frequency domain. , Can be considered “connected” or “coupled” to each other using electromagnetic energy having wavelengths in the microwave and optical (both visible and invisible) regions, and the like.
  • determining and “determining” as used in the present disclosure may encompass a wide variety of operations.
  • ⁇ Judgment '', ⁇ decision '' for example, judgment (judging), calculation (calculating), calculation (computing), processing (processing), derivation (deriving), investigating (investigating), searching (looking up, search, inquiry) (E.g., searching in a table, database, or another data structure), ascertaining may be considered “determined", “determined”, and the like.
  • “determining” and “deciding” include receiving (eg, receiving information), transmitting (eg, transmitting information), input (input), output (output), and access. (accessing) (for example, accessing data in a memory) may be regarded as “determined” or “determined”.
  • ⁇ judgment '' and ⁇ decision '' means that resolving, selecting, selecting, establishing, establishing, comparing, etc. are considered as ⁇ judgment '' and ⁇ decided ''. May be included.
  • judgment and “decision” may include deeming any operation as “judgment” and “determined”.
  • “Judgment (determination)” may be read as “assuming”, “expecting”, “considering”, or the like.
  • Each aspect / embodiment described in the present disclosure may be used alone, may be used in combination, or may be used by switching with execution. Further, the notification of the predetermined information (for example, the notification of “X”) is not limited to being explicitly performed, and is performed implicitly (for example, not performing the notification of the predetermined information). Is also good.
  • DESCRIPTION OF SYMBOLS 10 ... Information processing apparatus, 100 ... Processing apparatus, 110, 110a, 110b ... Agent part, 112, 112a ... Acquisition part, 114 ... Interpretation part, 116 ... Control command issuing part, 118 ... Response information generation part, 119 ... Output mode Determination unit, 120: display data generation unit, 130: sound data generation unit, 140: storage device, 150: input device, 152: microphone, 154: operation unit, 160, 160A: output device, 162: display, 164: speaker 166: Light-emitting unit, 168: Vibration generator, 170: Communication device, WD10, WD20: Window.

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • User Interface Of Digital Computer (AREA)
  • Machine Translation (AREA)

Abstract

An information processing device that comprises: an acquisition part that acquires content information about content; and an analysis part that, on the basis of the content information, analyzes user input that is made in natural language to an application that processes the content.

Description

情報処理装置Information processing device
 本発明は、情報処理装置に関する。 << The present invention relates to an information processing device.
 ユーザが発する音声コマンド等の音声入力を解釈して、音声で指示された処理を実行する音声エージェント機能を含む情報処理装置が知られている。例えば、簡略化された音声コマンドの使用を可能にする音声入力処理装置が提案されている(例えば、特許文献1)。この種の音声入力処理装置は、例えば、簡略化された音声コマンドを受信した場合、音声コマンドの内容の少なくとも一部と動作内容とを関連付けた動作情報の履歴である動作履歴を参照して、各種制御のための所定のコマンドを発行する。 2. Description of the Related Art There is known an information processing apparatus including a voice agent function of interpreting a voice input such as a voice command issued by a user and executing a process instructed by voice. For example, a voice input processing device that enables the use of simplified voice commands has been proposed (for example, Patent Document 1). This type of voice input processing device, for example, when a simplified voice command is received, with reference to an operation history that is a history of operation information that associates at least a part of the content of the voice command with the operation content, Issues predetermined commands for various controls.
特開2017-146437号公報JP-A-2017-146437
 しかし、多義的な指示がユーザから発せられた場合、動作履歴を参照してもユーザの指示を特定できない場合がある。このため、従来の音声エージェント機能を採用した情報処理装置では、ユーザから多義的な指示を受けた場合、ユーザの意図する処理を実行できない場合がある。したがって、従来の音声エージェント機能等を採用した情報処理装置の使い勝手は必ずしもよいとはいえない。 However, when an ambiguous instruction is issued from the user, the user's instruction may not be specified even by referring to the operation history. For this reason, in a conventional information processing apparatus that employs a voice agent function, when an ambiguous instruction is received from a user, a process intended by the user may not be performed in some cases. Therefore, the usability of the information processing apparatus employing the conventional voice agent function or the like is not necessarily good.
 以上の課題を解決するために、本発明の好適な態様に係る情報処理装置は、コンテンツに関するコンテンツ情報を取得する取得部と、前記コンテンツを処理するアプリケーションに対する自然言語によるユーザ入力を前記コンテンツ情報に基づいて解釈する解釈部と、を備える。 In order to solve the above problems, an information processing apparatus according to a preferred aspect of the present invention includes an acquisition unit that acquires content information regarding content, and a user input in a natural language for an application that processes the content, in the content information. And an interpreting unit for interpreting based on the information.
 本発明によれば、情報処理装置の使い勝手を向上させることができる。 According to the present invention, the usability of the information processing device can be improved.
本発明の第1実施形態に係る情報処理装置の全体構成を示すブロック図である。FIG. 1 is a block diagram illustrating an overall configuration of an information processing apparatus according to a first embodiment of the present invention. コンテンツ情報の一例を示す説明図である。FIG. 4 is an explanatory diagram illustrating an example of content information. ユーザ入力の解釈が一意に特定される場合の一例を示す説明図である。FIG. 9 is an explanatory diagram illustrating an example in which interpretation of a user input is uniquely specified. ユーザ入力の解釈が一意に特定される場合の別の例を示す説明図である。FIG. 11 is an explanatory diagram illustrating another example in which the interpretation of a user input is uniquely specified. ユーザ入力の解釈が一意に特定されない場合の一例を示す説明図である。FIG. 9 is an explanatory diagram illustrating an example in which interpretation of a user input is not uniquely specified; 図1に示した情報処理装置の動作の一例を示すフローチャートである。3 is a flowchart illustrating an example of an operation of the information processing apparatus illustrated in FIG. 本発明の第2実施形態に係る情報処理装置の全体構成を示すブロック図である。It is a block diagram showing the whole information processing unit composition concerning a 2nd embodiment of the present invention. 図7に示した情報処理装置の動作の一例を示す説明図である。FIG. 8 is an explanatory diagram illustrating an example of an operation of the information processing device illustrated in FIG. 7. 本発明の第3実施形態に係る情報処理装置の全体構成を示すブロック図である。It is a block diagram showing the whole information processor concerning a 3rd embodiment of the present invention. コンテンツ情報と応答情報の出力態様との関係の一例を示す説明図である。FIG. 5 is an explanatory diagram illustrating an example of a relationship between content information and an output mode of response information. 図9に示した情報処理装置の動作の一例を示すフローチャートである。10 is a flowchart illustrating an example of an operation of the information processing apparatus illustrated in FIG.
[1.第1実施形態]
 図1は、本発明の第1実施形態に係る情報処理装置10の全体構成を示すブロック図である。以下の説明では、情報処理装置10としてスマートフォンを想定する。但し、情報処理装置10としては、任意の可搬型の情報処理装置を採用することができ、例えば、ノートパソコン、ウェアラブル端末及びタブレット端末等であってもよい。
[1. First Embodiment]
FIG. 1 is a block diagram illustrating an overall configuration of an information processing apparatus 10 according to the first embodiment of the present invention. In the following description, a smartphone is assumed as the information processing device 10. However, any portable information processing device can be adopted as the information processing device 10, and may be, for example, a notebook computer, a wearable terminal, a tablet terminal, or the like.
 図1に例示するように、情報処理装置10は、処理装置100、記憶装置140、入力装置150、出力装置160及び通信装置170を具備するコンピュータシステムにより実現される。情報処理装置10の複数の要素は、単体又は複数のバスで相互に接続される。なお、本明細書における「装置」という用語は、回路、デバイス又はユニット等の他の用語に読替えてもよい。また、情報処理装置10の複数の要素の各々を、単数又は複数の機器が構成してもよい。あるいは、情報処理装置10の一部の要素は省略されてもよい。 As illustrated in FIG. 1, the information processing device 10 is realized by a computer system including a processing device 100, a storage device 140, an input device 150, an output device 160, and a communication device 170. A plurality of elements of the information processing device 10 are mutually connected by a single or a plurality of buses. Note that the term “apparatus” in this specification may be replaced with another term such as a circuit, a device, or a unit. In addition, each of the plurality of elements of the information processing device 10 may be configured by a single device or a plurality of devices. Alternatively, some elements of the information processing device 10 may be omitted.
 処理装置100は、情報処理装置10の全体を制御するプロセッサであり、例えば単数又は複数のチップで構成される。処理装置100は、例えば、周辺装置とのインタフェース、演算装置及びレジスタ等を含む中央処理装置(CPU:Central Processing Unit)で構成される。なお、処理装置100の機能の一部又は全部を、DSP(Digital Signal Processor)、ASIC(Application Specific Integrated Circuit)、PLD(Programmable Logic Device)、及び、FPGA(Field Programmable Gate Array)等のハードウェアによって実現してもよい。処理装置100は、各種の処理を並列的又は逐次的に実行する。 The processing device 100 is a processor that controls the entire information processing device 10, and is configured by, for example, a single chip or a plurality of chips. The processing device 100 includes, for example, a central processing unit (CPU: Central Processing Unit) including an interface with a peripheral device, an arithmetic device, a register, and the like. Note that some or all of the functions of the processing device 100 are implemented by hardware such as a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), a PLD (Programmable Logic Device), and an FPGA (Field Programmable Gate Array). It may be realized. The processing device 100 executes various types of processing in parallel or sequentially.
 処理装置100は、例えば、記憶装置140から制御プログラムPRを読み出して実行することによって、エージェント部110として機能する。エージェント部110は、自然言語によるユーザの入力であるユーザ入力を解釈して、ユーザ入力に応じた処理を実行する。ユーザ入力は、例えば、自然言語によるユーザからの指示又は質問等である。なお、ユーザ入力の方法(自然言語によるユーザの入力の方法)は、情報処理装置10がユーザ入力の内容をテキスト等に変換して解釈できればよく、特に限定されない。例えば、ユーザ入力の方法は、音声及びテキスト等による入力が該当する。 The processing device 100 functions as the agent unit 110 by, for example, reading out and executing the control program PR from the storage device 140. The agent unit 110 interprets a user input, which is a user input in a natural language, and executes a process according to the user input. The user input is, for example, an instruction or a question from the user in a natural language. The method of user input (the method of user input in natural language) is not particularly limited as long as the information processing apparatus 10 can convert the content of the user input into text or the like and interpret it. For example, the user input method corresponds to input by voice, text, or the like.
 なお、図1のエージェント部110内に示した取得部112、解釈部114、制御コマンド発行部116及び応答情報生成部118は、エージェント部110の機能ブロックの一例である。すなわち、情報処理装置10は、取得部112、解釈部114、制御コマンド発行部116及び応答情報生成部118を有する。 The acquiring unit 112, the interpreting unit 114, the control command issuing unit 116, and the response information generating unit 118 shown in the agent unit 110 of FIG. 1 are examples of functional blocks of the agent unit 110. That is, the information processing apparatus 10 includes the obtaining unit 112, the interpreting unit 114, the control command issuing unit 116, and the response information generating unit 118.
 取得部112は、コンテンツに関するコンテンツ情報を取得する。例えば、取得部112は、ユーザ入力を受け付ける状態にあるアプリケーションが処理しているコンテンツに関するコンテンツ情報を取得する。以下では、ユーザ入力を受け付ける状態にあるアプリケーションが処理しているコンテンツは、有効なコンテンツとも称される。例えば、取得部112は、情報処理装置10が実行しているアプリケーションを特定し、アプリケーションの名称又はアプリケーションが処理しているファイル等に基づいて、有効なコンテンツを特定する。そして、取得部112は、有効なコンテンツに関するコンテンツ情報を取得する。 (4) The acquisition unit 112 acquires content information on content. For example, the acquisition unit 112 acquires content information on content being processed by an application that is in a state of receiving a user input. In the following, the content that is being processed by the application that is in a state of accepting user input is also referred to as valid content. For example, the acquisition unit 112 specifies an application that is being executed by the information processing apparatus 10, and specifies valid content based on the name of the application or a file that is being processed by the application. Then, the obtaining unit 112 obtains content information on valid content.
 例えば、取得部112は、ユーザが情報処理装置10を用いて映画を視聴している場合、映画を有効なコンテンツとして特定する。また、例えば、取得部112は、ユーザが情報処理装置10を用いてメールの送信文を作成している場合、メールを有効なコンテンツとして特定する。そして、取得部112は、有効なコンテンツに関するコンテンツ情報を取得する。本明細書では、ユーザがメールを参照するため、メールをコンテンツの一種としている。コンテンツ情報は、コンテンツの種類に応じて定められた一又は複数のパラメータを有する。例えば、取得部112は、有効なコンテンツがTV(television)番組の場合、TV番組のタイトル(番組情報)を含む複数のパラメータを取得する。なお、コンテンツ情報の一例は、図2において説明する。 For example, when the user is watching a movie using the information processing device 10, the acquisition unit 112 specifies the movie as valid content. Further, for example, when the user has created a transmission message of a mail using the information processing device 10, the acquisition unit 112 specifies the mail as valid content. Then, the obtaining unit 112 obtains content information on valid content. In this specification, a mail is a type of content because a user refers to the mail. The content information has one or a plurality of parameters determined according to the type of the content. For example, when the effective content is a TV (television) program, the obtaining unit 112 obtains a plurality of parameters including the title (program information) of the TV program. An example of the content information will be described with reference to FIG.
 解釈部114は、有効なコンテンツを処理するアプリケーションに対するユーザ入力をコンテンツ情報に基づいて解釈する。例えば、解釈部114は、コンテンツ情報に含まれるパラメータに基づいて、ユーザ入力の内容を解釈する。例えば、有効なコンテンツがTV番組であり、TV番組に関する複数のパラメータのうちの1つであるタイトルが野球中継を示す場合、ユーザが「他の試合は?」と尋ねると、解釈部114は、ユーザ入力である「他の試合は?」を、他の野球の試合結果の検索又は他の野球の途中経過の検索と解釈する。 The interpretation unit 114 interprets a user input to an application that processes valid content based on the content information. For example, the interpretation unit 114 interprets the content of the user input based on the parameters included in the content information. For example, when the valid content is a TV program and the title, which is one of a plurality of parameters related to the TV program, indicates a baseball broadcast, when the user asks “What other games?” The user input “other games?” Is interpreted as a search for a game result of another baseball or a search for a progress of another baseball.
 制御コマンド発行部116は、ユーザ入力に応じた制御コマンドを、ユーザ入力の解釈部114による解釈結果に基づいて発行する。例えば、ユーザ入力である「他の試合は?」が他の野球の試合結果の検索又は他の野球の途中経過の検索と解釈部114により解釈された場合、制御コマンド発行部116は、データ放送に含まれる情報等から他の野球の試合結果又は他の野球の途中経過を検索する制御コマンドを発行する。他の野球の試合結果又は他の野球の途中経過を検索する制御コマンドの発行により、他の野球の試合結果又は他の野球の途中経過が検索され、検索結果が応答情報生成部118に取得される。 The control command issuing unit 116 issues a control command according to a user input based on the result of interpretation by the user input interpreting unit 114. For example, when the user input “other games?” Is interpreted by the search and interpretation unit 114 for searching for a result of another baseball game or for searching for the progress of another baseball game, the control command issuing unit 116 performs data broadcasting. A control command for retrieving the result of another baseball game or the progress of another baseball from information and the like included in the above is issued. By issuing a control command for searching for the result of another baseball game or the course of another baseball, the result of another baseball game or the course of another baseball is searched, and the search result is acquired by the response information generation unit 118. You.
 応答情報生成部118は、ユーザ入力に対する応答情報を、ユーザ入力の解釈部114による解釈結果に基づいて生成する。ユーザ入力に対する応答情報は、例えば、ユーザからの指示を受け付けたことを示す情報、ユーザからの指示に対する処理の実行結果を示す情報及びユーザの質問に対する回答を示す情報等である。例えば、ユーザ入力である「他の試合は?」が他の野球の試合結果の検索又は他の野球の途中経過の検索と解釈部114により解釈された場合、応答情報生成部118は、他の野球の試合結果の検索結果又は他の野球の途中経過の検索結果を示す応答情報を生成する。この結果、例えば、応答情報に基づいて、他の野球の試合結果又は他の野球の途中経過が、後述するディスプレイ162にテキストで表示される。 The response information generation unit 118 generates response information to the user input based on the result of the interpretation by the user input interpretation unit 114. The response information to the user input is, for example, information indicating that the instruction from the user has been received, information indicating the execution result of the process in response to the instruction from the user, information indicating the answer to the user's question, and the like. For example, if the user input “other games?” Is interpreted by the search and interpretation unit 114 for searching for a game result of another baseball or searching for the progress of another baseball, the response information generating unit 118 Response information indicating a search result of a baseball game result or a search result of another baseball game in progress is generated. As a result, for example, based on the response information, the game result of another baseball or the progress of another baseball is displayed as text on a display 162 described later.
 また、応答情報生成部118は、ユーザ入力の解釈部114による解釈結果が複数の解釈を含む場合、複数の解釈のうちのいずれがユーザ入力の内容に当てはまる解釈かを確認する応答情報を生成する。すなわち、応答情報生成部118は、ユーザ入力の解釈が一意に特定されない場合、ユーザ入力の内容をユーザに尋ねる応答情報を生成する。ユーザ入力の解釈が一意に特定されない場合の例は、図5において説明する。 When the interpretation result of the user input interpretation unit 114 includes a plurality of interpretations, the response information generation unit 118 generates response information for confirming which of the plurality of interpretations is applicable to the content of the user input. . That is, when the interpretation of the user input is not uniquely specified, the response information generation unit 118 generates response information for asking the user about the content of the user input. An example in which the interpretation of the user input is not uniquely specified will be described with reference to FIG.
 記憶装置140は、処理装置100が読取可能な記録媒体であり、処理装置100が実行する制御プログラムPRを含む複数のプログラム、及び処理装置100が使用する各種のデータを記憶する。記憶装置140は、例えば、ROM(Read Only Memory)、EPROM(Erasable Programmable ROM)、EEPROM(Electrically Erasable Programmable ROM)、及びRAM(Random Access Memory)等の少なくとも1つによって構成されてもよい。 The storage device 140 is a recording medium that can be read by the processing device 100, and stores a plurality of programs including the control program PR executed by the processing device 100, and various data used by the processing device 100. The storage device 140 may be constituted by at least one of a ROM (Read Only Memory), an EPROM (Erasable Programmable ROM), an EEPROM (Electrically Erasable Programmable ROM), and a RAM (Random Access Memory).
 入力装置150は、外部からの入力を受け付ける入力デバイスである。例えば、入力装置150は、音声入力操作を受け付けるマイクロフォン152とユーザによる操作を受け付ける操作部154とを有する。入力装置150は、マイクロフォン152又は操作部154等において受け付けたユーザ入力をエージェント部110に転送する。 The input device 150 is an input device that receives an external input. For example, the input device 150 includes a microphone 152 that receives a voice input operation and an operation unit 154 that receives an operation by a user. The input device 150 transfers the user input received by the microphone 152 or the operation unit 154 to the agent unit 110.
 マイクロフォン152は、例えば、ユーザからの指示又は質問等のユーザの入力を音声で受け付ける。操作部154は、情報処理装置10が使用する情報を処理装置100に入力するための機器(例えば、キーボード、マウス、スイッチ及びボタン等)であり、ユーザからの指示又は質問等のユーザの入力を受け付ける。具体的には、操作部154は、数字及び文字等の符号を処理装置100に入力するための操作と、ディスプレイ162が表示するアイコンを選択するための操作とを受け付ける。例えば、ディスプレイ162の表示面に対する接触を検出するタッチパネルが操作部154として好適である。なお、ユーザが操作可能な複数の操作子を操作部154が含んでもよい。また、入力装置150は、情報処理装置10自体の動き等を検出するセンサを含んでもよい。 The microphone 152 receives, for example, a user input such as an instruction or a question from the user by voice. The operation unit 154 is a device (for example, a keyboard, a mouse, a switch, a button, and the like) for inputting information used by the information processing device 10 to the processing device 100, and receives an instruction from a user or a user input such as a question. Accept. Specifically, operation unit 154 receives an operation for inputting codes such as numerals and characters to processing device 100 and an operation for selecting an icon displayed on display 162. For example, a touch panel that detects contact with the display surface of the display 162 is suitable as the operation unit 154. Note that the operation unit 154 may include a plurality of operators that can be operated by the user. Further, the input device 150 may include a sensor that detects a movement or the like of the information processing device 10 itself.
 出力装置160は、外部への出力を実施する出力デバイスである。例えば、出力装置160は、ディスプレイ162、スピーカー164及び発光部166を有する。ディスプレイ162は、表示装置の一例であり、処理装置100による制御のもとで各種の画像を表示する。例えば、ディスプレイ162は、応答情報を示すテキスト又はアイコン等の画像を表示する。なお、ディスプレイ162として、例えば、液晶表示パネル及び有機EL(Electro Luminescence)表示パネル等の各種の表示パネルが好適に利用される。 The output device 160 is an output device that performs output to the outside. For example, the output device 160 includes a display 162, a speaker 164, and a light emitting unit 166. The display 162 is an example of a display device, and displays various images under the control of the processing device 100. For example, the display 162 displays an image such as text or an icon indicating response information. As the display 162, for example, various display panels such as a liquid crystal display panel and an organic EL (Electro Luminescence) display panel are suitably used.
 スピーカー164は、処理装置100による制御のもとで各種の音を出力する。例えば、スピーカー164は、応答情報を示す音声又は音楽等の音を出力する。 The speaker 164 outputs various sounds under the control of the processing device 100. For example, the speaker 164 outputs sound such as voice or music indicating response information.
 発光部166は、例えば、LED(Light Emitting Diode)等の発光素子を有し、処理装置100による制御のもとで各種の光を発する。例えば、処理装置100は、応答情報の内容に応じて発光部166を点灯又は点滅させる。 The light emitting unit 166 has a light emitting element such as an LED (Light Emitting Diode), and emits various lights under the control of the processing device 100. For example, the processing device 100 turns on or blinks the light emitting unit 166 according to the content of the response information.
 通信装置170は、移動体通信網又はインターネット等のネットワークを介して他の装置と通信する機器である。通信装置170は、例えば、ネットワークデバイス、ネットワークコントローラ、ネットワークカード又は通信モジュールとも表記される。次に、図2を参照して、コンテンツ情報の一例を説明する。 The communication device 170 is a device that communicates with another device via a mobile communication network or a network such as the Internet. The communication device 170 is also described as, for example, a network device, a network controller, a network card, or a communication module. Next, an example of content information will be described with reference to FIG.
 図2は、コンテンツ情報の一例を示す説明図である。コンテンツ情報は、例えば、コンテンツの種類と、コンテンツの種類に応じて定められた一又は複数のパラメータとを有する。図2に示す例では、コンテンツの種類に応じて複数のパラメータが定められている。ユーザ入力の内容の解釈に複数のパラメータを用いることにより、1つのパラメータを用いる場合に比べて、多義的な指示がユーザから発せられた場合でも、ユーザの指示を効率よく特定することができる。 FIG. 2 is an explanatory diagram showing an example of content information. The content information has, for example, a content type and one or more parameters determined according to the content type. In the example shown in FIG. 2, a plurality of parameters are determined according to the type of content. By using a plurality of parameters for interpreting the contents of the user input, it is possible to more efficiently specify the user's instruction even when an ambiguous instruction is issued from the user, as compared with the case where one parameter is used.
 コンテンツの種類が映画又はTV番組である場合、パラメータは、例えば、映画又はTV番組のタイトル、字幕の有無、ウインドウサイズ、音声のミュートの有無及びイヤホンの接続の有無等をそれぞれ示す情報である。例えば、字幕の有無を示すパラメータは、第1の値(例えば、値“1”)が設定されている場合、字幕があることを示し、第1の値と異なる第2の値(例えば、値“0”)が設定されている場合、字幕がないことを示す。また、ウインドウサイズは、例えば、映画又はTV番組を表示しているウインドウが全画面表示か縮小表示かを示す。 If the type of content is a movie or a TV program, the parameters are, for example, information indicating the title of the movie or the TV program, the presence or absence of subtitles, the window size, the presence / absence of audio mute, the presence / absence of earphone connection, and the like. For example, when the parameter indicating the presence or absence of subtitles is set to a first value (for example, the value “1”), it indicates that subtitles are present, and a second value (for example, value If “0” is set, it indicates that there is no subtitle. The window size indicates, for example, whether a window displaying a movie or a TV program is full screen display or reduced display.
 コンテンツの種類が音楽である場合、パラメータは、例えば、曲のタイトル、歌詞の表示の有無、ウインドウサイズ及びイヤホンの接続の有無等をそれぞれ示す情報である。ウインドウサイズは、例えば、音楽の再生等を操作する操作ボタン等を表示しているウインドウが全画面表示か縮小表示かを示す。 If the content type is music, the parameter is information indicating, for example, the title of the song, whether or not to display lyrics, the window size, whether or not an earphone is connected, and the like. The window size indicates, for example, whether a window displaying operation buttons for operating music reproduction or the like is displayed in full screen or reduced.
 コンテンツの種類がメールである場合、パラメータは、例えば、アクティブ状態のウインドウの種類、ウインドウサイズ、音声のミュートの有無及びイヤホンの接続の有無等をそれぞれ示す情報である。アクティブ状態のウインドウの種類は、例えば、受信メールの一覧画面、受信メールの表示画面及び送信メールの作成画面等である。また、ウインドウサイズは、例えば、アクティブ状態のウインドウが全画面表示か縮小表示かを示す。 If the content type is e-mail, the parameters are, for example, information indicating the type of active window, the window size, the presence / absence of audio muting, the presence / absence of earphone connection, and the like. The types of windows in the active state are, for example, a received mail list screen, a received mail display screen, a transmitted mail creation screen, and the like. The window size indicates, for example, whether the active window is a full screen display or a reduced display.
 コンテンツの種類が地図である場合、パラメータは、例えば、地図の表示倍率、ウインドウサイズ、音声のミュートの有無及びイヤホンの接続の有無等をそれぞれ示す情報である。ウインドウサイズは、例えば、地図を表示しているウインドウが全画面表示か縮小表示かを示す。 If the content type is a map, the parameters are information indicating, for example, the display magnification of the map, the window size, the presence / absence of audio mute, the presence / absence of earphone connection, and the like. The window size indicates, for example, whether the window displaying the map is full screen display or reduced display.
 コンテンツの種類が戦闘ゲーム等のアクションゲームである場合、パラメータは、例えば、ゲームのタイトル、ゲームの画面の種類、ウインドウサイズ、音声のミュートの有無及びイヤホンの接続の有無等をそれぞれ示す情報である。ゲームの画面の種類は、例えば、格闘中の場面、アイテムを選択する場面及びゲーム結果を表示する画面等である。ウインドウサイズは、例えば、ゲームを表示しているウインドウが全画面表示か縮小表示かを示す。 When the type of content is an action game such as a battle game, the parameters are information indicating, for example, the title of the game, the type of the game screen, the window size, the presence / absence of audio mute, the presence / absence of earphone connection, and the like. . The types of game screens include, for example, a scene during a fight, a scene for selecting an item, and a screen for displaying a game result. The window size indicates, for example, whether the window displaying the game is full screen display or reduced display.
 コンテンツの種類が音楽ゲームである場合、パラメータは、例えば、ゲームのタイトル、ウインドウサイズ及びイヤホンの接続の有無等をそれぞれ示す情報である。ウインドウサイズは、例えば、ゲームを表示しているウインドウが全画面表示か縮小表示かを示す。 If the type of content is a music game, the parameters are information indicating, for example, the title of the game, the window size, the presence or absence of the connection of the earphone, and the like. The window size indicates, for example, whether the window displaying the game is full screen display or reduced display.
 なお、コンテンツの種類及びコンテンツの種類に応じて定められたパラメータ等は、図2に示す例に限定されない。例えば、コンテンツの種類が音楽である場合、パラメータは、イヤホンの接続の有無を示す1つの情報でもよい。すなわち、コンテンツ情報に含まれるパラメータは、1つでもよい。 {Note that the type of content and the parameters determined according to the type of content are not limited to the example shown in FIG. For example, when the type of content is music, the parameter may be one piece of information indicating whether or not the earphone is connected. That is, the number of parameters included in the content information may be one.
 図3は、ユーザ入力の解釈が一意に特定される場合の一例を示す説明図である。なお、図3は、ユーザが情報処理装置10を用いて映画を視聴している最中に「大きくして」と音声で指示した場合の情報処理装置10の動作の一例を示す。有効なコンテンツが映画の場合、ユーザ入力である「大きくして」は、画面を大きくすることと、音量を大きくすることとの2通りの意味を有する。図3に示す例では、映画に関するコンテンツ情報に含まれるパラメータが音声のミュート、イヤホンの未接続及び縮小表示等を示していると仮定する。このため、図3に示す例では、解釈部114は、ユーザ入力である「大きくして」を、画面を大きくすることと一意に解釈する。 FIG. 3 is an explanatory diagram showing an example in which the interpretation of the user input is uniquely specified. FIG. 3 shows an example of the operation of the information processing apparatus 10 when the user instructs by voice to “increase” while watching a movie using the information processing apparatus 10. When the valid content is a movie, “increase”, which is a user input, has two meanings: increasing the screen and increasing the volume. In the example illustrated in FIG. 3, it is assumed that the parameters included in the content information regarding the movie indicate audio mute, unconnected earphones, reduced display, and the like. For this reason, in the example illustrated in FIG. 3, the interpretation unit 114 uniquely interprets “enlarge”, which is a user input, as enlarging the screen.
 例えば、取得部112は、ユーザが情報処理装置10を用いて映画を視聴しているため、映画を有効なコンテンツとして特定し、映画のタイトル等を含む複数のパラメータ(例えば、図2に示したパラメータ)を取得する。そして、解釈部114は、取得部112が取得したパラメータ等に基づいて、ユーザ入力である「大きくして」を、画面を大きくすることと一意に解釈する。 For example, since the user is viewing a movie using the information processing apparatus 10, the acquisition unit 112 specifies the movie as valid content, and obtains a plurality of parameters including the title of the movie (for example, as illustrated in FIG. 2). Parameter). Then, the interpreting unit 114 uniquely interprets “enlarge”, which is a user input, as increasing the screen based on the parameters and the like acquired by the acquiring unit 112.
 解釈部114の解釈結果が画面を大きくすることであるため、応答情報生成部118は、画面を大きくすることを示す応答情報を、ユーザ入力に対する応答情報として生成する。この結果、例えば、「全画面にします」と記載されたテキストがディスプレイ162に表示される。 Since the interpretation result of the interpretation unit 114 is to enlarge the screen, the response information generation unit 118 generates response information indicating that the screen is enlarged as response information to a user input. As a result, for example, a text stating "Make full screen" is displayed on display 162.
 また、制御コマンド発行部116は、解釈部114の解釈結果が画面を大きくすることであるため、映画を全画面で表示する制御コマンドを発行する。この結果、ディスプレイ162の全画面に映画が表示される。 {Circle around (2)} Since the interpretation result of the interpretation unit 114 is to enlarge the screen, the control command issuing unit 116 issues a control command for displaying the movie on the full screen. As a result, the movie is displayed on the entire screen of the display 162.
 なお、所定の言葉で情報処理装置10に呼びかけた後に指示等をユーザが発声するようにしてもよい。例えば、情報処理装置10の入力装置150は、所定の言葉の呼びかけの後に続く音声をユーザ入力として受け付けてもよい。この場合、情報処理装置10は、ユーザが発した言葉が情報処理装置10に対する入力であるか否かを、所定の言葉の呼びかけの有無を検出することにより容易に判別できる。 Note that the user may utter an instruction or the like after calling the information processing apparatus 10 with a predetermined word. For example, the input device 150 of the information processing device 10 may receive, as a user input, a voice following a call for a predetermined word. In this case, the information processing device 10 can easily determine whether or not the word uttered by the user is an input to the information processing device 10 by detecting the presence or absence of a call for a predetermined word.
 また、ユーザ入力は、音声入力に限定されず、例えば、テキストであってもよい。例えば、ユーザが映画の視聴中に「渋谷に行きたい」と操作部154を介してテキストで入力すると、情報処理装置10は、現在位置から渋谷までの経路を検索し、検索結果をテキストでディスプレイ162に表示してもよい。図4以降においても、ユーザ入力が音声である場合を例にして情報処理装置10の動作を説明しているが、ユーザ入力は音声入力に限定されない。 ユ ー ザ In addition, the user input is not limited to the voice input, and may be, for example, text. For example, when the user inputs “I want to go to Shibuya” by text via the operation unit 154 while watching a movie, the information processing apparatus 10 searches for a route from the current position to Shibuya, and displays the search result in text. 162 may be displayed. Also in FIG. 4 and subsequent figures, the operation of the information processing apparatus 10 is described with an example where the user input is a voice, but the user input is not limited to the voice input.
 図4は、ユーザ入力の解釈が一意に特定される場合の別の例を示す説明図である。なお、図4は、ユーザが情報処理装置10を用いて映画を視聴している最中に「大きくして」と音声で指示した場合の情報処理装置10の動作の一例を示す。図4に示す例では、映画に関するコンテンツ情報に含まれるパラメータが音声の出力(ミュート無し)、イヤホンの接続及び全画面表示を示していると仮定する。このため、図4に示す例では、解釈部114は、ユーザ入力である「大きくして」を、音量を大きくすることと一意に解釈する。 FIG. 4 is an explanatory diagram showing another example in which the interpretation of the user input is uniquely specified. FIG. 4 shows an example of the operation of the information processing apparatus 10 when the user gives an instruction to increase the volume while listening to a movie using the information processing apparatus 10. In the example illustrated in FIG. 4, it is assumed that the parameters included in the content information regarding the movie indicate audio output (no mute), connection of earphones, and full-screen display. For this reason, in the example illustrated in FIG. 4, the interpretation unit 114 uniquely interprets “increase”, which is a user input, as increasing the volume.
 例えば、取得部112は、ユーザが情報処理装置10を用いて映画を視聴しているため、映画を有効なコンテンツとして特定し、映画のタイトル等を含む複数のパラメータ(例えば、図2に示したパラメータ)を取得する。そして、解釈部114は、取得部112が取得したパラメータ等に基づいて、ユーザ入力である「大きくして」を、音量を大きくすることと一意に解釈する。 For example, since the user is viewing a movie using the information processing apparatus 10, the acquisition unit 112 specifies the movie as valid content, and obtains a plurality of parameters including the title of the movie (for example, as illustrated in FIG. 2). Parameter). Then, the interpreting unit 114 uniquely interprets “increase”, which is a user input, as increasing the volume based on the parameters and the like acquired by the acquiring unit 112.
 解釈部114の解釈結果が音量を大きくすることであるため、応答情報生成部118は、音量を大きくすることを示す応答情報を、ユーザ入力に対する応答情報として生成する。この結果、例えば、「音量を大きくします」と記載されたテキストがディスプレイ162に表示される。 Since the interpretation result of the interpretation unit 114 is to increase the volume, the response information generation unit 118 generates response information indicating that the volume is increased as response information to a user input. As a result, for example, text indicating “increase the volume” is displayed on the display 162.
 また、制御コマンド発行部116は、解釈部114の解釈結果が音量を大きくすることであるため、音量を大きくする制御コマンドを発行する。この結果、情報処理装置10が再生している映画の音量が、大きくなる。 {Circle around (2)} Since the interpretation result of interpreting section 114 is to increase the volume, control command issuing section 116 issues a control command to increase the volume. As a result, the volume of the movie being played by the information processing device 10 increases.
 なお、ユーザ入力の解釈が一意に特定される場合の応答情報は、図3及び図4に示した例に限定されない。例えば、情報処理装置10は、ユーザ入力に対する応答として、「承知しました」と記載したテキストをディスプレイ162に表示してもよい。 The response information when the interpretation of the user input is uniquely specified is not limited to the examples illustrated in FIGS. 3 and 4. For example, the information processing apparatus 10 may display a text “OK” on the display 162 as a response to the user input.
 図3及び図4において説明したように、ユーザ入力の内容が複数の意味を有する場合でも、情報処理装置10は、有効なコンテンツに関するコンテンツ情報に基づいて、ユーザ入力の解釈を一意に特定できる。このため、情報処理装置10の使い勝手を向上させることができる。なお、有効なコンテンツに関するコンテンツ情報を参照しても、ユーザ入力の解釈を一意に特定することができない場合の例は、図5において説明する。 As described in FIGS. 3 and 4, even when the content of the user input has a plurality of meanings, the information processing apparatus 10 can uniquely specify the interpretation of the user input based on the content information on the valid content. For this reason, usability of the information processing apparatus 10 can be improved. An example in which the interpretation of the user input cannot be uniquely specified even by referring to the content information regarding the valid content will be described with reference to FIG.
 図5は、ユーザ入力の解釈が一意に特定されない場合の一例を示す説明図である。なお、図5は、ユーザが情報処理装置10を用いて映画を視聴している最中に「大きくして」と音声で指示した場合の情報処理装置10の動作の一例を示す。図5に示す例では、映画に関するコンテンツ情報に含まれるパラメータが音声の出力(ミュート無し)、イヤホンの接続及び縮小表示を示していると仮定する。このため、図5に示す例では、解釈部114は、ユーザ入力である「大きくして」を、画面を大きくすること又は音量を大きくすることと多義的に解釈する。 FIG. 5 is an explanatory diagram showing an example where the interpretation of the user input is not uniquely specified. FIG. 5 shows an example of the operation of the information processing device 10 when the user gives an instruction to increase the size while listening to a movie using the information processing device 10. In the example shown in FIG. 5, it is assumed that the parameters included in the content information regarding the movie indicate audio output (no mute), connection of earphones, and reduced display. For this reason, in the example illustrated in FIG. 5, the interpretation unit 114 interprets “increase”, which is a user input, in an ambiguous manner as increasing the screen or increasing the volume.
 例えば、取得部112は、ユーザが情報処理装置10を用いて映画を視聴しているため、映画を有効なコンテンツとして特定し、映画のタイトル等を含む複数のパラメータ(例えば、図2に示したパラメータ)を取得する。そして、解釈部114は、取得部112が取得したパラメータ等に基づいて、ユーザ入力である「大きくして」を、画面を大きくすること又は音量を大きくすることと多義的に解釈する。 For example, since the user is viewing a movie using the information processing apparatus 10, the acquisition unit 112 specifies the movie as valid content, and obtains a plurality of parameters including the title of the movie (for example, as illustrated in FIG. 2). Parameter). Then, the interpreting unit 114 interprets “increase”, which is a user input, in an ambiguous manner as increasing the screen or increasing the volume based on the parameters and the like acquired by the acquiring unit 112.
 ユーザ入力である「大きくして」の解釈部114による解釈が複数の解釈(画面を大きくすること及び音量を大きくすること)を含むため、応答情報生成部118は、複数の解釈のうちのいずれがユーザ入力の内容に当てはまる解釈かをユーザに尋ねる応答情報を生成する。例えば、応答情報生成部118は、「大きくするのは画面ですか?音量ですか?」等のユーザ入力の内容を特定するための応答情報を生成する。この結果、「大きくするのは画面ですか?音量ですか?」と記載されたテキストがディスプレイ162に表示される。 Since the interpretation by the interpretation unit 114 of “increase” which is a user input includes a plurality of interpretations (enlarging the screen and increasing the volume), the response information generation unit 118 Generates response information asking the user whether the interpretation is applicable to the contents of the user input. For example, the response information generation unit 118 generates response information for specifying the content of the user input such as “Is the screen to be increased or the volume?”. As a result, a text stating “Is the screen to be increased or the volume?” Is displayed on the display 162.
 図5に示す例では、ユーザは、「大きくするのは画面ですか?音量ですか?」の問いに対して「画面」と音声で指示している。このため、解釈部114は、最初のユーザ入力である「大きくして」を、画面を大きくすることと一意に解釈する。そして、応答情報生成部118は、ユーザの指示を実行することを示す応答情報を生成する。また、制御コマンド発行部116は、映画を全画面で表示する制御コマンドを発行する。この結果、例えば、「承知しました」と記載されたテキストがディスプレイ162に表示され、ディスプレイ162の全画面に映画が表示される。 In the example shown in FIG. 5, the user instructs “screen” and voice in response to the question “Is the screen to be increased or the volume?”. For this reason, the interpreting unit 114 uniquely interprets the first user input “enlarge” as enlarging the screen. Then, the response information generation unit 118 generates response information indicating that the user's instruction is to be executed. Further, the control command issuing unit 116 issues a control command for displaying a movie on the entire screen. As a result, for example, the text “OK” is displayed on the display 162, and the movie is displayed on the entire screen of the display 162.
 ユーザ入力の解釈部114による解釈が複数の解釈を含む場合でも、情報処理装置10は、複数の解釈のうちのいずれがユーザ入力の内容に当てはまる解釈かをユーザに尋ねる応答情報を用いて、ユーザ入力の解釈を特定できる。このため、情報処理装置10の使い勝手を向上させることができる。 Even when the interpretation of the user input by the interpretation unit 114 includes a plurality of interpretations, the information processing apparatus 10 uses the response information asking the user which of the plurality of interpretations is applicable to the content of the user input. Identify the interpretation of the input. For this reason, usability of the information processing apparatus 10 can be improved.
 図6は、図1に示した情報処理装置の動作の一例を示すフローチャートである。なお、図6に示す動作は、情報処理装置10の制御方法の一例である。 FIG. 6 is a flowchart showing an example of the operation of the information processing apparatus shown in FIG. The operation illustrated in FIG. 6 is an example of a control method of the information processing device 10.
 ステップS100では、処理装置100は、ユーザ入力があるか否かを判定する。例えば、処理装置100は、入力装置150がユーザ入力を受け付けたか否かを判定する。ユーザ入力がある場合、すなわち、入力装置150がユーザ入力を受け付けた場合、情報処理装置10の動作は、ステップS110に移る。一方、ユーザ入力がない場合、すなわち、入力装置150がユーザ入力を受け付けていない場合、情報処理装置10の動作は、ステップS100に戻る。 In step S100, the processing device 100 determines whether or not there is a user input. For example, the processing device 100 determines whether the input device 150 has received a user input. When there is a user input, that is, when the input device 150 receives the user input, the operation of the information processing device 10 proceeds to step S110. On the other hand, when there is no user input, that is, when the input device 150 has not received the user input, the operation of the information processing device 10 returns to step S100.
 すなわち、情報処理装置10は、入力装置150がユーザ入力を受け付けるまで、ステップS110の処理の実行を待機する。換言すれば、情報処理装置10は、入力装置150がユーザ入力(例えば、図3、図4及び図5の「大きくして」等の音声入力)を受け付けると、ステップS110の処理を実行する。 That is, the information processing apparatus 10 waits for the execution of the processing in step S110 until the input device 150 receives a user input. In other words, when the input device 150 receives a user input (for example, a voice input such as “increase” in FIGS. 3, 4, and 5), the information processing device 10 executes the process of step S110.
 ステップS110では、処理装置100は、取得部112として機能し、有効なコンテンツを特定する。図3から図5に示した例では、ユーザが情報処理装置10を用いて映画を視聴しているため、取得部112は、映画を有効なコンテンツとして特定する。なお、例えば、ユーザが情報処理装置10を用いてメールを見ている場合、メールが有効なコンテンツとして特定され、ユーザが情報処理装置10を用いて地図を見ている場合、地図が有効なコンテンツとして特定される。また、例えば、ユーザが情報処理装置10を用いてアクションゲームをしている場合、アクションゲームが有効なコンテンツとして特定され、ユーザが情報処理装置10を用いて音楽ゲームをしている場合、音楽ゲームが有効なコンテンツとして特定される。 In step S110, the processing device 100 functions as the acquisition unit 112 and specifies valid contents. In the examples illustrated in FIGS. 3 to 5, since the user is viewing a movie using the information processing device 10, the acquisition unit 112 specifies the movie as valid content. For example, when the user is viewing the mail using the information processing apparatus 10, the mail is specified as valid content, and when the user is viewing the map using the information processing apparatus 10, the map is valid content. Specified as Also, for example, when the user is playing an action game using the information processing device 10, the action game is specified as valid content, and when the user is playing a music game using the information processing device 10, the music game Is specified as valid content.
 次に、ステップS120では、処理装置100は、取得部112として機能し、有効なコンテンツの状態を示す一又は複数のパラメータを含むコンテンツ情報を取得する。例えば、ステップS110において特定された有効なコンテンツが映画である場合、取得部112は、映画のタイトル、字幕の有無、ウインドウサイズ、音声のミュートの有無及びイヤホンの接続の有無等をそれぞれ示すパラメータを、コンテンツ情報に含めるパラメータとして取得する。 Next, in step S120, the processing device 100 functions as the obtaining unit 112, and obtains content information including one or more parameters indicating a valid content state. For example, when the valid content identified in step S110 is a movie, the acquisition unit 112 sets parameters indicating the title of the movie, the presence or absence of subtitles, the window size, the presence or absence of audio mute, and the presence or absence of earphone connection, respectively. , As parameters to be included in the content information.
 次に、ステップS130では、処理装置100は、解釈部114として機能し、ユーザ入力の内容をステップS120において取得したコンテンツ情報に基づいて解釈する。図3に示した例では、解釈部114は、ユーザ入力である「大きくして」を、画面を大きくすることとコンテンツ情報に含まれるパラメータ等に基づいて一意に解釈する。また、図4に示した例では、解釈部114は、ユーザ入力である「大きくして」を、音量を大きくすることとコンテンツ情報に含まれるパラメータ等に基づいて一意に解釈する。また、図5に示した例では、解釈部114は、ユーザ入力である「大きくして」を、画面を大きくすること又は音量を大きくすることとコンテンツ情報に含まれるパラメータ等に基づいて多義的に解釈する。 Next, in step S130, the processing device 100 functions as the interpretation unit 114, and interprets the content of the user input based on the content information acquired in step S120. In the example illustrated in FIG. 3, the interpretation unit 114 uniquely interprets “enlarge”, which is a user input, based on the enlargement of the screen and the parameters included in the content information. Further, in the example shown in FIG. 4, the interpretation unit 114 uniquely interprets “increase”, which is a user input, based on increasing the volume, parameters included in the content information, and the like. In the example illustrated in FIG. 5, the interpreting unit 114 sets the user input “increase” in an ambiguous manner based on increasing the screen or increasing the volume and parameters included in the content information. To be interpreted.
 次に、ステップS140では、処理装置100は、応答情報生成部118として機能し、ステップS130において解釈したユーザ入力の内容が多義的か否かを判定する。例えば、応答情報生成部118は、ユーザ入力の解釈部114による解釈結果が複数の解釈を含むか否かを判定する。 Next, in step S140, the processing device 100 functions as the response information generation unit 118, and determines whether the content of the user input interpreted in step S130 is ambiguous. For example, the response information generation unit 118 determines whether the interpretation result of the user input by the interpretation unit 114 includes a plurality of interpretations.
 図3及び図4に示した例では、ユーザ入力の解釈部114による解釈結果は1つの解釈を示すため、ユーザ入力の内容は一意に特定される。したがって、図3及び図4に示した例では、応答情報生成部118は、ユーザ入力の内容は多義的でないと判定する。また、図5に示した例では、ユーザ入力の解釈部114による解釈結果は複数の解釈を含むため、ユーザ入力の内容は一意に特定されない。したがって、図5に示した例では、応答情報生成部118は、ユーザ入力の内容は多義的であると判定する。 In the examples shown in FIGS. 3 and 4, the interpretation result of the user input by the interpretation unit 114 indicates one interpretation, so that the content of the user input is uniquely specified. Therefore, in the examples shown in FIGS. 3 and 4, the response information generation unit 118 determines that the content of the user input is not ambiguous. Further, in the example shown in FIG. 5, since the interpretation result of the user input by the interpretation unit 114 includes a plurality of interpretations, the content of the user input is not uniquely specified. Therefore, in the example shown in FIG. 5, the response information generation unit 118 determines that the content of the user input is ambiguous.
 なお、ステップS130において解釈したユーザ入力の内容が多義的か否かの判定は、応答情報生成部118以外の機能ブロックにより実行されてもよい。例えば、解釈部114がステップS130において解釈したユーザ入力の内容が多義的か否かを判定してもよい。ユーザ入力の内容が多義的である場合、すなわち、ユーザ入力の解釈部114による解釈結果が複数の解釈を含む場合、情報処理装置10の動作は、ステップS142に移る。一方、ユーザ入力の内容が多義的でない場合、すなわち、ユーザ入力の内容が一意に特定される場合、情報処理装置10の動作は、ステップS150に移る。 Note that the determination as to whether the content of the user input interpreted in step S130 is ambiguous may be executed by a functional block other than the response information generation unit 118. For example, it may be determined whether the content of the user input interpreted by the interpreting unit 114 in step S130 is ambiguous. If the content of the user input is ambiguous, that is, if the interpretation result of the user input by the interpretation unit 114 includes a plurality of interpretations, the operation of the information processing apparatus 10 proceeds to step S142. On the other hand, when the content of the user input is not ambiguous, that is, when the content of the user input is uniquely specified, the operation of the information processing apparatus 10 proceeds to step S150.
 ステップS142では、処理装置100は、応答情報生成部118として機能し、ユーザ入力の内容をユーザに尋ねる応答情報をステップS130の解釈結果に基づいて生成する。そして、情報処理装置10は、生成した応答情報を出力する。図5に示した例では、応答情報生成部118は、ユーザ入力である「大きくして」の解釈結果(画面を大きくすること及び音量を大きくすることの2通りの解釈)に基づいて、「大きくするのは画面ですか?音量ですか?」等のユーザ入力の内容をユーザに尋ねる応答情報を生成する。そして、情報処理装置10は、「大きくするのは画面ですか?音量ですか?」と記載したテキストをディスプレイ162に表示する。 In step S142, the processing device 100 functions as the response information generation unit 118, and generates response information for asking the user about the contents of the user input based on the interpretation result in step S130. Then, the information processing device 10 outputs the generated response information. In the example illustrated in FIG. 5, the response information generation unit 118 uses the interpretation result of “increase” as the user input (two interpretations of increasing the screen and increasing the volume). Response information for asking the user about the contents of the user input such as "Is the screen to be increased or the volume?" Then, the information processing apparatus 10 displays a text stating “Is the screen to be increased or the volume?” On the display 162.
 次に、ステップS144では、処理装置100は、解釈部114として機能し、ユーザ入力の内容の解釈をステップS142で出力した応答情報に対する回答に基づいて決定する。図5に示した例では、解釈部114は、「大きくするのは画面ですか?音量ですか?」の問いに対して、「画面」との回答をユーザから受けたため、最初のユーザ入力である「大きくして」の内容の解釈を、画面を大きくすることに決定する。ステップS144の処理が実行された後、情報処理装置10の動作は、ステップS150に移る。 Next, in step S144, the processing device 100 functions as the interpretation unit 114, and determines the interpretation of the content of the user input based on the response to the response information output in step S142. In the example illustrated in FIG. 5, the interpreting unit 114 receives the answer “Screen” from the user in response to the question “Do you want to increase the screen or volume?” The interpretation of the content of "enlarge" is determined to enlarge the screen. After the process of step S144 is performed, the operation of the information processing device 10 proceeds to step S150.
 ステップS150では、処理装置100は、応答情報生成部118として機能し、ユーザ入力に対する応答情報を、ユーザ入力の内容の解釈結果に基づいて生成する。そして、情報処理装置10は、生成した応答情報を出力する。例えば、ステップS130において解釈したユーザ入力の内容が多義的である場合、応答情報生成部118は、ステップS144で決定したユーザ入力の内容の解釈に基づいて、応答情報を生成する。また、例えば、ステップS130において解釈したユーザ入力の内容が一意的である場合、応答情報生成部118は、ステップS130において解釈したユーザ入力の内容に応じて、応答情報を生成する。 In step S150, the processing device 100 functions as the response information generation unit 118, and generates response information to the user input based on the interpretation result of the content of the user input. Then, the information processing device 10 outputs the generated response information. For example, if the content of the user input interpreted in step S130 is ambiguous, the response information generation unit 118 generates response information based on the interpretation of the content of the user input determined in step S144. Further, for example, when the content of the user input interpreted in step S130 is unique, the response information generation unit 118 generates response information according to the content of the user input interpreted in step S130.
 図3に示した例では、ユーザ入力である「大きくして」が画面を大きくすることに一意に解釈されるため、応答情報生成部118は、画面を大きくすることを示す応答情報を、ユーザ入力に対する応答情報として生成する。そして、情報処理装置10は、生成した応答情報に基づいて、「全画面にします」と記載したテキストをディスプレイ162に表示する。また、図4に示した例では、ユーザ入力である「大きくして」が音量を大きくすることに一意に解釈されるため、応答情報生成部118は、音量を大きくすることを示す応答情報を、ユーザ入力に対する応答情報として生成する。そして、情報処理装置10は、生成した応答情報に基づいて、「音量を大きくします」と記載したテキストをディスプレイ162に表示する。 In the example illustrated in FIG. 3, the user input “enlarge” is uniquely interpreted as enlarging the screen, so that the response information generation unit 118 outputs the response information indicating that the screen is enlarged to the user. Generate as response information to input. Then, the information processing device 10 displays a text stating “Change to full screen” on the display 162 based on the generated response information. In the example illustrated in FIG. 4, “increase”, which is a user input, is uniquely interpreted as increasing the volume, so that the response information generation unit 118 generates response information indicating that the volume is increased. , As response information to a user input. Then, the information processing device 10 displays a text indicating “increase the volume” on the display 162 based on the generated response information.
 図5に示した例では、ユーザ入力の内容をユーザに尋ねる応答情報に対する回答に基づいてユーザ入力である「大きくして」の解釈が画面を大きくすることに決定されたため、応答情報生成部118は、ユーザの指示を実行することを示す応答情報を生成する。そして、情報処理装置10は、生成した応答情報に基づいて、「承知しました」と記載したテキストをディスプレイ162に表示する。 In the example illustrated in FIG. 5, the interpretation of “enlarge” as the user input is determined to enlarge the screen based on the response to the response information asking the user for the content of the user input. Generates response information indicating that the user's instruction is to be executed. Then, the information processing device 10 displays a text “OK” on the display 162 based on the generated response information.
 次に、ステップS160では、処理装置100は、制御コマンド発行部116として機能し、ユーザ入力に応じた制御コマンドを、ユーザ入力の内容の解釈結果に基づいて生成する。図3及び図5に示した例では、ユーザ入力である「大きくして」の内容の解釈結果が画面を大きくすることであるため、制御コマンド発行部116は、映画を全画面で表示する制御コマンドを発行する。この結果、ディスプレイ162の全画面に映画が表示される。図4に示した例では、ユーザ入力である「大きくして」の内容の解釈結果が音量を大きくすることであるため、制御コマンド発行部116は、音量を大きくする制御コマンドを発行する。この結果、情報処理装置10が再生している映画の音量が、大きくなる。 Next, in step S160, the processing device 100 functions as the control command issuing unit 116, and generates a control command corresponding to the user input based on the interpretation result of the content of the user input. In the examples shown in FIGS. 3 and 5, since the interpretation result of the content of “enlarge”, which is a user input, is to enlarge the screen, the control command issuing unit 116 performs control to display the movie on the full screen. Issue a command. As a result, the movie is displayed on the entire screen of the display 162. In the example shown in FIG. 4, since the interpretation result of the content of "increase", which is a user input, is to increase the volume, the control command issuing unit 116 issues a control command to increase the volume. As a result, the volume of the movie being played by the information processing device 10 increases.
 次に、ステップS170では、処理装置100は、応答情報生成部118として機能し、ユーザ入力に対する応答情報を、ステップS160において発行した制御コマンドの実行結果に基づいて生成する。そして、情報処理装置10は、生成した応答情報を出力する。なお、図3、図4及び図5に示した例では、ステップS160において発行した制御コマンドの実行によりユーザ入力に対する応答は終了する。但し、ユーザ入力に対する応答の終了は、ステップS160において発行した制御コマンドの実行に限定されない。例えば、ユーザ入力の内容が目的地までのルート検索の場合、ルート検索の結果が出力されることによりユーザ入力に対する応答が終了する。 Next, in step S170, the processing device 100 functions as the response information generation unit 118, and generates response information to the user input based on the execution result of the control command issued in step S160. Then, the information processing device 10 outputs the generated response information. In the examples shown in FIGS. 3, 4, and 5, the response to the user input ends by executing the control command issued in step S160. However, the end of the response to the user input is not limited to the execution of the control command issued in step S160. For example, when the content of the user input is a route search to a destination, the response to the user input ends by outputting the result of the route search.
 例えば、ユーザ入力の内容が目的地までのルート検索の場合、目的地までのルート検索を実行するための制御コマンドがステップS160において発行されるため、応答情報生成部118は、目的地までのルートを示す応答情報を、ルート検索の結果に基づいて生成する。そして、情報処理装置10は、目的地までのルートをテキスト及び音声の一方又は両方で出力して、ユーザ入力に対する応答を終了する。 For example, if the content of the user input is a route search to the destination, a control command for executing the route search to the destination is issued in step S160, and the response information generation unit 118 determines the route to the destination. Is generated based on the result of the route search. Then, the information processing apparatus 10 outputs the route to the destination by one or both of text and voice, and ends the response to the user input.
 なお、情報処理装置10の動作は、図6に示す例に限定されない。例えば、ステップS144において、ユーザ入力の内容の解釈が決定しない場合、ステップS142及びS144の一連の処理は、ユーザ入力の内容の解釈が決定するまで繰り返されてもよい。また、例えば、ステップS150及びS170の処理の一方は、ユーザ入力の内容に応じて省かれてもよい。 The operation of the information processing device 10 is not limited to the example illustrated in FIG. For example, when the interpretation of the content of the user input is not determined in step S144, a series of processes in steps S142 and S144 may be repeated until the interpretation of the content of the user input is determined. Also, for example, one of the processes of steps S150 and S170 may be omitted according to the content of the user input.
 以上、第1実施形態では、情報処理装置10は、コンテンツに関するコンテンツ情報を取得する取得部112と、コンテンツを処理するアプリケーションに対するユーザ入力(自然言語によるユーザの入力)をコンテンツ情報に基づいて解釈する解釈部114とを有する。情報処理装置10は、有効なコンテンツに関するコンテンツ情報に基づいて、ユーザ入力の内容を解釈する。このため、例えば、多義的な指示がユーザから発せられた場合、ユーザの指示が特定されないことを低減することができ、ユーザの意図と異なる処理が実行されることを低減することができる。この結果、情報処理装置10の使い勝手を向上させることができる。 As described above, in the first embodiment, the information processing device 10 interprets a user input (an input of a user in a natural language) to an application that processes content based on the content information. And an interpreting unit 114. The information processing device 10 interprets the content of the user input based on the content information on the valid content. Therefore, for example, when an ambiguous instruction is issued from the user, it is possible to reduce the possibility that the user's instruction is not specified, and to reduce the execution of a process different from the user's intention. As a result, the usability of the information processing device 10 can be improved.
 また、情報処理装置10は、ユーザ入力に応じた制御コマンドを、ユーザ入力の解釈部114による解釈結果に基づいて発行する制御コマンド発行部116を有する。例えば、多義的な指示がユーザから発せられた場合でも、ユーザの指示が解釈部114によりコンテンツ情報に基づいて一意に解釈されるため、ユーザの意図と異なる処理の制御コマンドが発行されることを低減することができる。 The information processing apparatus 10 further includes a control command issuing unit 116 that issues a control command according to a user input based on the result of interpretation by the user input interpreting unit 114. For example, even when an ambiguous instruction is issued from the user, the user's instruction is uniquely interpreted by the interpreting unit 114 based on the content information, so that a control command for a process different from the user's intention is issued. Can be reduced.
 また、情報処理装置10は、ユーザ入力に対する応答情報を、ユーザ入力の解釈部114による解釈結果に基づいて生成する応答情報生成部118を有する。例えば、多義的な指示がユーザから発せられた場合でも、ユーザの指示が解釈部114によりコンテンツ情報に基づいて一意に解釈されるため、ユーザの意図と異なる指示に対して応答情報が生成されることを低減することができる。 The information processing apparatus 10 also includes a response information generation unit 118 that generates response information to a user input based on the result of interpretation by the user input interpretation unit 114. For example, even when an ambiguous instruction is issued from the user, the instruction of the user is uniquely interpreted by the interpretation unit 114 based on the content information, so that response information is generated for an instruction different from the user's intention. Can be reduced.
 また、応答情報生成部118は、ユーザ入力の解釈部114による解釈結果が複数の解釈を含む場合、複数の解釈のうちのいずれがユーザ入力の内容に当てはまる解釈かをユーザに尋ねる応答情報を生成する。例えば、情報処理装置10は、有効なコンテンツに関するコンテンツ情報を用いてもユーザ入力の内容を一意に特定できない場合に、ユーザ入力の内容をユーザに尋ねる応答情報を用いて、ユーザ入力の内容を特定できる。 When the interpretation result of the user input interpretation unit 114 includes a plurality of interpretations, the response information generation unit 118 generates response information asking the user which of the plurality of interpretations is applicable to the content of the user input. I do. For example, when the content of the user input cannot be uniquely specified even by using the content information on the valid content, the information processing apparatus 10 specifies the content of the user input by using the response information asking the user about the content of the user input. it can.
[2.第2実施形態]
 第2実施形態と上述した第1実施形態の主な相違点は、図7に示すエージェント部110aが図1に示した取得部112の代わりに取得部112aを有する点である。
[2. Second Embodiment]
The main difference between the second embodiment and the above-described first embodiment is that the agent unit 110a shown in FIG. 7 has an acquisition unit 112a instead of the acquisition unit 112 shown in FIG.
 図7は、本発明の第2実施形態に係る情報処理装置10の全体構成を示すブロック図である。図1から図6において説明した要素と同一又は同様の要素については、同一の符号を付し、詳細な説明を省略する。 FIG. 7 is a block diagram showing the overall configuration of the information processing apparatus 10 according to the second embodiment of the present invention. Elements that are the same as or similar to the elements described in FIGS. 1 to 6 are denoted by the same reference numerals, and detailed description is omitted.
 図7に示す情報処理装置10は、図1に示した情報処理装置10と同一の構成である。例えば、情報処理装置10は、処理装置100、記憶装置140、入力装置150、出力装置160及び通信装置170を具備するコンピュータシステムにより実現される。情報処理装置10の複数の要素は、単体又は複数のバスで相互に接続される。また、情報処理装置10の複数の要素の各々を、単数又は複数の機器が構成してもよい。あるいは、情報処理装置10の一部の要素は省略されてもよい。 情報 処理 The information processing device 10 shown in FIG. 7 has the same configuration as the information processing device 10 shown in FIG. For example, the information processing device 10 is realized by a computer system including a processing device 100, a storage device 140, an input device 150, an output device 160, and a communication device 170. A plurality of elements of the information processing device 10 are mutually connected by a single or a plurality of buses. In addition, each of the plurality of elements of the information processing device 10 may be configured by a single device or a plurality of devices. Alternatively, some elements of the information processing device 10 may be omitted.
 図7に示す処理装置100は、図1に示した制御プログラムPRの代わりに制御プログラムPRaを実行することを除いて、図1に示した処理装置100と同一又は同様である。例えば、処理装置100は、記憶装置140から制御プログラムPRaを読み出して実行することによって、エージェント部110aとして機能する。 The processing device 100 shown in FIG. 7 is the same as or similar to the processing device 100 shown in FIG. 1 except that the control device PR is executed instead of the control program PR shown in FIG. For example, the processing device 100 functions as the agent unit 110a by reading and executing the control program PRa from the storage device 140.
 エージェント部110aは、図1に示したエージェント部110と同様に、自然言語によるユーザの入力であるユーザ入力を解釈して、ユーザ入力に応じた処理を実行する。なお、図7のエージェント部110a内に示した取得部112a、解釈部114、制御コマンド発行部116及び応答情報生成部118は、エージェント部110aの機能ブロックの一例である。すなわち、情報処理装置10は、取得部112a、解釈部114、制御コマンド発行部116及び応答情報生成部118を有する。図7に示す解釈部114、制御コマンド発行部116及び応答情報生成部118は、図1に示した解釈部114、制御コマンド発行部116及び応答情報生成部118と同一である。このため、図7では、取得部112aについて説明する。 The agent unit 110a interprets a user input, which is a user input in a natural language, and executes processing according to the user input, similarly to the agent unit 110 illustrated in FIG. Note that the acquiring unit 112a, the interpreting unit 114, the control command issuing unit 116, and the response information generating unit 118 shown in the agent unit 110a of FIG. 7 are examples of functional blocks of the agent unit 110a. That is, the information processing apparatus 10 includes the obtaining unit 112a, the interpreting unit 114, the control command issuing unit 116, and the response information generating unit 118. The interpreting unit 114, the control command issuing unit 116, and the response information generating unit 118 illustrated in FIG. 7 are the same as the interpreting unit 114, the control command issuing unit 116, and the response information generating unit 118 illustrated in FIG. Therefore, FIG. 7 illustrates the acquisition unit 112a.
 取得部112aは、ディスプレイ162に複数のウインドウが表示される場合、複数のウインドウのうち、ユーザの入力を受け付けるアクティブ状態のウインドウに対応するコンテンツを特定する。そして、取得部112aは、アクティブ状態のウインドウに対応するコンテンツに関するコンテンツ情報を取得する。例えば、取得部112aは、ディスプレイ162に表示されている複数のウインドウのうち、地図が表示されたウインドウがアクティブ状態である場合、地図の表示を有効なコンテンツとして特定する。そして、取得部112aは、有効なコンテンツに関するコンテンツ情報を取得する。 When a plurality of windows are displayed on the display 162, the acquisition unit 112a specifies a content corresponding to an active window that receives a user input from among the plurality of windows. Then, the obtaining unit 112a obtains content information on the content corresponding to the window in the active state. For example, when the window displaying the map is in an active state among a plurality of windows displayed on the display 162, the acquiring unit 112a specifies the display of the map as valid content. Then, the obtaining unit 112a obtains content information on valid content.
 図8は、図7に示した情報処理装置10の動作の一例を示す説明図である。なお、図8は、2つのウインドウWD(WD10及びWD20)がディスプレイ162に表示されている場合の情報処理装置10の動作の一例を示す。ウインドウWD10には、映画が表示され、ウインドウWD20には、メールが表示されている。図8のウインドウWD内の上側の濃い網掛けは、ユーザの入力を受け付けるアクティブ状態のウインドウWDを示す。 FIG. 8 is an explanatory diagram showing an example of the operation of the information processing apparatus 10 shown in FIG. FIG. 8 shows an example of the operation of the information processing apparatus 10 when two windows WD (WD10 and WD20) are displayed on the display 162. A movie is displayed in window WD10, and a mail is displayed in window WD20. The dark shaded upper portion in the window WD in FIG. 8 indicates the active window WD for receiving a user's input.
 状態C1では、取得部112aは、ユーザの入力を受け付けるアクティブ状態のウインドウWDとして、ウインドウWD10を特定する。そして、取得部112aは、ウインドウWD10に再生されている映画を有効なコンテンツとして特定する。したがって、取得部112aは、映画に関するコンテンツ情報を取得する。また、状態C2では、取得部112aは、ユーザの入力を受け付けるアクティブ状態のウインドウWDとして、ウインドウWD20を特定する。そして、取得部112aは、ウインドウWD20に表示されているメールを有効なコンテンツとして特定する。したがって、取得部112aは、メールに関するコンテンツ情報を取得する。 In the state C1, the acquiring unit 112a specifies the window WD10 as the active window WD that receives a user input. Then, the acquisition unit 112a specifies the movie reproduced in the window WD10 as valid content. Therefore, the obtaining unit 112a obtains content information on a movie. In the state C2, the acquisition unit 112a specifies the window WD20 as the active window WD that receives a user input. Then, the acquisition unit 112a specifies the mail displayed in the window WD20 as valid content. Therefore, the obtaining unit 112a obtains the content information regarding the mail.
 以上、第2実施形態においても、第1実施形態と同様の効果を得ることができる。また、第2実施形態では、取得部112aは、ディスプレイ162に複数のウインドウWDが表示される場合、複数のウインドウWDのうち、ユーザの入力を受け付けるアクティブ状態のウインドウWDに対応するコンテンツを有効なコンテンツとして特定する。このため、ディスプレイ162に複数のウインドウWDが表示される場合でも、情報処理装置10は、ユーザ入力を有効なコンテンツに関するコンテンツ情報に基づいて解釈できる。このため、ディスプレイ162に複数のウインドウWDが表示される場合でも、情報処理装置10の使い勝手を向上させることができる。 As described above, also in the second embodiment, the same effects as in the first embodiment can be obtained. In the second embodiment, when a plurality of windows WD are displayed on the display 162, the acquisition unit 112a validates the content corresponding to the active window WD that receives a user input among the plurality of windows WD. Identify as content. For this reason, even when a plurality of windows WD are displayed on the display 162, the information processing apparatus 10 can interpret the user input based on the content information on the valid content. Therefore, even when a plurality of windows WD are displayed on the display 162, the usability of the information processing apparatus 10 can be improved.
[3.第3実施形態]
 第3実施形態と上述した第1実施形態の主な相違点は、応答情報の出力態様が有効なコンテンツに関するコンテンツ情報に基づいて決定される点である。
[3. Third Embodiment]
The main difference between the third embodiment and the above-described first embodiment is that the output mode of the response information is determined based on the content information on the valid content.
 図9は、本発明の第3実施形態に係る情報処理装置10の全体構成を示すブロック図である。図1から図8において説明した要素と同一又は同様の要素については、同一の符号を付し、詳細な説明を省略する。 FIG. 9 is a block diagram showing the overall configuration of the information processing apparatus 10 according to the third embodiment of the present invention. Elements that are the same as or similar to the elements described with reference to FIGS. 1 to 8 are given the same reference numerals, and detailed descriptions thereof will be omitted.
 図9に示す情報処理装置10は、図1に示した出力装置160の代わりに出力装置160Aを有することを除いて、図1に示した情報処理装置10と同一の構成である。例えば、情報処理装置10は、処理装置100、記憶装置140、入力装置150、出力装置160A及び通信装置170を具備するコンピュータシステムにより実現される。情報処理装置10の複数の要素は、単体又は複数のバスで相互に接続される。また、情報処理装置10の複数の要素の各々は、単数又は複数の機器が構成してもよい。あるいは、情報処理装置10の一部の要素は省略されてもよい。 情報 処理 The information processing apparatus 10 shown in FIG. 9 has the same configuration as the information processing apparatus 10 shown in FIG. 1 except that an output device 160A is provided instead of the output device 160 shown in FIG. For example, the information processing device 10 is realized by a computer system including the processing device 100, the storage device 140, the input device 150, the output device 160A, and the communication device 170. A plurality of elements of the information processing device 10 are mutually connected by a single or a plurality of buses. In addition, each of the plurality of elements of the information processing device 10 may be configured by a single device or a plurality of devices. Alternatively, some elements of the information processing device 10 may be omitted.
 出力装置160Aは、振動発生部168を有することを除いて、図1に示した出力装置160と同一の構成である。すなわち、出力装置160Aは、ディスプレイ162、スピーカー164、発光部166及び振動発生部168を有する。振動発生部168は、例えば、バイブレータであり、処理装置100による制御のもとで振動する。具体的には、処理装置100は、応答情報の内容に応じて振動発生部168を振動させることにより、情報処理装置10を振動させる。処理装置100は、応答情報の内容に応じた振動のパターンを、電話の着信を知らせる振動のパターン等と異なるパターンにしてもよい。 The output device 160A has the same configuration as the output device 160 shown in FIG. 1 except that the output device 160A has the vibration generating unit 168. That is, the output device 160A includes the display 162, the speaker 164, the light emitting unit 166, and the vibration generating unit 168. The vibration generator 168 is, for example, a vibrator, and vibrates under the control of the processing device 100. Specifically, the processing device 100 vibrates the information processing device 10 by vibrating the vibration generating unit 168 according to the content of the response information. The processing device 100 may set the pattern of the vibration according to the content of the response information to a pattern different from the pattern of the vibration indicating the incoming call or the like.
 図9に示す処理装置100は、図1に示した制御プログラムPRの代わりに制御プログラムPRbを実行することを除いて、図1に示した処理装置100と同一又は同様である。例えば、処理装置100は、記憶装置140から制御プログラムPRbを読み出して実行することによって、エージェント部110b、表示データ生成部120及び音データ生成部130として機能する。 The processing device 100 shown in FIG. 9 is the same as or similar to the processing device 100 shown in FIG. 1 except that the control device PR shown in FIG. 1 is executed instead of the control program PR. For example, the processing device 100 functions as the agent unit 110b, the display data generation unit 120, and the sound data generation unit 130 by reading and executing the control program PRb from the storage device 140.
 エージェント部110bは、図1に示したエージェント部110と同様に、自然言語によるユーザの入力であるユーザ入力を解釈して、ユーザ入力に応じた処理を実行する。なお、図9のエージェント部110b内に示した取得部112、解釈部114、制御コマンド発行部116、応答情報生成部118及び出力態様決定部119は、エージェント部110bの機能ブロックの一例である。すなわち、情報処理装置10は、取得部112、解釈部114、制御コマンド発行部116、応答情報生成部118及び出力態様決定部119を有する。図9に示す取得部112、解釈部114、制御コマンド発行部116及び応答情報生成部118は、図1に示した解釈部114、制御コマンド発行部116及び応答情報生成部118と同一である。このため、図9では、出力態様決定部119、表示データ生成部120及び音データ生成部130について説明する。 The agent unit 110b interprets a user input, which is a user input in a natural language, and executes a process according to the user input, similarly to the agent unit 110 illustrated in FIG. Note that the acquiring unit 112, the interpreting unit 114, the control command issuing unit 116, the response information generating unit 118, and the output mode determining unit 119 shown in the agent unit 110b of FIG. 9 are examples of functional blocks of the agent unit 110b. That is, the information processing apparatus 10 includes the obtaining unit 112, the interpreting unit 114, the control command issuing unit 116, the response information generating unit 118, and the output mode determining unit 119. The acquiring unit 112, the interpreting unit 114, the control command issuing unit 116, and the response information generating unit 118 illustrated in FIG. 9 are the same as the interpreting unit 114, the control command issuing unit 116, and the response information generating unit 118 illustrated in FIG. Therefore, FIG. 9 illustrates the output mode determination unit 119, the display data generation unit 120, and the sound data generation unit 130.
 出力態様決定部119は、応答情報の出力態様を、有効なコンテンツに関するコンテンツ情報に基づいて決定する。例えば、出力態様決定部119は、複数の出力態様を含む出力態様候補から応答情報の出力態様をコンテンツ情報に基づいて選択する。出力態様候補は、例えば、応答情報を画像で出力する出力態様、応答情報を音で出力する出力態様、応答情報を振動で出力する出力態様及び応答情報の内容に応じた光を出力する出力態様のうちの複数の出力態様を含む。 (4) The output mode determining unit 119 determines the output mode of the response information based on the content information on the valid content. For example, the output mode determination unit 119 selects an output mode of response information from output mode candidates including a plurality of output modes based on the content information. The output mode candidates include, for example, an output mode in which response information is output as an image, an output mode in which response information is output as sound, an output mode in which response information is output as vibration, and an output mode in which light according to the content of the response information is output. Output modes.
 応答情報を画像で出力する出力態様は、例えば、応答情報の内容をテキストで表示する出力態様及び応答情報の内容に応じたアイコンを表示する出力態様を含んでもよい。また、応答情報を音で出力する出力態様は、例えば、応答情報の内容を示すテキストを読み上げる出力態様と、応答情報の内容を識別可能なメロディー、ハーモニー、リズム(又はテンポ)及び音色等の音楽的要素で出力する出力態様とを含んでもよい。 The output mode of outputting the response information as an image may include, for example, an output mode of displaying the content of the response information in text and an output mode of displaying an icon corresponding to the content of the response information. The output mode of outputting the response information as a sound includes, for example, an output mode in which a text indicating the content of the response information is read out, and music such as a melody, harmony, rhythm (or tempo), and timbre that can identify the content of the response information. And an output mode of outputting the target element.
 表示データ生成部120は、出力態様決定部119により応答情報の出力態様が応答情報を画像で出力する出力態様に決定された場合、応答情報の内容を示すテキスト又はアイコン等の表示データを生成する。そして、表示データ生成部120は、生成した表示データをディスプレイ162に転送する。 When the output mode of the response information is determined by the output mode determining unit 119 to be an output mode in which the response information is output as an image, the display data generation unit 120 generates display data such as a text or an icon indicating the content of the response information. . Then, the display data generation unit 120 transfers the generated display data to the display 162.
 音データ生成部130は、出力態様決定部119により応答情報の出力態様が応答情報を音で出力する出力態様に決定された場合、応答情報の内容を示す音データを生成する。音データは、例えば、応答情報の内容を示すテキストを読み上げる音のデータ又は応答情報の内容を識別可能な音楽的要素を含む音のデータ等である。音データ生成部130は、生成した音データをスピーカー164に転送する。 When the output mode of the response information is determined by the output mode determining unit 119 to be an output mode in which the response information is output as a sound, the sound data generating unit 130 generates sound data indicating the content of the response information. The sound data is, for example, sound data that reads out text indicating the content of the response information, or sound data that includes a musical element capable of identifying the content of the response information. The sound data generation unit 130 transfers the generated sound data to the speaker 164.
 なお、エージェント部110bの機能ブロックは、図9に示す例に限定されない。例えば、エージェント部110bは、取得部112の代わりに、図7に示した取得部112aを有してもよい。次に、図10を参照して、コンテンツ情報と応答情報の出力態様との関係の一例を説明する。 The function block of the agent unit 110b is not limited to the example shown in FIG. For example, the agent unit 110b may include the acquisition unit 112a illustrated in FIG. Next, an example of the relationship between the content information and the output mode of the response information will be described with reference to FIG.
 図10は、コンテンツ情報と応答情報の出力態様との関係の一例を示す説明図である。なお、コンテンツ情報と応答情報の出力態様との関係等は、図10に示す例に限定されない。図10では、複数のパラメータのうちの1つのパラメータが示す情報を抜粋して記載している。例えば、コンテンツの種類が映画又はTV番組の場合、字幕の有無を示すパラメータの情報を記載し、コンテンツの種類がメールの場合、ウインドウサイズを示すパラメータの情報を記載している。 FIG. 10 is an explanatory diagram illustrating an example of a relationship between content information and an output mode of response information. Note that the relationship between the content information and the output mode of the response information is not limited to the example illustrated in FIG. In FIG. 10, information indicated by one of the plurality of parameters is extracted and described. For example, when the type of content is a movie or a TV program, parameter information indicating the presence or absence of subtitles is described, and when the type of content is mail, parameter information indicating a window size is described.
 例えば、コンテンツの種類が映画又はTV番組で、字幕がない場合、応答情報の出力態様として、テキストが選択される。情報処理装置10は、ユーザ入力に対してテキスト表示で応答することにより、映画等の音声が聞き取り難くなることを防止できる。 For example, when the type of content is a movie or a TV program and there is no subtitle, text is selected as the output mode of the response information. The information processing device 10 can prevent the sound of a movie or the like from being difficult to hear by responding to the user input in a text display.
 また、コンテンツの種類が映画又はTV番組で、字幕がある場合、応答情報の出力態様として、字幕と異なる書体のテキストが選択される。情報処理装置10は、応答情報の内容を映画の字幕と異なる書体で表示することにより、ディスプレイ162に表示された文章が応答情報の内容を示すのか映画の字幕であるのかを容易に区別させることができる。また、情報処理装置10は、応答情報の内容をディスプレイ162上の字幕に重ならない位置に表示することにより、映画の字幕が見難くなることを防止できる。 If the type of content is a movie or a TV program and there is a caption, a text in a font different from the caption is selected as the output mode of the response information. The information processing apparatus 10 displays the content of the response information in a typeface different from the subtitle of the movie, thereby easily distinguishing whether the text displayed on the display 162 indicates the content of the response information or the subtitle of the movie. Can be. In addition, the information processing apparatus 10 displays the content of the response information at a position on the display 162 that does not overlap the subtitles, thereby preventing the subtitles of the movie from being difficult to see.
 コンテンツの種類がメールで、全画面表示である場合、応答情報の出力態様として、音声が選択される。情報処理装置10は、ユーザ入力に対して音声で応答することにより、メールの文章等が読み難くなることを防止できる。例えば、応答情報の出力態様がテキストである場合、応答情報の内容を示すテキストがメールの文章等に重ねて表示されると、メールの文章等が読み難くなる。 (4) When the content type is mail and full-screen display, voice is selected as the output mode of the response information. The information processing apparatus 10 can prevent the text and the like of the mail from being difficult to read by responding to the user input by voice. For example, when the output mode of the response information is text, if the text indicating the content of the response information is displayed over the text of the mail, the text of the mail becomes difficult to read.
 また、コンテンツの種類がメールで、縮小表示である場合、応答情報の出力態様として、音声及びテキストの両方が選択される。情報処理装置10は、ユーザ入力に対して音声とテキストの両方で応答することにより、音声のみで応答する場合に比べて、応答情報の内容をユーザに確実に伝えることができる。また、情報処理装置10は、応答情報の内容をディスプレイ162上のメールの表示領域と異なる領域に表示することにより、メールの文章等が読み難くなることを防止できる。例えば、応答情報の内容を示すテキストがメールの表示領域に表示される場合、応答情報の内容を示すテキストがメールの文章等に重ねて表示されると、メールの文章等が読み難くなる。 (4) When the content type is mail and reduced display, both voice and text are selected as output modes of the response information. By responding to the user input with both voice and text, the information processing apparatus 10 can reliably convey the contents of the response information to the user, as compared with the case of responding only with voice. Further, the information processing apparatus 10 displays the content of the response information in an area different from the display area of the mail on the display 162, so that it is possible to prevent the text of the mail from being difficult to read. For example, when the text indicating the content of the response information is displayed in the display area of the mail, if the text indicating the content of the response information is displayed over the text of the mail, the text of the mail becomes difficult to read.
 コンテンツの種類が地図で、全画面表示である場合、応答情報の出力態様として、音声が選択される。この場合、表示中の地図が見難くなることを防止することができる。例えば、応答情報の出力態様がテキストである場合、応答情報の内容を示すテキストが地図に重ねて表示されると、表示中の地図が見難くなる。また、コンテンツの種類が地図で、縮小表示である場合、応答情報の出力態様として、音声及びテキストの両方が選択される。この場合、音声のみで応答する場合に比べて、応答情報の内容をユーザに確実に伝えることができる。なお、情報処理装置10は、応答情報の内容をディスプレイ162上の地図の表示領域と異なる領域に表示することにより、表示中の地図が見難くなることを防止できる。 (4) When the content type is a map and full-screen display, voice is selected as the output mode of the response information. In this case, it is possible to prevent the displayed map from being difficult to see. For example, when the output mode of the response information is text, if the text indicating the content of the response information is displayed over the map, the displayed map becomes difficult to see. When the type of content is a map and reduced display, both voice and text are selected as output modes of the response information. In this case, the contents of the response information can be reliably transmitted to the user, as compared with the case of responding only by voice. The information processing apparatus 10 displays the contents of the response information in an area different from the display area of the map on the display 162, thereby preventing the map being displayed from being difficult to see.
 コンテンツの種類がアクションゲームで、全画面表示である場合、応答情報の出力態様として、音声が選択される。この場合、ゲームの画面等が見難くなることを防止することができ、アクションゲームの進行に支障が生じることを抑止することができる。例えば、応答情報の出力態様がテキストである場合、応答情報の内容を示すテキストがゲームの画面等に重ねて表示されると、ゲームの画面等が見難くなる。また、コンテンツの種類がアクションゲームで、縮小表示である場合、応答情報の出力態様として、音声及びテキストの両方が選択される。この場合、音声のみで応答する場合に比べて、応答情報の内容をユーザに確実に伝えることができる。なお、情報処理装置10は、応答情報の内容をディスプレイ162上のアクションゲームの表示領域と異なる領域に表示することにより、ゲームの画面等が見難くなることを防止できる。 If the type of content is an action game and full screen display, voice is selected as the output mode of the response information. In this case, it is possible to prevent the game screen or the like from being difficult to see, and it is possible to prevent the progress of the action game from being hindered. For example, when the output mode of the response information is text, if the text indicating the content of the response information is displayed over the game screen or the like, the game screen or the like becomes difficult to see. When the type of content is an action game and the display is reduced, both voice and text are selected as the output mode of the response information. In this case, the contents of the response information can be transmitted to the user more reliably than in the case of responding only by voice. The information processing apparatus 10 displays the content of the response information in an area different from the display area of the action game on the display 162, thereby preventing the game screen and the like from being difficult to see.
 コンテンツの種類が音楽ゲームで、全画面表示である場合、応答情報の出力態様として、テキストが選択される。なお、コンテンツの種類が音楽ゲームで、縮小表示である場合も、応答情報の出力態様として、テキストが選択される。情報処理装置10は、ユーザ入力に対してテキストで応答することにより、ゲームの音が聞き取り難くなることを防止でき、音楽ゲームの進行に支障が生じることを抑止できる。例えば、応答情報の出力態様が音声である場合、応答情報の内容をユーザに伝える音声がゲームの音と重なると、応答情報の内容及びゲームの音が聞き取り難くなる。 If the content type is a music game and full screen display, text is selected as the response information output mode. Note that, even when the content type is a music game and the display is reduced, text is selected as the output mode of the response information. The information processing apparatus 10 can prevent the sound of the game from being difficult to hear by responding to the user's input with a text, and can suppress the trouble in the progress of the music game. For example, when the output mode of the response information is voice, if the voice that conveys the content of the response information to the user overlaps with the sound of the game, it becomes difficult to hear the content of the response information and the sound of the game.
 図11は、図9に示した情報処理装置10の動作の一例を示すフローチャートである。なお、図11に示す動作は、情報処理装置10の制御方法の一例である。図11に示す動作は、ステップS132の処理が図6に示した動作に追加されることを除いて、図6に示した動作と同一又は同様である。このため、図11では、ステップS132の処理を中心に情報処理装置10の動作を説明する。ステップS132の処理は、例えば、ステップS130の処理が実行された後に実行される。 FIG. 11 is a flowchart showing an example of the operation of the information processing apparatus 10 shown in FIG. The operation illustrated in FIG. 11 is an example of a control method of the information processing device 10. The operation illustrated in FIG. 11 is the same as or similar to the operation illustrated in FIG. 6 except that the process of step S132 is added to the operation illustrated in FIG. Therefore, in FIG. 11, the operation of the information processing apparatus 10 will be described focusing on the processing of step S132. The process of step S132 is executed, for example, after the process of step S130 is executed.
 ステップS132では、処理装置100は、出力態様決定部119として機能し、ステップS120において取得したコンテンツ情報に基づいて、応答情報の出力態様を決定する。例えば、ステップS142、S150及びS170では、ステップS132の処理で決定された出力態様で、応答情報が出力される。ステップS132の処理が実行された後、ステップS140の処理が実行される。 In step S132, the processing device 100 functions as the output mode determining unit 119, and determines the output mode of the response information based on the content information acquired in step S120. For example, in steps S142, S150, and S170, response information is output in the output mode determined in the process of step S132. After the processing of step S132 is performed, the processing of step S140 is performed.
 なお、情報処理装置10の動作は、図11に示す例に限定されない。例えば、ステップS132の処理は、ステップS120の処理が実行された後であれば、ステップS130の処理が実効される前に実行されてもよい。また、例えば、出力態様決定部119は、コンテンツ情報の他に、ユーザ入力の内容及び応答情報の内容の一方又は両方を考慮して、応答情報の出力態様を決定してもよい。すなわち、出力態様決定部119は、コンテンツ情報とユーザ入力の内容とに基づいて応答情報の出力態様を決定してもよいし、コンテンツ情報とユーザ入力の内容と応答情報の内容とに基づいて応答情報の出力態様を決定してもよい。例えば、ユーザ入力の内容が緊急性の高い要求等である場合、あるいは、応答情報の内容が確実にユーザに認知してもらいたいもの(例えば、緊急性の高い内容等)である場合、応答情報の出力態様として、テキスト及び音声の両方が選択されてもよい。 The operation of the information processing device 10 is not limited to the example illustrated in FIG. For example, if the process of step S132 is performed after the process of step S120 is performed, it may be performed before the process of step S130 is performed. Further, for example, the output mode determining unit 119 may determine the output mode of the response information in consideration of one or both of the content of the user input and the content of the response information in addition to the content information. That is, the output mode determining unit 119 may determine the output mode of the response information based on the content information and the content of the user input, or may determine the response mode based on the content information, the content of the user input, and the content of the response information. The output mode of the information may be determined. For example, when the content of the user input is a highly urgent request or the like, or when the content of the response information is something that the user wants to be surely recognized (for example, a highly urgent content), the response information May be selected as both text and voice.
 また、例えば、応答情報がユーザの指示を承知したこと等の簡単な内容を伝える情報である場合、情報処理装置10は、振動発生部168を振動させることにより応答情報をユーザに伝えてもよい。あるいは、応答情報が簡単な内容を伝える情報である場合、情報処理装置10は、LED等の発光部166を点灯又は点滅させることにより応答情報をユーザに伝えてもよいし、短い音をスピーカー164から出力することにより応答情報をユーザに伝えてもよい。 Further, for example, when the response information is information that conveys a simple content such as that the user has accepted the instruction, the information processing apparatus 10 may transmit the response information to the user by vibrating the vibration generating unit 168. . Alternatively, when the response information is information that conveys simple contents, the information processing apparatus 10 may convey the response information to the user by turning on or off a light emitting unit 166 such as an LED, or may output a short sound to the speaker 164. May output the response information to the user.
 以上、第3実施形態においても、第1実施形態と同様の効果を得ることができる。また、第3実施形態では、情報処理装置10は、応答情報の出力態様をコンテンツ情報に基づいて決定する出力態様決定部119を有する。例えば、情報処理装置10は、ユーザ入力に対する応答情報の出力態様を有効なコンテンツに関するコンテンツ情報に応じて変更できる。このため、図10において説明したように、情報処理装置10の使い勝手を向上させることができる。 As described above, also in the third embodiment, the same effects as in the first embodiment can be obtained. In the third embodiment, the information processing device 10 includes an output mode determination unit 119 that determines the output mode of the response information based on the content information. For example, the information processing apparatus 10 can change the output mode of the response information to the user input in accordance with the content information regarding the valid content. Therefore, as described with reference to FIG. 10, usability of the information processing apparatus 10 can be improved.
[4.変形例]
 本発明は、以上に例示した実施形態に限定されない。具体的な変形の態様を以下に例示する。以下の例示から任意に選択された2以上の態様を併合してもよい。
[4. Modification]
The present invention is not limited to the embodiment exemplified above. Specific modifications will be described below. Two or more aspects arbitrarily selected from the following examples may be combined.
[第1変形例]
 上述した第2実施形態では、ディスプレイ162に表示された複数のウインドウWDのうち、アクティブ状態のウインドウWDに対応するコンテンツを有効なコンテンツとして特定する例を示したが、有効なコンテンツはアクティブ状態のウインドウWDに対応するコンテンツに限定されない。例えば、取得部112は、アクティブ状態のウインドウWDに対する操作が所定時間以上実行されない場合、複数のウインドウWDにそれぞれ対応する複数のコンテンツのうち、予め決められた優先度の最も高いコンテンツを有効なコンテンツとして特定してもよい。例えば、図8の状態C2において、ユーザが、送信メールを作成して送信した後、メールに対する操作を所定時間以上実行することなく、ウインドウWD10に表示されている映画を視聴し続けた場合で、映画の優先度がメールの優先度より高い場合、取得部112aは、アクティブ状態のメールでなく、映画を有効なコンテンツとして特定してもよい。
[First Modification]
In the above-described second embodiment, an example has been described in which the content corresponding to the window WD in the active state is specified as the valid content among the plurality of windows WD displayed on the display 162, but the valid content is in the active state. The content is not limited to the content corresponding to the window WD. For example, when the operation on the window WD in the active state is not performed for a predetermined time or more, the acquiring unit 112 determines that the predetermined highest priority content among the plurality of contents respectively corresponding to the plurality of windows WD is the valid content. May be specified. For example, in the state C2 in FIG. 8, when the user continues to watch the movie displayed in the window WD10 without performing an operation on the mail for a predetermined time or more after creating and transmitting the outgoing mail, When the priority of the movie is higher than the priority of the mail, the acquisition unit 112a may specify the movie as valid content instead of the mail in the active state.
[第2変形例]
 上述した第1実施形態から第3実施形態までの各実施形態では、出力装置160及び160Aが発光部166を有する例を示したが、応答情報の内容に応じた光を発光部166から出力する出力態様が出力態様候補に含まれない場合等において、発光部166は、出力装置160及び160Aから省かれてもよい。また、出力装置160は、応答情報を振動で出力する出力態様が出力態様候補に含まれる場合等において、振動発生部168を有してもよい。
[Second Modification]
In each of the above-described first to third embodiments, the output devices 160 and 160A have the light-emitting unit 166, but the light-emitting unit 166 outputs light corresponding to the content of the response information. In a case where the output mode is not included in the output mode candidates, the light emitting unit 166 may be omitted from the output devices 160 and 160A. In addition, the output device 160 may include the vibration generation unit 168 in a case where the output mode of outputting the response information by vibration is included in the output mode candidate.
[第3変形例]
 情報処理装置10は、補助記憶装置を有してもよい。補助記憶装置は、処理装置100が読取可能な記録媒体であり、例えば、CD-ROM(Compact Disc ROM)等の光ディスク、ハードディスクドライブ、フレキシブルディスク、光磁気ディスク(例えば、コンパクトディスク、デジタル多用途ディスク、Blu-ray(登録商標)ディスク)、スマートカード、フラッシュメモリ(例えば、カード、スティック、キードライブ)、フロッピー(登録商標)ディスク、及び、磁気ストリップ等の少なくとも1つによって構成されてもよい。補助記憶装置は、ストレージと呼ばれてもよい。
[Third Modification]
The information processing device 10 may include an auxiliary storage device. The auxiliary storage device is a recording medium readable by the processing device 100, for example, an optical disk such as a CD-ROM (Compact Disc ROM), a hard disk drive, a flexible disk, a magneto-optical disk (eg, a compact disk, a digital versatile disk). , A Blu-ray (registered trademark) disk), a smart card, a flash memory (for example, a card, a stick, a key drive), a floppy (registered trademark) disk, and a magnetic strip. The auxiliary storage device may be called a storage.
[5.その他]
(1)上述した実施形態では、記憶装置140は、処理装置100が読取可能な記録媒体であり、ROM及びRAMなどを例示したが、フレキシブルディスク、光磁気ディスク(例えば、コンパクトディスク、デジタル多用途ディスク、Blu-ray(登録商標)ディスク)、スマートカード、フラッシュメモリデバイス(例えば、カード、スティック、キードライブ)、CD-ROM(Compact Disc-ROM)、レジスタ、リムーバブルディスク、ハードディスク、フロッピー(登録商標)ディスク、磁気ストリップ、データベース、サーバその他の適切な記憶媒体である。また、プログラムは、電気通信回線を介してネットワークから送信されてもよい。また、プログラムは、電気通信回線を介して通信網から送信されてもよい。
[5. Others]
(1) In the above-described embodiment, the storage device 140 is a recording medium readable by the processing device 100, such as a ROM and a RAM. However, a flexible disk, a magneto-optical disk (for example, a compact disk, Disk, Blu-ray (registered trademark) disk, smart card, flash memory device (eg, card, stick, key drive), CD-ROM (Compact Disc-ROM), register, removable disk, hard disk, floppy (registered trademark) ) Disks, magnetic strips, databases, servers and other suitable storage media. Further, the program may be transmitted from a network via a telecommunication line. Further, the program may be transmitted from a communication network via a telecommunication line.
(2)上述した実施形態は、LTE(Long Term Evolution)、LTE-A(LTE-Advanced)、SUPER 3G、IMT-Advanced、4G(4th generation mobile communication system)、5G(5th generation mobile communication system)、FRA(Future Radio Access)、NR(new Radio)、W-CDMA(登録商標)、GSM(登録商標)、CDMA2000、UMB(Ultra Mobile Broadband)、IEEE 802.11(Wi-Fi(登録商標))、IEEE 802.16(WiMAX(登録商標))、IEEE 802.20、UWB(Ultra-WideBand)、Bluetooth(登録商標)、その他の適切なシステムを利用するシステム及びこれらに基づいて拡張された次世代システムの少なくとも一つに適用されてもよい。また、複数のシステムが組み合わされて(例えば、LTE及びLTE-Aの少なくとも一方と5Gとの組み合わせ等)適用されてもよい。 (2) The above-described embodiments are based on LTE (Long Term Evolution), LTE-A (LTE-Advanced), SUPER 3G, IMT-Advanced, 4G (4th generation mobile communication system), 5G (5th generation mobile communication system), FRA (Future Radio Access), NR (new Radio), W-CDMA (registered trademark), GSM (registered trademark), CDMA2000, UMB (Ultra Mobile Broadband), IEEE 802.11 (Wi-Fi (registered trademark)), A system using IEEE@802.16 (WiMAX (registered trademark)), IEEE@802.20, UWB (Ultra-WideBand), Bluetooth (registered trademark), and other appropriate systems, and a next-generation system extended based on these systems May be applied. A plurality of systems may be combined (for example, a combination of at least one of LTE and LTE-A with 5G) and applied.
 なお、本開示において説明した用語及び本開示の理解に必要な用語については、同一の又は類似する意味を有する用語と置き換えてもよい。例えば、信号はメッセージであってもよい。 用語 Note that terms described in the present disclosure and terms necessary for understanding the present disclosure may be replaced with terms having the same or similar meaning. For example, the signal may be a message.
(3)上述した実施形態において、入出力された情報等は特定の場所(例えば、メモリ)に保存されてもよいし、管理テーブルを用いて管理してもよい。入出力される情報等は、上書き、更新、又は追記され得る。出力された情報等は削除されてもよい。入力された情報等は他の装置へ送信されてもよい。 (3) In the above-described embodiment, input and output information and the like may be stored in a specific place (for example, a memory), or may be managed using a management table. Information that is input and output can be overwritten, updated, or added. The output information or the like may be deleted. The input information or the like may be transmitted to another device.
(4)上述した実施形態において、判定は、1ビットで表される値(0か1か)によって行われてもよいし、真偽値(Boolean:true又はfalse)によって行われてもよいし、数値の比較(例えば、所定の値との比較)によって行われてもよい。 (4) In the above-described embodiment, the determination may be made based on a value (0 or 1) represented by one bit, or may be performed based on a Boolean value (Boolean: true or false). , May be performed by comparing numerical values (for example, comparison with a predetermined value).
(5)上述した実施形態において例示した処理手順、シーケンス、フローチャートなどは、矛盾の無い限り、順序を入れ替えてもよい。例えば、本開示において説明した方法については、例示的な順序を用いて様々なステップの要素を提示しており、提示した特定の順序に限定されない。 (5) The order of the processing procedures, sequences, flowcharts, and the like exemplified in the above-described embodiment may be changed as long as there is no contradiction. For example, for the methods described in this disclosure, elements of various steps are presented in an exemplary order, and are not limited to the specific order presented.
(6)図1、図7及び図9に例示された各機能は、ハードウェア及びソフトウェアの少なくとも一方の任意の組み合わせによって実現される。また、各機能ブロックの実現方法は特に限定されない。すなわち、各機能ブロックは、物理的又は論理的に結合した1つの装置を用いて実現されてもよいし、物理的又は論理的に分離した2つ以上の装置を直接的又は間接的に(例えば、有線、無線などを用いて)接続し、これら複数の装置を用いて実現されてもよい。機能ブロックは、上記1つの装置又は上記複数の装置にソフトウェアを組み合わせて実現されてもよい。 (6) Each function illustrated in FIGS. 1, 7 and 9 is realized by an arbitrary combination of at least one of hardware and software. In addition, a method of implementing each functional block is not particularly limited. That is, each functional block may be realized using one device physically or logically coupled, or directly or indirectly (for example, two or more devices physically or logically separated from each other). , Wired, wireless, etc.), and may be implemented using these multiple devices. The functional block may be realized by combining one device or the plurality of devices with software.
(7)上述した実施形態で例示したプログラムは、ソフトウェアは、ソフトウェア、ファームウェア、ミドルウェア、マイクロコード、ハードウェア記述言語と呼ばれるか、他の名称で呼ばれるかを問わず、命令、命令セット、コード、コードセグメント、プログラムコード、プログラム、サブプログラム、ソフトウェアモジュール、アプリケーション、ソフトウェアアプリケーション、ソフトウェアパッケージ、ルーチン、サブルーチン、オブジェクト、実行可能ファイル、実行スレッド、手順、機能などを意味するよう広く解釈されるべきである。 (7) In the program exemplified in the above-described embodiment, the software is an instruction, an instruction set, a code, regardless of whether it is called software, firmware, middleware, microcode, a hardware description language, or another name. Should be interpreted broadly to mean code segment, program code, program, subprogram, software module, application, software application, software package, routine, subroutine, object, executable, thread of execution, procedure, function, etc. .
 また、ソフトウェア、命令、情報などは、伝送媒体を介して送受信されてもよい。例えば、ソフトウェアが、有線技術(同軸ケーブル、光ファイバケーブル、ツイストペア、デジタル加入者回線(DSL:Digital Subscriber Line)など)及び無線技術(赤外線、マイクロ波など)の少なくとも一方を使用してウェブサイト、サーバ、又は他のリモートソースから送信される場合、これらの有線技術及び無線技術の少なくとも一方は、伝送媒体の定義内に含まれる。 ソ フ ト ウ ェ ア Also, software, instructions, information, and the like may be transmitted and received via a transmission medium. For example, if the software uses at least one of wired technology (coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), etc.) and wireless technology (infrared, microwave, etc.), the website, When transmitted from a server or other remote source, at least one of these wired and / or wireless technologies is included within the definition of a transmission medium.
(8)前述の各形態において、「システム」及び「ネットワーク」という用語は、互換的に使用される。 (8) In the above embodiments, the terms “system” and “network” are used interchangeably.
(9)本開示において説明した情報、パラメータなどは、絶対値を用いて表されてもよいし、所定の値からの相対値を用いて表されてもよいし、対応する別の情報を用いて表されてもよい。上述したパラメータに使用する名称はいかなる点においても限定的な名称ではない。さらに、これらのパラメータを使用する数式等は、本開示で明示的に開示したものと異なる場合もある。 (9) The information, parameters, and the like described in the present disclosure may be represented using an absolute value, may be represented using a relative value from a predetermined value, or may be represented using another corresponding information. May be expressed as The names used for the parameters described above are not limiting in any way. Further, equations and the like using these parameters may differ from those explicitly disclosed in the present disclosure.
(10)上述した実施形態において、「接続された(connected)」、「結合された(coupled)」という用語、又はこれらのあらゆる変形は、2又はそれ以上の要素間の直接的又は間接的なあらゆる接続又は結合を意味し、互いに「接続」又は「結合」された2つの要素間に1又はそれ以上の中間要素が存在することを含むことができる。要素間の結合又は接続は、物理的なものであっても、論理的なものであっても、或いはこれらの組み合わせであってもよい。例えば、「接続」は「アクセス」で読み替えられてもよい。本開示で使用する場合、2つの要素は、1又はそれ以上の電線、ケーブル及びプリント電気接続の少なくとも一つを用いて、並びにいくつかの非限定的かつ非包括的な例として、無線周波数領域、マイクロ波領域及び光(可視及び不可視の両方)領域の波長を有する電磁エネルギーなどを用いて、互いに「接続」又は「結合」されると考えることができる。 (10) In the embodiments described above, the terms "connected," "coupled," or any variation thereof, refer to a direct or indirect connection between two or more elements. Any connection or combination is meant and may include the presence of one or more intermediate elements between two elements "connected" or "coupled" to each other. The coupling or connection between the elements may be physical, logical, or a combination thereof. For example, “connection” may be read as “access”. As used in this disclosure, two elements may be implemented using at least one of one or more wires, cables, and printed electrical connections, and as some non-limiting and non-exhaustive examples, in the radio frequency domain. , Can be considered "connected" or "coupled" to each other using electromagnetic energy having wavelengths in the microwave and optical (both visible and invisible) regions, and the like.
(11)上述した実施形態において、「に基づいて」という記載は、別段に明記されていない限り、「のみに基づいて」を意味しない。言い換えれば、「に基づいて」という記載は、「のみに基づいて」と「に少なくとも基づいて」の両方を意味する。 (11) In the above-described embodiment, the description “based on” does not mean “based only on” unless otherwise specified. In other words, the phrase "based on" means both "based only on" and "based at least on."
(12)本開示で使用する「判断(determining)」、「決定(determining)」という用語は、多種多様な動作を包含する場合がある。「判断」、「決定」は、例えば、判定(judging)、計算(calculating)、算出(computing)、処理(processing)、導出(deriving)、調査(investigating)、探索(looking up、search、inquiry)(例えば、テーブル、データベース又は別のデータ構造での探索)、確認(ascertaining)した事を「判断」「決定」したとみなす事などを含み得る。また、「判断」、「決定」は、受信(receiving)(例えば、情報を受信すること)、送信(transmitting)(例えば、情報を送信すること)、入力(input)、出力(output)、アクセス(accessing)(例えば、メモリ中のデータにアクセスすること)した事を「判断」「決定」したとみなす事などを含み得る。また、「判断」、「決定」は、解決(resolving)、選択(selecting)、選定(choosing)、確立(establishing)、比較(comparing)などした事を「判断」「決定」したとみなす事を含み得る。つまり、「判断」「決定」は、何らかの動作を「判断」「決定」したとみなす事を含み得る。また、「判断(決定)」は、「想定する(assuming)」、「期待する(expecting)」、「みなす(considering)」などで読み替えられてもよい。 (12) The terms "determining" and "determining" as used in the present disclosure may encompass a wide variety of operations. `` Judgment '', `` decision '', for example, judgment (judging), calculation (calculating), calculation (computing), processing (processing), derivation (deriving), investigating (investigating), searching (looking up, search, inquiry) (E.g., searching in a table, database, or another data structure), ascertaining may be considered "determined", "determined", and the like. Also, “determining” and “deciding” include receiving (eg, receiving information), transmitting (eg, transmitting information), input (input), output (output), and access. (accessing) (for example, accessing data in a memory) may be regarded as “determined” or “determined”. In addition, `` judgment '' and `` decision '' means that resolving, selecting, selecting, establishing, establishing, comparing, etc. are considered as `` judgment '' and `` decided ''. May be included. In other words, “judgment” and “decision” may include deeming any operation as “judgment” and “determined”. “Judgment (determination)” may be read as “assuming”, “expecting”, “considering”, or the like.
(13)上述した実施形態において、「含む(include)」、「含んでいる(including)」及びそれらの変形が使用されている場合、これらの用語は、用語「備える(comprising)」と同様に、包括的であることが意図される。さらに、本開示において使用されている用語「又は(or)」は、排他的論理和ではないことが意図される。 (13) In the embodiments described above, where “include”, “including” and variations thereof are used, these terms are used in the same manner as the term “comprising” It is intended to be comprehensive. Further, the term "or" as used in the present disclosure is not intended to be an exclusive or.
(14)本開示において、例えば、英語でのa, an及びtheのように、翻訳により冠詞が追加された場合、本開示は、これらの冠詞の後に続く名詞が複数形であることを含んでもよい。 (14) In the present disclosure, when articles are added by translation, for example, a, an, and the in English, the present disclosure may include that the noun following these articles is plural. Good.
(15)本開示において説明した各態様/実施形態は単独で用いてもよいし、組み合わせて用いてもよいし、実行に伴って切り替えて用いてもよい。また、所定の情報の通知(例えば、「Xであること」の通知)は、明示的に行うものに限られず、暗黙的(例えば、当該所定の情報の通知を行わない)ことによって行われてもよい。 (15) Each aspect / embodiment described in the present disclosure may be used alone, may be used in combination, or may be used by switching with execution. Further, the notification of the predetermined information (for example, the notification of “X”) is not limited to being explicitly performed, and is performed implicitly (for example, not performing the notification of the predetermined information). Is also good.
 以上、本開示について詳細に説明したが、当業者にとっては、本開示が本開示中に説明した実施形態に限定されるものではないということは明らかである。本開示は、請求の範囲の記載により定まる本開示の趣旨及び範囲を逸脱することなく修正及び変更態様として実施することができる。したがって、本開示の記載は、例示説明を目的とするものであり、本開示に対して何ら制限的な意味を有するものではない。 Although the present disclosure has been described in detail above, it is obvious to those skilled in the art that the present disclosure is not limited to the embodiments described in the present disclosure. The present disclosure can be implemented as modified and changed aspects without departing from the spirit and scope of the present disclosure defined by the description of the claims. Therefore, the description of the present disclosure is intended for illustrative purposes, and has no restrictive meaning to the present disclosure.
 10…情報処理装置、100…処理装置、110、110a、110b…エージェント部、112、112a…取得部、114…解釈部、116…制御コマンド発行部、118…応答情報生成部、119…出力態様決定部、120…表示データ生成部、130…音データ生成部、140…記憶装置、150…入力装置、152…マイクロフォン、154…操作部、160、160A…出力装置、162…ディスプレイ、164…スピーカー、166…発光部、168…振動発生部、170…通信装置、WD10、WD20…ウインドウ。 DESCRIPTION OF SYMBOLS 10 ... Information processing apparatus, 100 ... Processing apparatus, 110, 110a, 110b ... Agent part, 112, 112a ... Acquisition part, 114 ... Interpretation part, 116 ... Control command issuing part, 118 ... Response information generation part, 119 ... Output mode Determination unit, 120: display data generation unit, 130: sound data generation unit, 140: storage device, 150: input device, 152: microphone, 154: operation unit, 160, 160A: output device, 162: display, 164: speaker 166: Light-emitting unit, 168: Vibration generator, 170: Communication device, WD10, WD20: Window.

Claims (7)

  1.  コンテンツに関するコンテンツ情報を取得する取得部と、
     前記コンテンツを処理するアプリケーションに対する自然言語によるユーザ入力を前記コンテンツ情報に基づいて解釈する解釈部と、
     を備えることを特徴とする情報処理装置。
    An acquisition unit for acquiring content information on the content,
    An interpretation unit that interprets a user input in a natural language for an application that processes the content based on the content information;
    An information processing apparatus comprising:
  2.  前記ユーザ入力に応じた制御コマンドを、前記ユーザ入力の前記解釈部による解釈結果に基づいて発行する制御コマンド発行部を備える、
     ことを特徴とする請求項1に記載の情報処理装置。
    A control command issuing unit that issues a control command according to the user input based on an interpretation result of the user input by the interpretation unit,
    The information processing apparatus according to claim 1, wherein:
  3.  前記ユーザ入力に対する応答情報を、前記ユーザ入力の前記解釈部による解釈結果に基づいて生成する応答情報生成部を備える、
     ことを特徴とする請求項1又は2に記載の情報処理装置。
    Response information to the user input, comprising a response information generation unit that generates based on the interpretation result of the user input by the interpretation unit,
    The information processing apparatus according to claim 1, wherein:
  4.  前記応答情報生成部は、前記ユーザ入力の前記解釈部による解釈結果が複数の解釈を含む場合、前記複数の解釈のうちのいずれが前記ユーザ入力の内容に当てはまる解釈かをユーザに尋ねる前記応答情報を生成する、
     ことを特徴とする請求項3に記載の情報処理装置。
    The response information generation unit, when the interpretation result of the user input by the interpretation unit includes a plurality of interpretations, the response information to ask the user which of the plurality of interpretations are applicable to the content of the user input Produces
    The information processing apparatus according to claim 3, wherein:
  5.  前記応答情報の出力態様を前記コンテンツ情報に基づいて決定する出力態様決定部を備える、
     ことを特徴とする請求項3又は4に記載の情報処理装置。
    An output mode determining unit that determines an output mode of the response information based on the content information,
    The information processing apparatus according to claim 3, wherein:
  6.  前記取得部は、表示装置に複数のウインドウが表示される場合、前記複数のウインドウのうち、ユーザの入力を受け付けるアクティブ状態のウインドウに対応する前記コンテンツに関する前記コンテンツ情報を取得する、
     ことを特徴とする請求項1から5までのうちいずれか1項に記載の情報処理装置。
    When a plurality of windows are displayed on the display device, the obtaining unit obtains the content information on the content corresponding to an active window that receives a user input, among the plurality of windows,
    The information processing apparatus according to any one of claims 1 to 5, wherein:
  7.  前記コンテンツ情報は、前記コンテンツの種類に応じて定められた複数のパラメータを有する、
     ことを特徴とする請求項1から6までのうちいずれか1項に記載の情報処理装置。
    The content information has a plurality of parameters determined according to the type of the content,
    The information processing apparatus according to any one of claims 1 to 6, wherein:
PCT/JP2019/023630 2018-09-06 2019-06-14 Information processing device WO2020049826A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2020541024A JPWO2020049826A1 (en) 2018-09-06 2019-06-14 Information processing device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2018166791 2018-09-06
JP2018-166791 2018-09-06

Publications (1)

Publication Number Publication Date
WO2020049826A1 true WO2020049826A1 (en) 2020-03-12

Family

ID=69722020

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2019/023630 WO2020049826A1 (en) 2018-09-06 2019-06-14 Information processing device

Country Status (2)

Country Link
JP (1) JPWO2020049826A1 (en)
WO (1) WO2020049826A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001249685A (en) * 2000-03-03 2001-09-14 Alpine Electronics Inc Speech dialog device
JP2003263188A (en) * 2002-01-29 2003-09-19 Samsung Electronics Co Ltd Voice command interpreter with dialog focus tracking function, its method and computer readable recording medium with the method recorded

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001249685A (en) * 2000-03-03 2001-09-14 Alpine Electronics Inc Speech dialog device
JP2003263188A (en) * 2002-01-29 2003-09-19 Samsung Electronics Co Ltd Voice command interpreter with dialog focus tracking function, its method and computer readable recording medium with the method recorded

Also Published As

Publication number Publication date
JPWO2020049826A1 (en) 2021-09-24

Similar Documents

Publication Publication Date Title
EP1646168A2 (en) Method and apparatus for providing a user control interface in audio multistreaming
KR102545837B1 (en) Display arraratus, background music providing method thereof and background music providing system
JP2018537795A (en) Automatic execution of user interaction on computing devices
US10468004B2 (en) Information processing method, terminal device and computer storage medium
CN111033610A (en) Electronic device and voice recognition method
WO2020259133A1 (en) Method and device for recording chorus section, electronic apparatus, and readable medium
JP2014109897A (en) Information processing device and content retrieval method
WO2011037253A1 (en) Display system
JP2014049140A (en) Method and apparatus for providing intelligent service using input characters in user device
JP2011139405A (en) Information processor, information processing method, program, control object device, and information processing system
JP2020042745A (en) Electronic device, control method thereof, and program thereof
WO2020049826A1 (en) Information processing device
KR20140141026A (en) display apparatus and search result displaying method thereof
KR20140111574A (en) Apparatus and method for performing an action according to an audio command
WO2020049827A1 (en) Information processing device
US8942534B2 (en) Information processing apparatus, information processing method, program, and information processing system
JP7429194B2 (en) Dialogue device and dialogue program
CN103984691A (en) Information processing apparatus, information processing method, and program
JP2024509824A (en) Document editing methods, equipment, devices and storage media
JP2015049752A (en) Information processing device, method and program
KR20180010955A (en) Electric device and method for controlling thereof
US20080125174A1 (en) Portable devices for providing acoustic source information, apparatuses for providing acoustic source information, and methods of providing acoustic source information
WO2019235100A1 (en) Interactive device
CN108874976A (en) Search for content recommendation method, device, terminal device and storage medium
JP6917821B2 (en) Playback device, program and playback method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19857286

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020541024

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19857286

Country of ref document: EP

Kind code of ref document: A1