WO2016151956A1 - 情報処理システムおよび情報処理方法 - Google Patents
情報処理システムおよび情報処理方法 Download PDFInfo
- Publication number
- WO2016151956A1 WO2016151956A1 PCT/JP2015/084293 JP2015084293W WO2016151956A1 WO 2016151956 A1 WO2016151956 A1 WO 2016151956A1 JP 2015084293 W JP2015084293 W JP 2015084293W WO 2016151956 A1 WO2016151956 A1 WO 2016151956A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- voice recognition
- information processing
- control unit
- user
- processing system
- Prior art date
Links
- 230000010365 information processing Effects 0.000 title claims abstract description 136
- 238000003672 processing method Methods 0.000 title claims description 5
- 238000000034 method Methods 0.000 claims description 161
- 230000008569 process Effects 0.000 claims description 147
- 238000012545 processing Methods 0.000 claims description 74
- 210000003128 head Anatomy 0.000 description 40
- 238000012986 modification Methods 0.000 description 19
- 230000004048 modification Effects 0.000 description 19
- 230000006870 function Effects 0.000 description 18
- 238000004891 communication Methods 0.000 description 17
- 238000010586 diagram Methods 0.000 description 16
- 230000004913 activation Effects 0.000 description 13
- 238000001514 detection method Methods 0.000 description 9
- 238000003384 imaging method Methods 0.000 description 8
- 230000000694 effects Effects 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 5
- 125000002066 L-histidyl group Chemical group [H]N1C([H])=NC(C([H])([H])[C@](C(=O)[*])([H])N([H])[H])=C1[H] 0.000 description 3
- 238000003825 pressing Methods 0.000 description 3
- 239000004065 semiconductor Substances 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 239000000470 constituent Substances 0.000 description 2
- 238000005401 electroluminescence Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000001151 other effect Effects 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 230000003213 activating effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000008921 facial expression Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 230000004886 head movement Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 229910044991 metal oxide Inorganic materials 0.000 description 1
- 150000004706 metal oxides Chemical class 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/01—Assessment or evaluation of speech recognition systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
- G06F3/013—Eye tracking input arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/017—Gesture based interaction, e.g. based on a set of recognized hand gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/03—Arrangements for converting the position or the displacement of a member into a coded form
- G06F3/041—Digitisers, e.g. for touch screens or touch pads, characterised by the transducing means
- G06F3/042—Digitisers, e.g. for touch screens or touch pads, characterised by the transducing means by opto-electronic means
- G06F3/0425—Digitisers, e.g. for touch screens or touch pads, characterised by the transducing means by opto-electronic means using a single imaging device like a video camera for tracking the absolute position of a single or a plurality of objects with respect to an imaged reference surface, e.g. video camera imaging a display or a projection screen, a table or a wall surface, on which a computer generated image is displayed or projected
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0481—Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
- G06F3/04817—Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance using icons
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0487—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
- G06F3/0488—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
Definitions
- This disclosure relates to an information processing system and an information processing method.
- the user may want to continue the voice recognition process for the sound information. Therefore, it is desirable to provide a technique that allows the user to easily instruct whether or not to continue the speech recognition process for the sound information.
- the speech recognition unit includes a recognition control unit that controls the voice recognition unit so that voice recognition processing is performed by the voice recognition unit on the sound information input from the sound collection unit.
- An information processing system is provided that controls whether or not to continue the voice recognition processing based on a user gesture detected at a predetermined timing.
- the voice recognition unit is controlled so that voice recognition processing is performed by the voice recognition unit on sound information input from the sound collection unit, and is detected at a predetermined timing by the processor.
- an information processing method including controlling whether to continue the voice recognition process based on a user's gesture.
- a technique that allows the user to easily instruct whether or not to continue the speech recognition process for sound information.
- the above effects are not necessarily limited, and any of the effects shown in the present specification, or other effects that can be grasped from the present specification, together with or in place of the above effects. May be played.
- FIG. 3 is a block diagram illustrating a functional configuration example of an information processing system according to an embodiment of the present disclosure.
- FIG. It is a figure which shows the example of a screen transition until it detects the starting trigger of a speech recognition process from the display of an initial screen. It is a figure for demonstrating the case where a user rushes into a silence state after uttering all the utterance contents which wants to perform voice recognition processing. It is a figure for demonstrating the case where it rushes into a silence state before the utterance of all the utterance contents which a user wants to perform voice recognition processing ends.
- FIG. 5 is a flowchart illustrating an overall operation flow of an information processing system according to an embodiment of the present disclosure. It is a figure which shows the modification 1 of a structure of an information processing system. It is a figure which shows the modification 2 of a structure of an information processing system. It is a figure which shows the modification 2 of a structure of an information processing system. It is a figure which shows the modification 2 of a structure of an information processing system. It is a figure which shows the modification 2 of a structure of an information processing system.
- a plurality of constituent elements having substantially the same functional configuration may be distinguished by adding different numerals after the same reference numerals. However, when it is not necessary to particularly distinguish each of a plurality of constituent elements having substantially the same functional configuration, only the same reference numerals are given.
- FIG. 1 is a diagram for explaining speech recognition processing in a general system.
- a voice voice or speech
- a sound are used separately.
- the utterance indicates a state in which the user is uttering sound
- the silence indicates a state in which sound information is collected with a volume smaller than the threshold.
- a general system (hereinafter also simply referred to as “system”), when an operation for selecting a speech recognition start operation object G14 for starting speech recognition processing is input from a user, Such an operation is detected as an activation trigger for the voice recognition process, and a sound collection start screen G91 is displayed (time T91).
- the sound collection start screen G91 is displayed, the user starts speaking (time T92), and the system performs sound recognition processing on the collected sound information while collecting sound with the microphone (S91).
- a silent state is started. Then, the system detects a section Ma (hereinafter, also referred to as a “silent section”) Ma in which the duration of the sound information collected by the microphone continues and reaches a predetermined target time that is below the reference volume. (Time T94), a predetermined execution operation is executed based on the result of the speech recognition process performed on the sound information collected in the utterance section Ha (S92).
- a section Ma hereinafter, also referred to as a “silent section” Ma in which the duration of the sound information collected by the microphone continues and reaches a predetermined target time that is below the reference volume.
- the execution operation based on the result of the speech recognition process is not particularly limited.
- the execution operation based on the result of the speech recognition process includes an operation of outputting a search result corresponding to a character string as a result of the speech recognition process, an operation of outputting a character string as a result of the speech recognition process, Any one of an operation of outputting the processing result candidate obtained in the process and an operation of outputting a character string for replying to the utterance content extracted from the character string as a result of the speech recognition process may be included.
- the method of extracting the utterance content from the character string as a result of the speech recognition process is not limited.
- a technique for extracting utterance contents from a character string as a result of speech recognition processing is performed by performing natural language processing (for example, language analysis, semantic analysis, etc.) on a character string as a result of speech recognition processing.
- the content may be extracted.
- the system displays a screen G92 indicating that the execution operation is being processed.
- the system displays a screen G93 indicating the result of the execution operation.
- “collar”, “bid”, and “kick” are included in the screen G93 showing the result of the execution operation as a search result corresponding to the character string as the result of the speech recognition process.
- the speech recognition process for the sound information collected by the microphone is temporarily stopped. Therefore, when there is a user who considers the utterance content while speaking, the time when the user stopped the utterance to consider the utterance content is detected as a silent section, and the utterance content intended by the user There is a possibility that the voice recognition processing is only performed halfway.
- the content of the utterance may be forgotten during the utterance, or it may be suddenly chased by work other than the utterance (for example, an emergency situation may occur while driving a car). Therefore, there is a possibility that the time when the utterance has been stopped for such a reason is detected as a silent section, and the speech recognition process is only performed halfway through the utterance content intended by the user.
- the present specification proposes a technique that allows the user to easily instruct whether or not to continue the speech recognition process for the sound information collected by the microphone.
- FIG. 2 is a diagram illustrating a configuration example of the information processing system 10 according to the embodiment of the present disclosure.
- the information processing system 10 according to the embodiment of the present disclosure includes an image input unit 110, an operation input unit 115, a sound collection unit 120, and an output unit 130.
- the information processing system 10 can perform voice recognition processing on voices uttered by a user U (hereinafter also simply referred to as “user”).
- the image input unit 110 has a function of inputting an image.
- the image input unit 110 includes two cameras embedded in the table Tbl.
- the number of cameras included in the image input unit 110 is not particularly limited as long as it is one or more. In such a case, the position where each of the one or more cameras included in the image input unit 110 is provided is not particularly limited.
- the one or more cameras may include a monocular camera or a stereo camera.
- the operation input unit 115 has a function of inputting a user U operation.
- the operation input unit 115 includes one camera suspended from the ceiling that exists above the table Tbl.
- the position where the camera included in the operation input unit 115 is provided is not particularly limited.
- the camera may include a monocular camera or a stereo camera.
- the operation input unit 115 may not be a camera as long as it has a function of inputting the operation of the user U.
- the operation input unit 115 may be a touch panel or a hardware button.
- the output unit 130 has a function of displaying a screen on the table Tbl.
- the output unit 130 is suspended from the ceiling above the table Tbl.
- the position where the output unit 130 is provided is not particularly limited.
- the output unit 130 may be a projector capable of projecting the screen onto the top surface of the table Tbl, but may be another type of display as long as it has a function of displaying the screen. May be.
- the display surface of the screen may be other than the top surface of the table Tbl.
- the display surface of the screen may be a wall, a building, a floor surface, the ground, or a ceiling.
- the display surface of the screen may be a non-planar surface such as a curtain fold, or may be a surface in another place.
- the output unit 130 has a display surface
- the display surface of the screen may be the display surface of the output unit 130.
- the sound collection unit 120 has a function of collecting sound.
- the sound collection unit 120 includes a total of six microphones including three microphones existing above the table Tbl and three microphones existing on the upper surface of the table Tbl.
- the number of microphones included in the sound collection unit 120 is not particularly limited as long as it is one or more. In such a case, the position at which each of the one or more microphones included in the sound collection unit 120 is provided is not particularly limited.
- the arrival direction of the sound can be estimated based on sound information collected by each of the plurality of microphones. Further, if the sound collection unit 120 includes a microphone having directivity, the direction of arrival of sound can be estimated based on sound information collected by the microphone having directivity.
- FIG. 3 is a block diagram illustrating a functional configuration example of the information processing system 10 according to the embodiment of the present disclosure.
- the information processing system 10 according to the embodiment of the present disclosure includes an image input unit 110, an operation input unit 115, a sound collection unit 120, an output unit 130, and an information processing device 140 (hereinafter referred to as “information processing device 140”). , Also referred to as “control unit 140”).
- the information processing apparatus 140 executes control of each unit of the information processing system 10. For example, the information processing apparatus 140 generates information output from the output unit 130. Further, for example, the information processing apparatus 140 reflects information input by the image input unit 110, the operation input unit 115, and the sound collection unit 120 in information output from the output unit 130. As illustrated in FIG. 3, the information processing apparatus 140 includes an input image acquisition unit 141, a sound information acquisition unit 142, an operation detection unit 143, a recognition control unit 144, a voice recognition unit 145, and an output control unit 146. With. Details of these functional blocks will be described later.
- the information processing apparatus 140 may be configured by, for example, a CPU (Central Processing Unit).
- a CPU Central Processing Unit
- the processing device can be configured by an electronic circuit.
- the recognition control unit 144 controls the voice recognition unit 145 so that voice recognition processing is performed on the sound information input from the sound collection unit 120 by the voice recognition unit 145, and the recognition is performed.
- the control unit 144 controls whether or not to continue the voice recognition process based on the user gesture detected at a predetermined timing.
- the user can easily instruct whether or not to continue the speech recognition process for the sound information.
- the recognition control unit 144 controls whether or not to continue the speech recognition process based on the user's line of sight.
- the method for detecting the user's line of sight is not particularly limited.
- the operation detection unit 143 can detect the user's line of sight by analyzing the image input by the operation input unit 115.
- the viewpoint can be calculated as the intersection of the line of sight and the screen.
- FIG. 4 is a diagram showing an example of screen transition from the display of the initial screen to the detection of the activation trigger for the speech recognition process.
- the output control unit 146 displays an initial screen G10-1.
- the initial screen G10-1 is a recognition field which is a display column for a voice recognition start operation object G14 for starting the voice recognition process and a character string obtained by the voice recognition process (hereinafter also referred to as “recognized character string”).
- a character string display field G11 is included.
- the initial screen G10-1 includes a delete all operation object G12 for deleting all recognized character strings and a confirm operation object G13 for confirming the recognized character strings.
- the initial screen G10-1 also includes a forward moving operation object G15 for returning the cursor position in the recognized character string, a backward moving operation object G16 for moving the cursor position in the recognized character string backward, and a character at the cursor position.
- a delete operation object G17 for deleting a word is included.
- the operation detection unit 143 As shown in the initial screen G10-2, when an operation for selecting the voice recognition start operation object G14 by the user is input by the operation input unit 115, the operation is detected by the operation detection unit 143 as a trigger for starting the voice recognition process. It is detected (time T10).
- the output control unit 146 turns on the sound collection function of the sound collection unit 120 when the activation trigger for the speech recognition process is detected.
- an operation for selecting the speech recognition start operation object G14 will be described as an example as an activation trigger for the speech recognition process, but the activation trigger for the speech recognition process is not limited to such an example.
- the activation trigger for the speech recognition process may be an operation of pressing a hardware button for activating the speech recognition process.
- the voice recognition process may be started between the start of pressing the hardware button and the release of pressing (Push To Talk type).
- the activation trigger for the voice recognition process may be an execution of a voice recognition process activation command (for example, an utterance “voice”).
- the voice recognition process activation trigger is a predetermined voice recognition process activation gesture (for example, raising a hand, swinging a hand down, moving a face (for example, nodding, tilting the face to the left or right, etc.)). May be.
- the activation trigger of the voice recognition process may include that sound information whose voice likelihood exceeds a threshold is acquired from the sound collection unit 120. Subsequently, the user starts speaking toward the sound collection unit 120.
- FIG. 5 a case will be described in which the user enters a silent state after uttering all utterance contents to be subjected to voice recognition processing.
- the output control unit 146 is a predetermined object (hereinafter also referred to as “display object”).
- Display Mu The display object Mu may be stationary or may have a movement.
- the moving direction De of the display object Mu may be determined according to the arrival direction from the sound source of the uttered voice by the user to the sound collection unit 120.
- the estimation method of the arrival direction of the uttered voice by the user is not particularly limited.
- the recognition control unit 144 utters one arrival direction that matches or is similar to the finger direction of the user who performed the operation of selecting the voice recognition start operation object G14 (for example, the direction from the base of the finger to the fingertip). It may be estimated as the voice arrival direction.
- the similarity range may be determined in advance.
- the finger direction may be obtained by analyzing the input image.
- the recognition control unit 144 may estimate the arrival direction of the sound input by the sound collection unit 120 as the arrival direction of the uttered voice by the user.
- the arrival direction of the sound input first among the plurality of arrival directions may be estimated as the arrival direction of the uttered voice by the user.
- One arrival direction that matches or resembles the direction of the finger of the user who has performed the operation of selecting the recognition start operation object G14 may be estimated as the arrival direction of the uttered voice by the user.
- the recognition control unit 144 may estimate the arrival direction of the sound input by the sound collection unit 120 at the highest volume among the plurality of arrival directions as the arrival direction of the uttered voice by the user. In this way, the arrival direction of the uttered voice by the user can be estimated. On the other hand, the recognition control unit 144 may acquire the sound input by the sound collection unit 120 from a direction other than the arrival direction of the uttered voice by the user as noise. Therefore, the noise may include an output sound from the information processing system 10.
- FIG. 5 shows an example in which the output control unit 146 moves the display object Mu in the direction of arrival of the uttered voice by the user (movement direction De).
- the movement of the display object Mu is not limited to such movement.
- FIG. 5 shows an example in which the movement destination of the display object Mu is the voice recognition start operation object G14.
- the movement destination of the display object Mu is not limited to this example.
- FIG. 5 shows an example in which the output control unit 146 moves the circular display objects Mu that appear one after another according to the sound collection by the sound collection unit 120.
- the display mode of the display object Mu is shown. Is not limited to such an example.
- the output control unit 146 may control various parameters of the display object Mu based on predetermined information corresponding to the sound information (for example, sound quality, sound volume, etc. of the sound information).
- the sound information used at this time may be sound information from the direction of arrival of the uttered voice by the user.
- the parameter of the display object Mu may include at least one of the shape, transparency, color, size, and movement of the display object Mu.
- a technique described in a patent document Japanese Patent Laid-Open No. 2010-38943 can be employed as a technique for evaluating the likelihood of sound from sound information.
- a method for evaluating the likelihood of sound from sound information a method described in a patent document (Japanese Patent Laid-Open No. 2007-328228) can be adopted.
- the speech likelihood evaluation is performed by the output control unit 146, but the speech likelihood evaluation may be performed by a server (not shown).
- the recognition control unit 144 causes the voice recognition unit 145 to start voice recognition processing on the sound information acquired by the sound information acquisition unit 142.
- the timing for starting the speech recognition process is not limited.
- the recognition control unit 144 may cause the speech recognition unit 145 to start after the sound information whose soundness exceeds the predetermined threshold is collected, or the display object Mu reaches the speech recognition start operation object G14. Then, the speech recognition unit 145 may start speech recognition processing for sound information corresponding to the display object Mu.
- the recognition control unit 144 detects the silent section. (Time T12). Then, when the silent section is detected, the output control unit 146 causes the output unit 130 to output the moving object (time T13). In the example illustrated in FIG. 5, the output control unit 146 outputs the motion recognition start operation object G14 having motion as a motion object, but the motion object may be provided separately from the speech recognition start operation object G14. Good.
- the recognition control unit 144 controls whether or not to continue the speech recognition process based on the user's viewpoint and the moving object G14. More specifically, the recognition control unit 144 controls whether or not to continue the speech recognition process based on the degree of coincidence between the user's viewpoint and the moving object G14. Details of the degree of coincidence will be described later.
- the recognition control unit 144 since all of the utterance contents that the user wants to perform voice recognition processing have been uttered, the user does not have to keep watching the moving object G14.
- the recognition control unit 144 may control the voice recognition unit 145 to execute an execution operation based on the result of the voice recognition process when the degree of coincidence of both falls below the threshold at a predetermined timing (time T15).
- the predetermined timing is not particularly limited as long as it is a timing after the moving object G14 is output by the output unit 130.
- the voice recognition unit 145 executes an execution operation based on the result of the voice recognition process according to the control of the recognition control unit 144 (time T16).
- the output control unit 146 may output the object G22 instructing to wait for voice input until the execution operation is completed while the execution operation based on the result of the speech recognition process is being performed.
- the output control unit 146 can output the result of the execution operation.
- the output control unit 146 displays the display object Mu as shown in FIG.
- the display object Mu is as described above.
- the recognition control unit 144 causes the voice recognition unit 145 to start voice recognition processing on the sound information acquired by the sound information acquisition unit 142.
- the timing for starting the speech recognition process is not particularly limited as described above.
- the recognition control unit 144 detects the silent section. (Time T12). Then, when the silent section is detected, the output control unit 146 causes the output unit 130 to output the moving object (time T13). In the example illustrated in FIG. 6, the output control unit 146 outputs the motion recognition start operation object G14 having motion as a motion object, but the motion object may be provided separately from the speech recognition start operation object G14. Good.
- the recognition control unit 144 controls whether or not to continue the speech recognition process based on the user's viewpoint and the moving object G14. More specifically, the recognition control unit 144 controls whether or not to continue the speech recognition process based on the degree of coincidence between the user's viewpoint and the moving object G14.
- the recognition control unit 144 controls whether or not to continue the speech recognition process based on the degree of coincidence between the user's viewpoint and the moving object G14.
- the recognition control unit 144 may control the voice recognition unit 145 so as to continue the voice recognition processing when the degree of coincidence of both exceeds a threshold at a predetermined timing (time T15).
- the predetermined timing is not particularly limited as long as it is a timing after the moving object G14 is output by the output unit 130.
- the voice recognition unit 145 continues the voice recognition process for the sound information input from the sound collection unit 120 according to the control of the recognition control unit 144 (time T16). As a result, the speech recognition process once suspended is resumed.
- the output control unit 146 may start displaying the display object Mu again as shown in FIG.
- the voice recognition unit 145 may newly start a voice recognition process different from the voice recognition process that has already been started, and merge the results of the two voice recognition processes.
- the sound information may be buffered, and when the voice recognition process can be started next, the voice recognition process may be performed based on the buffered sound information and the sound information input from the sound collection unit 120. .
- FIG. 7 is a diagram for explaining a case where the degree of coincidence exceeds a threshold value.
- a determination region R10 corresponding to the moving object trajectory K10 is assumed.
- the determination region R10 is a region having a width W10 with reference to the locus K10 of the moving object, but the determination region R10 is not limited to such a region.
- the recognition control unit 144 may calculate, as the degree of coincidence, the ratio of the length of the user's viewpoint trajectory K20 that falls within the determination region R10 with respect to the entire length of the user's viewpoint trajectory K20. In the example shown in FIG. 7, since the degree of coincidence calculated in this way exceeds the threshold value, the recognition control unit 144 may control the voice recognition unit 145 so as to continue the voice recognition process.
- the method for calculating the degree of coincidence between the locus K10 of the moving object and the locus K21 of the user's viewpoint is not limited to this example.
- FIG. 8 is a diagram for explaining a case where the degree of coincidence is below a threshold value.
- the recognition control unit 144 since the degree of coincidence calculated as described above is below the threshold value, the recognition control unit 144 controls the voice recognition unit 145 to execute an execution operation based on the result of the voice recognition process. do it.
- the recognition control unit 144 may control the voice recognition unit 145 so as to continue the voice recognition process, or execute an execution operation based on the result of the voice recognition process.
- the voice recognition unit 145 may be controlled as described above.
- FIG. 9 is a diagram illustrating an output example of the relationship between the degree of coincidence and the threshold value.
- the output control unit 146 may cause the output unit 130 to output a predetermined first notification object G41 when the degree of coincidence exceeds a threshold value.
- the first notification object G41 is an icon indicating that the eyes are open, but is not limited to such an example.
- the output control unit 146 outputs a predetermined second notification object G42 different from the first notification object G41 to the output unit 130 when the degree of coincidence is lower than the threshold value. You may let them.
- the second notification object G42 is an icon representing a state in which eyes are closed, but is not limited to such an example. Note that the output control unit 146 may stop the output of the moving object G14 when the degree of coincidence is lower than the threshold value over a predetermined time.
- FIG. 10 is merely an example of the overall operation flow of the information processing system 10 according to the embodiment of the present disclosure, and thus the overall operation of the information processing system 10 according to the embodiment of the present disclosure.
- the flow is not limited to the example shown in the flowchart of FIG.
- the operation detection unit 143 detects a start trigger for voice recognition processing (S11), and the recognition control unit 144 detects an utterance from sound information input from the sound collection unit 120 ( S12), the voice recognition unit 145 is caused to start voice recognition processing for the sound information (S13). Subsequently, the recognition control unit 144 continues the voice recognition process until a silent section is detected (“No” in S14), but if a silent section is detected (“Yes” in S14), the voice recognition process is performed. Is temporarily stopped, and the output control unit 146 displays the moving object (S15).
- the recognition control unit 144 obtains the user's viewpoint trajectory K20 (S16), and calculates the degree of coincidence r between the moving object trajectory K10 and the user's viewpoint trajectory K20 (S17).
- the recognition control unit 144 shifts the operation to S15 until the continuation determination timing arrives (“No” in S18), but when the continuation determination timing arrives (“Yes” in S18), S19 Move the operation to.
- the recognition control unit 144 continues the speech recognition process (S13), but when the matching degree r does not exceed the threshold value r_threshold. (“No” in S19), the operation is shifted to the execution operation based on the result of the speech recognition process (S20), and the result of the execution operation is acquired (S21).
- FIG. 11 is a diagram illustrating a first modification of the configuration of the information processing system 10.
- the output unit 130 may be included in the mobile terminal.
- the kind of portable terminal is not specifically limited, A tablet terminal may be sufficient, a smart phone may be sufficient, and a mobile phone may be sufficient.
- the output unit 130 may be a television device
- the information processing device 140 may be a game machine
- the operation input unit 115 may be a controller that operates the game machine.
- the sound collection unit 120 and the output unit 130 may be connected to the operation input unit 115.
- the image input unit 110 and the sound collection unit 120 may be connected to the information processing apparatus 140.
- the operation input unit 115, the sound collection unit 120, and the output unit 130 may be provided in a smartphone connected to the information processing apparatus 140.
- the sound collecting unit 120 may be provided in a television device.
- FIG. 16 is a diagram illustrating a third modification of the configuration of the information processing system 10.
- the information processing apparatus 140 may be a game machine, and the operation input unit 115 may be a controller that operates the game machine.
- the output unit 130, the sound collection unit 120, and the image input unit 110 may be provided in a wearable device that is worn on the user's head.
- FIGS. 17 to 20 are diagrams showing a fourth modification of the configuration of the information processing system 10.
- the information processing system 10 may be mounted on a vehicle-mounted navigation system that can be attached to a vehicle and used by a user U who is driving the vehicle.
- the information processing system 10 may be mounted on a mobile terminal and used by a user U who is driving a car.
- the type of mobile terminal is not particularly limited.
- the operation input unit 115 is provided by a mobile terminal, and the output unit 130, the sound collection unit 120, and the image input unit 110 are provided on the body of the user U. It may be provided in a wearable device to be attached. As shown in FIG. 20, the information processing system 10 may be mounted on an in-vehicle navigation system built in an automobile and used by a user U driving the automobile.
- FIG. 21 is a diagram illustrating an example in which the moving object G14 is displayed in the visual field region in the three-dimensional space.
- the output unit 130 when the output unit 130 is a see-through type head mounted display, the output unit 130 may display the moving object G14 in the visual field region Vi in the three-dimensional space Re.
- FIG. 21 also shows a locus K10 of the moving object. The user can continue the speech recognition process by continuing to watch the moving object G14 displayed in this way.
- FIG. 22 is a diagram illustrating an example in which the moving object G14 is superimposed and displayed on the virtual image.
- the output unit 130 when the output unit 130 is a television device, the output unit 130 may display a moving object G14 superimposed on a virtual image such as a game screen.
- FIG. 21 also shows a locus K10 of the moving object. The user can continue the speech recognition process by continuing to watch the moving object G14 displayed in this way.
- a wearable device or the like worn on the user's head may be used instead of the television apparatus.
- the recognition control unit 144 controls whether or not to continue the speech recognition process based on the user's line of sight.
- an example of controlling whether or not to continue the speech recognition process is not limited to such an example.
- the recognition control unit 144 may control whether or not to continue the speech recognition process based on the inclination of the user's head. Such an example will be described with reference to FIGS.
- the user wears the operation input unit 115 including a sensor (for example, an acceleration sensor) that can detect the tilt of the head.
- the user may be wearing a sound collection unit 120.
- the output control unit 146 turns on the sound collection function by the sound collection unit 120, and the sound information collected by the sound collection unit 120 is acquired by the sound information acquisition unit 142. Then, as shown in FIG. 23, the output control unit 146 displays the display object Mu. Subsequently, the recognition control unit 144 causes the voice recognition unit 145 to start voice recognition processing on the sound information acquired by the sound information acquisition unit 142.
- the recognition control unit 144 detects the silent section. (Time T12).
- the output control unit 146 displays an object (for example, voice recognition) indicating that the voice recognition process can be continued by tilting the head in a predetermined direction (for example, upward) when a silent section is detected.
- the start operation object G14 is output to the output unit 130 (time T13).
- the recognition control unit 144 controls whether or not to continue the voice recognition process based on the tilt of the user's head.
- the recognition control unit 144 may control the voice recognition unit 145 to execute a predetermined execution operation based on the result of the voice recognition process when the tilt of the user's head is below the reference value at a predetermined timing.
- the predetermined timing is not particularly limited as long as it is after the silent section is detected.
- the voice recognition unit 145 executes an execution operation based on the result of the voice recognition process according to the control of the recognition control unit 144 (time T16).
- the output control unit 146 may output the object G22 instructing to wait for voice input until the execution operation is completed while the execution operation based on the result of the speech recognition process is being performed.
- the output control unit 146 can output the result of the execution operation.
- the output control unit 146 displays the display object Mu as shown in FIG.
- the recognition control unit 144 causes the voice recognition unit 145 to start voice recognition processing on the sound information acquired by the sound information acquisition unit 142.
- the recognition control unit 144 detects the silent section. (Time T12).
- the output control unit 146 displays an object (for example, voice recognition) indicating that the voice recognition process can be continued by tilting the head in a predetermined direction (for example, upward) when a silent section is detected.
- the start operation object G14 is output to the output unit 130 (time T13).
- the recognition control unit 144 controls whether or not to continue the voice recognition process based on the tilt of the user's head.
- the user since the user has not finished speaking all the utterance contents that the user wants to perform voice recognition processing, the user needs to tilt the head in a predetermined direction.
- the recognition control unit 144 may control the voice recognition unit 145 to continue the voice recognition process when the inclination of the user's head exceeds the reference value at a predetermined timing.
- the predetermined timing is not particularly limited as long as it is after the silent section is detected.
- the voice recognition unit 145 continues the voice recognition process for the sound information input from the sound collection unit 120 according to the control of the recognition control unit 144 (time T16). As a result, the speech recognition process once suspended is resumed.
- the output control unit 146 may start displaying the display object Mu again as shown in FIG.
- the recognition control unit 144 may control the voice recognition unit 145 so as to continue the voice recognition process, or execute an execution operation based on the result of the voice recognition process.
- the voice recognition unit 145 may be controlled as described above.
- the recognition control unit 144 may control whether or not to continue the voice recognition process based on the movement of the user's head. Such an example will be described with reference to FIGS.
- the user wears the operation input unit 115 including a sensor (for example, a gyro sensor) that can detect the movement of the head.
- the user may be wearing the sound collection unit 120.
- the output control unit 146 turns on the sound collection function by the sound collection unit 120, and the sound information collected by the sound collection unit 120 is acquired by the sound information acquisition unit 142. Then, as shown in FIG. 25, the output control unit 146 displays the display object Mu. Subsequently, the recognition control unit 144 causes the voice recognition unit 145 to start voice recognition processing on the sound information acquired by the sound information acquisition unit 142.
- the recognition control unit 144 detects the silent section. (Time T12).
- the output control unit 146 displays an object (for example, voice) indicating that the voice recognition process can be continued by rotating the head in a predetermined direction (for example, the right direction) when a silent section is detected.
- the recognition start operation object G14 is output to the output unit 130 (time T13).
- the recognition control unit 144 controls whether or not to continue the voice recognition process based on the movement of the user's head.
- the user since all the utterance contents that the user wants to perform voice recognition processing have been uttered, the user does not have to rotate the head in a predetermined direction. If the user does not rotate the head to the right, the movement of the user's head does not show a predetermined movement (rotation in a predetermined direction). Therefore, the recognition control unit 144 controls the voice recognition unit 145 to execute a predetermined execution operation based on the result of the voice recognition process when the movement of the user's head does not show the predetermined movement at a predetermined timing. do it.
- the predetermined timing is not particularly limited as long as it is after the silent section is detected.
- the voice recognition unit 145 executes an execution operation based on the result of the voice recognition process according to the control of the recognition control unit 144 (time T16).
- the output control unit 146 may output the object G22 instructing to wait for voice input until the execution operation is completed while the execution operation based on the result of the speech recognition process is being performed.
- the output control unit 146 can output the result of the execution operation.
- the output control unit 146 displays the display object Mu as shown in FIG.
- the recognition control unit 144 causes the voice recognition unit 145 to start voice recognition processing on the sound information acquired by the sound information acquisition unit 142.
- the recognition control unit 144 detects the silent section. (Time T12).
- the output control unit 146 displays an object (for example, voice) indicating that the voice recognition process can be continued by rotating the head in a predetermined direction (for example, the right direction) when a silent section is detected.
- the recognition start operation object G14 is output to the output unit 130 (time T13).
- the recognition control unit 144 controls whether or not to continue the voice recognition process based on the movement of the user's head.
- the recognition control unit 144 may control the voice recognition unit 145 to continue the voice recognition process when the movement of the user's head shows a predetermined movement at a predetermined timing.
- the predetermined timing is not particularly limited as long as it is after the silent section is detected.
- the voice recognition unit 145 continues the voice recognition process for the sound information input from the sound collection unit 120 according to the control of the recognition control unit 144 (time T16). As a result, the speech recognition process once suspended is resumed.
- the output control unit 146 may start to display the display object Mu again as shown in FIG.
- the recognition control unit 144 controls whether or not to continue the speech recognition process based on the movement of the user's head.
- the example to do was explained.
- FIG. 27 is a block diagram illustrating a hardware configuration example of the information processing system 10 according to the embodiment of the present disclosure.
- the information processing system 10 includes a CPU (Central Processing unit) 901, a ROM (Read Only Memory) 903, and a RAM (Random Access Memory) 905.
- the information processing system 10 may also include a host bus 907, a bridge 909, an external bus 911, an interface 913, an input device 915, an output device 917, a storage device 919, a drive 921, a connection port 923, and a communication device 925.
- the information processing system 10 may include an imaging device 933 and a sensor 935 as necessary.
- the information processing system 10 may include a processing circuit called DSP (Digital Signal Processor) or ASIC (Application Specific Integrated Circuit) instead of or in addition to the CPU 901.
- DSP Digital Signal Processor
- ASIC Application Specific Integrated Circuit
- the CPU 901 functions as an arithmetic processing unit and a control unit, and controls all or part of the operation in the information processing system 10 according to various programs recorded in the ROM 903, the RAM 905, the storage device 919, or the removable recording medium 927.
- the ROM 903 stores programs and calculation parameters used by the CPU 901.
- the RAM 905 temporarily stores programs used in the execution of the CPU 901, parameters that change as appropriate during the execution, and the like.
- the CPU 901, the ROM 903, and the RAM 905 are connected to each other by a host bus 907 configured by an internal bus such as a CPU bus. Further, the host bus 907 is connected to an external bus 911 such as a PCI (Peripheral Component Interconnect / Interface) bus via a bridge 909.
- PCI Peripheral Component Interconnect / Interface
- the input device 915 is a device operated by the user, such as a mouse, a keyboard, a touch panel, a button, a switch, and a lever.
- the input device 915 may include a microphone that detects the user's voice.
- the input device 915 may be, for example, a remote control device using infrared rays or other radio waves, or may be an external connection device 929 such as a mobile phone that supports the operation of the information processing system 10.
- the input device 915 includes an input control circuit that generates an input signal based on information input by the user and outputs the input signal to the CPU 901. The user operates the input device 915 to input various data to the information processing system 10 and instruct processing operations.
- An imaging device 933 which will be described later, can also function as an input device by imaging a user's hand movement, a user's finger, and the like. At this time, the pointing position may be determined according to the movement of the hand or the direction of the finger.
- the output device 917 is a device that can notify the user of the acquired information visually or audibly.
- the output device 917 is, for example, a display device such as an LCD (Liquid Crystal Display), a PDP (Plasma Display Panel), an organic EL (Electro-Luminescence) display, a projector, an audio output device such as a hologram display device, a speaker and headphones, As well as a printer device.
- the output device 917 outputs the result obtained by the processing of the information processing system 10 as a video such as text or an image, or outputs it as a voice such as voice or sound.
- the output device 917 may include a light or the like to brighten the surroundings.
- the storage device 919 is a data storage device configured as an example of a storage unit of the information processing system 10.
- the storage device 919 includes, for example, a magnetic storage device such as an HDD (Hard Disk Drive), a semiconductor storage device, an optical storage device, or a magneto-optical storage device.
- the storage device 919 stores programs executed by the CPU 901, various data, various data acquired from the outside, and the like.
- the drive 921 is a reader / writer for a removable recording medium 927 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory, and is built in or externally attached to the information processing system 10.
- the drive 921 reads information recorded on the attached removable recording medium 927 and outputs the information to the RAM 905.
- the drive 921 writes a record in the attached removable recording medium 927.
- the connection port 923 is a port for directly connecting a device to the information processing system 10.
- the connection port 923 can be, for example, a USB (Universal Serial Bus) port, an IEEE 1394 port, a SCSI (Small Computer System Interface) port, or the like.
- the connection port 923 may be an RS-232C port, an optical audio terminal, an HDMI (registered trademark) (High-Definition Multimedia Interface) port, or the like.
- Various data can be exchanged between the information processing system 10 and the external connection device 929 by connecting the external connection device 929 to the connection port 923.
- the communication device 925 is a communication interface configured with, for example, a communication device for connecting to the communication network 931.
- the communication device 925 can be, for example, a communication card for wired or wireless LAN (Local Area Network), Bluetooth (registered trademark), or WUSB (Wireless USB).
- the communication device 925 may be a router for optical communication, a router for ADSL (Asymmetric Digital Subscriber Line), or a modem for various communication.
- the communication device 925 transmits and receives signals and the like using a predetermined protocol such as TCP / IP with the Internet and other communication devices, for example.
- the communication network 931 connected to the communication device 925 is a wired or wireless network, such as the Internet, a home LAN, infrared communication, radio wave communication, or satellite communication.
- the imaging device 933 uses various members such as an imaging element such as a CCD (Charge Coupled Device) or CMOS (Complementary Metal Oxide Semiconductor), and a lens for controlling the imaging of a subject image on the imaging element. It is an apparatus that images a real space and generates a captured image.
- the imaging device 933 may capture a still image or may capture a moving image.
- the sensor 935 is various sensors such as an acceleration sensor, a gyro sensor, a geomagnetic sensor, an optical sensor, and a sound sensor.
- the sensor 935 obtains information related to the state of the information processing system 10 such as the posture of the information processing system 10, and information related to the surrounding environment of the information processing system 10 such as brightness and noise around the information processing system 10. To do.
- the sensor 935 may include a GPS sensor that receives a GPS (Global Positioning System) signal and measures the latitude, longitude, and altitude of the apparatus.
- GPS Global Positioning System
- Each component described above may be configured using a general-purpose member, or may be configured by hardware specialized for the function of each component. Such a configuration can be appropriately changed according to the technical level at the time of implementation.
- the information processing system 10 is provided that includes a control unit 144 and that controls whether or not to continue the speech recognition process based on the user gesture detected at a predetermined timing. According to this configuration, the user can easily instruct whether or not to continue the speech recognition process for the sound information.
- the output unit 130 may be a display provided in a wearable terminal (for example, a watch, glasses, etc.) other than the head mounted display.
- the output unit 130 may be a display used in the healthcare field.
- the speech recognition process based on the user's line of sight, the user's head tilt, and the user's head movement.
- the example which controls whether to continue is demonstrated.
- the user's gesture is not limited to such an example.
- the user's gesture may be the user's facial expression, the user's lip movement, the user's lip shape, or the open / closed state of the user's eyes. Also good.
- the output control unit 146 generates display control information for causing the output unit 130 to display the display content, and outputs the generated display control information to the output unit 130, so that the display content is displayed on the output unit 130.
- the output unit 130 can be controlled as described above.
- the contents of the display control information may be changed as appropriate according to the system configuration.
- the program for realizing the information processing apparatus 140 may be a web application.
- the display control information may be realized by a markup language such as HTML (HyperText Markup Language), SGML (Standard Generalized Markup Language), XML (Extensible Markup Language), or the like.
- the position of each component is not particularly limited as long as the operation of the information processing system 10 described above is realized.
- the image input unit 110, the operation input unit 115, the sound collecting unit 120, the output unit 130, and the information processing device 140 may be provided in different devices connected via a network.
- the information processing apparatus 140 corresponds to a server such as a web server or a cloud server, and the image input unit 110, the operation input unit 115, the sound collection unit 120, and the output unit 130 are connected to the server. It may correspond to a client connected via
- the information processing apparatus 140 may not be accommodated in the same apparatus.
- the information processing device 140 May be present on different devices.
- the voice recognition unit 145 is a server different from the information processing apparatus 140 including the input image acquisition unit 141, the sound information acquisition unit 142, the operation detection unit 143, the recognition control unit 144, and the output control unit 146. May be present.
- a recognition control unit that controls the voice recognition unit so that voice recognition processing is performed by the voice recognition unit on sound information input from the sound collection unit; The recognition control unit controls whether or not to continue the voice recognition process based on a user gesture detected at a predetermined timing. Information processing system.
- the recognition control unit controls whether or not to continue the voice recognition process based on the line of sight of the user. The information processing system according to (1).
- the recognition control unit controls whether or not to continue the voice recognition process based on the viewpoint and the moving object of the user; The information processing system according to (2).
- the recognition control unit controls whether or not to continue the voice recognition processing based on the degree of coincidence between the user's viewpoint and the moving object.
- the information processing system controls the voice recognition unit to continue the voice recognition process when the degree of coincidence exceeds a threshold;
- the recognition control unit controls the voice recognition unit to execute a predetermined execution operation based on a result of the voice recognition process when the degree of coincidence is lower than the threshold value.
- the information processing system includes an output control unit that causes the output unit to output the moving object.
- the information processing system according to any one of (4) to (6).
- the output control unit outputs the moving object to the output unit when the volume of the sound information continues and falls below a reference volume after a start of the voice recognition process reaches a predetermined target time. Let The information processing system according to (7).
- the predetermined timing is a timing after the moving object is output by the output unit.
- the output control unit causes the output unit to output a predetermined first notification object when the degree of coincidence exceeds a threshold value.
- the output control unit causes the output unit to output a predetermined second notification object different from the first notification object when the degree of coincidence is lower than the threshold.
- (12) The recognition control unit controls whether or not to continue the voice recognition process based on the tilt of the user's head.
- the recognition control unit controls the voice recognition unit to continue the voice recognition process when the inclination of the user's head exceeds a predetermined reference value;
- the recognition control unit controls the voice recognition unit to execute a predetermined execution operation based on a result of the voice recognition process when an inclination of the head of the user is lower than the reference value;
- the recognition control unit controls whether or not to continue the voice recognition process based on the movement of the user's head.
- the recognition control unit controls the voice recognition unit to continue the voice recognition process when the movement of the user's head shows a predetermined movement.
- the recognition control unit controls the voice recognition unit to execute a predetermined execution operation based on a result of the voice recognition process when a movement of the user's head does not indicate the predetermined movement.
- the recognition control unit causes the voice recognition unit to start the voice recognition process when a start trigger of the voice recognition process is detected;
- the execution operation includes an operation for outputting a search result according to the result of the speech recognition processing, an operation for outputting the result of the speech recognition processing, an operation for outputting a processing result candidate obtained in the course of the speech recognition processing, and Including any one of the operations for outputting a character string for replying to the utterance content extracted from the result of the speech recognition processing,
- the information processing system according to (6).
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
0.背景
1.本開示の実施形態
1.1.システム構成例
1.2.機能構成例
1.3.情報処理システムの機能詳細
1.4.システム構成の変形例
1.5.動きオブジェクトの表示形態
1.6.ユーザのジェスチャの変形例
1.7.ハードウェア構成例
2.むすび
まず、図面を参照しながら本開示の実施形態の背景を説明する。図1は、一般的なシステムにおける音声認識処理を説明するための図である。以下の説明において、音声(voiceまたはspeech)と音(sound)とは区別して用いられる。また、発話は、ユーザが音声を発している状態を示し、無音は、閾値よりも小さい音量によって音情報が集音されている状態を示す。
[1.1.システム構成例]
続いて、図面を参照しながら本開示の実施形態に係る情報処理システム10の構成例について説明する。図2は、本開示の実施形態に係る情報処理システム10の構成例を示す図である。図2に示したように、本開示の実施形態に係る情報処理システム10は、画像入力部110と、操作入力部115と、集音部120と、出力部130とを備える。情報処理システム10は、ユーザU(以下、単に「ユーザ」とも言う。)によって発せられた音声に対して音声認識処理を行うことが可能である。
続いて、本開示の実施形態に係る情報処理システム10の機能構成例について説明する。図3は、本開示の実施形態に係る情報処理システム10の機能構成例を示すブロック図である。図3に示したように、本開示の実施形態に係る情報処理システム10は、画像入力部110と、操作入力部115と、集音部120と、出力部130と、情報処理装置140(以下、「制御部140」とも言う。)と、を備える。
続いて、本開示の実施形態に係る情報処理システム10の機能詳細について説明する。本開示の実施形態においては、認識制御部144が、集音部120から入力される音情報に対して音声認識処理が音声認識部145によって施されるように音声認識部145を制御し、認識制御部144が、所定のタイミングにおいて検出されたユーザのジェスチャに基づいて音声認識処理を継続するか否かを制御する。
上記においては、出力部130がテーブルTblの天面に画面を投影することが可能なプロジェクタである例について説明した。しかし、情報処理システム10のシステム構成は、かかる例に限定されない。以下では、情報処理システム10のシステム構成の変形例について説明する。図11は、情報処理システム10の構成の変形例1を示す図である。図11に示すように、情報処理システム10が携帯端末である場合に、出力部130は、携帯端末に備わっていてもよい。携帯端末の種類は特に限定されず、タブレット端末であってもよいし、スマートフォンであってもよいし、携帯電話であってもよい。
上記では、動きオブジェクトG14の表示について説明した。ここで、動きオブジェクトG14の表示形態は特に限定されない。図21は、3次元空間における視野領域に動きオブジェクトG14が表示される例を示す図である。例えば、図21に示すように、出力部130がシースルー型のヘッドマウントディスプレイである場合、出力部130は、3次元空間Reにおける視野領域Viに動きオブジェクトG14を表示させてよい。また、図21には、動きオブジェクトの軌跡K10が示されている。ユーザは、このように表示されている動きオブジェクトG14を見続けることによって、音声認識処理を継続させることが可能となる。
上記では、認識制御部144が、ユーザの視線に基づいて、音声認識処理を継続するか否かを制御する例を説明する例を説明した。しかし、音声認識処理を継続するか否かを制御する例は、かかる例に限定されない。例えば、認識制御部144は、ユーザの頭部の傾きに基づいて、音声認識処理を継続するか否かを制御してもよい。かかる例について、図23および図24を参照しながら説明する。
次に、図27を参照して、本開示の実施形態に係る情報処理システム10のハードウェア構成について説明する。図27は、本開示の実施形態に係る情報処理システム10のハードウェア構成例を示すブロック図である。
以上説明したように、本開示の実施形態によれば、集音部120から入力される音情報に対して音声認識処理が音声認識部145によって施されるように音声認識部145を制御する認識制御部144を備え、認識制御部144は、所定のタイミングにおいて検出されたユーザのジェスチャに基づいて音声認識処理を継続するか否かを制御する、情報処理システム10が提供される。かかる構成によれば、音情報に対する音声認識処理を継続させるか否かをユーザが容易に指示することが可能となる。
(1)
集音部から入力される音情報に対して音声認識処理が音声認識部によって施されるように前記音声認識部を制御する認識制御部を備え、
前記認識制御部は、所定のタイミングにおいて検出されたユーザのジェスチャに基づいて前記音声認識処理を継続するか否かを制御する、
情報処理システム。
(2)
前記認識制御部は、前記ユーザの視線に基づいて、前記音声認識処理を継続するか否かを制御する、
前記(1)に記載の情報処理システム。
(3)
前記認識制御部は、前記ユーザの視点と動きオブジェクトとに基づいて、前記音声認識処理を継続するか否かを制御する、
前記(2)に記載の情報処理システム。
(4)
前記認識制御部は、前記ユーザの視点と前記動きオブジェクトとの一致度に基づいて、前記音声認識処理を継続するか否かを制御する、
前記(3)に記載の情報処理システム。
(5)
前記認識制御部は、前記一致度が閾値を上回る場合に、前記音声認識処理を継続するように前記音声認識部を制御する、
前記(4)に記載の情報処理システム。
(6)
前記認識制御部は、前記一致度が前記閾値を下回る場合に、前記音声認識処理の結果に基づく所定の実行動作を実行するように前記音声認識部を制御する、
前記(5)に記載の情報処理システム。
(7)
前記情報処理システムは、前記動きオブジェクトを出力部に出力させる出力制御部を備える、
前記(4)~(6)のいずれか一項に記載の情報処理システム。
(8)
前記出力制御部は、前記音声認識処理が開始されてから前記音情報の音量が継続して基準音量を下回る継続時間が所定の目標時間に達した場合に、前記動きオブジェクトを前記出力部に出力させる、
前記(7)に記載の情報処理システム。
(9)
前記所定のタイミングは、前記動きオブジェクトが前記出力部によって出力された後のタイミングである、
前記(7)または(8)に記載の情報処理システム。
(10)
前記出力制御部は、前記一致度が閾値を上回っている場合には、所定の第1の通知オブジェクトを前記出力部に出力させる、
前記(7)~(9)のいずれか一項に記載の情報処理システム。
(11)
前記出力制御部は、前記一致度が前記閾値を下回っている場合には、前記第1の通知オブジェクトとは異なる所定の第2の通知オブジェクトを前記出力部に出力させる、
前記(10)に記載の情報処理システム。
(12)
前記認識制御部は、前記ユーザの頭部の傾きに基づいて、前記音声認識処理を継続するか否かを制御する、
前記(1)に記載の情報処理システム。
(13)
前記認識制御部は、前記ユーザの頭部の傾きが所定の基準値を上回る場合に、前記音声認識処理を継続するように前記音声認識部を制御する、
前記(12)に記載の情報処理システム。
(14)
前記認識制御部は、前記ユーザの頭部の傾きが前記基準値を下回る場合に、前記音声認識処理の結果に基づく所定の実行動作を実行するように前記音声認識部を制御する、
前記(13)に記載の情報処理システム。
(15)
前記認識制御部は、前記ユーザの頭部の動きに基づいて、前記音声認識処理を継続するか否かを制御する、
前記(1)に記載の情報処理システム。
(16)
前記認識制御部は、前記ユーザの頭部の動きが所定の動きを示す場合に、前記音声認識処理を継続するように前記音声認識部を制御する、
前記(15)に記載の情報処理システム。
(17)
前記認識制御部は、前記ユーザの頭部の動きが前記所定の動きを示さない場合に、前記音声認識処理の結果に基づく所定の実行動作を実行するように前記音声認識部を制御する、
前記(16)に記載の情報処理システム。
(18)
前記認識制御部は、前記音声認識処理の起動トリガが検出された場合に、前記音声認識処理を前記音声認識部に開始させる、
前記(1)~(17)のいずれか一項に記載の情報処理システム。
(19)
前記実行動作は、前記音声認識処理の結果に応じた検索結果を出力させる動作、前記音声認識処理の結果を出力させる動作、前記音声認識処理の過程において得られた処理結果候補を出力させる動作および前記音声認識処理の結果から抽出される発話内容に返答するための文字列を出力させる動作のいずれか一つを含む、
前記(6)に記載の情報処理システム。
(20)
集音部から入力される音情報に対して音声認識処理が音声認識部によって施されるように前記音声認識部を制御することを含み、
プロセッサにより所定のタイミングにおいて検出されたユーザのジェスチャに基づいて前記音声認識処理を継続するか否かを制御することを含む、
情報処理方法。
110 画像入力部
115 操作入力部
120 集音部
130 出力部
140 情報処理装置(制御部)
141 入力画像取得部
142 音情報取得部
143 操作検出部
144 認識制御部
145 音声認識部
146 出力制御部
G10 初期画面
G11 認識文字列表示欄
G12 全削除操作オブジェクト
G13 確定操作オブジェクト
G14 音声認識開始操作オブジェクト(動きオブジェクト)
G15 前方移動操作オブジェクト
G16 後方移動操作オブジェクト
G17 削除操作オブジェクト
K10 動きオブジェクトの軌跡
K20、K21 ユーザの視点の軌跡
G41 第1の通知オブジェクト
G42 第2の通知オブジェクト
r 一致度
Claims (20)
- 集音部から入力される音情報に対して音声認識処理が音声認識部によって施されるように前記音声認識部を制御する認識制御部を備え、
前記認識制御部は、所定のタイミングにおいて検出されたユーザのジェスチャに基づいて前記音声認識処理を継続するか否かを制御する、
情報処理システム。 - 前記認識制御部は、前記ユーザの視線に基づいて、前記音声認識処理を継続するか否かを制御する、
請求項1に記載の情報処理システム。 - 前記認識制御部は、前記ユーザの視点と動きオブジェクトとに基づいて、前記音声認識処理を継続するか否かを制御する、
請求項2に記載の情報処理システム。 - 前記認識制御部は、前記ユーザの視点と前記動きオブジェクトとの一致度に基づいて、前記音声認識処理を継続するか否かを制御する、
請求項3に記載の情報処理システム。 - 前記認識制御部は、前記一致度が閾値を上回る場合に、前記音声認識処理を継続するように前記音声認識部を制御する、
請求項4に記載の情報処理システム。 - 前記認識制御部は、前記一致度が前記閾値を下回る場合に、前記音声認識処理の結果に基づく所定の実行動作を実行するように前記音声認識部を制御する、
請求項5に記載の情報処理システム。 - 前記情報処理システムは、前記動きオブジェクトを出力部に出力させる出力制御部を備える、
請求項4に記載の情報処理システム。 - 前記出力制御部は、前記音声認識処理が開始されてから前記音情報の音量が継続して基準音量を下回る継続時間が所定の目標時間に達した場合に、前記動きオブジェクトを前記出力部に出力させる、
請求項7に記載の情報処理システム。 - 前記所定のタイミングは、前記動きオブジェクトが前記出力部によって出力された後のタイミングである、
請求項7に記載の情報処理システム。 - 前記出力制御部は、前記一致度が閾値を上回っている場合には、所定の第1の通知オブジェクトを前記出力部に出力させる、
請求項7に記載の情報処理システム。 - 前記出力制御部は、前記一致度が前記閾値を下回っている場合には、前記第1の通知オブジェクトとは異なる所定の第2の通知オブジェクトを前記出力部に出力させる、
請求項10に記載の情報処理システム。 - 前記認識制御部は、前記ユーザの頭部の傾きに基づいて、前記音声認識処理を継続するか否かを制御する、
請求項1に記載の情報処理システム。 - 前記認識制御部は、前記ユーザの頭部の傾きが所定の基準値を上回る場合に、前記音声認識処理を継続するように前記音声認識部を制御する、
請求項12に記載の情報処理システム。 - 前記認識制御部は、前記ユーザの頭部の傾きが前記基準値を下回る場合に、前記音声認識処理の結果に基づく所定の実行動作を実行するように前記音声認識部を制御する、
請求項13に記載の情報処理システム。 - 前記認識制御部は、前記ユーザの頭部の動きに基づいて、前記音声認識処理を継続するか否かを制御する、
請求項1に記載の情報処理システム。 - 前記認識制御部は、前記ユーザの頭部の動きが所定の動きを示す場合に、前記音声認識処理を継続するように前記音声認識部を制御する、
請求項15に記載の情報処理システム。 - 前記認識制御部は、前記ユーザの頭部の動きが前記所定の動きを示さない場合に、前記音声認識処理の結果に基づく所定の実行動作を実行するように前記音声認識部を制御する、
請求項16に記載の情報処理システム。 - 前記認識制御部は、前記音声認識処理の起動トリガが検出された場合に、前記音声認識処理を前記音声認識部に開始させる、
請求項1に記載の情報処理システム。 - 前記実行動作は、前記音声認識処理の結果に応じた検索結果を出力させる動作、前記音声認識処理の結果を出力させる動作、前記音声認識処理の過程において得られた処理結果候補を出力させる動作および前記音声認識処理の結果から抽出される発話内容に返答するための文字列を出力させる動作のいずれか一つを含む、
請求項6に記載の情報処理システム。 - 集音部から入力される音情報に対して音声認識処理が音声認識部によって施されるように前記音声認識部を制御することを含み、
プロセッサにより所定のタイミングにおいて検出されたユーザのジェスチャに基づいて前記音声認識処理を継続するか否かを制御することを含む、
情報処理方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP15886508.9A EP3276618A4 (en) | 2015-03-23 | 2015-12-07 | Information processing system and information processing method |
US15/536,299 US10475439B2 (en) | 2015-03-23 | 2015-12-07 | Information processing system and information processing method |
JP2017507338A JP6729555B2 (ja) | 2015-03-23 | 2015-12-07 | 情報処理システムおよび情報処理方法 |
CN201580077946.0A CN107430856B (zh) | 2015-03-23 | 2015-12-07 | 信息处理系统和信息处理方法 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2015-059567 | 2015-03-23 | ||
JP2015059567 | 2015-03-23 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016151956A1 true WO2016151956A1 (ja) | 2016-09-29 |
Family
ID=56977095
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2015/084293 WO2016151956A1 (ja) | 2015-03-23 | 2015-12-07 | 情報処理システムおよび情報処理方法 |
Country Status (5)
Country | Link |
---|---|
US (1) | US10475439B2 (ja) |
EP (1) | EP3276618A4 (ja) |
JP (1) | JP6729555B2 (ja) |
CN (1) | CN107430856B (ja) |
WO (1) | WO2016151956A1 (ja) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2020160387A (ja) * | 2019-03-28 | 2020-10-01 | Necパーソナルコンピュータ株式会社 | 電子機器、制御方法およびプログラム |
US12094489B2 (en) | 2019-08-07 | 2024-09-17 | Magic Leap, Inc. | Voice onset detection |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102591413B1 (ko) * | 2016-11-16 | 2023-10-19 | 엘지전자 주식회사 | 이동단말기 및 그 제어방법 |
CN107919130B (zh) * | 2017-11-06 | 2021-12-17 | 百度在线网络技术(北京)有限公司 | 基于云端的语音处理方法和装置 |
US10923122B1 (en) * | 2018-12-03 | 2021-02-16 | Amazon Technologies, Inc. | Pausing automatic speech recognition |
US11151993B2 (en) * | 2018-12-28 | 2021-10-19 | Baidu Usa Llc | Activating voice commands of a smart display device based on a vision-based mechanism |
JP7351642B2 (ja) * | 2019-06-05 | 2023-09-27 | シャープ株式会社 | 音声処理システム、会議システム、音声処理方法、及び音声処理プログラム |
US12033625B2 (en) | 2021-06-16 | 2024-07-09 | Roku, Inc. | Voice control device with push-to-talk (PTT) and mute controls |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0944297A (ja) * | 1995-07-25 | 1997-02-14 | Canon Inc | データ入力方法及びその装置 |
JP2002091489A (ja) * | 2000-09-13 | 2002-03-27 | Alpine Electronics Inc | 音声認識装置 |
JP2005012377A (ja) * | 2003-06-17 | 2005-01-13 | Sharp Corp | 通信端末、通信端末の制御方法、音声認識処理装置、音声認識処理装置の制御方法、通信端末制御プログラム、通信端末制御プログラムを記録した記録媒体、音声認識処理装置制御プログラム、および、音声認識処理装置制御プログラムを記録した記録媒体 |
JP2007094104A (ja) * | 2005-09-29 | 2007-04-12 | Sony Corp | 情報処理装置および方法、並びにプログラム |
JP2010009484A (ja) * | 2008-06-30 | 2010-01-14 | Denso It Laboratory Inc | 車載機器制御装置および車載機器制御方法 |
JP2014095766A (ja) * | 2012-11-08 | 2014-05-22 | Sony Corp | 情報処理装置、情報処理方法及びプログラム |
JP2014517429A (ja) * | 2011-06-24 | 2014-07-17 | トムソン ライセンシング | ユーザの眼球の動きによって操作可能なコンピュータ装置、およびそのコンピュータ装置を操作する方法 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6243683B1 (en) * | 1998-12-29 | 2001-06-05 | Intel Corporation | Video control of speech recognition |
EP1215658A3 (en) * | 2000-12-05 | 2002-08-14 | Hewlett-Packard Company | Visual activation of voice controlled apparatus |
US6804396B2 (en) * | 2001-03-28 | 2004-10-12 | Honda Giken Kogyo Kabushiki Kaisha | Gesture recognition system |
US9250703B2 (en) * | 2006-03-06 | 2016-02-02 | Sony Computer Entertainment Inc. | Interface with gaze detection and voice input |
JP5601045B2 (ja) * | 2010-06-24 | 2014-10-08 | ソニー株式会社 | ジェスチャ認識装置、ジェスチャ認識方法およびプログラム |
CN103778359B (zh) * | 2014-01-24 | 2016-08-31 | 金硕澳门离岸商业服务有限公司 | 多媒体信息处理系统及多媒体信息处理方法 |
-
2015
- 2015-12-07 US US15/536,299 patent/US10475439B2/en active Active
- 2015-12-07 EP EP15886508.9A patent/EP3276618A4/en not_active Ceased
- 2015-12-07 CN CN201580077946.0A patent/CN107430856B/zh not_active Expired - Fee Related
- 2015-12-07 WO PCT/JP2015/084293 patent/WO2016151956A1/ja active Application Filing
- 2015-12-07 JP JP2017507338A patent/JP6729555B2/ja active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0944297A (ja) * | 1995-07-25 | 1997-02-14 | Canon Inc | データ入力方法及びその装置 |
JP2002091489A (ja) * | 2000-09-13 | 2002-03-27 | Alpine Electronics Inc | 音声認識装置 |
JP2005012377A (ja) * | 2003-06-17 | 2005-01-13 | Sharp Corp | 通信端末、通信端末の制御方法、音声認識処理装置、音声認識処理装置の制御方法、通信端末制御プログラム、通信端末制御プログラムを記録した記録媒体、音声認識処理装置制御プログラム、および、音声認識処理装置制御プログラムを記録した記録媒体 |
JP2007094104A (ja) * | 2005-09-29 | 2007-04-12 | Sony Corp | 情報処理装置および方法、並びにプログラム |
JP2010009484A (ja) * | 2008-06-30 | 2010-01-14 | Denso It Laboratory Inc | 車載機器制御装置および車載機器制御方法 |
JP2014517429A (ja) * | 2011-06-24 | 2014-07-17 | トムソン ライセンシング | ユーザの眼球の動きによって操作可能なコンピュータ装置、およびそのコンピュータ装置を操作する方法 |
JP2014095766A (ja) * | 2012-11-08 | 2014-05-22 | Sony Corp | 情報処理装置、情報処理方法及びプログラム |
Non-Patent Citations (1)
Title |
---|
See also references of EP3276618A4 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2020160387A (ja) * | 2019-03-28 | 2020-10-01 | Necパーソナルコンピュータ株式会社 | 電子機器、制御方法およびプログラム |
US12094489B2 (en) | 2019-08-07 | 2024-09-17 | Magic Leap, Inc. | Voice onset detection |
Also Published As
Publication number | Publication date |
---|---|
CN107430856A (zh) | 2017-12-01 |
CN107430856B (zh) | 2021-02-19 |
JP6729555B2 (ja) | 2020-07-22 |
JPWO2016151956A1 (ja) | 2018-01-11 |
EP3276618A4 (en) | 2018-11-07 |
US10475439B2 (en) | 2019-11-12 |
EP3276618A1 (en) | 2018-01-31 |
US20170330555A1 (en) | 2017-11-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6729555B2 (ja) | 情報処理システムおよび情報処理方法 | |
JP6635049B2 (ja) | 情報処理装置、情報処理方法およびプログラム | |
US11093045B2 (en) | Systems and methods to augment user interaction with the environment outside of a vehicle | |
JP6848881B2 (ja) | 情報処理装置、情報処理方法、及びプログラム | |
WO2017130486A1 (ja) | 情報処理装置、情報処理方法およびプログラム | |
US10771707B2 (en) | Information processing device and information processing method | |
WO2016152200A1 (ja) | 情報処理システムおよび情報処理方法 | |
WO2019077897A1 (ja) | 情報処理装置、情報処理方法、およびプログラム | |
JP6627775B2 (ja) | 情報処理装置、情報処理方法およびプログラム | |
US10720154B2 (en) | Information processing device and method for determining whether a state of collected sound data is suitable for speech recognition | |
US10522140B2 (en) | Information processing system and information processing method | |
WO2018139036A1 (ja) | 情報処理装置、情報処理方法およびプログラム | |
JP6575518B2 (ja) | 表示制御装置、表示制御方法およびプログラム | |
JP2016109726A (ja) | 情報処理装置、情報処理方法およびプログラム | |
JP2016156877A (ja) | 情報処理装置、情報処理方法およびプログラム | |
JP2016180778A (ja) | 情報処理システムおよび情報処理方法 | |
JP2017138698A (ja) | 情報処理装置、情報処理方法およびプログラム | |
JP2016170584A (ja) | 情報処理装置、情報処理方法およびプログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15886508 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2017507338 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15536299 Country of ref document: US |
|
REEP | Request for entry into the european phase |
Ref document number: 2015886508 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |