WO2021044732A1 - Information processing device, information processing method, and storage medium - Google Patents

Information processing device, information processing method, and storage medium Download PDF

Info

Publication number
WO2021044732A1
WO2021044732A1 PCT/JP2020/027052 JP2020027052W WO2021044732A1 WO 2021044732 A1 WO2021044732 A1 WO 2021044732A1 JP 2020027052 W JP2020027052 W JP 2020027052W WO 2021044732 A1 WO2021044732 A1 WO 2021044732A1
Authority
WO
WIPO (PCT)
Prior art keywords
unit
information processing
processing device
image
real
Prior art date
Application number
PCT/JP2020/027052
Other languages
French (fr)
Japanese (ja)
Inventor
貴広 岡山
Original Assignee
ソニー株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニー株式会社 filed Critical ソニー株式会社
Publication of WO2021044732A1 publication Critical patent/WO2021044732A1/en

Links

Images

Classifications

    • GPHYSICS
    • G03PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
    • G03BAPPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
    • G03B7/00Control of exposure by setting shutters, diaphragms or filters, separately or conjointly
    • G03B7/08Control effected solely on the basis of the response, to the intensity of the light received by the camera, of a built-in light-sensitive device
    • G03B7/091Digital circuits
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • G06F3/0346Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor with detection of the device orientation or free movement in a 3D space, e.g. 3D mice, 6-DOF [six degrees of freedom] pointers using gyroscopes, accelerometers or tilt-sensors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/70Circuitry for compensating brightness variation in the scene

Definitions

  • This disclosure relates to an information processing device, an information processing method, and a storage medium.
  • Patent Document 1 describes a technique for superimposing various data on the recognized finger position
  • Patent Document 2 has a different operation UI for superimposing according to the recognition accuracy. The technique to do so is described.
  • the present disclosure has been made in view of the above points, and an object of the present disclosure is to provide an information processing device, an information processing method, and a storage medium so as to be set so that a recognition target can be appropriately recognized. Make one.
  • the present disclosure is, for example, A judgment unit that determines changes in the user's gaze range in real space, An imaging control unit that sets the exposure of the imaging unit according to changes in the gaze range and controls the imaging unit to acquire an image in real space based on the exposure setting.
  • a recognition unit that recognizes real objects contained in real space images, It is an information processing device that has a display control unit that controls the display unit so that the virtual object superimposed on the real space changes based on the recognition result of the real object.
  • the present disclosure includes, for example, The determination unit determines the change in the user's gaze range with respect to the real space, The image pickup control unit sets the exposure of the image pickup unit according to the change in the gaze range, and controls the image pickup unit to acquire an image in real space based on the exposure setting.
  • the recognition unit recognizes the real object contained in the real space image, and This is an information processing method in which the display control unit controls the display unit so that the virtual object superimposed on the real space changes based on the recognition result of the real object.
  • the present disclosure includes, for example, The determination unit determines the change in the user's gaze range with respect to the real space, The image pickup control unit sets the exposure of the image pickup unit according to the change in the gaze range, and controls the image pickup unit to acquire an image in real space based on the exposure setting.
  • the recognition unit recognizes the real object contained in the real space image, and
  • the display control unit is a storage medium in which a program for causing a computer to execute an information processing method for controlling the display unit so that the virtual object superimposed on the real space changes based on the recognition result of the real object is stored.
  • FIG. 1 is a diagram that is referred to when the outline of one embodiment is explained.
  • 2A and 2B are diagrams that are referred to when the outline of one embodiment is explained.
  • FIG. 3 is a diagram showing an external example of the information processing apparatus according to the embodiment.
  • FIG. 4 is a block diagram showing an example of the internal configuration of the information processing apparatus according to the embodiment.
  • FIG. 5 is a flowchart showing a flow of processing performed by the information processing apparatus according to the embodiment.
  • FIG. 6 is a diagram referred to when a process performed by the information processing apparatus according to the embodiment is described.
  • FIG. 7 is a flowchart showing a flow of processing performed by the information processing apparatus according to the embodiment.
  • the information processing device 1 is mounted so that the left and right frames are hung on the left and right ears of the user U.
  • the information processing device 1 has an optically transparent display unit (optical see-through display) arranged in front of one or both eyes of the user U.
  • the information processing device 1 has a camera that acquires an image in the line-of-sight direction of the user U, and a real object is recognized based on the image acquired by the camera.
  • a real object is a part of a living body, specifically, a hand, a finger, a wrist, or the like.
  • Predetermined information which is a virtual object, is superimposed and displayed on the recognized real object.
  • the predetermined information is various GUIs (Graphical User Interfaces), and specific examples thereof include a menu screen.
  • GUIs Graphic User Interfaces
  • the user U moves one hand in front of the eyes, the hand is recognized, and the menu screen is superimposed and displayed on the recognized hand. Then, the user U performs a tap operation by the other hand or a known operation on the menu screen.
  • the processing according to the operation is executed by the information processing device 1.
  • the predetermined information may be a virtual object operated by a real object arranged in a three-dimensional coordinate system associated with the real space. In this embodiment, it is possible to construct a related input system by AR (Augmented Reality) technology.
  • the display unit included in the information processing device 1 may be a display unit (video see-through display) that does not have optical transparency and displays a captured image in the front real space in real time.
  • the information processing device 1 may be a HUD (Head-Up Display) installed in the interior of a car or the like, or an HMD (Head-Mounted Display) mounted on the head.
  • HUD Head-Up Display
  • HMD Head-Mounted Display
  • the information processing device 1 may be a tabletop type display device in which an image is projected on a plane such as a table by a projection device such as a projector.
  • the information processing device 1 may be a personal computer, a smart phone, a tablet computer, a PND (Portable Navigation Device), or the like.
  • the change in the gaze range is determined, and the exposure of the camera is set according to the change in the gaze range.
  • the exposure setting is adjusted to the periphery of the hand HA, which is the gaze range, so that the shape of the hand HA can be easily recognized as shown in FIG. 2B.
  • FIG. 3 is a diagram showing an external example of the information processing device 1 according to the present embodiment.
  • the information processing device 1 according to the present embodiment is a glasses-type wearable device.
  • the information processing device 1 has a frame 5 for holding the left image display unit 3A and the right image display unit 3B in front of the eyes, like ordinary eyeglasses.
  • the frame 5 is made of the same materials that make up ordinary eyeglasses, such as metals, alloys, plastics, and combinations thereof.
  • the left image display unit 3A and the right image display unit 3B may be an optical see-through display or a video see-through display. When it is not necessary to distinguish the individual display units, the left image display unit 3A and the right image display unit 3B are appropriately collectively referred to as the display unit 3.
  • the frame 5 is equipped with batteries, various sensors, speakers, and the like.
  • the frame 5 includes an outward-facing camera (an example of an imaging unit) for recognizing an object existing in the line-of-sight direction and an inward-facing camera (an example of another imaging unit) for detecting the line of sight of the user U. ) And is installed.
  • another imaging unit may be referred to as a second imaging unit.
  • the outward-facing camera is configured as, for example, a stereo camera 11 that simultaneously acquires a plurality of images.
  • the stereo camera 11 has two cameras (cameras 11A and 11B).
  • the inward-facing camera includes a left-eye camera 12A and a right-eye camera 12B.
  • FIG. 4 is a block diagram showing an example of the internal configuration of the information processing device 1 according to the present embodiment.
  • the information processing device 1 includes, for example, a sensor unit 10, a control unit 20, an output unit 30, and a storage unit 40. These configurations are connected via the bus 50, and commands and various data can be exchanged between the configurations.
  • the sensor unit 10 includes the stereo camera 11, the left-eye camera 12A, and the right-eye camera 12B described above.
  • the left-eye camera 12A and the right-eye camera 12B are, for example, infrared cameras.
  • the stereo camera 11 has an image sensor such as a CCD (Charge Coupled Device) or a CMOS (Complementary Metal Oxide Semiconductor).
  • the sensor unit 10 is not limited to the camera, and may include various sensors.
  • the sensor unit 10 includes a microphone, a GPS (Global Positioning System) sensor, an acceleration sensor, a visual (line of sight, gaze point, focus, blinking eye, etc.) sensor, a living body (heartbeat, body temperature, blood pressure, brain wave, etc.) sensor, and a gyro sensor. , Various sensors such as an illuminance sensor may be included.
  • the sensing data acquired by the sensor unit 10 is supplied to the control unit 20.
  • the control unit 20 has, for example, a determination unit 201, an image pickup control unit 202, a recognition unit 203, and a display control unit 204 as functional blocks.
  • the determination unit 201 determines a change in the gaze range of the user U with respect to the real space.
  • the determination unit 201 specifies, for example, the viewpoint position (gaze point) of the user U or a predetermined range based on the viewpoint position as the gaze range. From the viewpoint of the user U, for example, the positions of the light spots indicating the reflection of the infrared light emitted from the plurality of infrared LEDs (Light Emitting Diodes) to the pupils of the user U are set by the left-eye camera 12A and the right-eye camera 12B, respectively.
  • the line-of-sight positions of the left eye and the right eye are detected.
  • a predetermined range is set as the gaze range based on the detected line-of-sight position or the detected viewpoint position.
  • the image pickup control unit 202 sets the exposure of the image pickup unit (stereo camera 11 in this example) according to the change in the gaze range, and controls the stereo camera 11 to acquire an image in the real space based on the exposure setting. For example, the image pickup control unit 202 sets the exposure based on the average value of the brightness of the gaze range. Since the average value can change in brightness as the gaze range changes, the imaging control unit 202 resets the exposure setting when there is a change in the gaze range.
  • the exposure setting includes settings related to the aperture, shutter speed, and the like, but the setting is not limited to a specific parameter as long as the exposure is changed.
  • the recognition unit 203 recognizes a real object included in a real space image.
  • the real object is, for example, the hand of user U.
  • the recognition unit 203 analyzes the captured image acquired by the stereo camera 11 and performs a recognition process for a hand existing in the real space.
  • the recognition unit 203 identifies a hand in the captured image by collating the image feature amount extracted from the captured image with the image feature amount of a known real object stored in the storage unit 40, and identifies the captured image. Recognize the position and shape of the hand in.
  • the hand in the captured image may be recognized by another known method such as a method using a plurality of feature points (for example, a knuckle joint or the like).
  • the recognition unit 203 may analyze the image captured by the stereo camera 11 and acquire three-dimensional shape information (depth information) in the real space. For example, the recognition unit 203 performs a stereo matching method for a plurality of images simultaneously acquired by the stereo camera 11, an SfM (Structure from Motion) method for a plurality of images acquired in time series, a SLAM method, and the like in a real space. The three-dimensional shape of the above may be recognized and the three-dimensional shape information may be acquired. Further, when the recognition unit 203 can acquire the three-dimensional shape information in the real space, the recognition unit 203 may recognize the three-dimensional position, shape, size, and posture of the real object.
  • SfM Structure from Motion
  • the recognition unit 203 may recognize the user operation based on the sensing data or the like.
  • the recognition unit 203 according to the present embodiment recognizes a gesture operation based on a continuous change in the shape of the hand.
  • the recognition unit 203 recognizes the gesture operation by cutting out the partial image including the hand, scaling the cut out partial image, temporarily storing the partial image, calculating the difference between frames, and the like.
  • the display control unit 204 controls the display unit 3 so that the virtual object superimposed on the real space changes based on the recognition result of the real object. For example, the display control unit 204 controls to superimpose the menu screen, which is an example of GUI, on the recognized hand. With this control, an image in which the menu screen is superimposed on the position of the hand recognized by the recognition unit 203 is presented to the user U via the display unit 3.
  • the output unit 30 outputs various information by display or voice.
  • the output unit 30 has, for example, a display unit 3 and a speaker 301.
  • the output unit 30 may have an oscillator (vibrator) or the like.
  • the storage unit 40 stores programs and data for processing by the information processing device 1.
  • the storage unit 40 stores an image feature amount used for hand recognition and a gesture pattern used for recognizing a gesture operation. Further, the storage unit 40 stores the recognition result information which is the recognition result by the recognition unit 203.
  • the recognition result information is, for example, information on a recognized object or a range of a hand, information on a recognized hand shape, or the like.
  • a magnetic storage device such as an HDD (Hard Disk Drive), a semiconductor storage device, an optical storage device, an optical magnetic storage device, or the like can be applied.
  • the configuration of the information processing device 1 described above is merely an example, and the information processing device 1 may have another configuration.
  • the information processing device 1 may have a wireless module for communicating with an external device and a battery for supplying electric power to each configuration.
  • FIG. 5 is a flowchart showing an operation example of the information processing apparatus 1 according to the present embodiment.
  • the information processing apparatus 1 is set to a mode in which the user U moves his / her hand to the tip of the line of sight, superimposes the menu screen on the hand, and performs various inputs using the menu screen. It is done at the time.
  • the process described below may be performed regardless of the mode setting.
  • the line-of-sight position is detected in step ST11.
  • the images taken by the left-eye camera 12A and the right-eye camera 12B are supplied to the control unit 20.
  • the determination unit 201 of the control unit 20 detects the line-of-sight position of the user U.
  • the gaze range is set based on the line-of-sight position. Then, the process proceeds to step ST12.
  • a predetermined hand range is set within the gaze range.
  • the set range is set as the calculation range of the parameters related to the exposure setting.
  • the range of the hand centered on the line-of-sight position is set as the calculation range of the parameters related to the exposure setting.
  • the line-of-sight position does not have to be strictly centered, and the line-of-sight position may be located near the substantially center of the set range.
  • Such processing is performed by, for example, the determination unit 201.
  • the predetermined hand range is the size of the hand held in the line-of-sight direction based on the arm length a and the hand size b of a general person, as schematically shown in FIG. It is a pre-calculated range of how much it will be.
  • a plurality of predetermined hand ranges may be prepared, and any range may be selected according to the attributes of the user U and the like. Then, the process proceeds to step ST13.
  • step ST13 the parameters related to the exposure setting (AE (Auto Exposure) parameters) are calculated by the imaging control unit 202.
  • the image pickup control unit 202 calculates the parameter based on, for example, the average value of the brightness in the range set in step ST12. Then, the process proceeds to step ST14.
  • step ST14 the parameters calculated in step ST13 are set in the stereo camera 11, so that the exposure setting is performed by the image pickup control unit 202.
  • the stereo camera 11 acquires an image in real space by taking a picture based on the set exposure setting. Then, the process proceeds to step ST15.
  • step ST15 the recognition unit 203 detects whether or not the real space image includes a hand by executing a process based on the hand recognition algorithm on the real space image acquired by the stereo camera 11. .. Then, the process proceeds to step ST16.
  • step ST16 the recognition unit 203 determines whether or not the shape of the hand can be recognized. Although the hand is not recognized in the initial process, the user U holds his / her hand over the gaze range according to the mode setting, so that the hand can be recognized in the process according to step ST16.
  • the recognition result such as the shape of the hand is stored in the storage unit 40 as the recognition result information.
  • the recognition result information stored in the storage unit 40 may be used as a template or the like in the next hand recognition process. If the hand is not recognized in the process related to step ST16, the process returns to step ST12. If the hand is recognized in the process related to step ST16, the process proceeds to step ST17.
  • step ST17 the range within the gaze range and including the hand recognized in the processes related to steps ST15 and ST16 is reset as the calculation range of the parameters related to the exposure setting. Then, the process returns to step ST13, and the parameters related to the exposure setting are calculated based on the brightness of the reset range.
  • the position of the recognized hand is appropriately converted to the position of the image displayed on the display unit 3.
  • the display control unit 204 performs a process of superimposing the menu screen on the position of the hand in the image. As a result, the user U can visually recognize the menu screen superimposed on the hand via the display unit 3.
  • FIG. 7 is a flowchart showing the flow of processing performed by the control. The process related to the flowchart shown in FIG. 7 is performed in parallel with the process related to the flowchart shown in FIG.
  • step ST21 the line-of-sight position is detected.
  • the image captured by the left-eye camera 12A and the right-eye camera 12B is detected by the control unit 20.
  • the determination unit 201 of the control unit 20 detects the line-of-sight position of the user U.
  • the gaze range is set based on the line-of-sight position. Then, the process proceeds to step ST22.
  • step ST22 the determination unit 201 determines whether or not there has been a change in the gaze range beyond a certain level. If there is no change in the gaze range beyond a certain level, the process returns to step ST21. When there is a change in the gaze range beyond a certain level, the process proceeds to step ST23.
  • step ST23 control for returning the process related to the flowchart shown in FIG. 5 to step ST11 is performed interruptively. That is, since the gaze range has changed, the parameters related to the exposure setting corresponding to the changed gaze range are recalculated so that the hand can be properly recognized within the changed gaze range. Specifically, the processes related to steps ST11 to ST17 described above are performed. In this way, the exposure of the stereo camera 11 is set according to the change in the gaze range. The stereo camera 11 acquires an image in real space based on the exposure setting set by itself. The position of the recognized hand may change as the gaze range changes. Control is performed by the display control unit 204 so that the position of the menu screen changes to the position of the hand after the change.
  • an appropriate setting is made so that the exposure matches the range of the hand that is considered to exist in the gaze range. Therefore, when the user holds his / her hand over the gaze range, which is the point of the line of sight, it is possible to prevent an error from occurring without recognizing the hand even in an environment where blackout or whiteout is likely to occur. it can. Further, since the parameters related to the exposure setting are calculated based on the information in a certain range (brightness in this example), the processing time can be shortened. When the hand is recognized, it is reset as the calculation range of the parameters related to the exposure setting based on the range of the recognized hand. Therefore, the calculation range of the parameters related to the exposure setting can be adjusted so as to correspond to individual differences in the size of the hand and the length of the arm.
  • the above problems can be solved by applying a camera with a large dynamic range or a camera capable of advanced correction processing.
  • the cost of the information processing device 1 itself becomes high, and the information processing device 1 may become large in size.
  • the information processing device 1 is a wearable device or the like, it is preferable that the power consumption is small, but the above-mentioned method may increase the power consumption and require frequent charging. is there.
  • the recognition target can be appropriately recognized without using a high-performance camera. Therefore, the cost of the information processing device 1 can be reduced, and the information processing device 1 can be downsized.
  • the image pickup control unit 202 may use a table in which the exposure setting is associated with the brightness when performing the exposure setting.
  • the image pickup control unit 202 may use the table to read out the exposure setting corresponding to the brightness and set the read out exposure setting to the stereo camera 11.
  • the table may be stored in the storage unit 40.
  • the table may be stored in the information processing device 1 in advance, or may be acquired from the outside via the network.
  • the recognition unit 203 may obtain the reliability (hand recognition accuracy) regarding hand recognition. For example, when the reliability is equal to or higher than the threshold value, the recognition unit 203 determines that the hand in the image captured by the stereo camera 11 is recognized, and when the reliability is lower than the threshold value, the stereo camera It may be determined that the hand in the image taken by 11 is not recognized. Further, the recognition unit 203 outputs the hand recognition result to the display control unit 204 when the reliability is equal to or higher than the threshold value, and displays and controls the hand recognition result when the reliability is lower than the threshold value. It may not be output to unit 204. That is, the GUI may be superimposed on the recognized hand position only when the reliability of hand recognition is equal to or higher than the threshold value.
  • the GUI When the hand held in the line-of-sight direction swings, if the GUI is superimposed and displayed on the swinging hand, the GUI also swings, which may give a sense of discomfort to the user U.
  • the reliability is lowered when the hand swings, so that it is possible to prevent the GUI from being superimposed and displayed on the swinging hand, which gives the user U a sense of discomfort. It is possible to prevent it from being stored.
  • the image pickup control unit 202 may switch the calculation range of the parameters related to the exposure setting according to the reliability. For example, even if the shape of the hand is recognized in the process related to step ST16, if the reliability is equal to or less than the threshold value, the process may return to step ST12. If the reliability regarding the recognition of the shape of the hand is larger than the threshold value, the process proceeds to step ST17. As a result, the calculation range of the parameters related to the exposure setting can be appropriately set.
  • the recognition unit 203 may recognize a continuous change in the shape of the hand, and further, a gesture operation based on the change in the shape of the hand may be recognized by the recognition unit 203. Then, the operation content for the GUI (for example, the menu screen) may be determined based on the gesture operation, and the processing may be performed according to the operation content.
  • a gesture operation based on the change in the shape of the hand
  • the operation content for the GUI for example, the menu screen
  • the information processing device 1 may be a smartphone instead of a glasses-type wearable device.
  • the in-camera of the smartphone may be used as a camera for detecting the line of sight
  • the out-camera may be used as a camera for acquiring an image for recognizing a hand. More specifically, the in-camera is arranged on the surface (display surface) on which the display of the smartphone is arranged, and photographs the user himself / herself who is visually recognizing the display.
  • the out-camera is arranged on the surface opposite to the display surface and captures the real space in front of the user. That is, it may be considered that the shooting direction of the in-camera and the shooting direction of the out-camera are opposite.
  • the camera that captures the subject in the line-of-sight direction such as the hand may be one camera instead of the stereo camera 11.
  • the depth information may be acquired by a ToF (Time of Flight) sensor or a LiDAR (Light Detection and Ringing) sensor.
  • the user U may be a robot or the like.
  • This disclosure can also be realized by devices, methods, programs, systems, etc. For example, by making it possible to download a program that performs the functions described in the above-described embodiment and downloading and installing the program by a device that does not have the functions described in the above-described embodiment, the control described in the embodiment can be performed in the device. It becomes possible to do.
  • the present disclosure can also be realized by a server that distributes such a program.
  • the items described in each embodiment and modification can be combined as appropriate.
  • the present disclosure may also adopt the following configuration.
  • a judgment unit that determines changes in the user's gaze range in real space, An imaging control unit that sets the exposure of the imaging unit according to the change in the gaze range and controls the imaging unit to acquire an image in the real space based on the exposure setting.
  • a recognition unit that recognizes real objects included in the real space image, An information processing device having a display control unit that controls a display unit so that a virtual object superimposed on the real space changes based on the recognition result of the real object.
  • the information processing apparatus wherein the predetermined range is a range having a preset size.
  • the predetermined range is within the gaze range and is the range of the real object recognized by the recognition unit (the information processing apparatus according to 2).
  • the image pickup control unit sets the exposure of the image pickup unit by using a table in which the exposure setting is associated with the brightness.
  • the recognition unit recognizes that the image in the real space includes the real object when the recognition accuracy of the real object is equal to or higher than the threshold value, and when the recognition accuracy of the real object is less than the threshold value, the real object is said to be included.
  • the information processing device Recognize that the spatial image does not contain real objects, The information processing device according to any one of (2)) to (5). (7) The information processing device according to (6), wherein the image pickup control unit switches the predetermined range according to the recognition accuracy. (8) The information processing device according to any one of (1) to (7), wherein the real object is a part of the user's body. (9) The information processing device according to (8), wherein the real object is the hand of the user. (10) The recognition unit recognizes the continuous change in the shape of the user's hand. The information processing device according to (9). (11) The information processing apparatus according to any one of (1) to (10), wherein the determination unit specifies the gaze range based on the detection result of the position of the line of sight of the user.
  • the information processing apparatus according to any one of (1) to (11), which has a second imaging unit that detects the line of sight, which is different from the imaging unit. (13) The imaging direction of the second imaging unit is opposite to the imaging direction of the imaging unit.
  • the information processing device according to (12).
  • the information processing apparatus according to any one of (1) to (13), wherein the display unit is either an optical see-through display or a video see-through display.
  • the information processing device according to any one of (1) to (14), which is configured as a wearable device or a smartphone that can be attached to and detached from the human body.
  • the determination unit determines the change in the user's gaze range with respect to the real space
  • the image pickup control unit sets the exposure of the image pickup unit according to the change in the gaze range, and controls the image pickup unit to acquire the image in the real space based on the exposure setting.
  • the recognition unit recognizes the real object included in the real space image, and then recognizes the real object.
  • An information processing method in which the display control unit controls the display unit so that the virtual object superimposed on the real space changes based on the recognition result of the real object.
  • the determination unit determines the change in the user's gaze range with respect to the real space
  • the image pickup control unit sets the exposure of the image pickup unit according to the change in the gaze range, and controls the image pickup unit to acquire the image in the real space based on the exposure setting.
  • the recognition unit recognizes the real object included in the real space image, and then recognizes the real object.
  • a storage medium in which a program for causing a computer to execute an information processing method in which a display control unit controls an information processing unit so that a virtual object superimposed on the real space changes based on the recognition result of the real object.

Abstract

An information processing device having: an assessment unit for assessing a change of a user's gaze range relating to a real space; an imaging control unit for controlling an imaging unit so as to set the exposure of the imaging unit in accordance with the change of the gaze range and acquire an image of the real space on the basis of the exposure setting; a recognition unit for recognizing a real object included in the image of the real space; and a display control unit for controlling a display unit, on the basis of the real object recognition result, so that a virtual object superimposed on the real space changes.

Description

情報処理装置、情報処理方法及び記憶媒体Information processing device, information processing method and storage medium
 本開示は、情報処理装置、情報処理方法及び記憶媒体に関する。 This disclosure relates to an information processing device, an information processing method, and a storage medium.
 認識対象に対する認識結果に応じて種々の処理を行う機器に関する提案がなされている。例えば、下記特許文献1には、認識された指の位置に各種のデータを重畳する技術が記載されている、また、下記特許文献2には、認識精度に応じて、重畳する操作UIを異なるようにする技術が記載されている。 Proposals have been made for devices that perform various processes according to the recognition result for the recognition target. For example, the following Patent Document 1 describes a technique for superimposing various data on the recognized finger position, and the following Patent Document 2 has a different operation UI for superimposing according to the recognition accuracy. The technique to do so is described.
特開2010-257359号公報Japanese Unexamined Patent Publication No. 2010-257359
国際公開2017/104272号International Publication No. 2017/10427
 このような分野では、認識対象を適切に認識することができるように、各種の設定がなされることが望まれる。 In such a field, it is desirable that various settings are made so that the recognition target can be appropriately recognized.
 本開示は、上述した点に鑑みてなされたものであり、認識対象を適切に認識することができる設定がなされるようにした情報処理装置、情報処理方法及び記憶媒体を提供することを目的の一つとする。 The present disclosure has been made in view of the above points, and an object of the present disclosure is to provide an information processing device, an information processing method, and a storage medium so as to be set so that a recognition target can be appropriately recognized. Make one.
 本開示は、例えば、
 実空間に関するユーザの注視範囲の変化を判定する判定部と、
 注視範囲の変化に応じて撮像部の露出設定を行い、露出設定に基づいて、実空間の画像を取得するよう撮像部を制御する撮像制御部と、
 実空間の画像に含まれる実オブジェクトの認識を行う認識部と、
 実オブジェクトの認識結果に基づいて、実空間に重畳された仮想オブジェクトが変化するように表示部を制御する表示制御部と
 を有する
 情報処理装置である。
The present disclosure is, for example,
A judgment unit that determines changes in the user's gaze range in real space,
An imaging control unit that sets the exposure of the imaging unit according to changes in the gaze range and controls the imaging unit to acquire an image in real space based on the exposure setting.
A recognition unit that recognizes real objects contained in real space images,
It is an information processing device that has a display control unit that controls the display unit so that the virtual object superimposed on the real space changes based on the recognition result of the real object.
 また、本開示は、例えば、
 判定部が、実空間に関するユーザの注視範囲の変化を判定し、
 撮像制御部が、注視範囲の変化に応じて撮像部の露出設定を行い、露出設定に基づいて、実空間の画像を取得するよう撮像部を制御し、
 認識部が、実空間の画像に含まれる実オブジェクトの認識を行い、
 表示制御部が、実オブジェクトの認識結果に基づいて、実空間に重畳された仮想オブジェクトが変化するように表示部を制御する
 情報処理方法である。
In addition, the present disclosure includes, for example,
The determination unit determines the change in the user's gaze range with respect to the real space,
The image pickup control unit sets the exposure of the image pickup unit according to the change in the gaze range, and controls the image pickup unit to acquire an image in real space based on the exposure setting.
The recognition unit recognizes the real object contained in the real space image, and
This is an information processing method in which the display control unit controls the display unit so that the virtual object superimposed on the real space changes based on the recognition result of the real object.
 また、本開示は、例えば、
 判定部が、実空間に関するユーザの注視範囲の変化を判定し、
 撮像制御部が、注視範囲の変化に応じて撮像部の露出設定を行い、露出設定に基づいて、実空間の画像を取得するよう撮像部を制御し、
 認識部が、実空間の画像に含まれる実オブジェクトの認識を行い、
 表示制御部が、実オブジェクトの認識結果に基づいて、実空間に重畳された仮想オブジェクトが変化するように表示部を制御する
 情報処理方法をコンピュータに実行させるプログラムが記憶された記憶媒体である。
In addition, the present disclosure includes, for example,
The determination unit determines the change in the user's gaze range with respect to the real space,
The image pickup control unit sets the exposure of the image pickup unit according to the change in the gaze range, and controls the image pickup unit to acquire an image in real space based on the exposure setting.
The recognition unit recognizes the real object contained in the real space image, and
The display control unit is a storage medium in which a program for causing a computer to execute an information processing method for controlling the display unit so that the virtual object superimposed on the real space changes based on the recognition result of the real object is stored.
図1は、一実施形態の概要の説明がなされる際に参照される図である。FIG. 1 is a diagram that is referred to when the outline of one embodiment is explained. 図2A及び図2Bは、一実施形態の概要の説明がなされる際に参照される図である。2A and 2B are diagrams that are referred to when the outline of one embodiment is explained. 図3は、一実施形態に係る情報処理装置の外観例を示す図である。FIG. 3 is a diagram showing an external example of the information processing apparatus according to the embodiment. 図4は、一実施形態に係る情報処理装置の内部構成例を示すブロック図である。FIG. 4 is a block diagram showing an example of the internal configuration of the information processing apparatus according to the embodiment. 図5は、一実施形態に係る情報処理装置で行われる処理の流れを示すフローチャートである。FIG. 5 is a flowchart showing a flow of processing performed by the information processing apparatus according to the embodiment. 図6は、一実施形態に係る情報処理装置で行われる処理が説明される際に参照される図である。FIG. 6 is a diagram referred to when a process performed by the information processing apparatus according to the embodiment is described. 図7は、一実施形態に係る情報処理装置で行われる処理の流れを示すフローチャートである。FIG. 7 is a flowchart showing a flow of processing performed by the information processing apparatus according to the embodiment.
 以下、本開示の実施形態等について図面を参照しながら説明する。なお、説明は以下の順序で行う。
<一実施形態>
<変形例>
 以下に説明する実施形態等は本開示の好適な具体例であり、本開示の内容がこれらの実施形態等に限定されるものではない。
Hereinafter, embodiments and the like of the present disclosure will be described with reference to the drawings. The explanation will be given in the following order.
<One Embodiment>
<Modification example>
The embodiments and the like described below are suitable specific examples of the present disclosure, and the contents of the present disclosure are not limited to these embodiments and the like.
<一実施形態>
[一実施形態の概要]
 始めに、一実施形態の概要についての説明がなされる。本実施形態では、情報処理装置(情報処理装置1)として眼鏡型のウエアラブル機器を例にした説明がなされる。図1に示すように、情報処理装置1は、左右のフレームがユーザUの左右の耳のそれぞれに掛けられるようにして装着される。情報処理装置1は、ユーザUの目の一方、又は、双方の前方に配置される光学的透過性を有する表示部(光学シースルーディスプレイ)を有している。情報処理装置1は、ユーザUの視線方向の画像を取得するカメラを有しており、カメラにより取得された画像に基づいて、実オブジェクトが認識される。実オブジェクトは、生体の一部であり、具体的には、手、指、手首等である。認識された実オブジェクトに、仮想オブジェクトである所定の情報が重畳して表示される。所定の情報は、各種のGUI(Graphical User Interface)であり、具体例としては、メニュー画面が挙げられる。例えば、ユーザUが眼前に一方の手を移動させることにより手が認識され、認識された手にメニュー画面が重畳されて表示される。そして、ユーザUは、メニュー画面に対して、他方の手によるタップ操作や公知の操作を行う。メニュー画面に対する操作位置が情報処理装置1によって認識されることで、操作に応じた処理が情報処理装置1により実行される。所定の情報は、実空間に関連付けられた三次元座標系に配置された、実オブジェクトによって操作される仮想オブジェクトであっても良い。本実施形態では、AR(Augmented Reality)技術により係る入力システムを構築することが可能となっている。
<One Embodiment>
[Outline of one embodiment]
First, an outline of one embodiment will be explained. In the present embodiment, an explanation will be given using a glasses-type wearable device as an example of the information processing device (information processing device 1). As shown in FIG. 1, the information processing device 1 is mounted so that the left and right frames are hung on the left and right ears of the user U. The information processing device 1 has an optically transparent display unit (optical see-through display) arranged in front of one or both eyes of the user U. The information processing device 1 has a camera that acquires an image in the line-of-sight direction of the user U, and a real object is recognized based on the image acquired by the camera. A real object is a part of a living body, specifically, a hand, a finger, a wrist, or the like. Predetermined information, which is a virtual object, is superimposed and displayed on the recognized real object. The predetermined information is various GUIs (Graphical User Interfaces), and specific examples thereof include a menu screen. For example, when the user U moves one hand in front of the eyes, the hand is recognized, and the menu screen is superimposed and displayed on the recognized hand. Then, the user U performs a tap operation by the other hand or a known operation on the menu screen. When the operation position with respect to the menu screen is recognized by the information processing device 1, the processing according to the operation is executed by the information processing device 1. The predetermined information may be a virtual object operated by a real object arranged in a three-dimensional coordinate system associated with the real space. In this embodiment, it is possible to construct a related input system by AR (Augmented Reality) technology.
 なお、情報処理装置1が有する表示部は、光学的透過性を有していない、前方の実空間の撮像画像をリアルタイムに表示する表示部(ビデオシースルーディスプレイ)であっても良い。また、情報処理装置1は、車の室内等に設置されるHUD(Head-Up Display)や頭部に装着されるHMD(Head-Mounted Display)であっても良い。情報処理装置1がHUDである場合、HUDを介して視認される実空間の風景に所定の情報が重畳される。また、情報処理装置1は、フロジェクタ等の投影装置によりテーブル等の平面に画像が投影されるテーブルトップ型の表示デバイスであっても良い。また、情報処理装置1は、パーソナルコンピュータ、スマー卜フォン、タブレッ卜型のコンピュータ、PND(Portable Navigation Device)等であっても良い。 The display unit included in the information processing device 1 may be a display unit (video see-through display) that does not have optical transparency and displays a captured image in the front real space in real time. Further, the information processing device 1 may be a HUD (Head-Up Display) installed in the interior of a car or the like, or an HMD (Head-Mounted Display) mounted on the head. When the information processing device 1 is a HUD, predetermined information is superimposed on the scenery in the real space visually recognized via the HUD. Further, the information processing device 1 may be a tabletop type display device in which an image is projected on a plane such as a table by a projection device such as a projector. Further, the information processing device 1 may be a personal computer, a smart phone, a tablet computer, a PND (Portable Navigation Device), or the like.
 ところで、上述した入力システムを構築する上で、GUIを重畳する手が適切に認識される必要がある。例えば、ユーザUが暗い場所から明るい方向を向いているときに、手認識のために手をかざすと、撮影した手の画像が黒つぶれして、正しい手の形状を認識するのが難しくなる。具体的には、図2Aに模式的に示すように、晴れた日中に手をかざすと背景の明るい空に露光が合ってしまい、木Trや手HAが黒つぶれしまっている。このように、明暗が両極端な被写体の場合には、黒つぶれや白飛びと呼ばれる現象が生じ、認識対象が正確に認識されない虞が高くなる。 By the way, in constructing the above-mentioned input system, it is necessary to properly recognize the hand on which the GUI is superimposed. For example, if the user U is facing a bright direction from a dark place and holds his / her hand for hand recognition, the captured hand image is blacked out and it becomes difficult to recognize the correct hand shape. Specifically, as schematically shown in FIG. 2A, when a hand is held over in a sunny day, the exposure matches the bright sky in the background, and the tree Tr and the hand HA are blacked out. As described above, when the subject has extreme brightness and darkness, a phenomenon called blackout or whiteout occurs, and there is a high possibility that the recognition target is not accurately recognized.
 本例では、GUIが手に重畳されて表示されることから、ユーザUは、かざした手を注視していると考えられる。そこで本実施形態では、注視範囲の変化を判定し、注視範囲の変化に応じてカメラの露出設定がなされる。これにより、注視範囲である手HAの周辺に露光設定が合わせられるので、図2Bに示すように手HAの形状が認識され易くすることができる。以下、本実施形態の詳細についての更なる説明がなされる。 In this example, since the GUI is superimposed on the hand and displayed, it is considered that the user U is watching the hand held up. Therefore, in the present embodiment, the change in the gaze range is determined, and the exposure of the camera is set according to the change in the gaze range. As a result, the exposure setting is adjusted to the periphery of the hand HA, which is the gaze range, so that the shape of the hand HA can be easily recognized as shown in FIG. 2B. Hereinafter, the details of the present embodiment will be further described.
[情報処理装置の構成例]
(情報処理装置の外観例)
 図3は、本実施形態に係る情報処理装置1の外観例を示す図である。上述したように、本実施形態に係る情報処理装置1は、眼鏡型のウエアラブル機器である。情報処理装置1は、通常の眼鏡と同様に、眼前に左画像表示部3A及び右画像表示部3Bを保持するためのフレーム5を有する。フレーム5は、金属や合金、プラスチック、これらの組合せといった、通常の眼鏡を構成する材料と同じ材料から作製されている。左画像表示部3A及び右画像表示部3Bは、光学シースルーディスプレイでも良いし、ビデオシースルーディスプレイであっても良い。なお、個々の表示部を特に区別する必要無い場合には、左画像表示部3A及び右画像表示部3Bは、表示部3と適宜、総称される。
[Configuration example of information processing device]
(Example of appearance of information processing device)
FIG. 3 is a diagram showing an external example of the information processing device 1 according to the present embodiment. As described above, the information processing device 1 according to the present embodiment is a glasses-type wearable device. The information processing device 1 has a frame 5 for holding the left image display unit 3A and the right image display unit 3B in front of the eyes, like ordinary eyeglasses. The frame 5 is made of the same materials that make up ordinary eyeglasses, such as metals, alloys, plastics, and combinations thereof. The left image display unit 3A and the right image display unit 3B may be an optical see-through display or a video see-through display. When it is not necessary to distinguish the individual display units, the left image display unit 3A and the right image display unit 3B are appropriately collectively referred to as the display unit 3.
 フレーム5には、電池、各種センサ、スピーカ等が搭載されている。例えば、フレーム5には、視線方向に存在する物体を認識するための外向きのカメラ(撮像部の一例)と、ユーザUの視線を検出するための内向きのカメラ(他の撮像部の一例)とが搭載されている。本開示において、他の撮像部を第2の撮像部という場合がある。外向きのカメラは、例えば、複数の画像を同時に取得するステレオカメラ11として構成されている。ステレオカメラ11は、2個のカメラ(カメラ11A、11B)を有している。内向きのカメラは、左目用カメラ12Aと、右目用カメラ12Bとか構成されている。 The frame 5 is equipped with batteries, various sensors, speakers, and the like. For example, the frame 5 includes an outward-facing camera (an example of an imaging unit) for recognizing an object existing in the line-of-sight direction and an inward-facing camera (an example of another imaging unit) for detecting the line of sight of the user U. ) And is installed. In the present disclosure, another imaging unit may be referred to as a second imaging unit. The outward-facing camera is configured as, for example, a stereo camera 11 that simultaneously acquires a plurality of images. The stereo camera 11 has two cameras ( cameras 11A and 11B). The inward-facing camera includes a left-eye camera 12A and a right-eye camera 12B.
(情報処理装置の内部構成例)
 図4は、本実施形態に係る情報処理装置1の内部構成例を示すブロック図である。情報処理装置1は、例えば、センサ部10と、制御部20と、出力部30と、記憶部40とを有している。これらの構成がバス50を介して接続されており、構成間でのコマンドや各種のデータのやり取りが可能とされている。
(Example of internal configuration of information processing device)
FIG. 4 is a block diagram showing an example of the internal configuration of the information processing device 1 according to the present embodiment. The information processing device 1 includes, for example, a sensor unit 10, a control unit 20, an output unit 30, and a storage unit 40. These configurations are connected via the bus 50, and commands and various data can be exchanged between the configurations.
 センサ部10は、上述したステレオカメラ11、左目用カメラ12A、及び、右目用カメラ12Bを含む。左目用カメラ12A及び右目用カメラ12Bは、例えば、赤外線カメラである。ステレオカメラ11は、CCD(Charge Coupled Device)やCMOS(Complementary Metal Oxide Semiconductor)等の撮像素子を有している。勿論、センサ部10はカメラに限定されず、様々なセンサを含んでいても良い。例えば、センサ部10はマイクロフォン、GPS(Global Positioning System)センサ、加速度センサ、視覚(視線、注視点、焦点、瞬目等)センサ、生体(心拍、体温、血圧、脳波等)センサ、ジャイ口センサ、照度センサ等の各種センサを含んでいても良い。センサ部10により取得されたセンシングデータが、制御部20に供給される。 The sensor unit 10 includes the stereo camera 11, the left-eye camera 12A, and the right-eye camera 12B described above. The left-eye camera 12A and the right-eye camera 12B are, for example, infrared cameras. The stereo camera 11 has an image sensor such as a CCD (Charge Coupled Device) or a CMOS (Complementary Metal Oxide Semiconductor). Of course, the sensor unit 10 is not limited to the camera, and may include various sensors. For example, the sensor unit 10 includes a microphone, a GPS (Global Positioning System) sensor, an acceleration sensor, a visual (line of sight, gaze point, focus, blinking eye, etc.) sensor, a living body (heartbeat, body temperature, blood pressure, brain wave, etc.) sensor, and a gyro sensor. , Various sensors such as an illuminance sensor may be included. The sensing data acquired by the sensor unit 10 is supplied to the control unit 20.
 制御部20は、機能ブロックとして、例えば、判定部201と、撮像制御部202と、認識部203と、表示制御部204とを有している。判定部201は、実空間に関するユーザUの注視範囲の変化を判定する。判定部201は、例えば、ユーザUの視点位置(注視点)若しくは視点位置を基準にした所定の範囲を注視範囲と特定する。ユーザUの視点は、例えば、複数の赤外線LED(Light Emitting Diode)よりユーザUの瞳孔に照射される赤外光の反射を示す光点の位置を、左目用カメラ12A及び右目用カメラ12Bのそれぞれで撮影する。撮影画像に基づいて瞳孔を特定することにより、左目及び右目のそれぞれの視線位置が検出される。検出された視線位置若しくは検出された視点位置を基準として所定の範囲が注視範囲として設定される。 The control unit 20 has, for example, a determination unit 201, an image pickup control unit 202, a recognition unit 203, and a display control unit 204 as functional blocks. The determination unit 201 determines a change in the gaze range of the user U with respect to the real space. The determination unit 201 specifies, for example, the viewpoint position (gaze point) of the user U or a predetermined range based on the viewpoint position as the gaze range. From the viewpoint of the user U, for example, the positions of the light spots indicating the reflection of the infrared light emitted from the plurality of infrared LEDs (Light Emitting Diodes) to the pupils of the user U are set by the left-eye camera 12A and the right-eye camera 12B, respectively. Take a picture with. By identifying the pupil based on the captured image, the line-of-sight positions of the left eye and the right eye are detected. A predetermined range is set as the gaze range based on the detected line-of-sight position or the detected viewpoint position.
 撮像制御部202は、注視範囲の変化に応じて撮像部(本例におけるステレオカメラ11)の露出設定を行い、露出設定に基づいて、実空間の画像を取得するようステレオカメラ11を制御する。例えば、撮像制御部202は、注視範囲の明度の平均値に基づいて、露光設定を行う。注視範囲が変化することにより明度に平均値も変化し得ることから、注視範囲の変化があると、撮像制御部202は、露出設定を再設定する。なお、露出設定とは、絞り、シャッタースピード等に関する設定が挙げられるが、露光を変更する設定であれば特定のパラメータに限定されるものではない。 The image pickup control unit 202 sets the exposure of the image pickup unit (stereo camera 11 in this example) according to the change in the gaze range, and controls the stereo camera 11 to acquire an image in the real space based on the exposure setting. For example, the image pickup control unit 202 sets the exposure based on the average value of the brightness of the gaze range. Since the average value can change in brightness as the gaze range changes, the imaging control unit 202 resets the exposure setting when there is a change in the gaze range. The exposure setting includes settings related to the aperture, shutter speed, and the like, but the setting is not limited to a specific parameter as long as the exposure is changed.
 認識部203は、実空間の画像に含まれる実オブジェクトの認識を行う。実オブジェクトは、例えば、ユーザUの手である。認識部203は、ステレオカメラ11によって取得される撮像画像を解析し、実空間に存在する手の認識処理を行う。認識部203は、例えば、撮像画像から抽出される画像特徴量を、記憶部40に記憶される既知の実オブジェクトの画像特徴量と照合することにより、撮像画像中の手を識別し、撮像画像における手の位置および形状を認識する。なお、複数の特徴点(例えば、指の関節等)を用いる方法等、他の公知の方法により、撮像画像中における手が認識されても良い。 The recognition unit 203 recognizes a real object included in a real space image. The real object is, for example, the hand of user U. The recognition unit 203 analyzes the captured image acquired by the stereo camera 11 and performs a recognition process for a hand existing in the real space. The recognition unit 203 identifies a hand in the captured image by collating the image feature amount extracted from the captured image with the image feature amount of a known real object stored in the storage unit 40, and identifies the captured image. Recognize the position and shape of the hand in. The hand in the captured image may be recognized by another known method such as a method using a plurality of feature points (for example, a knuckle joint or the like).
 なお、認識部203は、ステレオカメラ11により撮像される画像を解析し、実空間の三次元形状情報(深度情報)を取得しても良い。例えば、認識部203は、ステレオカメラ11により同時に取得された複数画像に対するステレオマッチング法や、時系列的に取得された複数画像に対するSfM(Structure from Motion)法、SLAM法等を行うことにより実空間の三次元形状を認識し、三次元形状情報を取得しても良い。また、認識部203が実空間の三次元形状情報を取得可能な場合、認識部203は、実オブジェク卜の三次元的な位置、形状、サイズ、及び、姿勢を認識しても良い。 Note that the recognition unit 203 may analyze the image captured by the stereo camera 11 and acquire three-dimensional shape information (depth information) in the real space. For example, the recognition unit 203 performs a stereo matching method for a plurality of images simultaneously acquired by the stereo camera 11, an SfM (Structure from Motion) method for a plurality of images acquired in time series, a SLAM method, and the like in a real space. The three-dimensional shape of the above may be recognized and the three-dimensional shape information may be acquired. Further, when the recognition unit 203 can acquire the three-dimensional shape information in the real space, the recognition unit 203 may recognize the three-dimensional position, shape, size, and posture of the real object.
 なお、認識部203は、センシングデータ等に基づいて、ユーザ操作の認識を行ってもよい。例えば、本実施形態に係る認識部203は、連続的な手の形状を変化に基づくジェスチャ操作を認識する。例えば、認識部203は、手を含む部分画像の切り出し、切り出した部分画像のスケーリング、部分画像の一時的な保存、フレーム間の差分の計算等を行うことにより、ジェスチャ操作を認識する。 Note that the recognition unit 203 may recognize the user operation based on the sensing data or the like. For example, the recognition unit 203 according to the present embodiment recognizes a gesture operation based on a continuous change in the shape of the hand. For example, the recognition unit 203 recognizes the gesture operation by cutting out the partial image including the hand, scaling the cut out partial image, temporarily storing the partial image, calculating the difference between frames, and the like.
 表示制御部204は、実オブジェクトの認識結果に基づいて、実空間に重畳された仮想オブジェクトが変化するように表示部3を制御する。例えば、表示制御部204は、認識された手にGUIの一例であるメニュー画面を重畳する制御を行う。係る制御により、認識部203により認識された手の位置にメニュー画面が重畳された画像が、表示部3を介してユーザUに提示される。 The display control unit 204 controls the display unit 3 so that the virtual object superimposed on the real space changes based on the recognition result of the real object. For example, the display control unit 204 controls to superimpose the menu screen, which is an example of GUI, on the recognized hand. With this control, an image in which the menu screen is superimposed on the position of the hand recognized by the recognition unit 203 is presented to the user U via the display unit 3.
 出力部30は、各種の情報を表示や音声により出力するものである。出力部30は、例えば、表示部3及びスピーカ301を有している。この他に、出力部30が、振動子(バイブレータ)等を有していても良い。 The output unit 30 outputs various information by display or voice. The output unit 30 has, for example, a display unit 3 and a speaker 301. In addition to this, the output unit 30 may have an oscillator (vibrator) or the like.
 記憶部40は、情報処理装置1による処理のためのプログラムやデータを記憶する。例えば、記憶部40には、手の認識に用いられる画像特徴量や、ジェスチャ操作の認識に用いられるジェスチャパターンが記憶されている。また、記憶部40には、認識部203による認識結果である認識結果情報が記憶される。認識結果情報は、例えば、認識された物体や手の範囲の情報、認識された手の形状の情報等である。記憶部40としては、HDD(Hard Disk Drive)等の磁気記憶デバイス、半導体記憶デバイス、光記憶デバイス、光磁気記憶デバイス等を適用することができる。 The storage unit 40 stores programs and data for processing by the information processing device 1. For example, the storage unit 40 stores an image feature amount used for hand recognition and a gesture pattern used for recognizing a gesture operation. Further, the storage unit 40 stores the recognition result information which is the recognition result by the recognition unit 203. The recognition result information is, for example, information on a recognized object or a range of a hand, information on a recognized hand shape, or the like. As the storage unit 40, a magnetic storage device such as an HDD (Hard Disk Drive), a semiconductor storage device, an optical storage device, an optical magnetic storage device, or the like can be applied.
 なお、上述した情報処理装置1の構成は例示に過ぎず、情報処理装置1が他の構成を有していても良い。例えば、情報処理装置1が、外部機器と通信を行うための無線モジュールや、各構成に電力を供給するバッテリを有していても良い。 The configuration of the information processing device 1 described above is merely an example, and the information processing device 1 may have another configuration. For example, the information processing device 1 may have a wireless module for communicating with an external device and a battery for supplying electric power to each configuration.
[情報処理装置の動作例]
 図5は、本実施形態に係る情報処理装置1の動作例を示すフローチャートである。以下に示す処理は、例えば、ユーザUが視線の先に手を移動させ、当該手にメニュー画面を重畳させ、メニュー画面を使用した各種の入力を行うモードが、情報処理装置1に設定された際に行われる。勿論、モードの設定に関係なく、以下に説明する処理が行われても良い。
[Operation example of information processing device]
FIG. 5 is a flowchart showing an operation example of the information processing apparatus 1 according to the present embodiment. In the processing shown below, for example, the information processing apparatus 1 is set to a mode in which the user U moves his / her hand to the tip of the line of sight, superimposes the menu screen on the hand, and performs various inputs using the menu screen. It is done at the time. Of course, the process described below may be performed regardless of the mode setting.
 処理が開始されると、ステップST11では、視線位置が検出される。例えば、左目用カメラ12A及び右目用カメラ12Bにより撮影された画像が制御部20に供給される。そして、制御部20の判定部201により、ユーザUの視線位置が検出される。視線位置に基づいて注視範囲が設定される。そして、処理がステップST12に進む。 When the process is started, the line-of-sight position is detected in step ST11. For example, the images taken by the left-eye camera 12A and the right-eye camera 12B are supplied to the control unit 20. Then, the determination unit 201 of the control unit 20 detects the line-of-sight position of the user U. The gaze range is set based on the line-of-sight position. Then, the process proceeds to step ST12.
 ステップST12では、注視範囲内に予め決定された手の範囲が設定される。設定された範囲が、露光設定に関するパラメータの計算範囲として設定される。例えば、視線位置を中心とした手の範囲が、露光設定に関するパラメータの計算範囲として設定される。なお、視線位置が厳密に中心になっていなくても良く、設定された範囲の略中心付近に視線位置があれば良い。係る処理は、例えば、判定部201により行われる。 In step ST12, a predetermined hand range is set within the gaze range. The set range is set as the calculation range of the parameters related to the exposure setting. For example, the range of the hand centered on the line-of-sight position is set as the calculation range of the parameters related to the exposure setting. It should be noted that the line-of-sight position does not have to be strictly centered, and the line-of-sight position may be located near the substantially center of the set range. Such processing is performed by, for example, the determination unit 201.
 本処理の段階では、未だ手が視線の先にかざされていない状態を想定されるので、予め決定された手の範囲が適用される。予め決定された手の範囲とは、図6に模式的に示すように、一般的な人物の腕の長さaと手の大きさbとに基づいて、視線方向にかざされた手の大きさがどの程度になるかが事前に計算された範囲である。なお、予め決定された手の範囲が複数、用意され、ユーザUの属性等に応じて何れかの範囲が選択されるようにしても良い。そして処理がステップST13に進む。 At the stage of this processing, it is assumed that the hand is not yet held over the line of sight, so a predetermined range of hands is applied. The predetermined hand range is the size of the hand held in the line-of-sight direction based on the arm length a and the hand size b of a general person, as schematically shown in FIG. It is a pre-calculated range of how much it will be. A plurality of predetermined hand ranges may be prepared, and any range may be selected according to the attributes of the user U and the like. Then, the process proceeds to step ST13.
 ステップST13では、露光設定に関するパラメータ(AE(Auto Exposure)パラメータ)が撮像制御部202により計算される。撮像制御部202は、例えばステップST12で設定された範囲の明度の平均値に基づいてパラメータを計算する。そして、処理がステップST14に進む。 In step ST13, the parameters related to the exposure setting (AE (Auto Exposure) parameters) are calculated by the imaging control unit 202. The image pickup control unit 202 calculates the parameter based on, for example, the average value of the brightness in the range set in step ST12. Then, the process proceeds to step ST14.
 ステップST14では、ステップST13で計算されたパラメータがステレオカメラ11に設定されることにより、撮像制御部202による露出設定が行われる。ステレオカメラ11は、設定された露出設定に基づいて撮影を行うことにより実空間の画像を取得する。そして処理がステップST15に進む。 In step ST14, the parameters calculated in step ST13 are set in the stereo camera 11, so that the exposure setting is performed by the image pickup control unit 202. The stereo camera 11 acquires an image in real space by taking a picture based on the set exposure setting. Then, the process proceeds to step ST15.
 ステップST15では、ステレオカメラ11により取得された実空間の画像に対して、認識部203が手認識アルゴリズムに基づく処理を実行することにより、実空間の画像に手が含まれるか否かを検出する。そして、処理がステップST16に進む。 In step ST15, the recognition unit 203 detects whether or not the real space image includes a hand by executing a process based on the hand recognition algorithm on the real space image acquired by the stereo camera 11. .. Then, the process proceeds to step ST16.
 ステップST16では、手の形状が認識できたか否かが認識部203により判断される。初回の処理では手の認識はされないものの、モードの設定に応じてユーザUが注視範囲に手をかざすことから、ステップST16に係る処理で、手は認識され得る。なお、ステップST16で手が認識された場合には、手の形状等の認識結果が認識結果情報として記憶部40に記憶される。記憶部40に記憶された認識結果情報が、次の手認識処理におけるテンプレート等として使用されても良い。ステップST16に係る処理で手が認識されない場合には、処理がステップST12に戻る。ステップST16に係る処理で手が認識された場合には、処理はステップST17に進む。 In step ST16, the recognition unit 203 determines whether or not the shape of the hand can be recognized. Although the hand is not recognized in the initial process, the user U holds his / her hand over the gaze range according to the mode setting, so that the hand can be recognized in the process according to step ST16. When the hand is recognized in step ST16, the recognition result such as the shape of the hand is stored in the storage unit 40 as the recognition result information. The recognition result information stored in the storage unit 40 may be used as a template or the like in the next hand recognition process. If the hand is not recognized in the process related to step ST16, the process returns to step ST12. If the hand is recognized in the process related to step ST16, the process proceeds to step ST17.
 ステップST17では、注視範囲内であり、且つ、ステップST15、ST16に係る処理で認識された手を包含する範囲が露光設定に関するパラメータの計算範囲として再設定される。そして、処理がステップST13に戻り、再設定された範囲の明度に基づいて、露光設定に関するパラメータが計算される。 In step ST17, the range within the gaze range and including the hand recognized in the processes related to steps ST15 and ST16 is reset as the calculation range of the parameters related to the exposure setting. Then, the process returns to step ST13, and the parameters related to the exposure setting are calculated based on the brightness of the reset range.
 なお、図示は省略されているが、ステップST15、ST16に係る処理で手が認識された場合には、認識された手の位置が表示部3に表示される画像の位置に適宜、変換され、当該画像における手の位置にメニュー画面が重畳される処理が、表示制御部204により行われる。これにより、ユーザUは、表示部3を介して、手に重畳されたメニュー画面を視認することができる。 Although not shown, when a hand is recognized in the processes related to steps ST15 and ST16, the position of the recognized hand is appropriately converted to the position of the image displayed on the display unit 3. The display control unit 204 performs a process of superimposing the menu screen on the position of the hand in the image. As a result, the user U can visually recognize the menu screen superimposed on the hand via the display unit 3.
 なお、上述した処理の実行中にユーザUの視点位置が変化し、注視範囲が変化する可能性もある。この場合は、上述した処理がステップS11に戻される制御が行われる。図7は、係る制御で行われる処理の流れを示すフローチャートである。図7に示すフローチャートに係る処理は、図5に示すフローチャートに係る処理と並列的に行われる。 Note that the viewpoint position of the user U may change during the execution of the above-mentioned processing, and the gaze range may change. In this case, control is performed so that the above-mentioned process is returned to step S11. FIG. 7 is a flowchart showing the flow of processing performed by the control. The process related to the flowchart shown in FIG. 7 is performed in parallel with the process related to the flowchart shown in FIG.
 ステップST21では、視線位置が検出される。例えば、左目用カメラ12A及び右目用カメラ12Bにより撮影された画像が制御部20により検出される。そして、制御部20の判定部201により、ユーザUの視線位置が検出される。視線位置に基づいて注視範囲が設定される。そして、処理がステップST22に進む。 In step ST21, the line-of-sight position is detected. For example, the image captured by the left-eye camera 12A and the right-eye camera 12B is detected by the control unit 20. Then, the determination unit 201 of the control unit 20 detects the line-of-sight position of the user U. The gaze range is set based on the line-of-sight position. Then, the process proceeds to step ST22.
 ステップST22では、一定以上の注視範囲の変化があったか否かが判定部201により判断される。一定以上の注視範囲の変化がない場合には、処理がステップST21に戻る。一定以上の注視範囲の変化があった場合には、処理がステップST23に進む。 In step ST22, the determination unit 201 determines whether or not there has been a change in the gaze range beyond a certain level. If there is no change in the gaze range beyond a certain level, the process returns to step ST21. When there is a change in the gaze range beyond a certain level, the process proceeds to step ST23.
 ステップST23では、図5に示すフローチャートに係る処理をステップST11に戻す制御が割り込み的に行われる。即ち、注視範囲が変化したことから、変化後の注視範囲内で手が適切に認識されるように、変化後の注視範囲に対応する露光設定に関するパラメータが再計算される。具体的には、上述したステップST11~ステップST17に係る処理が行われる。このように、注視範囲の変化に応じてステレオカメラ11の露出設定が行われる。ステレオカメラ11は、自身に設定された露出設定に基づいて、実空間の画像を取得する。注視範囲の変化に応じて認識される手の位置も変化し得る。変化後の手の位置にメニュー画面の位置が変化するように、表示制御部204による制御が行われる。 In step ST23, control for returning the process related to the flowchart shown in FIG. 5 to step ST11 is performed interruptively. That is, since the gaze range has changed, the parameters related to the exposure setting corresponding to the changed gaze range are recalculated so that the hand can be properly recognized within the changed gaze range. Specifically, the processes related to steps ST11 to ST17 described above are performed. In this way, the exposure of the stereo camera 11 is set according to the change in the gaze range. The stereo camera 11 acquires an image in real space based on the exposure setting set by itself. The position of the recognized hand may change as the gaze range changes. Control is performed by the display control unit 204 so that the position of the menu screen changes to the position of the hand after the change.
[一実施形態により得られる効果の一例]
 以上説明した処理では、注視範囲に存在すると考えられる手の範囲に露光が合うような適切な設定がなされる。従って、ユーザが視線の先である注視範囲に手をかざした場合に、黒つぶれや白飛びが起こり易い環境下であっても手が認識されずにエラーが生じてしまうことを防止することができる。
 また、一定の範囲における情報(本例では明度)に基づいて露光設定に関するパラメータが計算されるので、処理時間を短くすることができる。
 手が認識された場合には、認識された手の範囲に基づいて、露光設定に関するパラメータの計算範囲として再設定される。従って、手の大きさや腕の長さの個人差に対応するように、露光設定に関するパラメータの計算範囲を調整することができる。
[Example of effect obtained by one embodiment]
In the process described above, an appropriate setting is made so that the exposure matches the range of the hand that is considered to exist in the gaze range. Therefore, when the user holds his / her hand over the gaze range, which is the point of the line of sight, it is possible to prevent an error from occurring without recognizing the hand even in an environment where blackout or whiteout is likely to occur. it can.
Further, since the parameters related to the exposure setting are calculated based on the information in a certain range (brightness in this example), the processing time can be shortened.
When the hand is recognized, it is reset as the calculation range of the parameters related to the exposure setting based on the range of the recognized hand. Therefore, the calculation range of the parameters related to the exposure setting can be adjusted so as to correspond to individual differences in the size of the hand and the length of the arm.
 勿論、ダイナミックレンジが大きいカメラや、高度な補正処理が可能なカメラを適用することにより上述した問題は解決し得る。しかしながら、係る方法では、情報処理装置1自体のコストが高くなる他、情報処理装置1が大型化してしまう虞がある。また、情報処理装置1がウエアラブル機器等である場合には、消費電力が小さいことが好ましいが、上述した方法では、消費電力が大きくなり、頻繁に充電をすることが必要となってしまう虞がある。本実施形態では、高機能なカメラを用いることなく、認識対象を適切に認識することができる。従って、情報処理装置1のコストを低減でき、また、情報処理装置1の小型化を実現することができる。 Of course, the above problems can be solved by applying a camera with a large dynamic range or a camera capable of advanced correction processing. However, in such a method, the cost of the information processing device 1 itself becomes high, and the information processing device 1 may become large in size. Further, when the information processing device 1 is a wearable device or the like, it is preferable that the power consumption is small, but the above-mentioned method may increase the power consumption and require frequent charging. is there. In the present embodiment, the recognition target can be appropriately recognized without using a high-performance camera. Therefore, the cost of the information processing device 1 can be reduced, and the information processing device 1 can be downsized.
<変形例>
 以上、本開示の実施形態について具体的に説明したが、本開示の内容は上述した実施形態に限定されるものではなく、本開示の技術的思想に基づく各種の変形が可能である。以下、変形例について説明する。
<Modification example>
Although the embodiments of the present disclosure have been specifically described above, the contents of the present disclosure are not limited to the above-described embodiments, and various modifications based on the technical idea of the present disclosure are possible. Hereinafter, a modified example will be described.
 上述した一実施形態において、撮像制御部202は、露光設定を行う際に、明度に露出設定が対応付けられたテーブルを用いても良い。撮像制御部202は、当該テーブルを用いて、明度に対応する露出設定を読出し、読み出した露出設定をステレオカメラ11に行っても良い。テーブルは、記憶部40に記憶されても良い。テーブルは、予め情報処理装置1に記憶されていても良いし、ネットワークを介して外部から取得されても良い。 In one embodiment described above, the image pickup control unit 202 may use a table in which the exposure setting is associated with the brightness when performing the exposure setting. The image pickup control unit 202 may use the table to read out the exposure setting corresponding to the brightness and set the read out exposure setting to the stereo camera 11. The table may be stored in the storage unit 40. The table may be stored in the information processing device 1 in advance, or may be acquired from the outside via the network.
 上述した一実施形態において、認識部203は、手の認識に関する信頼度(手の認識精度)を求めても良い。認識部203は、例えば、信頼度が閾値以上である場合に、ステレオカメラ11により撮影された画像内の手が認識されたものと判断し、信頼度が閾値未満である場合には、ステレオカメラ11により撮影された画像内の手が認識されなかったものと判断しても良い。また、認識部203は、信頼度が閾値以上である場合に、手の認識結果を表示制御部204に出力するようにし、信頼度が閾値未満である場合には、手の認識結果を表示制御部204に出力しないようにしても良い。即ち、手の認識の信頼度が閾値以上である場合のみ、認識された手の位置にGUIが重畳されるようにしても良い。視線方向にかざされた手が揺動した場合に、当該揺動する手にGUIを重畳して表示するとGUIも揺動してユーザUに違和感を与える虞がある。しかしながら、上述した処理を行うことにより、手が揺動すると信頼度が低くなることから揺動する手にGUIが重畳して表示されることを防止することができ、ユーザUに違和感を与えてしまうことを防止することができる。 In one embodiment described above, the recognition unit 203 may obtain the reliability (hand recognition accuracy) regarding hand recognition. For example, when the reliability is equal to or higher than the threshold value, the recognition unit 203 determines that the hand in the image captured by the stereo camera 11 is recognized, and when the reliability is lower than the threshold value, the stereo camera It may be determined that the hand in the image taken by 11 is not recognized. Further, the recognition unit 203 outputs the hand recognition result to the display control unit 204 when the reliability is equal to or higher than the threshold value, and displays and controls the hand recognition result when the reliability is lower than the threshold value. It may not be output to unit 204. That is, the GUI may be superimposed on the recognized hand position only when the reliability of hand recognition is equal to or higher than the threshold value. When the hand held in the line-of-sight direction swings, if the GUI is superimposed and displayed on the swinging hand, the GUI also swings, which may give a sense of discomfort to the user U. However, by performing the above-mentioned processing, the reliability is lowered when the hand swings, so that it is possible to prevent the GUI from being superimposed and displayed on the swinging hand, which gives the user U a sense of discomfort. It is possible to prevent it from being stored.
 上述した一実施形態において、撮像制御部202が、露光設定に関するパラメータの計算範囲を、信頼度に応じて切り替えるようにしても良い。例えば、ステップST16に係る処理で手の形状が認識された場合でもその信頼度が閾値以下である場合には、処理がステップST12に戻るようにしても良い。手の形状の認識に関する信頼度が閾値より大きい場合には処理がステップST17に進む。これにより、露光設定に関するパラメータの計算範囲を適切に設定することができる。 In one embodiment described above, the image pickup control unit 202 may switch the calculation range of the parameters related to the exposure setting according to the reliability. For example, even if the shape of the hand is recognized in the process related to step ST16, if the reliability is equal to or less than the threshold value, the process may return to step ST12. If the reliability regarding the recognition of the shape of the hand is larger than the threshold value, the process proceeds to step ST17. As a result, the calculation range of the parameters related to the exposure setting can be appropriately set.
 上述した一実施形態において、連続的な手の形状の変化が認識部203により認識され、更に、手の形状の変化に基づくジェスチャ操作が認識部203により認識されても良い。そして、ジェスチャ操作に基づいてGUI(例えば、メニュー画面)への操作内容が判別され、操作内容に応じた処理が行われても良い。 In one embodiment described above, the recognition unit 203 may recognize a continuous change in the shape of the hand, and further, a gesture operation based on the change in the shape of the hand may be recognized by the recognition unit 203. Then, the operation content for the GUI (for example, the menu screen) may be determined based on the gesture operation, and the processing may be performed according to the operation content.
 上述した一実施形態に係る情報処理装置1は、眼鏡型のウエアラブル機器ではなくスマートフォンでも良い。スマートフォンのインカメラが視線検出用のカメラとして使用され、アウトカメラが手を認識するための画像を取得するカメラとして使用されても良い。より具体的には、インカメラは、スマートフォンのディスプレイが配置された面(ディスプレイ面)に配置され、ディスプレイを視認しているユーザ自身を撮影する。アウトカメラは、ディスプレイ面と反対側の面に配置され、ユーザの前方の実空間を撮影する。すなわち、インカメラの撮影方向とアウトカメラの撮影方向は反対であると見做されてよい。 The information processing device 1 according to the above-described embodiment may be a smartphone instead of a glasses-type wearable device. The in-camera of the smartphone may be used as a camera for detecting the line of sight, and the out-camera may be used as a camera for acquiring an image for recognizing a hand. More specifically, the in-camera is arranged on the surface (display surface) on which the display of the smartphone is arranged, and photographs the user himself / herself who is visually recognizing the display. The out-camera is arranged on the surface opposite to the display surface and captures the real space in front of the user. That is, it may be considered that the shooting direction of the in-camera and the shooting direction of the out-camera are opposite.
 手等の視線方向の被写体を撮像するカメラは、ステレオカメラ11ではなく1個のカメラでも良い。この場合、深度情報は、ToF(Time of Flight)センサ、LiDAR(Light Detection and Ranging)センサによって取得されても良い。 The camera that captures the subject in the line-of-sight direction such as the hand may be one camera instead of the stereo camera 11. In this case, the depth information may be acquired by a ToF (Time of Flight) sensor or a LiDAR (Light Detection and Ringing) sensor.
 上述した一実施形態ではユーザUが人間である例について説明したが、ユーザUは、ロボット等であっても良い。 Although the example in which the user U is a human being has been described in the above-described embodiment, the user U may be a robot or the like.
 本開示は、装置、方法、プログラム、システム等により実現することもできる。例えば、上述した実施形態で説明した機能を行うプログラムをダウンロード可能とし、実施形態で説明した機能を有しない装置が当該プログラムをダウンロードしてインストールすることにより、当該装置において実施形態で説明した制御を行うことが可能となる。本開示は、このようなプログラムを配布するサーバにより実現することも可能である。また、各実施形態、変形例で説明した事項は、適宜組み合わせることが可能である。 This disclosure can also be realized by devices, methods, programs, systems, etc. For example, by making it possible to download a program that performs the functions described in the above-described embodiment and downloading and installing the program by a device that does not have the functions described in the above-described embodiment, the control described in the embodiment can be performed in the device. It becomes possible to do. The present disclosure can also be realized by a server that distributes such a program. In addition, the items described in each embodiment and modification can be combined as appropriate.
 なお、本開示中に例示された効果により本開示の内容が限定して解釈されるものではない。 Note that the content of this disclosure is not construed as limited by the effects exemplified in this disclosure.
 本開示は、以下の構成も採ることができる。
(1)
 実空間に関するユーザの注視範囲の変化を判定する判定部と、
 前記注視範囲の変化に応じて撮像部の露出設定を行い、前記露出設定に基づいて、前記実空間の画像を取得するよう前記撮像部を制御する撮像制御部と、
 前記実空間の画像に含まれる実オブジェクトの認識を行う認識部と、
 前記実オブジェクトの認識結果に基づいて、前記実空間に重畳された仮想オブジェクトが変化するように表示部を制御する表示制御部と
 を有する
 情報処理装置。
(2)
 前記撮像制御部は、前記注視範囲として前記ユーザの注視点を基準に設定される所定の範囲内の明度に基づいて前記撮像部の露出設定を行う
 (1)に記載の情報処理装置。
(3)
 前記所定の範囲は、予め設定された大きさを有する範囲である
 (2)に記載の情報処理装置。
(4)
 前記所定の範囲は、前記注視範囲内であり、前記認識部により認識された実オブジェクトの範囲である
 (2に記載の情報処理装置。
(5)
 前記撮像制御部は、前記明度に前記露出設定が対応付けられたテーブルを用いて、前記撮像部の露出設定を行う
 (2)から(4)までの何れかに記載の情報処理装置。
(6)
 前記認識部は、前記実オブジェクトの認識精度が閾値以上である場合に、前記実空間の画像が実オブジェクトを含むと認識し、前記実オブジェクトの認識精度が前記閾値未満である場合に、前記実空間の画像が実オブジェクトを含まないと認識する、
 (2))から(5)までの何れかに記載の情報処理装置。
(7)
 前記撮像制御部は、前記認識精度に応じて前記所定の範囲を切り替える
 (6)に記載の情報処理装置。
(8)
 前記実オブジェクトは、前記ユーザの体の一部である
 (1)から(7)までの何れかに記載の情報処理装置。
(9)
 前記実オブジェクトは、前記ユーザの手である
 (8)に記載の情報処理装置。
(10)
 前記認識部は、前記ユーザの手の形状の連続的な変化を認識し、
 (9)に記載の情報処理装置。
(11)
 前記判定部は、前記ユーザの視線の位置の検出結果に基づいて、前記注視範囲を特定する
 (1)から(10)までの何れかに記載の情報処理装置。
(12)
 前記撮像部とは異なる、前記視線を検出する第2の撮像部を有する
 (1)から(11)までの何れかに記載の情報処理装置。
(13)
 前記第2の撮像部の撮影方向は、前記撮像部の撮影方向とは反対の方向である、
 (12)に記載の情報処理装置。
(14)
 前記表示部は、光学シースルーディスプレイ及びビデオシースルーディスプレイの何れかである
 (1)から(13)までの何れかに記載の情報処理装置。
(15)
 人体に着脱可能なウエアラブル機器又はスマートフォンとして構成される
 (1)から(14)までの何れかに記載の情報処理装置。
(16)
 判定部が、実空間に関するユーザの注視範囲の変化を判定し、
 撮像制御部が、前記注視範囲の変化に応じて撮像部の露出設定を行い、前記露出設定に基づいて、前記実空間の画像を取得するよう前記撮像部を制御し、
 認識部が、前記実空間の画像に含まれる実オブジェクトの認識を行い、
 表示制御部が、前記実オブジェクトの認識結果に基づいて、前記実空間に重畳された仮想オブジェクトが変化するように表示部を制御する
 情報処理方法。
(17)
 判定部が、実空間に関するユーザの注視範囲の変化を判定し、
 撮像制御部が、前記注視範囲の変化に応じて撮像部の露出設定を行い、前記露出設定に基づいて、前記実空間の画像を取得するよう前記撮像部を制御し、
 認識部が、前記実空間の画像に含まれる実オブジェクトの認識を行い、
 表示制御部が、前記実オブジェクトの認識結果に基づいて、前記実空間に重畳された仮想オブジェクトが変化するように表示部を制御する
 情報処理方法をコンピュータに実行させるプログラムが記憶された記憶媒体。
The present disclosure may also adopt the following configuration.
(1)
A judgment unit that determines changes in the user's gaze range in real space,
An imaging control unit that sets the exposure of the imaging unit according to the change in the gaze range and controls the imaging unit to acquire an image in the real space based on the exposure setting.
A recognition unit that recognizes real objects included in the real space image,
An information processing device having a display control unit that controls a display unit so that a virtual object superimposed on the real space changes based on the recognition result of the real object.
(2)
The information processing apparatus according to (1), wherein the image pickup control unit sets the exposure of the image pickup unit based on the brightness within a predetermined range set based on the gaze point of the user as the gaze range.
(3)
The information processing apparatus according to (2), wherein the predetermined range is a range having a preset size.
(4)
The predetermined range is within the gaze range and is the range of the real object recognized by the recognition unit (the information processing apparatus according to 2).
(5)
The information processing apparatus according to any one of (2) to (4), wherein the image pickup control unit sets the exposure of the image pickup unit by using a table in which the exposure setting is associated with the brightness.
(6)
The recognition unit recognizes that the image in the real space includes the real object when the recognition accuracy of the real object is equal to or higher than the threshold value, and when the recognition accuracy of the real object is less than the threshold value, the real object is said to be included. Recognize that the spatial image does not contain real objects,
The information processing device according to any one of (2)) to (5).
(7)
The information processing device according to (6), wherein the image pickup control unit switches the predetermined range according to the recognition accuracy.
(8)
The information processing device according to any one of (1) to (7), wherein the real object is a part of the user's body.
(9)
The information processing device according to (8), wherein the real object is the hand of the user.
(10)
The recognition unit recognizes the continuous change in the shape of the user's hand.
The information processing device according to (9).
(11)
The information processing apparatus according to any one of (1) to (10), wherein the determination unit specifies the gaze range based on the detection result of the position of the line of sight of the user.
(12)
The information processing apparatus according to any one of (1) to (11), which has a second imaging unit that detects the line of sight, which is different from the imaging unit.
(13)
The imaging direction of the second imaging unit is opposite to the imaging direction of the imaging unit.
The information processing device according to (12).
(14)
The information processing apparatus according to any one of (1) to (13), wherein the display unit is either an optical see-through display or a video see-through display.
(15)
The information processing device according to any one of (1) to (14), which is configured as a wearable device or a smartphone that can be attached to and detached from the human body.
(16)
The determination unit determines the change in the user's gaze range with respect to the real space,
The image pickup control unit sets the exposure of the image pickup unit according to the change in the gaze range, and controls the image pickup unit to acquire the image in the real space based on the exposure setting.
The recognition unit recognizes the real object included in the real space image, and then recognizes the real object.
An information processing method in which the display control unit controls the display unit so that the virtual object superimposed on the real space changes based on the recognition result of the real object.
(17)
The determination unit determines the change in the user's gaze range with respect to the real space,
The image pickup control unit sets the exposure of the image pickup unit according to the change in the gaze range, and controls the image pickup unit to acquire the image in the real space based on the exposure setting.
The recognition unit recognizes the real object included in the real space image, and then recognizes the real object.
A storage medium in which a program for causing a computer to execute an information processing method in which a display control unit controls an information processing unit so that a virtual object superimposed on the real space changes based on the recognition result of the real object.
1・・・情報処理装置
3・・・表示部
11・・・ステレオカメラ
12A・・・左目用カメラ
12B・・・右目用カメラ
20・・・制御部
201・・・判定部
202・・・撮像制御部
203・・・認識部
204・・・表示制御部
1 ... Information processing device 3 ... Display unit 11 ... Stereo camera 12A ... Left eye camera 12B ... Right eye camera 20 ... Control unit 201 ... Judgment unit 202 ... Imaging Control unit 203 ・ ・ ・ Recognition unit 204 ・ ・ ・ Display control unit

Claims (17)

  1.  実空間に関するユーザの注視範囲の変化を判定する判定部と、
     前記注視範囲の変化に応じて撮像部の露出設定を行い、前記露出設定に基づいて、前記実空間の画像を取得するよう前記撮像部を制御する撮像制御部と、
     前記実空間の画像に含まれる実オブジェクトの認識を行う認識部と、
     前記実オブジェクトの認識結果に基づいて、前記実空間に重畳された仮想オブジェクトが変化するように表示部を制御する表示制御部と
     を有する
     情報処理装置。
    A judgment unit that determines changes in the user's gaze range in real space,
    An imaging control unit that sets the exposure of the imaging unit according to the change in the gaze range and controls the imaging unit to acquire an image in the real space based on the exposure setting.
    A recognition unit that recognizes real objects included in the real space image,
    An information processing device having a display control unit that controls a display unit so that a virtual object superimposed on the real space changes based on the recognition result of the real object.
  2.  前記撮像制御部は、前記注視範囲として前記ユーザの注視点を基準に設定される所定の範囲内の明度に基づいて前記撮像部の露出設定を行う
     請求項1に記載の情報処理装置。
    The information processing device according to claim 1, wherein the image pickup control unit sets the exposure of the image pickup unit based on the brightness within a predetermined range set based on the gaze point of the user as the gaze range.
  3.  前記所定の範囲は、予め設定された大きさを有する範囲である
     請求項2に記載の情報処理装置。
    The information processing apparatus according to claim 2, wherein the predetermined range is a range having a preset size.
  4.  前記所定の範囲は、前記注視範囲内であり、前記認識部により認識された実オブジェクトの範囲である
     請求項2に記載の情報処理装置。
    The information processing device according to claim 2, wherein the predetermined range is within the gaze range and is a range of a real object recognized by the recognition unit.
  5.  前記撮像制御部は、前記明度に前記露出設定が対応付けられたテーブルを用いて、前記撮像部の露出設定を行う
     請求項2に記載の情報処理装置。
    The information processing device according to claim 2, wherein the image pickup control unit sets the exposure of the image pickup unit by using a table in which the exposure setting is associated with the brightness.
  6.  前記認識部は、前記実オブジェクトの認識精度が閾値以上である場合に、前記実空間の画像が実オブジェクトを含むと認識し、前記実オブジェクトの認識精度が前記閾値未満である場合に、前記実空間の画像が実オブジェクトを含まないと認識する、
     請求項2に記載の情報処理装置。
    The recognition unit recognizes that the image in the real space includes the real object when the recognition accuracy of the real object is equal to or higher than the threshold value, and when the recognition accuracy of the real object is less than the threshold value, the real object is said to be included. Recognize that the spatial image does not contain real objects,
    The information processing device according to claim 2.
  7.  前記撮像制御部は、前記認識精度に応じて前記所定の範囲を切り替える
     請求項6に記載の情報処理装置。
    The information processing device according to claim 6, wherein the image pickup control unit switches the predetermined range according to the recognition accuracy.
  8.  前記実オブジェクトは、前記ユーザの体の一部である
     請求項1に記載の情報処理装置。
    The information processing device according to claim 1, wherein the real object is a part of the user's body.
  9.  前記実オブジェクトは、前記ユーザの手である
     請求項8に記載の情報処理装置。
    The information processing device according to claim 8, wherein the real object is the hand of the user.
  10.  前記認識部は、前記ユーザの手の形状の連続的な変化を認識し、
     請求項9に記載の情報処理装置。
    The recognition unit recognizes the continuous change in the shape of the user's hand.
    The information processing device according to claim 9.
  11.  前記判定部は、前記ユーザの視線の位置の検出結果に基づいて、前記注視範囲を特定する
     請求項1に記載の情報処理装置。
    The information processing device according to claim 1, wherein the determination unit specifies the gaze range based on the detection result of the position of the line of sight of the user.
  12.  前記撮像部とは異なる、前記視線を検出する第2の撮像部を有する
     請求項1に記載の情報処理装置。
    The information processing apparatus according to claim 1, further comprising a second imaging unit that detects the line of sight, which is different from the imaging unit.
  13.  前記第2の撮像部の撮影方向は、前記撮像部の撮影方向とは反対の方向である、
     請求項12に記載の情報処理装置。
    The imaging direction of the second imaging unit is opposite to the imaging direction of the imaging unit.
    The information processing device according to claim 12.
  14.  前記表示部は、光学シースルーディスプレイ及びビデオシースルーディスプレイの何れかである
     請求項1に記載の情報処理装置。
    The information processing device according to claim 1, wherein the display unit is either an optical see-through display or a video see-through display.
  15.  人体に着脱可能なウエアラブル機器又はスマートフォンとして構成される
     請求項1に記載の情報処理装置。
    The information processing device according to claim 1, which is configured as a wearable device or a smartphone that can be attached to and detached from the human body.
  16.  判定部が、実空間に関するユーザの注視範囲の変化を判定し、
     撮像制御部が、前記注視範囲の変化に応じて撮像部の露出設定を行い、前記露出設定に基づいて、前記実空間の画像を取得するよう前記撮像部を制御し、
     認識部が、前記実空間の画像に含まれる実オブジェクトの認識を行い、
     表示制御部が、前記実オブジェクトの認識結果に基づいて、前記実空間に重畳された仮想オブジェクトが変化するように表示部を制御する
     情報処理方法。
    The determination unit determines the change in the user's gaze range with respect to the real space,
    The image pickup control unit sets the exposure of the image pickup unit according to the change in the gaze range, and controls the image pickup unit to acquire the image in the real space based on the exposure setting.
    The recognition unit recognizes the real object included in the real space image, and then recognizes the real object.
    An information processing method in which the display control unit controls the display unit so that the virtual object superimposed on the real space changes based on the recognition result of the real object.
  17.  判定部が、実空間に関するユーザの注視範囲の変化を判定し、
     撮像制御部が、前記注視範囲の変化に応じて撮像部の露出設定を行い、前記露出設定に基づいて、前記実空間の画像を取得するよう前記撮像部を制御し、
     認識部が、前記実空間の画像に含まれる実オブジェクトの認識を行い、
     表示制御部が、前記実オブジェクトの認識結果に基づいて、前記実空間に重畳された仮想オブジェクトが変化するように表示部を制御する
     情報処理方法をコンピュータに実行させるプログラムが記憶された記憶媒体。
    The determination unit determines the change in the user's gaze range with respect to the real space,
    The image pickup control unit sets the exposure of the image pickup unit according to the change in the gaze range, and controls the image pickup unit to acquire the image in the real space based on the exposure setting.
    The recognition unit recognizes the real object included in the real space image, and then recognizes the real object.
    A storage medium in which a program for causing a computer to execute an information processing method in which a display control unit controls an information processing unit so that a virtual object superimposed on the real space changes based on the recognition result of the real object.
PCT/JP2020/027052 2019-09-06 2020-07-10 Information processing device, information processing method, and storage medium WO2021044732A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2019-162712 2019-09-06
JP2019162712 2019-09-06

Publications (1)

Publication Number Publication Date
WO2021044732A1 true WO2021044732A1 (en) 2021-03-11

Family

ID=74852462

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/027052 WO2021044732A1 (en) 2019-09-06 2020-07-10 Information processing device, information processing method, and storage medium

Country Status (1)

Country Link
WO (1) WO2021044732A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03107934A (en) * 1989-09-22 1991-05-08 Canon Inc Exposure controller using signal of line of sight
JPH0553043A (en) * 1991-08-27 1993-03-05 Nikon Corp Photographed object recognizing device and camera
JP2005138755A (en) * 2003-11-07 2005-06-02 Denso Corp Device and program for displaying virtual images
JP2016192122A (en) * 2015-03-31 2016-11-10 ソニー株式会社 Information processing device, information processing method, and program
JP2016208380A (en) * 2015-04-27 2016-12-08 ソニーセミコンダクタソリューションズ株式会社 Image processing apparatus, imaging apparatus, image processing method, and program

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03107934A (en) * 1989-09-22 1991-05-08 Canon Inc Exposure controller using signal of line of sight
JPH0553043A (en) * 1991-08-27 1993-03-05 Nikon Corp Photographed object recognizing device and camera
JP2005138755A (en) * 2003-11-07 2005-06-02 Denso Corp Device and program for displaying virtual images
JP2016192122A (en) * 2015-03-31 2016-11-10 ソニー株式会社 Information processing device, information processing method, and program
JP2016208380A (en) * 2015-04-27 2016-12-08 ソニーセミコンダクタソリューションズ株式会社 Image processing apparatus, imaging apparatus, image processing method, and program

Similar Documents

Publication Publication Date Title
CN110647237B (en) Gesture-based content sharing in an artificial reality environment
CN108170279B (en) Eye movement and head movement interaction method of head display equipment
US10078377B2 (en) Six DOF mixed reality input by fusing inertial handheld controller with hand tracking
CN107111370B (en) Virtual representation of real world objects
KR20230074780A (en) Touchless photo capture in response to detected hand gestures
US10165176B2 (en) Methods, systems, and computer readable media for leveraging user gaze in user monitoring subregion selection systems
JP6095763B2 (en) Gesture registration device, gesture registration program, and gesture registration method
US9076033B1 (en) Hand-triggered head-mounted photography
US20170277257A1 (en) Gaze-based sound selection
US11320655B2 (en) Graphic interface for real-time vision enhancement
US20190227694A1 (en) Device for providing augmented reality service, and method of operating the same
US20140152558A1 (en) Direct hologram manipulation using imu
US20190212828A1 (en) Object enhancement in artificial reality via a near eye display interface
CN112334869A (en) Electronic device and control method thereof
US11487354B2 (en) Information processing apparatus, information processing method, and program
JP2023507867A (en) Artificial reality system with variable focus display for artificial reality content
US9298256B1 (en) Visual completion
JP2016224086A (en) Display device, control method of display device and program
WO2019142560A1 (en) Information processing device for guiding gaze
WO2018146922A1 (en) Information processing device, information processing method, and program
WO2020080107A1 (en) Information processing device, information processing method, and program
CN110895433A (en) Method and apparatus for user interaction in augmented reality
WO2020044916A1 (en) Information processing device, information processing method, and program
WO2021044732A1 (en) Information processing device, information processing method, and storage medium
KR20240009984A (en) Contextual visual and voice search from electronic eyewear devices

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20859715

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20859715

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP