WO2022168515A1 - 情報処理装置、情報処理方法、並びにプログラム - Google Patents
情報処理装置、情報処理方法、並びにプログラム Download PDFInfo
- Publication number
- WO2022168515A1 WO2022168515A1 PCT/JP2022/000077 JP2022000077W WO2022168515A1 WO 2022168515 A1 WO2022168515 A1 WO 2022168515A1 JP 2022000077 W JP2022000077 W JP 2022000077W WO 2022168515 A1 WO2022168515 A1 WO 2022168515A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- real world
- information processing
- map
- change
- camera
- Prior art date
Links
- 230000010365 information processing Effects 0.000 title claims abstract description 101
- 238000003672 processing method Methods 0.000 title claims abstract description 8
- 230000008859 change Effects 0.000 claims abstract description 86
- 238000001514 detection method Methods 0.000 claims abstract description 37
- 238000012545 processing Methods 0.000 claims description 40
- 230000003190 augmentative effect Effects 0.000 claims description 8
- 238000003384 imaging method Methods 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 27
- 238000000034 method Methods 0.000 description 25
- 238000005516 engineering process Methods 0.000 description 19
- 230000008569 process Effects 0.000 description 19
- 230000009471 action Effects 0.000 description 6
- 230000005540 biological transmission Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 230000010287 polarization Effects 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 239000004984 smart glass Substances 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000001151 other effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000001932 seasonal effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/20—Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/017—Gesture based interaction, e.g. based on a set of recognized hand gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2219/00—Indexing scheme for manipulating 3D models or images for computer graphics
- G06T2219/20—Indexing scheme for editing of 3D models
- G06T2219/2012—Colour editing, changing, or manipulating; Use of colour codes
Definitions
- the present technology relates to an information processing device, an information processing method, and a program, for example, an information processing device, an information processing method, and a program that reflect changes in the real world on a 3D map.
- AR Augmented Reality
- Information (AR content) presented to the user in AR technology is also called annotation.
- Annotations are visualized by various forms of virtual objects such as text, icons, and animations.
- Patent Document 1 proposes to appropriately control the display of virtual objects so as not to confuse the user due to disturbance in the display of virtual objects.
- a 3D map of the real world is generated to place the AR content.
- the AR content may not be displayed in the expected position, which may confuse the user.
- there is a change in the real world there is a demand for a mechanism that can easily reflect the change in the three-dimensional map.
- This technology was created in view of this situation, and enables the 3D map to be updated in response to changes in the real world.
- a first information processing apparatus includes a detection unit that detects a change in the real world using an image of the real world, and when the detection unit detects a change in the real world: and an updating unit for updating the three-dimensional map generated by photographing the real world.
- a second information processing apparatus includes a detection unit that detects a gesture performed by a person in the real world using an image captured by a camera installed in the real world; and a processing unit that, when the gesture is detected by the unit, executes processing corresponding to the detected gesture.
- a third information processing apparatus includes a recognition unit that recognizes an attribute of a person in the real world using an image captured by a camera installed in the real world, and and a providing unit that generates and provides AR (Augmented Reality) content to be provided to the person based on the recognized attribute.
- a recognition unit that recognizes an attribute of a person in the real world using an image captured by a camera installed in the real world
- a providing unit that generates and provides AR (Augmented Reality) content to be provided to the person based on the recognized attribute.
- An information processing device is configured such that the information processing device detects a change in the real world using a photographed image of the real world, and when the change in the real world is detected, the real world is This is an information processing method for updating a three-dimensional map generated by photographing.
- a program causes a computer to detect a change in the real world using an image of the real world captured, and when the change in the real world is detected, capture the real world. This is a program for executing processing for updating the generated three-dimensional map.
- an image of the real world is used to detect changes in the real world, and when changes in the real world are detected , the 3D map generated by photographing the real world is updated.
- an image captured by a camera installed in the real world is used to detect a gesture performed by a person in the real world, and the gesture is detected. If so, a process corresponding to the detected gesture is executed.
- an image captured by a camera installed in the real world is used to recognize the attribute of a person in the real world, and based on the recognized attribute , AR (Augmented Reality) content to be provided to people is generated and provided.
- AR Augmented Reality
- the information processing device may be an independent device, or may be an internal block that constitutes one device.
- the program can be provided by transmitting it via a transmission medium or by recording it on a recording medium.
- FIG. 1 is a diagram illustrating a configuration of an embodiment of a system to which the present technology is applied;
- FIG. It is a figure which shows the structural example of an information processing apparatus.
- 4 is a flowchart for explaining processing of an information processing device; It is a figure which shows an example of AR content.
- FIG. 10 is a diagram for explaining changes in the real world;
- FIG. FIG. 4 is a diagram for explaining updating of a 3D map;
- FIG. 4 is a flowchart for explaining processing of an information processing device;
- FIG. 4 is a diagram for explaining detection of a gesture;
- It is a figure which shows the structural example of a personal computer.
- FIG. 1 is a diagram showing the configuration of an embodiment of an information processing system to which the present technology is applied.
- cameras 21-1 to 21-3, an information processing device 22, and a portable terminal 23 are connected via a network 24 so as to exchange data.
- the information processing system 11 shown in FIG. 1 generates a three-dimensional map (hereinafter referred to as a 3D map), arranges AR (Augmented Reality) content on the generated 3D map, and supplies the arranged AR content to the user. do.
- a 3D map three-dimensional map
- AR Augmented Reality
- the information processing system 11 also detects the change and updates the 3D map.
- the cameras 21-1 to 21-3 are simply referred to as cameras 21 when there is no need to distinguish them individually. Although three cameras 21 are shown in FIG. 1 , the number of cameras 21 is not limited to three, and a plurality of cameras 21 are connected to the network 24 . Although one information processing device 22 and one mobile terminal 23 are also shown in FIG.
- the network 24 is a wired or/and wireless network that supports, for example, a home network, a LAN (Local Area Network), a WAN (Wide Area Network), or a wide area network such as the Internet.
- a home network a LAN (Local Area Network), a WAN (Wide Area Network), or a wide area network such as the Internet.
- LAN Local Area Network
- WAN Wide Area Network
- Internet a wide area network
- the camera 21 captures an image and supplies the captured image (image data) to the information processing device 22 via the network 24 .
- the supplied image data may be image data of a still image or image data of a moving image.
- the information processing device 22 uses the supplied image data to generate a 3D map or update a 3D map that has been generated.
- the information processing device 22 generates AR content, arranges it at a predetermined position on the 3D map, and supplies the arranged AR content to the mobile terminal 23 .
- the information processing device 22 changes the arrangement position of the AR content or changes it to another AR content and arranges it as necessary.
- the information processing device 22 supplies AR content to the mobile terminal 23 .
- the mobile terminal 23 is a smart phone, a tablet terminal, smart glasses, a head-mounted display, or the like. For example, when the user is shooting a position A in the real world for which a 3D map is created with the mobile terminal 23, the AR content arranged at the position A is supplied to the mobile terminal 23 so that the user can AR content is provided.
- FIG. 2 is a diagram showing a functional configuration example of the information processing device 22. As shown in FIG. The information processing device 22 shown in FIG. 2 is referred to as the information processing device 22 in the first embodiment, and is described as an information processing device 22a.
- the information processing device 22a includes a camera information acquisition unit 41, a 3D map generation unit 42, a 3D map storage unit 43, an AR content generation unit 44, a change detection unit 45, a 3D map update unit 46, and an AR content provision unit 47.
- the camera information acquisition unit 41 acquires image data of images captured by the camera 21 and information such as an ID that uniquely identifies the camera 21 .
- the 3D map generation unit 42 analyzes the image based on the image data from the camera information acquisition unit 41 and generates a 3D map.
- the generated 3D map is held (recorded) in the 3D map holding unit 43 .
- the AR content generation unit 44 generates AR content to be placed on the 3D map, and places the AR content at a predetermined position on the 3D map.
- the AR content arranged on the 3D map by the AR content generation unit 44 is supplied to the mobile terminal 23 of the user by the AR content provision unit 47 .
- the AR content providing unit 47 supplies the mobile terminal 23 with AR content arranged at a position within the 3D map corresponding to the position in the real world captured by the mobile terminal 23 .
- the 3D map held in the 3D map holding unit 43 In the real world on which the 3D map held in the 3D map holding unit 43 is based, if there is any change, for example, the layout of the shelves, the 3D map must be changed (updated) according to the change. .
- an image based on the image data acquired by the camera information acquisition section 41 is analyzed by the change detection section 45, and if there is a change in the real world, the change is detected.
- the 3D map update unit 46 updates the 3D map held in the 3D map holding unit 43 so that the change in the real world is reflected. is updated.
- step S ⁇ b>11 the camera information acquisition unit 41 acquires image data of the image captured by the camera 21 .
- a 2D camera or a 3D camera can be used as the camera 21 that takes an image when generating a 3D map, and it may be a camera that acquires a color image or a camera that acquires a monochrome image. good.
- the camera 21 is a 3D camera
- a stereo camera can be used.
- a camera that performs distance measurement using an iToF (Indirect time of flight) method or a dToF (Direct time of flight) method can be used.
- An ultrasonic sensor can also be used instead of the camera 21 for measuring the distance.
- a camera called a multispectral camera or a polarization camera can also be used.
- These cameras may be used in combination to acquire the images used when generating the 3D map.
- images from a 2D camera, images from a multispectral camera, and/or images from a polarization camera may be used to generate a 3D map.
- an image from a 3D camera, an image from a multispectral camera, and/or an image from a polarization camera may be used to generate a 3D map.
- the camera 21 may be a fixed camera fixed at a predetermined position, or may be a portable camera.
- a fixed camera a camera called a celestial camera or the like, which is installed on a ceiling or a wall, or a camera called a surveillance camera or the like can be used.
- Examples of portable cameras include devices called scanning devices, such as handheld scanners that are held by hand for scanning, laser scanners that are placed on the ground using a tripod for scanning, and automobiles and trolleys. It is possible to use a device or the like that is mounted and travels for scanning.
- scanning devices such as handheld scanners that are held by hand for scanning, laser scanners that are placed on the ground using a tripod for scanning, and automobiles and trolleys. It is possible to use a device or the like that is mounted and travels for scanning.
- a camera mounted on a drone an AVG (Automatic Guided Vehicle), a walking robot, or the like may be used.
- AVG Automatic Guided Vehicle
- Smartphones, smart glasses, tablet terminals, etc. can also be used as portable cameras. These terminals can also be mobile terminals 23 .
- the camera 21 When creating a 3D map, the camera 21 is used to photograph the real world for which the 3D map is to be created, and a large amount of image data is acquired.
- the image used to create the 3D map may be data processed so that personal information is not included.
- the information processing device 22a may be supplied with the feature points extracted from the image. Even when an image is supplied to the information processing device 22a, an image in which the face of a person photographed in the image is mosaiced or processed to make it invisible is supplied. You may do so. By doing so, it is possible to create a 3D map in consideration of privacy.
- the 3D map generation unit 42 uses the image data captured by the camera 21 acquired from the camera information acquisition unit 41 to generate a 3D map.
- a 3D map is generated, for example, by analyzing an image, generating point cloud data, performing a stitching process, and removing overlaps. The generation of the 3D map depends on what kind of camera is used as the camera 21, for example, whether a 2D camera or a 3D camera is used, and an appropriate method is selected depending on the type of camera used and the type of data to be handled. It should be applied and generated.
- AR content is generated by the AR content generation unit 44 and placed at a predetermined position on the 3D map.
- AR content of signboards and guide displays is generated and placed in the area corresponding to the wall in the 3D map
- AR content of product description is generated in the area corresponding to the product shelf in the 3D map. be placed.
- the generated AR content is held in the 3D map holding unit 43 in association with the 3D map.
- the process in step S13 may be performed when the AR content is provided to the user, and the AR content suitable for providing to the user at that time may be generated.
- step S14 the AR content is supplied to the mobile terminal 23 by the AR content providing unit 47.
- the processing in step S14 is performed when the mobile terminal 23 side requests to supply the AR content, and the AR content is placed at a position on the 3D map corresponding to the position in the real world captured by the mobile terminal 23.
- the AR content is supplied from the AR content providing unit 47 to the mobile terminal 23 via the network 24 and played back on the mobile terminal 23, thereby providing the AR content to the user.
- the upper diagram in FIG. 4 is a diagram showing an example of the real world (assumed to be the real world 61) photographed by the mobile terminal 23.
- FIG. A shelf 71 is arranged at a position A in the real world 61 .
- the AR content providing unit 47 selects the AR content arranged at the position A on the 3D map corresponding to the position A in the real world captured by the mobile terminal 23. , to the mobile terminal 23 .
- FIG. 4 is a diagram showing an example of a screen on which AR content is supplied to the mobile terminal 23 and displayed on the display 62 of the mobile terminal 23.
- FIG. A display 62 of the mobile terminal 23 displays a shelf 71 arranged in the real world 61 , and a lamp-shaped AR content 72 is displayed on the shelf 71 .
- the shelf 71 was located at position A in the real world 61 at time T1 when the 3D map was generated, but the real world at time T2 after time T1 has passed.
- the shelf 71 has moved to position B as shown in the lower diagram of FIG.
- the map generated at time T1 remains unchanged, so the state where the shelf 71 is at position A in the 3D map (the state shown in the upper diagram of FIG. 5). is.
- the real world 61 there is a shelf 71 at position B (the state shown in the lower diagram of FIG. 5). That is, in this case, there is a discrepancy between the position of the shelf 71 in the 3D map and the real world.
- AR content 72 is displayed at a position corresponding to position A in the real world 61, and a shelf 71 is displayed at a position corresponding to position B. A screen appears. In this way, when there is a change in the real world, the AR content 72 may not be displayed in the correct position unless the change is reflected in the 3D map.
- the display 62 of the mobile terminal 23 displays a shelf 71 arranged at a position B in the real world 61, and AR content 72 is displayed on the shelf 71.
- the screen on which the AR content 72 is arranged at an appropriate position as shown in the lower diagram of FIG. be presented to the user.
- the 3D map is updated and the AR content is rearranged on the updated 3D map as described below. You can prevent things from happening.
- FIG. 7 is a flowchart for explaining the processing related to updating the 3D map and the AR content performed by the information processing device 22a.
- step S ⁇ b>21 the camera information acquisition unit 41 acquires image data and identification information for identifying the camera 21 .
- the 3D map is created, there is a camera 21 that continues to photograph the real world on which the 3D map is based, and the image from the camera 21 is supplied to the information processing device 22a.
- a surveillance camera or a fixed-point camera can be used as the camera 21 that continuously captures the real world.
- a camera mounted on the user's portable terminal 23 can also be used.
- the camera information acquisition unit 41 acquires image data from the camera 21 capturing the real world at predetermined intervals.
- the change detection unit 45 analyzes the image data from the camera information acquisition unit 41 and determines whether or not there has been a change in the real world.
- the camera 21 is installed as a fixed camera at a position for photographing the real world 61 shown in FIG.
- An image (video) captured by the camera 21 is supplied from such a camera 21 to the change detection section 45 via the camera information acquisition section 41 .
- the change detection unit 45 holds at least the image supplied at the previous time.
- an image hereinafter referred to as image T1 as shown in the upper diagram of FIG. 5 acquired at time T1 is held.
- an image as shown in the lower diagram of FIG. 5 hereinafter referred to as image T2 is acquired at time T2.
- the change detection unit 45 detects the shelf 71 from the image T1 supplied at time T1. Detecting the shelf 71 means, for example, detecting a feature point (described as a feature point T1) for specifying the shelf 71 as an object. At time T2, the change detection unit 45 detects feature points (described as feature points T2) of the shelf 71 from the image T2 supplied at time T2.
- the change detection unit 45 compares the feature point T1 and the feature point T2. If there is no change in the real world, in this case the shelf 71 does not move, the coordinates of the feature point T1 and the feature point T2 detected from the shelf 71 do not change, so the positions of the feature points are compared. As a result, in other words, the result of calculating the difference between the positions of the feature points is a value that falls below the threshold.
- the coordinates of the feature points will change, so the amount of change in the position of the feature points will be greater than or equal to the threshold. If the amount of change in the position of the feature point is greater than or equal to the threshold, it is determined that there has been a change in the real world. In the situation shown in FIG. 5, the difference between the feature point T1 and the feature point T2 is greater than or equal to the threshold, and it is determined that the real world 61 has changed.
- the images used to update the 3D map are, as described above, images from surveillance cameras, fixed-point cameras, etc. that capture the real world.
- images from surveillance cameras, fixed-point cameras, etc. that capture the real world.
- the image used to update the 3D map may be data processed so as not to include personal information.
- the information processing device 22a may be supplied with the feature points extracted from the image. Even when an image is supplied to the information processing device 22a, an image in which the face of a person photographed in the image is mosaiced or processed to make it invisible is supplied. You may do so. By doing so, the 3D map can be updated in consideration of privacy.
- step S22 If it is determined in step S22 that there is no change in the real world, the process returns to step S21 and the subsequent processes are repeated. That is, a series of processes are continued in which the camera 21 continues to capture images, the images are analyzed, and it is determined whether or not there is any change in the real world.
- step S22 if it is determined in step S22 that there has been a change in the real world, the process proceeds to step S23.
- step S ⁇ b>23 the 3D map updating unit 46 updates the 3D map held in the 3D map holding unit 43 .
- the change detection unit 45 detects data required for creating a 3D map, such as point cloud data, from the image acquired when it is determined that there has been a change, and uses the point cloud data to generate a real image.
- the 3D map corresponding to the changed part in the world is updated.
- an image captured by the camera 21 capturing the real world is analyzed, a change occurring in the real world is detected, and the 3D map is updated when the change is detected. It is possible to shorten the time loss from when the 3D map is updated.
- the image from the fixed camera it is possible to eliminate the process of re-imaging, and it is possible to reduce the time and labor required for re-imaging.
- the AR content is rearranged in step S24.
- the AR content generator 44 arranges AR content on the updated 3D map. By rearranging the AR content on the updated 3D map, even if there is a change in the real world 61, the AR content 72 corresponding to the changed real world as shown in the lower diagram of FIG. A properly displayed image can be presented to the user.
- the shelf 71 is moved has been described as an example, but it can also be applied to the following cases, for example, detecting changes in scenery.
- the video from the camera 21 capturing the road for example, the road in the town, or the road in the facility such as a shopping mall is analyzed, and a situation occurs in which the road width is narrowed due to construction. Assume that
- a change in the real world 61 that the passage has narrowed is detected, the 3D map is updated, and AR content for the narrowed road is placed on that road.
- AR content for the narrowed road is placed on that road.
- a signboard that calls for caution such as "The road width is narrowing. Please be careful” may be provided as AR content.
- images from surveillance cameras installed outdoors are acquired, and changes in landscape such as topographical changes due to disasters, seasonal changes in plants, changes due to construction, changes due to redevelopment, changes due to new construction, etc. It is made to be detected as a change in the real world 61. - ⁇ By updating the 3D map when a change is detected, it is possible to perform navigation corresponding to a change in terrain due to, for example, a disaster.
- image data is supplied from the camera 21 to the information processing device 22a, and a change in the real world 61 is detected by the change detection unit 45.
- a function for detecting changes may be provided.
- the camera 21 may perform processing up to detection of a change in the real world 61, and send image data or the like to the information processing device 22a only when the change is detected.
- a process of detecting feature points from an image and sending the feature points to the information processing device 22a may be performed. That is, part of the processing performed by the information processing device 22 described above may be configured to be performed on the camera 21 side.
- Such processing may be carried out by an AI (artificial intelligence) chip mounted on the camera 21.
- AI artificial intelligence
- FIG. 8 is a diagram for explaining an embodiment (referred to as a second embodiment) in which the real world 61 is photographed by the camera 21 and changes in the real world 61 are detected.
- the camera 21 is, for example, a camera installed within a shopping mall, and photographs a predetermined area within the shopping mall.
- the camera 21 may also detect a user's gesture and perform processing corresponding to the gesture.
- a change in the real world 61 is detected, but a gesture of the user is detected as this change, and processing corresponding to the detected gesture is executed.
- FIG. 9 is a diagram showing a configuration example of the information processing device 22b when detecting a user's gesture and executing corresponding processing.
- the information processing device 22b shown in FIG. 9 parts similar to those of the information processing device 22a shown in FIG.
- the information processing device 22b includes a camera information acquisition unit 41, a 3D map storage unit 43, an AR content generation unit 44, an AR content provision unit 47, a gesture detection unit 101, and a user identification unit 102.
- the information processing device 22b has a configuration in which a 3D map that has already been generated is held in the 3D map holding unit 43, and the AR content arranged in the 3D map is supplied to the mobile terminal 23.
- the information processing device 22b may be configured to include the change detection unit 45 and the 3D map update unit 46 .
- the information processing device 22a in the first embodiment described above and the information processing device 22b in the second embodiment may be combined.
- the gesture detection unit 101 analyzes image data from the camera 21 acquired via the camera information acquisition unit 41, detects gestures made by the user, and performs processing corresponding to the detected gestures.
- the user identification unit 102 identifies the user (the mobile terminal 23 of the user) who performed the gesture.
- step S ⁇ b>41 the gesture detection unit 101 acquires image data from the camera 21 acquired via the camera information acquisition unit 41 .
- step S42 the gesture detection unit 101 analyzes the acquired image and detects gestures.
- a person is detected from the image to be analyzed, and if a person is detected, the person's hand is also detected.
- the movement of the hand is detected over a plurality of frames, and it is determined whether or not the movement corresponds to a predetermined gesture.
- step S42 when it is determined in step S42 that a gesture has been detected, the processing proceeds to step S43, and when it is determined that no gesture has been detected, the processing returns to step S41.
- the gesture detection process continues until the gesture is detected.
- the user identification unit 102 identifies the coordinates of the person (user) who made the gesture.
- the camera information acquisition unit 41 acquires information (camera ID) for uniquely identifying the camera 21 in addition to the image data. From the camera ID, the location in the real world 61 photographed by the camera 21 can be known, and the corresponding location on the 3D map can also be determined.
- the coordinates of the user's position in the coordinate system set in the real world 61 are obtained. What is detected may be the coordinates of the position corresponding to the position of the user in the real world 61 in the coordinate system set in the 3D map.
- the user identification unit 102 identifies the position of the mobile terminal 23. For example, the user identification unit 102 acquires an image captured by the mobile terminal 23 and extracts feature points from the acquired image. The extracted feature points are collated with the 3D map held in the 3D map holding unit 43, and positions (objects) matching the extracted feature points are specified. Through such processing, the position of the mobile terminal 23 capturing the image is specified.
- step S45 the person and the mobile terminal 23 are linked.
- a person and the mobile terminal 23 are associated with each other where the position of the person specified in step S43 and the position of the mobile terminal 23 specified in step S44 match.
- the person who performed the gesture and the mobile terminal 23 of the person who performed the gesture are linked.
- step S46 processing corresponding to the gesture is executed. For example, if the gesture is an instruction to the mobile terminal 23, the specified mobile terminal 23 is notified of the instruction given by the user by the gesture. In the portable terminal 23, processing according to the instruction supplied from the information processing device 22b is executed.
- the gesture is an instruction for the AR content displayed on the display 62 of the mobile terminal 23
- the AR content matching the instruction is set and supplied.
- the camera 21 detects a gesture and executes processing for that gesture. Also, the mobile terminal 23 can be caused to execute a process instructed by a gesture by the user.
- ⁇ Third Embodiment> As a third embodiment, a case of detecting information about a user and providing AR content suitable for the obtained information about the user will be described.
- the camera 21 is, for example, a camera installed within a shopping mall, and photographs a predetermined area within the shopping mall. Information about a user photographed by a camera 21 is acquired, and AR contents are selected and provided based on the acquired information.
- User information includes user attributes, and user attributes include, for example, gender and age.
- FIG. 11 is a diagram showing a configuration example of the information processing device 22c when detecting a user's attribute and providing AR content suitable for that user.
- the same parts as those of the information processing device 22a shown in FIG. 11 the same parts as those of the information processing device 22a shown in FIG.
- the information processing device 22c includes a camera information acquisition unit 41, a 3D map storage unit 43, an AR content generation unit 44, an AR content provision unit 47, a user identification unit 102, and a user attribute recognition unit 121.
- the user attribute recognition unit 121 recognizes user attributes. Attributes include information such as sex, age, family, and friends. When user's preference information can be acquired in advance, the preference information can be used as an attribute.
- An action history may be used as an attribute. The action history includes, for example, being in front of a predetermined store for a predetermined time, purchasing a product, and the like.
- the user's situation in the real world may be used as the attribute. The user's situation in the real world is, for example, being in a crowded situation.
- the information processing device 22c shows a configuration in which a 3D map that has already been generated is held in the 3D map holding unit 43, and AR content placed on the 3D map is supplied to the mobile terminal 23.
- the change detection unit 45 and the 3D map update unit 46 may be provided in the information processing device 22c.
- the information processing device 22c may be configured to include the gesture detection unit 101 .
- step S ⁇ b>61 the user attribute recognition unit 121 acquires image data from the camera 21 acquired via the camera information acquisition unit 41 .
- step S62 the user attribute recognition unit 121 analyzes the acquired image data and recognizes the user's attribute. For example, attributes such as gender and age are recognized using machine learning such as deep learning. When the action history is recognized, the movement of a predetermined user is continuously monitored. This action is recognized in the process of step S62.
- Step S63 When the user's attribute is recognized in step S62, the process proceeds to step S63.
- Steps S63 to S65 are the same processing as steps S43 to S65 (FIG. 10), so description thereof is omitted here.
- step S66 the AR content generator 44 generates (selects) AR content suitable for the recognized attribute, and the AR content provider 47 supplies it to the mobile terminal 23.
- a male-recognized user is presented with male-oriented AR content
- a female-recognized user is presented with female-oriented AR content.
- AR content related to the stopping store is presented.
- the degree of congestion is acquired as the user's situation in the real world, different AR content is presented depending on whether the degree of congestion is high or low.
- the series of processes described above can be executed by hardware or by software.
- a program that constitutes the software is installed in the computer.
- the computer includes, for example, a computer built into dedicated hardware and a general-purpose personal computer capable of executing various functions by installing various programs.
- FIG. 13 is a block diagram showing an example of the hardware configuration of a computer that executes the series of processes described above by a program.
- a CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- An input/output interface 1005 is further connected to the bus 1004 .
- An input unit 1006 , an output unit 1007 , a storage unit 1008 , a communication unit 1009 and a drive 1010 are connected to the input/output interface 1005 .
- the input unit 1006 consists of a keyboard, mouse, microphone, and the like.
- the output unit 1007 includes a display, a speaker, and the like.
- the storage unit 1008 includes a hard disk, nonvolatile memory, and the like.
- a communication unit 1009 includes a network interface and the like.
- a drive 1010 drives a removable medium 1011 such as a magnetic disk, optical disk, magneto-optical disk, or semiconductor memory.
- the CPU 1001 loads, for example, a program stored in the storage unit 1008 into the RAM 1003 via the input/output interface 1005 and the bus 1004, and executes the above-described series of programs. is processed.
- the program executed by the computer (CPU 1001) can be provided by being recorded on removable media 1011 such as package media, for example. Also, the program can be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.
- the program can be installed in the storage section 1008 via the input/output interface 1005 by loading the removable medium 1011 into the drive 1010 . Also, the program can be received by the communication unit 1009 and installed in the storage unit 1008 via a wired or wireless transmission medium. In addition, programs can be installed in the ROM 1002 and the storage unit 1008 in advance.
- the program executed by the computer may be a program in which processing is performed in chronological order according to the order described in this specification, or in parallel or at a necessary timing such as when a call is made. It may be a program in which processing is performed.
- system refers to an entire device composed of a plurality of devices.
- the present technology can also take the following configuration.
- a detection unit that detects a change in the real world using an image of the real world;
- An information processing apparatus comprising: an update unit that updates a three-dimensional map generated by photographing the real world when the detection unit detects a change in the real world.
- the information processing apparatus according to (1) wherein the image is an image captured by a camera installed in the real world.
- the change in the real world is a change in scenery.
- Arrangement of AR (Augmented Reality) content arranged in the 3D map is also updated when the 3D map is updated by the update unit.
- a detection unit that detects a gesture made by a person in the real world using an image captured by a camera installed in the real world; and a processing unit that, when the detection unit detects the gesture, executes processing corresponding to the detected gesture.
- the gesture is a gesture performed by the person on the mobile terminal;
- a recognition unit that recognizes attributes of a person in the real world using an image captured by a camera installed in the real world; and a providing unit that generates and provides AR (Augmented Reality) content to be provided to the person based on the attribute recognized by the recognition unit.
- the information processing apparatus according to (9), wherein the attributes are gender and age.
- (11) The information processing apparatus according to (9) or (10), wherein the attribute is the situation in the real world.
- the information processing device using an image of the real world to detect changes in the real world;
- An information processing method comprising, when a change in the real world is detected, updating a three-dimensional map generated by photographing the real world.
- (13) to the computer using an image of the real world to detect changes in the real world;
- 11 information processing system 21 camera, 22 information processing device, 23 mobile terminal, 24 network, 41 camera information acquisition unit, 42 3D map generation unit, 43 3D map storage unit, 44 AR content generation unit, 45 change detection unit, 46 3D map update unit, 47 AR content provision unit, 61 real world, 62 display, 71 shelf, 72 AR content, 101 gesture detection unit, 102 user identification unit, 121 user attribute recognition unit
Abstract
Description
図1は、本技術を適用した情報処理システムの一実施の形態の構成を示す図である。図1に示した情報処理システム11は、カメラ21-1乃至21-3、情報処理装置22、および携帯端末23が、それぞれデータの授受を行えるようにネットワーク24を介して接続されている。
図2は、情報処理装置22の機能構成例を示す図である。図2に示した情報処理装置22を、第1の実施の形態における情報処理装置22とし、情報処理装置22aと記述する。
情報処理装置22aが行う3Dマップの生成と、ARコンテンツの供給に係わる処理について、図3のフローチャートを参照して説明する。
図7は、情報処理装置22aが行う3DマップとARコンテンツの更新に係わる処理について説明するためのフローチャートである。
図8は、カメラ21により現実世界61を撮影し、現実世界61における変化を検出する場合の実施の形態(第2の実施の形態とする)について説明するための図である。
第3の実施の形態として、ユーザに関する情報を検出し、得られたユーザに関する情報に適したARコンテンツを提供する場合について説明を加える。
上述した一連の処理は、ハードウエアにより実行することもできるし、ソフトウエアにより実行することもできる。一連の処理をソフトウエアにより実行する場合には、そのソフトウエアを構成するプログラムが、コンピュータにインストールされる。ここで、コンピュータには、専用のハードウエアに組み込まれているコンピュータや、各種のプログラムをインストールすることで、各種の機能を実行することが可能な、例えば汎用のパーソナルコンピュータなどが含まれる。
(1)
現実世界を撮影した画像を用いて、前記現実世界における変更を検出する検出部と、
前記検出部で前記現実世界における変更が検出された場合、前記現実世界を撮影することで生成された3次元マップを更新する更新部と
を備える情報処理装置。
(2)
前記画像は、前記現実世界に設置されているカメラにより撮影された画像である
前記(1)に記載の情報処理装置。
(3)
前記現実世界における変更は、物体の位置の変更である
前記(1)または(2)に記載の情報処理装置。
(4)
前記現実世界における変更は、景観の変化である
前記(1)または(2)に記載の情報処理装置。
(5)
前記更新部により前記3次元マップが更新された場合、前記3次元マップに配置されているAR(Augmented Reality)コンテンツの配置も更新する
前記(1)乃至(4)のいずれかに記載の情報処理装置。
(6)
前記現実世界を撮影した画像は、個人情報を含まないように加工された画像である
前記(1)乃至(5)のいずれかに記載の情報処理装置。
(7)
現実世界に設置されているカメラにより撮影された画像を用いて、前記現実世界にいる人が行ったジェスチャーを検出する検出部と、
前記検出部により前記ジェスチャーが検出された場合、検出された前記ジェスチャーに対応する処理を実行する処理部と
を備える情報処理装置。
(8)
前記ジェスチャーは、前記人が携帯端末に対して行ったジェスチャーであり、
前記処理部は、前記携帯端末に前記ジェスチャーで指示されたことを前記携帯端末に通知する
前記(7)に記載の情報処理装置。
(9)
現実世界に設置されているカメラにより撮影された画像を用いて、前記現実世界にいる人の属性を認識する認識部と、
前記認識部により認識された前記属性に基づき、前記人に提供するAR(Augmented Reality)コンテンツを生成し、提供する提供部と
を備える情報処理装置。
(10)
前記属性は、性別、年齢である
前記(9)に記載の情報処理装置。
(11)
前記属性は、前記現実世界の状況である
前記(9)または(10)に記載の情報処理装置。
(12)
情報処理装置が、
現実世界を撮影した画像を用いて、前記現実世界における変更を検出し、
前記現実世界における変更が検出された場合、前記現実世界を撮影することで生成された3次元マップを更新する
情報処理方法。
(13)
コンピュータに、
現実世界を撮影した画像を用いて、前記現実世界における変更を検出し、
前記現実世界における変更が検出された場合、前記現実世界を撮影することで生成された3次元マップを更新する
処理を実行させるためのプログラム。
Claims (13)
- 現実世界を撮影した画像を用いて、前記現実世界における変更を検出する検出部と、
前記検出部で前記現実世界における変更が検出された場合、前記現実世界を撮影することで生成された3次元マップを更新する更新部と
を備える情報処理装置。 - 前記画像は、前記現実世界に設置されているカメラにより撮影された画像である
請求項1に記載の情報処理装置。 - 前記現実世界における変更は、物体の位置の変更である
請求項1に記載の情報処理装置。 - 前記現実世界における変更は、景観の変化である
請求項1に記載の情報処理装置。 - 前記更新部により前記3次元マップが更新された場合、前記3次元マップに配置されているAR(Augmented Reality)コンテンツの配置も更新する
請求項1に記載の情報処理装置。 - 前記現実世界を撮影した画像は、個人情報を含まないように加工された画像である
請求項1に記載の情報処理装置。 - 現実世界に設置されているカメラにより撮影された画像を用いて、前記現実世界にいる人が行ったジェスチャーを検出する検出部と、
前記検出部により前記ジェスチャーが検出された場合、検出された前記ジェスチャーに対応する処理を実行する処理部と
を備える情報処理装置。 - 前記ジェスチャーは、前記人が携帯端末に対して行ったジェスチャーであり、
前記処理部は、前記携帯端末に前記ジェスチャーで指示されたことを前記携帯端末に通知する
請求項7に記載の情報処理装置。 - 現実世界に設置されているカメラにより撮影された画像を用いて、前記現実世界にいる人の属性を認識する認識部と、
前記認識部により認識された前記属性に基づき、前記人に提供するAR(Augmented Reality)コンテンツを生成し、提供する提供部と
を備える情報処理装置。 - 前記属性は、性別、年齢である
請求項9に記載の情報処理装置。 - 前記属性は、前記現実世界の状況である
請求項9に記載の情報処理装置。 - 情報処理装置が、
現実世界を撮影した画像を用いて、前記現実世界における変更を検出し、
前記現実世界における変更が検出された場合、前記現実世界を撮影することで生成された3次元マップを更新する
情報処理方法。 - コンピュータに、
現実世界を撮影した画像を用いて、前記現実世界における変更を検出し、
前記現実世界における変更が検出された場合、前記現実世界を撮影することで生成された3次元マップを更新する
処理を実行させるためのプログラム。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202280012469.XA CN116783617A (zh) | 2021-02-05 | 2022-01-05 | 信息处理装置、信息处理方法和程序 |
JP2022579392A JPWO2022168515A1 (ja) | 2021-02-05 | 2022-01-05 | |
EP22749394.7A EP4290468A1 (en) | 2021-02-05 | 2022-01-05 | Information processing device, information processing method, and program |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2021-017389 | 2021-02-05 | ||
JP2021017389 | 2021-02-05 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022168515A1 true WO2022168515A1 (ja) | 2022-08-11 |
Family
ID=82742286
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2022/000077 WO2022168515A1 (ja) | 2021-02-05 | 2022-01-05 | 情報処理装置、情報処理方法、並びにプログラム |
Country Status (5)
Country | Link |
---|---|
EP (1) | EP4290468A1 (ja) |
JP (1) | JPWO2022168515A1 (ja) |
CN (1) | CN116783617A (ja) |
TW (1) | TW202236077A (ja) |
WO (1) | WO2022168515A1 (ja) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005251170A (ja) * | 2004-01-23 | 2005-09-15 | Sony United Kingdom Ltd | 表示装置 |
JP2012221250A (ja) | 2011-04-08 | 2012-11-12 | Sony Corp | 画像処理装置、表示制御方法及びプログラム |
JP2019046291A (ja) * | 2017-09-05 | 2019-03-22 | 株式会社ソニー・インタラクティブエンタテインメント | 情報処理装置および画像表示方法 |
JP2020204708A (ja) * | 2019-06-17 | 2020-12-24 | 株式会社ジースキャン | 地図情報更新システム |
-
2022
- 2022-01-05 JP JP2022579392A patent/JPWO2022168515A1/ja active Pending
- 2022-01-05 WO PCT/JP2022/000077 patent/WO2022168515A1/ja active Application Filing
- 2022-01-05 CN CN202280012469.XA patent/CN116783617A/zh active Pending
- 2022-01-05 EP EP22749394.7A patent/EP4290468A1/en active Pending
- 2022-01-26 TW TW111103253A patent/TW202236077A/zh unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005251170A (ja) * | 2004-01-23 | 2005-09-15 | Sony United Kingdom Ltd | 表示装置 |
JP2012221250A (ja) | 2011-04-08 | 2012-11-12 | Sony Corp | 画像処理装置、表示制御方法及びプログラム |
JP2019046291A (ja) * | 2017-09-05 | 2019-03-22 | 株式会社ソニー・インタラクティブエンタテインメント | 情報処理装置および画像表示方法 |
JP2020204708A (ja) * | 2019-06-17 | 2020-12-24 | 株式会社ジースキャン | 地図情報更新システム |
Also Published As
Publication number | Publication date |
---|---|
CN116783617A (zh) | 2023-09-19 |
JPWO2022168515A1 (ja) | 2022-08-11 |
EP4290468A1 (en) | 2023-12-13 |
TW202236077A (zh) | 2022-09-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9661214B2 (en) | Depth determination using camera focus | |
US20190333478A1 (en) | Adaptive fiducials for image match recognition and tracking | |
US10264207B2 (en) | Method and system for creating virtual message onto a moving object and searching the same | |
CN108921894B (zh) | 对象定位方法、装置、设备和计算机可读存储介质 | |
US20170337747A1 (en) | Systems and methods for using an avatar to market a product | |
US20160119607A1 (en) | Image processing system and image processing program | |
US20170255947A1 (en) | Image processing system and image processing method | |
US11715236B2 (en) | Method and system for re-projecting and combining sensor data for visualization | |
CN110555876B (zh) | 用于确定位置的方法和装置 | |
US20200294318A1 (en) | Representation of user position, movement, and gaze in mixed reality space | |
JP2019174920A (ja) | 物品管理システム、及び物品管理プログラム | |
EP2808805A1 (en) | Method and apparatus for displaying metadata on a display and for providing metadata for display | |
US9851784B2 (en) | Movement line conversion and analysis system, method and program | |
JP2013195725A (ja) | 画像表示システム | |
WO2022168515A1 (ja) | 情報処理装置、情報処理方法、並びにプログラム | |
JP2016021097A (ja) | 画像処理装置、画像処理方法、およびプログラム | |
JP6399096B2 (ja) | 情報処理装置、表示方法およびコンピュータプログラム | |
US11341774B2 (en) | Information processing apparatus, data generation method, and non-transitory computer readable medium storing program | |
GB2513865A (en) | A method for interacting with an augmented reality scene | |
JP2011192220A (ja) | 同一人判定装置、同一人判定方法および同一人判定プログラム | |
TW201822034A (zh) | 收集系統、終端用程式以及收集方法 | |
CN111860070A (zh) | 识别发生改变的对象的方法和装置 | |
KR20220013235A (ko) | 영상 통화 수행 방법, 그 방법을 수행하는 디스플레이 기기, 및 그 방법을 수행하는 프로그램이 저장된 컴퓨터 판독 가능 저장 매체 | |
US11393197B2 (en) | System and method for quantifying augmented reality interaction | |
US20230127443A1 (en) | System for Controlling Display Device on Basis of Identified Capture Range |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22749394 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022579392 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202280012469.X Country of ref document: CN |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2022749394 Country of ref document: EP Effective date: 20230905 |