WO2021200432A1 - 撮影指示方法、撮影方法、撮影指示装置及び撮影装置 - Google Patents
撮影指示方法、撮影方法、撮影指示装置及び撮影装置 Download PDFInfo
- Publication number
- WO2021200432A1 WO2021200432A1 PCT/JP2021/012156 JP2021012156W WO2021200432A1 WO 2021200432 A1 WO2021200432 A1 WO 2021200432A1 JP 2021012156 W JP2021012156 W JP 2021012156W WO 2021200432 A1 WO2021200432 A1 WO 2021200432A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- region
- shooting
- dimensional
- image
- images
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three-dimensional [3D] modelling for computer graphics
- G06T17/20—Finite element generation, e.g. wire-frame surface description, tesselation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three-dimensional [3D] modelling for computer graphics
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional [3D] objects
- G06V20/647—Three-dimensional [3D] objects by matching two-dimensional images to three-dimensional objects
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/61—Control of cameras or camera modules based on recognised objects
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/63—Control of cameras or camera modules by using electronic viewfinders
- H04N23/633—Control of cameras or camera modules by using electronic viewfinders for displaying additional information relating to control or operation of the camera
- H04N23/634—Warning indications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/64—Computer-aided capture of images, e.g. transfer from script file into camera, check of taken image quality, advice or proposal for image composition or decision on when to take image
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30244—Camera pose
Definitions
- This disclosure relates to a shooting instruction method, a shooting method, a shooting instruction device, and a shooting device.
- Patent Document 1 discloses a technique for generating a three-dimensional model of a subject by using a plurality of images obtained by photographing the subject from a plurality of viewpoints.
- An object of the present disclosure is to provide a shooting instruction method or a shooting instruction device capable of improving the accuracy of a three-dimensional model.
- the shooting instruction method is a shooting instruction method executed by a shooting instruction device, and is based on each shooting position and orientation of a plurality of images of a subject and the plurality of images.
- the designation of the first region is accepted, and at least one of the shooting position and the orientation is set so as to capture the image used for generating the three-dimensional model of the designated first region. Instruct.
- the photographing method is an imaging method executed by an imaging apparatus, in which a plurality of first images in a target space are photographed, and the plurality of first images and the plurality of first images are taken. Based on each first shooting position and orientation, the first three-dimensional position information of the target space is generated, and the second three-dimensional position information of the target space, which is more detailed than the first three-dimensional position information, is generated. The difficult second region is determined by using the first three-dimensional position information without generating the second three-dimensional position information.
- the present disclosure can provide a shooting instruction method or a shooting instruction device that can improve the accuracy of the three-dimensional model.
- FIG. 1 is a block diagram of the terminal device according to the first embodiment.
- FIG. 2 is a sequence diagram of the terminal device according to the first embodiment.
- FIG. 3 is a flowchart of the initial processing according to the first embodiment.
- FIG. 4 is a diagram showing an example of the initial display according to the first embodiment.
- FIG. 5 is a diagram showing an example of a method of selecting a priority designated portion according to the first embodiment.
- FIG. 6 is a diagram showing an example of a method of selecting a priority designated portion according to the first embodiment.
- FIG. 7 is a flowchart of the position / orientation estimation process according to the first embodiment.
- FIG. 8 is a flowchart of the shooting position candidate determination process according to the first embodiment.
- FIG. 9 is a diagram showing a situation in which the camera and the object according to the first embodiment are viewed from above.
- FIG. 10 is a diagram showing an example of an image obtained by each camera according to the first embodiment.
- FIG. 11 is a schematic diagram for explaining an example of determining a shooting position candidate according to the first embodiment.
- FIG. 12 is a schematic diagram for explaining an example of determining a shooting position candidate according to the first embodiment.
- FIG. 13 is a schematic diagram for explaining an example of determining a shooting position candidate according to the first embodiment.
- FIG. 14 is a flowchart of the three-dimensional reconstruction process according to the first embodiment.
- FIG. 15 is a flowchart of the display process during shooting according to the first embodiment.
- FIG. 16 is a diagram showing an example of a method of visually presenting a shooting position candidate according to the first embodiment.
- FIG. 17 is a diagram showing an example of a method of visually presenting a shooting position candidate according to the first embodiment.
- FIG. 18 is a diagram showing a display example of an alert according to the first embodiment.
- FIG. 19 is a flowchart of a shooting instruction process according to the first embodiment.
- FIG. 20 is a diagram showing a configuration of a three-dimensional reconstruction system according to the second embodiment.
- FIG. 21 is a block diagram of the photographing apparatus according to the second embodiment.
- FIG. 22 is a flowchart showing the operation of the photographing apparatus according to the second embodiment.
- FIG. 23 is a flowchart of the position / orientation estimation process according to the second embodiment.
- FIG. 24 is a flowchart of the position / posture integration process according to the second embodiment.
- FIG. 25 is a plan view showing a state of photographing in the target space according to the second embodiment.
- FIG. 26 is a diagram showing an example of an image and an example of comparison processing according to the second embodiment.
- FIG. 27 is a flowchart of the area detection process according to the second embodiment.
- FIG. 28 is a flowchart of the display process according to the second embodiment.
- FIG. 29 is a diagram showing a display example of the UI screen according to the second embodiment.
- FIG. 30 is a diagram showing an example of area information according to the second embodiment.
- FIG. 31 is a diagram showing a display example when the estimation of the position / posture according to the second embodiment fails.
- FIG. 32 is a diagram showing a display example when a low-precision region according to the second embodiment is detected.
- FIG. 33 is a diagram showing an example of instructions to the user according to the second embodiment.
- FIG. 34 is a diagram showing an example of an instruction (arrow) according to the second embodiment.
- FIG. 35 is a diagram showing an example of area information according to the second embodiment.
- FIG. 36 is a plan view showing a shooting state of the target area according to the second embodiment.
- FIG. 37 is a diagram showing an example of a photographed area when the three-dimensional point according to the second embodiment is used.
- FIG. 38 is a diagram showing an example of a photographed region when the mesh according to the second embodiment is used.
- FIG. 39 is a diagram showing an example of a depth image according to the second embodiment.
- FIG. 40 is a diagram showing an example of a photographed area when the depth image according to the second embodiment is used.
- FIG. 41 is a flowchart of the photographing method according to the second embodiment.
- the shooting instruction method is a shooting instruction method executed by a shooting instruction device, and is based on each shooting position and orientation of a plurality of images of a subject and the plurality of images.
- the designation of the first region is accepted, and at least one of the shooting position and the orientation is set so as to capture the image used for generating the three-dimensional model of the designated first region. Instruct.
- the accuracy of the 3D model in the area required by the user can be improved preferentially, so that the accuracy of the 3D model can be improved.
- a second region where it is difficult to generate the three-dimensional model is detected, and an image which facilitates the generation of the three-dimensional model of the second region is generated.
- the shooting position and the posture are instructed so as to capture an image that facilitates the generation of the three-dimensional model in the first region. Instruct at least one of the postures.
- the image of the subject for which the attribute recognition has been executed is displayed, and the designation of the attribute is accepted in the designation of the first area.
- an edge on the two-dimensional image whose angle difference from the epipolar line based on the shooting position and posture is smaller than a predetermined value is obtained, and (ii) is obtained.
- the three-dimensional region corresponding to the edge is detected as the second region, and in the instruction corresponding to the second region, the shooting position and the posture are set so as to shoot an image in which the angle difference is larger than the value. At least one may be specified.
- the plurality of images are a plurality of frames included in the moving image currently being captured and displayed, and the instruction corresponding to the second region may be performed in real time.
- the shooting direction may be indicated in the instruction corresponding to the second region.
- the user can easily perform a suitable shooting according to the instruction.
- the photographing area may be specified.
- the user can easily perform a suitable shooting according to the instruction.
- the shooting instruction device includes a processor and a memory, and is a tertiary of the subject based on the shooting position and orientation of each of the plurality of images of the subject and the plurality of images.
- the designation of the first region is accepted, and at least one of the shooting position and the posture is instructed to shoot the image used for generating the three-dimensional model of the designated first region.
- the accuracy of the 3D model in the area required by the user can be improved preferentially, so that the accuracy of the 3D model can be improved.
- the shooting instruction method is based on the shooting positions and postures of the plurality of images of the subject and the plurality of images, and the three-dimensional model of the subject using the plurality of images.
- a region that is difficult to generate is detected, and at least one of the imaging position and the posture is instructed to capture an image that facilitates the generation of a three-dimensional model of the detected region.
- the accuracy of the 3D model can be improved.
- the shooting instruction method further accepts the designation of the priority region, and the instruction further receives at least the shooting position and orientation so as to capture an image that facilitates the generation of a three-dimensional model of the designated priority region.
- One may be instructed.
- the accuracy of the 3D model in the area required by the user can be improved preferentially.
- the photographing method is an imaging method executed by an imaging apparatus, in which a plurality of first images in a target space are photographed, and the plurality of first images and the plurality of first images are taken. Based on each first shooting position and orientation, the first three-dimensional position information of the target space is generated, and the second three-dimensional position information of the target space, which is more detailed than the first three-dimensional position information, is generated. The difficult second region is determined by using the first three-dimensional position information without generating the second three-dimensional position information.
- the photographing method can determine the second region where it is difficult to generate the second three-dimensional position information by using the first three-dimensional position information without generating the second three-dimensional position information. It is possible to improve the efficiency of capturing a plurality of images for generating the second three-dimensional position information.
- the second region may be at least one of a region in which an image has not been taken and a region in which the accuracy of the second three-dimensional position information is estimated to be lower than a predetermined reference. ..
- the first three-dimensional position information may include a first three-dimensional point cloud
- the second three-dimensional position information may include a second three-dimensional point cloud that is denser than the first three-dimensional point cloud
- the third region of the target space corresponding to the region around the first three-dimensional point cloud may be determined, and the region other than the third region may be determined as the second region.
- a mesh is generated using the first three-dimensional point cloud, and a region other than the third region of the target space corresponding to the region in which the mesh is generated is determined to be the second region. May be good.
- the second region may be determined based on the reprojection error of the first three-dimensional point cloud.
- the first three-dimensional position information includes a depth image, and based on the depth image, a region within a predetermined distance from a shooting viewpoint is determined as a third region, and a region other than the third region is described as the third region. It may be determined as the second region.
- the photographing method further includes a plurality of second images already photographed, a second photographing position and posture of each of the plurality of second images, the plurality of first images, and a plurality of the first images.
- the coordinate systems of the plurality of first shooting positions and postures may be aligned with the coordinate systems of the plurality of second shooting positions and postures by using the shooting positions and postures.
- the determination of the second region can be performed using the information obtained in a plurality of shootings.
- the photographing method may further display a third area other than the second area or the second area during the photographing of the target space.
- the second area can be presented to the user.
- information indicating the second region or the third region may be superimposed and displayed on any of the plurality of images.
- the user since the position of the second region in the image can be presented to the user, the user can easily grasp the position of the second region.
- information indicating the second region or the third region may be superimposed and displayed on the map of the target space.
- the user since the position of the second area in the surrounding environment can be presented to the user, the user can easily grasp the position of the second area.
- the second region and the restoration accuracy of each region included in the second region may be displayed.
- the user can grasp the restoration accuracy of each area in addition to the second area, so that appropriate shooting can be performed based on this.
- the user may be further instructed to shoot the second region.
- the user can efficiently perform appropriate shooting.
- the instruction may include at least one of a direction and a distance from the current position to the second region.
- the user can efficiently perform appropriate shooting.
- the photographing apparatus includes a processor and a memory, and the processor photographs a plurality of first images of the target space using the memory, and the plurality of first images and the plurality of first images.
- the first three-dimensional position information of the target space is generated based on the first shooting position and the orientation of each of the plurality of first images, and the first three-dimensional position is used using the plurality of first images.
- a second region in which it is difficult to generate the second three-dimensional position information of the target space, which is more detailed than the information, is determined by using the first three-dimensional position information without generating the second three-dimensional position information.
- the photographing apparatus can determine the second region where it is difficult to generate the second three-dimensional position information by using the first three-dimensional position information without generating the second three-dimensional position information. It is possible to improve the efficiency of capturing a plurality of images for generating the second three-dimensional position information.
- a recording medium such as a system, method, integrated circuit, computer program or computer-readable CD-ROM, and the system, method, integrated circuit, computer program. And any combination of recording media may be realized.
- a three-dimensional model By generating a three-dimensional model using a plurality of images taken by a camera, a three-dimensional map or the like can be generated more easily than a method of generating a three-dimensional model using laser measurement. Therefore, a method of generating a three-dimensional model using an image is used when measuring a distance in construction management at a construction site or the like.
- the three-dimensional model is a representation of the captured measurement target on a computer.
- the three-dimensional model has, for example, position information of each three-dimensional location on the measurement target.
- a UI user interface
- three-dimensional reconstruction generation of a three-dimensional model
- instructs a shooting position or a shooting posture based on the detection result will be described.
- the work can be made more efficient.
- FIG. 1 is a block diagram of the terminal device 100 according to the present embodiment.
- the terminal device 100 has an imaging function, a function of estimating a three-dimensional position and posture during shooting, a function of determining the next shooting position candidate from the shot image, and a function of presenting the estimated shooting position candidate to the user.
- the terminal device 100 has a function of generating a three-dimensional model which is a three-dimensional point cloud (point cloud) of the shooting environment by performing three-dimensional reconstruction using the estimated three-dimensional position and orientation, and the three-dimensional model.
- a function to determine the next shooting position candidate by using it a function to present an estimated shooting position candidate to the user, and a shooting image, a three-dimensional position / orientation, and a three-dimensional model between other terminal devices and a management server. It may have a function of transmitting and receiving at least one of.
- the terminal device 100 includes an imaging unit 101, a control unit 102, a position / orientation estimation unit 103, a three-dimensional reconstruction unit 104, an image analysis unit 105, a point cloud analysis unit 106, a communication unit 107, and a UI unit. It includes 108, an image storage unit 111, a camera posture storage unit 112, and a three-dimensional model storage unit 113.
- the imaging unit 101 is an imaging device such as a camera, and acquires an image (moving image). In the following, an example in which an image is mainly used will be described, but a plurality of still images may be used instead of the image.
- the imaging unit 101 stores the acquired video in the video storage unit 111.
- the imaging unit 101 may capture a visible light image or an infrared image. When an infrared image is used, it is possible to take an image even in a dark environment such as at night.
- the imaging unit 101 may be a monocular camera, or may have a plurality of cameras such as a stereo camera. By using a calibrated stereo camera, the accuracy of the three-dimensional position and orientation can be improved. A parallax image with parallax can be obtained even when the stereo camera is not calibrated.
- the control unit 102 controls the entire image processing of the terminal device 100 and the like.
- the position / orientation estimation unit 103 estimates the three-dimensional position and orientation of the camera that captured the image using the image stored in the image storage unit 111. Further, the position / orientation estimation unit 103 stores the three-dimensional position and orientation estimated by the camera attitude storage unit 112. For example, the position / orientation estimation unit 103 uses image processing such as SLAM (Simultaneus Localization and Mapping) to estimate the position and orientation of the camera. Alternatively, the position / orientation estimation unit 103 may calculate the position and orientation of the camera using the information obtained by various sensors (GPS or acceleration sensor) included in the terminal device 100. In the former, the position and orientation can be estimated from the information from the imaging unit 101. In the latter, image processing can be realized with low processing.
- SLAM Simultaneus Localization and Mapping
- the three-dimensional reconstruction unit 104 generates a three-dimensional model by performing three-dimensional reconstruction using the image stored in the image storage unit 111 and the three-dimensional position and orientation stored in the camera posture storage unit 112. do. Further, the three-dimensional reconstruction unit 104 stores the generated three-dimensional model in the three-dimensional model storage unit 113.
- the three-dimensional reconstruction unit 104 performs three-dimensional reconstruction using image processing typified by SfM (Structure from Motion).
- the three-dimensional reconstruction unit 104 may utilize stereo parallax when using an image obtained by a calibrated camera such as a stereo camera. In the former, it is possible to generate an accurate three-dimensional model by using many images. In the latter case, a three-dimensional model can be generated at high speed with light processing.
- the image analysis unit 105 uses the image stored in the image storage unit 111 and the three-dimensional position and orientation stored in the camera posture storage unit 112 to perform three-dimensional reconstruction with high accuracy from which position. It analyzes whether shooting should be performed and determines a shooting position candidate based on the analysis result. Information indicating the determined shooting position candidate is output to the UI unit 108 and presented to the user.
- the point cloud analysis unit 106 uses the image stored in the image storage unit 111, the three-dimensional position and orientation stored in the camera posture storage unit 112, and the three-dimensional model in the three-dimensional model storage unit 113. Then, the density of the point cloud included in the three-dimensional model is determined.
- the point cloud analysis unit 106 determines a shooting position candidate that captures a sparse area. Further, the point cloud analysis unit 106 detects a region of the point cloud generated by using the peripheral region of the image in which lens distortion or the like is likely to occur, and determines a shooting position candidate such that the region is captured in the center of the camera. ..
- the determined shooting position candidate is output to the UI unit 108 and presented to the user.
- the communication unit 107 transmits and receives the captured video, the calculated three-dimensional posture, and the three-dimensional model to and from the cloud server or other terminal device via communication.
- the UI unit 108 presents the photographed video and the image shooting position candidates determined by the image analysis unit 105 and the point cloud analysis unit 106 to the user.
- the UI unit 108 has an input function for inputting a shooting start instruction, a shooting end instruction, and a priority processing location from the user.
- FIG. 2 is a sequence diagram showing information exchange and the like in the terminal device 100.
- the region shown in light ink indicates that the imaging unit 101 is continuously photographing.
- the terminal device 100 analyzes the image and the shooting position and posture in real time, and gives a shooting instruction to the user.
- the real-time analysis is to perform the analysis while taking a picture.
- real-time analysis is to perform analysis without generating a three-dimensional model.
- the terminal device 100 estimates the position and orientation of the camera during shooting, and determines a region that is difficult to restore based on the estimation result and the shot image.
- the terminal device 100 predicts a shooting position and posture that can secure parallax that makes it easy to restore the area, and presents the predicted shooting position and posture on the UI. Although the sequence for moving image shooting is presented here, the same processing may be performed for shooting a still image for each image.
- the UI unit 108 performs initial processing (S101). As a result, the UI unit 108 sends the shooting start signal to the imaging unit 101.
- the start process is performed, for example, by the user clicking the "shooting start” button on the display of the terminal device 100.
- the UI unit 108 performs display processing during shooting (S102). Specifically, the UI unit 108 presents a video being shot and instructions to the user.
- the imaging unit 101 Upon receiving the shooting start signal, the imaging unit 101 shoots an image, and the image information that is the captured image is obtained by the position / orientation estimation unit 103, the three-dimensional reconstruction unit 104, the image analysis unit 105, and the point cloud analysis unit 106. Send to.
- the imaging unit 101 may perform streaming transmission that appropriately transmits an image at the same time as shooting, or may collectively transmit an image at that time at regular intervals. That is, the image information is one or a plurality of images (frames) included in the video.
- processing can be performed as appropriate, so that the waiting time for three-dimensional model generation can be reduced.
- a large amount of captured information can be utilized, so that highly accurate processing can be realized.
- the position / orientation estimation unit 103 first performs an input standby process at the start of shooting, and is in a state of waiting for image information from the image pickup unit 101.
- the position / orientation estimation unit 103 performs the position / orientation estimation process (S103). That is, the position / orientation estimation process is performed every one frame or a plurality of frames.
- the position / orientation estimation unit 103 transmits an estimation failure signal to the UI unit 108 in order to present the failure to the user.
- the position / orientation estimation unit 103 transmits the position / orientation information, which is the estimation result of the three-dimensional position / orientation, to the UI unit 108 in order to output the current three-dimensional position and attitude. do. Further, the position / orientation estimation unit 103 transmits the position / orientation information to the image analysis unit 105 and the three-dimensional reconstruction unit 104.
- the image analysis unit 105 first performs an input standby process at the start of shooting, and is in a state of waiting for image information from the image pickup unit 101 and position / orientation information from the position / orientation estimation unit 103.
- the image analysis unit 105 performs the shooting position candidate determination process (S104).
- the shooting position candidate determination process may be performed frame by frame, or may be performed every fixed time (plurality of frames) (for example, every 5 seconds).
- the image analysis unit 105 determines whether the terminal device 100 is moving to the shooting position candidate generated by the shooting position candidate determination process, and if it is moving, performs a new shooting position candidate determination process. It does not have to be.
- the image analysis unit 105 is moving if the current position and posture are on the straight line connecting the position and posture of the image when the shooting position candidate is determined and the position and posture of the calculation candidate. judge.
- the three-dimensional reconstruction unit 104 performs an input standby process at the start of shooting, and waits for the image information from the image pickup unit 101 and the position / orientation information of the position / orientation estimation unit 103.
- the three-dimensional reconstruction unit 104 calculates the three-dimensional model by performing the three-dimensional reconstruction process (S105).
- the three-dimensional reconstruction unit 104 transmits the calculated point cloud information, which is a three-dimensional model, to the point cloud analysis unit 106.
- the point cloud analysis unit 106 first performs an input standby process at the start of shooting, and is in a state of waiting for point cloud information from the three-dimensional reconstruction unit 104.
- the point cloud analysis unit 106 performs a shooting position candidate determination process (S106). For example, the point cloud analysis unit 106 determines the sparse and dense state of the entire point cloud and detects a sparse region.
- the point cloud analysis unit 106 determines a shooting position candidate that captures many of these sparse areas.
- the point cloud analysis unit 106 may determine a shooting position candidate by using image information or position / orientation information in addition to the point cloud information.
- FIG. 3 is a flowchart of the initial process (S101).
- the UI unit 108 displays the current captured image (S201).
- the UI unit 108 acquires whether or not there is a priority location that the user wants to restore preferentially (S202). For example, the UI unit 108 displays a button for designating the priority mode, and when the button is selected, determines that there is a priority location.
- the UI unit 108 displays the priority location selection screen (S204) and acquires the information of the priority location selected by the user (S205). After step S205, or when there is no priority point (No in S203), the UI unit 108 then outputs a shooting start signal to the imaging unit 101. As a result, shooting is started (S206). For example, shooting may be started when the user presses a button, or shooting may be automatically started after a lapse of a certain period of time.
- the priority of restoration is set high when instructing the movement of the camera to the set priority location. According to this, it is not necessary to give an instruction to move the camera to restore an area that is difficult for the user to restore but the user does not need, and it is possible to give an instruction to the area required by the user.
- FIG. 4 is a diagram showing an example of the initial display of the initial processing (S101).
- the captured image 201, the priority designation button 202 for selecting whether to select the priority designation location, and the shooting start button 203 for starting shooting are displayed.
- the captured image 201 may be a still image or a moving image (moving image) currently being photographed.
- FIG. 5 is a diagram showing an example of a method of selecting a priority designated portion at the time of priority designation.
- attribute recognition of an object such as a window frame is performed, and a desired object is selected from a list of a plurality of objects included in the image by using the selection field 204.
- labels such as window frames, desks, and walls are added to each pixel by a method such as Semantic Segmentation on an image, and the target pixels are collectively selected by selecting the labels.
- FIG. 6 is a diagram showing an example of another selection method of the priority designation location at the time of priority designation.
- the priority point is selected by the user designating an arbitrary area (the area surrounded by a rectangle in the figure) by a pointer or a touch operation. It should be noted that any means may be used as long as the user can select a specific area, for example, the selection may be made by specifying a color, or a rough designation such as the right side area in the image may be made.
- an input method other than the operation on the screen may be used.
- the selection operation may be performed by voice input.
- input becomes easy when it is difficult to operate by hand, such as when wearing gloves in a cold region.
- the priority part may be added as appropriate during shooting. This makes it possible to select a part that was not shown in the initial state.
- FIG. 7 is a flowchart of the position / orientation estimation process (S103).
- the position / orientation estimation unit 103 acquires one or more images or videos from the video storage unit 111 (S301).
- the position / orientation estimation unit 103 calculates or acquires the position / orientation information including the camera parameters including the three-dimensional position and orientation (orientation) of the camera and the lens information (S302).
- the position / orientation estimation unit 103 calculates the position / orientation information by performing image processing such as SLAM or SfM on the image acquired in S301.
- the position / orientation estimation unit 103 stores the position / orientation information acquired in S302 in the camera attitude storage unit 112 (S303).
- the image input in S301 is an image sequence consisting of a plurality of frames for a certain period of time, and the processing after S302 may be performed on this image sequence (plural images).
- images may be sequentially input as in streaming, and the processing after S302 may be repeated for each input image.
- the accuracy can be improved by using the information for a plurality of times.
- a fixed-length input delay can be guaranteed, and the waiting time required for 3D model generation can be reduced.
- FIG. 8 is a flowchart of the shooting position candidate determination process (S104).
- the image analysis unit 105 acquires a plurality of images or videos from the video storage unit 111 (S401).
- one of the acquired plurality of images is set as the key image.
- the key image is an image that serves as a reference when performing the three-dimensional reconstruction in the subsequent stage.
- the depth of each pixel of the key image is estimated using the information of an image other than the key image, and the three-dimensional reconstruction is performed using the estimated depth.
- the image analysis unit 105 acquires the position / orientation information of each of the plurality of images (S402).
- the image analysis unit 105 calculates the epipolar line between the images (between the cameras) using the position / orientation information of each image (S403).
- the image analysis unit 105 detects an edge in each image (S404). For example, the image analysis unit 105 detects an edge by a filter process such as a Sobel filter.
- the image analysis unit 105 calculates the angle between the epipolar line and the edge in each image (S405).
- the image analysis unit 105 calculates the degree of difficulty in restoring each pixel of the key image based on the angle obtained in S405 (S406).
- the restoration difficulty level may be set in a plurality of stages, or may be set in two stages of high / low. For example, if the angle is smaller than a predetermined value (for example, 5 degrees), the restoration difficulty is set to high, and if the angle is larger than the predetermined value, the restoration difficulty is set to low. good.
- the image analysis unit 105 estimates from which position it will be easier to restore the region with high restoration difficulty based on the restoration difficulty calculated in S406, and determines the estimated position as a shooting position candidate. (S407).
- the area with high restoration difficulty is synonymous with being on the same plane as the moving direction between the cameras, and the epipolar line and the edge are horizontal when the camera moves in the direction perpendicular to that plane. Will disappear. Therefore, if the camera is moving forward, the difficulty of restoration can be lowered in the region where the difficulty of restoration is high by moving the camera in the vertical direction or the horizontal direction.
- the restoration difficulty level is obtained through the processing from S401 to S406, this is not the case if the restoration difficulty level can be calculated.
- the deterioration of image quality due to lens distortion is greater at the edges of the image than at the center of the image. Therefore, the image analysis unit 105 may determine an object that appears only on the edge of the screen in each image, and set a high degree of difficulty in restoring the area of the object. For example, the image analysis unit 105 may determine the field of view of the camera in the three-dimensional space from the position / orientation information, and determine the area that is displayed only at the edge of the screen from the overlap of the fields of view of each camera.
- FIG. 9 is a diagram showing a situation in which the camera and the object are viewed from above.
- FIG. 10 is a diagram showing an example of an image obtained by each camera in the situation shown in FIG.
- the epipolar line that can be calculated from the camera geometry is searched.
- NCC Normalized Cross Correlation
- the angle between the epipolar line and the edge on the image can be used as the difficulty level of the three-dimensional reconstruction.
- Information having the same meaning as the angle may be used.
- the epipolar line and the edge may be considered as a vector, and the internal product value of the epipolar line and the edge may be used.
- the epipolar line can be calculated using the basic matrix (Fundamental Matrix) between the camera A and the camera B. This basic matrix can be calculated from the position / orientation information of the camera A and the camera B.
- the internal matrices of the camera A and the camera B are KA and KB
- the relative rotation matrix of the camera B seen from the camera A is R
- the relative movement vector is T
- 11 to 13 are schematic views for explaining an example of determining a shooting position candidate.
- the edge having a high degree of difficulty in restoration is often a straight line or a line segment on a three-dimensional plane passing through a straight line connecting the three-dimensional positions of the camera A and the camera B.
- the straight line perpendicular to the plane passing through this straight line has a larger angle between the epipolar line and the edge in matching between the camera A and the camera B, and the difficulty of restoration becomes lower.
- the image analysis unit 105 determines the camera C as a shooting position candidate.
- the image analysis unit 105 may determine the shooting position candidate by the above method using the edge having the highest restoration difficulty as a candidate. Alternatively, the image analysis unit 105 may select a random edge as a candidate from the top 10 edges of the restoration difficulty level.
- the camera C is calculated only from the information of the pair of cameras (camera A and camera B), but when there are a plurality of edges having a high degree of difficulty in restoration, the image analysis unit 105 is shown in FIG. May determine a shooting position candidate (camera C) for the first edge and a shooting position candidate (camera C) for the second edge, and output a path connecting the camera C and the camera D.
- the method for determining the shooting position candidate is not limited to this, and the image analysis unit 105 sets the edge captured at the edge of the screen in the camera A in consideration of the fact that the center of the screen has less influence on image quality deterioration such as distortion.
- a position where the edge is reflected in the center of the screen may be determined as a shooting position candidate.
- FIG. 14 is a flowchart of the three-dimensional reconstruction process (S105).
- the three-dimensional reconstruction unit 104 acquires a plurality of images or videos from the video storage unit 111 (S501).
- the three-dimensional reconstruction unit 104 acquires the position / orientation information (camera parameters) of each of the plurality of images from the camera attitude storage unit 112 (S502).
- the three-dimensional reconstruction unit 104 generates a three-dimensional model by performing three-dimensional reconstruction using the acquired plurality of images and the plurality of position / orientation information (S503). For example, the three-dimensional reconstruction unit 104 performs three-dimensional reconstruction using the visual volume crossing method or SfM. Finally, the three-dimensional reconstruction unit 104 stores the generated three-dimensional model in the three-dimensional model storage unit 113 (S504).
- the processing of S503 does not have to be performed by the terminal device 100.
- the terminal device 100 transmits an image and camera parameters to a cloud server or the like.
- the cloud server generates a three-dimensional model by performing three-dimensional reconstruction.
- the terminal device 100 receives the three-dimensional model from the cloud server.
- the terminal device 100 can use a high-quality three-dimensional model regardless of the performance of the terminal device 100.
- FIG. 15 is a flowchart of the in-shooting display process (S102).
- the UI unit 108 displays the UI screen (S601).
- the UI unit 108 acquires and displays an captured image which is an image being photographed (S602).
- the UI unit 108 determines whether or not the shooting position candidate or the estimation failure signal has been received (S603).
- the estimation failure signal is a signal transmitted from the position / attitude estimation unit 103 when the position / attitude estimation in the position / attitude estimation unit 103 fails.
- the imaging position candidate is transmitted from the three-dimensional reconstruction unit 104 or the point cloud analysis unit 106.
- the UI unit 108 When the UI unit 108 receives the shooting position candidate (Yes in S603), it displays that there is a shooting position candidate (S604) and presents the shooting position candidate (S605). For example, the UI unit 108 may visually display the shooting position candidate via the UI, or may perform presentation using the sound by a mechanism that outputs the sound of a speaker or the like. Specifically, even if a voice instruction is given, "Please lift 20 cm” when moving the terminal device 100 upward, and “Turn 45 ° to the right” when shooting the right side. good. According to this, the user does not have to gaze at the screen of the terminal device 100 during shooting while moving, so that shooting can be performed safely.
- the presentation may be performed by vibration.
- a rule such as two short vibrations when moving up and one long vibration when turning to the right is predetermined, and it is possible to make a presentation according to the rule. In this case as well, since it is not necessary to gaze at the screen, safe shooting can be realized.
- the UI unit 108 When the UI unit 108 receives the estimation failure signal, it displays in S604 that the estimation has failed.
- the UI unit 108 determines whether or not there has been an instruction to end shooting (S606).
- the instruction to end the shooting may be, for example, given by operating the UI screen, or may be an instruction by voice. Alternatively, an instruction may be given by gesture input such as shaking the terminal device 100 twice.
- the UI unit 108 transmits a shooting end signal to the imaging unit 101 to inform the end of shooting (S607). If there is no instruction to end shooting (No in S606), the UI unit 108 repeats the processing after S601.
- FIG. 16 is a diagram showing an example of a method of visually presenting a shooting position candidate.
- the shooting position candidate is a position higher than the current position and the shooting is specified from the position higher than the current position.
- the up arrow 211 is presented on the screen.
- the UI unit 108 may change the display form (color, size, etc.) of the arrow according to the distance from the current position to the shooting position candidate. For example, the UI unit 108 may display a large red arrow when the current position is far from the shooting position candidate, and may display a small green arrow so that the current position approaches the shooting position candidate.
- the UI unit 108 does not display an arrow or presents a ⁇ symbol when there is no shooting position candidate (that is, when there is no area with a high degree of restoration difficulty in the current shooting), and the current shooting is performed. May indicate that is good.
- FIG. 17 is a diagram showing another example of a method of visually presenting a shooting position candidate.
- the UI unit 108 displays the dotted line frame 212 that comes to the center of the screen when the camera moves to the shooting position candidate, and indicates to the user to move the camera so that the dotted line frame 212 approaches the center of the screen. ..
- the color or thickness of the frame may indicate the distance from the current position to the shooting position candidate.
- the UI unit 108 may display a message 213 such as an alert as shown in FIG. Further, the UI unit 108 may switch the instruction method depending on the situation. For example, the UI unit 108 may display a small display immediately after the start of the instruction and increase the display after a certain period of time. Alternatively, the UI unit 108 may issue an alert based on the time from the start of the instruction. For example, the UI unit 108 may issue an alert when the instruction is not followed one minute after the start of the instruction.
- the UI unit 108 may present rough information when the current position is far from the shooting position candidate, and display a frame when the distance is short enough to be expressed on the screen.
- the display method of the shooting position candidate may be a method other than the illustrated method.
- the UI unit 108 may display shooting position candidates on map information (two-dimensional or three-dimensional). According to this, the user can intuitively grasp the direction of movement.
- an instruction may be given to a moving body equipped with a camera such as a robot or a drone.
- the function of the terminal device 100 may be included in the mobile body. That is, the moving body may move to the determined shooting position candidate and perform shooting. According to this, a highly accurate three-dimensional model can be stably generated even in an automatically controlled device.
- the information of the pixels determined to have a high degree of difficulty in restoration may be utilized when performing three-dimensional reconstruction in the terminal device 100 or in the server.
- the terminal device 100 or the server may determine a three-dimensional point reconstructed using pixels determined to have a high degree of difficulty in restoration as a region or a point having low accuracy.
- metadata indicating a region or a point having low accuracy may be added to the three-dimensional model or the three-dimensional point. According to this, it is possible to determine in the post-processing whether the generated three-dimensional point has high accuracy or low accuracy. For example, the degree of correction in the three-dimensional point filtering process can be switched according to the accuracy.
- the photographing instruction device performs the processing shown in FIG.
- the shooting instruction device (for example, the terminal device 100) is an area in which it is difficult to generate a three-dimensional model of a subject using a plurality of images based on the shooting positions and postures of each of the plurality of images of the subject and the plurality of images. (Second region) is detected (S701).
- the shooting instruction device instructs at least one of the shooting position and the posture to shoot an image that facilitates the generation of the detected three-dimensional model of the region (S702). According to this, the accuracy of the three-dimensional model can be improved.
- the images that facilitate the generation of a three-dimensional model are (1) an image of a region that is not captured from a part of the imaging viewpoints among a plurality of imaging viewpoints, and (2) blurring.
- An image of a small area (3) an image of an area with many feature points because the contrast is higher than other areas, and (4) an area closer to the shooting viewpoint than other areas, and when the three-dimensional position is calculated, It includes at least one image of a region where the error between the calculated three-dimensional position and the actual position is estimated to be small, and (5) an image of a region where the influence of lens distortion is smaller than that of other regions.
- the photographing instruction device further accepts the designation of the priority area (first area), and in the instruction (S702), the image is photographed so as to facilitate the generation of the three-dimensional model of the designated priority area.
- the image Indicate at least one of position and orientation. According to this, the accuracy of the three-dimensional model of the area required by the user can be improved preferentially.
- the shooting instruction device is used to generate a three-dimensional model of the subject based on the shooting positions and postures of the plurality of images of the subject and the plurality of images. It accepts the designation of one region (for example, priority region) (S205 in FIG. 3), and instructs at least one of the shooting position and the posture to shoot the image used for generating the three-dimensional model of the designated first region. .. According to this, the accuracy of the three-dimensional model in the area required by the user can be improved preferentially, so that the accuracy of the three-dimensional model can be improved.
- one region for example, priority region
- a second region in which it is difficult to generate the three-dimensional model is detected (S701), and the generation of the three-dimensional model in the second region becomes easy.
- At least one of the shooting position and the posture is instructed to generate an image (S702), and the instruction (S702) corresponding to the first region facilitates the generation of the three-dimensional model in the first region.
- At least one of the shooting position and the posture is instructed to shoot.
- the shooting instruction device displays an image of the subject on which the attribute recognition has been executed, and the designation of the attribute is accepted in the designation of the first area.
- the imaging instruction device instructs at least one of the imaging position and the posture so as to capture an image in which the angle difference is larger than the value.
- the plurality of images are a plurality of frames included in the moving image currently being captured and displayed, and the instruction (S702) corresponding to the second region is performed in real time. According to this, it is possible to improve the convenience of the user by giving a shooting instruction in real time.
- the shooting direction is instructed.
- the user can easily perform a suitable shooting according to the instruction.
- the direction in which the next imaging position exists with respect to the current position is presented.
- the photographing area is indicated. According to this, the user can easily perform a suitable shooting according to the instruction.
- the shooting instruction device includes a processor and a memory, and the processor performs the above processing using the memory.
- the three-dimensional model is a representation of the captured measurement target on a computer.
- the three-dimensional model has, for example, position information of each three-dimensional location on the measurement target.
- the 3D model cannot be reconstructed because an appropriate image is not obtained, or the accuracy of the 3D model may decrease.
- the accuracy is an error between the position information of the three-dimensional model and the actual position.
- information for assisting the shooting is presented to the user during the shooting. As a result, the user can efficiently take an appropriate image. In addition, the accuracy of the generated three-dimensional model can be improved.
- the non-photographed area refers to an area that has not been photographed at that time (for example, an area hidden by another object) and an area that has been photographed when the target space is being photographed. May include a region where a three-dimensional point could not be obtained.
- a region where 3D reconstruction (generation of a 3D model) is difficult is detected, and the detected region is presented to the user. Further, as a result, the efficiency of photographing can be improved, and the failure of reconstruction of the three-dimensional model or the decrease in accuracy can be suppressed.
- FIG. 20 is a diagram showing a configuration of a three-dimensional reconstruction system according to the present embodiment.
- the three-dimensional reconstruction system includes a photographing device 301 and a reconstruction device 302.
- the photographing device 301 is a terminal device used by the user, and is, for example, a mobile terminal such as a tablet terminal, a smartphone, or a notebook personal computer.
- the photographing device 301 has a photographing function, a function of estimating the position and orientation of the camera (hereinafter referred to as a position / orientation), a function of displaying a photographed area, and the like. Further, the photographing device 301 sends the photographed image and the position / orientation to the reconstructing device 302 during and after the photographing.
- the image is, for example, a moving image. The image may be a plurality of still images. Further, the photographing device 301 estimates the position / orientation during photographing, determines the photographed area using at least one of the position / orientation and the three-dimensional point cloud, and presents the photographed area to the user.
- the reconstruction device 302 is, for example, a server connected to the photographing device 301 via a network or the like.
- the reconstructing device 302 acquires an image captured by the photographing device 301, and generates a three-dimensional model using the acquired image.
- the reconstructing device 302 may use the camera position / orientation estimated by the photographing device 301, or may estimate the camera position from the acquired image.
- the data transfer between the photographing device 301 and the reconstructing device 302 may be performed offline via an HDD (hard disk drive) or the like, or may be performed constantly via a network.
- HDD hard disk drive
- the three-dimensional model generated by the reconstruction device 302 may be a three-dimensional point cloud (point cloud) in which the three-dimensional space is densely restored, or may be a set of three-dimensional meshes. Further, the three-dimensional point cloud generated by the photographing apparatus 301 is a set of three-dimensional points obtained by sparsely three-dimensionally restoring characteristic points such as corners of an object in space. That is, the three-dimensional model (three-dimensional point cloud) generated by the photographing apparatus 301 is a model having a lower spatial resolution than the three-dimensional model generated by the reconstructing apparatus 302. In other words, the three-dimensional model (three-dimensional point cloud) generated by the photographing apparatus 301 is a simpler model than the three-dimensional model generated by the reconstructing apparatus 302.
- the simple model is, for example, a model with a small amount of information, a model with easy generation, or a model with low accuracy.
- the three-dimensional model generated by the photographing apparatus 301 is a group of three-dimensional points sparser than the three-dimensional model generated by the reconstructing apparatus 302.
- FIG. 21 is a block diagram of the photographing device 301.
- the photographing device 301 includes an imaging unit 311, a position / orientation estimation unit 312, a position / orientation integration unit 313, an area detection unit 314, an area detection unit 314, a UI unit 315, a control unit 316, and an image storage unit 317.
- a position / posture storage unit 318 and an area information storage unit 319 are provided.
- the image pickup unit 311 is an image pickup device such as a camera, and acquires an image (moving image). In the following, an example in which a moving image is mainly used will be described, but a plurality of still images may be used instead of the moving image.
- the image capturing unit 311 stores the acquired image in the image storage unit 317.
- the imaging unit 311 may capture a visible light image or a non-visible light image (for example, an infrared image). When an infrared image is used, it is possible to take an image even in a dark environment such as at night.
- the imaging unit 311 may be a monocular camera, or may have a plurality of cameras such as a stereo camera.
- the image pickup unit 311 may be a device capable of capturing a depth image such as an RGB-D sensor. In this case, since a depth image that is three-dimensional information can be acquired, the accuracy of estimating the camera position and orientation can be improved. In addition, the depth image can be used as alignment information at the time of integration of the three-dimensional posture described later.
- the control unit 316 controls the entire imaging process and the like of the photographing device 301.
- the position / orientation estimation unit 312 estimates the three-dimensional position and orientation (position / orientation) of the camera that captured the image using the image stored in the image storage unit 317. Further, the position / orientation estimation unit 312 stores the estimated position / orientation in the position / attitude storage unit 318.
- the position / orientation estimation unit 312 uses image processing such as SLAM (Simultaneus Localization and Mapping) to estimate the position / orientation.
- the position / orientation estimation unit 312 may calculate the position and orientation of the camera using the information obtained by various sensors (GPS or acceleration sensor) included in the photographing device 301. In the former, the position and orientation can be estimated from the information from the imaging unit 311. In the latter, image processing can be realized with low processing.
- the position / orientation integration unit 313 integrates the position / orientation of the camera estimated in each shooting when performing multiple shootings in one environment, and calculates the position / orientation that can be handled in the same space. Specifically, the position / orientation integration unit 313 uses the three-dimensional coordinate axes of the position / orientation obtained in the first shooting as the reference coordinate axes. Then, the position / posture integration unit 313 converts the coordinates of the position / posture obtained in the second and subsequent shootings into the coordinates of the space of the reference coordinate axis.
- the area detection unit 314 uses the image stored in the image storage unit 317 and the position / orientation stored in the position / orientation storage unit 318 to form an area or three-dimensional reconstruction that cannot be three-dimensionally reconstructed in the target space. Detects areas with low accuracy.
- the area in which three-dimensional reconstruction is not possible in the target space is, for example, an area in which an image has not been taken.
- the region with low accuracy of three-dimensional reconstruction is, for example, a region in which the number of images captured in the region is small (less than a predetermined number). Further, the region with low accuracy is a region in which an error between the generated three-dimensional position information and the execution position is large when the three-dimensional position information is generated. Further, the area detection unit 314 stores the information of the detected area in the area information storage unit 319.
- the area detection unit 314 may detect an area capable of three-dimensional reconstruction and determine that the area other than the area cannot be three-dimensionally reconstructed.
- the information stored in the area information storage unit 319 may be two-dimensional information superimposed on the image, or may be three-dimensional information such as three-dimensional coordinate information.
- the UI unit 315 presents the captured image and the area information detected by the area detection unit 314 to the user.
- the UI unit 315 has an input function for the user to input a shooting start instruction and a shooting end instruction.
- the UI unit 315 is a display with a touch panel.
- FIG. 22 is a flowchart showing the operation of the photographing device 301.
- shooting is started and stopped according to a user's instruction. Specifically, shooting is started by pressing the shooting start button on the UI.
- the shooting start instruction is input (Yes in S801), the imaging unit 311 starts shooting.
- the captured image is stored in the image storage unit 317.
- the position / orientation estimation unit 312 calculates the position / orientation every time an image is added (S802).
- the calculated position / orientation is stored in the position / orientation storage unit 318.
- the generated three-dimensional point cloud is also saved.
- the position / attitude integration unit 313 integrates the position / attitude (S803). Specifically, the position / orientation integration unit 313 uses the position / orientation estimation result and the image to create a three-dimensional coordinate space between the position / orientation of the images taken so far and the position / orientation of the newly taken image. Determine if they can be integrated and, if possible, integrate them. That is, the position / orientation integration unit 313 converts the coordinates of the position / orientation of the newly captured image into the coordinate system of the position / orientation up to now. As a result, a plurality of positions and orientations are represented in one three-dimensional coordinate space. Therefore, it is possible to use the data obtained in a plurality of times of shooting in common, and it is possible to improve the accuracy of estimating the position and orientation.
- the area detection unit 314 detects the photographed area and the like (S804). Specifically, the area detection unit 314 generates three-dimensional position information (three-dimensional point cloud, three-dimensional model, depth image, etc.) using the estimation result of the position and orientation and the image, and the generated three-dimensional position. The information is used to detect areas where 3D reconstruction is not possible or areas where 3D reconstruction is inaccurate. Further, the area detection unit 314 stores the information of the detected area in the area information storage unit 319. Next, the UI unit 315 displays the information of the area obtained by the above processing (S805).
- these series of processes are repeated until the end of shooting (S806). For example, these processes are repeated every time an image of one or a plurality of frames is acquired.
- FIG. 23 is a flowchart of the position / orientation estimation process (S802).
- the position / orientation estimation unit 312 acquires an image from the image storage unit 317 (S811).
- the position / orientation estimation unit 312 calculates the position / orientation of the camera in each image using the acquired images (S812).
- the position / orientation estimation unit 312 calculates the position / orientation using image processing such as SLAM or SfM (Structure from Motion).
- the position / orientation estimation unit 312 may estimate the position / orientation by utilizing the information obtained by these sensors.
- the position / orientation estimation unit 312 may use the result obtained by pre-calibrating as a camera parameter such as the focal length of the lens. Alternatively, the position / orientation estimation unit 312 may calculate the camera parameters at the same time as the position / orientation estimation.
- the position / orientation estimation unit 312 stores the calculated position / orientation information in the position / attitude storage unit 318 (S813). If the calculation of the position / posture information fails, the information indicating the failure may be stored in the position / posture storage unit 318. This makes it possible to know the location and time of failure, and what kind of image the failure occurred in, and this information can be utilized when re-shooting.
- FIG. 24 is a flowchart of the position / orientation integration process (S803).
- the position / orientation integration unit 313 acquires an image from the image storage unit 317 (S821).
- the position / attitude integration unit 313 acquires the current position / attitude (S822).
- the position / orientation integration unit 313 acquires at least one image and position / orientation of the captured path other than the current photographing path (S823).
- the imaged path can be generated from, for example, the time-series information of the position and orientation of the camera obtained by SLAM. Further, the information on the imaged path is stored in, for example, the position / orientation storage unit 318.
- the SLAM result is saved for each shooting trial, and in the Nth shooting (current route), the 3D coordinate axes of the Nth shooting are the 1st to N-1th results (past route). ) Is integrated with the three-dimensional coordinate axes.
- the position information obtained by GPS or Bluetooth may be used instead of the result of SLAM.
- the position / attitude integration unit 313 determines whether or not integration is possible (S824). Specifically, the position / orientation integration unit 313 determines whether the acquired position / orientation / image of the photographed route and the current position / orientation / image are similar, and if they are similar, it is possible to integrate. Judge, and if they are not similar, judge that integration is not possible. More specifically, the position / orientation integration unit 313 calculates a feature amount expressing the features of the entire image from each image, and determines whether or not the images have similar viewpoints by comparing them. Further, when the photographing device 301 has GPS and the absolute position of the photographing device 301 is known, the position / orientation integration unit 313 uses the information to obtain an image taken at the same position as or close to the current image. You may judge.
- the position / attitude integration unit 313 performs path integration processing (S825). Specifically, the position-posture integration unit 313 calculates a three-dimensional relative position between the current image and a reference image obtained by capturing a region similar to the image. The position / orientation integration unit 313 calculates the coordinates of the current image by adding the calculated three-dimensional relative position to the coordinates of the reference image.
- FIG. 25 is a plan view showing a state of photography in the target space.
- the route C is the route of the camera A that has been photographed and the position and orientation have been estimated. Note that the figure shows a case where the camera A exists at a predetermined position on the path C. Further, the path D is the path of the camera B currently being photographed. At this time, the current camera B and the camera A are taking images with the same field of view. Although an example in which two images are obtained by different cameras is shown here, two images taken by the same gamera at different times may be used.
- FIG. 26 is a diagram showing an example of an image and an example of comparison processing in this case.
- the position / orientation integration unit 313 extracts a feature amount such as an ORB (Oriented FAST and Rotated BRIEF) feature amount for each image, and the entire image is based on their distribution or number. Extract the features of. For example, the position / orientation integration unit 313 clusters the features appearing in the image like Bag of words, and uses the histogram for each class as the features.
- ORB Oriented FAST and Rotated BRIEF
- the position / orientation integration unit 313 compares the feature amounts of the entire image between the images, and when it is determined that the images capture the same location, the position / orientation integration unit 313 performs feature point matching between the images to perform between the cameras. Calculate the relative three-dimensional position of. In other words, the position / orientation integration unit 313 searches for an image having a similar feature amount on the entire screen from a plurality of images of the captured route.
- the position-posture integration unit 313 converts the three-dimensional position of the path D into the coordinate system of the path C based on this relative positional relationship. This makes it possible to represent a plurality of paths in one coordinate system. In this way, the position and orientation in a plurality of paths can be integrated.
- the position / orientation integration unit 313 may perform the integration process using the detection result.
- the position / orientation integration unit 313 may perform processing using the detection result of the sensor without performing the processing using the above image, or may use the detection result of the sensor in addition to the image.
- the position / orientation integration unit 313 may use GPS information to narrow down the images to be compared.
- the position / orientation integration unit 313 may set an image of the position / orientation within a range of ⁇ 0.001 degrees or less from the latitude / longitude of the current camera position by GPS as a comparison target. As a result, the amount of processing can be reduced.
- FIG. 27 is a flowchart of the area detection process (S804).
- the area detection unit 314 acquires an image from the image storage unit 317 (S831).
- the area detection unit 314 acquires the position / orientation from the position / orientation storage unit 318 (S832).
- the area detection unit 314 acquires a three-dimensional point cloud indicating the three-dimensional position of the feature point generated by SLAM or the like from the position / orientation storage unit 318.
- the area detection unit 314 detects an unphotographed area using the acquired image, position / orientation, and three-dimensional point cloud (S833). Specifically, the area detection unit 314 projects a three-dimensional point cloud into the image, and the periphery of the pixel on which the three-dimensional point is projected (within a predetermined distance from the pixel) can be restored (photographed area). ). The farther the projected three-dimensional point is from the shooting position, the larger the predetermined distance may be.
- the area detection unit 314 may estimate the recoverable area from the parallax image. Further, when an RGB-D camera is used, the determination may be made using the obtained depth value. The details of the determination process of the photographed area will be described later. Further, the area detection unit 314 may not only determine whether the three-dimensional model can be generated, but also determine the accuracy when the three-dimensional reconstruction is performed.
- This area information may be, for example, an image in which the information of each area is superimposed on the captured image, or information in which the information of each area is arranged in a three-dimensional space such as a three-dimensional map.
- FIG. 28 is a flowchart of the display process (S805).
- the UI unit 315 confirms whether there is information to be displayed (S841). Specifically, the UI unit 315 confirms whether or not there is a newly added image in the image storage unit 317, and determines that there is display information if there is. Further, the UI unit 315 confirms whether or not there is any information newly added to the area information storage unit 319, and determines that there is display information if there is any.
- the UI unit 315 acquires display information such as images and area information (S842). Next, the UI unit 315 displays the acquired display information (S843).
- FIG. 29 is a diagram showing an example of a UI screen displayed by the UI unit 315.
- the UI screen includes a shooting start / shooting stop button 321, a shooting image 322, area information 323, and a character display area 324.
- the shooting start / shooting stop button 321 is an operation unit for instructing the user to start and stop shooting.
- the image being photographed 322 is an image currently being photographed.
- the area information 323 displays a photographed area, a low-precision area, and the like.
- the low-precision region is a region in which the accuracy of the generated three-dimensional model is low when the three-dimensional model is generated using the captured image.
- the character display area 324 information on an unphotographed area or a low-precision area is indicated by characters. In addition, voice or the like may be used instead of characters.
- FIG. 30 is a diagram showing an example of area information 323.
- an image in which information indicating each region is superimposed on the image being photographed is used. Further, as the shooting viewpoint moves, the display of the area information 323 is changed in real time. As a result, the user can easily grasp the captured area and the like while referring to the image in the current camera viewpoint. For example, the captured area and the low-precision area are displayed in different colors. In addition, the area that has been photographed in the current route and the area that has been photographed in another route are displayed in different colors.
- the information indicating each region may be information superimposed on the image, or may be represented by characters or symbols. That is, the information may be any information that allows the user to visually determine each area.
- the user knows that the uncolored area and the low-precision area should be photographed, so that it is possible to avoid missing a photograph. Further, in order to execute the path integration process, it is possible to present the user with an area to be photographed so that the photographed area is continuous with the photographed area of another route.
- a display that is easy for the user to pay attention to, such as blinking, may be used for an area that the user wants to pay attention to, such as a low-precision area.
- the image on which the information of the area is superimposed may be a past image.
- the area information may be superimposed on the image being photographed 322.
- these display methods may be switched according to the type of terminal. For example, in a terminal having a small screen size such as a smartphone, area information may be superimposed on the image being photographed 322, and in a terminal having a large screen size such as a tablet terminal, the image being photographed 322 and the area information 323 may be displayed individually. good.
- the UI unit 315 presents, in text or voice, how many meters before the current position the low-precision region occurred. This distance can be calculated from the result of position estimation. Information can be correctly notified to the user by using character information. In addition, when voice is used, it is not necessary to take the line of sight during shooting, so that the notification can be safely performed.
- the information of the area may be superimposed on the real space by AR (Augmented Reality) glass or HUD (Head-Up Display). According to this, the affinity with the image seen from the actual viewpoint is improved, and the part to be photographed can be intuitively presented to the user.
- AR Augmented Reality
- HUD Head-Up Display
- FIG. 31 is a diagram showing a display example in this case.
- the image when the position / orientation estimation fails may be displayed in the area information 323.
- characters or voices prompting the resumption of shooting from that point may be presented. This allows the user to quickly redo in the event of a failure.
- the photographing device 301 may predict the failure position from the elapsed time from the failure and the moving speed, and present information such as "Please return 5 m before" by voice or text.
- the photographing device 301 may display a two-dimensional map or a three-dimensional point map and present the shooting failure position on the map.
- the photographing device 301 may detect that the user has returned to the failure position and present the user to that effect by characters, images, sounds, vibrations, or the like. For example, it is possible to detect that the user has returned to the failure position by using the feature amount of the entire image.
- the photographing device 301 may instruct the user to perform another photographing and a photographing method.
- the photographing method is, for example, taking a larger image of the area.
- the photographing apparatus 301 may give this instruction using characters, images, sounds, vibrations, or the like.
- the quality of the acquired data can be improved, so that the accuracy of the generated 3D model can be improved.
- FIG. 32 is a diagram showing a display example when a low-precision region is detected.
- instructions are given to the user using characters.
- the character display area 324 indicates that the low precision area has been detected.
- an instruction to the user for shooting a low-precision area such as "Please go back 5 m" is displayed.
- these displays may be turned off.
- FIG. 33 is a diagram showing another example of instructions to the user.
- the arrow 325 shown in FIG. 33 may allow the user to move and display the direction and distance.
- FIG. 34 is a diagram showing an example of this arrow.
- the moving direction is indicated by the angle of the arrow
- the distance to the moving destination is indicated by the size of the arrow.
- the display form of the arrow may be changed according to the distance.
- the display form may be a color, or may be the presence / absence or size of an effect.
- the effect is, for example, blinking, movement, scaling, or the like.
- the darkness of the arrow may be changed. For example, the closer the distance, the larger the effect, or the darker the color of the arrow. Further, a plurality of these may be combined.
- the display is not limited to the arrow, and any display such as a triangle or a finger icon may be used as long as the display can recognize the direction.
- the area information 323 may be superimposed on the image from a third party's viewpoint such as a plan view or a perspective view instead of being superimposed on the image being photographed.
- a photographed area or the like may be superimposed and displayed on the three-dimensional map.
- FIG. 35 is a diagram showing an example of area information 323 in this case.
- FIG. 35 is a plan view of a three-dimensional map, and shows a photographed area and a low-precision area. Further, the current camera position 331 (position and direction of the photographing device 301), the current route 332, and the past route 333 are shown.
- the imaging device 301 may utilize it. Further, when GPS or the like can be used, the photographing device 301 may create map information based on the latitude / longitude information obtained by GPS.
- the display from a third-party perspective may be used in the same manner without a map.
- the visibility is lowered, but the user can grasp the positional relationship between the current shooting position and the low-precision area.
- the user can confirm the situation in which the area where the object is expected to be present is not captured. Therefore, even in this case, it is possible to improve the efficiency of shooting.
- map information viewed from another viewpoint may be used.
- the photographing device 301 may have a function of changing the viewpoint of the three-dimensional map.
- a UI for changing the viewpoint by the user may be used.
- the photographing apparatus 301 may display both the area information 323 shown in FIG. 29 and the like described above and the area information shown in FIG. 35, or may have a function of switching between these displays.
- SLAM position estimation of the camera
- a three-dimensional point cloud relating to a feature point such as a corner of an object in the image is generated. Since it can be determined that the three-dimensional modeling is possible for the region where the three-dimensional points are generated, the photographed region can be presented on the image by projecting the three-dimensional points into the image.
- FIG. 36 is a plan view showing a shooting situation of the target area. Black circles in the figure indicate the generated three-dimensional points (feature points).
- FIG. 37 is a diagram showing an example of projecting into an image of a three-dimensional point and determining the area around the three-dimensional point as a photographed area.
- the photographing device 301 may generate a mesh by connecting three-dimensional points, and may determine that the area has been photographed using the mesh.
- FIG. 38 is a diagram showing an example of a photographed area in this case. As shown in FIG. 38, the photographing apparatus 301 may determine the area where the mesh is generated as the photographed area.
- the photographed area may be determined by using the parallax image or the depth image obtained from the stereo camera or the RGB-D sensor. According to this, since dense three-dimensional information can be obtained with a light process, the photographed area can be determined more accurately.
- self-position estimation is performed using SLAM, but this is not the case if the position-or-orientation of the camera can be estimated during shooting.
- the photographing device 301 may predict the accuracy at the time of restoration and display the predicted accuracy in addition to presenting the photographed area.
- the three-dimensional points are calculated from the feature points in the image.
- the projected three-dimensional point may deviate from the reference feature point. This deviation is called the reprojection error, and the accuracy can be evaluated using the reprojection error. Specifically, it can be determined that the larger the reprojection error, the lower the accuracy.
- the photographing device 301 may express the accuracy by the color of the photographed area.
- the high precision region is represented in blue and the low precision region is represented in red.
- the accuracy may be expressed stepwise by the difference in color or the darkness. As a result, the user can easily grasp the accuracy of each area.
- the photographing device 301 may determine whether or not the restoration is possible and evaluate the accuracy by using the depth image obtained by the RGB-D sensor or the like.
- FIG. 39 is a diagram showing an example of a depth image. In the figure, the darker the color (the denser the hatching), the farther the distance is.
- the photographing device 301 may determine that pixels up to a certain depth (for example, up to 5 m) are being photographed.
- FIG. 40 is a diagram showing an example when a region having a depth up to a certain range is determined to be a photographed region. In the figure, the hatched area is determined to be the photographed area.
- the photographing device 301 may determine the accuracy of the area according to the distance to the area. That is, the photographing device 301 may determine that the closer the distance is, the higher the accuracy is. For example, the relationship between distance and accuracy may be defined linearly, or other definitions may be used.
- the photographing device 301 can generate a depth image from the parallax image, so that the photographed area and the accuracy may be determined by the same method as when the depth image is used. In this case, the photographing device 301 may determine that the region where the depth value could not be calculated from the parallax image cannot be restored (not photographed). Alternatively, the photographing device 301 may estimate the depth value from the peripheral pixels in the region where the depth value could not be calculated. For example, the photographing device 301 calculates the average value of 5 ⁇ 5 pixels centering on the target pixel.
- the information indicating the three-dimensional position of the photographed area and the accuracy of each three-dimensional position determined above is accumulated. Since the coordinates of the position and orientation of the camera are integrated as described above, the coordinates of the obtained area information can also be integrated. As a result, information indicating the three-dimensional position of the photographed area in the target space and the accuracy of each three-dimensional position is generated.
- the photographing instruction device performs the processing shown in FIG. 41.
- the photographing device 301 captures a plurality of first images of the target space (S851), and based on the plurality of first images and the first photographing position and orientation of each of the plurality of first images, the first image of the target space is obtained.
- Generates three-dimensional position information for example, a sparse three-dimensional point group or depth image) (S852), and uses a plurality of first images to form a second three-dimensional object space that is more detailed than the first three-dimensional position information.
- a second region in which it is difficult to generate position information is determined by using the first three-dimensional position information without generating the second three-dimensional position information (S853).
- the region where it is difficult to generate the three-dimensional position information is the region where the three-dimensional position cannot be calculated and the region where the error between the three-dimensional position and the actual position is larger than the predetermined formula position.
- the second area is (1) an area that is not photographed from a part of the plurality of photographing viewpoints, (2) an area that is photographed but has a large blur, and (3) is photographed.
- the contrast is lower than other areas, there are few feature points, (4) The area is photographed, but it is farther from the shooting viewpoint than other areas, and it is calculated even if the three-dimensional position is calculated.
- the blur includes at least one region where the error between the three-dimensional position and the actual position is estimated to be large, and (5) a region where the influence of lens distortion is larger than the other regions.
- the blur can be detected, for example, by obtaining the position change of the feature point with time.
- the detailed three-dimensional position information is, for example, three-dimensional position information having high spatial resolution.
- the spatial resolution of the three-dimensional position information indicates the distance between the two adjacent three-dimensional positions when the two adjacent three-dimensional positions can be discriminated as different three-dimensional positions.
- high spatial resolution means that the distance between two adjacent three-dimensional positions is small. That is, the three-dimensional position information having a high spatial resolution has more three-dimensional position information in a space of a predetermined size.
- the three-dimensional position information having a high spatial resolution may be referred to as dense three-dimensional position information
- the three-dimensional position information having a low spatial resolution may be referred to as sparse three-dimensional position information.
- the detailed three-dimensional position information may be three-dimensional position information having a large amount of information.
- the first three-dimensional position information is distance information from one viewpoint like a depth image
- the second three-dimensional position information is a tertiary such as a three-dimensional point cloud from which distance information can be obtained from an arbitrary viewpoint. It may be the original model.
- the target space and the subject have the same concept, and mean an area that is commonly photographed.
- the photographing apparatus 301 can determine the second region where it is difficult to generate the second three-dimensional position information by using the first three-dimensional position information without generating the second three-dimensional position information. It is possible to improve the efficiency of capturing a plurality of images for generating the second three-dimensional position information.
- the second region is at least one of a region where an image is not taken and a region where the accuracy of the second three-dimensional position information is estimated to be lower than a predetermined standard.
- the reference is, for example, a threshold value of the distance between two different three-dimensional positions. That is, the second region is a region in which the difference between the generated three-dimensional position information and the execution position is larger than a predetermined threshold value when the three-dimensional position information is generated.
- the first three-dimensional position information includes a first three-dimensional point cloud
- the second three-dimensional position information includes a second three-dimensional point cloud that is denser than the first three-dimensional point cloud.
- the photographing device 301 determines the third region of the target space corresponding to the region around the first three-dimensional point cloud (the region within a predetermined distance from the first three-dimensional point cloud). Then, a region other than the third region is determined to be the second region (for example, FIG. 37).
- the photographing apparatus 301 determines a mesh using the first three-dimensional point cloud, and sets a region other than the third region of the target space corresponding to the region where the mesh is generated as the second region. Judgment (eg, FIG. 38).
- the photographing apparatus 301 determines the second region based on the reprojection error of the first three-dimensional point cloud.
- the first three-dimensional position information includes a depth image.
- the photographing device 301 determines, for example, a region within a predetermined distance from the photographing viewpoint as a third region and a region other than the third region as a second region based on the depth image. do.
- the photographing device 301 further includes a plurality of second images already photographed, a second photographing position and a posture of each of the plurality of second images, a plurality of first images, a plurality of first photographing positions, and a plurality of first images.
- the posture the coordinate systems of the plurality of first shooting positions and postures are matched with the coordinate systems of the plurality of second shooting positions and postures. According to this, the photographing device 301 can determine the second region by using the information obtained by a plurality of times of photographing.
- the photographing device 301 further displays a second area or a third area other than the second area (for example, a photographed area) during the photographing of the target space (for example, FIG. 30). According to this, the photographing apparatus 301 can present the second region to the user.
- a second area or a third area other than the second area for example, a photographed area
- the photographing apparatus 301 can present the second region to the user.
- the photographing device 301 superimposes and displays information indicating the second region or the third region on any of a plurality of images (for example, FIG. 30). According to this, since the photographing device 301 can present the position of the second region in the image to the user, the user can easily grasp the position of the second region.
- the photographing device 301 superimposes and displays information indicating the second region or the third region on the map of the target space (for example, FIG. 35). According to this, since the photographing device 301 can present the position of the second region in the surrounding environment to the user, the user can easily grasp the position of the second region.
- the photographing device 301 displays the second region and the restoration accuracy (accuracy of three-dimensional reconstruction) of each region included in the second region. According to this, the user can grasp the restoration accuracy of each area in addition to the second area, and based on this, an appropriate shooting can be performed.
- the photographing device 301 further presents to the user an instruction for causing the user to photograph the second region (for example, FIGS. 32 and 33). According to this, the user can efficiently perform appropriate shooting.
- the indication includes at least one of the direction and the distance from the current position to the second region (eg, FIGS. 33 and 34). According to this, the user can efficiently perform appropriate shooting.
- the photographing device includes a processor and a memory, and the processor performs the above processing using the memory.
- the photographing instruction device the photographing device, and the like according to the embodiment of the present disclosure have been described above, the present disclosure is not limited to this embodiment.
- the photographing instruction device may have at least a part of processing units included in the photographing device.
- the photographing apparatus may have at least a part of processing units included in the photographing instruction device.
- at least a part of the processing units included in the photographing instruction device and at least a part of the processing units included in the photographing device may be combined.
- the shooting instruction method according to the first embodiment may include at least a part of the processing included in the shooting method according to the second embodiment.
- the photographing method according to the second embodiment may include at least a part of the processing included in the photographing instruction method according to the first embodiment.
- at least a part of the processing included in the shooting instruction method according to the first embodiment and at least a part of the processing included in the shooting method according to the second embodiment may be combined.
- each processing unit included in the photographing instruction device and the photographing device according to the above embodiment is typically realized as an LSI which is an integrated circuit. These may be individually integrated into one chip, or may be integrated into one chip so as to include a part or all of them.
- the integrated circuit is not limited to the LSI, and may be realized by a dedicated circuit or a general-purpose processor.
- An FPGA Field Programmable Gate Array
- a reconfigurable processor that can reconfigure the connection and settings of the circuit cells inside the LSI may be used.
- each component may be configured by dedicated hardware or may be realized by executing a software program suitable for each component.
- Each component may be realized by a program execution unit such as a CPU or a processor reading and executing a software program recorded on a recording medium such as a hard disk or a semiconductor memory.
- the present disclosure may be realized as a shooting instruction method or the like executed by a shooting instruction device or the like.
- the division of the functional block in the block diagram is an example, and a plurality of functional blocks can be realized as one functional block, one functional block can be divided into a plurality of functional blocks, and some functions can be transferred to other functional blocks. You may. Further, the functions of a plurality of functional blocks having similar functions may be processed by a single hardware or software in parallel or in a time division manner.
- each step in the flowchart is executed is for exemplifying in order to specifically explain the present disclosure, and may be an order other than the above. Further, a part of the above steps may be executed at the same time (parallel) as other steps.
- the photographing instruction device and the like according to one or more aspects have been described above based on the embodiment, the present disclosure is not limited to this embodiment. As long as the gist of the present disclosure is not deviated, various modifications that can be conceived by those skilled in the art are applied to the present embodiment, and a form constructed by combining components in different embodiments is also within the scope of one or more embodiments. May be included within.
- This disclosure can be applied to a shooting instruction device.
- Terminal device 101 Imaging unit 102 Control unit 103 Position / orientation estimation unit 104 Three-dimensional reconstruction unit 105 Image analysis unit 106 Point cloud analysis unit 107 Communication unit 108 UI unit 111 Video storage unit 112 Camera posture storage unit 113 Three-dimensional model storage unit 301 Imaging device 302 Reconstruction device 311 Imaging unit 312 Position / orientation estimation unit 313 Position / orientation integration unit 314 Area detection unit 315 UI unit 316 Control unit 317 Image storage unit 318 Position / orientation storage unit 319 Area information storage unit 321 Shooting start / shooting stop Button 322 Shooting image 323 Area information 324 Character display area 325 Arrow 331 Camera position 332 Current route 333 Past route
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Signal Processing (AREA)
- Geometry (AREA)
- Computer Graphics (AREA)
- Medical Informatics (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Studio Devices (AREA)
- Image Processing (AREA)
- Stereoscopic And Panoramic Photography (AREA)
- Processing Or Creating Images (AREA)
Priority Applications (7)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP21780793.2A EP4131173B1 (en) | 2020-03-30 | 2021-03-24 | Imaging instruction method, imaging method, imaging instruction device, and imaging device |
| CN202180023034.0A CN115336250B (zh) | 2020-03-30 | 2021-03-24 | 摄影方法及摄影装置 |
| EP25181151.9A EP4592965A3 (en) | 2020-03-30 | 2021-03-24 | Shooting instruction method, shooting method, shooting instruction device, and shooting device |
| CN202510900352.0A CN120475137A (zh) | 2020-03-30 | 2021-03-24 | 摄影指示方法、摄影指示装置 |
| JP2022512007A JP7745167B2 (ja) | 2020-03-30 | 2021-03-24 | 撮影方法及び撮影装置 |
| US17/943,415 US12175601B2 (en) | 2020-03-30 | 2022-09-13 | Shooting method, shooting instruction method, shooting device, and shooting instruction device |
| JP2025144786A JP2025172128A (ja) | 2020-03-30 | 2025-09-01 | 撮影指示方法及び撮影指示装置 |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2020059676 | 2020-03-30 | ||
| JP2020-059676 | 2020-03-30 | ||
| JP2020-179647 | 2020-10-27 | ||
| JP2020179647 | 2020-10-27 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/943,415 Continuation US12175601B2 (en) | 2020-03-30 | 2022-09-13 | Shooting method, shooting instruction method, shooting device, and shooting instruction device |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2021200432A1 true WO2021200432A1 (ja) | 2021-10-07 |
Family
ID=77928625
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2021/012156 Ceased WO2021200432A1 (ja) | 2020-03-30 | 2021-03-24 | 撮影指示方法、撮影方法、撮影指示装置及び撮影装置 |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US12175601B2 (https=) |
| EP (2) | EP4131173B1 (https=) |
| JP (2) | JP7745167B2 (https=) |
| CN (2) | CN120475137A (https=) |
| WO (1) | WO2021200432A1 (https=) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2023149118A1 (ja) * | 2022-02-03 | 2023-08-10 | ソニーグループ株式会社 | プログラム、情報処理装置、および情報処理方法 |
| WO2023234385A1 (ja) * | 2022-06-03 | 2023-12-07 | Necソリューションイノベータ株式会社 | 地図生成装置、地図生成方法、及びコンピュータ読み取り可能な記録媒体 |
| WO2025033109A1 (ja) * | 2023-08-04 | 2025-02-13 | ソニーグループ株式会社 | 情報処理装置および方法 |
| JP7808906B1 (ja) * | 2025-09-04 | 2026-01-30 | 達也 片山 | 畳敷込区画採寸支援システム |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115601419A (zh) * | 2021-07-07 | 2023-01-13 | 北京字跳网络技术有限公司(Cn) | 同步定位与建图后端优化方法、装置及存储介质 |
| CN115830424B (zh) * | 2023-02-09 | 2023-04-28 | 深圳酷源数联科技有限公司 | 基于融合图像的矿废识别方法、装置、设备及存储介质 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2008089410A (ja) * | 2006-10-02 | 2008-04-17 | Konica Minolta Holdings Inc | 3次元情報取得システム、3次元情報取得方法、およびプログラム |
| JP2010256253A (ja) * | 2009-04-27 | 2010-11-11 | Topcon Corp | 三次元計測用画像撮影装置及びその方法 |
| JP2016061687A (ja) * | 2014-09-18 | 2016-04-25 | ファナック株式会社 | 輪郭線計測装置およびロボットシステム |
| JP2017130146A (ja) | 2016-01-22 | 2017-07-27 | キヤノン株式会社 | 画像管理装置、画像管理方法及びプログラム |
Family Cites Families (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP3703178B2 (ja) * | 1995-09-01 | 2005-10-05 | キヤノン株式会社 | 三次元シーンの形状と表面模様の再構成方法およびその装置 |
| US7840042B2 (en) * | 2006-01-20 | 2010-11-23 | 3M Innovative Properties Company | Superposition for visualization of three-dimensional data acquisition |
| CN102812715B (zh) * | 2011-01-27 | 2015-08-19 | 松下电器产业株式会社 | 三维图像摄影装置以及三维图像拍摄方法 |
| WO2013029675A1 (en) * | 2011-08-31 | 2013-03-07 | Metaio Gmbh | Method for estimating a camera motion and for determining a three-dimensional model of a real environment |
| JP6280425B2 (ja) * | 2014-04-16 | 2018-02-14 | 株式会社日立製作所 | 画像処理装置、画像処理システム、3次元計測器、画像処理方法及び画像処理プログラム |
| US10706621B2 (en) * | 2015-11-30 | 2020-07-07 | Photopotech LLC | Systems and methods for processing image information |
| CN109658515B (zh) * | 2017-10-11 | 2022-11-04 | 阿里巴巴集团控股有限公司 | 点云网格化方法、装置、设备及计算机存储介质 |
| US10574881B2 (en) * | 2018-02-15 | 2020-02-25 | Adobe Inc. | Smart guide to capture digital images that align with a target image model |
| JP2019185283A (ja) * | 2018-04-06 | 2019-10-24 | 日本放送協会 | 3次元モデル生成装置及びそのプログラム、並びに、ip立体像表示システム |
| KR101930796B1 (ko) * | 2018-06-20 | 2018-12-19 | 주식회사 큐픽스 | 이미지를 이용한 3차원 좌표 계산 장치, 3차원 좌표 계산 방법, 3차원 거리 측정 장치 및 3차원 거리 측정 방법 |
| WO2020056041A1 (en) * | 2018-09-11 | 2020-03-19 | Pointivo, Inc. | Improvements in data acquistion, processing, and output generation for use in analysis of one or a collection of physical assets of interest |
| KR102158324B1 (ko) * | 2019-05-07 | 2020-09-21 | 주식회사 맥스트 | 점군 정보 생성 장치 및 방법 |
-
2021
- 2021-03-24 EP EP21780793.2A patent/EP4131173B1/en active Active
- 2021-03-24 EP EP25181151.9A patent/EP4592965A3/en active Pending
- 2021-03-24 CN CN202510900352.0A patent/CN120475137A/zh active Pending
- 2021-03-24 CN CN202180023034.0A patent/CN115336250B/zh active Active
- 2021-03-24 WO PCT/JP2021/012156 patent/WO2021200432A1/ja not_active Ceased
- 2021-03-24 JP JP2022512007A patent/JP7745167B2/ja active Active
-
2022
- 2022-09-13 US US17/943,415 patent/US12175601B2/en active Active
-
2025
- 2025-09-01 JP JP2025144786A patent/JP2025172128A/ja active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2008089410A (ja) * | 2006-10-02 | 2008-04-17 | Konica Minolta Holdings Inc | 3次元情報取得システム、3次元情報取得方法、およびプログラム |
| JP2010256253A (ja) * | 2009-04-27 | 2010-11-11 | Topcon Corp | 三次元計測用画像撮影装置及びその方法 |
| JP2016061687A (ja) * | 2014-09-18 | 2016-04-25 | ファナック株式会社 | 輪郭線計測装置およびロボットシステム |
| JP2017130146A (ja) | 2016-01-22 | 2017-07-27 | キヤノン株式会社 | 画像管理装置、画像管理方法及びプログラム |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4131173A4 |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2023149118A1 (ja) * | 2022-02-03 | 2023-08-10 | ソニーグループ株式会社 | プログラム、情報処理装置、および情報処理方法 |
| WO2023234385A1 (ja) * | 2022-06-03 | 2023-12-07 | Necソリューションイノベータ株式会社 | 地図生成装置、地図生成方法、及びコンピュータ読み取り可能な記録媒体 |
| JPWO2023234385A1 (https=) * | 2022-06-03 | 2023-12-07 | ||
| JP7744064B2 (ja) | 2022-06-03 | 2025-09-25 | Necソリューションイノベータ株式会社 | 地図生成装置、地図生成方法、及びプログラム |
| WO2025033109A1 (ja) * | 2023-08-04 | 2025-02-13 | ソニーグループ株式会社 | 情報処理装置および方法 |
| JP7808906B1 (ja) * | 2025-09-04 | 2026-01-30 | 達也 片山 | 畳敷込区画採寸支援システム |
Also Published As
| Publication number | Publication date |
|---|---|
| US12175601B2 (en) | 2024-12-24 |
| EP4131173A4 (en) | 2023-11-08 |
| JPWO2021200432A1 (https=) | 2021-10-07 |
| JP2025172128A (ja) | 2025-11-20 |
| US20230005220A1 (en) | 2023-01-05 |
| CN115336250A (zh) | 2022-11-11 |
| CN115336250B (zh) | 2025-07-04 |
| CN120475137A (zh) | 2025-08-12 |
| EP4592965A3 (en) | 2025-10-15 |
| EP4131173A1 (en) | 2023-02-08 |
| EP4592965A2 (en) | 2025-07-30 |
| JP7745167B2 (ja) | 2025-09-29 |
| EP4131173B1 (en) | 2025-07-16 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7745167B2 (ja) | 撮影方法及び撮影装置 | |
| CN111983635B (zh) | 位姿确定方法及装置、电子设备和存储介质 | |
| US10116922B2 (en) | Method and system for automatic 3-D image creation | |
| CN105190229B (zh) | 三维形状计测装置、三维形状计测方法及三维形状计测程序 | |
| CN105744138B (zh) | 快速对焦方法和电子设备 | |
| JP7204021B2 (ja) | 画像の鮮鋭レベルを表わす位置合わせ誤差マップを得るための装置および方法 | |
| JP7836988B2 (ja) | 算出方法及び算出装置 | |
| JP2023157799A (ja) | ビューワ制御方法及び情報処理装置 | |
| KR20210112390A (ko) | 촬영 방법, 장치, 전자 기기 및 저장 매체 | |
| CN111935389B (zh) | 拍摄对象切换方法、装置、拍摄设备及可读存储介质 | |
| JP5086120B2 (ja) | 奥行き情報取得方法、奥行き情報取得装置、プログラムおよび記録媒体 | |
| WO2023149118A1 (ja) | プログラム、情報処理装置、および情報処理方法 | |
| CN117480784A (zh) | 拍摄系统、拍摄方法以及程序 | |
| JP7416736B2 (ja) | 撮影支援装置、撮影支援方法、及び撮影支援プログラム | |
| WO2025249064A1 (ja) | 情報処理装置、情報処理方法およびプログラム | |
| WO2025033166A1 (ja) | 撮像装置、撮像方法、およびプログラム | |
| JP2024169179A (ja) | 情報処理装置、情報処理方法 | |
| WO2025187211A1 (ja) | 情報処理システム、情報処理方法、及びプログラム | |
| CN114283053A (zh) | 双目点云的确定方法、装置和设备 | |
| JP2014074999A (ja) | 画像処理装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21780793 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2022512007 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2021780793 Country of ref document: EP Effective date: 20221031 |
|
| WWG | Wipo information: grant in national office |
Ref document number: 202180023034.0 Country of ref document: CN |
|
| WWG | Wipo information: grant in national office |
Ref document number: 2021780793 Country of ref document: EP |