WO2021193672A1 - 三次元モデル生成方法及び三次元モデル生成装置 - Google Patents
三次元モデル生成方法及び三次元モデル生成装置 Download PDFInfo
- Publication number
- WO2021193672A1 WO2021193672A1 PCT/JP2021/012093 JP2021012093W WO2021193672A1 WO 2021193672 A1 WO2021193672 A1 WO 2021193672A1 JP 2021012093 W JP2021012093 W JP 2021012093W WO 2021193672 A1 WO2021193672 A1 WO 2021193672A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- dimensional model
- dimensional
- accuracy
- search
- points
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
- G06T7/593—Depth or shape recovery from multiple images from stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/98—Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/90—Arrangement of cameras or camera modules, e.g. multiple cameras in TV studios or sports stadiums
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/08—Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
- G06T2207/10012—Stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
Definitions
- the present disclosure relates to a three-dimensional model generation method and a three-dimensional model generation device.
- Patent Document 1 discloses a technique for generating a three-dimensional model of a subject by using a plurality of images obtained by photographing the subject from a plurality of viewpoints.
- the present disclosure provides a three-dimensional model generation method and the like that can improve the generation accuracy of the three-dimensional model.
- the three-dimensional model generation method is a three-dimensional model generation method executed by an information processing apparatus, in which a plurality of images obtained by photographing a subject from a plurality of viewpoints are acquired. Similarities similar to the first point of the first image among the plurality of images are obtained from a plurality of second points in the search region based on the first point in the second image different from the first image.
- the search is performed, the accuracy of the search result is calculated using the similarity between the first point and each of the plurality of second points, and the search result and the subject using the accuracy.
- the three-dimensional model generation device includes a memory and a processor, and the processor uses the memory to capture a plurality of images of a subject from a plurality of viewpoints. Is obtained, and a similarity similar to the first point of the first image among the plurality of images is searched from the search area based on the first point in the second image different from the first image. Using the similarity between the first point and each of the plurality of second points in the search region, the accuracy of the three-dimensional point generated based on the first point is calculated. The result of the search and the accuracy are used to generate a three-dimensional model of the subject.
- the present disclosure may be realized as a program that causes a computer to execute the steps included in the above three-dimensional model generation method. Further, the present disclosure may be realized as a non-temporary recording medium such as a CD-ROM that can be read by a computer that records the program. The present disclosure may also be realized as information, data or signals indicating the program. Then, those programs, information, data and signals may be distributed via a communication network such as the Internet.
- the accuracy of three-dimensional model generation can be improved.
- FIG. 1 is a diagram for explaining an outline of a three-dimensional model generation method according to an embodiment.
- FIG. 2 is a block diagram showing a characteristic configuration of the three-dimensional model generator according to the embodiment.
- FIG. 3 is a diagram for explaining the search process by the search unit.
- FIG. 4 is a diagram showing the relationship between the subject and the plurality of frames.
- FIG. 5 is a diagram showing an example when the first accuracy is lower than a predetermined accuracy.
- FIG. 6 is a diagram showing an edge in which the first accuracy tends to be lower than a predetermined accuracy in the first frame.
- FIG. 7 is a diagram showing an example when the first accuracy is higher than a predetermined accuracy.
- FIG. 1 is a diagram for explaining an outline of a three-dimensional model generation method according to an embodiment.
- FIG. 2 is a block diagram showing a characteristic configuration of the three-dimensional model generator according to the embodiment.
- FIG. 3 is a diagram for explaining the search process by the search unit.
- FIG. 4
- FIG. 8 is a diagram showing an edge in which the first accuracy tends to be higher than a predetermined accuracy in the first frame.
- FIG. 9 is a diagram for explaining a first example of the three-dimensional model generation process.
- FIG. 10 is a diagram for explaining a second example of the three-dimensional model generation process.
- FIG. 11 is a flowchart showing an example of the operation of the three-dimensional model generator.
- FIG. 12 is a flowchart showing an example of details of the generation process of step S104 by the generation unit.
- FIG. 13 is a diagram for explaining the search process by the search unit in the modified example.
- a three-dimensional model is generated by searching for similarities between a plurality of images.
- search for similarities when searching for similarities of one pixel in one image from another image, epipolar lines on the other image are calculated from the geometric constraints of the camera, and for a plurality of pixels on the epipolar lines. A search is performed.
- there are a plurality of pixels similar to one pixel on the epipolar line such as when similar textures are lined up on the epipolar line, there is a problem that the accuracy of the search is lowered.
- the present disclosure provides a three-dimensional model generation method and the like that can improve the generation accuracy of the three-dimensional model.
- the three-dimensional model generation method is a three-dimensional model generation method executed by an information processing apparatus, in which a plurality of images obtained by photographing a subject from a plurality of viewpoints are acquired. Similarities similar to the first point of the first image among the plurality of images are obtained from a plurality of second points in the search region based on the first point in the second image different from the first image.
- the search is performed, the accuracy of the search result is calculated using the similarity between the first point and each of the plurality of second points, and the search result and the subject using the accuracy.
- the search result and the accuracy of the search result are used to generate the three-dimensional model, for example, the search result with low accuracy is not adopted for the generation of the three-dimensional model, or the search result with high accuracy is not adopted.
- the generation accuracy of the three-dimensional model can be improved.
- the similarities are searched for each of the plurality of second images, and in the calculation of the accuracy, the first accuracy of each of the plurality of first search results corresponding to the plurality of second images is performed.
- the three-dimensional model may be generated by using the plurality of first search results and the plurality of first precisions.
- a three-dimensional model is generated using a plurality of first search results and a plurality of first precisions, for example, a low-precision first search result is not adopted for the generation of the three-dimensional model, or By preferentially adopting the first search result with high accuracy, it is possible to generate a three-dimensional point with higher accuracy. Therefore, the accuracy of generating the three-dimensional model can be improved.
- the three-dimensional model may be generated without using the first search result in the second image in which the calculated first accuracy is smaller than the predetermined accuracy.
- the first search result whose accuracy is lower than the predetermined threshold value is not adopted for the generation of the three-dimensional model, it is possible to generate the three-dimensional point with higher accuracy.
- N (N is an integer of 1 or more) are selected from the plurality of second images in descending order of the calculated first accuracy, and the selected N second images.
- the three-dimensional model may be generated using the N first search results corresponding to.
- N first search results with high accuracy are preferentially adopted for the generation of the three-dimensional model, it is possible to generate more accurate three-dimensional points.
- a plurality of three-dimensional points corresponding to the plurality of first search results are generated based on the plurality of first search results, and the plurality of three-dimensional points are generated.
- the higher the first accuracy of the corresponding first search result the larger the weighted average is performed to generate the integrated three-dimensional point, and the three-dimensional model including the generated integrated three-dimensional point may be generated. ..
- the plurality of three-dimensional points generated by using the plurality of first search results are integrated by performing a weighted average that is heavily weighted as the accuracy is higher. Since the three-dimensional point is generated, a more accurate three-dimensional point can be generated.
- the search is performed for each of the plurality of the first points, and in the calculation of the accuracy, the plurality of first accuracy is calculated for each of the plurality of first points, and the three-dimensional model is calculated.
- the plurality of first search results and the plurality of three-dimensional points obtained by generating the three-dimensional points using the plurality of first precisions are obtained.
- a three-dimensional model including the three-dimensional model may be generated as the three-dimensional model.
- the accuracy for each of the plurality of first points, the sum of the plurality of values indicating the plurality of first accuracy calculated for the first point is searched based on the first point. It is calculated as the second accuracy of the result, and in the generation of the three-dimensional model, the three-dimensional model including the plurality of three-dimensional points and the plurality of the second precisions corresponding to the plurality of three-dimensional points is three-dimensional. It may be generated as a model.
- each of the plurality of three-dimensional points included in the three-dimensional model can be associated with the second accuracy based on the plurality of first accuracy of the plurality of search results used when the three-dimensional point was generated.
- one or more low precisions corresponding to the second precisions lower than the predetermined precisions between two high-precision three-dimensional points corresponding to the second precisions higher than the predetermined precisions.
- 3D points may be corrected with reference to the two high-precision 3D points.
- the lower the accuracy the larger the amount of movement allowed for correction can be increased, and the low-precision three-dimensional point can be corrected with reference to the high-precision three-dimensional point.
- the accuracy of generating a three-dimensional model can be improved.
- the search area may be an area composed of a plurality of pixels on the epipolar line corresponding to the first point.
- a pixel candidate similar to the first pixel can be effectively selected from the second image.
- the three-dimensional model generation device includes a memory and a processor, and the processor acquires a plurality of images obtained by photographing a subject from a plurality of viewpoints using the memory. Then, a similarity similar to the first point of the first image among the plurality of images is searched from the search region based on the first point in the second image different from the first image, and the first is described. Using the similarity between one point and each of the plurality of second points in the search area, the accuracy of the three-dimensional point generated based on the first point is calculated, and the accuracy of the three-dimensional point generated based on the first point is calculated. The result and the accuracy are used to generate a three-dimensional model of the subject.
- the search result and the accuracy of the search result are used to generate the three-dimensional model, for example, the search result with low accuracy is not adopted for the generation of the three-dimensional model, or the search result with high accuracy is not adopted.
- the generation accuracy of the three-dimensional model can be improved.
- each figure is a schematic view and is not necessarily exactly illustrated. Further, in each figure, substantially the same configuration is designated by the same reference numerals, and duplicate description may be omitted or simplified.
- FIG. 1 is a diagram for explaining the outline of the three-dimensional model generation method according to the embodiment.
- FIG. 2 is a block diagram showing a characteristic configuration of the three-dimensional model generation device 100 according to the embodiment.
- a three-dimensional model of a predetermined region is generated from a plurality of images taken from a plurality of different viewpoints using a plurality of imaging devices 301.
- the predetermined region is a region including a stationary stationary object, a moving moving object such as a person, or both.
- the predetermined region is, for example, a region including at least one of a stationary stationary object and a moving moving body as a subject.
- An example of a predetermined area including a stationary object and an animal body is a venue where a sports game such as basketball is held, or a space on a road where a person or a car exists.
- the predetermined area may include not only a specific object as a subject but also a landscape or the like.
- FIG. 1 illustrates a case where the subject 500 is a building. Further, in the following, not only a specific object to be a subject but also a predetermined area including a landscape or the like is also simply referred to as a subject.
- the three-dimensional model generation system 400 includes an image pickup device group 300 including a plurality of image pickup devices 301, an estimation device 200, and a three-dimensional model generation device 100.
- the plurality of image pickup devices 301 are a plurality of image pickup devices that capture a predetermined area. Each of the plurality of imaging devices 301 captures a subject, and outputs the plurality of captured frames to the estimation device 200, respectively.
- the image pickup device group 300 includes two or more image pickup devices 301. Further, the plurality of imaging devices 301 capture the same subject from different viewpoints.
- a frame is, in other words, an image.
- the three-dimensional model generation system 400 is provided with the image pickup device group 300, but the present invention is not limited to this, and one image pickup device 301 may be provided.
- a multi-viewpoint image composed of a plurality of frames having different viewpoints is generated on one image pickup device 301 while moving one image pickup device 301 for a subject existing in the real space. You may let me take a picture.
- Each of the plurality of frames is a frame imaged (generated) by an imaging device 301 in which at least one of the positions and orientations of the imaging device 301 is different from each other.
- each image pickup device 301 may be a camera that generates a two-dimensional image, or may be a camera provided with a three-dimensional measurement sensor that generates a three-dimensional model.
- the plurality of image pickup devices 301 are cameras that generate two-dimensional images, respectively.
- the plurality of image pickup devices 301 may be directly connected to the estimation device 200 by wired communication or wireless communication so that the frames captured by each can be output to the estimation device 200, or the communication device, the server, or the like is illustrated. It may be indirectly connected to the estimation device 200 via a hub that does not.
- the frames taken by the plurality of imaging devices 301 may be output to the estimation device 200 in real time. Also.
- the frame may be recorded once in a memory or an external storage device such as a cloud server, and then output from the external storage device to the estimation device 200.
- the plurality of image pickup devices 301 may be fixed cameras such as surveillance cameras, mobile cameras such as video cameras, smartphones, or wearable cameras, and movement of drones with a shooting function or the like. It may be a camera.
- the estimation device 200 performs camera calibration by causing one or more image pickup devices 301 to take a picture of a subject from a plurality of viewpoints.
- the estimation device 200 performs camera calibration for estimating the positions and orientations of the plurality of image pickup devices 301 based on the plurality of frames taken by the plurality of image pickup devices 301, respectively.
- the posture of the image pickup device 301 indicates at least one of the imaging direction of the image pickup device 301 and the inclination of the image pickup device 301.
- the imaging direction of the imaging device 301 is the direction of the optical axis of the imaging device 301.
- the inclination of the image pickup device 301 is the rotation angle of the image pickup device 301 around the optical axis from the reference posture.
- the estimation device 200 estimates the camera parameters of the plurality of image pickup devices 301 based on the plurality of frames acquired from the plurality of image pickup devices 301.
- the camera parameter is a parameter indicating the characteristics of the image pickup device 301, and is an internal parameter consisting of the focal length of the image pickup device 301, the center of the image, and the like, and the position of the image pickup device 301 (more specifically, the three-dimensional position).
- an external parameter indicating the posture That is, the position and orientation of each of the plurality of imaging devices 301 can be obtained by estimating the camera parameters of each of the plurality of imaging devices 301.
- the estimation method in which the estimation device 200 estimates the position and orientation of the image pickup device 301 is not particularly limited.
- the estimation device 200 may estimate the positions and orientations of the plurality of image pickup devices 301 by using, for example, Visual-SLAM (Simultaneus Localization and Mapping) technology.
- the estimation device 200 may estimate the positions and orientations of the plurality of image pickup devices 301 by using, for example, the Structure-From-Motion technique.
- the estimation device 200 uses Visual-SLAM technology or Structure-From-Motion technology to extract characteristic points as feature points from each of the plurality of frames 531 to 533 photographed by the plurality of image pickup devices 301.
- a search for feature points is performed by extracting a set of similar similar points between a plurality of frames from the plurality of feature points. Since the estimation device 200 can identify the points on the subject 510 that are commonly reflected in the plurality of frames 531 to 533 by searching for the feature points, the estimation device 200 can use the extracted set of similar points on the subject 510.
- the three-dimensional coordinates of the point can be obtained by the principle of triangulation.
- the estimation device 200 can estimate the position and orientation of each imaging device 301 by extracting a plurality of sets of similar points and using the plurality of sets of similar points.
- the estimation device 200 calculates three-dimensional coordinates for each set of similar points in the process of estimating the position and orientation of each image pickup device 301, and is a tertiary including a plurality of three-dimensional points indicated by the calculated three-dimensional coordinates.
- the original model 520 may be generated.
- Each of the plurality of three-dimensional points indicates a position on the subject 510 in the three-dimensional space.
- the estimation device 200 obtains the position and orientation of each image pickup device 301 and the map information as the estimation result.
- the three-dimensional model 520 includes three-dimensional positions of each of the plurality of three-dimensional points.
- the three-dimensional model 520 displays not only a plurality of three-dimensional positions, but also the color of each three-dimensional point, the surface shape around each three-dimensional point, and information indicating by which frame each three-dimensional point was generated. It may be included.
- the estimation device 200 may generate a three-dimensional model 520 including a sparse three-dimensional point cloud by limiting the number of sets of similar points to a predetermined number in order to speed up the estimation process. .. This is because the estimation device 200 can estimate the position and orientation of each image pickup device 301 with sufficient accuracy even with a predetermined number of sets of similarities.
- the predetermined number may be determined to be a number capable of estimating the position and orientation of each imaging device 301 with sufficient accuracy.
- the estimation device 200 may estimate the position and orientation of each image pickup device 301 by using a set of similar points that are similar to each other with a predetermined degree of similarity or higher. As a result, the estimation device 200 can limit the number of pairs of similarities used in the estimation process to the number of pairs that are similar at or above a predetermined degree of similarity.
- the estimation device 200 may calculate, for example, the distance between the image pickup device 301 and the subject 510 as a camera parameter based on the position and orientation of the image pickup device 301 estimated by using the above technique.
- the three-dimensional model generation system 400 includes a distance measuring sensor, and the distance between the image pickup device 301 and the subject 510 may be measured using the distance measuring sensor.
- the estimation device 200 may be directly connected to the three-dimensional model generation device 100 by wired communication or wireless communication, or indirectly connected to the estimation device 200 via a hub (not shown) such as a communication device or a server. You may. As a result, the estimation device 200 outputs the plurality of frames received from the plurality of image pickup devices 301 and the plurality of camera parameters of the plurality of estimated image pickup devices 301 to the three-dimensional model generation device 100.
- the camera parameters estimated by the estimation device 200 may be output to the three-dimensional model generation device 100 in real time. Further, the camera parameters may be once recorded in an external storage device such as a memory or a cloud server, and then output from those external storage devices to the three-dimensional model generation device 100.
- an external storage device such as a memory or a cloud server
- the estimation device 200 is a computer including, for example, a control program, a processing circuit such as a processor or a logic circuit that executes the control program, and a recording device such as an internal memory for storing the control program or an accessible external memory. At least have a system.
- the three-dimensional model generation device 100 generates a three-dimensional model of a predetermined region based on a plurality of frames captured by the plurality of imaging devices 301 and camera parameters estimated by the estimation device 200. Specifically, the three-dimensional model generation device 100 generates a three-dimensional model of the subject in a virtual three-dimensional space based on the camera parameters of each of the plurality of imaging devices 301 and the plurality of frames. A device that executes model generation processing.
- the three-dimensional model of the subject is data including the three-dimensional shape of the subject and the color of the subject, which are restored in a virtual three-dimensional space from the frame in which the actual subject is photographed.
- the three-dimensional model of the subject is a multi-viewpoint, that is, a point indicating the three-dimensional position of each of the plurality of points on the subject captured in each of the plurality of two-dimensional images taken by the plurality of imaging devices 301 at a plurality of different viewpoints. It is a set.
- the three-dimensional position is represented by, for example, ternary information including an X component, a Y component, and a Z component indicating the positions of the X axis, the Y axis, and the Z axis that are orthogonal to each other.
- the information included in the plurality of points indicating the three-dimensional position includes not only the three-dimensional position (that is, the information indicating the coordinates) but also the information indicating the color of each point and the information indicating the surface shape of each point and its surroundings. Etc. may be included.
- the information on the three-dimensional position includes information other than the information on the distance between the shooting viewpoint of the frame and the subject.
- the three-dimensional model generator 100 includes, for example, a control program, a processing circuit such as a processor or a logic circuit that executes the control program, and a recording device such as an internal memory or an accessible external memory that stores the control program. At least a computer system equipped with.
- the three-dimensional model generation device 100 is an information processing device. The functions of each processing unit of the three-dimensional model generator 100 may be realized by software or hardware.
- the three-dimensional model generation device 100 may store camera parameters in advance. In this case, the three-dimensional model generation system 400 does not have to include the estimation device 200. Further, the plurality of image pickup devices 301 may be communicably connected to the three-dimensional model generation device 100 by wireless or wired communication.
- the plurality of frames taken by the image pickup apparatus 301 may be directly output to the three-dimensional model generation apparatus 100.
- the image pickup device 301 may be directly connected to the three-dimensional model generation device 100 by, for example, wired communication or wireless communication, or the three-dimensional model generation device may be generated via a hub (not shown) such as a communication device or a server. It may be indirectly connected to the device 100.
- the three-dimensional model generation device 100 is a device that generates a three-dimensional model from a plurality of frames.
- the three-dimensional model generation device 100 includes a reception unit 110, a storage unit 120, an acquisition unit 130, a generation unit 140, and an output unit 150.
- the receiving unit 110 receives from the estimation device 200 a plurality of frames taken by the plurality of imaging devices 301 and camera parameters estimated by the estimation device 200. As a result, the receiving unit 110 acquires the first frame (first image) of the subject photographed from the first viewpoint and the second frame (second image) of the subject photographed from the second viewpoint. That is, the plurality of frames received by the receiving unit 110 include the first frame and the second frame. The receiving unit 110 may acquire the three-dimensional model 520 from the estimation device 200. The receiving unit 110 outputs a plurality of received frames and camera parameters to the storage unit 120.
- the receiving unit 110 is, for example, a communication interface for communicating with the estimation device 200.
- the receiving unit 110 When the three-dimensional model generation device 100 and the estimation device 200 wirelessly communicate with each other, the receiving unit 110 includes, for example, an antenna and a wireless communication circuit. Alternatively, when the three-dimensional model generation device 100 and the estimation device 200 communicate by wire, the receiving unit 110 includes, for example, a connector connected to a communication line and a wire communication circuit. The receiving unit 110 may receive a plurality of frames from a plurality of imaging devices 301 without going through the estimation device 200.
- the storage unit 120 stores a plurality of frames and camera parameters received by the reception unit 110.
- the storage unit 120 may store the three-dimensional model 520 received by the reception unit 110.
- the storage unit 120 may store the processing results of each processing unit included in the three-dimensional model generation device 100.
- the storage unit 120 stores, for example, a control program executed by each processing unit included in the three-dimensional model generation device 100.
- the storage unit 120 is realized by, for example, an HDD (Hard Disk Drive), a flash memory, or the like.
- the acquisition unit 130 acquires a plurality of frames stored in the storage unit 120 and the camera parameters of each imaging device 301 from the storage unit 120 and outputs them to the generation unit 140.
- the three-dimensional model generation device 100 does not have to include the storage unit 120 and the acquisition unit 130.
- the receiving unit 110 may output the plurality of frames received from the plurality of imaging devices 301 and the camera parameters of each imaging device 301 received from the estimating device 200 to the generating unit 140.
- the generation unit 140 generates a three-dimensional model using a plurality of frames and camera parameters.
- the generation unit 140 includes a search unit 141, a calculation unit 142, and a model generation unit 143.
- the search unit 141 searches for similarities similar to the first point of the first frame among the plurality of frames from a plurality of second points in the search region based on the first point in the second frame.
- the first point is the first pixel of one of the plurality of first pixels constituting the first frame.
- the search region is a region defined by an epipolar line corresponding to the first point of the first frame in the plurality of second frames, and is, for example, a region composed of a plurality of second points on the epipolar line. ..
- the plurality of second points are a plurality of second pixels included in the search area.
- Each of the first point and the second point may be a feature point or may not be a feature point.
- the search unit 141 searches for similar points (similar pixels) similar to the first pixel from the plurality of second pixels in the search area of the second frame for each of the plurality of first pixels constituting the first frame. You may.
- the search unit 141 may search for the above-mentioned similarities for each of the plurality of first pixels, or may search for the above-mentioned similarities for one first pixel. Further, the search unit 141 may search for the above-mentioned similarities for each of the plurality of second frames.
- the search unit 141 may search for the above-mentioned similarities for each of the plurality of second frames, or may search for the above-mentioned similarities for one second frame.
- FIG. 3 is a diagram for explaining the search process by the search unit 141.
- the first frame 531 including the subject 510 is imaged by the image pickup device 301 of the first viewpoint V1
- the second frame 532 including the subject 510 is imaged by the image pickup device 301 of the second viewpoint V2
- the third viewpoint V3 An example is shown in which the second frame 533 including the subject 510 is imaged by the image pickup apparatus 301 of the above.
- a straight line connecting the position of the image pickup device 301 that imaged the first frame and the two-dimensional coordinates on the first frame of the first pixel is set as the second frame to be processed for each first pixel.
- Calculate the projected epipolar line For example, as shown in FIG. 3, the search unit 141 calculates the epipolar line 552 projected on the second frame 532 by the straight line L1 connecting the first viewpoint V1 and the first pixel 541. Further, the search unit 141 calculates the epipolar line 553 in which the straight line L1 is projected on the second frame 533. Then, the search unit 141 searches for similarities similar to the first pixel 541 to be processed in the first frame 531 from the epipolar lines 552 and 552 of the second frame 532 and 533, respectively.
- the calculation unit 142 calculates the accuracy of the search result by using the similarity between the first pixel and each of the plurality of second pixels in the search area.
- the accuracy of the search result corresponds to, for example, the certainty of the similarities (second pixel similar to the first pixel) searched in the search area.
- the certainty of similarities differs from the similarity of similarities. For example, even if the similarity is high, if a plurality of second pixels having a high similarity are present in the search region, the certainty of the similarity is low. That is, the certainty of the similarity is affected by the similarity of the second pixel other than the similarity.
- the calculation unit 142 calculates N (I, J) indicating a Normalized Cross Correlation (NCC) between a small area between the first frame and the second frame to be processed using Equation 1 as the similarity. .. N (I, J) is represented by a numerical value between -1 and 1, and the closer it is to 1, the higher the similarity.
- the calculation unit 142 calculates a plurality of similarities in order to calculate the similarity between the first pixel and each of the plurality of second pixels in the search area.
- the calculation unit 142 calculates the first sum, which is the sum of the plurality of similarities calculated for one search area, using Equation 2.
- the first sum corresponds to the first accuracy, which is an example of the accuracy of the search result. The smaller the first sum, the higher the first accuracy.
- the calculation unit 142 searches for similarities similar to the first pixel, and calculates the first accuracy for each of the plurality of second frames to be searched.
- the first accuracy indicates the accuracy when searching for similarities of the first pixel from a plurality of second pixels in the search area of the second frame to be processed.
- Equation 2 (I, J) indicates the coordinates of the first pixel in the first frame. Further, (X, Y) indicates the coordinates of the start point of the epipolar line in the second frame to be processed, and (S, T) indicates the coordinates of the end point of the epipolar line in the second frame to be processed. i indicates a frame number for specifying the second frame to be referred to.
- Equation 2 may include only N (I, J) exceeding the first threshold Th1.
- FIG. 4 is a diagram showing the relationship between the subject 510 and the plurality of frames 561 to 563.
- FIG. 4 shows an example in which the optical axes of the plurality of imaging devices 301 that have been imaged are parallel to each other.
- the optical axes of the plurality of imaging devices 301 are parallel to each other, and the optical axes of the plurality of imaging devices 301 are not limited to being parallel to each other.
- FIG. 5 is a diagram showing an example in the case where the first accuracy is lower than the predetermined accuracy in the first frame 561. That is, FIG. 5 shows an example in which the first sum calculated using the equation 2 exceeds the second threshold Th2 (not shown). In this way, when the first sum exceeds the second threshold value Th2, the calculation unit 142 may determine that the search accuracy of the second pixel searched for the first pixel 571 is lower than the predetermined accuracy. good.
- Th2 the second threshold value
- FIG. 5D shows a plurality of second pixels on the epipolar line 572 of the second frame 562, which is a search target for pixels similar to the first pixel 571 of the first frame 561, and the first pixel 571 and the plurality. It is a graph which shows the relationship with the degree of similarity (matching score) with the 2nd pixel of.
- the horizontal axis of the graph shown in FIG. 5D indicates the position of the second pixel on the epipolar line, and the vertical axis indicates the score (similarity) of the second pixel.
- FIG. 5 (e) shows a plurality of second pixels on the epipolar line 573 of the second frame 563, which is a search target for pixels similar to the first pixel 571 of the first frame 561, and the first pixel 571 and the plurality. It is a graph which shows the relationship with the degree of similarity (matching score) with the 2nd pixel of.
- a straight line connecting the position of the image pickup device 301 that imaged the first frame 561 and the first pixel 571 (or the point of the subject in the three-dimensional space corresponding to the first pixel 571) is formed in the second frame 563. It is a projected straight line.
- the degree of similarity is simply referred to as a score.
- many pixels similar to the first pixel 571 are arranged on the epipolar lines 572 and 573 corresponding to the first pixel 571. Therefore, many second pixels having a matching score exceeding the first threshold Th1 (for example, a predetermined number or more) are included.
- Th1 for example, a predetermined number or more
- the epipolar lines 572 and 573 are likely to contain a large amount of the second pixel having a matching score exceeding the first threshold Th1.
- the edge shown by the broken line in FIG. 6 follows the epipolar line
- many second pixels having a matching score exceeding the first threshold Th1 are likely to be included.
- calculation unit 142 may determine that the second pixel whose matching score exceeds the first threshold Th1 is similar to the first pixel 571.
- the horizontal axis of the graph shown in FIG. 5 (d) shows the position of the second pixel for the sake of explanation, but the position of the second pixel is not indispensable for calculating the first sum.
- FIG. 7 is a diagram showing an example in the case where the first accuracy is equal to or higher than a predetermined accuracy in the first frame 561. That is, FIG. 7 shows an example in which the first sum calculated by using Equation 2 is equal to or less than the second threshold Th2. In this way, when the first sum is equal to or less than the second threshold value Th2, the calculation unit 142 may determine that the search accuracy of the second pixel searched for the first pixel 571 is equal to or higher than a predetermined accuracy. ..
- a specific description will be given.
- FIG. 7D shows a plurality of second pixels on the epipolar line 575 of the second frame 562, which is a search target for pixels similar to the first pixel 574 of the first frame 561, and the first pixel 574 and the plurality. It is a graph which shows the relationship with the degree of similarity (matching score) with the 2nd pixel of.
- FIG. 7 (e) shows a plurality of second pixels on the epipolar line 576 of the second frame 563, which is a search target for pixels similar to the first pixel 574 of the first frame 561, and the first pixel 574 and the plurality. It is a graph which shows the relationship with the degree of similarity (matching score) with the 2nd pixel of.
- the epipolar line 576 a straight line connecting the position of the image pickup device 301 that imaged the first frame 561 and the first pixel 574 (or the point of the subject in the three-dimensional space corresponding to the first pixel 574) is formed in the second frame 563. It is a projected straight line.
- the degree of similarity is simply referred to as a score.
- the number of second pixels having a matching score exceeding the first threshold Th1 is small (for example, less than a predetermined number).
- the number of second pixels having a matching score exceeding the first threshold Th1 tends to decrease.
- the edge shown by the broken line in FIG. 8 intersects the epipolar line, the number of second pixels having a matching score exceeding the first threshold Th1 tends to be small (for example, less than a predetermined number).
- the larger the first sum the larger the number of second pixels similar to the first pixel. Therefore, the erroneous second pixel is matched with the first pixel. There is a high possibility that it will end up.
- the first sum is smaller, the number of second pixels similar to the first pixel is smaller, so that there is a higher possibility that an appropriate second pixel is matched with the first pixel. Therefore, the first sum can be used as an index indicating that the larger the value, the lower the accuracy of the search.
- the first accuracy may be the reciprocal of the first sum, or may be a value obtained by subtracting the first sum from a predetermined fixed value.
- the calculation unit 142 determines that the search accuracy is lower than the predetermined accuracy when there is no second pixel whose matching score exceeds the first threshold Th1 even when the first sum is equal to or less than the second threshold Th2. You may judge. This is because there are no similarities on the epipolar line that resemble the first pixel.
- the calculation unit 142 calculates the first sum for each of the plurality of second frames. Therefore, a plurality of first sums can be obtained for one first pixel.
- the calculation unit 142 calculates the second sum, which is the sum of the plurality of first sums obtained for one first pixel, using Equation 3.
- the second sum corresponds to the second accuracy, which is an example of the accuracy of the search result.
- the second accuracy indicates the accuracy when the similarity of the plurality of first pixels to be processed is searched from the plurality of second frames.
- N indicates the number of a plurality of second frames to be referred to.
- the calculation unit 142 calculates a plurality of first sums for each of all the first pixels of the first frame. Then, the calculation unit 142 calculates the second sum using the plurality of first sums. The second sum can be used as an index indicating that the larger the value, the lower the second accuracy.
- the second accuracy may be the reciprocal of the second sum, or may be a value obtained by subtracting the second sum from a predetermined fixed value.
- the model generation unit 143 generates a three-dimensional model using the search result by the search unit 141 and the accuracy calculated by the calculation unit 142. Specifically, the model generation unit 143 uses a plurality of first search results, which are the results of searches performed on each of the plurality of second frames, and a plurality of first precisions to generate a three-dimensional model. Generate. The model generation unit 143 generates a plurality of three-dimensional points obtained by generating three-dimensional points using a plurality of first search results and a plurality of first precisions for each of the plurality of first pixels. Generate a 3D model that includes.
- FIG. 9 is a diagram for explaining a first example of a three-dimensional model generation process.
- the model generation unit 143 uses the first search result in the second frame in which the first sum indicating the calculated first accuracy is larger than the second threshold Th2, that is, the first accuracy is lower than the predetermined accuracy.
- a three-dimensional model may be generated. As shown in FIG. 9, the model generation unit 143 does not use the first search result in the second frame 532 of the accuracy A1 (the first sum calculated for the second frame 532) larger than the second threshold Th2.
- the three-dimensional point 522 is generated by using the first search result in the second frame 533 of the accuracy A2 (the first sum calculated for the second frame 533) of the second threshold value Th2 or less.
- the model generation unit 143 excludes the first search result determined to have low accuracy, and includes each of the plurality of second pixels selected as similar points by the other first search results, and the first one.
- a three-dimensional point 522 represented by three-dimensional coordinates of a point on the subject 510 is calculated by the principle of triangulation. For example, as shown in FIG. 9, the model generation unit 143 sets the intersection of the straight line L1 connecting the first viewpoint V1 and the first pixel 541 and the straight line L3 connecting the third viewpoint V3 and the second pixel 543 as a three-dimensional point. Generate as 522.
- the model generation unit 143 selects N (N is an integer of 1 or more) from a plurality of second frames in descending order of the calculated first accuracy, and corresponds to the selected N second frames.
- a three-dimensional model may be generated using the N first search results.
- the model generation unit 143 uses N first search results selected in descending order of accuracy, and uses a plurality of pairs of each of the plurality of second pixels selected as similarities and the first pixel.
- the three-dimensional point indicated by the three-dimensional coordinates of the point on the subject 510 is calculated by the principle of triangulation.
- the model generation unit 143 calculates one three-dimensional point corresponding to one first pixel by calculating the average of the plurality of three-dimensional points. May be generated.
- the second threshold Th2 is an example of predetermined accuracy. As described above, since the first search result larger than the second threshold value Th2 is not adopted in the three-dimensional model generation processing, it is possible to generate more accurate three-dimensional points.
- FIG. 10 is a diagram for explaining a second example of the three-dimensional model generation process.
- the model generation unit 143 generates a plurality of three-dimensional points corresponding to the plurality of first search results based on the plurality of first search results. Specifically, the model generation unit 143 obtains a pair of a first pixel and a second pixel similar to the first pixel for each of the plurality of second frames. Then, the model generation unit 143 generates a plurality of three-dimensional points by generating one three-dimensional point from one set. Then, the model generation unit 143 generates an integrated three-dimensional point by performing a weighted average of a plurality of three-dimensional points as the first accuracy of the corresponding first search result increases, and the generated integrated tertiary point is generated. A three-dimensional model including the original point may be generated.
- the first search result with high accuracy was preferentially adopted, but in the second example, as shown in FIG. 10, the first search result is similar based on all the first search results.
- the model generation unit 143 sets the intersection of the straight line L1 connecting the first viewpoint V1 and the first pixel 541 and the straight line L2 connecting the second viewpoint V2 and the second pixel 542 as a three-dimensional point.
- the intersection of the straight line L1 and the straight line L3 connecting the third viewpoint V3 and the second pixel 543 is generated as a three-dimensional point 522.
- the plurality of three-dimensional points 521 and 522 generated by using the plurality of first search results are integrated by performing a weighted average that is heavily weighted as the accuracy is higher.
- Generate a 3D point Therefore, it is possible to integrate a plurality of three-dimensional points so that many components of the three-dimensional points with high accuracy are included, and it is possible to generate a three-dimensional point with higher accuracy.
- the model generation unit 143 may generate a three-dimensional model including a plurality of three-dimensional points and a plurality of second precisions corresponding to the plurality of three-dimensional points. Therefore, each of the plurality of three-dimensional points included in the three-dimensional model can be associated with the second accuracy based on the plurality of first accuracy of the plurality of search results used when the three-dimensional point was generated.
- the model generation unit 143 can filter (smooth) the three-dimensional model with the second accuracy as a weight.
- the model generation unit 143 has a low precision of 1 or more corresponding to a second precision equal to or less than a predetermined precision between two high-precision three-dimensional points corresponding to a second precision larger than a predetermined precision.
- 3D points may be corrected with reference to two highly accurate 3D points.
- the lower the accuracy the larger the amount of movement allowed for correction can be increased, and the low-precision three-dimensional point can be corrected with reference to the high-precision three-dimensional point.
- the accuracy of generating a three-dimensional model can be improved.
- FIG. 11 is a flowchart showing an example of the operation of the three-dimensional model generation device 100.
- the receiving unit 110 receives from the estimation device 200 a plurality of frames captured by the plurality of imaging devices 301 and the camera parameters of each imaging device 301 (S101).
- Step S101 is an example of a step of acquiring a plurality of images.
- the receiving unit 110 does not have to receive the plurality of frames and the camera parameters at one timing, and may receive each at different timings. That is, the acquisition of a plurality of frames and the acquisition of camera parameters may be performed at the same timing as each other, or may be performed at different timings from each other.
- the storage unit 120 stores a plurality of frames taken by the plurality of image pickup devices 301 received by the reception unit 110 and the camera parameters of each image pickup device 301 (S102).
- the acquisition unit 130 acquires a plurality of frames and camera parameters stored in the storage unit 120, and outputs the acquired plurality of frames and camera parameters to the generation unit 140 (S103).
- the generation unit 140 generates a three-dimensional model using a plurality of frames and camera parameters (S104).
- step S104 for generating the three-dimensional model will be described later with reference to FIG.
- the output unit 150 outputs the three-dimensional model generated by the generation unit 140 (S105).
- FIG. 12 is a flowchart showing an example of details of the generation process of step S104 by the generation unit 140.
- the generation unit 140 performs loop 1 for each frame set of the multi-viewpoint images taken at the timings corresponding to each other (S111). In loop 1, loop 2 is performed for each frame set.
- the generation unit 140 performs loop 2 for each first pixel of the first frame in the frame set to be processed (S112). In loop 2, the processes of steps S113 to S115 are performed for each first pixel.
- the search unit 141 transfers the first pixel to be processed from the plurality of second pixels on the epipolar line corresponding to the first pixel on the plurality of second frames of the frame set to be processed to the first pixel.
- Search for similar similarities S113. The details of step S113 are omitted because they have been described in the description of the search unit 141.
- the calculation unit 142 calculates the first accuracy of the search for similar points similar to the first pixel to be processed for each of the plurality of second frames (S114). The details of step S114 have been described in the description of the calculation unit 142 and will be omitted.
- the model generation unit 143 generates a three-dimensional model using the plurality of first search results obtained in step S113 and the plurality of first precisions obtained in step S114 (S115). The details of step S115 are omitted because they have been described in the description of the model generation unit 143.
- Loop 2 ends when the processing of steps S113 to S115 is completed for all the first pixels included in the first frame of the frame set to be processed.
- Loop 1 ends when loop 2 ends for all framesets.
- the three-dimensional model generation method is a three-dimensional model generation method executed by an information processing apparatus, and acquires a plurality of frames obtained by photographing a subject from a plurality of viewpoints (S101). ), A similarity similar to the first pixel of the first frame among the plurality of frames is searched from a plurality of second pixels in the search region based on the first pixel in the second frame different from the first frame (S113). ), The accuracy of the search result is calculated using the similarity between the first pixel and each of the plurality of second pixels (S114), and the search result and the accuracy are used to generate a three-dimensional model. (S115).
- the search result and the accuracy of the search result are used to generate the three-dimensional model, for example, the search result with low accuracy is not adopted for the generation of the three-dimensional model, or the search result with high accuracy is not adopted.
- the generation accuracy of the three-dimensional model can be improved.
- the three-dimensional model generation method in the search (S113), similarities are searched for each of the plurality of second frames.
- the accuracy calculation (S114) the first accuracy of each of the plurality of first search results corresponding to the plurality of second frames is calculated.
- the three-dimensional model of the subject (S115) the three-dimensional model is generated by using the plurality of first search results and the plurality of first precisions.
- a three-dimensional model is generated using a plurality of first search results and a plurality of first precisions, for example, a low-precision first search result is not adopted for the generation of the three-dimensional model, or By preferentially adopting the first search result with high accuracy, it is possible to generate a three-dimensional point with higher accuracy. Therefore, the accuracy of generating the three-dimensional model can be improved.
- the calculated accuracy is smaller than the predetermined accuracy without using the first search result in the second frame.
- the original model may be generated. According to this, since the first search result having a accuracy lower than a predetermined threshold value is not adopted for the generation of the three-dimensional model, it is possible to generate a three-dimensional point with higher accuracy.
- N is an integer of 1 or more
- a three-dimensional model may be generated by selecting sheets and using N first search results corresponding to the selected N second frames. According to this, since N first search results with high accuracy are preferentially adopted for the generation of the three-dimensional model, it is possible to generate more accurate three-dimensional points.
- a plurality of three dimensions corresponding to the plurality of first search results are obtained based on the plurality of first search results.
- a point is generated, and a weighted average of a plurality of three-dimensional points is weighted as the first accuracy of the corresponding first search result is higher to generate an integrated three-dimensional point, and the generated integrated three-dimensional point is generated.
- the plurality of three-dimensional points generated by using the plurality of first search results are integrated by performing a weighted average that is heavily weighted as the accuracy is higher. Since the three-dimensional point is generated, a more accurate three-dimensional point can be generated.
- the search (S113) is performed for each of the plurality of first pixels.
- the accuracy calculation (S114) a plurality of first precisions are calculated for each of the plurality of first pixels.
- the generation of the three-dimensional model (S115) a plurality of three dimensions obtained by generating three-dimensional points using a plurality of first search results and a plurality of first precisions for each of the plurality of first pixels.
- a three-dimensional model including points is generated as a three-dimensional model. Therefore, it is possible to generate more accurate three-dimensional points for each of the plurality of first pixels. Therefore, the accuracy of generating the three-dimensional model can be improved.
- the accuracy calculation (S114) for each of the plurality of first pixels, a plurality of values indicating a plurality of first accuracy calculated for the first pixel. The sum of (that is, the matching score) is calculated as the second accuracy of the search result based on the first point.
- a three-dimensional model including a plurality of three-dimensional points and the second accuracy is generated as a three-dimensional model. Therefore, each of the plurality of three-dimensional points included in the three-dimensional model can be associated with the second accuracy based on the plurality of first accuracy of the plurality of search results used when the three-dimensional point was generated.
- a predetermined value is provided between two high-precision three-dimensional points corresponding to a second accuracy higher than a predetermined accuracy.
- One or more low-precision three-dimensional points corresponding to the second precision below the precision are corrected with reference to two high-precision three-dimensional points.
- the lower the accuracy the larger the amount of movement allowed for correction can be increased, and the low-precision three-dimensional point can be corrected with reference to the high-precision three-dimensional point.
- the accuracy of generating a three-dimensional model can be improved.
- the search area is an area composed of a plurality of second pixels on the epipolar line corresponding to the first pixel. Therefore, a pixel candidate similar to the first pixel can be effectively selected from the second image.
- the three-dimensional model of the subject is generated by calculating the accuracy of the search result, but the calculation of the accuracy of the search result may be omitted.
- the three-dimensional model generation method according to the first modification is executed by the information processing apparatus.
- the three-dimensional model generation method according to the first modification is to capture a first image obtained by photographing the subject from the first viewpoint and a second image obtained by photographing the subject from the second viewpoint.
- a search area of the second image is specified based on the first point, and a plurality of second search areas of the search area are specified.
- the similar point is searched for in the search region, the result of the search, and a plurality of calculated similarities.
- a three-dimensional model of the subject is generated based on the variation of.
- the plurality of similarities in the graph shown in FIG. 7 (d) are more locally distributed than the plurality of similarities in the graph shown in FIG. 5 (d).
- the plurality of similarities in the graph shown in FIG. 7 (d) have a large variation and a large standard deviation as compared with the plurality of similarities in the graph shown in FIG. 5 (d). Therefore, the calculation unit 142 calculates, for example, the standard deviations of a plurality of similarities. Then, when the calculated standard deviation is equal to or greater than a predetermined threshold value and the maximum similarity is equal to or greater than a predetermined threshold value, the model generation unit 143 three-dimensionally captures the second pixel (similarity point) having the largest similarity. Used for model generation. Further, the model generation unit 143 does not use the second pixel having the largest similarity in the three-dimensional model when the calculated standard deviation is predetermined to be less than the threshold value.
- the index showing the variation of a plurality of similarities is not limited to the standard deviation.
- NCC is calculated as the degree of similarity (matching score), but the present invention is not limited to this.
- an SSD (I, J) indicating an SSD (Sum of Squared Difference), which is the sum of the squares of the differences in pixel values between small regions between the first frame and the second frame to be processed, is used as the degree of similarity. It may be calculated using 4. SSD (I, J) indicates that the smaller the value, the higher the similarity.
- the search region is a region defined by an epipolar line corresponding to the first point (first pixel) of the first frame in a plurality of second frames.
- the search unit 141 includes the position and orientation of the image pickup apparatus 301 that images each of the plurality of second frames 532 and 533, and a known three-dimensional model 513 (three-dimensional) acquired in advance. From the relationship with the point cloud), the third pixels 582 and 583 in the plurality of second frames 532 and 533 corresponding to the first pixel 581 of the first frame 531 are specified.
- the search unit 141 may determine the region including the third pixels 582 and 583 based on the third pixels 582 and 583 as the search regions 592 and 593. For example, the search unit 141 may determine a rectangular region centered on the third pixels 582 and 583 as the search regions 592 and 593.
- the search area is not limited to a rectangle, and may be an area having another specific shape such as a square or a circle.
- the known three-dimensional model 513 acquired in advance may be the three-dimensional model 520 generated by the estimation device 200 described with reference to FIG.
- each processing unit included in the three-dimensional model generator or the like is realized by the CPU and the control program.
- each component of the processing unit may be composed of one or a plurality of electronic circuits.
- the one or more electronic circuits may be general-purpose circuits or dedicated circuits, respectively.
- the one or more electronic circuits may include, for example, a semiconductor device, an IC (Integrated Circuit), an LSI (Large Scale Integration), or the like.
- the IC or LSI may be integrated on one chip or may be integrated on a plurality of chips.
- IC integrated circuit
- LSI System LSI
- VLSI Very Large Scale Integration
- ULSI Ultra Large Scale Integration
- FPGA Field Programmable Gate Array
- the general or specific aspects of the present disclosure may be realized by a system, an apparatus, a method, an integrated circuit, or a computer program.
- a computer-readable non-temporary recording medium such as an optical disk, HDD (Hard Disk Drive) or semiconductor memory in which the computer program is stored.
- HDD Hard Disk Drive
- it may be realized by any combination of a system, an apparatus, a method, an integrated circuit, a computer program and a recording medium.
- the present disclosure can be applied to a three-dimensional model generation device or a three-dimensional model generation system, and can be applied to, for example, figure creation, terrain or building structure recognition, person behavior recognition, free-viewpoint image generation, and the like.
- Imaging device group 301 Imaging device 400 3D model generation system 500, 510 Subject 511,520 3D model 521,522 3D point 531,561 1st frame 532,533,562,563 2nd frame 541,571,574,581 1st pixel 542,543 2nd pixel 552,552,572 , 573, 575, 576 Epipolar lines 582, 583 3rd pixel 592, 593 Search area L1 to L3 straight line V1 1st viewpoint V2 2nd viewpoint V3 3rd viewpoint
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
Description
特許文献1に開示されている技術では、複数の画像間における類似点を探索することで三次元モデルを生成する。一般に、類似点の探索では、一の画像の一の画素の類似点を他の画像から探索する場合、カメラの幾何制約から他の画像上のエピポーラ線が算出され、エピポーラ線上の複数の画素について探索が行われる。しかしながら、エピポーラ線上に同じようなテクスチャが並んでいる場合などのように、一の画素に類似している画素がエピポーラ線上に複数ある場合、探索の精度が低下するという課題がある。
[概要]
まず、図1を参照しながら、実施の形態に係る三次元モデル生成方法の概要について説明する。
複数の撮像装置301は、所定の領域を撮影する複数の撮像装置である。複数の撮像装置301は、それぞれ被写体を撮影し、撮影した複数のフレームをそれぞれ推定装置200に出力する。本実施の形態では、撮像装置群300には、2台以上の撮像装置301が含まれる。また、複数の撮像装置301は、互いに異なる視点から同一の被写体を撮影する。フレームは、言い換えると、画像である。
推定装置200は、1台以上の撮像装置301に複数の視点から被写体を撮影させることでカメラ校正を行う。推定装置200は、例えば、複数の撮像装置301でそれぞれ撮影された複数のフレームに基づいて複数の撮像装置301の位置及び姿勢を推定するカメラ校正を行う。ここで、撮像装置301の姿勢とは、撮像装置301の撮影方向、及び、撮像装置301の傾きの少なくとも一方を示す。撮像装置301の撮影方向とは、撮像装置301の光軸の方向である。撮像装置301の傾きとは、基準姿勢からの撮像装置301の光軸周りの回転角度である。
三次元モデル生成装置100は、複数の撮像装置301で撮影された複数のフレームと推定装置200で推定されたカメラパラメータとに基づいて、所定の領域の三次元モデルを生成する。具体的には、三次元モデル生成装置100は、複数の撮像装置301それぞれのカメラパラメータと、複数のフレームとに基づいて、被写体の三次元モデルを仮想的な三次元空間上に生成する三次元モデル生成処理を実行する装置である。
続いて、図2を参照しながら、三次元モデル生成装置100の構成の詳細について説明する。
次に、三次元モデル生成装置100の動作について、図11を用いて説明する。図11は、三次元モデル生成装置100の動作の一例を示すフローチャートである。
本実施の形態に係る三次元モデル生成方法は、情報処理装置によって実行される三次元モデル生成方法であって、被写体を複数の視点から撮影することで得られた複数のフレームを取得し(S101)、複数のフレームのうちの第1フレームの第1画素に類似する類似点を、第1フレームとは異なる第2フレームにおける第1画素に基づく探索領域における複数の第2画素から探索し(S113)、第1画素と、複数の第2画素のそれぞれとの間の類似度を用いて、探索結果の精度を算出し(S114)、探索結果、および、精度を用いて三次元モデルを生成する(S115)。
また、上記実施の形態に係る三次元モデル生成方法は、探索結果の精度を算出することで、被写体の三次元モデルを生成したが、探索結果の精度を算出することは省略されてもよい。例えば、変形例1に係る三次元モデル生成方法は、情報処理装置によって実行される。変形例1に係る三次元モデル生成方法は、被写体を第1視点から撮影することで得られた第1画像と、前記被写体を第2視点から撮影することで得られた第2画像と、を取得し、前記第1画像の第1の点に類似する類似点を探索するために、前記第1の点に基づいて前記第2画像の探索領域を特定し、前記探索領域の複数の第2の点のそれぞれと前記第1の点とが類似する度合いを示す類似度を算出することで、前記探索領域において前記類似点を探索し、前記探索の結果、および、算出された複数の類似度のばらつきに基づいて前記被写体の三次元モデルを生成する。
上記実施の形態および変形例1に係る三次元モデル生成方法では、NCCを類似度(マッチングスコア)として算出したがこれに限らない。例えば、第1フレームと、処理対象の第2フレームとの間で小領域間の画素値の差分の二乗和であるSSD(Sum of Squared Difference)を示すSSD(I,J)を類似度として式4を用いて算出してもよい。SSD(I,J)は、値が小さくなるほど類似度が高いことを示す。
110 受信部
120 記憶部
130 取得部
140 生成部
141 探索部
142 算出部
143 モデル生成部
150 出力部
200 推定装置
300 撮像装置群
301 撮像装置
400 三次元モデル生成システム
500、510 被写体
511、520 三次元モデル
521、522 三次元点
531、561 第1フレーム
532、533、562、563 第2フレーム
541、571、574、581 第1画素
542、543 第2画素
552、553、572、573、575、576 エピポーラ線
582、583 第3画素
592、593 探索領域
L1~L3 直線
V1 第1視点
V2 第2視点
V3 第3視点
Claims (11)
- 情報処理装置によって実行される三次元モデル生成方法であって、
被写体を複数の視点から撮影することで得られた複数の画像を取得し、
前記複数の画像のうちの第1画像の第1の点に類似する類似点を、前記第1画像とは異なる第2画像における前記第1の点に基づく探索領域における複数の第2の点から探索し、
前記第1の点と、前記複数の第2の点のそれぞれとの間の類似度を用いて、探索結果の精度を算出し、
前記探索結果、および、前記精度を用いて前記被写体の三次元モデルを生成する
三次元モデル生成方法。 - 前記探索では、複数の前記第2画像のそれぞれについて前記類似点を探索し、
前記精度の算出では、前記複数の第2画像にそれぞれ対応する複数の第1探索結果のそれぞれの第1精度を算出し、
前記三次元モデルの生成では、前記複数の第1探索結果、および、前記複数の第1精度を用いて前記三次元モデルを生成する
請求項1に記載の三次元モデル生成方法。 - 前記三次元モデルの生成では、算出された前記第1精度が所定の精度よりも小さい第2画像における前記第1探索結果を用いずに、前記三次元モデルを生成する
請求項2に記載の三次元モデル生成方法。 - 前記三次元モデルの生成では、
前記複数の第2画像から、算出された前記第1精度が高い順にN(Nは1以上の整数)枚を選択し、
選択したN枚の第2画像に対応するN個の第1探索結果を用いて前記三次元モデルを生成する
請求項2に記載の三次元モデル生成方法。 - 前記三次元モデルの生成では、
前記複数の第1探索結果に基づいて、前記複数の第1探索結果にそれぞれが対応する複数の三次元点を生成し、
前記複数の三次元点を、対応する第1探索結果の第1精度が高いほど大きく重み付けされた加重平均を行うことで統合三次元点を生成し、
生成した前記統合三次元点を含む前記三次元モデルを生成する
請求項2に記載の三次元モデル生成方法。 - 前記探索は、複数の前記第1の点のそれぞれについて行われ、
前記精度の算出では、前記複数の第1の点のそれぞれについて、前記複数の第1精度を算出し、
前記三次元モデルの生成では、前記複数の第1の点のそれぞれについて、前記複数の第1探索結果、および、前記複数の第1精度を用いて三次元点を生成することで得られる複数の三次元点を含む三次元モデルを、前記三次元モデルとして生成する
請求項2から5のいずれか1項に記載の三次元モデル生成方法。 - 前記精度の算出では、前記複数の第1の点のそれぞれについて、当該第1の点について算出された前記複数の第1精度を示す複数の値の総和を当該第1の点に基づく探索結果の第2精度として算出し、
前記三次元モデルの生成では、前記複数の三次元点と、前記複数の三次元点に対応する複数の前記第2精度とを含む三次元モデルを前記三次元モデルとして生成する
請求項6に記載の三次元モデル生成方法。 - 前記三次元モデルの生成では、
所定の精度より高い第2精度に対応する2つの高精度な三次元点の間の、前記所定の精度より低い第2精度に対応する1以上の低精度な三次元点を、前記2つの高精度な三次元点を基準にして補正する
請求項7に記載の三次元モデル生成方法。 - 前記探索領域は、前記第1の点に対応するエピポーラ線上の複数の画素により構成される領域である
請求項1から8のいずれか1項に記載の三次元モデル生成方法。 - メモリと、
プロセッサとを備え、
前記プロセッサは、前記メモリを用いて、
被写体を複数の視点から撮影することで得られた複数の画像を取得し、
前記複数の画像のうちの第1画像の第1の点に類似する類似点を、前記第1画像とは異なる第2画像における前記第1の点に基づく探索領域から探索し、
前記第1の点と、前記探索領域における複数の第2の点のそれぞれとの間の類似度を用いて、前記第1の点に基づいて生成される三次元点の精度を算出し、
前記探索の結果、および、前記精度を用いて前記被写体の三次元モデルを生成する
三次元モデル生成装置。 - 情報処理装置によって実行される三次元モデル生成方法であって、
被写体を第1視点から撮影することで得られた第1画像と、前記被写体を第2視点から撮影することで得られた第2画像と、を取得し、
前記第1画像の第1の点に類似する類似点を探索するために、前記第1の点に基づいて前記第2画像の探索領域を特定し、
前記探索領域の複数の第2の点のそれぞれと前記第1の点とが類似する度合いを示す類似度を算出することで、前記探索領域において前記類似点を探索し、
前記探索の結果、および、算出された複数の類似度のばらつきに基づいて前記被写体の三次元モデルを生成する
三次元モデル生成方法。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP21775312.8A EP4131166A4 (en) | 2020-03-27 | 2021-03-23 | THREE-DIMENSIONAL MODEL GENERATION METHOD AND THREE-DIMENSIONAL MODEL GENERATION DEVICE |
JP2022510572A JPWO2021193672A1 (ja) | 2020-03-27 | 2021-03-23 | |
US17/943,479 US20230005216A1 (en) | 2020-03-27 | 2022-09-13 | Three-dimensional model generation method and three-dimensional model generation device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2020058979 | 2020-03-27 | ||
JP2020-058979 | 2020-03-27 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/943,479 Continuation US20230005216A1 (en) | 2020-03-27 | 2022-09-13 | Three-dimensional model generation method and three-dimensional model generation device |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021193672A1 true WO2021193672A1 (ja) | 2021-09-30 |
Family
ID=77890639
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2021/012093 WO2021193672A1 (ja) | 2020-03-27 | 2021-03-23 | 三次元モデル生成方法及び三次元モデル生成装置 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20230005216A1 (ja) |
EP (1) | EP4131166A4 (ja) |
JP (1) | JPWO2021193672A1 (ja) |
WO (1) | WO2021193672A1 (ja) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023095375A1 (ja) * | 2021-11-29 | 2023-06-01 | パナソニックIpマネジメント株式会社 | 三次元モデル生成方法及び三次元モデル生成装置 |
CN116630550A (zh) * | 2023-07-21 | 2023-08-22 | 方心科技股份有限公司 | 一种基于多图片的三维模型生成方法及系统 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000171214A (ja) * | 1998-12-08 | 2000-06-23 | Meidensha Corp | 対応点検索方法及びこれを利用した三次元位置計測方法 |
JP2007132715A (ja) * | 2005-11-08 | 2007-05-31 | Univ Chuo | 三次元計測方法、計測装置、復元方法および復元装置 |
JP2011221988A (ja) * | 2010-03-24 | 2011-11-04 | National Institute Of Advanced Industrial & Technology | ステレオ画像による3次元位置姿勢計測装置、方法およびプログラム |
JP2014102805A (ja) * | 2012-11-22 | 2014-06-05 | Canon Inc | 情報処理装置、情報処理方法及びプログラム |
JP2017130146A (ja) | 2016-01-22 | 2017-07-27 | キヤノン株式会社 | 画像管理装置、画像管理方法及びプログラム |
WO2019160032A1 (ja) * | 2018-02-14 | 2019-08-22 | オムロン株式会社 | 3次元計測システム及び3次元計測方法 |
JP2020047049A (ja) * | 2018-09-20 | 2020-03-26 | ファナック株式会社 | 画像処理装置及び画像処理方法 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6647146B1 (en) * | 1997-08-05 | 2003-11-11 | Canon Kabushiki Kaisha | Image processing apparatus |
US7574067B2 (en) * | 2003-10-03 | 2009-08-11 | General Electric Company | Surface reconstruction and registration with a helmholtz reciprocal image pair |
US8072448B2 (en) * | 2008-01-15 | 2011-12-06 | Google Inc. | Three-dimensional annotations for street view data |
US9525862B2 (en) * | 2011-08-31 | 2016-12-20 | Metaio Gmbh | Method for estimating a camera motion and for determining a three-dimensional model of a real environment |
US9542773B2 (en) * | 2013-05-23 | 2017-01-10 | Google Inc. | Systems and methods for generating three-dimensional models using sensed position data |
KR102209008B1 (ko) * | 2014-02-17 | 2021-01-28 | 삼성전자주식회사 | 카메라 포즈 추정 장치 및 카메라 포즈 추정 방법 |
US9436987B2 (en) * | 2014-04-30 | 2016-09-06 | Seiko Epson Corporation | Geodesic distance based primitive segmentation and fitting for 3D modeling of non-rigid objects from 2D images |
WO2016185024A1 (en) * | 2015-05-20 | 2016-11-24 | Cognimatics Ab | Method and arrangement for calibration of cameras |
CN106887018B (zh) * | 2015-12-15 | 2021-01-05 | 株式会社理光 | 立体匹配方法、控制器和系统 |
-
2021
- 2021-03-23 JP JP2022510572A patent/JPWO2021193672A1/ja active Pending
- 2021-03-23 EP EP21775312.8A patent/EP4131166A4/en active Pending
- 2021-03-23 WO PCT/JP2021/012093 patent/WO2021193672A1/ja active Application Filing
-
2022
- 2022-09-13 US US17/943,479 patent/US20230005216A1/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000171214A (ja) * | 1998-12-08 | 2000-06-23 | Meidensha Corp | 対応点検索方法及びこれを利用した三次元位置計測方法 |
JP2007132715A (ja) * | 2005-11-08 | 2007-05-31 | Univ Chuo | 三次元計測方法、計測装置、復元方法および復元装置 |
JP2011221988A (ja) * | 2010-03-24 | 2011-11-04 | National Institute Of Advanced Industrial & Technology | ステレオ画像による3次元位置姿勢計測装置、方法およびプログラム |
JP2014102805A (ja) * | 2012-11-22 | 2014-06-05 | Canon Inc | 情報処理装置、情報処理方法及びプログラム |
JP2017130146A (ja) | 2016-01-22 | 2017-07-27 | キヤノン株式会社 | 画像管理装置、画像管理方法及びプログラム |
WO2019160032A1 (ja) * | 2018-02-14 | 2019-08-22 | オムロン株式会社 | 3次元計測システム及び3次元計測方法 |
JP2020047049A (ja) * | 2018-09-20 | 2020-03-26 | ファナック株式会社 | 画像処理装置及び画像処理方法 |
Non-Patent Citations (1)
Title |
---|
See also references of EP4131166A4 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023095375A1 (ja) * | 2021-11-29 | 2023-06-01 | パナソニックIpマネジメント株式会社 | 三次元モデル生成方法及び三次元モデル生成装置 |
CN116630550A (zh) * | 2023-07-21 | 2023-08-22 | 方心科技股份有限公司 | 一种基于多图片的三维模型生成方法及系统 |
CN116630550B (zh) * | 2023-07-21 | 2023-10-20 | 方心科技股份有限公司 | 一种基于多图片的三维模型生成方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
US20230005216A1 (en) | 2023-01-05 |
EP4131166A4 (en) | 2023-05-10 |
EP4131166A1 (en) | 2023-02-08 |
JPWO2021193672A1 (ja) | 2021-09-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021140886A1 (ja) | 三次元モデル生成方法、情報処理装置およびプログラム | |
CN110176032B (zh) | 一种三维重建方法及装置 | |
CN112785702A (zh) | 一种基于2d激光雷达和双目相机紧耦合的slam方法 | |
CN109472828B (zh) | 一种定位方法、装置、电子设备及计算机可读存储介质 | |
CN110044374B (zh) | 一种基于图像特征的单目视觉测量里程的方法及里程计 | |
JP2011174879A (ja) | 位置姿勢推定装置及びその方法 | |
US20230005216A1 (en) | Three-dimensional model generation method and three-dimensional model generation device | |
JP7407428B2 (ja) | 三次元モデル生成方法及び三次元モデル生成装置 | |
US10607350B2 (en) | Method of detecting and describing features from an intensity image | |
US11062521B2 (en) | Virtuality-reality overlapping method and system | |
CN109902675B (zh) | 物体的位姿获取方法、场景重构的方法和装置 | |
CN115035235A (zh) | 三维重建方法及装置 | |
CN110825079A (zh) | 一种地图构建方法及装置 | |
US20240296621A1 (en) | Three-dimensional model generation method and three-dimensional model generation device | |
CN111798507A (zh) | 一种输电线安全距离测量方法、计算机设备和存储介质 | |
JP6922348B2 (ja) | 情報処理装置、方法、及びプログラム | |
CN111105467A (zh) | 一种图像标定方法、装置及电子设备 | |
CN112598736A (zh) | 一种基于地图构建的视觉定位方法及装置 | |
WO2021100681A1 (ja) | 三次元モデル生成方法及び三次元モデル生成装置 | |
WO2022124017A1 (ja) | 三次元モデル生成方法及び三次元モデル生成装置 | |
JP7075090B1 (ja) | 情報処理システム、及び、情報処理方法 | |
Liebold et al. | Integrated Georeferencing of LIDAR and Camera Data acquired from a moving platform | |
JP2022112168A (ja) | 情報処理装置、情報処理方法、およびプログラム | |
CN117218066A (zh) | 深度相机成像质量的评估方法与系统 | |
TW202143176A (zh) | 基於光標籤的場景重建系統 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21775312 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022510572 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2021775312 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2021775312 Country of ref document: EP Effective date: 20221027 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |