WO2023095375A1 - 三次元モデル生成方法及び三次元モデル生成装置 - Google Patents

三次元モデル生成方法及び三次元モデル生成装置 Download PDF

Info

Publication number
WO2023095375A1
WO2023095375A1 PCT/JP2022/025296 JP2022025296W WO2023095375A1 WO 2023095375 A1 WO2023095375 A1 WO 2023095375A1 JP 2022025296 W JP2022025296 W JP 2022025296W WO 2023095375 A1 WO2023095375 A1 WO 2023095375A1
Authority
WO
WIPO (PCT)
Prior art keywords
camera
distance
subject
image
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2022/025296
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
研翔 寺西
徹 松延
哲史 吉川
ジョージ ナダー
ヂァン ウー
ポンサック ラサン
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panasonic Intellectual Property Management Co Ltd
Original Assignee
Panasonic Intellectual Property Management Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Panasonic Intellectual Property Management Co Ltd filed Critical Panasonic Intellectual Property Management Co Ltd
Priority to CN202280076021.4A priority Critical patent/CN118266003A/zh
Priority to EP22898160.1A priority patent/EP4443383A4/en
Priority to JP2023563508A priority patent/JP7692175B2/ja
Publication of WO2023095375A1 publication Critical patent/WO2023095375A1/ja
Priority to US18/663,702 priority patent/US20240296621A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three-dimensional [3D] modelling for computer graphics
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/593Depth or shape recovery from multiple images from stereo images
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2210/00Indexing scheme for image generation or computer graphics
    • G06T2210/21Collision detection, intersection

Definitions

  • the present disclosure relates to a 3D model generation method and a 3D model generation device.
  • Patent Document 1 discloses a technique for generating a three-dimensional model of a subject using multiple images obtained by photographing the subject from multiple viewpoints.
  • the present disclosure provides a three-dimensional model generation method and the like that can improve the generation accuracy of a three-dimensional model and reduce the processing time of three-dimensional model generation processing.
  • a 3D model generation method is a 3D model generation method executed by an information processing device, in which subject information including a plurality of positions on a subject in a 3D space is acquired, and a first Obtaining a first camera image of the subject taken from a viewpoint and a second camera image of the subject taken from a second viewpoint, wherein one or more cameras include the first viewpoint and the second viewpoint.
  • map information including a plurality of three-dimensional points obtained by camera calibration performed by photographing the subject from a plurality of viewpoints and each indicating a position on the subject in a three-dimensional space, the determining a search range in a three-dimensional space including a first three-dimensional point on the subject corresponding to a first point in the first camera image based on the subject information, and performing the search on the second camera image; Matching is performed to search for similarities similar to the first point in the range corresponding to the range, and a three-dimensional model is generated using the search result of the matching.
  • a 3D model generation device includes a processor and a memory, and the processor uses the memory to generate subject information including a plurality of positions on the subject in a 3D space. obtaining a first camera image of the subject taken from a first viewpoint and a second camera image of the subject taken from a second viewpoint; map information including a plurality of three-dimensional points obtained by camera calibration executed by photographing the subject from a plurality of viewpoints including a second viewpoint and each indicating a position on the subject in a three-dimensional space; determining a search range in a three-dimensional space including a first three-dimensional point on the subject corresponding to a first point in the first camera image, based on the subject information; Matching is performed to search for similarities similar to the first point in a range corresponding to the search range on the image, and a three-dimensional model is generated using the search results of the matching.
  • a 3D model generation device includes a memory and a processor connected to the memory, and the processor generates a 3D model by photographing a subject in a 3D space from a first viewpoint. and a second camera image generated by photographing the subject from a second viewpoint, and passing through the first viewpoint and a first point of the first camera image. searching for a second point similar to the first point in a search range on the epipolar line specified by projecting a straight line on the second camera image; generating a model, wherein the search range is provided based on a position of a first three-dimensional point corresponding to the first point in the three-dimensional space, and the position is an electromagnetic wave emitted toward the subject; is calculated based on the reflected wave of
  • the present disclosure may be implemented as a program that causes a computer to execute the steps included in the three-dimensional model generation method.
  • the present disclosure may also be implemented as a non-temporary recording medium such as a computer-readable CD-ROM that records the program.
  • the present disclosure may be realized as information, data, or signals indicating the program. These programs, information, data and signals may then be distributed over a communication network such as the Internet.
  • FIG. 1 is a diagram for explaining an outline of a three-dimensional model generation method according to an embodiment.
  • FIG. 2 is a block diagram showing a characteristic configuration of the 3D model generation system according to the embodiment.
  • FIG. 3 is a diagram for explaining camera calibration by the estimation device.
  • FIG. 4A is a diagram for explaining a first example of processing for selecting a target frame.
  • FIG. 4B is a diagram for explaining a second example of processing for selecting a target frame.
  • FIG. 4C is a diagram for explaining a third example of processing for selecting a target frame.
  • FIG. 4D is a diagram for explaining a fourth example of processing for selecting a target frame.
  • FIG. 4E is a diagram for explaining a fifth example of processing for selecting a target frame.
  • FIG. 4A is a diagram for explaining a first example of processing for selecting a target frame.
  • FIG. 4B is a diagram for explaining a second example of processing for selecting a target frame.
  • FIG. 4C is a
  • FIG. 5A is a diagram for explaining a problem when only the first distance information is used.
  • FIG. 5B is a diagram showing an example of estimating the position of the first three-dimensional point using the second distance information.
  • FIG. 6 is a diagram for explaining matching processing when the search range is not limited.
  • FIG. 7 is a diagram for explaining matching processing when the search range is limited.
  • FIG. 8 is a flow chart showing an example of the operation of the 3D model generation device.
  • FIG. 9 is a block diagram showing a characteristic configuration of the 3D model generation system according to Modification 1.
  • FIG. 10 is a diagram showing an example of the configuration of a camera group.
  • 11 is a flow chart showing an example of the operation of the sensor fusion device according to Modification 1.
  • FIG. FIG. 10 is a diagram showing an example of the configuration of a camera group.
  • FIG. 12 is a diagram for explaining an example of movement of the sensor device with respect to the subject.
  • FIG. 13 is a diagram showing an example of a three-dimensional point cloud integrated with camera images.
  • FIG. 14 is a diagram showing an example of a time series three-dimensional point group.
  • FIG. 15 is a diagram for explaining the integration of the camera image integration 3D point cloud and the time-series 3D point cloud.
  • Patent Document 1 generates a three-dimensional model by searching for similarities between a plurality of images.
  • searching for similarities when searching for similarities of one pixel in one image from another image, the epipolar line on the other image is calculated from the geometric constraints of the camera, and all pixels on the epipolar line are calculated. Exploration is done. Therefore, there is room for improving the processing speed of searching for similarities.
  • an erroneous similarity may be searched for, and in this case, there is a problem that the precision of the search is lowered.
  • searching for similarities not only on the epipolar line but also from the search range such as the entire image or a predetermined area, there is a problem of searching for erroneous similarities.
  • the present disclosure provides a three-dimensional model generation method and the like that can improve the generation accuracy of the three-dimensional model and reduce the processing time of the three-dimensional model.
  • a 3D model generation method is a 3D model generation method executed by an information processing device, in which subject information including a plurality of positions on a subject in a 3D space is acquired, and a first Obtaining a first camera image of the subject taken from a viewpoint and a second camera image of the subject taken from a second viewpoint, wherein one or more cameras include the first viewpoint and the second viewpoint.
  • map information including a plurality of three-dimensional points obtained by camera calibration performed by photographing the subject from a plurality of viewpoints and each indicating a position on the subject in a three-dimensional space, the determining a search range in a three-dimensional space including a first three-dimensional point on the subject corresponding to a first point in the first camera image based on the subject information, and performing the search on the second camera image; Matching is performed to search for similarities similar to the first point in the range corresponding to the range, and a three-dimensional model is generated using the search result of the matching.
  • the search range is determined based on subject information without using map information, and the first camera image on the first camera image is detected in the range corresponding to the search range on the second camera image limited by the search range.
  • Search for similarities that are similar to the points in are searched for in a range where there is a high possibility that similarities exist based on subject information, so that it is possible to improve the accuracy of similarity point search and reduce the time required for the search process. can do. Therefore, the generation accuracy of the three-dimensional model can be improved, and the processing time of the three-dimensional model generation processing can be shortened.
  • the epipolar line corresponding to the first point in the second camera image is limited to a length corresponding to the search range, and on the epipolar line in the second camera image, the Similarities may be searched for that are similar to the first point.
  • the subject information includes a distance image generated by measurement by a distance image sensor, and the distance image includes a plurality of pixels each having distance information indicating the distance from the distance image sensor to the subject. and in the determination, the search range may be determined based on distance information possessed by a pixel corresponding to the first point in the distance image.
  • the subject information includes a distance image having a plurality of pixels associated with the plurality of pixels of the first camera image, it is possible to easily identify the distance information corresponding to the first point. can. Therefore, the position of the first three-dimensional point can be estimated based on the specified distance information, and the search range can be determined with high accuracy.
  • the subject information includes a plurality of distance images respectively generated by measurement by a plurality of distance image sensors, and each of the plurality of distance images is obtained from the distance image sensor that generated the distance image.
  • the plurality of camera images include the first camera image and the second camera image, and in the determination, one of the plurality of range images
  • the search range may be determined based on one or more pieces of distance information possessed by one or more pixels each corresponding to the first point in one or more distance images.
  • the subject information includes a plurality of distance images having a plurality of pixels associated with the plurality of pixels of the first camera image
  • the plurality of distance information corresponding to the first point can be easily specified. can do.
  • the plurality of distance information specified in this way are distance information obtained from different viewpoints. Therefore, even if some distance information contains detection errors, other distance information can be used to reduce the detection errors. The impact can be reduced. Therefore, the position of the first three-dimensional point can be estimated more accurately based on one or more pieces of distance information among the plurality of pieces of distance information, and the search range can be determined with accuracy.
  • the determining the search range using third distance information corresponding to the first point, which is calculated using two or more camera images other than the first camera image, as the one or more distance information; may
  • the search range can be determined using the third distance information with high accuracy. Therefore, the search range can be determined with high accuracy.
  • each of the plurality of range image sensors corresponds in position and orientation to a plurality of cameras including the one or more cameras, and in the determination, in the one or more range images, the first The one or more pixels each corresponding to a point may be identified using poses of the plurality of cameras obtained from the camera calibration.
  • one or more pieces of distance information can be specified using the positions and orientations of multiple cameras obtained by camera calibration.
  • the one or more distance images include a first distance image corresponding to the first camera image and a second distance image corresponding to the second camera image
  • the second camera image is the Feature point matching in camera calibration may be determined from the plurality of camera images based on the number of feature points between the first camera image.
  • the second camera image to be matched for similarities with the first camera image is determined based on the number of feature points. Therefore, it is possible to specify a second distance image for specifying one or more pieces of distance information that is highly likely to contain no error, that is, has high accuracy.
  • the second camera image is calculated from a first position and orientation at the time of shooting of the camera that shot the first camera image and a second position and orientation at the time of shooting of the camera that shot the second camera image. It may be determined based on the difference in photographing attitudes.
  • the second camera image that is the target of similarity matching with the first camera image is determined based on the difference in the pose of the camera. Therefore, it is possible to specify a second distance image for specifying one or more pieces of distance information that is highly likely to contain no error, that is, has high accuracy.
  • the second camera image is calculated from a first position and orientation at the time of shooting of the camera that shot the first camera image and a second position and orientation at the time of shooting of the camera that shot the second camera image. It may be determined based on the difference in shooting positions.
  • the second camera image that is the target of similarity matching with the first camera image is determined based on the difference in the positions of the cameras. Therefore, it is possible to specify a second distance image for specifying one or more pieces of distance information that is highly likely to contain no error, that is, has high accuracy.
  • the difference between the maximum value and the minimum value of the one or more pieces of distance information may be less than a first value.
  • one or more pieces of distance information are specified in which the difference between the maximum value and the minimum value is less than the first value. This makes it possible to specify one or more pieces of distance information that are highly likely to contain no error, that is, have high accuracy.
  • the search range may be widened as the accuracy of the one or more pieces of distance information is low.
  • the search range is widened as the accuracy of one or more pieces of distance information decreases, so the search range can be determined according to the accuracy.
  • the accuracy may be higher as the number of the one or more pieces of distance information increases.
  • the search range can be narrowed as the number of pieces of distance information of 1 or more is large.
  • the accuracy may be higher as the variance of the one or more pieces of distance information is smaller.
  • the search range can be narrowed as the variance of the plurality of distance information is smaller.
  • the subject information may be generated based on two or more types of sensor information.
  • subject information is generated based on two or more different types of sensor information.
  • object information is obtained in which a decrease in accuracy due to detection errors is reduced.
  • the two or more types of sensor information are obtained from a plurality of two-dimensional images obtained from a stereo camera and from a measuring instrument that emits electromagnetic waves and obtains reflected waves of the electromagnetic waves reflected by the subject. and three-dimensional data obtained by
  • the subject information is generated based on a plurality of two-dimensional images and three-dimensional data. Data can be obtained with high accuracy.
  • a 3D model generation device includes a processor and a memory, and the processor uses the memory to generate subject information including a plurality of positions on the subject in a 3D space. obtaining a first camera image of the subject taken from a first viewpoint and a second camera image of the subject taken from a second viewpoint; map information including a plurality of three-dimensional points obtained by camera calibration executed by photographing the subject from a plurality of viewpoints including a second viewpoint and each indicating a position on the subject in a three-dimensional space; determining a search range in a three-dimensional space including a first three-dimensional point on the subject corresponding to a first point in the first camera image, based on the subject information; Matching is performed to search for similarities similar to the first point in a range corresponding to the search range on the image, and a three-dimensional model is generated using the search results of the matching.
  • the search range is determined based on subject information without using map information, and the first camera image on the first camera image is detected in the range corresponding to the search range on the second camera image limited by the search range.
  • Search for similarities that are similar to the points in are searched for in a range where there is a high possibility that similarities exist based on subject information, so that it is possible to improve the accuracy of similarity point search and reduce the time required for the search process. can do. Therefore, the generation accuracy of the three-dimensional model can be improved, and the processing time of the three-dimensional model generation processing can be shortened.
  • a 3D model generation device includes a memory and a processor connected to the memory, and the processor generates a 3D model by photographing a subject in a 3D space from a first viewpoint. and a second camera image generated by photographing the subject from a second viewpoint, and passing through the first viewpoint and a first point of the first camera image. searching for a second point similar to the first point in a search range on the epipolar line specified by projecting a straight line on the second camera image; generating a model, wherein the search range is provided based on a position of a first three-dimensional point corresponding to the first point in the three-dimensional space, and the position is an electromagnetic wave emitted toward the subject; is calculated based on the reflected wave of
  • the search range is determined based on the position of the first three-dimensional point obtained based on the reflected wave of the electromagnetic wave, and in the range corresponding to the search range on the second camera image limited by the search range, Search for similarities that are similar to the first point on the first camera image.
  • similarity points are searched for in a range where there is a high possibility that similarities exist based on subject information, so that it is possible to improve the accuracy of similarity point search and reduce the time required for the search process. can do. Therefore, the generation accuracy of the three-dimensional model can be improved, and the processing time of the three-dimensional model generation processing can be shortened.
  • the position may be obtained based on a range image generated by a sensor that receives the reflected wave.
  • the position of the first three-dimensional point can be easily specified based on the range image. Therefore, the search range can be determined with high accuracy.
  • each figure is a schematic diagram and is not necessarily strictly illustrated.
  • symbol is attached
  • FIG. 1 is a diagram for explaining the outline of the three-dimensional model generation method according to the embodiment.
  • FIG. 2 is a block diagram showing a characteristic configuration of the 3D model generation system according to the embodiment.
  • a three-dimensional model of a predetermined area is generated from a plurality of images taken from a plurality of different viewpoints using a plurality of cameras 310.
  • the predetermined area is an area that includes a static object that is stationary, a moving object such as a person, or both.
  • the predetermined area is, for example, an area that includes at least one of a stationary stationary object and a moving moving object as a subject.
  • predetermined areas containing stationary objects and moving objects include venues where sports games such as basketball are held, spaces on roads where people or cars are present, and the like. It should be noted that the predetermined area may include not only a specific object to be photographed, but also a landscape or the like.
  • FIG. 1 illustrates a case where the subject 500 is a building. Further, hereinafter, not only a specific object as a subject, but also a predetermined area including a landscape or the like will simply be referred to as a subject.
  • the 3D model generation system 400 includes a camera group 300 including a plurality of cameras 310, an estimation device 200, and a 3D model generation device 100, as shown in FIG.
  • a plurality of cameras 310 are a plurality of imaging devices that photograph a predetermined area.
  • a plurality of cameras 310 each capture a subject and output a plurality of captured frames to estimation device 200 .
  • a plurality of shot frames is also called a multi-viewpoint image.
  • camera group 300 includes two or more cameras 310 .
  • the plurality of cameras 310 shoot the same subject from different viewpoints.
  • a frame in other words, is an image.
  • the 3D model generation system 400 has been described as having the camera group 300, it is not limited to this and may be equipped with one camera 310.
  • a single camera 310 is moved to photograph a subject existing in the real space at a plurality of different timings, thereby allowing the single camera 310 to capture a plurality of images with different viewpoints.
  • Shooting may be performed so as to generate a multi-viewpoint image composed of frames.
  • each of the plurality of frames is associated with the position and orientation of camera 310 at the timing when the frame was captured.
  • Each of the plurality of frames is a frame captured (generated) by a camera 310 that differs from each other in at least one of the position and orientation of the camera 310 .
  • the cameras 310 with at least one of the positions and orientations different from each other may be realized by a plurality of cameras 310 with fixed positions and orientations, or by one camera 310 with at least one of the positions and orientations not fixed. Alternatively, it may be realized by a combination of a camera 310 whose position and orientation are fixed and a camera 310 whose position and orientation are not fixed.
  • each camera 310 generates a camera image.
  • a camera image has a plurality of pixels arranged two-dimensionally. Each pixel of the camera image may have color information or luminance information as a pixel value.
  • Each camera 310 may also be a camera with a range image sensor 320 .
  • the distance image sensor 320 generates a distance image (depth map) by measuring the distance to the subject at each pixel position.
  • a range image has a plurality of pixels arranged two-dimensionally. Each pixel of the distance image may have, as a pixel value, distance information indicating the distance from the camera 310 to the subject at the position corresponding to the pixel.
  • a range image is an example of subject information including a plurality of positions on a subject in a three-dimensional space.
  • the plurality of cameras 310 are cameras each equipped with a distance image sensor 320 that generates a distance image.
  • the positions and orientations of the multiple cameras 310 and the positions and orientations of the multiple range image sensors 320 are in a fixed correspondence relationship.
  • a plurality of cameras 310 generate camera images and range images as frames.
  • a plurality of pixels of the camera image generated by each camera 310 may be associated with a plurality of pixels of the distance image generated by the camera 310 .
  • the range image sensor 320 may be a ToF (Time of Flight) camera. Further, the distance image sensor 320 is a sensor that emits an electromagnetic wave like the measuring device 321 described later in Modification 1, acquires a reflected wave of the electromagnetic wave reflected by an object, and thereby generates a distance image. good too.
  • ToF Time of Flight
  • the resolution (number of pixels) of the camera image and the resolution (number of pixels) of the range image may be the same or different.
  • the resolution of the camera image and the resolution of the distance image may be associated with a plurality of pixels in which one pixel is larger than that of the camera image and the distance image.
  • the plurality of cameras 310 may generate camera images and range images with the same resolution, or may generate camera images and range images with mutually different resolutions.
  • the camera image and the range image may be output from the camera 310 as an integrated image in which they are integrated. That is, the integrated image may be an image including a plurality of pixels each having color information indicating the color of the pixel and distance information as pixel values.
  • the plurality of cameras 310 may be directly connected to the estimating device 200 by wired or wireless communication so that the frames captured by each camera can be output to the estimating device 200, or may be connected directly to a communication device or server (not shown). It may be indirectly connected to the estimating device 200 via a hub.
  • the frames captured by the plurality of cameras 310 may be output to the estimation device 200 in real time. again.
  • the frames may be once recorded in an external storage device such as a memory or a cloud server, and then output from the external storage device to the estimation device 200 .
  • each of the plurality of cameras 310 may be a fixed camera such as a surveillance camera, a mobile camera such as a video camera, a smartphone, or a wearable camera, or a mobile camera such as a drone with a shooting function. may be
  • the estimating apparatus 200 performs camera calibration by causing one or more cameras 310 to photograph a subject from multiple viewpoints.
  • the estimating apparatus 200 performs camera calibration for estimating the positions and orientations of the multiple cameras 310 based on multiple frames captured by the multiple cameras 310, for example.
  • the posture of the camera 310 indicates at least one of the shooting direction of the camera 310 and the tilt of the camera 310 .
  • the shooting direction of the camera 310 is the direction of the optical axis of the camera 310 .
  • the tilt of the camera 310 is the rotation angle of the camera 310 around the optical axis from the reference posture.
  • the estimation device 200 estimates the camera parameters of the multiple cameras 310 based on multiple frames (multiple camera images) acquired from the multiple cameras 310 .
  • the camera parameters are parameters that indicate the characteristics of the camera 310, and include internal parameters such as the focal length and image center of the camera 310, and the position (more specifically, three-dimensional position) and orientation of the camera 310. and external parameters that indicate That is, the position and orientation of each of the multiple cameras 310 are obtained by estimating the camera parameters of each of the multiple cameras 310 .
  • the estimation method by which the estimation device 200 estimates the position and orientation of the camera 310 is not particularly limited.
  • the estimation device 200 may estimate the positions and orientations of the multiple cameras 310 using, for example, Visual-SLAM (Simultaneous Localization and Mapping) technology.
  • the estimation device 200 may estimate the positions and orientations of the multiple cameras 310 using, for example, Structure-From-Motion technology.
  • the estimating apparatus 200 uses Visual-SLAM technology or Structure-From-Motion technology to extract characteristic points from each of a plurality of frames 531 to 533 captured by a plurality of cameras 310. Points 541 to 543 are extracted, and among the extracted feature points 541 to 543, a feature point search is performed to extract a set of similar points that are similar between a plurality of frames. Estimation apparatus 200 can identify points on subject 510 that appear in common in a plurality of frames 531 to 533 by searching for feature points. The three-dimensional coordinates of the point can be determined by the principle of triangulation.
  • the estimation device 200 can estimate the position and orientation of each camera 310 by extracting a plurality of sets of similarities and using the plurality of sets of similarities.
  • the estimating apparatus 200 calculates three-dimensional coordinates for each pair of similar points, and generates map information including a plurality of three-dimensional points indicated by the calculated plurality of three-dimensional coordinates. 520 is generated.
  • Each of the multiple 3D points indicates a position on the subject in the 3D space.
  • the estimation device 200 obtains the position and orientation of each camera 310 and map information as estimation results. Since the obtained map information has undergone optimization processing together with the camera parameters, it is information with higher accuracy than the predetermined accuracy.
  • the map information also includes the three-dimensional position of each of the plurality of three-dimensional points. Note that the map information includes not only a plurality of 3D positions, but also the color of each 3D point, the surface shape around each 3D point, information indicating which frame generated each 3D point, and the like. You can
  • the estimation device 200 may generate map information including a sparse three-dimensional point group by limiting the number of sets of similarities to a predetermined number. This is because the estimation apparatus 200 can estimate the position and orientation of each camera 310 with sufficient accuracy even with a predetermined number of pairs of similar points.
  • the predetermined number may be determined to be a number that allows the position and orientation of each camera 310 to be estimated with sufficient accuracy.
  • the estimation apparatus 200 may estimate the position and orientation of each camera 310 using a set of similar points having a predetermined degree of similarity or higher. As a result, the estimation apparatus 200 can limit the number of pairs of similarities used in the estimation process to the number of pairs that are similar with a predetermined degree of similarity or more.
  • the estimation device 200 may calculate the distance between the camera 310 and the subject as a camera parameter, for example, based on the position and orientation of the camera 310 estimated using the above technology.
  • the three-dimensional model generation system 400 may include a distance sensor, and the distance between the camera 310 and the subject may be measured using the distance sensor.
  • the estimating device 200 may be directly connected to the 3D model generating device 100 by wired communication or wireless communication, or indirectly connected to the estimating device 200 via a communication device or a hub (not shown) such as a server. may Thereby, the estimation device 200 outputs the frames received from the cameras 310 and the estimated camera parameters of the cameras 310 to the 3D model generation device 100 .
  • estimation result by the estimation device 200 may be output to the three-dimensional model generation device 100 in real time. Also, the estimation result may be once recorded in an external storage device such as a memory or a cloud server, and then output from the external storage device to the 3D model generation device 100 .
  • an external storage device such as a memory or a cloud server
  • the estimating device 200 is, for example, a computer that includes a control program, a processing circuit such as a processor or logic circuit that executes the control program, and a recording device such as an internal memory or an accessible external memory that stores the control program. At least have a system.
  • the 3D model generation device 100 generates a 3D model of a predetermined area based on a plurality of frames captured by a plurality of cameras 310 and the estimation results of the estimation device 200 (the position and orientation of each camera 310). . Specifically, the 3D model generating apparatus 100 generates a 3D model of a subject in a virtual 3D space based on the camera parameters of the plurality of cameras 310 and a plurality of frames. It is a device that executes a generation process.
  • the 3D model of the subject is data including the 3D shape of the subject and the color of the subject, restored in a virtual 3D space from the frame in which the real object of the subject was shot.
  • a 3D model of a subject is a set of points representing the 3D positions of a plurality of points on the subject captured in multiple camera images captured by a plurality of cameras 310 at multiple different viewpoints. be.
  • a three-dimensional position is represented, for example, by ternary information consisting of an X component, a Y component, and a Z component that indicate the respective positions of mutually orthogonal X, Y, and Z axes.
  • the three-dimensional position is not limited to coordinates represented by a rectangular coordinate system, and may be coordinates represented by a polar coordinate system.
  • the information contained in the plurality of points indicating the three-dimensional position includes not only the three-dimensional position (that is, information indicating coordinates), but also information indicating the color of each point, information indicating the surface shape of each point and its surroundings. etc. may be included.
  • the three-dimensional model generation device 100 includes, for example, a control program, a processing circuit such as a processor or a logic circuit that executes the control program, a recording device such as an internal memory or an accessible external memory that stores the control program, at least a computer system comprising
  • the 3D model generation device 100 is an information processing device.
  • the function of each processing unit of the 3D model generation device 100 may be realized by software or by hardware.
  • the 3D model generation device 100 may store camera parameters in advance. In this case, the 3D model generation system 400 does not need to include the estimation device 200 . Also, the plurality of cameras 310 may be communicably connected to the 3D model generation device 100 wirelessly or by wire.
  • the plurality of frames captured by the camera 310 may be directly output to the 3D model generation device 100.
  • the camera 310 may be directly connected to the 3D model generation device 100 by, for example, wired or wireless communication, or may be connected to the 3D model generation device via a hub (not shown) such as a communication device or server. 100 may be indirectly connected.
  • the 3D model generation device 100 is a device that generates a 3D model from a plurality of frames.
  • the 3D model generation device 100 includes a reception unit 110 , a storage unit 120 , an acquisition unit 130 , a determination unit 140 , a generation unit 150 and an output unit 160 .
  • the receiving unit 110 receives from the estimating device 200 a plurality of frames captured by the plurality of cameras 310 and an estimation result including the position and orientation of each camera 310 obtained by the estimating device 200 .
  • the receiving unit 110 receives the first frame (the first camera image and the first distance image) of the subject photographed from the first viewpoint and the second frame of the subject photographed from the second viewpoint.
  • frame (second camera image and second range image). That is, the plurality of frames received by the receiver 110 includes the first frame and the second frame.
  • Receiving section 110 outputs the received frames and the estimation result to storage section 120 .
  • the receiving unit 110 is, for example, a communication interface for communicating with the estimating device 200.
  • the receiving unit 110 includes, for example, an antenna and a wireless communication circuit.
  • the receiving unit 110 includes, for example, a connector connected to a communication line and a wired communication circuit. Note that the receiving unit 110 may receive a plurality of frames from the plurality of cameras 310 without going through the estimating device 200 .
  • the storage unit 120 stores multiple frames and estimation results received by the receiving unit 110 . By storing a plurality of frames, the storage unit 120 stores a distance image, which is an example of subject information, included in the plurality of frames. The storage unit 120 also stores the search range calculated by the determination unit 140 . Note that the storage unit 120 may store the processing result of the processing unit included in the 3D model generation device 100 . The storage unit 120 stores, for example, a control program for causing a processing circuit to execute processing by each processing unit included in the 3D model generation device 100 . The storage unit 120 is realized by, for example, an HDD (Hard Disk Drive), flash memory, or the like.
  • HDD Hard Disk Drive
  • the acquisition unit 130 acquires the plurality of frames stored in the storage unit 120 and the camera parameters of each camera 310 among the estimation results from the storage unit 120 and outputs them to the determination unit 140 and the generation unit 150 .
  • the 3D model generation device 100 does not have to include the storage unit 120 and the acquisition unit 130 .
  • the receiving unit 110 may output a plurality of frames received from the plurality of cameras 310 and the camera parameters of each camera 310 among the estimation results received from the estimating device 200 to the determining unit 140 and the generating unit 150. good.
  • the determination unit 140 determines the plurality of pixels of the camera image and the plurality of pixels of the distance image. . Note that when the plurality of pixels of the camera image obtained by each camera 310 and the plurality of pixels of the distance image are associated in advance, the determination unit 140 does not need to perform this association processing.
  • the determination unit 140 determines a search range to be used for searching for multiple similarities between multiple frames based on the subject information acquired from the storage unit 120 by the acquisition unit 130 without using map information.
  • the search range is a range on the three-dimensional space including the first three-dimensional point on the object corresponding to the first point on the first frame.
  • the search range can also be said to be a range in the three-dimensional space in which the first three-dimensional point is likely to exist.
  • the search range is the range in the shooting direction from the first viewpoint where the first frame was shot.
  • the search range is used to search for a plurality of similarities between the first frame and the second frame in a range corresponding to the search range on the second frame, which is different from the first frame, among the plurality of frames.
  • the second frame is a frame to be searched for similarities with the first frame.
  • the similarity search may be performed with a frame different from the first frame among the plurality of frames. That is, the frame selected as the second frame is not limited to one frame, and may be a plurality of frames.
  • the determining unit 140 estimates the position of the first three-dimensional point based on the first distance information possessed by the pixel corresponding to the first point in the first distance image included in the first frame, and estimates the position of the first three-dimensional point.
  • the search range may be determined based on the position of one 3D point.
  • the determination unit 140 may determine a range within a predetermined distance from the estimated position of the first three-dimensional point as the search range.
  • the determining unit 140 selects distance information corresponding to the position of the first three-dimensional point from among a plurality of frames other than the first frame. The above second frame may be selected.
  • the determining unit 140 determines the first three-dimensional point based on not only the first distance information but also the second distance information of the pixel corresponding to the first point in the second distance image included in the second frame. Position may be estimated. Further, the determining unit 140 determines a plurality of second frames from among the plurality of frames, and selects a plurality of pixels each corresponding to the first point in the plurality of second distance images of the determined plurality of second frames. may determine the search range based on a plurality of pieces of second distance information possessed by each. At this time, the determination unit 140 may determine the search range based on a plurality of pieces of second distance information without using the first distance information.
  • the determination unit 140 determines the search range based on one or more distance information items of the one or more pixels respectively corresponding to the first points in one or more distance images out of the plurality of distance images. may be determined.
  • the determining unit 140 uses the second distance obtained by converting the second distance information into the coordinate system of the first frame. The information is used to estimate the position of the first 3D point.
  • the transformation of the coordinate system is performed based on the position and orientation of the camera from which the distance information before transformation was obtained and the position and orientation of the camera from which the frame to be transformed was obtained.
  • the one or more distance information may include the first distance information or the transformed second distance information.
  • the determining unit 140 determines the position of the first three-dimensional point estimated based on the first distance information, and the position of the first three-dimensional point estimated based on the second distance information after conversion into the coordinate system of the first frame. If the distance between the two positions is less than a predetermined threshold, the midpoint between the two positions may be estimated as the position of the first three-dimensional point. The determining unit 140 does not have to estimate the position of the first three-dimensional point, which is the reference of the search range, if the distance is equal to or greater than a predetermined threshold.
  • Determining unit 140 selects distance information having a difference between the maximum value and the minimum value that is less than a first value, among a plurality of distance information corresponding to the first point in the plurality of distance images, as one or more distances. It may be specified as information.
  • the determining unit 140 may estimate one or more representative values of distance information as the position of the first three-dimensional point. That is, the determination unit 140 may determine the search range based on one or more representative values of distance information. Representative values are, for example, average values, median values, maximum values, minimum values, and the like. Note that the determining unit 140 does not have to estimate the position of the first three-dimensional point when there is a large variation in one or more pieces of distance information.
  • the determination unit 140 does not have to determine the search range.
  • Variability may be represented by variance, standard deviation, etc. of one or more distance information.
  • the case where the variation of one or more distance information is large is, for example, the case where the variance of one or more distance information is larger than a predetermined dispersion, or the case where the standard deviation of one or more distance information is larger than a predetermined standard deviation. is.
  • the determination unit 140 identifies one or more pixels each corresponding to the first point using the positions and orientations of the cameras 310 obtained by camera calibration.
  • FIGS. 4A to 4E Two subjects 510 and cameras 311, 312, 313 are shown in FIGS. 4A-4E.
  • Cameras 311 , 312 , 313 are included in a plurality of cameras 310 .
  • the camera 312 is a camera that generates a first frame, which is a reference frame that serves as a basis for searching for similarities, and the cameras 311 and 313 are other cameras.
  • FIG. 4A is a diagram for explaining a first example of processing for selecting a target frame.
  • the determination unit 140 selects the frames captured by the camera 311 that is in the second position and orientation in which the difference in the shooting orientation from the first position and orientation at the time of shooting of the camera 312 that shot the first frame is included in the first range. may be selected as the target frame.
  • the target frame in the first example is also the second frame.
  • the determining unit 140 selects, as the target frame, a frame captured by the camera 311 captured in the capturing direction D1 in which the difference ⁇ from the capturing direction D2 of the camera 312 is within a first range.
  • You may
  • the first range may be determined as a range having a common field of view with camera 312 .
  • the first range may be determined as a range in which the number of feature points between the first frame and the first camera image in the feature point matching in camera calibration is equal to or greater than the first number.
  • the first number may be a value greater than one.
  • the second camera image of the second frame as the target frame is the first position and orientation of the camera that captured the first camera image of the first frame and the first position and orientation of the camera that captured the second camera image. It may also be determined based on the difference in shooting orientation calculated from the two positions and orientations.
  • FIG. 4B is a diagram for explaining a second example of processing for selecting a target frame.
  • the determination unit 140 determines that the angle difference between the normal direction D11 of the surface of the subject at an arbitrary point on the subject and the direction D12 toward the arbitrary point is included in the second range.
  • a frame captured by the camera 311 that satisfies the following conditions may be selected as the target frame.
  • the determination unit 140 does not have to select the first frame as the target frame.
  • the second range may be defined as an angular range in which range image sensor 320 can successfully detect the distance to the surface of the subject.
  • FIG. 4C is a diagram for explaining a third example of processing for selecting a target frame.
  • the determining unit 140 selects the frame captured by the camera 311 that is in the second position/orientation in which the difference in the capturing position from the first position/orientation when the camera 312 that captured the first frame is included in the third range. may be selected as the target frame.
  • the target frame in the third example is also the second frame.
  • the determination unit 140 selects the frames captured by the cameras 311 and 313 captured at positions where the distance difference ⁇ L from the position of the camera 312 is included in the third range. may be selected as
  • a third range may be determined as a range having a common field of view with camera 312 .
  • the third range may be determined as a range in which the number of feature points between the first camera image of the first frame is equal to or greater than the first number in feature point matching in camera calibration.
  • the first number may be a value greater than one.
  • the second camera image of the second frame as the target frame includes the first position and orientation of the camera that captured the first camera image of the first frame and the first position and orientation of the camera that captured the second camera image. It may be determined based on the difference between the shooting positions calculated from the two positions and orientations.
  • FIG. 4D is a diagram for explaining a fourth example of processing for selecting a target frame.
  • Determining unit 140 determines the first position at the position away from subject 510 by a distance in which the difference between the first position and orientation at the time of shooting of camera 312 that shot the first frame and the distance from subject 510 falls within a fourth range.
  • a frame captured by the camera 311 in two positions and orientations may be selected as the target frame.
  • the target frame in the fourth example is also the second frame.
  • the determining unit 140 selects the camera 311 to shoot at a position separated from the subject by distances L11 and L13 where the difference from the distance L12 between the camera 312 and the subject is included in the third range. , 313 may be selected as the target frame.
  • a fourth range may be determined as a range having a common field of view with camera 312 . That is, the fourth range may be determined as a range in which the number of feature points between the first camera image of the first frame and the first camera image is equal to or greater than the first number in feature point matching in camera calibration. For example, the first number may be a value greater than one.
  • FIG. 4E is a diagram for explaining a fifth example of processing for selecting a target frame.
  • the determination unit 140 may select, as the target frame, a frame having a large area in which the subject included in the first frame is overlapped and photographed. For example, the determination unit 140 selects a frame having a second number or more of distance information corresponding to distance information whose difference from the first distance information at the first point in the first frame is a fifth value or less as the target frame. You may choose.
  • the distance information corresponding to the distance information whose difference from the first distance information is equal to or less than the fifth value means that the distance information corresponding to the first point of the frame is projected onto the first frame so that the coordinate system becomes the first distance information. Distance information converted to the coordinate system of the frame.
  • distance information that differs from the first distance information by a fifth value or less is called distance information that overlaps with the first distance information.
  • the determining unit 140 compares the position and orientation and the angle of view of the camera 312 with the positions and orientations and the angles of view of the cameras 311 and 313, and determines that the area where the imaging ranges overlap exceeds a predetermined size.
  • a frame captured by the camera may be selected as the target frame.
  • the first range, the third range, and the fourth range are defined when the number of feature points between the first camera image of the first frame is equal to or greater than the first number in feature point matching in camera calibration. determined within a range. Therefore, it can be said that the target frame is determined from a plurality of camera images in feature point matching based on the number of feature points between the first camera image and the target frame.
  • positions and orientations of the cameras 311 to 313 used in the process of selecting the target frame of the determination unit 140 are specified by camera parameters obtained by camera calibration.
  • the determination unit 140 may select a plurality of target frames as long as the conditions for selecting target frames described in the first to fifth examples are satisfied.
  • the determination unit 140 may assign priorities to a plurality of target frames that satisfy the conditions, and select target frames up to the third number in descending order of priority.
  • the third number is a number determined so that the load applied to the similarity search processing with respect to the first frame is equal to or less than a predetermined load.
  • the priority order may be determined in descending order as the position of the camera that captured the first frame is closer, or in descending order as the direction of capturing an arbitrary point of the subject is closer to the normal direction of the arbitrary point. Alternatively, the closer the distance from the subject to the position of the camera that captured the first frame, the closer the distance to the subject.
  • the determination unit 140 determines a search range based on one or more representative values of distance information on a straight line passing through the first viewpoint and the first three-dimensional point. Specifically, the determination unit 140 determines a range of a predetermined size centered at a position away from the camera 312 by the distance indicated by the representative value on the straight line as the search range. Further, specifically, the determination unit 140 determines the first distance based on the distance information obtained by projecting the first frame and the distance information of the target frame corresponding to each pixel of the first frame onto the first frame. The distance from the viewpoint to the point corresponding to the position of each pixel in the first frame on the subject is obtained, and the size of the search range is determined according to the obtained distance.
  • the search range is a search range for searching for points similar to points of pixels of the first frame from a second frame different from the first frame.
  • the determination unit 140 may widen the search range determined for each of a plurality of pixels in the distance image as the accuracy of one or more distance information for estimating the position of the first three-dimensional point is lower. Specifically, the determination unit 140 may determine that the accuracy of one or more pieces of distance information increases as the number of one or more pieces of distance information increases. One or more pieces of distance information are likely to be similar to each other, that is, have values within a predetermined range. Therefore, it can be determined that the greater the number of pieces of distance information that are equal to or greater than 1, the higher the accuracy. Further, the determining unit 140 may determine that the accuracy of the one or more distance information is higher as the variance of the one or more distance information is smaller.
  • the accuracy of the one or more pieces of distance information is higher as the variance of the one or more pieces of distance information is smaller. It should be noted that the accuracy of the distance information may be determined to be higher as the reflectance when the distance information is obtained increases.
  • the determination unit 140 determines the search range based on a plurality of pieces of second distance information without using the first distance information will be described with reference to FIGS. 5A and 5B.
  • FIG. 5A is a diagram for explaining a problem when using only the first distance information.
  • FIG. 5B is a diagram showing an example of estimating the position of the first three-dimensional point using the second distance information.
  • FIG. 5A shows three subjects 513 and camera 312 . Three subjects 513 and three cameras 311, 312, 313 are shown in FIG. 5B.
  • the camera 312 is a camera that generates a first frame, which is a reference frame that serves as a basis for searching for similarities, and the cameras 311 and 313 are other cameras.
  • the thick solid line in FIG. 5A indicates the detection result with high accuracy by the range image sensor of camera 312, and the thick dashed line in FIG. 5A indicates the detection result with low accuracy by the range image sensor.
  • a detection result with high accuracy is a result of detection with reflectance equal to or higher than the specified reflectance
  • a detection result with low accuracy is a detection result with accuracy higher than the specified accuracy
  • a low detection result may be a result in which reflectance is detected below a predetermined reflectance.
  • the reflectance is, for example, the intensity ratio between the emitted electromagnetic wave and the acquired reflected wave.
  • the detection results from one camera 312 may include not only high-accuracy detection results but also low-accuracy detection results. Therefore, if the determination unit 140 estimates the position of the first three-dimensional point using only the distance information obtained from the detection result with low accuracy, the accuracy of the estimated position of the first three-dimensional point becomes low. A position different from the actual position may be estimated as the position of the first 3D point.
  • FIG. 5B it is highly likely that one of the detection results obtained by the three cameras 311, 312, and 313 is the detection result with high accuracy. This possibility increases as the number of cameras increases. Therefore, in FIG. 5A, instead of the distance information in the pixels including the detection result with low accuracy, detection with high accuracy calculated using the detection results of the cameras 311 and 313 other than the camera 312 that generated the detection result is performed. can be interpolated.
  • the determination unit 140 determines whether the accuracy of the first distance information of the first point detected by the camera 312 is lower than a predetermined accuracy, and determines whether the accuracy of the first distance information is lower than the predetermined accuracy. If it is determined to be low, the distance information of the first point may be interpolated by replacing the first distance information with the third distance information.
  • the third distance information is distance information corresponding to the first point, and is calculated using two camera images captured by the cameras 311 and 313 .
  • the determining unit 140 associates two pixels each corresponding to the first point in the two camera images captured by the cameras 311 and 313, and determines the two pixels, the positions of the cameras 311 and 313, and the The position of the first point may be calculated by triangulation based on the posture, and the third distance information may be calculated based on the position of the first point. In this way, when the detection accuracy of the first distance information of the pixel corresponding to the first point in the first distance image is lower than the predetermined accuracy, the determining unit 140 determines the other two images other than the first camera image.
  • the search range may be determined using the third distance information corresponding to the first point, which is calculated using the above camera images, as one or more pieces of distance information.
  • the determination unit 140 may change the first frame used as a reference for searching for similarities to another frame. That is, after the frame is changed, similarity search is performed between the first frame after the change and the frames other than the first frame after the change.
  • the determining unit 140 determines whether the accuracy of the first distance information of the first point detected by the camera 312 is lower than a predetermined accuracy, and determines whether the accuracy of the first distance information is lower than the predetermined accuracy.
  • the distance information of the first point may be interpolated by replacing the first distance information with the first conversion information.
  • the first transformation information is distance information obtained by coordinate transformation so as to project the second distance information of the first point detected by the camera 311 onto the detection result of the camera 312 . Further, in this case, the determination unit 140 combines the second transformation information obtained by coordinate transformation so as to project the second distance information of the first point detected by the camera 313 onto the detection result of the camera 312, and the first transformation information.
  • the distance information of the first point may be interpolated by replacing it with the distance information calculated using .
  • the determination unit 140 determines the first distance information in the second distance image.
  • the search range may be determined using the second distance information of the pixel corresponding to the point as one or more pieces of distance information. Note that, in the case of interpolation, distance information determined to have high accuracy is used for calculation of distance information to be replaced.
  • the generation unit 150 generates a three-dimensional model of the subject based on the plurality of frames acquired from the storage unit 120 by the acquisition unit 130, the camera parameters, and the search range.
  • the generation unit 150 searches for similarities similar to the first point on the first frame in a range corresponding to a search range on another frame (for example, a second frame) different from the first frame.
  • the generation unit 150 limits the length of the epipolar line corresponding to the first point in the second frame to a length corresponding to the search range, and searches for similarities similar to the first point on the epipolar line in the second frame. do.
  • the generation unit 150 searches for similarities from the second frame for each of the plurality of first pixels included in the first frame.
  • generation section 150 calculates Normalized Cross Correlation (NCC) between small regions in a combination between a first frame and a plurality of frames other than the first frame by N(I, J), and generates matching information indicating the results of matching between frames.
  • NCC Normalized Cross Correlation
  • FIG. 6 is a diagram for explaining matching processing when the search range is not limited.
  • FIG. 7 is a diagram for explaining matching processing when the search range is limited.
  • the first viewpoint V1 and the pixel 572 An epipolar line 582 corresponding to a straight line L1 passing through and is present all over the frame 581 from end to end.
  • the first frame 571 is an image obtained from the first viewpoint V1
  • the frame 581 is an image obtained from the second viewpoint V2.
  • the straight line L1 coincides with the imaging direction of the camera 311 at the first viewpoint V1.
  • Pixel 572 corresponds to point 511 on subject 510 .
  • a search for pixels of frame 581 that are similar to pixel 572 is performed at unrestricted epipolar line 582 .
  • the pixel 583 corresponding to the point 512 different from the point 511 of the object 510 on the frame 581 is erroneously selected as a similar point. There is As a result, the generation accuracy of the three-dimensional model is lowered.
  • the search range R2 is determined to be a shorter search range than the search range R1 shown in FIG. Therefore, one pixel 572 in the first frame 571 is matched with the frame 581 in the restricted search range R2, and in the frame 581, the first viewpoint V1 and the pixel 572 are passed through.
  • An epipolar line 584 corresponding to the straight line L1 is shorter than the epipolar line 582 in accordance with the search range R2.
  • a search for pixels of frame 581 that are similar to pixel 572 is performed at epipolar line 584 which is shorter than epipolar line 582 .
  • the number of pixels having characteristics similar to the pixel 572 can be reduced, and the possibility of determining the pixel 585 corresponding to the point 511 of the subject 510 on the frame 581 as the similar point can be increased. Therefore, it is possible to improve the generation accuracy of the three-dimensional model. Moreover, since the search range can be narrowed, the processing time for the search can be shortened.
  • the generation unit 150 generates a three-dimensional model by performing triangulation using the position and orientation of each camera 310 and matching information. Note that matching may be performed for all combinations of two frames out of a plurality of frames.
  • I xy and J xy are pixel values within the small regions of frame I and frame J, respectively. again, as well as are the average values of the pixel values in the small regions of frame I and frame J, respectively.
  • the generation unit 150 generates a three-dimensional model using the search result of matching. As a result, the generation unit 150 generates a three-dimensional model that includes a plurality of three-dimensional points that are larger in number and have a higher density than the plurality of three-dimensional points included in the map information.
  • the output unit 160 outputs the three-dimensional model generated by the generation unit 150.
  • the output unit 160 includes, for example, a display device such as a display (not shown), and an antenna, a communication circuit, a connector, and the like for communicably connecting by wire or wirelessly.
  • the output unit 160 causes the display device to display the three-dimensional model by outputting the integrated three-dimensional model to the display device.
  • FIG. 8 is a flowchart showing an example of the operation of the 3D model generation device 100. As shown in FIG.
  • the reception unit 110 receives a plurality of frames captured by a plurality of cameras 310 and the camera parameters of each camera 310 from the estimation device 200 (S101). Note that the receiving unit 110 does not have to receive a plurality of frames and camera parameters at one timing, and may receive them at different timings. That is, the first acquisition step and the second acquisition step may be performed at the same timing or at different timings.
  • the storage unit 120 stores the plurality of frames captured by the plurality of cameras 310 and the camera parameters of each camera 310 received by the reception unit 110 (S102).
  • the acquisition unit 130 acquires subject information (a plurality of distance images) from a plurality of frames stored in the storage unit 120, and outputs the acquired subject information to the determination unit 140 (S103).
  • the determination unit 140 determines a search range to be used for matching multiple points between multiple frames based on the subject information acquired by the acquisition unit 130 (S104).
  • the details of step S104 have been described in the description of the processing performed by the determination unit 140, and therefore are omitted.
  • the generation unit 150 searches for similarities similar to the first point on the first frame in the range corresponding to the search range on the second frame (S105), and based on the search result, the three-dimensional A model is generated (S106).
  • steps S105 and S106 have been explained in the description of the processing performed by the generation unit 150, and therefore are omitted.
  • the output unit 160 outputs the three-dimensional model generated by the generation unit 150 (S107).
  • the three-dimensional model generation method acquires subject information including a plurality of positions on the subject in a three-dimensional space (S103), and obtains a first camera image of the subject taken from a first viewpoint, a first Camera calibration performed by acquiring second camera images of a subject captured from two viewpoints (S101) and capturing the subject from a plurality of viewpoints including a first viewpoint and a second viewpoint with one or more cameras.
  • a search range on the three-dimensional space including the first three-dimensional point above is determined (S104), and a matching search for similarities similar to the first point in a range corresponding to the search range on the second image is performed (S105), and a three-dimensional model is generated using the search results in the matching (S106).
  • the subject information is obtained by camera calibration performed by having one or more cameras image the subject from multiple viewpoints, and includes a plurality of three-dimensional points each indicating a position on the subject in three-dimensional space. This information is different from the map information.
  • a search range is determined based on subject information without using map information, and in a range corresponding to the search range on the second image limited by the search range, Search for similarities that are similar to the first point.
  • similarity points are searched for in a range where there is a high possibility that similarities exist based on subject information, so that it is possible to improve the accuracy of similarity point search and reduce the time required for the search process. can do. Therefore, the generation accuracy of the three-dimensional model can be improved, and the processing time of the three-dimensional model generation processing can be shortened.
  • the epipolar line corresponding to the first point in the second camera image is limited to a length corresponding to the search range, and on the epipolar line of the second camera image, the first point Search for similarities similar to .
  • the subject information includes a distance image generated by measurement by the distance image sensor 320.
  • the distance image has a plurality of pixels each having distance information indicating the distance from the distance image sensor 320 to the subject.
  • the search range is determined based on the distance information possessed by the pixel corresponding to the first point in the distance image.
  • the subject information includes a distance image having a plurality of pixels associated with the plurality of pixels of the first camera image, it is possible to easily identify the distance information corresponding to the first point. can. Therefore, the position of the first three-dimensional point can be estimated based on the specified distance information, and the search range can be determined with high accuracy.
  • the subject information includes a plurality of distance images respectively generated by measurements by a plurality of distance image sensors 320.
  • Each of the plurality of distance images has a plurality of pixels each having distance information indicating the distance from the distance image sensor 320 that generated the distance image to the subject.
  • a plurality of pixels of each of the plurality of distance images are associated with a plurality of pixels of a camera image corresponding to the distance image having the plurality of pixels among the plurality of camera images.
  • the multiple camera images include a first camera image and a second camera image.
  • the search range is determined based on one or more pieces of distance information possessed by one or more pixels each corresponding to the first point in one or more distance images among the plurality of distance images.
  • the subject information includes a plurality of distance images having a plurality of pixels associated with the plurality of pixels of the first camera image
  • the plurality of distance information corresponding to the first point can be easily specified. can do.
  • the plurality of distance information specified in this way are distance information obtained from different viewpoints. Therefore, even if some distance information contains detection errors, other distance information can be used to reduce the detection errors. The impact can be reduced. Therefore, the position of the first three-dimensional point can be estimated more accurately based on one or more pieces of distance information among the plurality of pieces of distance information, and the search range can be determined with accuracy.
  • the search range is determined using the third distance information corresponding to the first point, which is calculated using the camera image of , as one or more pieces of distance information. Therefore, when the detection accuracy of the first distance information is low, the search range can be determined using the third distance information with high accuracy. Therefore, the search range can be determined with high accuracy.
  • each of the plurality of distance image sensors 320 corresponds in position and orientation to the plurality of cameras 310 including one or more cameras.
  • the multiple distance images include a first distance image corresponding to the first camera image and a second distance image corresponding to the second camera image.
  • the pixel corresponding to the first point in the second distance image is A search range is determined by using the second distance information that is provided as one or more pieces of distance information. Therefore, when the detection accuracy of the first distance information is low, the search range can be determined using the second distance information with high accuracy. Therefore, the search range can be determined with high accuracy.
  • each of the plurality of distance image sensors 320 corresponds in position and orientation to the plurality of cameras 310 including one or more cameras.
  • determining, one or more pixels in one or more range images, each corresponding to the first point, are identified using multiple camera orientations obtained by camera calibration.
  • one or more pieces of distance information can be specified using the positions and orientations of multiple cameras obtained by camera calibration.
  • the one or more distance images include a first distance image corresponding to the first camera image and a second distance image corresponding to the second camera image.
  • a second camera image is determined from a plurality of camera images based on the number of feature points between the first camera image and the first camera image in feature point matching in camera calibration.
  • the second camera image to be matched for similarities with the first camera image is determined based on the number of feature points. Therefore, it is possible to specify a second distance image for specifying one or more pieces of distance information that is highly likely to contain no error, that is, has high accuracy.
  • the second camera image may be captured by calculating the first position and orientation of the camera that captured the first camera image and the second position and orientation of the camera that captured the second camera image. Determined based on the difference in posture.
  • the second camera image that is the target of similarity matching with the first camera image is determined based on the difference in the pose of the camera. Therefore, it is possible to specify a second distance image for specifying one or more pieces of distance information that is highly likely to contain no error, that is, has high accuracy.
  • the second camera image may be captured by calculating the first position and orientation of the camera that captured the first camera image and the second position and orientation of the camera that captured the second camera image. determined based on the position difference.
  • the second camera image that is the target of similarity matching with the first camera image is determined based on the difference in the positions of the cameras. Therefore, it is possible to specify a second distance image for specifying one or more pieces of distance information that is highly likely to contain no error, that is, has high accuracy.
  • the difference between the maximum value and the minimum value of one or more pieces of distance information is less than the first value.
  • one or more pieces of distance information are specified in which the difference between the maximum value and the minimum value is less than the first value. This makes it possible to specify one or more pieces of distance information that are highly likely to contain no error, that is, have high accuracy.
  • the search range is widened as the accuracy of one or more pieces of distance information is low.
  • the search range is widened as the accuracy of one or more pieces of distance information decreases, so the search range can be determined according to the accuracy.
  • the accuracy is higher as the number of distance information of 1 or more is larger.
  • the search range can be narrowed as the number of pieces of distance information of 1 or more is large.
  • the accuracy is higher as the variance of the distance information of 1 or more is smaller.
  • the search range can be narrowed as the variance of the plurality of distance information is smaller.
  • Modification 1 A three-dimensional model generation system 410 according to this modification will be described.
  • this modified example a case will be described in which subject information different from the subject information described in the embodiment is used. That is, the subject information used in this modified example is different from the distance image.
  • FIG. 9 is a block diagram showing the characteristic configuration of the 3D model generation system according to Modification 1. As shown in FIG.
  • 3D model generation system 410 differs from 3D model generation system 400 according to the embodiment in that camera group 300 further includes measuring instrument 321, and sensor fusion device instead of estimation device 200. 210 is the main difference. Components similar to those of the 3D model generation system 400 according to the embodiment are denoted by the same reference numerals, and descriptions thereof are omitted.
  • FIG. 10 is a diagram showing an example of the configuration of the camera group.
  • the two cameras 310 included in the camera group 300 and the measuring instrument 321 are fixed and supported by a fixing member 330 so that their positions and orientations are fixed.
  • a device that includes two cameras 310 and a measuring device 321 that have a fixed positional relationship with each other is called a sensor device.
  • the two cameras 310 constitute a stereo camera.
  • the two cameras 310 take images in synchronization with each other, and generate stereo images taken at synchronized shooting times.
  • a shooting time (time stamp) is attached to the generated stereo image.
  • Stereo images are output to the sensor fusion device 210 .
  • the two cameras 310 may capture stereo video.
  • the measuring instrument 321 generates three-dimensional data by emitting electromagnetic waves and acquiring reflected waves of the electromagnetic waves reflected by the subject. Specifically, the measuring device 321 measures the time it takes for the emitted electromagnetic wave to be reflected by the subject and return to the measuring device 321 after being emitted, and using the measured time and the wavelength of the electromagnetic wave, Calculate the distance between the measuring instrument 321 and a point on the surface of the object.
  • the measuring device 321 emits electromagnetic waves in a plurality of predetermined radial directions from a reference point of the measuring device 321 . For example, the meter 321 emits electromagnetic waves at a first angular interval about the horizontal direction and emits electromagnetic waves at a second angular interval about the vertical direction.
  • the measuring device 321 can calculate the three-dimensional coordinates of a plurality of points on the subject by detecting the distance to the subject in each of a plurality of directions around the measuring device 321 . Therefore, the measuring device 321 can calculate positional information indicating a plurality of three-dimensional positions on the subject around the measuring device 321, and can generate a three-dimensional model having the positional information.
  • the location information may be a 3D point cloud containing a plurality of 3D points indicating a plurality of 3D locations.
  • the measuring instrument 321 includes a laser irradiation unit (not shown) that emits laser light as an electromagnetic wave, and a laser light receiving unit (not shown) that receives reflected light of the irradiated laser light reflected by an object. It is a three-dimensional laser measuring instrument having The measuring instrument 321 rotates or oscillates a unit including a laser irradiation unit and a laser light reception unit on two different axes, or a movable mirror (MEMS ( By installing a Micro Electro Mechanical Systems (mirror), the subject is scanned with a laser beam. As a result, the measuring device 321 can generate a highly accurate and dense three-dimensional model of the subject. Note that the three-dimensional model generated here is, for example, a three-dimensional model in the world coordinate system.
  • the measuring instrument 321 acquires a three-dimensional point group by line scanning. Therefore, the measuring device 321 acquires a plurality of 3D points included in the 3D point group at different times. That is, the time measured by the measuring device 321 and the time taken by the two cameras 310 are not synchronized.
  • the instrument 321 produces a horizontally dense and vertically coarse 3D point cloud. That is, in the 3D point group obtained by the measuring device 321, the interval between vertically adjacent 3D points is wider than the interval between horizontally adjacent 3D points.
  • the measurement time at which each 3D point was measured is given in association with the 3D point.
  • the measuring device 321 is a three-dimensional laser measuring device (LiDAR) that measures the distance to the subject by irradiating laser light, but is not limited to this, and measures the distance to the subject by emitting millimeter waves. It may be a millimeter wave radar measuring instrument that measures the distance between.
  • LiDAR three-dimensional laser measuring device
  • the two cameras 310 shown in FIG. 10 may be part or all of the plurality of cameras 310 included in the camera group 300 .
  • FIG. 11 is a flowchart showing an example of the operation of the sensor fusion device 210 according to Modification 1.
  • FIG. 11 is a flowchart showing an example of the operation of the sensor fusion device 210 according to Modification 1.
  • the sensor fusion device 210 acquires a stereo video and a time-series 3D point group (S201).
  • a stereo video includes a plurality of stereo images each generated in chronological order.
  • the sensor fusion device 210 calculates the position and orientation of the sensor device (S202). Specifically, the sensor fusion device 210 generates a stereo image and a three-dimensional image generated at shooting times and measurement times within a predetermined time difference among the stereo video and the three-dimensional point cloud obtained by the sensor device. The points are used to calculate the position and orientation of the sensor device.
  • the coordinates that serve as the reference for the position and orientation of the sensor device may be the camera coordinate origin of the left-eye camera of the stereo camera when stereo video is used, or when time-series 3D point clouds are used. , the coordinates of the center of rotation of the measuring instrument 321, or if both are used, either the origin of the camera coordinates of the left-eye camera or the coordinates of the center of rotation of the measuring instrument 321 may be used.
  • the sensor device may move to different positions at time t1 and time t2, as shown in FIG. FIG. 12 shows the positions of the left-eye camera 310 of the stereo cameras at times t1 and t2 and the position of the measuring device 321 at time t1.
  • the stereo images captured at times t1 and t2 are used to generate a camera image integrated three-dimensional point cloud that exists only at characteristic locations of the subject.
  • the sensor fusion device 210 uses Visual SLAM (Simultaneous Localization and Mapping) based on feature point matching between stereo images and time-series images to extract the sensor data.
  • the position and orientation of the device may be calculated.
  • the sensor fusion device 210 integrates the time-series three-dimensional point cloud using the calculated position and orientation (S203).
  • a three-dimensional point cloud obtained by integration is called a LiDAR integrated 3D point cloud.
  • step S202 when the sensor fusion device 210 calculates the position and orientation of the sensor device using the time-series 3D point cloud, for example, NDT (Normal Distribution Transform) based on 3D point cloud matching , the position and orientation of the sensor device may be calculated. Since the time-series 3D point cloud is used, the position and orientation of the sensor device can be calculated at the same time the LiDAR integrated 3D point cloud is generated.
  • NDT Normal Distribution Transform
  • step S202 a case where the sensor fusion device 210 calculates the position and orientation of the sensor device using both the stereo video and the time-series 3D point cloud will be described.
  • the camera parameters consisting of the individual focal lengths of the left-eye and right-eye cameras of the stereo camera, the lens distortion, the image center, and the relative positions and orientations of the left-eye and right-eye cameras are obtained, for example, by the checker It is calculated by a camera calibration method using a board.
  • the sensor fusion device 210 performs feature point matching between stereo images, and performs feature point matching between temporally continuous images of the left eye image. and camera parameters to calculate the three-dimensional position of the matching point.
  • the sensor fusion device 210 performs this process with an arbitrary number of frames to generate a camera image integrated 3D point cloud.
  • the sensor fusion device 210 aligns the camera image integrated 3D point cloud and the time-series 3D point cloud acquired by the measuring device 321 by a method of minimizing the cost function, and obtains subject information. Generate (S204).
  • the cost function consists of the weighted sum of two error functions, as shown in Equation 2.
  • the first error function E1 in the cost function is, as shown in FIG. 15, a reprojection error when each 3D point of the camera image integrated 3D point cloud is reprojected onto camera coordinates at two times. Camera parameters obtained in advance by camera calibration are used for the reprojection calculation. This error is calculated and summed for any three-dimensional point in any time period.
  • the second error function E2 in the cost function is obtained by transforming each 3D point of the camera-integrated 3D point cloud into the coordinate system of the time-series 3D point cloud generated by the measuring instrument 321, and then using the peripheral measuring instruments This is the result of calculating the distance to 321 time-series 3D points.
  • the transformation matrices of the two coordinate spaces may be calculated from the actual positional relationship between the left-eye camera and the measuring device 321.
  • this error is calculated and summed.
  • the cost function is minimized using the camera coordinates at the two times and each element of the transformation matrix from the camera coordinate system to the coordinate system of the measuring instrument 321 as a variable parameter. Minimization may be performed by the method of least squares, the Gauss-Newton method, the levenberg-marquardt method, or the like.
  • the weight w may be the ratio of the number of time-series 3D points obtained by the measuring device 321 and the number of 3D points in the camera image integrated 3D point cloud.
  • the minimization process determines the time-series camera position and orientation, the conversion formula for the camera coordinate system, and the measuring instrument coordinate system. Using this, time-series 3D point groups are integrated to generate a LiDAR integrated 3D point group as subject information.
  • the sensor fusion device 210 outputs the generated subject information to the 3D model generation device 100.
  • subject information is generated based on two or more types of sensor information. That is, subject information is generated based on two or more different types of sensor information. In other words, object information is obtained in which a decrease in accuracy due to detection errors is reduced.
  • the two or more types of sensor information are a plurality of two-dimensional images obtained from a stereo camera, and a measurement that emits an electromagnetic wave and obtains a reflected wave of the electromagnetic wave reflected by an object. and three-dimensional data obtained from the device 321 .
  • the subject information is generated based on a plurality of two-dimensional images and three-dimensional data. Data can be obtained with high accuracy.
  • the determination unit 140 uses subject information (for example, distance image) to search for a plurality of similarities between a plurality of frames without using map information.
  • subject information for example, distance image
  • the search range is determined, it is not limited to this.
  • the determining unit 140 selects the first method of determining the search range based on the distance image as described in the above embodiment and the second method of determining the search range based on the map information, based on the subject and the first frame.
  • the search range may be determined by switching according to the distance from the camera 310 that generates .
  • the first method is used to determine the search range, and if the distance between the subject and the camera 310 that generates the first frame is a predetermined distance or more, In some cases (ie, when the subject is far from the cameras 310), a second method may be used to determine the search range. This is because the accuracy of the map information is higher than the accuracy of the range images of the cameras 310 when the distance between the subject and the cameras 310 is greater than or equal to a predetermined distance.
  • the determination unit 140 interpolates 3D points where the subject is estimated to exist between the 3D points using the 3D points included in the map information. , the three-dimensional information of the subject is generated, and the search range is determined based on the generated three-dimensional information. Specifically, the determination unit 140 fills (that is, interpolates) between a plurality of 3D points included in a sparse 3D point group based on map information with a plurality of planes, thereby roughly cubic The original position is estimated, and the estimation result is generated as an estimated 3D model. For example, a plurality of 3D points included in a sparse 3D point group may be interpolated by meshing a plurality of 3D points.
  • the determination unit 140 determines, for each of a plurality of pixels on the projection frame in which the estimated 3D model is projected onto the first frame, the 3D position based on the first viewpoint from which the first frame was captured, , estimate the three-dimensional position on the object corresponding to the pixel. Thereby, the determination unit 140 generates an estimated distance image including a plurality of pixels each including the estimated three-dimensional position. Then, similarly to the first method, the determination unit 140 estimates the position of the first three-dimensional point based on the generated estimated range image, and determines the search range based on the estimated position of the first three-dimensional point. do.
  • each processing unit included in the 3D model generation device, etc. was described as being implemented by a CPU and a control program.
  • each component of the processing unit may consist of one or more electronic circuits.
  • Each of the one or more electronic circuits may be a general-purpose circuit or a dedicated circuit.
  • One or more electronic circuits may include, for example, a semiconductor device, an IC (Integrated Circuit), or an LSI (Large Scale Integration).
  • An IC or LSI may be integrated on one chip or may be integrated on a plurality of chips.
  • ICs Although they are called ICs or LSIs here, they may be called system LSIs, VLSIs (Very Large Scale Integration), or ULSIs (Ultra Large Scale Integration) depending on the degree of integration.
  • An FPGA Field Programmable Gate Array
  • general or specific aspects of the present disclosure may be implemented in systems, devices, methods, integrated circuits, or computer programs.
  • the computer program may be realized by a computer-readable non-temporary recording medium such as an optical disk, HDD (Hard Disk Drive), or semiconductor memory storing the computer program.
  • a computer-readable non-temporary recording medium such as an optical disk, HDD (Hard Disk Drive), or semiconductor memory storing the computer program.
  • any combination of systems, devices, methods, integrated circuits, computer programs and recording media may be implemented.
  • the present disclosure can be applied to a 3D model generation device or a 3D model generation system, and can be applied, for example, to figure creation, terrain or building structure recognition, human action recognition, or free viewpoint video generation.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Length Measuring Devices By Optical Means (AREA)
  • Image Analysis (AREA)
PCT/JP2022/025296 2021-11-29 2022-06-24 三次元モデル生成方法及び三次元モデル生成装置 Ceased WO2023095375A1 (ja)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN202280076021.4A CN118266003A (zh) 2021-11-29 2022-06-24 三维模型生成方法以及三维模型生成装置
EP22898160.1A EP4443383A4 (en) 2021-11-29 2022-06-24 METHOD AND DEVICE FOR GENERATING THREE-DIMENSIONAL MODELS
JP2023563508A JP7692175B2 (ja) 2021-11-29 2022-06-24 三次元モデル生成方法及び三次元モデル生成装置
US18/663,702 US20240296621A1 (en) 2021-11-29 2024-05-14 Three-dimensional model generation method and three-dimensional model generation device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2021-193622 2021-11-29
JP2021193622 2021-11-29

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/663,702 Continuation US20240296621A1 (en) 2021-11-29 2024-05-14 Three-dimensional model generation method and three-dimensional model generation device

Publications (1)

Publication Number Publication Date
WO2023095375A1 true WO2023095375A1 (ja) 2023-06-01

Family

ID=86539012

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/025296 Ceased WO2023095375A1 (ja) 2021-11-29 2022-06-24 三次元モデル生成方法及び三次元モデル生成装置

Country Status (5)

Country Link
US (1) US20240296621A1 (https=)
EP (1) EP4443383A4 (https=)
JP (1) JP7692175B2 (https=)
CN (1) CN118266003A (https=)
WO (1) WO2023095375A1 (https=)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117136542A (zh) * 2021-03-31 2023-11-28 苹果公司 用于观看3d照片和3d视频的技术
KR20240094055A (ko) * 2022-11-18 2024-06-25 현대자동차주식회사 차량, 차량 제어 방법, 및 차량의 주행 제어 방법

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015033047A (ja) * 2013-08-05 2015-02-16 Kddi株式会社 複数カメラを用いた奥行き推定装置
JP2017130146A (ja) 2016-01-22 2017-07-27 キヤノン株式会社 画像管理装置、画像管理方法及びプログラム
JP2019220099A (ja) * 2018-06-22 2019-12-26 凸版印刷株式会社 ステレオマッチング処理装置、ステレオマッチング処理方法、及びプログラム
JP2020008502A (ja) * 2018-07-11 2020-01-16 株式会社フォーディーアイズ 偏光ステレオカメラによる深度取得装置及びその方法
WO2021193672A1 (ja) * 2020-03-27 2021-09-30 パナソニックIpマネジメント株式会社 三次元モデル生成方法及び三次元モデル生成装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4114107B2 (ja) * 1998-03-18 2008-07-09 ソニー株式会社 画像処理装置および方法、並びに記録媒体
JP2009047498A (ja) * 2007-08-17 2009-03-05 Fujifilm Corp 立体撮像装置および立体撮像装置の制御方法並びにプログラム
EP3565259A1 (en) 2016-12-28 2019-11-06 Panasonic Intellectual Property Corporation of America Three-dimensional model distribution method, three-dimensional model receiving method, three-dimensional model distribution device, and three-dimensional model receiving device
JP6981247B2 (ja) * 2017-12-27 2021-12-15 富士通株式会社 情報処理装置、情報処理方法、及び情報処理プログラム
EP4064206B1 (en) * 2019-11-20 2026-02-25 Panasonic Intellectual Property Management Co., Ltd. Three-dimensional model generation method and three-dimensional model generation device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015033047A (ja) * 2013-08-05 2015-02-16 Kddi株式会社 複数カメラを用いた奥行き推定装置
JP2017130146A (ja) 2016-01-22 2017-07-27 キヤノン株式会社 画像管理装置、画像管理方法及びプログラム
JP2019220099A (ja) * 2018-06-22 2019-12-26 凸版印刷株式会社 ステレオマッチング処理装置、ステレオマッチング処理方法、及びプログラム
JP2020008502A (ja) * 2018-07-11 2020-01-16 株式会社フォーディーアイズ 偏光ステレオカメラによる深度取得装置及びその方法
WO2021193672A1 (ja) * 2020-03-27 2021-09-30 パナソニックIpマネジメント株式会社 三次元モデル生成方法及び三次元モデル生成装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4443383A4

Also Published As

Publication number Publication date
JPWO2023095375A1 (https=) 2023-06-01
CN118266003A (zh) 2024-06-28
US20240296621A1 (en) 2024-09-05
JP7692175B2 (ja) 2025-06-13
EP4443383A1 (en) 2024-10-09
EP4443383A4 (en) 2025-03-19

Similar Documents

Publication Publication Date Title
JP7716712B2 (ja) 三次元モデル生成方法、情報処理装置およびプログラム
JP7143225B2 (ja) 三次元再構成方法及び三次元再構成装置
US8718326B2 (en) System and method for extracting three-dimensional coordinates
EP3832601B1 (en) Image processing device and three-dimensional measuring system
JP7660284B2 (ja) 三次元モデル生成方法及び三次元モデル生成装置
JP2013535013A (ja) 画像ベースの測位のための方法および装置
WO2004044522A1 (ja) 3次元形状計測方法およびその装置
JP7170230B2 (ja) 三次元再構成方法及び三次元再構成装置
US20240296621A1 (en) Three-dimensional model generation method and three-dimensional model generation device
US10713810B2 (en) Information processing apparatus, method of controlling information processing apparatus, and storage medium
JP7607229B2 (ja) 三次元変位計測方法及び三次元変位計測装置
JP7407428B2 (ja) 三次元モデル生成方法及び三次元モデル生成装置
JPWO2016135856A1 (ja) 3次元形状計測システムおよびその計測方法
JP7649978B2 (ja) 三次元モデル生成方法及び三次元モデル生成装置
CN119832151A (zh) 开放式三维重建方法、自动深度定位方法、设备及机器人
US20250157079A1 (en) Information processing device, information processing method, and program
TW201947186A (zh) 三維形狀測量系統、探測器、攝像裝置、電腦裝置以及測量時間設定方法
CN117109556A (zh) 一种空间非合作目标相对位姿测量实验系统及方法
KR20220078447A (ko) 저밀도 영역을 복원하는 이미지 복원 장치의 동작 방법
Shojaeipour et al. Robot path obstacle locator using webcam and laser emitter
JP5409451B2 (ja) 3次元変化検出装置
JPH11183142A (ja) 三次元画像撮像方法及び三次元画像撮像装置
CN121962410A (zh) 三维重建方法及相关装置
Liebold et al. Integrated Georeferencing of LIDAR and Camera Data acquired from a moving platform
Uranishi et al. Three-dimensional measurement system using a cylindrical mirror

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22898160

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2023563508

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 202280076021.4

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2022898160

Country of ref document: EP

Effective date: 20240701