WO2022264519A1 - 情報処理装置、情報処理方法及びコンピュータプログラム - Google Patents

情報処理装置、情報処理方法及びコンピュータプログラム Download PDF

Info

Publication number
WO2022264519A1
WO2022264519A1 PCT/JP2022/006697 JP2022006697W WO2022264519A1 WO 2022264519 A1 WO2022264519 A1 WO 2022264519A1 JP 2022006697 W JP2022006697 W JP 2022006697W WO 2022264519 A1 WO2022264519 A1 WO 2022264519A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
dimensional model
vertex
information processing
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2022/006697
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
俊一 本間
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Group Corp
Original Assignee
Sony Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Group Corp filed Critical Sony Group Corp
Priority to US18/565,569 priority Critical patent/US20240265660A1/en
Priority to JP2023529508A priority patent/JPWO2022264519A1/ja
Publication of WO2022264519A1 publication Critical patent/WO2022264519A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating three-dimensional [3D] models or images for computer graphics
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating three-dimensional [3D] models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating three-dimensional [3D] models or images for computer graphics
    • G06T19/20Editing of three-dimensional [3D] images, e.g. changing shapes or colours, aligning objects or positioning parts
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/75Determining position or orientation of objects or cameras using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2219/00Indexing scheme for manipulating 3D models or images for computer graphics
    • G06T2219/20Indexing scheme for editing of 3D models
    • G06T2219/2004Aligning objects, relative positioning of parts
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2219/00Indexing scheme for manipulating 3D models or images for computer graphics
    • G06T2219/20Indexing scheme for editing of 3D models
    • G06T2219/2021Shape modification

Definitions

  • the present disclosure relates to an information processing device, an information processing method, and a computer program.
  • AR augmented reality
  • the virtual information character can stand on the ground or floor, and the virtual information ball can bounce off walls and other objects.
  • Conflict expressions and hidden expressions such as returning are possible.
  • the present disclosure has been made in view of the above-described problems, and aims to enable a 3D model to be superimposed on an object in an image with high accuracy.
  • An information processing apparatus acquires a first feature value associated with the first vertex of a three-dimensional model having a plurality of first vertices, and based on the first feature value, in a target image captured by a camera: a position specifying unit that specifies a first position corresponding to the first vertex; and a position specifying unit that projects the three-dimensional model onto the target image and corrects the projected position of the first vertex to the first position, and a processing unit that deforms the three-dimensional model projected onto the target image.
  • the information processing method of the present disclosure acquires a first feature value associated with the first vertex of a three-dimensional model having a plurality of first vertices, and based on the first feature value, in a target image captured by a camera identifying a first position corresponding to the first vertex, projecting the three-dimensional model onto the target image, and correcting the projected position of the first vertex to the first position; Deform the projected three-dimensional model.
  • a computer program of the present disclosure acquires a first feature value associated with the first vertex of a three-dimensional model having a plurality of first vertices, and based on the first feature value, in a target image captured by a camera, the identifying a first position corresponding to a first vertex; and projecting the three-dimensional model onto the target image, and correcting the position where the first vertex is projected to the first position. and deforming the three-dimensional model projected onto the computer.
  • FIG. 4 is a diagram showing an example of feature points detected from an image; A diagram of a dense 3D point cloud obtained from a sparse 3D point cloud. A diagram of vertices in a mesh model.
  • FIG. 4 is a diagram showing an example of a feature quantity database relating to feature points of a 3D model;
  • FIG. 4 is a diagram showing an example of a model database regarding vertices and meshes of a three-dimensional model;
  • FIG. 4 is a diagram showing an example of matching between feature points in an image and feature points in a 3D model;
  • FIG. 4 is a diagram showing an example of matching between feature points in an image and feature points in a 3D model;
  • FIG. 4 is a diagram showing an example in which some of the feature points of the 3D model do not match the feature points on the image;
  • FIG. 4 is a diagram for explaining processing for detecting corresponding points of feature points of a three-dimensional model;
  • FIG. 10 is a diagram showing an example of correcting the projected positions of the feature points of the 3D model to the positions of the corresponding points;
  • FIG. 5 is a diagram showing an example in which correction processing is not performed when projecting a 3D model onto an image;
  • FIG. 5 is a diagram showing an example in which correction processing is not performed when projecting a 3D model onto an image;
  • FIG. 5 is a diagram showing an example of performing correction processing when projecting a three-dimensional model onto an image;
  • FIG. 5 is a diagram showing an example in which correction processing is not performed when projecting a 3D model onto an image;
  • FIG. 5 is a diagram showing an example of performing correction processing when projecting a three-dimensional model onto an image;
  • 4 is a flowchart of information processing system processing according to an embodiment of the present disclosure;
  • FIG. 2 is a diagram showing an example of a hardware configuration of an information processing apparatus of the present disclosure;
  • FIG. 1 is a block diagram of an information processing system 1000 according to an embodiment of the present disclosure.
  • the information processing system 1000 includes a three-dimensional model creation device 100 , a database generation device 200 , a database 300 , an information processing device 400 and a camera 500 .
  • the three-dimensional model creation device 100 includes a feature point detection unit 110, a point cloud restoration unit 120, and a model generation unit 130.
  • the database generation device 200 includes a feature point detection unit 210, a feature amount calculation unit 220, and a database generation unit 230.
  • the database 300 includes a feature database 310 (first database) and a model database 320 (second database).
  • the model database 320 according to this embodiment includes two of a vertex table 330 and a mesh table 340 (see FIG. 7 described later).
  • the information processing device 400 includes a feature point detection unit (feature amount calculation unit) 410 , a matching unit 420 , a posture estimation unit 430 , a processing unit 440 and a database update unit 450 .
  • the three-dimensional model when projecting a pre-created three-dimensional model onto a projection target in an image (target image) acquired by the camera 500, feature amounts associated with vertices (feature points) of the three-dimensional model are used. to correct the projected position of the vertex.
  • the shape of the two-dimensional image of the three-dimensional model projected onto the image is deformed, and the three-dimensional model is superimposed on the projection target included in the image with high precision.
  • the three-dimensional model is, for example, an object that can be created by Structure from Motion (SFM) or the like that inputs a plurality of images and reconstructs a three-dimensional structure.
  • SFM Structure from Motion
  • a three-dimensional model is an object for projection in accordance with a projection target (superimposition target) in an image in an AR application.
  • a three-dimensional model has a plurality of vertices (first vertices).
  • a feature amount (first feature amount) is associated with each vertex of the three-dimensional model.
  • the 3D model is represented by mesh data.
  • Mesh data is data representing a set of planes (polygons) formed by connecting three or more vertices.
  • the mesh data includes vertex data including the positions of the vertices forming each plane.
  • SFM Structure From Motion
  • FIG. 2 is a diagram showing an example of a method for creating a three-dimensional model.
  • the information processing system 1000 inputs an image 1100 as shown in FIG.
  • the input image 1100 is sent to the feature point detection unit 110 of the three-dimensional model creation device 100 .
  • the image 1100 is a still image of the three-dimensional model object 11 (see FIG. 3), for example, a photograph.
  • the image 1100 may be a paused moving image or the like other than a photograph.
  • FIG. 3 is a diagram showing one of the images 1100 and a plurality of feature points on an object included in the image. Note that FIG. 3 shows an image of an object different from that in FIG. 2(a).
  • the feature point detection unit 110 detects a plurality of feature points 12 from the image 1100 by performing feature point detection processing.
  • the feature points 12 are, for example, vertices included in the model object 11 photographed in the image 1100, and points that can be recognized from the appearance of the object 11, such as points with clear shading on the image.
  • the feature point detection unit 110 calculates a local feature amount from a local image (patch image) centering on the feature point 12 in the image 1100 .
  • the feature point detection unit 110 includes a feature amount calculation unit that calculates local feature amounts.
  • the feature point detection unit 110 obtains the correspondence relationship between the feature points 12 (same feature points) between the images 1100 based on the local feature amounts calculated from each of the plurality of images 1100 . That is, the feature points 12 at the same position are specified between the different images 1100 by comparing the local feature amounts. Thereby, the feature point detection unit 110 can acquire the positional relationship of the three-dimensional positions of the plurality of feature points and the positional relationship between the camera that captured each image and these feature points.
  • the feature point detection unit 110 transmits information on the plurality of detected feature points 12 (three-dimensional positions of feature points, local feature amounts) to the point cloud restoration unit 120 . Regarding a plurality of local feature amounts corresponding to the same feature point 12 obtained from a plurality of images 1100, the feature point detection unit 110 transmits a representative value of the plurality of local feature amounts as the local feature amount of the feature point 12. Alternatively, all or two or more of these multiple local feature amounts may be transmitted.
  • the point group restoration unit 120 acquires information on the plurality of feature points 12 transmitted from the feature point detection unit 110.
  • the point cloud restoration unit 120 obtains, as a sparse three-dimensional point cloud 1200, a plurality of vertices indicating three-dimensional positions obtained by projecting the plurality of feature points 12 onto a three-dimensional space.
  • FIG. 2(b) shows an example of a sparse three-dimensional point cloud 1200.
  • the point cloud reconstruction unit 120 may use bundle adjustment to obtain a more accurate 3D position of the feature point 13 (first vertex) of the 3D model from the sparse 3D point cloud 1200 .
  • the point cloud reconstruction unit 120 can also create a dense 3D point cloud 1300 from the sparse 3D point cloud 1200 using means such as Multi-View Stereo (MVS).
  • FIG. 2(c) shows an example of a dense 3D point cloud 1300.
  • FIG. FIG. 4 shows an example of a dense 3D point cloud obtained from a sparse 3D point cloud when the object of the 3D model is the object shown in FIG. Note that the process of creating a dense three-dimensional point group may be omitted.
  • the point cloud reconstruction unit 120 transmits information on the sparse three-dimensional point cloud 1200 or the three-dimensional point cloud 1300 to the model generation unit 130 .
  • the increased points (vertices) can also be treated as feature points, and the feature amount of the feature points can be obtained by interpolation from the original feature points.
  • the model generator 130 creates a three-dimensional model (three-dimensional model 1400) composed of mesh data as shown in FIG. .
  • the model generation unit 130 forms a plane (polygon) by connecting three points based on the positions of the three-dimensional points included in the sparse three-dimensional point group 1200 or the dense three-dimensional point group 1300 .
  • the three-dimensional model creation device 100 creates mesh data by gathering the planes (polygons) to obtain a three-dimensional model.
  • FIG. 5 shows an example of feature points (each vertex forming each plane) in a three-dimensional model.
  • FIG. 6 is a database (feature amount database) containing information (three-dimensional positions, local feature amounts of vertices, etc.) on vertices (feature points) of a 3D model.
  • FIG. 7 is a database (model database) containing information on vertices and meshes of a three-dimensional model. These databases are generated by database generation device 200 .
  • a method for the database generation device 200 to create a feature database and a model database model will be described below.
  • the three-dimensional model creation device 100 and the database generation device 200 are separated, but they may be integrated.
  • the feature quantity database and the model database may be modeled based on the information about the feature points and the mesh that the model three-dimensional model creating apparatus 100 acquires when creating the three-dimensional model.
  • the database generation device 200 acquires the information of the three-dimensional model created by the three-dimensional model creation device 100 and the image 1100 .
  • the feature point detection unit 210 detects a position (point) on the image 1100 corresponding to each vertex (feature point) that configures the three-dimensional model. For example, the positional relationship between the camera that captured each image and the feature points of the three-dimensional model, which is acquired when the three-dimensional model is generated, may be used. Alternatively, the feature point detection unit 210 may use feature points already detected from the image by the three-dimensional model creation device 100 .
  • the feature amount calculation unit 220 calculates the local feature amount of the detected position (point) from each image 1100 in the same manner as described above.
  • the feature quantity calculation unit 220 associates the calculated local feature quantity with the feature point and transmits the result to the database generation unit 230 .
  • a local feature amount associated with a feature point may be a representative value among a plurality of local feature amounts obtained from a plurality of images 1100 . Alternatively, it may be two or more selected from all of a plurality of local feature amounts or from these plurality of local feature amounts. Note that the feature amount calculation unit 220 may use the local feature amount that has already been calculated by the three-dimensional model creation device 100 .
  • the database generator 230 creates a feature amount database 310 (first database) that records information about feature points as shown in FIG. 6, and a model database 320 (second database) that records information about vertices and meshes as shown in FIG. create.
  • the feature quantity database 310 includes a column 311 for recording unique feature point IDs for identifying feature points, a column 312 for recording three-dimensional positions of feature points, and a column 313 for recording local feature quantities of feature points. include.
  • the model database 320 includes a vertex table 330 containing data of vertices forming a mesh, as shown in FIG. 7(a), and a mesh table 340, as shown in FIG. 7(b).
  • the vertex table 330 includes a column 331 for recording unique vertex IDs for identifying mesh vertices, a column 332 for recording feature point IDs corresponding to the vertices, and a column 333 for recording three-dimensional positions. .
  • the mesh table 340 includes a column 341 that records unique mesh IDs for identifying meshes, and a column 342 that records the vertex IDs of vertices that make up the mesh.
  • the feature quantity database 310 and the model database 320 are associated with each other by vertex IDs. For example, when a mesh on the surface of a three-dimensional model is identified, it is possible to identify the vertices forming the mesh, the three-dimensional positions of the vertices (feature points), and the local feature amounts from the mesh ID.
  • the information processing device 400 performs a process of projecting a three-dimensional model onto an image captured by a camera and superimposing the three-dimensional model on the image with high accuracy.
  • the feature point detection unit 410 of the information processing device 400 in FIG. 1 acquires an image 510 (target image) captured by the camera 500 .
  • the feature point detection unit 410 detects a plurality of feature points 511_1 from the image 510 by feature point detection, and calculates the local feature amount of the feature points 511_1.
  • the feature point detection unit 410 transmits information (position information, local feature amount, etc.) regarding the feature point 511_1 to the matching unit 420 .
  • the feature points 511_1 may be feature points obtained by performing feature point detection on the entire image 510, or by specifying an image portion corresponding to a building by semantic segmentation or the like, and performing feature inspection on the specified image portion. Feature points obtained by performing extraction may also be used.
  • the matching unit 420 acquires information (position information, local feature amount, etc.) on the feature point 511_1 detected from the image 510 from the feature point detection unit 410 .
  • the matching unit 420 acquires a plurality of feature points 511_2 (first vertices) and local feature amounts (first feature amounts) of the three-dimensional model recorded in the database 300 .
  • the matching unit 420 compares the local feature amount of the feature points on the three-dimensional model with the local feature amount of the feature point 511_1, and matches the corresponding feature points.
  • the matching unit 420 determines that both feature points are feature points that match each other. Identify points. Matching section 420 transmits information about the matched feature points to posture estimation section 430 .
  • FIG. 8 is a diagram schematically showing an example of matching feature points in an image captured by a camera and feature points in a 3D model. A situation is shown in which feature points 511_1 included in an image 510 acquired by a camera 500 and feature points 511_2 included in a three-dimensional model 900 of a building are all matched.
  • FIG. 9(a) is a diagram showing an example in which some of the feature points of the 3D model are not matched when the feature points of the 3D model and the feature points of the image are matched.
  • the feature point 511_2 in the 3D model matches the feature point 511_1 in the image, but the feature point 512_2 in the 3D model does not match.
  • FIG.9(b) is mentioned later.
  • the orientation estimation unit 430 estimates the orientation of the camera 500 that captured the image 510 . More specifically, posture estimation section 430 generates a plurality of pairs (N pairs) of the two-dimensional position of the feature point on the image and the three-dimensional position of the feature point of the three-dimensional model that matches the feature point. Based on this, the orientation of the camera 500 is estimated.
  • the PNP algorithm (PNP-RANSAC) using the RANSAC (Random Sampling Consensus) framework can be used.
  • PNP-RANSAC PNP-RANSAC
  • RANSAC Random Sampling Consensus
  • the feature points of the three-dimensional model included in the pairs used for estimation correspond to the points (feature points) that are inliers in PNP-RANSAC.
  • the feature points of the 3D model included in the pairs that were used for the estimation correspond to the points (feature points) that became outliers in PNP-RANSAC.
  • the processing unit 440 projects the three-dimensional model onto the image 510 corresponding to the estimated pose of the camera 500 .
  • the positions at which the feature points (inlier points) of the three-dimensional model used for estimating the camera pose are projected onto the image 510 are the positions of the feature points on the image paired with the inlier points. Matches or is close to a 2D location. In other words, it can be considered that the three-dimensional model and the image of the projection destination match in the vicinity of the projected position of the inlier feature point.
  • the positions at which the feature points of the 3D model that are not used for estimating the camera pose are projected onto the image 510, and the positions at which the feature points that are not matched in the above-described matching process are projected onto the image 510 are: It can be very different from where it should be in the image plane. For example, if the projected position is far from the original position on the image plane, or if there is an obstructing object between the 3D model object (real world object) and the camera, 3 Parts of the dimensional model may not be projected (not shown) in the image. That is, it can be considered that the three-dimensional model and the image of the projection destination do not match in the vicinity of the projected positions of the outlier feature points and the unmatched feature points.
  • the feature points of the 3D model that are not used for estimating the camera pose are referred to as outlier feature points (vertices).
  • the feature points of the three-dimensional model used for estimating the pose of the camera are called inlier feature points (vertices).
  • the processing unit 440 projects the three-dimensional model onto the image 510 captured by the camera 500, and corrects the positions of the projection destinations of the outlier feature points to appropriate positions. This deforms the two-dimensional shape of the three-dimensional model projected onto the image. As a result, the three-dimensional model can be accurately superimposed on the object on which the image is projected.
  • the processing unit 440 functions as a processing unit that deforms the three-dimensional model projected onto the image by correcting the positions of the projection destinations of the outlier feature points in the three-dimensional model. Details of the processing unit 440 will be described below.
  • the processing unit 440 sets a region (referred to as region A) centered on the projected position of the outlier feature point in the image onto which the 3D model is projected.
  • FIG. 10 shows an example of a surrounding area A centered on the position where the outlier feature point 512_2 is projected.
  • Area A is a partial area of the image onto which the three-dimensional model is projected.
  • Area A is, for example, a rectangular area of M ⁇ M pixels.
  • the processing unit 440 calculates a local feature amount (second feature amount) for each pixel (position) in the area A. Each pixel is selected in turn, and the distance (distance in the feature space) or difference between the local feature amount of the selected pixel and the local feature amount (first feature amount) of the outlier feature point 512_2 is calculated. If the distance is equal to or less than the threshold, it is determined that the search for the corresponding points has succeeded, and if the distance is greater than the threshold, the search for the corresponding points has failed.
  • the processing unit 440 regards pixels (positions) whose distance is equal to or less than the threshold as corresponding points, that is, positions (pixels) on the image corresponding to the outlier feature point 512_2.
  • the processing unit 440 includes a position specifying unit 440A that specifies the positions of corresponding points.
  • the processing unit 440 may terminate the search when the corresponding point is detected for the first time, or may search all the pixels in the area A and adopt the smallest pixel among the pixels below the threshold value as the corresponding point. good too.
  • the position of the searched corresponding point corresponds to the position (first position) corresponding to the outlier feature point (first vertex) in the image captured by the camera (target image).
  • the position specifying unit 440A acquires a first feature value associated with the first vertex of the three-dimensional model having a plurality of first vertices, and based on the acquired first feature value, the first feature value in the target image captured by the camera. A first position (corresponding point) corresponding to the vertex is identified.
  • the processing unit 440 transforms the projected image of the three-dimensional model by moving the projected position of the outlier feature point to the position (pixel) of the searched corresponding point.
  • the following method is also possible. That is, in this method, the position (three-dimensional position) of the feature point that is the outlier is adjusted so that the projected position when projected onto the image is the position after the movement (position after correction). Correct in the 3D model. Then, the corrected three-dimensional model is reprojected onto the image.
  • the position of the feature point 512_2, which is the outlier of the 3D model shown in FIG. An example of changing the shape of a two-dimensional image that has been processed will be shown.
  • the feature point after the position change is shown as feature point 511_3.
  • FIG. 12 shows an example in which the 3D model is not superimposed with high accuracy when the projected positions of the outlier feature points are not corrected.
  • the 3D model (large 3D model) includes as part two sub-models (3D models 810, 820). An example of projecting this large 3D model is shown.
  • the image includes a near view building 710 and a far view building 720 .
  • the pose of the camera is estimated using the feature points (inlier feature points) of the three-dimensional model 810 corresponding to the building 710 in the foreground. There is no outlier feature point in the 3D model 810 corresponding to the foreground, and the feature point 711 of the 3D model 810 is projected at or near the position of the corresponding point in the image.
  • the three-dimensional model 810 is superimposed on the projection target in the image with high accuracy.
  • the feature points of the 3D model 820 corresponding to the background are all or part of the feature points that are outliers in this example, and the position of the projection destination of the 3D model 820 deviates from the original projection target.
  • the projection area of the distant model 820 is greatly deviated from the original position of the building 720 to be projected, and the three-dimensional model 820 is not accurately superimposed on the image.
  • illustration of feature points (outlier feature points) of the three-dimensional model 820 is omitted.
  • FIG. 13 shows an example in which three-dimensional models are not superimposed with high accuracy when the projected positions of outlier feature points are not corrected.
  • the 3D model (large 3D model) includes as part two sub-models (3D models 810, 820). An example of projecting this large 3D model is shown.
  • the position of the camera is estimated using the feature points (inlier feature points) of the three-dimensional model 820 corresponding to the distant building 720 .
  • the three-dimensional model 820 is accurately superimposed on the projection target (image portion of the building in the background) in the image.
  • the feature points of the 3D model 810 corresponding to the foreground are all or part of feature points that are outliers in this example, and the position of the projection destination of the 3D model 810 is shifted from the original projection target.
  • the projection area of the near view model 810 is greatly deviated from the original position of the building 710 to be projected, and the three-dimensional model 810 is not accurately superimposed on the image.
  • illustration of feature points (outlier feature points) of the three-dimensional model 810 is omitted.
  • FIG. 14 shows an example in which correction processing according to this embodiment is performed in the case of the example shown in FIG.
  • the positions at which the outlier feature points of the three-dimensional model 810 are projected onto the image are corrected to the positions of the corresponding points described above.
  • the projected image of the three-dimensional model 810 is deformed, and the projected three-dimensional model 801 is accurately superimposed on the building 710 in the foreground. It should be noted that illustration of feature points in the three-dimensional models 810 and 820 is omitted in FIG.
  • FIG. 15 shows another example of projecting a 3D model without correcting the positions of outlier feature points.
  • FIG. 16 shows an example in which the positions of the feature points that are outliers in FIG. 15 are corrected.
  • FIG. 15 shows an example of projecting a large-scale three-dimensional model (including three-dimensional models 730_1 to 730_5 as part) onto an image.
  • Three-dimensional models 730_1 to 730_5 correspond to buildings 830_1 to 830_5, which are objects of projection.
  • Outlier feature points 522_1, 522_2, 522_3, and 522_5 are shown in FIG.
  • the positions of these feature points are corrected to the positions of their respective corresponding points, as shown in FIG.
  • the feature points 522_1, 522_2, 522_3, 522_5 after the positions are corrected are indicated by feature points 523_1, 523_2, 523_3, 523_5 in FIG.
  • the sub-models 730_1 to 730_5 included in the three-dimensional model are precisely superimposed on the buildings 830_1 to 830_5 that are the respective projection targets.
  • circles with no reference numerals represent feature points that are inliers.
  • the database updating unit 450 updates the positions of the vertices (three-dimensional positions) in the three-dimensional model based on the corrected positions (two-dimensional positions) of the projected feature points. Update the position of (vertex).
  • the database update unit 450 changes the shape of the mesh by reflecting the post-correction position information of the feature points on the mesh data of the three-dimensional model, thereby correcting the three-dimensional model itself.
  • p3 is the distance in the depth direction of the vertex cPv in the camera coordinate system.
  • K is a 3 ⁇ 3 intrinsic parameter matrix.
  • FIG. 17 is a flowchart illustrating an example of the processing flow of the information processing system 1000 according to the embodiment of the present disclosure.
  • the feature point detection unit 410 detects a plurality of feature points from one or more images 510 acquired by the camera 500 (S1001).
  • the feature point detection unit 410 calculates local feature amounts for each of the plurality of feature points based on the image 510 (S1002).
  • the feature point detection unit 410 detects vertices (feature points) of the three-dimensional model based on the calculated local feature amount and the local feature amounts of each vertex (feature point) of the three-dimensional model recorded in the database 300. and the feature points of the image 510 are matched (S1003). The feature point detection unit 410 generates a pair of matched feature points (S1003).
  • the pose estimation unit 430 estimates the pose of the camera 500 based on the pair of feature points (S1004).
  • the processing unit 440 projects the three-dimensional model onto the image 510 based on the estimated orientation of the camera 500 (S1005). That is, project the 3D model onto an image 510 corresponding to the estimated pose of the camera. Since the vertices (feature points) of the three-dimensional model included in the above pair are the vertices used for camera estimation, these feature points are accurately projected onto the image 510 .
  • the processing unit 440 extracts the feature points of the 3D model that have not been matched with the feature points of the image 510 and the feature points of the 3D model of the pairs that have not been used for estimating the pose of the camera 500 . Specify at least one or both of the points.
  • the identified feature points correspond to feature points that are outliers.
  • the processing unit 440 sets a region (referred to as region A) centering on the projected position of the outlier feature point, and calculates a local feature amount for each position (point) in the region A.
  • the processing unit 440 searches for a position (point) where the difference from the local feature amount of the outlier feature point in the region A is equal to or less than the threshold (S1006).
  • the processing unit 440 corrects the projected position of the outlier feature point to the position searched in step S1006 (S1006). As a result, the projection image of the three-dimensional model projected onto the image is deformed, and the three-dimensional model is superimposed on the object in the image 510 with high accuracy.
  • the information processing apparatus of the present disclosure among the feature points of the three-dimensional model captured in the image 510, the feature points that are outliers are detected, and the projected positions of the detected feature points are projected to the peripheral points. Correction is made to the positions of pixels with close or same local feature values among the pixels in the region. As a result, the projection image of the projected three-dimensional model can be deformed, and the three-dimensional model can be superimposed on the projection target on the image with high precision (AR superimposition).
  • the information processing apparatus 400 reflects the results of correcting the positions of the feature points in the three-dimensional model in both the vertex table 330 (see FIG. 7A) in the model database and the feature amount database 310. I was letting In this modified example, the corrected positions of the feature points in the three-dimensional model are reflected only in the vertex table 330 .
  • the position of the feature point (vertex) projected onto the camera image changes depending on the lens distortion of the camera and how correctly the distortion correction is performed. Therefore, if the post-correction positions of the vertices corrected on the image are reflected in the feature amount database, there is a possibility that the originally correct vertex positions will be corrected to incorrect positions.
  • the 3D coordinates of the vertices in the feature point database used for estimating the pose of the camera and the 3D coordinates of the vertices in the vertex table are managed independently, and the correction of the 3D positions of the vertices is reflected only in the vertex table. . This makes it possible to correct the position of only the vertices used for AR superimposition according to the characteristics of the camera.
  • FIG. 18 is an example of the hardware configuration of a computer that executes a series of processes of the information processing system 1000 of the present disclosure by a program.
  • CPU 1001 , ROM 1002 and RAM 1003 are interconnected via bus 1004 .
  • An input/output interface 1005 is also connected to the bus 1004 .
  • An input unit 1006 , an output unit 1007 , a storage unit 1008 , a communication unit 1009 and a drive 1010 are connected to the input/output interface 1005 .
  • the input unit 1006 is composed of, for example, a keyboard, mouse, microphone, touch panel, input terminal, and the like.
  • the output unit 1007 includes, for example, a display, a speaker, an output terminal, and the like.
  • the storage unit 1008 is composed of, for example, a hard disk, a RAM disk, a non-volatile memory, or the like.
  • the communication unit 1009 is composed of, for example, a network interface. Drives drive removable media such as magnetic disks, optical disks, magneto-optical disks, or semiconductor memories.
  • the CPU 1001 loads, for example, a program stored in the storage unit 1008 into the RAM 1003 via the input/output interface 1005 and the bus 1004, and executes the above-described series of programs. is processed.
  • the RAM 1003 also appropriately stores data necessary for the CPU 1001 to execute various processes.
  • Programs executed by computers can be applied by being recorded on removable media such as package media.
  • the program can be installed in the storage unit 1008 via the input/output interface 1005 by loading the removable medium into the drive 1010 .
  • This program can also be provided via wired or wireless transmission media such as local area networks, the Internet, and digital satellite broadcasting.
  • the program can be received by the communication unit 1009 and installed in the storage unit 1008 .
  • the present invention is not limited to the above-described embodiments as they are, and can be embodied by modifying the constituent elements without departing from the gist of the present invention at the implementation stage. Further, various inventions can be formed by appropriate combinations of the plurality of constituent elements disclosed in the above embodiments. For example, some components may be omitted from all components shown in the embodiments. Furthermore, components across different embodiments may be combined as appropriate.
  • this disclosure can also take the following configurations.
  • Information processing device with [Item 2] A feature amount calculation unit that calculates a second feature amount for a position in the target image, The information according to item 1, wherein the position specifying unit specifies a position in the target image having the second feature amount whose distance from the first feature amount is equal to or less than a threshold, and defines the specified position as the first position.
  • the feature quantity calculation unit calculates a second feature quantity for each of a plurality of positions within a region around the position where the first vertex is projected, Item 3.
  • the information processing apparatus according to Item 2 wherein the position specifying unit specifies a position having the second feature amount whose distance from the first feature amount is equal to or less than a threshold, and sets the specified position as the first position.
  • the surrounding area is a certain area around the projected position.
  • a feature amount calculation unit that detects a plurality of feature points in the target image and calculates a plurality of second feature amounts for the plurality of feature points; an estimating unit that detects the feature point having the second feature amount that is equal to or less than a threshold, and estimates the orientation of the camera based on a set of the first vertex and the detected feature point; 5.
  • the information processing apparatus according to any one of items 1 to 4, wherein the processing unit projects the three-dimensional model onto the target image based on the orientation of the camera.
  • Item 6 Item 6. The information processing apparatus according to Item 5, wherein the estimation unit estimates the orientation of the camera based on a PNP algorithm.
  • a first database containing the position of the first vertex and the first feature associated with the first vertex; a second database containing the position of the first vertex;
  • the estimation unit estimates the orientation of the camera based on the first database
  • the position specifying unit specifies the first position corresponding to the first vertex in the target image based on the second database, an updating unit that converts the first position in the target image into a position in a three-dimensional model coordinate system, and updates the position of the first vertex in the second database based on the position after conversion.
  • the update unit updates the position of the first vertex in the first database based on the changed position.
  • the information processing apparatus according to Item 7, wherein the updating unit does not update the position of the first vertex in the first database.
  • the three-dimensional model is a model in which a plurality of feature points detected by performing feature point detection based on one or more images of an object are set as the first vertices, The information processing apparatus according to any one of items 1 to 9, wherein the first feature amount related to the first vertex is a feature amount calculated for the feature point.
  • [Item 11] obtaining a first feature value associated with the first vertices of a three-dimensional model having a plurality of first vertices; 1 Identify the position, An information processing method for transforming the three-dimensional model projected onto the target image by projecting the three-dimensional model onto the target image and correcting a position where the first vertex is projected to the first position.
  • [Item 12] obtaining a first feature value associated with the first vertices of a three-dimensional model having a plurality of first vertices; 1 identifying a location; projecting the three-dimensional model onto the target image, and deforming the three-dimensional model projected onto the target image by correcting the projected position of the first vertex to the first position; computer program to run
  • REFERENCE SIGNS LIST 100 three-dimensional model creation device 110 feature point detection unit 120 point group restoration unit 130 model generation unit 200 database generation device 210 feature point detection unit 220 feature amount calculation unit 300 database 310 feature amount database 320 model database 330 vertex table 340 mesh table 400 Information processing device 410 Feature point detection unit 420 Matching unit 430 Posture estimation unit 440 Processing unit 500 Camera

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Architecture (AREA)
  • Processing Or Creating Images (AREA)
PCT/JP2022/006697 2021-06-14 2022-02-18 情報処理装置、情報処理方法及びコンピュータプログラム Ceased WO2022264519A1 (ja)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US18/565,569 US20240265660A1 (en) 2021-06-14 2022-02-18 Information processing apparatus, information processing method, and computer program
JP2023529508A JPWO2022264519A1 (https=) 2021-06-14 2022-02-18

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2021098991 2021-06-14
JP2021-098991 2021-06-14

Publications (1)

Publication Number Publication Date
WO2022264519A1 true WO2022264519A1 (ja) 2022-12-22

Family

ID=84526094

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/006697 Ceased WO2022264519A1 (ja) 2021-06-14 2022-02-18 情報処理装置、情報処理方法及びコンピュータプログラム

Country Status (3)

Country Link
US (1) US20240265660A1 (https=)
JP (1) JPWO2022264519A1 (https=)
WO (1) WO2022264519A1 (https=)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117152553A (zh) * 2023-07-28 2023-12-01 深圳康诺思腾科技有限公司 图像的标签生成方法、装置和系统、介质和计算设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0950540A (ja) * 1995-08-09 1997-02-18 Hitachi Ltd 画像生成方法
JP2003141569A (ja) * 2001-10-31 2003-05-16 Canon Inc 情報処理方法および映像合成装置
JP2005227876A (ja) * 2004-02-10 2005-08-25 Canon Inc 画像処理方法、画像処理装置
JP2008040913A (ja) * 2006-08-08 2008-02-21 Canon Inc 情報処理方法、情報処理装置
JP2011145856A (ja) * 2010-01-14 2011-07-28 Ritsumeikan 複合現実感技術による画像生成方法及び画像生成システム
JP2015228050A (ja) * 2014-05-30 2015-12-17 ソニー株式会社 情報処理装置および情報処理方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6464934B2 (ja) * 2015-06-11 2019-02-06 富士通株式会社 カメラ姿勢推定装置、カメラ姿勢推定方法およびカメラ姿勢推定プログラム
JP2018036901A (ja) * 2016-08-31 2018-03-08 富士通株式会社 画像処理装置、画像処理方法および画像処理プログラム
US11295532B2 (en) * 2018-11-15 2022-04-05 Samsung Electronics Co., Ltd. Method and apparatus for aligning 3D model

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0950540A (ja) * 1995-08-09 1997-02-18 Hitachi Ltd 画像生成方法
JP2003141569A (ja) * 2001-10-31 2003-05-16 Canon Inc 情報処理方法および映像合成装置
JP2005227876A (ja) * 2004-02-10 2005-08-25 Canon Inc 画像処理方法、画像処理装置
JP2008040913A (ja) * 2006-08-08 2008-02-21 Canon Inc 情報処理方法、情報処理装置
JP2011145856A (ja) * 2010-01-14 2011-07-28 Ritsumeikan 複合現実感技術による画像生成方法及び画像生成システム
JP2015228050A (ja) * 2014-05-30 2015-12-17 ソニー株式会社 情報処理装置および情報処理方法

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117152553A (zh) * 2023-07-28 2023-12-01 深圳康诺思腾科技有限公司 图像的标签生成方法、装置和系统、介质和计算设备

Also Published As

Publication number Publication date
US20240265660A1 (en) 2024-08-08
JPWO2022264519A1 (https=) 2022-12-22

Similar Documents

Publication Publication Date Title
Kawai et al. Diminished reality based on image inpainting considering background geometry
CN111868786B (zh) 跨设备监控计算机视觉系统
CN118089666B (zh) 一种适用于低重叠度无人机影像的摄影测量方法及系统
US12198225B2 (en) Transformer-based shape models
CN112270736A (zh) 增强现实处理方法及装置、存储介质和电子设备
CN111062966A (zh) 基于l-m算法和多项式插值对相机跟踪进行优化的方法
CN119693559B (zh) 一种基于双目视觉的海洋波浪场重构方法和装置
Barandiaran et al. Real-time optical markerless tracking for augmented reality applications
CN114202632A (zh) 网格线性结构恢复方法、装置、电子设备及存储介质
Park et al. Virtual object placement in video for augmented reality
CN120707630A (zh) 一种单目相机下针对任意物体的姿态识别算法及应用系统
Dong et al. Probability driven approach for point cloud registration of indoor scene
WO2022264519A1 (ja) 情報処理装置、情報処理方法及びコンピュータプログラム
WO2025242011A1 (zh) 一种基于多源输入的三维模型重建方法和装置
JP2004514228A (ja) カイラル性のローバストな使用によるシーン復元およびカメラ較正
JP5413188B2 (ja) 三次元画像処理装置、三次元画像処理方法および三次元画像処理プログラムを記録した媒体
CN111783497A (zh) 视频中目标的特征确定方法、装置和计算机可读存储介质
Yu et al. Parallax-tolerant image stitching with epipolar displacement field
KR20240115631A (ko) 고밀도 점군 정보 생성 시스템 및 방법
Shilaskar et al. A scalable structure-from-motion framework for efficient 2D-to-3D reconstruction of historical artifacts
EP3779878A1 (en) Method and device for combining a texture with an artificial object
CN119904349B (zh) 鱼眼相机slam方法和装置、系统、存储介质
KR102806920B1 (ko) 3d 볼류메트릭 비디오의 생성 방법 및 이의 기록 매체
CN120339376B (zh) 融合物体平面特征的视觉slammot系统及方法
JP7771363B2 (ja) アナモルフィックレンズによってキャプチャされた画像をモデル化することにおいて使用するための装置、方法およびコンピュータプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22824519

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023529508

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22824519

Country of ref document: EP

Kind code of ref document: A1