WO2023116430A1 - 视频与城市信息模型三维场景融合方法、系统及存储介质 - Google Patents

视频与城市信息模型三维场景融合方法、系统及存储介质 Download PDF

Info

Publication number
WO2023116430A1
WO2023116430A1 PCT/CN2022/137042 CN2022137042W WO2023116430A1 WO 2023116430 A1 WO2023116430 A1 WO 2023116430A1 CN 2022137042 W CN2022137042 W CN 2022137042W WO 2023116430 A1 WO2023116430 A1 WO 2023116430A1
Authority
WO
WIPO (PCT)
Prior art keywords
camera
scene
video
parameter
parameters
Prior art date
Application number
PCT/CN2022/137042
Other languages
English (en)
French (fr)
Inventor
陈彪
陈顺清
刘慧敏
Original Assignee
奥格科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 奥格科技股份有限公司 filed Critical 奥格科技股份有限公司
Publication of WO2023116430A1 publication Critical patent/WO2023116430A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/20Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2219/00Indexing scheme for manipulating 3D models or images for computer graphics
    • G06T2219/20Indexing scheme for editing of 3D models
    • G06T2219/2004Aligning objects, relative positioning of parts
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • the invention relates to the field of cartography, in particular to a fusion method, system and storage medium of a video and a three-dimensional scene of a city information model based on a heuristic algorithm.
  • 3D model-based scenes can truly restore objects in the physical world such as topography, buildings, bridges, etc., with high-precision, equal-scale, and high-simulation effects.
  • the 3D model is the result of a certain period of time and belongs to static data, which cannot reflect the current situation and the latest situation.
  • real-scene 3D GIS is more and more connected to monitoring video and other Internet of Things data to meet the business needs of different fields such as security and transportation.
  • the methods of combining video and 3D scenes are generally divided into two methods: pop-up window display video and video and 3D scene fusion. The latter is also called video fusion. Users can understand the surrounding scenes while watching the video. It has the characteristics of high restoration of reality, intuitiveness, appropriate video position and real position, and easy understanding.
  • the fusion of video and 3D scenes mostly adopts two methods of manual operation and automatic mapping.
  • the manual operation requires manual calibration of the video image and the 3D scene, and restores the camera information by adjusting multiple parameter values such as the position, orientation, and depression angle of the camera.
  • This method has low efficiency and poor accuracy.
  • the method of automatic mapping is to calculate the projection matrix through the camera internal parameters and the camera external parameters, so as to realize the precise mapping between the video and the scene.
  • the key step in the fusion of video and 3D scenes is camera calibration, that is, the estimation of internal and external parameters of the camera.
  • this method has the limitation that it is difficult to select feature points; the second is to estimate the camera internal parameters with a calibration instrument in advance, and then select at least three feature points from the current scene to estimate the camera external parameters.
  • the link of user interaction is eliminated, and the attitude of the camera will also change in the operating environment, especially the PTZ camera.
  • the estimation of the camera extrinsic parameters of the two methods is greatly affected by noise. Even if the reprojection error is small, the obtained camera position may still be biased.
  • the video and the three-dimensional scene fusion method of the city information model include the following steps:
  • (m 1 , n 1 ) is the real camera space coordinates of feature points
  • (m' 1 , n' 1 ) is the calculated camera space coordinates
  • k is the number of feature points
  • step S8 Determine whether the fitness function exceeds the set threshold; if it exceeds, execute step S9; if not, the currently matched camera parameters are the global optimal solution, output the global optimal solution as the matching result, and obtain the camera Position, actual imaging point and camera angle of view to realize the fusion of video scene and 3D scene;
  • the video and city information model three-dimensional scene fusion system includes the following modules:
  • the feature point file generation module is used to calibrate the spatial feature point and generate the feature point file according to the video shooting image and its coordinate file, the three-dimensional scene view and its coordinate file;
  • the parameter setting module is used to set the camera parameter update speed, direction and algorithm iteration number
  • a calculation module for calculating the camera projection matrix and fitness function; and calculating the camera space coordinates according to the camera projection matrix and the three-dimensional coordinates of the feature points;
  • the fitness function is defined as the average error between the real camera space coordinates of feature points and the solved camera space coordinates:
  • the fitness judgment module is used to judge whether the fitness function exceeds the set threshold; if it exceeds, the iterative judgment module will be started, if not, the currently matched camera parameters are the global optimal solution, and the global optimal solution will be output As a matching result, the camera position, actual imaging point and camera angle of view are obtained, and the fusion of the video scene and the 3D scene is realized;
  • This iteration completion judgment module is used to judge whether all parameters have completed this iteration. If it is completed, the solution space of all parameters will be generated according to the results of this iteration, and n groups of solutions with the smallest fitness value will be selected according to the fitness value. As a candidate optimal solution, as the search base point for the next round of iteration; otherwise, return to the parameter update module;
  • the number of iterations judging module is used to judge whether the number of algorithm iterations set by the parameter setting module is satisfied. If it is satisfied, the optimal solution with the smallest fitness value is selected from the current camera parameter candidate optimal solutions as the optimal solution output, and the obtained The camera position, the actual imaging point and the camera angle of view realize the fusion of the video scene and the 3D scene; otherwise update the velocity value V n and return to the parameter update module.
  • the storage medium of the present invention has computer instructions stored thereon, and when the computer instructions are executed by the processor, each step of the method for merging the video and the three-dimensional scene of the city information model of the present invention is realized.
  • the present invention provides a heuristic algorithm-based video and three-dimensional scene fusion method, system and storage medium. It does not take the calculation of camera internal and external parameters as an important step, but directly obtains the projection matrix based on parameters such as camera position and observation point, and then uses heuristic
  • the formula algorithm dynamically searches parameters to obtain the minimum error of camera projection, which reduces the difficulty and accuracy requirements of camera calibration and calculation, and improves the robustness of camera position matching.
  • the present invention supports self-adaptive search of camera parameters for multiple feature points, and uses fewer camera parameters to achieve precise matching of video and 3D scenes. At the same time, the present invention also solves the problem that the precise coordinates of the camera cannot be obtained, and the coordinate error of the algorithm can converge to the optimal solution faster, thereby realizing automatic matching of camera parameters and improving the efficiency of camera parameter matching.
  • FIG. 2 is a schematic diagram of an imaging cone involved in an embodiment of the present invention.
  • the invention discloses a method for merging video and three-dimensional scenes of a city information model based on a heuristic algorithm, and mainly solves the important and difficult problem of intelligent matching of camera parameters.
  • the present invention proposes an improved heuristic algorithm, supports adaptive search of camera parameters for multiple feature points, and uses fewer camera parameters to achieve precise matching of video and 3D scenes.
  • the present embodiment is based on the heuristic algorithm-based video and city information model 3D scene fusion method
  • the technical means adopted include: 1) generating feature point files according to the consistency of 2D and 3D scene objects, 2) selecting Fix the initial parameters of the cone, and use the least parameters to calibrate the camera to the maximum extent. 3) Automatically adapt the camera parameters according to the error value of the coordinate point. 4) Use the error value of the coordinate point and the iterative effect to screen out the optimal parameters.
  • each camera parameter can be estimated according to the camera situation, and the algorithm is not affected by the estimation accuracy. Specifically, it mainly includes the following steps:
  • the 3D scene view and its coordinate file calibrate the spatial feature points, and generate a feature point file.
  • the feature point file includes the pixel coordinates of the feature point and the three-dimensional space coordinate information corresponding to the feature point.
  • Each group of points with the same name is a set of feature points, and feature point matching is to match the points with the same name together.
  • the pixel coordinates of feature point A are (m, n)
  • the three-dimensional space coordinates are (x, y, z)
  • (m, n) and (x, y, z) are a set of matching results.
  • the calibration of the points with the same name or feature points can adopt manual and automatic calibration methods.
  • the 3D scene based on oblique photography data has a high degree of agreement with the video scene, and the feature points can be automatically marked by machine learning, while the scene based on manual modeling needs to be manually calibrated.
  • Table 1 shows an example of the marked feature points:
  • the image feature point matching algorithm that is, the SIFT algorithm
  • the image to be matched includes the video shooting screen
  • the corresponding three-dimensional scene view the two-dimensional coordinates of the video image can be obtained directly through the picture, and the coordinates of the three-dimensional scene can be obtained directly through the three-dimensional system; extract feature points; describe the feature points, obtain feature point descriptors; feature points Match; output feature point file.
  • the method of manual labeling is adopted, and several rules must be followed when performing manual labeling: try to adjust the 3D scene to the same viewing angle as the surveillance video, and scale the 3D scene to It is consistent with the monitoring video; the selected points in the video should correspond to the positions in the 3D scene one by one, and the selected feature points should be as stable and recognizable as possible; the selected points in the video should take into account the up, down, For the four directions of left and right and the center position of the video, the number of selected points should not be less than 4.
  • the viewing frustum is a cone composed of the camera position O as the origin, the line of sight direction OB (the centerline of the viewing frustum), the viewing angle fov (ie, the internal reference of the camera), the far plane (FAR PLANE) and the near plane (NEAR PLANE) In three-dimensional space, objects located in the middle of the far and near planes are visible and imaged on the near plane, as shown in Figure 2.
  • the key parameters that determine the viewing frustum are the camera position O(O x ,O y ,O z ), the line of sight direction OB, and the vertical angle of view fov value, where the line of sight direction OB has an intersection with the 3D scene C(C x ,C y , C z ), OC can be used instead of OB. Therefore, the initial camera parameters can be represented by the initial viewing frustum parameters:
  • C x , Cy y , C z are the coordinates of the intersection point C between the line of sight OB and the 3D scene; O x , O y , O z are the coordinates of the camera position O; fov is the vertical angle of view of the camera.
  • the moving speed of each parameter is set to V, and the moving direction is set to s.
  • the speed represents the step size of each movement of each parameter, and the moving direction represents the movable interval of the parameter.
  • the height of the far plane is:
  • Far represents the distance between the far plane and point O
  • near represents the distance between the near plane and point O
  • NearPlaneWidth represents the width of the near plane
  • FarPlaneWidth represents the width of the far plane
  • (m 1 , n 1 ) is the real camera space coordinates of the feature points
  • (m' 1 , n' 1 ) is the calculated camera space coordinates
  • k is the number of feature points.
  • step S8 Judging whether the fitness function exceeds a set threshold. If it exceeds, execute step S9. If not, it means that the currently matched camera parameters are the global optimal solution, output the global optimal solution as the matching result, and the algorithm ends, and the camera position, actual imaging point and camera angle of view are obtained. Realize the integration of video scene and 3D scene.
  • the value of the fitness function (referred to as the fitness value) in steps S7 and S8 is an index to measure the error between the current camera parameters and the real camera parameters. The smaller the error, the better the parameter matching effect, and vice versa.
  • step S10 Judging whether all the parameters have been completed for this iteration, if so, execute step S10, otherwise execute steps S4 to S8.
  • step S10 The purpose of step S10 is to generate the solution space of all parameters through the iteration of the current round, and filter out several groups of parameters that are most likely to be the optimal solution from the solution space, as the search base point for the next round of parameter iteration.
  • the candidate optimal solution it can be screened according to the principle that the fitness value is the smallest and the parameter matching effect is the best. This is because the image of the fitness function is a "U" shape with only a minimum value of 0, so the smaller the fitness value, the smaller the error of the matched parameters.
  • Embodiment 1 This embodiment is based on the same inventive concept as Embodiment 1, and proposes a three-dimensional scene fusion system for video and city information models based on heuristic algorithms.
  • the system includes the following modules:
  • the feature point file generation module is used to realize step S1, according to the video shooting image and its coordinate file, the three-dimensional scene view and its coordinate file, calibrate the spatial feature point, and generate the feature point file;
  • the initialization module is used to implement step S2, to initialize the viewing frustum and camera parameters;
  • a parameter setting module used to implement step S3, setting camera parameter update speed, direction and algorithm iteration number;
  • Calculation module used to realize step S5-S7, calculate camera projection matrix and fitness function; And according to camera projection matrix and feature point three-dimensional coordinates, calculate camera space coordinates;
  • the fitness function is defined as the average error between the real camera space coordinates of feature points and the solved camera space coordinates:
  • (m 1 , n 1 ) is the real camera space coordinates of feature points
  • (m' 1 , n' 1 ) is the calculated camera space coordinates
  • k is the number of feature points
  • the fitness judgment module is used to implement step S8, and judge whether the fitness function exceeds the set threshold; if it exceeds, the iterative judgment module is started; if not, the currently matched camera parameters are the global optimal solution, and the global The optimal solution is output as the matching result, and the camera position, actual imaging point and camera angle of view are obtained, and the fusion of the video scene and the 3D scene is realized;
  • This iteration completion judgment module is used to implement steps S9-S10, to judge whether all parameters have completed this iteration, if completed, generate the solution space of all parameters according to the results of this iteration, and filter out the fitness according to the fitness value
  • steps S9-S10 to judge whether all parameters have completed this iteration, if completed, generate the solution space of all parameters according to the results of this iteration, and filter out the fitness according to the fitness value.
  • the number of iterations judging module is used to implement step S11 to judge whether the number of algorithm iterations set by the parameter setting module is satisfied, and if so, select the one with the smallest fitness value from the current camera parameter candidate optimal solutions as the optimal Solution output, get the camera position, actual imaging point and camera angle of view, realize the fusion of video scene and 3D scene; otherwise, update the velocity value V n , and then return to the parameter update module.
  • Embodiment 1 is based on the same inventive concept as Embodiment 1, and proposes a corresponding storage medium, on which computer instructions are stored. When the computer instructions are executed by the processor, the steps of the three-dimensional scene fusion method in Embodiment 1 are implemented. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Architecture (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)

Abstract

本发明涉及地图制图学领域,为基于启发式算法的视频与城市信息模型三维场景的融合方法、系统及存储介质,其方法包括:根据视频拍摄图像、三维场景视图,生成特征点文件;初始化视椎体和相机参数;设置相机参数更新速度、方向和算法迭代次数,更新相机参数;计算相机投影矩阵、相机空间坐标、适应度函数;适应度函数没超过设定阈值时当前匹配到的相机参数为全局最优解,若超过则根据本次迭代结果生成所有参数的解空间,筛选出适应度值最小的n组解作为候选最优解,作为下一轮迭代的搜索基点;当满足算法迭代次数时从候选最优解里筛选出适应度值最小的作为最优解输出。本发明提高了相机位置匹配的鲁棒性,实现了相机参数智能匹配。

Description

视频与城市信息模型三维场景融合方法、系统及存储介质 技术领域
本发明涉及地图制图学领域,尤其涉及基于启发式算法的视频与城市信息模型三维场景的融合方法、系统及存储介质。
背景技术
在实景三维GIS领域中,基于三维模型的场景可以真实还原地形地貌、建筑、桥梁等物理世界的对象,具备高精度、等比例、高仿真的效果。但是,三维模型是某一时间段的成果,属于静态数据,无法体现现势性,无法体现当前最新的情况。为解决这个问题,实景三维GIS越来越多地接入监控视频等物联网数据,来满足安防、交通等不同领域的业务需求。视频与三维场景结合的方式一般分为弹窗展示视频和视频与三维场景融合两种方法。后者又称之为视频融合,用户可在观看视频的同时了解周围场景,具有还原真实度高、直观、视频位置与真实位置贴切、易于理解等特点。
目前视频与三维场景的融合大多采用手工操作和自动映射两种方式。其中手工操作需要人工实现视频画面与三维场景的校准,通过调整摄像头的位置、朝向、俯角等多个参数值来还原摄像机信息,该方法效率低、精度差。自动映射的方法即通过相机内参和相机外参计算投影矩阵,实现视频与场景的精准映射。视频与三维场景融合的关键一步是相机标定,即相机的内、外参数估计。一般有两种方式,一是利用最小二乘法最小化重投影误差,计算出相机的投影矩阵,该方法需要大量的特征点(至少6对),由于相机视野有限,且模型有时会缺乏3D细节,该方法具有特征点难以选取的局限性;二是预先用校准仪器估计相机内参,再从当前场景中选取至少三个特征点估计相机外参,该方法在获得相机内外参的过程中都增加了用户交互的环节,且相机姿态在运行环境下也会发生改变,尤其是云台摄像机。此外,当特征点较少时,这两种方法的相机外参的估计受噪声的影响均比较大,即使重新投影的误差很小,但得到的相机位置可能还是有偏差。
发明内容
为了解决现有技术所存在的问题,本发明提出基于启发式算法的视频与城市信息模型三维场景融合方法、系统及存储介质,本发明不以计算相机内外参数为显要步骤,而是基于相机位置、观察点等参数直接得到投影矩阵,然后利用启发式算法动态搜索参数,求得相机投影的最小误差;从而降低了相机标定及计算的难度,提高了相机位置匹配的鲁棒性,提高了相机参数估计的效率,实现了相机参数智能匹配。
本发明实施例中视频与城市信息模型三维场景融合方法,包括以下步骤:
S1、根据视频拍摄图像及其坐标文件、三维场景视图及其坐标文件,标定空间特征点,并生成特征点文件;
S2、初始化视椎体和相机参数;
S3、设置相机参数更新速度、方向和算法迭代次数;
S4、根据相机各参数更新的速度和方向,更新相机参数P'=P+V n*s,其中P'为更新后的相机参数,P为更新前的相机参数,V n为参数更新后的速度值,s为参数的移动方向;
S5、计算相机投影矩阵;
S6、根据相机投影矩阵和特征点三维坐标,计算相机空间坐标;
position'=Position 特征点三维坐标*ProjectionMatrix
其中,Position 特征点三维坐标为特征点的三维坐标,由此计算得到相机空间坐标position'=(m' i,n' i),i=1,2,3...k;
S7、计算适应度函数;其中,适应度函数定义为特征点的真实相机空间坐标与求解的相机空间坐标的平均误差:
Figure PCTCN2022137042-appb-000001
其中(m 1,n 1)是特征点的真实相机空间坐标,(m' 1,n' 1)是计算的相机空间坐标,k为特征点的数量;
S8、判断适应度函数是否超过设定的阈值;若超过,则执行步骤S9,若没超过,则当前匹配到的相机参数是全局最优解,将全局最优解输出作为匹配结果,得到相机位置、实际成像点和相机视角,实现视频场景与三维场景的融合;
S9、判断所有的参数是否完成本次迭代,若完成,执行步骤S10,否则执行步骤S4至S8;
S10、根据本次迭代结果,生成所有参数的解空间,根据适应度值筛选出适应度值最小的n组解作为候选最优解,作为下一轮迭代的搜索基点;
S11、判断是否满足算法迭代次数iters,若满足,则从当前的相机参数候选最优解里筛选出适应度值最小的作为最优解输出,得到相机位置、实际成像点和相机视角,实现视频场景与三维场景的融合;否则更新速度值V n,再执行步骤S4至S10。
本发明实施例中视频与城市信息模型三维场景融合系统,包括以下各模块:
特征点文件生成模块,用于根据视频拍摄图像及其坐标文件、三维场景视图及其坐标文件,标定空间特征点,并生成特征点文件;
初始化模块,用于初始化视椎体和相机参数;
参数设定模块,用于设置相机参数更新速度、方向和算法迭代次数;
参数更新模块,用于根据相机各参数更新的速度和方向,更新相机参数P'=P+V n*s,其中P'为更新后的相机参数,P为更新前的相机参数,V n为参数更新后的速度值,s为参数的移动方向;
计算模块,用于计算相机投影矩阵和适应度函数;以及根据相机投影矩阵和特征点三维坐标,计算相机空间坐标;
position'=Position 特征点三维坐标*ProjectionMatrix
其中,Position 特征点三维坐标为特征点的三维坐标,由此计算得到相机空间坐标position'=(m' i,n' i),i=1,2,3...k;
所述适应度函数定义为特征点的真实相机空间坐标与求解的相机空间坐标的平均误差:
Figure PCTCN2022137042-appb-000002
其中(m 1,n 1)是特征点的真实相机空间坐标,(m' 1,n' 1)是计算的相机空间坐标,k为特征点的数量;
适应度判断模块,用于判断适应度函数是否超过设定的阈值;若超过,则启动迭代判断模块,若没超过,则当前匹配到的相机参数是全局最优解,将全局最优解输出作为匹配结果,得到相机位置、实际成像点和相机视角,实现视频场景与三维场景的融合;
本次迭代完成判断模块,用于判断所有的参数是否完成本次迭代,若完成,则根据本次迭代结果,生成所有参数的解空间,根据适应度值筛选出适应度值最小的n组解作为候选最优解,作为下一轮迭代的搜索基点;否则返回参数更新模块;
迭代次数判断模块,用于判断是否满足参数设定模块所设定的算法迭代次数,若满足,则从当前的相机参数候选最优解里筛选出适应度值最小的作为最优解输出,得到相机位置、实际成像点和相机视角,实现视频场景与三维场景的融合;否则更新速度值V n,再返回参数更新模块。
本发明的存储介质,其上存储有计算机指令,当计算机指令被处理器执行时,实现本发明视频与城市信息模型三维场景融合方法的各个步骤。
与现有技术相比,本发明取得了如下有益效果:
1、本发明提供基于启发式算法的视频与三维场景融合方法、系统及存储介质,不以计算相机内外参数为显要步骤,而是基于相机位置、观察点等参数直接得到投影矩阵,然后利用启发式算法动态搜索参数,求得相机投影的最小误差,降低了相机标定及计算的难度、准确 性要求,提高了相机位置匹配的鲁棒性。
2、本发明支持多特征点的自适应搜索相机参数,利用较少的相机参数实现视频与三维场景的精准匹配。同时,本发明还解决了无法获得相机精准坐标的问题,且算法的坐标误差能更快地收敛至最优解,实现了相机参数的自动匹配,提高相机参数匹配的效率。
3、另一方面,通常情况下,用户清楚相机及观察中心点的大致位置,可设为相机初始位置,在一定程度上可以提高相机参数估计的效率。
附图说明
图1为本发明实施例维场景融合方法中特征点自动匹配流程示意图;
图2为本发明实施例中涉及到的视椎体成像原理图;
图3为本发明实施例维场景融合方法中相机参数智能匹配的流程示意图。
具体实施方式
本发明公开了基于启发式算法的视频与城市信息模型三维场景融合的方法,主要解决相机参数智能匹配这个重难点的问题。在具体实施方式上,本发明提出改进的启发式算法,支持多特征点的自适应搜索相机参数,利用较少的相机参数实现视频与三维场景的精准匹配。
下面结合实施例及附图对本发明的技术方案做进一步详细的描述,但本发明的实施方式并不限于此。
实施例1
参见图1-3,本实施例基于启发式算法的视频与城市信息模型三维场景融合方法,所采用的技术手段包括:1)依据二三维场景对象的一致性,生成特征点文件,2)选定视椎体初始参数,最大限度利用最少的参数对相机进行标定,3)依据坐标点误差值对相机参数进行自动适配,4)使用坐标点误差值和迭代效果筛选出最优参数。其中,各个相机参数可根据相机情况进行预估,算法不受预估精确度的影响。具体来说,主要包括以下几个步骤:
S1、根据视频拍摄图像及其坐标文件、三维场景视图及其坐标文件,标定空间特征点,并生成特征点文件。
本步骤对视频拍摄图像与三维场景视图中的同名点进行标注,每对同名点为一组特征点,包括三维场景下的坐标Position=(X i,Y i,Z i)和视频中的二维图像坐标position=(m i,n i),其中i=1.2.3...k,所有的特征点对组成一个特征点文件。具体来说,根据视频拍摄图像及其坐标文件、三维场景视图及其坐标文件,标注并提取视频拍摄图像与三维场景视图中的一对同名点作为相应的一组特征点,并进行特征点描述及特征点匹配,将提取出来的全部特征点对组成特征点文件。
特征点匹配的具体流程为:
(1)获取视频中所标注同名点的像素坐标。为获取视频中的像素坐标,可从视频中截取出一张图片,利用PhotoShop对图片进行裁剪,借助Photoshop获取同名点的像素坐标,即特征点的像素坐标。
(2)获取特征点对应的三维空间坐标(X、Y、Z)。该坐标可从三维信息平台中直接拾取。
(3)生成特征点文件。该特征点文件包含特征点的像素坐标以及与该特征点对应的三维空间坐标信息。
(4)特征点匹配。每一组同名点是一组特征点,特征点匹配就是将同名点匹配在一起。如特征点A的像素坐标是(m,n),三维空间坐标是(x,y,z),则(m,n)与(x,y,z)就是一组匹配结果。有多少组特征点,就有多少组匹配结果。
其中,同名点或特征点的标定可采用手动和自动标定方式。基于倾斜摄影数据的三维场景与视频场景的吻合度较高,可采用机器学习的方式自动标注特征点,而基于手工建模的场景需要进行手工标定。表一示例了标注的特征点:
表一:特征点
Figure PCTCN2022137042-appb-000003
本步骤中,对倾斜摄影场景与视频画面特征点自动标注时采用的是图像特征点匹配算法,即SIFT算法,主要过程如下:获取待匹配图像以及对应的坐标文件,待匹配图像包括视频拍摄画面以及与之对应的三维场景视图,视频图像的二维坐标可直接通过图片获取,三维场景的坐标可以通过三维系统直接获取;提取特征点;对特征点进行描述,获取特征点描述子;特征点匹配;输出特征点文件。
而对通过手工建模的三维场景与视频画面进行标注时采用手工标注的方式,进行手工标注时需遵循几个规则:尽量将三维场景调整至与监控视频相同的观察角度,三维场景比例缩放至与监控视频中一致;视频中所选的点要与三维场景中的位置一一对应,尽可能使选择的特征点具备稳定性及可辨识性;视频中所选的点尽量兼顾上、下、左、右四个方向和视频中心位置,所选点的数量应不少于4。
S2、初始化视椎体和相机参数。
视椎体即以相机位置O为原点,由视线方向OB(视椎体中心线)、视角fov(即相机内参)、 远平面(FAR PLANE)和近平面(NEAR PLANE)共同组成的一个锥形体三维空间,其中位于远、近两个平面中间的物体是可见的,且成像于近平面,如图2所示。
决定视椎体的关键参数是相机位置O(O x,O y,O z)、视线方向OB以及垂直方向视角fov值,其中视线方向OB与三维场景有个交点C(C x,C y,C z),可用OC代替OB。因此,初始相机参数可用初始视椎体参数表示:
P=(C x,C y,C z,O x,O y,O z,fov)
其中,C x、C y、C z为视线方向OB与三维场景的交点C的坐标;O x、O y、O z为相机位置O的坐标;fov为相机的垂直方向视角。
本实施例中,初始化的视椎体只有7个参数,剩下的参数可以直接设定。为更大限度的使相机观察到更大的空间,可将图2中OA的距离值尽可能设置小,而OB的距离尽可能大。所初始化的关键相机参数可以确定视椎体,但不仅限于这些参数,也可根据具体实际情况进行扩展。
S3、设置相机参数更新速度、方向和算法迭代次数。
本实施例中,设置每个参数的移动速度为V,移动方向为s,速度代表每个参数每次移动的步长,移动方向代表参数可移动的区间。移动速度表示为
Figure PCTCN2022137042-appb-000004
更新后的速度V n=V 1+(n-1)Δv,n代表算法迭代的次数,Δv代表速度更新的幅度;移动方向s=[-a,a],a为整数,s=-a,-a+1,-a+2,...0...a-2,a-1,a,总共有2a个值;则每个参数的搜索邻域为[-(V*a),V*a],设置算法迭代次数为iters,每组候选参数每轮迭代可产生(2a) 7组解。
本实施例中,参数更新的初始速度为V 1=(20 cx,20 cy,20 cz,20 ox,20 oy,20 oz,10 fov),
Figure PCTCN2022137042-appb-000005
每个参数的移动方向相同为s=[-5,5],即每个参数每次可在候选解左边和右边各选5个值作为邻域值,总共10个邻域值;算法迭代次数iters=50;误差阈值δ=0.0001。
本实施例中,每个参数都可以通过移动产生许多新的候选参数,参数的移动方向s和移动速度V决定了参数的搜索邻域范围,每个参数的移动范围需要根据实际的情况进行设定,移动速度和移动方向也直接影响到算法的收敛效率。
S4、更新相机参数。根据相机各参数更新的速度和方向,更新相机参数P'=P+V n*s,其中P'为更新后的相机参数,P为更新前的相机参数。
S5、计算相机投影矩阵ProjectMatrix。
根据三维成像原理可知,物体在相机空间的坐标等于世界坐标乘以相机的投影矩阵。投影矩阵的变化与视椎体息息相关,相机参数每次更新都会产生一个新的视椎体,因此投影矩阵也会随着相机参数的变化而变化。本实施例在计算相机投影矩阵时采用的公式为:
Figure PCTCN2022137042-appb-000006
其中,Aspect是相机的横纵比,其取值为:
Figure PCTCN2022137042-appb-000007
或者
Figure PCTCN2022137042-appb-000008
其中,近平面高度为:
Figure PCTCN2022137042-appb-000009
远平面高度为:
Figure PCTCN2022137042-appb-000010
far表示远平面与O点的距离,near表示近平面与O点的距离,NearPlaneWidth表示近平面宽度,FarPlaneWidth表示远平面宽度。
S6、根据相机投影矩阵和特征点三维坐标,计算目标函数,目标函数即相机空间坐标:
position'=Position 特征点三维坐标*ProjectionMatrix
其中,Position 特征点三维坐标为特征点的三维坐标,由此计算得到相机空间坐标position'=(m' i,n' i),i=1,2,3...k。
S7、计算适应度函数。
本实施例将适应度函数定义为特征点的真实相机空间坐标与求解的相机空间坐标的平均误差:
Figure PCTCN2022137042-appb-000011
其中(m 1,n 1)是特征点的真实相机空间坐标,(m' 1,n' 1)是计算的相机空间坐标,k为特征点的数量。
S8、判断适应度函数是否超过设定的阈值。若超过,则执行步骤S9,若没超过,则说明当前匹配到的相机参数是全局最优解,将全局最优解输出作为匹配结果,算法结束,得到相机位置、实际成像点和相机视角,实现视频场景与三维场景的融合。
步骤S7和S8中适应度函数的值(简称适应度值)是衡量当前相机参数与真实相机参数误差的指标。误差越小,说明参数匹配效果越好,反之亦然。
S9、判断所有的参数是否完成本次迭代,若完成,执行步骤S10,否则执行步骤S4至S8。
S10、根据本次迭代结果,生成所有参数的解空间,根据适应度值筛选出适应度值最小的n组解作为候选最优解,作为下一轮迭代的搜索基点。
步骤S10的目的是通过本轮的迭代生成所有参数的解空间,从解空间里面筛选出最有可能是最优解的几组参数,作为下一轮参数迭代的搜索基点。在选择候选最优解时可以根据适应度值最小,参数匹配效果最好的原则进行筛选。这是由于适应度函数的图像是一个“U”型样式,只有一个最小值0,因此适应度值越小,匹配到的参数误差越小。为了减少算法迭代次数、扩大搜索区域,在筛选最优参数候选值时可通过对适应度值排序,找到适应度值最小的几组参数作为候选解,并对参数移动的速度和方向更新,搜索每组候选解的解空间。
S11、判断是否满足算法迭代次数iters,若满足,则从当前的相机参数候选最优解里筛选出适应度值最小的作为最优解输出,算法结束,得到相机位置、实际成像点和相机视角,实现视频场景与三维场景的融合;否则更新速度值V n,再执行步骤S4至S10。
实施例2
本实施例与实施例1基于相同的发明构思,提出基于启发式算法的视频与城市信息模型三维场景融合系统,该系统包括以下各模块:
特征点文件生成模块,用于实现步骤S1,根据视频拍摄图像及其坐标文件、三维场景视图及其坐标文件,标定空间特征点,并生成特征点文件;
初始化模块,用于实现步骤S2,初始化视椎体和相机参数;
参数设定模块,用于实现步骤S3,设置相机参数更新速度、方向和算法迭代次数;
参数更新模块,用于实现步骤S4,根据相机各参数更新的速度和方向,更新相机参数P'=P+V n*s,其中P'为更新后的相机参数,P为更新前的相机参数,V n为参数更新后的速度值,s为参数的移动方向;
计算模块,用于实现步骤S5-S7,计算相机投影矩阵和适应度函数;以及根据相机投影矩 阵和特征点三维坐标,计算相机空间坐标;
position'=Position 特征点三维坐标*ProjectionMatrix
其中,Position 特征点三维坐标为特征点的三维坐标,由此计算得到相机空间坐标position'=(m' i,n' i),i=1,2,3...k;
所述适应度函数定义为特征点的真实相机空间坐标与求解的相机空间坐标的平均误差:
Figure PCTCN2022137042-appb-000012
其中(m 1,n 1)是特征点的真实相机空间坐标,(m' 1,n' 1)是计算的相机空间坐标,k为特征点的数量;
适应度判断模块,用于实现步骤S8,判断适应度函数是否超过设定的阈值;若超过,则启动迭代判断模块,若没超过,则当前匹配到的相机参数是全局最优解,将全局最优解输出作为匹配结果,得到相机位置、实际成像点和相机视角,实现视频场景与三维场景的融合;
本次迭代完成判断模块,用于实现步骤S9-S10,判断所有的参数是否完成本次迭代,若完成,则根据本次迭代结果,生成所有参数的解空间,根据适应度值筛选出适应度值最小的n组解作为候选最优解,作为下一轮迭代的搜索基点;否则返回参数更新模块;
迭代次数判断模块,用于实现步骤S11,判断是否满足参数设定模块所设定的算法迭代次数,若满足,则从当前的相机参数候选最优解里筛选出适应度值最小的作为最优解输出,得到相机位置、实际成像点和相机视角,实现视频场景与三维场景的融合;否则更新速度值V n,再返回参数更新模块。
对于本实施例描述系统方法而言,由于在技术方案的实施上与实施例1的方法相对应,所以本实施例描述得比较简略,技术特征相对应之处请参见实施例1中各步骤的说明即可,此处不再赘述。
实施例3
本实施例与实施例1基于相同的发明构思,提出相应的存储介质,存储介质上存储有计算机指令,当所述计算机指令被处理器执行时,实现实施例1中三维场景融合方法的各步骤。
通过以上的实施方式的描述可知,本领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件平台的方式来实现。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在存储介质中,如ROM/RAM、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例或者实施例的某些部分所述的方法。
上述实施例为本发明较佳的实施方式,但本发明的实施方式并不受上述实施例的限制,其他的任何未背离本发明的精神实质与原理下所作的改变、修饰、替代、组合、简化,均应为等效的置换方式,都包含在本发明的保护范围之内。

Claims (10)

  1. 视频与城市信息模型三维场景融合方法,其特征在于,包括以下步骤:
    S1、根据视频拍摄图像及其坐标文件、三维场景视图及其坐标文件,标定空间特征点,并生成特征点文件;
    S2、初始化视椎体和相机参数;
    S3、设置相机参数更新速度、方向和算法迭代次数;
    S4、根据相机各参数更新的速度和方向,更新相机参数P'=P+V n*s,其中P'为更新后的相机参数,P为更新前的相机参数,V n为参数更新后的速度值,s为参数的移动方向;
    S5、计算相机投影矩阵;
    S6、根据相机投影矩阵和特征点三维坐标,计算相机空间坐标;
    position'=Position 特征点三维坐标*ProjectionMatrix
    其中,Position 特征点三维坐标为特征点的三维坐标,由此计算得到相机空间坐标position'=(m' i,n' i),i=1,2,3...k;
    S7、计算适应度函数;其中,适应度函数定义为特征点的真实相机空间坐标与求解的相机空间坐标的平均误差:
    Figure PCTCN2022137042-appb-100001
    其中(m 1,n 1)是特征点的真实相机空间坐标,(m' 1,n' 1)是计算的相机空间坐标,k为特征点的数量;
    S8、判断适应度函数是否超过设定的阈值;若超过,则执行步骤S9,若没超过,则当前匹配到的相机参数是全局最优解,将全局最优解输出作为匹配结果,得到相机位置、实际成像点和相机视角,实现视频场景与三维场景的融合;
    S9、判断所有的参数是否完成本次迭代,若完成,执行步骤S10,否则执行步骤S4至S8;
    S10、根据本次迭代结果,生成所有参数的解空间,根据适应度值筛选出适应度值最小的n组解作为候选最优解,作为下一轮迭代的搜索基点;
    S11、判断是否满足算法迭代次数iters,若满足,则从当前的相机参数候选最优解里筛选出适应度值最小的作为最优解输出,得到相机位置、实际成像点和相机视角,实现视频场景与三维场景的融合;否则更新速度值V n,再执行步骤S4至S10。
  2. 根据权利要求1所述的视频与城市信息模型三维场景融合方法,其特征在于,步骤S3中设置每个相机参数的移动速度为V,移动方向为s,移动速度表示为
    Figure PCTCN2022137042-appb-100002
    更新后的速度值表示为V n=V 1+(n-1)Δv,n代表算法迭代的次数,Δv代表速度更新的幅度;移动方向s=[-a,a],a为整数,s=-a,-a+1,-a+2,...0...a-2,a-1,a;则每个相机参数的搜索邻域为[-(V*a),V*a]。
  3. 根据权利要求1所述的视频与城市信息模型三维场景融合方法,其特征在于,步骤S2中视椎体以相机位置为原点,由视线方向OB、垂直方向视角fov、远平面和近平面共同组成的一个锥形体三维空间;视椎体的关键参数包括相机位置O(O x,O y,O z)、视线方向OB以及垂直方向视角fov值,其中视线方向OB与三维场景的交点为C(C x,C y,C z),初始相机参数用初始视椎体参数表示为:
    P=(C x,C y,C z,O x,O y,O z,fov)
    其中,C x、C y、C z为交点C的坐标;O x、O y、O z为相机位置O的坐标;fov为相机的垂直方向视角。
  4. 根据权利要求1所述的视频与城市信息模型三维场景融合方法,其特征在于,步骤S5中相机投影矩阵的计算公式为:
    Figure PCTCN2022137042-appb-100003
    其中,Aspect是相机的横纵比,其取值为:
    Figure PCTCN2022137042-appb-100004
    或者
    Figure PCTCN2022137042-appb-100005
    其中,近平面高度为:
    Figure PCTCN2022137042-appb-100006
    远平面高度为:
    Figure PCTCN2022137042-appb-100007
    far表示远平面与O点的距离,near表示近平面与O点的距离,NearPlaneWidth表示近平面宽度,FarPlaneWidth表示远平面宽度。
  5. 根据权利要求1所述的视频与城市信息模型三维场景融合方法,其特征在于,步骤S10在选择候选最优解时,根据适应度函数的值最小,参数匹配效果最好的原则进行筛选。
  6. 根据权利要求1所述的视频与城市信息模型三维场景融合方法,其特征在于,步骤S1对视频拍摄图像与三维场景视图中的同名点进行标注,每对同名点为一组特征点,包括三维场景下的坐标Position=(X i,Y i,Z i)和视频中的二维图像坐标position=(m i,n i),其中i=1.2.3...k,所有的特征点对组成一个特征点文件。
  7. 视频与城市信息模型三维场景融合系统,其特征在于,包括:
    特征点文件生成模块,用于根据视频拍摄图像及其坐标文件、三维场景视图及其坐标文件,标定空间特征点,并生成特征点文件;
    初始化模块,用于初始化视椎体和相机参数;
    参数设定模块,用于设置相机参数更新速度、方向和算法迭代次数;
    参数更新模块,用于根据相机各参数更新的速度和方向,更新相机参数P'=P+V n*s,其中P'为更新后的相机参数,P为更新前的相机参数,V n为参数更新后的速度值,s为参数的移动方向;
    计算模块,用于计算相机投影矩阵和适应度函数;以及根据相机投影矩阵和特征点三维坐标,计算相机空间坐标;
    position'=Position 特征点三维坐标*ProjectionMatrix
    其中,Position 特征点三维坐标为特征点的三维坐标,由此计算得到相机空间坐标position'=(m' i,n' i),i=1,2,3...k;
    所述适应度函数定义为特征点的真实相机空间坐标与求解的相机空间坐标的平均误差:
    Figure PCTCN2022137042-appb-100008
    其中(m 1,n 1)是特征点的真实相机空间坐标,(m' 1,n' 1)是计算的相机空间坐标,k为特征点的数量;
    适应度判断模块,用于判断适应度函数是否超过设定的阈值;若超过,则启动迭代判断模块,若没超过,则当前匹配到的相机参数是全局最优解,将全局最优解输出作为匹配结果,得到相机位置、实际成像点和相机视角,实现视频场景与三维场景的融合;
    本次迭代完成判断模块,用于判断所有的参数是否完成本次迭代,若完成,则根据本次 迭代结果,生成所有参数的解空间,根据适应度值筛选出适应度值最小的n组解作为候选最优解,作为下一轮迭代的搜索基点;否则返回参数更新模块;
    迭代次数判断模块,用于判断是否满足参数设定模块所设定的算法迭代次数,若满足,则从当前的相机参数候选最优解里筛选出适应度值最小的作为最优解输出,得到相机位置、实际成像点和相机视角,实现视频场景与三维场景的融合;否则更新速度值V n,再返回参数更新模块。
  8. 根据权利要求7所述的视频与城市信息模型三维场景融合系统,其特征在于,初始化模块中,视椎体以相机位置为原点,由视线方向OB、垂直方向视角fov、远平面和近平面共同组成的一个锥形体三维空间;视椎体的关键参数包括相机位置O(O x,O y,O z)、视线方向OB以及垂直方向视角fov值,其中视线方向OB与三维场景的交点为C(C x,C y,C z),初始相机参数用初始视椎体参数表示为:
    P=(C x,C y,C z,O x,O y,O z,fov)
    其中,C x、C y、C z为交点C的坐标;O x、O y、O z为相机位置O的坐标;fov为相机的垂直方向视角。
  9. 根据权利要求7所述的视频与城市信息模型三维场景融合系统,其特征在于,计算模块中相机投影矩阵的计算公式为:
    Figure PCTCN2022137042-appb-100009
    其中,Aspect是相机的横纵比,其取值为:
    Figure PCTCN2022137042-appb-100010
    或者
    Figure PCTCN2022137042-appb-100011
    其中,近平面高度为:
    Figure PCTCN2022137042-appb-100012
    远平面高度为:
    Figure PCTCN2022137042-appb-100013
    far表示远平面与O点的距离,near表示近平面与O点的距离,NearPlaneWidth表示近平面宽度,FarPlaneWidth表示远平面宽度。
  10. 存储介质,其上存储有计算机指令,其特征在于,所述计算机指令被处理器执行时,实现权利要求1-6中任一项所述视频与城市信息模型三维场景融合方法的步骤。
PCT/CN2022/137042 2021-12-23 2022-12-06 视频与城市信息模型三维场景融合方法、系统及存储介质 WO2023116430A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111591333.2A CN114255285B (zh) 2021-12-23 2021-12-23 视频与城市信息模型三维场景融合方法、系统及存储介质
CN202111591333.2 2021-12-23

Publications (1)

Publication Number Publication Date
WO2023116430A1 true WO2023116430A1 (zh) 2023-06-29

Family

ID=80797196

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/137042 WO2023116430A1 (zh) 2021-12-23 2022-12-06 视频与城市信息模型三维场景融合方法、系统及存储介质

Country Status (2)

Country Link
CN (1) CN114255285B (zh)
WO (1) WO2023116430A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117470122A (zh) * 2023-11-08 2024-01-30 华中科技大学 一种钢筋骨架绑扎质量自动检查装置
CN118331474A (zh) * 2024-06-12 2024-07-12 中大智能科技股份有限公司 基于三维模型解算相机外方位元素的可交互式方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114255285B (zh) * 2021-12-23 2023-07-18 奥格科技股份有限公司 视频与城市信息模型三维场景融合方法、系统及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111582022A (zh) * 2020-03-26 2020-08-25 深圳大学 一种移动视频与地理场景的融合方法、系统及电子设备
CN111836012A (zh) * 2020-06-28 2020-10-27 航天图景(北京)科技有限公司 基于三维场景的视频融合与视频联动方法及电子设备
CN112053446A (zh) * 2020-07-11 2020-12-08 南京国图信息产业有限公司 一种基于三维gis的实时监控视频与三维场景融合方法
WO2021227360A1 (zh) * 2020-05-14 2021-11-18 佳都新太科技股份有限公司 一种交互式视频投影方法、装置、设备及存储介质
CN114255285A (zh) * 2021-12-23 2022-03-29 奥格科技股份有限公司 视频与城市信息模型三维场景融合方法、系统及存储介质

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105447869B (zh) * 2015-11-30 2019-02-12 四川华雁信息产业股份有限公司 基于粒子群优化算法的摄像机自标定方法及装置
CN108537876B (zh) * 2018-03-05 2020-10-16 清华-伯克利深圳学院筹备办公室 三维重建方法、装置、设备及存储介质
CN109035394B (zh) * 2018-08-22 2023-04-07 广东工业大学 人脸三维模型重建方法、装置、设备、系统及移动终端
CN110648363A (zh) * 2019-09-16 2020-01-03 腾讯科技(深圳)有限公司 相机姿态确定方法、装置、存储介质及电子设备
CN112258587B (zh) * 2020-10-27 2023-07-07 上海电力大学 一种基于灰狼粒子群混合算法的相机标定方法
CN112927353B (zh) * 2021-02-25 2023-05-19 电子科技大学 基于二维目标检测和模型对齐的三维场景重建方法、存储介质及终端
CN113658263B (zh) * 2021-06-17 2023-10-31 石家庄铁道大学 基于视觉场景的电磁干扰源可视化标注方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111582022A (zh) * 2020-03-26 2020-08-25 深圳大学 一种移动视频与地理场景的融合方法、系统及电子设备
WO2021227360A1 (zh) * 2020-05-14 2021-11-18 佳都新太科技股份有限公司 一种交互式视频投影方法、装置、设备及存储介质
CN111836012A (zh) * 2020-06-28 2020-10-27 航天图景(北京)科技有限公司 基于三维场景的视频融合与视频联动方法及电子设备
CN112053446A (zh) * 2020-07-11 2020-12-08 南京国图信息产业有限公司 一种基于三维gis的实时监控视频与三维场景融合方法
CN114255285A (zh) * 2021-12-23 2022-03-29 奥格科技股份有限公司 视频与城市信息模型三维场景融合方法、系统及存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIU ZHONGDONG, DAI ZHAOXIN; LI CHENGMING; LIU XIAOLI: "A Fast Fusion Object Determination Method for Multi-path Video and Three-dimensional GIS Scene", ACTA GEODAETICA ET CARTOGRAPHICA SINICA, vol. 49, no. 5, 15 May 2020 (2020-05-15), pages 632 - 643, XP093074365, ISSN: 1001-1595, DOI: 10.11947/j.AGCS.2020.20190282 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117470122A (zh) * 2023-11-08 2024-01-30 华中科技大学 一种钢筋骨架绑扎质量自动检查装置
CN118331474A (zh) * 2024-06-12 2024-07-12 中大智能科技股份有限公司 基于三维模型解算相机外方位元素的可交互式方法

Also Published As

Publication number Publication date
CN114255285B (zh) 2023-07-18
CN114255285A (zh) 2022-03-29

Similar Documents

Publication Publication Date Title
CN109387204B (zh) 面向室内动态环境的移动机器人同步定位与构图方法
CN110568447B (zh) 视觉定位的方法、装置及计算机可读介质
CN108335353B (zh) 动态场景的三维重建方法、装置和系统、服务器、介质
WO2023116430A1 (zh) 视频与城市信息模型三维场景融合方法、系统及存储介质
CN110288657B (zh) 一种基于Kinect的增强现实三维注册方法
CN110223383A (zh) 一种基于深度图修补的植物三维重建方法及系统
CN106940704B (zh) 一种基于栅格地图的定位方法及装置
CN113052835B (zh) 一种基于三维点云与图像数据融合的药盒检测方法及其检测系统
CN112444242B (zh) 一种位姿优化方法及装置
WO2021004416A1 (zh) 一种基于视觉信标建立信标地图的方法、装置
CN106846467B (zh) 基于每个相机位置优化的实体场景建模方法和系统
CN111127524A (zh) 一种轨迹跟踪与三维重建方法、系统及装置
CN111951201B (zh) 一种无人机航拍图像拼接方法、装置和存储介质
CN112561978B (zh) 深度估计网络的训练方法、图像的深度估计方法、设备
CN110111248A (zh) 一种基于特征点的图像拼接方法、虚拟现实系统、照相机
JP6894707B2 (ja) 情報処理装置およびその制御方法、プログラム
CN112053447A (zh) 一种增强现实三维注册方法及装置
WO2023093739A1 (zh) 一种多视图三维重建的方法
CN110070578B (zh) 一种回环检测方法
CN116503566B (zh) 一种三维建模方法、装置、电子设备及存储介质
CN110567441A (zh) 基于粒子滤波的定位方法、定位装置、建图及定位的方法
WO2024088071A1 (zh) 三维场景重建方法、装置、设备及存储介质
CN114022542A (zh) 一种基于三维重建的3d数据库制作方法
CN114998773A (zh) 适用于无人机系统航拍图像的特征误匹配剔除方法及系统
CN114882106A (zh) 位姿确定方法和装置、设备、介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22909745

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE