CN114387351A - Monocular vision calibration method and computer readable storage medium - Google Patents
Monocular vision calibration method and computer readable storage medium Download PDFInfo
- Publication number
- CN114387351A CN114387351A CN202111573714.8A CN202111573714A CN114387351A CN 114387351 A CN114387351 A CN 114387351A CN 202111573714 A CN202111573714 A CN 202111573714A CN 114387351 A CN114387351 A CN 114387351A
- Authority
- CN
- China
- Prior art keywords
- information
- fitted
- pose
- semantic
- monocular
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000005457 optimization Methods 0.000 claims description 11
- 239000011159 matrix material Substances 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 4
- 230000008569 process Effects 0.000 description 11
- 230000006870 function Effects 0.000 description 6
- 230000000007 visual effect Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 230000004807 localization Effects 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 241000282414 Homo sapiens Species 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000019771 cognition Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/80—Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
- G06T7/74—Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to the technical field of navigation, and provides a monocular vision calibration method and a computer readable storage medium. According to the monocular vision calibration method, the traditional monocular vision SLAM technology and the text recognition technology are combined together, so that the accuracy of pose estimation of a monocular camera can be improved, and a semantic map convenient for a user to use can be generated.
Description
Technical Field
The invention relates to the technical field of navigation, in particular to a monocular vision calibration method and a computer readable storage medium.
Background
Visual slam (simultaneous Localization and mapping) is a technology for estimating self-motion and simultaneously modeling a scene, and has been widely applied to fields of auto-driving, augmented reality, virtual reality, robot navigation, and the like. Visual SLAM attempts to solve such problems: when an intelligent agent moves in an unknown environment, how to determine the motion track of the intelligent agent through pictures taken by a camera and construct a map of the surrounding environment. Conventional visual SLAM uses little semantic information in localization and mapping and is therefore limited in many application scenarios. The traditional SLAM is combined with semantic information, so that the practicability and robustness of the system can be improved, and the system is more consistent with the cognition of human beings on exploring unknown environments.
The obtained more accurate feature matching relationship is a crucial component in the monocular SLAM, the traditional monocular SLAM only depends on the matching relationship among a limited number of feature points and cannot necessarily obtain accurate camera pose estimation, the generated map is not high in accuracy, the generated map is a sparse point cloud map, and the map is low in practicability from the perspective of user interaction. Therefore, it is desirable to provide a monocular vision calibration method and a computer readable storage medium to solve at least the above problems.
Disclosure of Invention
It is an object of the present invention to provide a monocular vision calibration method and computer readable storage medium that at least partially overcome the deficiencies in the prior art.
According to an aspect of the present invention, there is provided a monocular vision calibration method, comprising the steps of:
acquiring an original image through a monocular camera, and extracting semantic information in the original image;
obtaining initial pose information of the monocular camera and initial coordinate information of the original image based on the semantic information and the original image;
continuously acquiring a plurality of images to be fitted according to a time sequence through the monocular camera, and obtaining pose information to be fitted of the monocular camera and coordinate information to be fitted of the images to be fitted based on a uniform velocity model, the initial pose information and the initial coordinate information;
judging the number of the images to be fitted containing the semantic information, fitting the initial pose information and the pose information to be fitted to obtain output pose information when the number of the images to be fitted containing the semantic information is not less than six, and fitting the initial coordinate information and the coordinate information to be fitted to obtain output three-dimensional scene information;
and calibrating the pose and the coordinates of the monocular camera based on the relative relationship between the output pose information and the output three-dimensional scene information.
Preferably, before the acquiring the original image by the monocular camera, the method further comprises: and calibrating the internal reference matrix and the distortion parameter of the monocular camera.
Preferably, the semantic information is a set of pixel points with text traits in the original image.
Preferably, the fitting the initial coordinate information and the coordinate information to be fitted to obtain the output three-dimensional scene information includes: fitting based on the initial coordinate information and the coordinate information to be fitted to obtain a plurality of semantic planes; and fitting based on the semantic plane to obtain the three-dimensional scene information.
Preferably, the fitting based on the initial coordinate information and the coordinate information to be fitted to obtain a plurality of semantic planes includes: and regarding the initial coordinate information containing the semantic information and the coordinate information to be fitted as being on the same plane, and obtaining the semantic plane.
Preferably, the calibrating the pose and the coordinates of the monocular camera based on the relative relationship between the output pose information and the output three-dimensional scene information includes: obtaining a reprojection error factor based on the output three-dimensional scene information; and calibrating the coordinates of the monocular camera through the reprojection error factor.
Preferably, a distance factor is obtained based on the output three-dimensional scene information, and the pose of the monocular camera is calibrated based on the distance factor and the reprojection error factor.
Preferably, based on the distance factor and the reprojection error factor, the pose of the monocular camera is calibrated through a factor graph optimization algorithm.
Preferably, the factor graph optimization algorithm is constructed based on the G2O library.
According to another aspect of the present invention, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a monocular vision calibration method as described in any one of the above.
The invention provides a monocular vision calibration method, which comprises the steps of collecting an original image through a monocular camera, extracting semantic information in the original image, continuously collecting a plurality of images to be fitted according to a time sequence based on the semantic information and the original image, fitting to obtain output pose information and output three-dimensional scene information, and calibrating the pose and the coordinate of the monocular camera based on the relative relationship between the output pose information and the output three-dimensional scene information. According to the monocular vision calibration method, the traditional monocular vision SLAM technology and the text recognition technology are combined together, so that the accuracy of pose estimation of a monocular camera can be improved, and a semantic map convenient for a user to use can be generated.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is a flow chart of a monocular vision calibration method according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. For convenience of description, only portions related to the invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.
The monocular camera in the embodiment of the application mainly refers to a monocular digital camera, and can be a pinhole camera or a wide-angle camera and other various setting forms. This application so to monocular camera, because current equipment, like the camera that carries on-vehicle camera or unmanned aerial vehicle, some is the monocular camera, if reform transform it into binocular camera in order to improve positioning accuracy, the transformation cost is higher, one of the purpose of this application lies in relying on old monocular camera, do the transformation in the aspect of the software to the positioning performance of whole unmanned aerial vehicle or vehicle, can realize great positioning accuracy and promote, and provide a semantic map that can directly use for unmanned aerial vehicle or vehicle. In addition, the method can also be applied to occasions such as intelligent glasses with monocular cameras.
As shown in fig. 1, the present invention provides a monocular vision calibration method, comprising the following steps:
s101: acquiring an original image through a monocular camera, and extracting semantic information in the original image;
s102: obtaining initial pose information of the monocular camera and initial coordinate information of the original image based on the semantic information and the original image;
s103: continuously acquiring a plurality of images to be fitted according to a time sequence through a monocular camera, and obtaining pose information to be fitted of the monocular camera and coordinate information to be fitted of the images to be fitted based on a uniform velocity model, initial pose information and initial coordinate information;
s104: judging the number of images to be fitted containing semantic information, fitting the initial pose information and the pose information to be fitted to obtain output pose information when the number of the images to be fitted containing the semantic information is not less than six, and fitting the initial coordinate information and the coordinate information to be fitted to obtain output three-dimensional scene information;
s105: and calibrating the pose and the coordinates of the monocular camera based on the relative relationship between the output pose information and the output three-dimensional scene information.
In the processing S101, the original image may be a color image or a black and white image, but the definition of the original image should be sufficient to distinguish semantic information, especially which pixels belong to the semantic information and which pixels belong to which part of the semantic information, so that the original image in the present application has a certain requirement on the definition. Extracting semantic information
In the processing S101, a task of detecting and extracting two-dimensional text information in a picture taken by a camera at the current time may be completed by using software such as a Mask textpointer. Many software capable of performing semantic recognition are provided, and the details are not repeated here and should be well known to those skilled in the art. After whether each pixel point in the current picture belongs to the text feature and which text feature each pixel point corresponds to is obtained, as an optimal implementation manner, a two-dimensional text information set (L) in a single picture can be defined as:
wherein ltT-th represented two-dimensional text information consisting of the sheets contained in the text informationWord wtAnd the vertex pixel coordinate p of the polygon bounding box of the text information in the picturetAnd T represents the total number of the two-dimensional text information detected in the current picture, and when T is-1, the current picture represents that no text information is detected.
In the process S102, the initial coordinate information of the original image may be obtained by retrieving from an existing map based on the semantic information identified in the process S101, which may be a general navigation map, or may be a special map for a smaller scene, such as a three-dimensional navigation map inside an office building, so that by retrieving the semantic information, a rough initial coordinate information may be obtained quickly, such as which point on the map the original image corresponds to, but there is a certain difference with respect to the accurate position of the monocular camera due to the inaccuracy of positioning of the monocular camera. The initial pose information may also be determined, for example, the pixel points belonging to the same semantic information are regarded as being in the same plane, and a semantic information image formed after fitting is compared with a pre-stored semantic information image, so as to obtain a rough initial pose of the current monocular camera, which is well known to those skilled in the art and will not be described in detail.
In the process S103, the uniform velocity model is a model that continues to move at a uniform velocity toward the moving direction of the starting monocular camera according to an empirically set parameter or an average velocity of a previous time and the like based on the initial coordinate information and the initial pose, where the acquisition and processing frequency of the monocular camera is actually very high, in the process S104, when 6 images to be fitted containing the semantic information are acquired, an output pose information may start to be output, the process time may be very short, and it is not necessary to accurately confirm a velocity change in the middle, and the uniform velocity model is used, which may effectively reduce the amount of computation.
In the processing S104, in the previous processing, initial pose and initial coordinate information of the camera are obtained by initializing a matching relationship of feature points with text semantic tags, pose information to be fitted of the monocular camera at each moment is obtained through a uniform velocity model, new coordinate information to be fitted is generated, it is continuously detected whether an image to be fitted belonging to the semantic information exceeds six, when it is detected that the number exceeds six, in a preferred implementation, a random consistency sampling algorithm may be used to perform plane fitting on the spatial point set, and here, as a preferred implementation, a semantic plane may be obtained by fitting first, and then the semantic plane is fitted to obtain output three-dimensional scene information.
The specific implementation manner may be that the geometric expression of the semantic plane obtained by fitting is as follows:
π=(nx,ny,nz,)T (2.2)
wherein, pi represents the semantic plane obtained by fitting, (n)x,ny,nz)TThe unit normal vector corresponding to the plane is shown, and d is the distance from the origin to the plane. After the semantic plane is obtained through fitting, the space plane is constructed into output three-dimensional scene information by utilizing the semantic information again, and the output three-dimensional scene information is defined as:
Πk={Wk,πk,Qk,Yk} (2.3)
therein, IIkRepresenting the kth semantic plane, WkRepresenting words, pi, contained in the semantic information corresponding to the semantic planekThe representation is a geometric expression of the spatial plane corresponding to the kth semantic plane, QkThe position coordinate of the center point corresponding to the kth semantic plane is shown, YkThe map point set corresponding to the kth semantic plane is represented.
Therefore, output three-dimensional scene information tightly coupled with the semantic information is obtained, and various composite functions such as semantic information guidance and the like can be provided while auxiliary positioning is realized.
In the processing S104, the fitting of the initial pose information and the pose information to be fitted to obtain the output pose information may be based on a factor graph optimization algorithm, and the semantic information in the scene used in the present invention may provide a supplementary clue for pose estimation of the camera. On one hand, the semantic information contains a large amount of feature point information, which can help estimate the pose of the camera. On the other hand, the semantic information can be regarded as a plane with a specific label attribute, and better camera pose estimation information can be further optimized by regarding the spatial map points belonging to the same text region as being on the same plane. Aiming at the different influences, the plane semantic information is tightly coupled into the factor graph optimization model, and the accurate camera pose estimation is calculated by jointly optimizing the reprojection error factor and the point-to-plane distance error factor, and the processes are well known by the technical personnel in the field and are not described in detail herein.
The invention provides a monocular vision calibration method, which comprises the steps of collecting an original image through a monocular camera, extracting semantic information in the original image, continuously collecting a plurality of images to be fitted according to a time sequence based on the semantic information and the original image, fitting to obtain output pose information and output three-dimensional scene information, and calibrating the pose and the coordinate of the monocular camera based on the relative relationship between the output pose information and the output three-dimensional scene information. According to the monocular vision calibration method, the traditional monocular vision SLAM technology and the text recognition technology are combined together, so that the accuracy of pose estimation of a monocular camera can be improved, and a semantic map convenient for a user to use can be generated.
As a preferred implementation, before processing S101, the reference matrix and distortion parameters of the monocular camera are calibrated. The specific implementation manner of the calibration process is as follows:
1) carrying out internal reference calibration on the camera to obtain distortion parameters and an internal reference matrix of the camera
Wherein, [ x, y [ ]]Is the coordinate of the normalized plane point, [ x ]distorted,ydistorted]Is the distorted coordinate, k1,k2,k3,p1,p2Is a distortion term, and r is the distance from any point on the plane to the origin of the coordinate system;
p is the camera reference matrix, where f is the camera focal length, [ O ]x,Oy]Is the principal optical axis point.
Therefore, after the calibration of the internal parameter matrix and the distortion parameter is completed, the subsequent calibration process can be facilitated.
As a preferred implementation manner, in the processing S101, the semantic information is a set of pixel points having text traits in the original image.
As a preferred implementation manner, in the processing S104, the fitting the initial coordinate information and the coordinate information to be fitted to obtain the output three-dimensional scene information includes: fitting based on the initial coordinate information and the coordinate information to be fitted to obtain a plurality of semantic planes; and fitting based on the semantic plane to obtain three-dimensional scene information. The fitting based on the initial coordinate information and the coordinate information to be fitted to obtain a plurality of semantic planes comprises the following steps: and regarding the initial coordinate information containing the semantic information and the coordinate information to be fitted as being on the same plane to obtain a semantic plane.
As a preferred implementation manner, in the process S105, calibrating the pose and the coordinates of the monocular camera based on the relative relationship between the output pose information and the output three-dimensional scene information includes: obtaining a reprojection error factor based on the output three-dimensional scene information; and calibrating the coordinates of the monocular camera through the reprojection error factor.
The specific implementation process can be as follows: the reprojection error factor is defined as:
ei,j=pi,j-Proj(TcPi,j) (3.1)
wherein ei,jRepresenting a reprojection error factor, Pi,jRepresenting the j-th spatial point coordinate information in time i, pi,jIs represented by Pi,jCorresponding to the pixel coordinates of the characteristic points in the image, the Proj function represents the projection function of the camera, TcIndicating the current timeAnd estimating the pose of the camera.
As a preferred implementation mode, a distance factor is obtained based on the output three-dimensional scene information, and the pose of the monocular camera is calibrated based on the distance factor and the reprojection error factor.
The specific implementation mode can be as follows: point to plane distance factor (e)k,m) Is defined as:
ek,m=ΠkPk,m (3.2)
wherein P isk,mThe representation is the homogeneous representation of the m-th map point space coordinate belonging to the k-th semantic planekAnd the represented k-th three-dimensional semantic plane corresponds to a geometric expression form.
As a preferred implementation mode, the pose of the monocular camera is calibrated through a factor graph optimization algorithm based on the distance factor and the reprojection error factor. Wherein a factor graph optimization algorithm can be constructed based on the G2O library.
The specific implementation mode can be as follows:
and obtaining final camera pose information by jointly optimizing a reprojection error factor and a point-to-plane distance factor, wherein a loss function (C) of a least square model in a factor graph is defined as:
wherein λ is1Denoted is the weight value of the reprojection error factor, λ2Expressed is a weight value, p, of a point-to-plane distance factorhExpressed is a robust Huber function, ei,jDenoted is a reprojection error factor, ek,mExpressed is the factor of the spatial point-to-plane distance, Ωi,jExpressed is the covariance matrix of the reprojection error factor, omegak,mThe covariance matrix of the point-to-plane distance factors is shown.
The optimization problem for solving the camera pose parameters is defined as:
whereinRepresenting the pose estimation information, T, of the camera that ultimately needs to be solvedcIs the initial value of camera pose estimation in factor graph optimization. The invention utilizes G2O library to construct the factor graph optimization problem, and adopts Levenberg-Marquardt (LM) algorithm to solve, and finally, the camera pose estimation with low error can be obtained.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The present application also provides a computer-readable medium having stored thereon a computer program which, when executed by a processor, implements a monocular vision calibration method as described above. The computer readable media may include both permanent and non-permanent, removable and non-removable media implemented in any method or technology for storage of information. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the inventive concept. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.
Claims (10)
1. A monocular vision calibration method is characterized by comprising the following steps:
acquiring an original image through a monocular camera, and extracting semantic information in the original image;
obtaining initial pose information of the monocular camera and initial coordinate information of the original image based on the semantic information and the original image;
continuously acquiring a plurality of images to be fitted according to a time sequence through the monocular camera, and obtaining pose information to be fitted of the monocular camera and coordinate information to be fitted of the images to be fitted based on a uniform velocity model, the initial pose information and the initial coordinate information;
judging the number of the images to be fitted containing the semantic information, fitting the initial pose information and the pose information to be fitted to obtain output pose information when the number of the images to be fitted containing the semantic information is not less than six, and fitting the initial coordinate information and the coordinate information to be fitted to obtain output three-dimensional scene information;
and calibrating the pose and the coordinates of the monocular camera based on the relative relationship between the output pose information and the output three-dimensional scene information.
2. A monocular vision calibration method according to claim 1, further comprising, before said capturing of the original image by the monocular camera: and calibrating the internal reference matrix and the distortion parameter of the monocular camera.
3. The monocular vision calibration method of claim 1, wherein the semantic information is a set of pixel points in the original image having text traits.
4. The monocular vision calibration method of claim 1, wherein the fitting the initial coordinate information and the coordinate information to be fitted to obtain output three-dimensional scene information comprises: fitting based on the initial coordinate information and the coordinate information to be fitted to obtain a plurality of semantic planes; and fitting based on the semantic plane to obtain the three-dimensional scene information.
5. The monocular vision calibration method of claim 4, wherein the fitting based on the initial coordinate information and the coordinate information to be fitted to obtain a plurality of semantic planes comprises: and regarding the initial coordinate information containing the semantic information and the coordinate information to be fitted as being on the same plane, and obtaining the semantic plane.
6. The monocular vision calibration method of claim 1, wherein said calibrating the pose and coordinates of the monocular camera based on the relative relationship between the output pose information and the output three-dimensional scene information comprises: obtaining a reprojection error factor based on the output three-dimensional scene information; and calibrating the coordinates of the monocular camera through the reprojection error factor.
7. The monocular vision calibration method of claim 6, wherein a distance factor is obtained based on the output three-dimensional scene information, and the pose of the monocular camera is calibrated based on the distance factor and the reprojection error factor.
8. The monocular vision calibration method of claim 7, wherein the pose of the monocular camera is calibrated by a factor graph optimization algorithm based on the distance factor and the reprojection error factor.
9. The monocular vision calibration method of claim 8, wherein the factor graph optimization algorithm is constructed based on a G2O library.
10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the monocular vision calibration method according to any one of claims 1 to 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111573714.8A CN114387351A (en) | 2021-12-21 | 2021-12-21 | Monocular vision calibration method and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111573714.8A CN114387351A (en) | 2021-12-21 | 2021-12-21 | Monocular vision calibration method and computer readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114387351A true CN114387351A (en) | 2022-04-22 |
Family
ID=81197358
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111573714.8A Pending CN114387351A (en) | 2021-12-21 | 2021-12-21 | Monocular vision calibration method and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114387351A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115100643A (en) * | 2022-08-26 | 2022-09-23 | 潍坊现代农业与生态环境研究院 | Monocular vision positioning enhancement method and equipment fusing three-dimensional scene semantics |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109815847A (en) * | 2018-12-30 | 2019-05-28 | 中国电子科技集团公司信息科学研究院 | A kind of vision SLAM method based on semantic constraint |
CN110298921A (en) * | 2019-07-05 | 2019-10-01 | 青岛中科智保科技有限公司 | The construction method and processing equipment of three-dimensional map with personage's semantic information |
WO2020156923A2 (en) * | 2019-01-30 | 2020-08-06 | Harman Becker Automotive Systems Gmbh | Map and method for creating a map |
CN111968129A (en) * | 2020-07-15 | 2020-11-20 | 上海交通大学 | Instant positioning and map construction system and method with semantic perception |
-
2021
- 2021-12-21 CN CN202111573714.8A patent/CN114387351A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109815847A (en) * | 2018-12-30 | 2019-05-28 | 中国电子科技集团公司信息科学研究院 | A kind of vision SLAM method based on semantic constraint |
WO2020156923A2 (en) * | 2019-01-30 | 2020-08-06 | Harman Becker Automotive Systems Gmbh | Map and method for creating a map |
CN110298921A (en) * | 2019-07-05 | 2019-10-01 | 青岛中科智保科技有限公司 | The construction method and processing equipment of three-dimensional map with personage's semantic information |
CN111968129A (en) * | 2020-07-15 | 2020-11-20 | 上海交通大学 | Instant positioning and map construction system and method with semantic perception |
Non-Patent Citations (2)
Title |
---|
QIYI TONG等: "TXSLAM: A Monocular Semantic SLAM Tightly Coupled with Planar Text Features", PROCEEDINGS OF THE 2022 IEEE 25TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, 6 May 2022 (2022-05-06) * |
李晓晗等: "Mono-SemSLAM:一种基于物体语义信息的单目视觉SLAM方法", 第22届中国系统仿真技术及其应用学术年会(CCSSTA22ND 2021)论文集, 10 October 2021 (2021-10-10) * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115100643A (en) * | 2022-08-26 | 2022-09-23 | 潍坊现代农业与生态环境研究院 | Monocular vision positioning enhancement method and equipment fusing three-dimensional scene semantics |
CN115100643B (en) * | 2022-08-26 | 2022-11-11 | 潍坊现代农业与生态环境研究院 | Monocular vision positioning enhancement method and equipment fusing three-dimensional scene semantics |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110070615B (en) | Multi-camera cooperation-based panoramic vision SLAM method | |
CN109345588B (en) | Tag-based six-degree-of-freedom attitude estimation method | |
CN111325796B (en) | Method and apparatus for determining pose of vision equipment | |
CN110853075B (en) | Visual tracking positioning method based on dense point cloud and synthetic view | |
EP2833322B1 (en) | Stereo-motion method of three-dimensional (3-D) structure information extraction from a video for fusion with 3-D point cloud data | |
US8467596B2 (en) | Method and apparatus for object pose estimation | |
CN109993793B (en) | Visual positioning method and device | |
US20220319146A1 (en) | Object detection method, object detection device, terminal device, and medium | |
CN114140527A (en) | Dynamic environment binocular vision SLAM method based on semantic segmentation | |
CN114137564A (en) | Automatic indoor object identification and positioning method and device | |
CN114140539A (en) | Method and device for acquiring position of indoor object | |
CN113160315B (en) | Semantic environment map representation method based on dual quadric surface mathematical model | |
CN114387351A (en) | Monocular vision calibration method and computer readable storage medium | |
WO2021114775A1 (en) | Object detection method, object detection device, terminal device, and medium | |
KR102249381B1 (en) | System for generating spatial information of mobile device using 3D image information and method therefor | |
CN112509110A (en) | Automatic image data set acquisition and labeling framework for land confrontation intelligent agent | |
CN115908564A (en) | Storage line inspection method of automatic transportation equipment and automatic transportation equipment | |
CN115656991A (en) | Vehicle external parameter calibration method, device, equipment and storage medium | |
KR102624644B1 (en) | Method of estimating the location of a moving object using vector map | |
CN112507776A (en) | Rapid large-range semantic map construction method | |
CN112598736A (en) | Map construction based visual positioning method and device | |
Su | Vanishing points in road recognition: A review | |
Garcia et al. | A photogrammetric approach for real‐time visual SLAM applied to an omnidirectional system | |
Xu et al. | Feature selection and pose estimation from known planar objects using monocular vision | |
Zhu et al. | Toward the ghosting phenomenon in a stereo-based map with a collaborative RGB-D repair |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |