CN107909612B

CN107909612B - Method and system for visual instant positioning and mapping based on 3D point cloud

Info

Publication number: CN107909612B
Application number: CN201711252235.XA
Authority: CN
Inventors: 李仕杰; 林伟
Original assignee: Uisee Technologies Beijing Co Ltd
Current assignee: Uisee Technologies Beijing Co Ltd
Priority date: 2017-12-01
Filing date: 2017-12-01
Publication date: 2021-01-29
Anticipated expiration: 2037-12-01
Also published as: CN107909612A

Abstract

The application aims at providing a method and a system for visual instant positioning and mapping based on 3D point cloud, which specifically comprises the following steps: determining camera pose information of a newly acquired picture frame; detecting whether the picture frame is a key frame based on the camera pose information; and if the picture frame is a key frame, fitting and generating a 3D straight line in the map according to the 3D point cloud corresponding to the picture frame. The method provides a brand-new scheme based on the point-line characteristic on the basis of the existing vSLAM method based on the point characteristic, and the characteristic points extracted by the direct method have obvious gradient change at the edges, so that the method is beneficial to extracting straight lines in a three-dimensional space; moreover, the straight lines are detected in the point cloud in the three-dimensional space, so that the detection of unmatched straight lines is reduced, and the calculation of straight line triangulation can be omitted.

Description

Method and system for visual instant positioning and mapping based on 3D point cloud

Technical Field

The application relates to the field of intelligent driving, in particular to a technology for visual instant positioning and map building based on 3D point cloud.

Background

In the instant positioning and mapping (SLAM), intelligent equipment such as a robot and the like moves from an unknown position in an unknown environment, self-positioning is carried out according to position estimation and a map in the moving process, and meanwhile, an incremental map is built on the basis of self-positioning to realize autonomous positioning and navigation of the robot. Due to its important theoretical and application value, the instant positioning and mapping technology has been considered by many scholars as the key to realizing true autonomous mobile robot or intelligent driving.

Compared with the positioning and mapping by using the laser radar in the past, the positioning and mapping method by using the camera as the sensor is gradually the mainstream, and is called visual instantaneous positioning and mapping (vSLAM). The conventional vSLAM method mainly comprises an indirect method based on feature points and minimized matching point reprojection errors and a direct method based on pixel intensity and minimized photometric errors, and the two methods rely on extraction and matching of point features and can better process scenes rich in texture information.

Disclosure of Invention

One objective of the present application is to provide a method and system for visual instant positioning and mapping based on 3D point cloud.

According to one aspect of the present application, there is provided a method for visual instant positioning and mapping, the method comprising:

determining camera pose information of a newly acquired picture frame;

detecting whether the picture frame is a key frame based on the camera pose information;

and if the picture frame is a key frame, fitting and generating a 3D straight line in the map according to the 3D point cloud corresponding to the picture frame.

According to one aspect of the present application, there is provided a system for visual instant positioning and mapping, the system comprising:

the pose determining module is used for determining the camera pose information of the newly acquired picture frame;

a key frame detection module for detecting whether the picture frame is a key frame based on the camera pose information;

and the straight line fitting module is used for fitting and generating a 3D straight line in the map according to the 3D point cloud corresponding to the picture frame if the picture frame is the key frame.

According to one aspect of the present application, there is provided an apparatus for visual instant positioning and mapping, the apparatus comprising:

a processor; and

a memory arranged to store computer executable instructions that, when executed, cause the processor to perform:

determining camera pose information of a newly acquired picture frame;

According to an aspect of the application, there is provided a computer-readable medium comprising instructions that, when executed, cause a system to:

determining camera pose information of a newly acquired picture frame;

Compared with the prior art, the method provides a brand-new scheme based on the point-line characteristics on the basis of the existing vSLAM method based on the point characteristics, and the gradient change of the characteristic points extracted by the direct method at the edges is obvious, so that the method is beneficial to extracting straight lines in a three-dimensional space; moreover, the straight lines are detected in the point cloud in the three-dimensional space, so that the detection of unmatched straight lines is reduced, and the calculation of straight line triangulation can be omitted.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 illustrates a flow diagram of a method for visual point-in-time localization and mapping according to one embodiment of the present application;

FIG. 2 illustrates sub-steps of one step in FIG. 1;

FIG. 3 illustrates a system block diagram for visual instant positioning and mapping according to one embodiment of the present application;

fig. 4 illustrates an exemplary system according to various embodiments of the present application.

The same or similar reference numbers in the drawings identify the same or similar elements.

Detailed Description

The present application is described in further detail below with reference to the attached figures.

In a typical configuration of the present application, the terminal, the device serving the network, and the trusted party each include one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device.

The device referred to in this application includes, but is not limited to, a user device, a network device, or a device formed by integrating a user device and a network device through a network. The user equipment includes, but is not limited to, any mobile electronic product, such as a smart phone, a tablet computer, etc., capable of performing human-computer interaction with a user (e.g., human-computer interaction through a touch panel), and the mobile electronic product may employ any operating system, such as an android operating system, an iOS operating system, etc. The network device includes an electronic device capable of automatically performing numerical calculation and information processing according to a preset or stored instruction, and hardware thereof includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like. The network device includes but is not limited to a computer, a network host, a single network server, a plurality of network server sets or a cloud of a plurality of servers; here, the Cloud is composed of a large number of computers or web servers based on Cloud Computing (Cloud Computing), which is a kind of distributed Computing, one virtual supercomputer consisting of a collection of loosely coupled computers. Including, but not limited to, the internet, a wide area network, a metropolitan area network, a local area network, a VPN network, a wireless Ad Hoc network (Ad Hoc network), etc. Preferably, the device may also be a program running on the user device, the network device, or a device formed by integrating the user device and the network device, the touch terminal, or the network device and the touch terminal through a network.

Of course, those skilled in the art will appreciate that the foregoing is by way of example only, and that other existing or future devices, which may be suitable for use in the present application, are also encompassed within the scope of the present application and are hereby incorporated by reference.

In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.

Fig. 1 shows a method for visual point-of-care positioning and mapping according to the present application, the method comprising step S11, step S12 and step S13. In step S11, the visual instantaneous positioning and mapping system determines the camera pose information of the newly acquired picture frame; in step S12, the visual instant positioning and mapping system detects whether the picture frame is a key frame based on the camera pose information; in step S13, if the picture frame is a key frame, the visual instant positioning and mapping system fits the 3D point cloud corresponding to the picture frame to generate a 3D straight line in the map.

Specifically, in step S11, the visual on-line localization and mapping system determines camera pose information for the newly acquired picture frame. For example, the visual instant positioning and mapping system receives a new picture frame, matches the picture frame with feature points between other frames, and obtains the camera pose information of the current picture frame by minimizing a reprojection error according to the matching result. For another example, the visual instant positioning and mapping system receives a new picture frame, performs direct image registration on the new picture frame and a previous picture frame of the picture frame, uses an image pyramid technology, and adopts a coarse-to-fine tracking mode to determine the information camera pose information of the picture frame. For example, in the embodiment of the present disclosure, the method is taken as an example to acquire camera pose information with higher accuracy.

In step S12, the visual on-line location and mapping system detects whether the picture frame is a key frame based on the camera pose information. For example, the visual instant positioning and mapping system detects whether the current picture frame is a key frame according to the camera pose information of the current picture frame and the related information between other key frames.

In step S13, if the picture frame is a key frame, the visual instant positioning and mapping system fits the 3D point cloud corresponding to the picture frame to generate a 3D straight line in the map. For example, if the picture frame is detected and determined as the key frame, the visual instant positioning and mapping system projects a point line in the map to the picture frame according to the camera pose of the picture frame to generate a corresponding 3D point cloud, and generates a 3D straight line in the map according to the 3D point cloud fitting.

For example, the visual instant positioning and mapping system receives a new picture frame, obtains initial camera pose information of the current picture frame by a pixel intensity-based direct method or a feature point-based indirect method according to the relation between the picture frame and other picture frames, projects the dotted Line features in the map to the current picture frame by using the initial camera pose information as an initial value, and obtains an accurate pose with higher accuracy by calculating matching dotted Line features corresponding to the projection dotted Line features, wherein the matching point features are corresponding points closest to the projection point features, and the matching Line features are straight lines detected by an LSD (Line Segment Detector) algorithm in the neighborhood of the projection Line features. The visual instant positioning and mapping system selects a plurality of key frames which are close to the current frame in time and space, and detects whether the current frame is a key frame or not based on the association of the camera pose information of the current frame and the plurality of key frames. And if the current frame is detected to be determined to be a key frame, the visual instant positioning and image establishing system projects key points in the key frames to the current key frame, the depth of the projection points is determined by utilizing neighborhood information around the projection points, a semi-dense depth image is generated, and the 3D point cloud corresponding to the image frame is obtained. The visual instant positioning and mapping system adopts RANSAC (Random Sample Consensus) algorithm to fit a 3D straight line in a three-dimensional space by using a 3D point cloud of a current picture frame, in the process, in order to consider the influence of uncertainty, a Mahalanobis distance is adopted for calculation, then, the visual instant positioning and mapping system recovers a more accurate straight line in an inner point by using a least square method, deletes the corresponding inner point, and the recovery process is operated iteratively until no straight line is extracted.

In some embodiments, the method further comprises step S14 (not shown). In step S14, the visual point-in-time localization and mapping system optimizes the dotted line features in the map and the camera pose information. For example, the visual instantaneous positioning system optimizes the dotted line features in the map and the camera pose information of the current frame, and the optimization method includes but is not limited to: global optimization and local optimization.

For example, after the visual instant positioning and mapping system establishes a new map point line, optimization processing is performed on the point line characteristics in the map and the second pose information of the current frame. In consideration of efficiency, the visual instant positioning and mapping system adopts a sliding window filter to locally optimize the dot line characteristics and the second position and orientation information, and specifically comprises the following steps:

1) for the pixel values in the point-line characteristics and the second position and attitude information, the visual real-time positioning and mapping system optimizes photometric errors of a plurality of point characteristics and geometric errors of straight lines in a weighting mode, wherein the photometric errors of the points comprise spatial distance errors from each projection point to corresponding points in the image, and the combined errors of the straight lines comprise spatial distance errors from the projection straight lines to the corresponding straight lines in the image;

2) the method comprises the following steps of eliminating the influence of outliers by adopting a Huber error function and a weight based on gradient, wherein the weight formula is as follows:

in the formula, w_pThe weight of the gradient-based error, c is a constant,

is a gradient change of the pixel value,

for example, when | p ≦ δ, then

When | p | is greater than δ, then

Wherein, δ is a threshold used for judging the objective function formulas corresponding to the errors with different sizes;

3) optimizing by adopting a Gauss-Newton optimization method;

4) the consistency of the system is ensured by adopting first-order Jacobian approximation.

In some embodiments, step S13 further includes a sub-step S133 (not shown), in which sub-step S133, the visual on-line positioning and mapping system determines a 2D straight line associated with the 3D straight line according to a projection region of the 3D straight line in the picture frame; wherein the method further comprises a step S15 (not shown), in step S15, the visual instant positioning and mapping system presents the map according to the 3D straight line and the 2D straight line associated with the 3D straight line. For example, after the visual instant positioning and mapping system generates a corresponding 3D straight line according to the 3D point cloud corresponding to the current picture frame, a 2D straight line is detected in the projection neighborhood of the 3D straight line on the current picture frame image, and the 3D straight line is associated with the corresponding 2D straight line; and then, the visual instant positioning and mapping system presents the 3D straight line in the map according to the association relation.

For example, after the visual instant positioning and mapping system generates a corresponding 3D straight line according to the 3D point cloud corresponding to the current picture frame, a 2D straight line is detected by using an LSD algorithm in the projection neighborhood of the 3D straight line on the current picture frame image, the 3D straight line is associated with the corresponding 2D straight line position, and the positions of other straight lines are adaptively adjusted; and then, when the map is presented by the visual instant positioning and mapping system, the 3D straight line in the map is presented more accurately according to the incidence relation between the 3D straight line and the 2D straight line.

In some embodiments, as shown in fig. 2, step S13 includes sub-step S131 and sub-step S132. In the substep S131, if the picture frame is a key frame, the visual instant positioning and map building system generates a 3D point cloud corresponding to the picture frame by projecting an activation point in a map to the picture frame; in sub-step S132, the visual instant positioning and mapping system generates a 3D straight line in the map according to the 3D point cloud fitting.

For example, if the current picture frame is a key frame, the pose information of the previous picture frame of the visual instant positioning and mapping system is an initial value, the rigid body transformation pose of the current frame is calculated based on the pose information of the current picture frame, key points in a map are projected to the current frame, the depth information of the projection points is obtained by using neighborhood information around the projection points, a semi-dense depth map including the current frame is generated according to the depth information of each key point, and 3D point cloud corresponding to the current frame is obtained. The visual instant positioning and mapping system adopts RAVSAC algorithm to fit a 3D straight line in a three-dimensional space by using a 3D point cloud of a current picture frame, in the process, in order to consider the influence of uncertainty, the Mahalanobis distance is used for calculation, then, the visual instant positioning and mapping system recovers a more accurate straight line in an inner point by using a least square method, the corresponding inner point is deleted, and the recovery process is operated iteratively until no straight line is extracted.

In some embodiments, in sub-step S132, the visual instantaneous positioning and mapping system preprocesses the 3D point cloud and generates a 3D straight line in the map according to the preprocessed 3D point cloud fitting. In some embodiments, the pre-processing of the 3D point cloud includes, but is not limited to: a visual instant positioning and mapping system generates new points in the 3D point cloud through interpolation processing; and deleting the points, the distance between which and the existing straight line in the map is less than the distance threshold, in the 3D point cloud by the visual instant positioning and map building system. For example, the visual instant positioning and mapping system preprocesses the 3D point cloud corresponding to the current frame, for example, a new point is generated in the 3D point cloud through interpolation processing, and for example, a point, whose distance from the straight line in the map is smaller than a distance threshold, in the 3D point cloud is deleted; and then, the visual instant positioning and mapping system performs straight line fitting according to the preprocessed 3D point cloud to obtain a corresponding 3D straight line.

For example, the visual instant positioning and mapping system preprocesses the 3D point cloud corresponding to the current frame, for example, a new point is generated in the 3D point cloud through interpolation processing, and for example, the visual instant positioning and mapping system extends an existing straight line in the map, and deletes a point in the 3D point cloud whose distance from the straight line in the map is smaller than a distance threshold, so as to obtain a new 3D point cloud; and then, the visual instant positioning and mapping system performs straight line fitting according to the preprocessed 3D point cloud to obtain a corresponding 3D straight line.

In some embodiments, the step S13 further includes a sub-step S134 (not shown), in which sub-step S134, the visual point-in-time localization and mapping system updates coordinates of point features in the map from the 3D point cloud. For example, for point features, the visual instant location and mapping system recovers its depth and propagates uncertainty in a triangulated manner.

For example, the visual real-time positioning and mapping system calculates the depth information of the restored point feature in the current frame camera coordinate system in a triangularization mode according to a pair of matching point features in two views and the camera pose information between the current frame and the previous frame corresponding to the point features and the depth information of the point feature in the camera coordinate system in the previous frame image, and transmits uncertain information.

In some embodiments, in step S12, the visual on-line location and mapping system selects a plurality of key frames, determines key frame parameters of the picture frame based on the plurality of key frames and the camera pose information, and determines whether the picture frame is a key frame according to the key frame parameters. In some embodiments, the key frame parameters include, but are not limited to: visual field change information, camera translation change information, exposure time change information. For example, the visual instant positioning and mapping system selects a plurality of key frames which are close to the current frame in terms of time and space, and calculates the key frame parameters of the current picture frame according to the camera pose information of the plurality of key frames and the current picture frame, wherein the key frame parameters include but are not limited to: the method comprises the steps of obtaining visual field change information, camera translation change information and exposure time change information; and then, the visual instant positioning and image establishing system judges whether the picture frame is a key frame according to the key frame parameter corresponding to the current picture frame.

For example, the visual instant positioning system selects a plurality of key frames which are closely spaced in time and space according to the related information of the current frame, and determines the key frame parameters of the current picture frame based on the plurality of key frames and the second position information of the current picture frame. Wherein the key frame parameters include:

1) visual field change:

2) camera translation change:

3) exposure time variation:

in the above formula 1), f is a distance measurement unit, p represents pixel information of a corresponding point of a key point of the current frame, and p' represents pixel information of a key point in a plurality of key frames of the plurality of key frames; 2) in the formula f_tP represents the location information of a key point of the current frame, p being a distance metric unit_t' projection position information of key points for a plurality of key frames; 3) where a is a parameter in the photometric calibration.

The visual instant location and mapping system determines three key frame parameters of the current picture frame based on the second pose information of the plurality of key frames and the current frame, and determines a weighted sum of the three key frames and compares it to a predetermined threshold, for example:

in the formula, w_f、

w_aThe weights corresponding to the visual field change information, the camera translation change information and the exposure time change preset by the visual instant positioning and mapping system respectively, and

if the weighted sum of the three key parameters is equal to or greater than the predetermined threshold T_kfThen the visual instant positioning and mapping system determines that the current picture frame is the key frame.

It should be understood by those skilled in the art that the key frame parameters are merely examples, and other elements of the key frame parameters that may exist or become known in the future are included in the scope of the present application and are incorporated by reference herein.

In some embodiments, the method further comprises step S15 (not shown). In step S15, if the picture frame is a non-key frame, the visual instant positioning and mapping system updates the coordinates of the dotted line feature in the map. For example, if the current frame is not a key frame, the visual instant positioning and mapping system updates the depth values of each point and the 3D line end point in the map based on the current picture frame by using a probability-based depth filter.

For example, if the current frame is not a key frame, for a point { p, u } on another key frame that is not yet determined for the depth on the current frame, the epipolar line L corresponding to p is found according to the second pose information_pAnd searching a point u' which is most similar to the point u on the epipolar line, calculating by using triangulation to obtain the depth x and the uncertainty tau, and then updating the depth estimation of the point p by using a Bayesian probability model. When the depth estimate of p converges, its three-dimensional coordinates are computed and added to the map.

Fig. 3 shows a system for visual point-in-time localization and mapping according to the present application, which includes a pose determination module 11, a key frame detection 12, and a line fitting module 13. A pose determining module 11, configured to determine camera pose information of a newly acquired picture frame; a key frame detection 12 for detecting whether the picture frame is a key frame based on the camera pose information; and a straight line fitting module 13, configured to, if the picture frame is a key frame, fit and generate a 3D straight line in the map according to the 3D point cloud corresponding to the picture frame.

Specifically, the pose determination module 11 is configured to determine camera pose information of a newly acquired picture frame. For example, the visual instant positioning and mapping system receives a new picture frame, matches the picture frame with feature points between other frames, and obtains the camera pose information of the current picture frame by minimizing a reprojection error according to the matching result. For another example, the visual instant positioning and mapping system receives a new picture frame, performs direct image registration on the new picture frame and a previous picture frame of the picture frame, uses an image pyramid technology, and adopts a coarse-to-fine tracking mode to determine the information camera pose information of the picture frame. For example, in the embodiment of the present disclosure, the system is taken as an example to acquire camera pose information with higher accuracy.

And detecting a key frame 12, configured to detect whether the picture frame is a key frame based on the camera pose information. For example, the visual instant positioning and mapping system detects whether the current picture frame is a key frame according to the camera pose information of the current picture frame and the related information between other key frames.

And a straight line fitting module 13, configured to, if the picture frame is a key frame, fit and generate a 3D straight line in the map according to the 3D point cloud corresponding to the picture frame. For example, if the picture frame is detected and determined as the key frame, the visual instant positioning and mapping system projects a point line in the map to the picture frame according to the camera pose of the picture frame to generate a corresponding 3D point cloud, and generates a 3D straight line in the map according to the 3D point cloud fitting.

In some embodiments, the system further includes an optimization module 14 (not shown). An optimization module 14, configured to optimize the dotted line features in the map and the camera pose information. For example, the visual instantaneous positioning system optimizes the dotted line features in the map and the camera pose information of the current frame, and the optimization methods include but are not limited to: global optimization and local optimization.

in the formula, w_pThe weight of the gradient-based error, c is a constant,

is a gradient change of the pixel value,

for example, when | p ≦ δ, then

When | p | is greater than δ, then

3) optimizing by adopting a Gauss-Newton optimization method;

In some embodiments, the line fitting module further comprises a correlation unit 133 (not shown). An association unit 133, configured to determine a 2D straight line associated with the 3D straight line according to a projection area of the 3D straight line in the picture frame; wherein the system further comprises a rendering module 15 (not shown). A rendering module 15, configured to render the map according to the 3D straight line and the 2D straight line associated with the 3D straight line. For example, after the visual instant positioning and mapping system generates a corresponding 3D straight line according to the 3D point cloud corresponding to the current picture frame, a 2D straight line is detected in the projection neighborhood of the 3D straight line on the current picture frame image, and the 3D straight line is associated with the corresponding 2D straight line; and then, the visual instant positioning and mapping system presents the 3D straight line in the map according to the association relation.

In some embodiments, as shown in fig. 2, the straight line fitting module 13 includes a point cloud generating unit 131 and a straight line fitting unit 132. A point cloud generating unit 131, configured to generate a 3D point cloud corresponding to the picture frame by projecting an activation point in a map to the picture frame if the picture frame is a key frame; and a straight line fitting unit 132, configured to generate a 3D straight line in the map according to the 3D point cloud fitting.

In some embodiments, the straight line fitting unit 132 is configured to pre-process the 3D point cloud and generate a 3D straight line in the map according to the pre-processed 3D point cloud fitting. In some embodiments, the pre-processing of the 3D point cloud includes, but is not limited to: a visual instant positioning and mapping system generates new points in the 3D point cloud through interpolation processing; and deleting the points, the distance between which and the existing straight line in the map is less than the distance threshold, in the 3D point cloud by the visual instant positioning and map building system. For example, the visual instant positioning and mapping system preprocesses the 3D point cloud corresponding to the current frame, for example, a new point is generated in the 3D point cloud through interpolation processing, and for example, a point, whose distance from the straight line in the map is smaller than a distance threshold, in the 3D point cloud is deleted; and then, the visual instant positioning and mapping system performs straight line fitting according to the preprocessed 3D point cloud to obtain a corresponding 3D straight line.

In some embodiments, the line fitting module 13 further comprises a coordinate updating unit 134 (not shown). A coordinate updating unit 134, configured to update coordinates of the point features in the map according to the 3D point cloud. For example, for point features, the visual instant location and mapping system recovers its depth and propagates uncertainty in a triangulated manner.

In some embodiments, the key frame detection module 12 is configured to select a plurality of key frames, determine key frame parameters of the picture frame based on the plurality of key frames and the camera pose information, and determine whether the picture frame is a key frame according to the key frame parameters. In some embodiments, the key frame parameters include, but are not limited to: visual field change information, camera translation change information, exposure time change information. For example, the visual instant positioning and mapping system selects a plurality of key frames which are close to the current frame in terms of time and space, and calculates the key frame parameters of the current picture frame according to the camera pose information of the plurality of key frames and the current picture frame, wherein the key frame parameters include but are not limited to: the method comprises the steps of obtaining visual field change information, camera translation change information and exposure time change information; and then, the visual instant positioning and image establishing system judges whether the picture frame is a key frame according to the key frame parameter corresponding to the current picture frame.

4) visual field change:

5) camera translation change:

6) exposure time variation:

f in the above formula 1) is a distanceA quantity unit, p representing pixel information of a corresponding point of a key point of the current frame, and p' representing pixel information of key points in a plurality of key frames of the plurality of key frames; 2) in the formula f_tP represents the location information of a key point of the current frame, p being a distance metric unit_t' projection position information of key points for a plurality of key frames; 3) where a is a parameter in the photometric calibration.

in the formula, w_f、

In some embodiments, the system further comprises a coordinate update module 15 (not shown). And the coordinate updating module 15 is configured to update the coordinates of the dotted line features in the map if the picture frame is a non-key frame. For example, if the current frame is not a key frame, the visual instant positioning and mapping system updates the depth values of each point and the 3D line end point in the map based on the current picture frame by using a probability-based depth filter.

The present application also provides a computer readable storage medium having stored thereon computer code which, when executed, performs a method as in any one of the preceding.

The present application also provides a computer program product, which when executed by a computer device, performs the method of any of the preceding claims.

The present application further provides a computer device, comprising:

one or more processors;

a memory for storing one or more computer programs;

the one or more computer programs, when executed by the one or more processors, cause the one or more processors to implement the method of any preceding claim.

In some embodiments, as shown in FIG. 4, the system 300 can be implemented as any one of the computer devices in the embodiments shown in the above figures or in other described embodiments. In some embodiments, system 300 may include one or more computer-readable media (e.g., system memory or NVM/storage 320) having instructions and one or more processors (e.g., processor(s) 305) coupled with the one or more computer-readable media and configured to execute the instructions to implement modules to perform the actions described herein.

For one embodiment, system control module 310 may include any suitable interface controllers to provide any suitable interface to at least one of processor(s) 305 and/or any suitable device or component in communication with system control module 310.

The system control module 310 may include a memory controller module 330 to provide an interface to the system memory 315. Memory controller module 330 may be a hardware module, a software module, and/or a firmware module.

System memory 315 may be used, for example, to load and store data and/or instructions for system 300. For one embodiment, system memory 315 may include any suitable volatile memory, such as suitable DRAM. In some embodiments, the system memory 315 may include a double data rate type four synchronous dynamic random access memory (DDR4 SDRAM).

For one embodiment, system control module 310 may include one or more input/output (I/O) controllers to provide an interface to NVM/storage 320 and communication interface(s) 325.

For example, NVM/storage 320 may be used to store data and/or instructions. NVM/storage 320 may include any suitable non-volatile memory (e.g., flash memory) and/or may include any suitable non-volatile storage device(s) (e.g., one or more Hard Disk Drives (HDDs), one or more Compact Disc (CD) drives, and/or one or more Digital Versatile Disc (DVD) drives).

NVM/storage 320 may include storage resources that are physically part of the device on which system 300 is installed or may be accessed by the device and not necessarily part of the device. For example, NVM/storage 320 may be accessible over a network via communication interface(s) 325.

Communication interface(s) 325 may provide an interface for system 300 to communicate over one or more networks and/or with any other suitable device. System 300 may wirelessly communicate with one or more components of a wireless network according to any of one or more wireless network standards and/or protocols.

For one embodiment, at least one of the processor(s) 305 may be packaged together with logic for one or more controller(s) (e.g., memory controller module 330) of the system control module 310. For one embodiment, at least one of the processor(s) 305 may be packaged together with logic for one or more controller(s) of the system control module 310 to form a System In Package (SiP). For one embodiment, at least one of the processor(s) 305 may be integrated on the same die with logic for one or more controller(s) of the system control module 310. For one embodiment, at least one of the processor(s) 305 may be integrated on the same die with logic for one or more controller(s) of the system control module 310 to form a system on a chip (SoC).

In various embodiments, system 300 may be, but is not limited to being: a server, a workstation, a desktop computing device, or a mobile computing device (e.g., a laptop computing device, a handheld computing device, a tablet, a netbook, etc.). In various embodiments, system 300 may have more or fewer components and/or different architectures. For example, in some embodiments, system 300 includes one or more cameras, a keyboard, a Liquid Crystal Display (LCD) screen (including a touch screen display), a non-volatile memory port, multiple antennas, a graphics chip, an Application Specific Integrated Circuit (ASIC), and speakers.

It should be noted that the present application may be implemented in software and/or a combination of software and hardware, for example, implemented using Application Specific Integrated Circuits (ASICs), general purpose computers or any other similar hardware devices. In one embodiment, the software programs of the present application may be executed by a processor to implement the steps or functions described above. Likewise, the software programs (including associated data structures) of the present application may be stored in a computer readable recording medium, such as RAM memory, magnetic or optical drive or diskette and the like. Additionally, some of the steps or functions of the present application may be implemented in hardware, for example, as circuitry that cooperates with the processor to perform various steps or functions.

In addition, some of the present application may be implemented as a computer program product, such as computer program instructions, which when executed by a computer, may invoke or provide methods and/or techniques in accordance with the present application through the operation of the computer. Those skilled in the art will appreciate that the form in which the computer program instructions reside on a computer-readable medium includes, but is not limited to, source files, executable files, installation package files, and the like, and that the manner in which the computer program instructions are executed by a computer includes, but is not limited to: the computer directly executes the instruction, or the computer compiles the instruction and then executes the corresponding compiled program, or the computer reads and executes the instruction, or the computer reads and installs the instruction and then executes the corresponding installed program. Computer-readable media herein can be any available computer-readable storage media or communication media that can be accessed by a computer.

Communication media includes media by which communication signals, including, for example, computer readable instructions, data structures, program modules, or other data, are transmitted from one system to another. Communication media may include conductive transmission media such as cables and wires (e.g., fiber optics, coaxial, etc.) and wireless (non-conductive transmission) media capable of propagating energy waves such as acoustic, electromagnetic, RF, microwave, and infrared. Computer readable instructions, data structures, program modules, or other data may be embodied in a modulated data signal, for example, in a wireless medium such as a carrier wave or similar mechanism such as is embodied as part of spread spectrum techniques. The term "modulated data signal" means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. The modulation may be analog, digital or hybrid modulation techniques.

By way of example, and not limitation, computer-readable storage media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. For example, computer-readable storage media include, but are not limited to, volatile memory such as random access memory (RAM, DRAM, SRAM); and non-volatile memory such as flash memory, various read-only memories (ROM, PROM, EPROM, EEPROM), magnetic and ferromagnetic/ferroelectric memories (MRAM, FeRAM); and magnetic and optical storage devices (hard disk, tape, CD, DVD); or other now known media or later developed that can store computer-readable information/data for use by a computer system.

An embodiment according to the present application comprises an apparatus comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein the computer program instructions, when executed by the processor, trigger the apparatus to perform a method and/or a solution according to the aforementioned embodiments of the present application.

It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the apparatus claims may also be implemented by one unit or means in software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.

Various aspects of various embodiments are defined in the claims. These and other aspects of the various embodiments are specified in the following numbered clauses:

1. a method for visual on-the-fly positioning and mapping, wherein the method comprises:

determining camera pose information of a newly acquired picture frame;

2. The method of clause 1, wherein the method further comprises:

and optimizing the dotted line features in the map and the camera pose information.

3. The method according to clause 1, wherein, if the picture frame is a key frame, generating a 3D straight line in a map according to the 3D point cloud corresponding to the picture frame by fitting, further comprising:

determining a 2D straight line associated with the 3D straight line according to a projection area of the 3D straight line in the picture frame;

wherein the method further comprises:

presenting the map according to the 3D straight line and a 2D straight line associated with the 3D straight line.

4. The method according to any one of clauses 1 to 3, wherein, if the picture frame is a key frame, generating a 3D straight line in a map according to a 3D point cloud fitting corresponding to the picture frame, comprises:

if the picture frame is a key frame, generating a 3D point cloud corresponding to the picture frame by projecting an activation point in a map to the picture frame;

and generating a 3D straight line in the map according to the 3D point cloud fitting.

5. The method of clause 4, wherein the generating a 3D straight line in the map from the 3D point cloud fit comprises:

preprocessing the 3D point cloud;

and fitting and generating a 3D straight line in the map according to the preprocessed 3D point cloud.

6. The method of clause 5, wherein the pre-processing the 3D point cloud comprises at least any one of:

generating a new point in the 3D point cloud by interpolation processing;

and deleting the points, the distance between which and the existing straight line in the map is less than the distance threshold, in the 3D point cloud.

7. The method according to clause 1, wherein, if the picture frame is a key frame, generating a 3D straight line in a map according to the 3D point cloud corresponding to the picture frame by fitting, further comprising:

and updating coordinates of point features in the map according to the 3D point cloud.

8. The method of clause 1, wherein the detecting whether the picture frame is a key frame based on the camera pose information comprises:

selecting a plurality of key frames, determining the picture frames based on the plurality of key frames and the camera pose information to determine key frame parameters, and determining whether the picture frames are key frames according to the key frame parameters.

9. The method of clause 8, wherein the key frame parameters include at least one of view change information, camera translation change information, exposure time change information.

10. The method of clause 1, wherein the method further comprises:

and if the picture frame is a non-key frame, updating the coordinates of the dotted line features in the map.

11. A system for visual point-of-care positioning and mapping, wherein the system comprises:

12. The system of clause 11, wherein the system further comprises:

and the optimization module is used for optimizing the dotted line features in the map and the camera pose information.

13. The system of clause 11, wherein the line fitting module further comprises:

the association unit is used for determining a 2D straight line associated with the 3D straight line according to a projection area of the 3D straight line in the picture frame;

wherein the system further comprises:

and the presenting module is used for presenting the map according to the 3D straight line and the 2D straight line associated with the 3D straight line.

14. The system of any of clauses 11-13, wherein the line fitting module comprises:

the point cloud generating unit is used for generating a 3D point cloud corresponding to the picture frame by projecting an activation point in a map to the picture frame if the picture frame is a key frame;

and the straight line fitting unit is used for generating a 3D straight line in the map according to the 3D point cloud fitting.

15. The system of clause 14, wherein the line fitting unit is to:

preprocessing the 3D point cloud;

16. The system of clause 15, wherein the pre-processing of the 3D point cloud comprises at least any one of:

generating a new point in the 3D point cloud by interpolation processing;

17. The system of clause 11, wherein the line fitting module further comprises:

and the coordinate updating unit is used for updating the coordinates of the point features in the map according to the 3D point cloud.

18. The system of clause 11, wherein the key frame detection module is to:

selecting a plurality of key frames, determining key frame parameters of the picture frames based on the plurality of key frames and the camera pose information, and determining whether the picture frames are key frames according to the key frame parameters.

19. The system of clause 18, wherein the key frame parameters include at least one of view change information, camera translation change information, exposure time change information.

20. The system of clause 11, wherein the system further comprises:

and the coordinate updating module is used for updating the coordinates of the dotted line features in the map if the picture frame is a non-key frame.

21. An apparatus for visual point-of-care positioning and mapping, wherein the apparatus comprises:

a processor; and

a memory arranged to store computer-executable instructions that, when executed, cause the processor to perform operations as recited in any of clauses 1-10.

22. A computer-readable medium comprising instructions that, when executed, cause a system to perform the following operations as recited in any of clauses 1-10.

Claims

determining camera pose information of a newly acquired picture frame;

selecting a plurality of key frames which are close to a current frame in time and space, and determining key frame parameters of the picture frame based on the key frames and the camera pose information, wherein the key frame parameters comprise view change, camera translation change and exposure time change;

determining whether the picture frame is a key frame according to the key frame parameter;

2. The method of claim 1, wherein the method further comprises:

3. The method of claim 1, wherein, if the picture frame is a key frame, generating a 3D straight line in a map according to a 3D point cloud corresponding to the picture frame by fitting, further comprising:

wherein the method further comprises:

4. The method of any one of claims 1 to 3, wherein, if the picture frame is a key frame, generating a 3D straight line in a map according to a 3D point cloud fit corresponding to the picture frame comprises:

5. The method of claim 4, wherein the generating a 3D straight line in the map from the 3D point cloud fit comprises:

preprocessing the 3D point cloud;

6. The method of claim 5, wherein the pre-processing the 3D point cloud comprises at least any one of:

generating a new point in the 3D point cloud by interpolation processing;

7. The method of claim 1, wherein, if the picture frame is a key frame, generating a 3D straight line in a map according to a 3D point cloud corresponding to the picture frame by fitting, further comprising:

and updating the coordinates of the point features in the map according to the 3D point cloud.

8. The method of claim 1, wherein the key frame parameters include at least one of view change information, camera translation change information, exposure time change information.

9. The method of claim 1, wherein the method further comprises:

10. A system for visual point-of-care positioning and mapping, wherein the system comprises:

the key frame detection module is used for selecting a plurality of key frames which are close to the current frame in time and space, determining key frame parameters of the picture frame based on the plurality of key frames and the camera pose information, wherein the key frame parameters comprise visual field change, camera translation change and exposure time change, and determining whether the picture frame is a key frame according to the key frame parameters;

11. The system of claim 10, wherein the system further comprises:

12. The system of claim 10, wherein the line fitting module further comprises:

wherein the system further comprises:

13. The system of any of claims 10 to 12, wherein the line fitting module comprises:

14. The system of claim 13, wherein the line fitting unit is to:

preprocessing the 3D point cloud;

15. The system of claim 14, wherein the pre-processing of the 3D point cloud comprises at least any one of:

generating a new point in the 3D point cloud by interpolation processing;

16. The system of claim 10, wherein the line fitting module further comprises:

17. The system of claim 10, wherein the key frame parameters include at least one of view change information, camera translation change information, exposure time change information.

18. The system of claim 10, wherein the system further comprises:

19. An apparatus for visual point-of-care positioning and mapping, wherein the apparatus comprises:

a processor; and

a memory arranged to store computer executable instructions that, when executed, cause the processor to perform the method of any of claims 1 to 9.

20. A computer readable medium comprising instructions that when executed cause a system to perform the method of any of claims 1 to 9 below.