CN115826628B - Heterogeneous unmanned aerial vehicle vision obstacle avoidance system and method based on NeRF neural network - Google Patents

Heterogeneous unmanned aerial vehicle vision obstacle avoidance system and method based on NeRF neural network Download PDF

Info

Publication number
CN115826628B
CN115826628B CN202310147930.9A CN202310147930A CN115826628B CN 115826628 B CN115826628 B CN 115826628B CN 202310147930 A CN202310147930 A CN 202310147930A CN 115826628 B CN115826628 B CN 115826628B
Authority
CN
China
Prior art keywords
module
nerf
depth
slam
aerial vehicle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310147930.9A
Other languages
Chinese (zh)
Other versions
CN115826628A (en
Inventor
周仁建
王聪
姚慧敏
张继花
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Aeronautic Polytechnic
Original Assignee
Chengdu Aeronautic Polytechnic
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Aeronautic Polytechnic filed Critical Chengdu Aeronautic Polytechnic
Priority to CN202310147930.9A priority Critical patent/CN115826628B/en
Publication of CN115826628A publication Critical patent/CN115826628A/en
Application granted granted Critical
Publication of CN115826628B publication Critical patent/CN115826628B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses a heterogeneous unmanned aerial vehicle vision obstacle avoidance system and method based on a NeRF neural network, and belongs to the technical field of unmanned aerial vehicle vision obstacle avoidance. The invention solves the problem that the SLAM vision obstacle avoidance is unstable by adopting the traditional RGB-D depth camera under the condition of direct sunlight of the unmanned aerial vehicle, and improves the reliability of autonomous flight of the unmanned aerial vehicle.

Description

Heterogeneous unmanned aerial vehicle vision obstacle avoidance system and method based on NeRF neural network
Technical Field
The invention belongs to the technical field of unmanned aerial vehicle vision obstacle avoidance, and particularly relates to a heterogeneous unmanned aerial vehicle vision obstacle avoidance system and method based on a NeRF neural network.
Background
NeRF (neural radiation field) is one of the current important research fields, and the problem to be solved is how to generate a new view angle diagram given some photographed diagrams. Different from the traditional three-dimensional reconstruction method, which is used for expressing a scene as explicit expressions such as point cloud, grid, voxel and the like, the method develops a new way, models the scene as a continuous 5D radiation field which is implicitly stored in a neural network, only needs to input sparse multi-angle images with phase for training to obtain a neural radiation field model, and clear pictures under any visual angles can be rendered according to the model.
Fusing deep learning with traditional geometry is a trend in SLAM development. In previous technical applications, some single-point modules in SLAM were replaced by neural networks, such as feature extraction, feature matching, loop-back, and depth estimation. Compared with single-point substitution, neRF-based SLAM method based on NeRF is a brand new framework, and can substitute traditional SLAM end to end, both in design method and in realization architecture.
In the autonomous flight process of the unmanned aerial vehicle, flexible avoidance of obstacles is required, and a method for performing SLAM map reconstruction based on a laser radar auxiliary RGB-D depth camera or a binocular camera is generally adopted.
While many three-dimensional reconstruction solutions are based on RGB-D or lidar sensors, scene reconstruction from monocular images provides a more convenient solution. RGBD cameras may fail under certain conditions, such as direct sunlight, and lidar is still heavier than monocular RGB cameras. In addition, the stereoscopic camera simplifies the depth estimation problem to a one-dimensional parallax search, but relies on accurate calibration of the camera that is prone to incorrect calibration in actual operation.
Disclosure of Invention
Aiming at the defects in the prior art, the heterogeneous unmanned aerial vehicle vision obstacle avoidance system and method based on the NeRF neural network provided by the invention solve the problem that the unmanned aerial vehicle is unstable in SLAM vision obstacle avoidance by adopting a traditional RGB-D depth camera under the condition of direct sunlight.
In order to achieve the aim of the invention, the invention adopts the following technical scheme: a heterogeneous unmanned aerial vehicle vision obstacle avoidance system based on a NeRF neural network comprises a core module, and a laser radar module, an unmanned aerial vehicle flight control module, an unmanned aerial vehicle monocular camera module and an IMU inertial navigation module which are respectively connected with the core module;
the unmanned aerial vehicle monocular camera module is used for acquiring images to obtain original images;
the IMU inertial navigation module is used for acquiring the camera gesture;
the core module is used for obtaining a global map according to the original image and the camera gesture;
the laser radar module is used for combining the core module according to the global map to obtain a path planning result;
and the unmanned aerial vehicle flight control module is used for controlling the unmanned aerial vehicle according to the path planning result to realize the obstacle avoidance of the unmanned aerial vehicle.
The beneficial effects of the invention are as follows: the invention collects images through the unmanned plane monocular camera module, an image collection subunit is arranged in the FPGA editable logic subunit of the core module, the collected images are transmitted to the NeRF-SLAM image building subunit, the NeRF-SLAM image building subunit models a scene into a continuous 5D radiation field by combining with the IMU inertial navigation module and the gesture resolving subunit, and SLAM image building can be performed implicitly through a three-dimensional scene stored in a neural network. Uploading a graph construction result to an APU application program processor submodule through a MicroBlaze processor unit to update a global map of the obstacle avoidance system, and combining a laser radar module and a laser radar SLAM unit to realize the planning of a flight path, wherein the path planning result is sent to an unmanned aerial vehicle flight control system to control the unmanned aerial vehicle, so that the unmanned aerial vehicle vision obstacle avoidance is realized; the weight and the volume of the unmanned aerial vehicle are reduced by using the monocular camera module of the unmanned aerial vehicle; meanwhile, by adopting the concept of heterogeneous treatment, the generalization capability of the NeRF-SLAM radiation nerve field is improved, and the local minimum value is not trapped.
Further, the core module comprises an APU application processor sub-module and an FPGA editable logic sub-module connected with the APU application processor sub-module;
the APU application program processor submodule comprises a laser radar SLAM unit, a global map unit and a flight path planning unit which are sequentially connected; the laser radar SLAM unit is connected with the laser radar module; the global map unit is connected with the FPGA editable logic submodule; the flight path planning unit is connected with the unmanned aerial vehicle flight control module;
the FPGA editable logic submodule comprises a MicroBlaze processor unit and a NeRF-SLAM radiation nerve field unit which are connected in sequence; the NeRF-SLAM radiation nerve field unit comprises a NeRF-SLAM image construction subunit, an image acquisition subunit and an attitude solution subunit which are respectively connected with the NeRF-SLAM image construction subunit; the MicroBlaze processor unit is respectively connected with the NeRF-SLAM map building subunit and the global map unit; the image acquisition subunit is connected with the unmanned aerial vehicle monocular camera module; and the gesture solution operator unit is connected with the IMU inertial navigation module.
The beneficial effects of the above-mentioned further scheme are: by adopting the concept of heterogeneous treatment, the generalization capability of the NeRF-SLAM radiation nerve field is improved, and the local minimum value is not trapped.
The invention provides a heterogeneous unmanned aerial vehicle vision obstacle avoidance method based on a NeRF neural network, which comprises the following steps:
s1, obtaining an original image through an unmanned plane monocular camera module;
s2, modeling the original image into a continuous 5D radiation field by utilizing a NeRF-SLAM radiation nerve field unit to obtain a mapping result;
s3, transmitting the map building result to a global map unit through a MicroBlaze processor unit to obtain a global map;
s4, obtaining a path planning result by utilizing a laser radar SLAM unit and a laser radar module according to the global map;
s5, transmitting the path planning result to an unmanned aerial vehicle flight control module by using a flight path planning unit, and realizing unmanned aerial vehicle vision obstacle avoidance.
The beneficial effects of the invention are as follows: the NeRF-SLAM model is optimized by utilizing the structure of the Hessian matrix, and meanwhile, an MLP multi-layer sensor in a classical NeRF network is replaced by an SLFNN (single hidden layer feedforward neural network), so that an ELM (extreme learning machine) model is conveniently constructed and the solution is accelerated, and the reliability of autonomous flight of the unmanned aerial vehicle is improved.
Further, the step S2 specifically includes the following steps:
s201, according to the original image, utilizing an image acquisition subunit of a NeRF-SLAM radiation nerve field unit to obtain a continuous frame RGB image;
s202, optimizing a NeRF-SLAM model by utilizing the structure of a Hessian matrix according to the continuous frame RGB image to obtain an improved NeRF-SLAM model;
s203, obtaining a mapping result according to the improved NeRF-SLAM model.
The beneficial effects of the above-mentioned further scheme are: the structure of the Hessian matrix is utilized to optimize the NeRF-SLAM model, so that radiation field parameters can be optimized, and camera gestures can be refined at the same time.
Further, the improved NeRF-SLAM model in step S202 includes a first single hidden layer feedforward neural network SLFNN1, a second single hidden layer feedforward neural network SLFNN2, a calculation and position encoding module, a volume rendering module, and a GRU convolution module;
the computing and position coding module is used for acquiring the scene position and the observation direction;
the first single hidden layer feedforward neural network SLFNN1 is used for obtaining voxel density and feature vectors according to the scene position;
the second single hidden layer feedforward neural network SLFNN2 is used for obtaining color according to the observation direction and the feature vector;
the volume rendering module is used for obtaining voxel output according to the voxel density and the color;
and the GRU convolution module is used for acquiring the output of the GRU convolution module.
The beneficial effects of the above-mentioned further scheme are: the MLP multi-layer perceptron in the classical NeRF network is replaced by SLFNN (single hidden layer feedforward neural network), so that an ELM (extreme learning machine) model is conveniently constructed and the solution is accelerated.
Further, the step S202 specifically includes the following steps:
s2021, according to the continuous frame RGB image, obtaining dense optical flow, optical flow weight and output of the GRU convolution module through the GRU convolution module;
s2022, obtaining a camera matrix by utilizing Schur Shull complement of a depth arrow-shaped block sparse Hessian matrix according to the dense optical flow and the optical flow weight;
s2023, obtaining a first gesture according to Cholesky decomposition of the camera matrix;
s2024, obtaining depth according to the first gesture;
s2025, performing block division on the Hessian matrix to obtain a block division result; the expression of the block segmentation result is:
Figure SMS_1
wherein ,
Figure SMS_2
is a Hessian matrix; c is a matrix of block cameras;Ea Hessian matrix that is camera or depth off-diagonal; />
Figure SMS_3
Transpose of the Hessian matrix for camera/depth off-diagonal;Pa diagonal matrix corresponding to each pixel depth of the key frame; />
Figure SMS_4
Updating for increments on the lie algebra of camera poses in the SE (3) group; />
Figure SMS_5
Updating for increments of inverse depth per pixel;
Figure SMS_6
is a gesture residual error;wis a depth residual;bis a residual vector;
s2026, obtaining depth marginal covariance and attitude marginal covariance according to the block segmentation result; the expressions of the depth marginal covariance and the gesture marginal covariance are as follows:
Figure SMS_7
Figure SMS_8
wherein ,
Figure SMS_9
is depth marginal covariance; />
Figure SMS_10
Inverting the diagonal matrix corresponding to the inverse depth of each key frame pixel; />
Figure SMS_11
Inverse inversion of the diagonal matrix corresponding to the inverse depth of each key frame pixel; />
Figure SMS_12
Is the marginal covariance of the gesture;Lcholesky decomposition factors for the lower triangular matrix; />
Figure SMS_13
Transpose Cholesky decomposition factors of the lower triangular matrix;
s2027, obtaining an improved NeRF-SLAM model according to the first gesture, the depth, the gesture marginal covariance, the depth marginal covariance and the continuous frame RGB image.
The beneficial effects of the above-mentioned further scheme are: and optimizing the NeRF-SLAM model by using the structure of the Hessian matrix to obtain an improved NeRF-SLAM model, and preparing a subsequent mapping result.
Further, the step S203 specifically includes the following steps:
s2031, acquiring a camera gesture through an IMU inertial navigation module, and obtaining a second gesture by utilizing a gesture solution subunit of a NeRF-SLAM radiation nerve field unit;
s2032, performing high-frequency coding through a calculation and position coding module according to the first gesture, the depth, the gesture marginal covariance, the depth marginal covariance and the second gesture to obtain a scene position and an observation direction; the expression of the high-frequency coding is as follows:
Figure SMS_14
wherein ,
Figure SMS_15
as a function of high frequency encoding;pis a position vector; />
Figure SMS_16
Is a sine function; />
Figure SMS_17
Is a cosine function;
s2033, according to the scene position, obtaining voxel density and feature vectors by using a first single hidden layer feedforward neural network SLFNN 1;
s2034, obtaining color by using a second single hidden layer feedforward neural network SLFNN2 according to the observation direction and the feature vector;
s2035, obtaining voxel output through a volume rendering module according to the voxel density and the color; the expression of the voxel output is:
Figure SMS_18
wherein ,
Figure SMS_20
outputting for the voxels; />
Figure SMS_24
For radiation from->
Figure SMS_26
To the point oftCumulative transmittance along rays, i.e. rays +.>
Figure SMS_21
To the point oftProbability of not encountering any particles; />
Figure SMS_22
Is a voxel density function; />
Figure SMS_25
Is a color function; />
Figure SMS_27
The corresponding light is;dis a direction vector;
Figure SMS_19
is the proximal boundary; />
Figure SMS_23
Is a distal boundary;tsampling the distance from the point to the optical center of the camera on the optical line of the camera;sis the position on the light;
s2036, combining the voxel output and the GRU convolution module output, and obtaining a mapping result by utilizing a NeRF-SLAM mapping subunit of a NeRF-SLAM radiation nerve field unit and judging and occupying a grid map through local mapping and obstacle area.
The beneficial effects of the above-mentioned further scheme are: and obtaining a mapping result according to the improved NeRF-SLAM model, and preparing for flight route planning.
Further, the loss function of the mapping result is:
Figure SMS_28
Figure SMS_29
Figure SMS_30
wherein ,
Figure SMS_32
a loss function for the mapping result;Tis a first gesture; />
Figure SMS_35
Is a neural parameter; />
Figure SMS_37
Is a color loss function;Iis a color image to be rendered; />
Figure SMS_33
Is a rendered color image; />
Figure SMS_34
Is a super parameter for balancing depth and color supervision; />
Figure SMS_36
As a depth loss function;Ddepth; />
Figure SMS_38
Is the uncertainty marginal covariance; />
Figure SMS_31
Is the depth of the rendering.
The beneficial effects of the above-mentioned further scheme are: parameters are continuously adjusted according to the graph construction result loss function, and the reliability of autonomous flight of the unmanned aerial vehicle is improved.
Drawings
Fig. 1 is a system configuration diagram of the present invention.
Fig. 2 is a flow chart of the method of the present invention.
Fig. 3 is a schematic structural diagram of a basic framework of the design of the hardware unit of SLFNN in embodiment 3 of the present invention.
FIG. 4 is a training flowchart of the first single hidden layer feedforward neural network or the second single hidden layer feedforward neural network in embodiment 3 of the present invention.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and all the inventions which make use of the inventive concept are protected by the spirit and scope of the present invention as defined and defined in the appended claims to those skilled in the art.
Example 1
As shown in fig. 1, the heterogeneous unmanned aerial vehicle vision obstacle avoidance system based on the NeRF neural network comprises a core module, and a laser radar module, an unmanned aerial vehicle flight control module, an unmanned aerial vehicle monocular camera module and an IMU inertial navigation module which are respectively connected with the core module;
the unmanned aerial vehicle monocular camera module is used for acquiring images to obtain original images;
the IMU inertial navigation module is used for acquiring the camera gesture;
the core module is used for obtaining a global map according to the original image and the camera gesture;
the laser radar module is used for combining the core module according to the global map to obtain a path planning result;
and the unmanned aerial vehicle flight control module is used for controlling the unmanned aerial vehicle according to the path planning result to realize the obstacle avoidance of the unmanned aerial vehicle.
The core module comprises an APU application processor sub-module and an FPGA editable logic sub-module connected with the APU application processor sub-module;
the APU application program processor submodule comprises a laser radar SLAM unit, a global map unit and a flight path planning unit which are sequentially connected; the laser radar SLAM unit is connected with the laser radar module; the global map unit is connected with the FPGA editable logic submodule; the flight path planning unit is connected with the unmanned aerial vehicle flight control module;
the FPGA editable logic submodule comprises a MicroBlaze processor unit and a NeRF-SLAM radiation nerve field unit which are connected in sequence; the NeRF-SLAM radiation nerve field unit comprises a NeRF-SLAM image construction subunit, an image acquisition subunit and an attitude solution subunit which are respectively connected with the NeRF-SLAM image construction subunit; the MicroBlaze processor unit is respectively connected with the NeRF-SLAM map building subunit and the global map unit; the image acquisition subunit is connected with the unmanned aerial vehicle monocular camera module; and the gesture solution operator unit is connected with the IMU inertial navigation module.
In this embodiment, the core module is a development board with a model of Zynq uitrasaccale+mpsocxcczu 5 EV.
The working principle of the invention is as follows: the invention collects images through the unmanned plane monocular camera module, an image collection subunit is arranged in the FPGA editable logic subunit of the core module, the collected images are transmitted to the NeRF-SLAM image building subunit, the NeRF-SLAM image building subunit models a scene into a continuous 5D radiation field by combining with the IMU inertial navigation module and the gesture resolving subunit, and SLAM image building can be performed implicitly through a three-dimensional scene stored in a neural network. And uploading a map construction result to an APU application program processor submodule to update a global map of the obstacle avoidance system through a MicroBlaze processor unit, and combining a laser radar module and a laser radar SLAM unit to realize the planning of a flight path, and sending a path planning result to a unmanned aerial vehicle flight control system to control the unmanned aerial vehicle so as to realize the visual obstacle avoidance of the unmanned aerial vehicle.
Example 2
As shown in fig. 2, the invention provides a heterogeneous unmanned aerial vehicle vision obstacle avoidance method based on a NeRF neural network, which comprises the following steps:
s1, obtaining an original image through an unmanned plane monocular camera module;
s2, modeling the original image into a continuous 5D radiation field by utilizing a NeRF-SLAM radiation nerve field unit to obtain a mapping result;
s3, transmitting the map building result to a global map unit through a MicroBlaze processor unit to obtain a global map;
s4, obtaining a path planning result by utilizing a laser radar SLAM unit and a laser radar module according to the global map;
s5, transmitting the path planning result to an unmanned aerial vehicle flight control module by using a flight path planning unit, and realizing unmanned aerial vehicle vision obstacle avoidance.
The step S2 specifically includes the following steps:
s201, according to the original image, utilizing an image acquisition subunit of a NeRF-SLAM radiation nerve field unit to obtain a continuous frame RGB image;
s202, optimizing a NeRF-SLAM model by utilizing the structure of a Hessian matrix according to the continuous frame RGB image to obtain an improved NeRF-SLAM model;
s203, obtaining a mapping result according to the improved NeRF-SLAM model.
The improved NeRF-SLAM model in the step S202 comprises a first single hidden layer feedforward neural network SLFNN1, a second single hidden layer feedforward neural network SLFNN2, a calculation and position coding module, a volume rendering module and a GRU convolution module;
the computing and position coding module is used for acquiring the scene position and the observation direction;
the first single hidden layer feedforward neural network SLFNN1 is used for obtaining voxel density and feature vectors according to the scene position;
the second single hidden layer feedforward neural network SLFNN2 is used for obtaining color according to the observation direction and the feature vector;
the volume rendering module is used for obtaining voxel output according to the voxel density and the color;
and the GRU convolution module is used for acquiring the output of the GRU convolution module.
The step S202 specifically includes the following steps:
s2021, according to the continuous frame RGB image, obtaining dense optical flow, optical flow weight and output of the GRU convolution module through the GRU convolution module;
s2022, obtaining a camera matrix by utilizing Schur Shull complement of a depth arrow-shaped block sparse Hessian matrix according to the dense optical flow and the optical flow weight;
s2023, obtaining a first gesture according to Cholesky decomposition of the camera matrix;
s2024, obtaining depth according to the first gesture;
s2025, performing block division on the Hessian matrix to obtain a block division result; the expression of the block segmentation result is:
Figure SMS_39
wherein ,
Figure SMS_40
is Hessian momentAn array; c is a matrix of block cameras;Ea Hessian matrix that is camera or depth off-diagonal; />
Figure SMS_41
Transpose of the Hessian matrix for camera/depth off-diagonal;Pa diagonal matrix corresponding to each pixel depth of the key frame; />
Figure SMS_42
Updating for increments on the lie algebra of camera poses in the SE (3) group; />
Figure SMS_43
Updating for increments of inverse depth per pixel;
Figure SMS_44
is a gesture residual error;wis a depth residual;bis a residual vector;
s2026, obtaining depth marginal covariance and attitude marginal covariance according to the block segmentation result; the expressions of the depth marginal covariance and the gesture marginal covariance are as follows:
Figure SMS_45
Figure SMS_46
wherein ,
Figure SMS_47
is depth marginal covariance; />
Figure SMS_48
Inverting the diagonal matrix corresponding to the inverse depth of each key frame pixel; />
Figure SMS_49
Inverse inversion of the diagonal matrix corresponding to the inverse depth of each key frame pixel; />
Figure SMS_50
Is the marginal covariance of the gesture;Lcholesky decomposition factors for the lower triangular matrix; />
Figure SMS_51
Transpose Cholesky decomposition factors of the lower triangular matrix;
s2027, obtaining an improved NeRF-SLAM model according to the first gesture, the depth, the gesture marginal covariance, the depth marginal covariance and the continuous frame RGB image.
The step S203 specifically includes the following steps:
s2031, acquiring a camera gesture through an IMU inertial navigation module, and obtaining a second gesture by utilizing a gesture solution subunit of a NeRF-SLAM radiation nerve field unit;
s2032, performing high-frequency coding through a calculation and position coding module according to the first gesture, the depth, the gesture marginal covariance, the depth marginal covariance and the second gesture to obtain a scene position and an observation direction; the expression of the high-frequency coding is as follows:
Figure SMS_52
wherein ,
Figure SMS_53
as a function of high frequency encoding;pis a position vector; />
Figure SMS_54
Is a sine function; />
Figure SMS_55
Is a cosine function;
s2033, according to the scene position, obtaining voxel density and feature vectors by using a first single hidden layer feedforward neural network SLFNN 1;
s2034, obtaining color by using a second single hidden layer feedforward neural network SLFNN2 according to the observation direction and the feature vector;
s2035, obtaining voxel output through a volume rendering module according to the voxel density and the color; the expression of the voxel output is:
Figure SMS_56
wherein ,
Figure SMS_58
outputting for the voxels; />
Figure SMS_61
For radiation from->
Figure SMS_63
To the point oftCumulative transmittance along rays, i.e. rays +.>
Figure SMS_59
To the point oftProbability of not encountering any particles; />
Figure SMS_62
Is a voxel density function; />
Figure SMS_64
Is a color function; />
Figure SMS_65
The corresponding light is;dis a direction vector;
Figure SMS_57
is the proximal boundary; />
Figure SMS_60
Is a distal boundary;tsampling the distance from the point to the optical center of the camera on the optical line of the camera;sis the position on the light; />
S2036, combining the voxel output and the GRU convolution module output, and obtaining a mapping result by utilizing a NeRF-SLAM mapping subunit of a NeRF-SLAM radiation nerve field unit and judging and occupying a grid map through local mapping and obstacle area.
The loss function of the mapping result is as follows:
Figure SMS_66
Figure SMS_67
Figure SMS_68
wherein ,
Figure SMS_71
a loss function for the mapping result;Tis a first gesture; />
Figure SMS_72
Is a neural parameter; />
Figure SMS_74
Is a color loss function;Iis a color image to be rendered; />
Figure SMS_70
Is a rendered color image; />
Figure SMS_73
Is a super parameter for balancing depth and color supervision; />
Figure SMS_75
As a depth loss function;Ddepth; />
Figure SMS_76
Is the uncertainty marginal covariance; />
Figure SMS_69
Is the depth of the rendering.
Example 3
As shown in fig. 1, a heterogeneous unmanned aerial vehicle vision obstacle avoidance system based on a NeRF neural network mainly comprises a core module, a laser radar module, an unmanned aerial vehicle flight control module, an unmanned aerial vehicle monocular camera module and an IMU inertial navigation module. The core module mainly comprises an APU application program processor submodule and an FPGA editable logic submodule; the core module adopts a development board with the model of Zynq UItrascale+MPSoCXCZU5 EV. The APU application program processor submodule mainly comprises a laser radar SLAM unit, a global map unit and a flight path planning unit. The FPGA editable logic sub-module mainly comprises a MicroBlaze processor unit and a NeRF-SLAM radiation nerve field unit; the NeRF-SLAM radiation nerve field unit comprises a NeRF-SLAM mapping subunit, an image acquisition subunit and a gesture resolving subunit. The invention adopts the concept of heterogeneous processing, wherein an FPGA editable logic sub-module is used for replacing an MLP multi-layer perceptron in a NeRF-SLAM model, a heterogeneous FPGA editable logic sub-module resource acceleration limit learning machine ELM is used for solving a first single hidden layer feedforward neural network SLFNN1 and a second single hidden layer feedforward neural network SLFNN2, the generalization capability of a nerve radiation field of the NeRF-SLAM model is improved, a local minimum is not trapped, and the image construction is completed through a heterogeneous APU application program processor sub-module. In addition, the unmanned plane monocular camera module is used as an unmanned plane vision obstacle avoidance SLAM mapping sensor, and monocular depth estimation is carried out by using a dense optical flow estimation algorithm, so that the weight and the volume of the unmanned plane are reduced, the cost is reduced, and the reliability of autonomous flight of the unmanned plane is improved.
In this embodiment, the MicroBlaze processor unit adopted by the modified NeRF-SLAM model is RISC (reduced instruction set processor) based, which has advantages of high speed, reduced design cost, and improved reliability.
The FPGA editable logic sub-module mainly comprises a MicroBlaze processor unit and a NeRF-SLAM radiation nerve field unit. The MicroBlaze processor unit is mainly used for managing the NeRF-SLAM radiation nerve field unit, completing communication with a general purpose I/O interface (GPIO) and a high-speed peripheral I/O interface and processing clocks; the functions comprise (1) system control, (2) data preprocessing, (3) ELM training and (4) map building control.
Further, the MicroBlaze processor unit is connected with the BRAM through the LMB local memory bus, so that the storage neuron coefficients can be flexibly changed. Furthermore, the use of BRAM is also advantageous in avoiding delays caused by off-chip accesses, given that the implementation of ELM requires a large amount of on-chip memory resources.
Further, the MicroBlaze processor unit is connected to an EMC external storage controller through an XCL cache link to store data into a cache, such as an SD card.
Further, the MicroBlaze processor unit accesses a general purpose I/O interface (GPIO), a high speed peripheral I/O interface, and a clock via the PLB peripheral local bus.
Further, after the MicroBlaze processor unit completes data processing work through the FPU floating point number processor, data processing results are transmitted to the PS through an AXI bus.
Further, the NeRF-SLAM radiation nerve field unit acquires original images by an external unmanned plane monocular camera module, and continuous frame image input is obtained through an image acquisition subunitIThe method comprises the steps of carrying out a first treatment on the surface of the Starting from a series of images img 1 through img n, the modified NeRF-SLAM model first calculates the dense optical flow between pairs of frames, uses the GRU convolution module to calculate a new dense optical flow given the correlation between pairs of frames and a guess of the current dense optical flow, and the weights for each optical flow measurement.
Further, using these streams and weights as metric values, the improved NeRF-SLAM model employed by the present invention solves a dense Beam Adjustment (BA) problem in which the 3D geometry is parameterized as a set of inverse depth maps for each keyframe. Parameterization of this structure results in a very efficient way to solve the dense BA problem by linearizing the system of equations into a familiar camera or depth arrow block sparse Hessian matrix
Figure SMS_77
Expressed as a linear least squares problem, in whichcAndpis the dimension of the camera and the point.
Further, to solve the linear least squares problem, schur complements of the Hessian matrix are used to calculate a simplified camera matrix, since it is not depth dependent and has much smaller dimensions.
Further, the problem of camera pose is solved by Cholesky decomposition, pose is solved by pre-substitution and post-substitutionT. Given these posesTWe can solve for the sample depth. Furthermore, given a gestureTDepth and depth ofDThe modified NeRF-SLAM model proposes to compute the induced optical flow and provide it again as an initial guess to the GRU convolution module.
Further, the modified NeRF-SLAM model needs to be optimized with the structure of the Hessian matrix, and the block division is performed as follows:
Figure SMS_78
;/>
Figure SMS_79
wherein ,His a Hessian matrix of the matrix,bis the residual error that is present in the sample,Cis a matrix of block cameras and,Pis a diagonal matrix corresponding to the depth per pixel of the key frame. By using
Figure SMS_80
Incremental update on lie algebra representing camera pose in SE (3) group, and +.>
Figure SMS_81
Is an incremental update to the inverse depth per pixel.EHessian matrix, which is camera or depth off-diagonal,>
Figure SMS_82
andwcorresponding to pose and depth residuals.
Further, from the block segmentation of the Hessian matrix, dense depth can be efficiently calculated
Figure SMS_83
And posture->
Figure SMS_84
Is a marginal covariance of (b):
Figure SMS_85
Figure SMS_86
wherein ,
Figure SMS_87
is the inverse of the diagonal matrix corresponding to the inverse depth of each keyframe pixel, +.>
Figure SMS_88
Is the inverse transpose of the diagonal matrix corresponding to the inverse depth of each key frame pixel.EIs a Hessian block matrix of camera or depth off-diagonal,>
Figure SMS_89
is a transpose of the block matrix of the camera/depth off-diagonal Hessian,Lcholesky Cholesky factorization factor, which is the lower triangular matrix, ++>
Figure SMS_90
Is the transpose of Cholesky factorization factor of the lower triangular matrix, ++>
Figure SMS_91
Representing taking the inverse matrix.
Further, given all the information computed by the tracking module-pose, depth, their respective marginal covariances, and the input RGB image-radiation field parameters that can be optimized while refining the camera pose.
Furthermore, the MLP multi-layer perceptron in the classical NeRF network is replaced by the first single hidden layer feedforward neural network SLFNN1 and the second single hidden layer feedforward neural network SLFNN2 which are realized based on the FPGA editable logic submodule, so that an ELM (extreme learning machine) model is conveniently constructed and the solution is accelerated.
Further, the IMU inertial navigation module collects the camera gesture, obtains a second gesture Q through a gesture resolving subunit, and sends the second gesture Q to calculation and position codingModule, previously generated depth covariance
Figure SMS_92
Joint depthDPosture covariance->
Figure SMS_93
Associating a first poseTAre also connected to the calculation and position coding module. Calculating scene position (x, y, z) and viewing direction of the scene midpoint>
Figure SMS_94
The scene position (x, y, z) is encoded and then sent into the first single hidden layer feedforward neural network SLFNN1, the scene position is in the first single hidden layer feedforward neural network SLFNN1, and the voxel density is output through the maximum output selection module of the first single hidden layer feedforward neural network SLFNN1 after passing through (1) random neurons, (2) activation function lookup tables and (3) output neurons>
Figure SMS_95
And 256-dimensional feature vectors, and then associating the 256-dimensional feature vectors obtained above with the viewing direction +.>
Figure SMS_96
After Concat merging, processing by a second single hidden layer feedforward neural network SLFNN2, and after (1) random neurons, (2) activation function lookup tables and (3) output neurons, outputting color RGB by a maximum output selection module of the second single hidden layer feedforward neural network SLFNN2.
Further, the MicroBlaze processor unit trains a first single hidden layer feedforward neural network SLFNN1 and a second single hidden layer feedforward neural network SLFNN2 through an FPGA editable logic sub-module fast unidirectional link FSL. The first single hidden layer feedforward neural network SLFNN1 and the second single hidden layer feedforward neural network SLFNN2 are composed of (1) random neurons, (2) an activation function lookup table, (3) output neurons and a maximum output selection module.
Further, in the calculation and position coding module, the improvement of the NeRF-SLAM model requires the reconstruction of a high-definition scene from the high-frequency details of the scene. The high-frequency coding function adopted in the coding process is as follows:
Figure SMS_97
wherein ,pas a vector of the position of the object,
Figure SMS_98
representing sine +.>
Figure SMS_99
The representation takes the form of a cosine,π3.1415926.
Wherein, the first single hidden layer feedforward neural network SLFNN1 and the second single hidden layer feedforward neural network SLFNN2 are commonly used processing methods combining parallel and pipeline FPGA editable logic sub-modules, so as to obtain a hardware unit design basic framework, as shown in fig. 3, in the ELM algorithm execution process, a series of values, such as random weights, need to be generated
Figure SMS_100
Bias->
Figure SMS_101
Solving for weights +.>
Figure SMS_102
. M+1 input data realize specific digital coding by random weight from an input module, and the specific digital coding is input into a neuron module through a multiplexer to timely output a calculation result. The input module and the multiplexer adopt parallel processing, and the neuron module and the output module adopt serial processing.
Further, voxel density
Figure SMS_103
And the color RGB, namely c= (r, g, b), is connected to a volume rendering module, and the volume rendering module obtains the output of the voxel after rendering:
Figure SMS_104
wherein ,
Figure SMS_105
representing voxel output, function->
Figure SMS_106
Representing ray from +.>
Figure SMS_107
To the point oftCumulative transmission along the ray, i.e. ray-from
Figure SMS_108
To the point oftProbability of not touching any particle +.>
Figure SMS_109
Is a voxel density function; />
Figure SMS_110
Is a color function; />
Figure SMS_111
The corresponding light is;dis a direction vector.
In the HLS high-level synthesis adopted in the actual FPGA editable logic sub-module, the implementation is carried out in a discretization mode:
Figure SMS_112
wherein ,
Figure SMS_113
;/>
Figure SMS_114
wherein
Figure SMS_115
For discretized voxel output,/->
Figure SMS_116
Is discretized transmittance.
In this embodiment, HLS high-level synthesis refers to a process of automatically converting a logic structure described in a high-level language into a circuit model described in a low-level language.
Further, the calculation result of voxel calculation and the output of the former GRU convolution module are combined and output to a NeRF-SLAM mapping subunit for mapping. And after the NeRF-SLAM mapping subunit performs three steps of local mapping, barrier area judgment and grid map occupation, outputting a mapping result to the MicroBlaze processor unit through the FPGA editable logic submodule fast unidirectional link FSL.
Further, considering uncertainty perceived loss, the mapping loss function is expressed as:
Figure SMS_117
given super parameters
Figure SMS_118
Balancing depth, we vs. poseTAnd nerve parameter->
Figure SMS_119
Are minimized and color supervised (will +.>
Figure SMS_120
Set to 1.0).
Further, the depth loss function may be expressed as:
Figure SMS_121
wherein
Figure SMS_122
Is the depth of the rendering, andDdense depth and uncertainty estimated for the tracking module. We render depth as the expected ray termination distance.
Further, the depth of each pixelBy sampling the three-dimensional position along the rays of the pixel, evaluating the sampleiDensity of (3)
Figure SMS_123
And alpha synthesizes the calculated result density similar to standard volume rendering; the expression for the pixel depth is:
Figure SMS_124
from the sampleiAlong the depth of the ray(s),
Figure SMS_125
is the distance between consecutive samples and serves as an input variable for the pixel depth. Wherein->
Figure SMS_126
Is the bulk density by measuring the density of the sampleiIs generated by evaluating an MLP on three-dimensional world coordinates.
Further, the method comprises the steps of,
Figure SMS_127
along the light to the sampleiIs defined as:
Figure SMS_128
further, the color loss function is defined as in the original NeRF:
Figure SMS_129
wherein
Figure SMS_130
Is a rendered color image, synthesized similarly to a depth image, by using volume rendering. Each color of each pixel is also calculated by sampling along the rays of the pixel and alpha synthesizing the resulting densities and colors: />
Figure SMS_131
, wherein />
Figure SMS_132
Also->
Figure SMS_133
Transmittance of (2) and->
Figure SMS_134
Is the color estimated by ELM (extreme learning machine). For a given sampleiAt the same time, density ∈>
Figure SMS_135
And color->
Figure SMS_136
Further, in the present invention, the training flow of the first single hidden layer feedforward neural network SLFNN1 and the second single hidden layer feedforward neural network SLFNN2 used in the NeRF-SLAM model is shown in fig. 4. The method mainly comprises the steps of designing an external software unit and an internal hardware unit, and firstly, starting a training process; step two, inputting preprocessing data; thirdly, processing the training label; fourthly, entering a hardware unit design, and distributing random weights; fifthly, calculating hidden layer output; sixthly, calculating matrix generalized inverse; seventh, calculating network output; eighth step, outputting the data accuracy; and ninth, judging whether retraining is needed, if yes, executing the fourth, fifth, sixth, seventh and eighth steps again, and if no, ending the training flow.
In summary, in the autonomous flight process of the unmanned aerial vehicle, the image is acquired through the monocular camera module of the unmanned aerial vehicle, the image acquisition subunit is arranged in the FPGA editable logic subunit of the heterogeneous processor core module, and the acquired image is transmitted to the NeRF-SLAM image construction subunit; the NeRF-SLAM mapping subunit models the scene into a continuous 5D radiation field by combining the IMU inertial navigation module and the gesture resolving subunit, and SLAM mapping can be performed implicitly through a three-dimensional scene stored in a neural network. The method comprises the steps that a mapping result is uploaded to an APU application program processor submodule through a MicroBlaze processor unit based on RISC (reduced instruction set processor), a global map of an obstacle avoidance system is updated, a laser radar module and a laser radar SLAM unit are combined, so that flight path planning is achieved, and a path planning result is sent to an unmanned aerial vehicle flight control module to control an unmanned aerial vehicle.

Claims (7)

1. The heterogeneous unmanned aerial vehicle vision obstacle avoidance system based on the NeRF neural network is characterized by comprising a core module, and a laser radar module, an unmanned aerial vehicle flight control module, an unmanned aerial vehicle monocular camera module and an IMU inertial navigation module which are respectively connected with the core module;
the unmanned aerial vehicle monocular camera module is used for acquiring images to obtain original images;
the IMU inertial navigation module is used for acquiring the camera gesture;
the core module is used for obtaining a global map according to the original image and the camera gesture; the core module comprises an APU application processor sub-module and an FPGA editable logic sub-module connected with the APU application processor sub-module;
the APU application program processor submodule comprises a laser radar SLAM unit, a global map unit and a flight path planning unit which are sequentially connected; the laser radar SLAM unit is connected with the laser radar module; the global map unit is connected with the FPGA editable logic submodule; the flight path planning unit is connected with the unmanned aerial vehicle flight control module;
the FPGA editable logic submodule comprises a MicroBlaze processor unit and a NeRF-SLAM radiation nerve field unit which are connected in sequence; the NeRF-SLAM radiation nerve field unit comprises a NeRF-SLAM image construction subunit, an image acquisition subunit and an attitude solution subunit which are respectively connected with the NeRF-SLAM image construction subunit; the MicroBlaze processor unit is respectively connected with the NeRF-SLAM map building subunit and the global map unit; the image acquisition subunit is connected with the unmanned aerial vehicle monocular camera module; the gesture solution operator unit is connected with the IMU inertial navigation module;
the laser radar module is used for combining the core module according to the global map to obtain a path planning result;
and the unmanned aerial vehicle flight control module is used for controlling the unmanned aerial vehicle according to the path planning result to realize the obstacle avoidance of the unmanned aerial vehicle.
2. The method for avoiding the obstacle of the heterogeneous unmanned aerial vehicle vision obstacle avoidance system based on the NeRF neural network according to claim 1, wherein the method for avoiding the obstacle of the heterogeneous unmanned aerial vehicle vision based on the NeRF neural network comprises the following steps:
s1, obtaining an original image through an unmanned plane monocular camera module;
s2, modeling the original image into a continuous 5D radiation field by utilizing a NeRF-SLAM radiation nerve field unit to obtain a mapping result;
s3, transmitting the map building result to a global map unit through a MicroBlaze processor unit to obtain a global map;
s4, obtaining a path planning result by utilizing a laser radar SLAM unit and a laser radar module according to the global map;
s5, transmitting the path planning result to an unmanned aerial vehicle flight control module by using a flight path planning unit, and realizing unmanned aerial vehicle vision obstacle avoidance.
3. The obstacle avoidance method of the heterogeneous unmanned aerial vehicle vision obstacle avoidance system based on the NeRF neural network according to claim 2, wherein the step S2 is specifically as follows:
s201, according to the original image, utilizing an image acquisition subunit of a NeRF-SLAM radiation nerve field unit to obtain a continuous frame RGB image;
s202, optimizing a NeRF-SLAM model by utilizing the structure of a Hessian matrix according to the continuous frame RGB image to obtain an improved NeRF-SLAM model;
s203, obtaining a mapping result according to the improved NeRF-SLAM model.
4. The method for avoiding obstacle in heterogeneous unmanned aerial vehicle vision system based on NeRF neural network according to claim 3, wherein the improved NeRF-SLAM model in step S202 comprises a first single hidden layer feedforward neural network SLFNN1, a second single hidden layer feedforward neural network SLFNN2, a calculation and position encoding module, a volume rendering module and a GRU convolution module;
the computing and position coding module is used for acquiring the scene position and the observation direction;
the first single hidden layer feedforward neural network SLFNN1 is used for obtaining voxel density and feature vectors according to the scene position;
the second single hidden layer feedforward neural network SLFNN2 is used for obtaining color according to the observation direction and the feature vector;
the volume rendering module is used for obtaining voxel output according to the voxel density and the color;
and the GRU convolution module is used for acquiring the output of the GRU convolution module.
5. The method for avoiding the visual obstacle avoidance system of heterogeneous unmanned aerial vehicle based on NeRF neural network according to claim 4, wherein the step S202 is specifically as follows:
s2021, according to the continuous frame RGB image, obtaining dense optical flow, optical flow weight and output of the GRU convolution module through the GRU convolution module;
s2022, obtaining a camera matrix by utilizing Schur Shull complement of a depth arrow-shaped block sparse Hessian matrix according to the dense optical flow and the optical flow weight;
s2023, obtaining a first gesture according to Cholesky decomposition of the camera matrix;
s2024, obtaining depth according to the first gesture;
s2025, performing block division on the Hessian matrix to obtain a block division result; the expression of the block segmentation result is:
Figure QLYQS_1
wherein ,
Figure QLYQS_2
is a Hessian matrix; c is a matrix of block cameras;Ea Hessian matrix that is camera or depth off-diagonal; />
Figure QLYQS_3
Transpose of the Hessian matrix for camera/depth off-diagonal;Pa diagonal matrix corresponding to each pixel depth of the key frame; />
Figure QLYQS_4
Updating for increments on the lie algebra of camera poses in the SE (3) group; />
Figure QLYQS_5
Updating for increments of inverse depth per pixel; />
Figure QLYQS_6
Is a gesture residual error;wis a depth residual;bis a residual vector;
s2026, obtaining depth marginal covariance and attitude marginal covariance according to the block segmentation result; the expressions of the depth marginal covariance and the gesture marginal covariance are as follows:
Figure QLYQS_7
Figure QLYQS_8
wherein ,
Figure QLYQS_9
is depth marginal covariance; />
Figure QLYQS_10
Diagonal matrix corresponding to inverse depth for each key frame pixelReversing;
Figure QLYQS_11
inverse inversion of the diagonal matrix corresponding to the inverse depth of each key frame pixel; />
Figure QLYQS_12
Is the marginal covariance of the gesture;Lcholesky decomposition factors for the lower triangular matrix; />
Figure QLYQS_13
Transpose Cholesky decomposition factors of the lower triangular matrix;
s2027, obtaining an improved NeRF-SLAM model according to the first gesture, the depth, the gesture marginal covariance, the depth marginal covariance and the continuous frame RGB image.
6. The method for avoiding the visual obstacle avoidance system of heterogeneous unmanned aerial vehicle based on NeRF neural network according to claim 5, wherein the step S203 is specifically as follows:
s2031, acquiring a camera gesture through an IMU inertial navigation module, and obtaining a second gesture by utilizing a gesture solution subunit of a NeRF-SLAM radiation nerve field unit;
s2032, performing high-frequency coding through a calculation and position coding module according to the first gesture, the depth, the gesture marginal covariance, the depth marginal covariance and the second gesture to obtain a scene position and an observation direction; the expression of the high-frequency coding is as follows:
Figure QLYQS_14
wherein ,
Figure QLYQS_15
as a function of high frequency encoding;pis a position vector; />
Figure QLYQS_16
Is positive toA chord function; />
Figure QLYQS_17
Is a cosine function;
s2033, according to the scene position, obtaining voxel density and feature vectors by using a first single hidden layer feedforward neural network SLFNN 1;
s2034, obtaining color by using a second single hidden layer feedforward neural network SLFNN2 according to the observation direction and the feature vector;
s2035, obtaining voxel output through a volume rendering module according to the voxel density and the color; the expression of the voxel output is:
Figure QLYQS_18
wherein ,
Figure QLYQS_20
outputting for the voxels; />
Figure QLYQS_24
For radiation from->
Figure QLYQS_26
To the point oftCumulative transmittance along rays, i.e. rays +.>
Figure QLYQS_21
To the point oftProbability of not encountering any particles; />
Figure QLYQS_23
Is a voxel density function; />
Figure QLYQS_25
Is a color function; />
Figure QLYQS_27
The corresponding light is;dis a direction vector; />
Figure QLYQS_19
Is the proximal boundary; />
Figure QLYQS_22
Is a distal boundary;tsampling the distance from the point to the optical center of the camera on the optical line of the camera;sis the position on the light;
s2036, combining the voxel output and the GRU convolution module output, and obtaining a mapping result by utilizing a NeRF-SLAM mapping subunit of a NeRF-SLAM radiation nerve field unit and judging and occupying a grid map through local mapping and obstacle area.
7. The obstacle avoidance method of the heterogeneous unmanned aerial vehicle vision obstacle avoidance system based on the NeRF neural network of claim 6, wherein the loss function of the mapping result is:
Figure QLYQS_28
Figure QLYQS_29
Figure QLYQS_30
wherein ,
Figure QLYQS_32
a loss function for the mapping result;Tis a first gesture; />
Figure QLYQS_34
Is a neural parameter; />
Figure QLYQS_36
Is a color loss function;Iis a color image to be rendered; />
Figure QLYQS_33
Is a rendered color image; />
Figure QLYQS_35
Is a super parameter for balancing depth and color supervision; />
Figure QLYQS_37
As a depth loss function;Ddepth; />
Figure QLYQS_38
Is the uncertainty marginal covariance; />
Figure QLYQS_31
Is the depth of the rendering. />
CN202310147930.9A 2023-02-22 2023-02-22 Heterogeneous unmanned aerial vehicle vision obstacle avoidance system and method based on NeRF neural network Active CN115826628B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310147930.9A CN115826628B (en) 2023-02-22 2023-02-22 Heterogeneous unmanned aerial vehicle vision obstacle avoidance system and method based on NeRF neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310147930.9A CN115826628B (en) 2023-02-22 2023-02-22 Heterogeneous unmanned aerial vehicle vision obstacle avoidance system and method based on NeRF neural network

Publications (2)

Publication Number Publication Date
CN115826628A CN115826628A (en) 2023-03-21
CN115826628B true CN115826628B (en) 2023-05-09

Family

ID=85522084

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310147930.9A Active CN115826628B (en) 2023-02-22 2023-02-22 Heterogeneous unmanned aerial vehicle vision obstacle avoidance system and method based on NeRF neural network

Country Status (1)

Country Link
CN (1) CN115826628B (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105059533A (en) * 2015-08-14 2015-11-18 深圳市多翼创新科技有限公司 Aircraft and landing method thereof
CN105759836A (en) * 2016-03-14 2016-07-13 武汉卓拔科技有限公司 Unmanned aerial vehicle obstacle avoidance method and device based on 3D camera
CN106168808A (en) * 2016-08-25 2016-11-30 南京邮电大学 A kind of rotor wing unmanned aerial vehicle automatic cruising method based on degree of depth study and system thereof
CN107553490A (en) * 2017-09-08 2018-01-09 深圳市唯特视科技有限公司 A kind of monocular vision barrier-avoiding method based on deep learning
CN207157509U (en) * 2017-01-24 2018-03-30 刘畅 A kind of unmanned plane for gunnery training and amusement
CN108875912A (en) * 2018-05-29 2018-11-23 天津科技大学 A kind of neural network model for image recognition
CN110456805A (en) * 2019-06-24 2019-11-15 深圳慈航无人智能系统技术有限公司 A kind of UAV Intelligent tracking flight system and method
CN110873565A (en) * 2019-11-21 2020-03-10 北京航空航天大学 Unmanned aerial vehicle real-time path planning method for urban scene reconstruction
CN111540011A (en) * 2019-02-06 2020-08-14 福特全球技术公司 Hybrid metric-topology camera based positioning
CN111831010A (en) * 2020-07-15 2020-10-27 武汉大学 Unmanned aerial vehicle obstacle avoidance flight method based on digital space slice
CN112435325A (en) * 2020-09-29 2021-03-02 北京航空航天大学 VI-SLAM and depth estimation network-based unmanned aerial vehicle scene density reconstruction method
CN113192182A (en) * 2021-04-29 2021-07-30 山东产研信息与人工智能融合研究院有限公司 Multi-sensor-based live-action reconstruction method and system
CN113405547A (en) * 2021-05-21 2021-09-17 杭州电子科技大学 Unmanned aerial vehicle navigation method based on semantic VSLAM
CN114170459A (en) * 2021-11-23 2022-03-11 杭州弥拓科技有限公司 Dynamic building identification guiding method and system based on Internet of things and electronic equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10618673B2 (en) * 2016-04-15 2020-04-14 Massachusetts Institute Of Technology Systems and methods for dynamic planning and operation of autonomous systems using image observation and information theory
US11827352B2 (en) * 2020-05-07 2023-11-28 Skydio, Inc. Visual observer for unmanned aerial vehicles

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105059533A (en) * 2015-08-14 2015-11-18 深圳市多翼创新科技有限公司 Aircraft and landing method thereof
CN105759836A (en) * 2016-03-14 2016-07-13 武汉卓拔科技有限公司 Unmanned aerial vehicle obstacle avoidance method and device based on 3D camera
CN106168808A (en) * 2016-08-25 2016-11-30 南京邮电大学 A kind of rotor wing unmanned aerial vehicle automatic cruising method based on degree of depth study and system thereof
CN207157509U (en) * 2017-01-24 2018-03-30 刘畅 A kind of unmanned plane for gunnery training and amusement
CN107553490A (en) * 2017-09-08 2018-01-09 深圳市唯特视科技有限公司 A kind of monocular vision barrier-avoiding method based on deep learning
CN108875912A (en) * 2018-05-29 2018-11-23 天津科技大学 A kind of neural network model for image recognition
CN111540011A (en) * 2019-02-06 2020-08-14 福特全球技术公司 Hybrid metric-topology camera based positioning
CN110456805A (en) * 2019-06-24 2019-11-15 深圳慈航无人智能系统技术有限公司 A kind of UAV Intelligent tracking flight system and method
CN110873565A (en) * 2019-11-21 2020-03-10 北京航空航天大学 Unmanned aerial vehicle real-time path planning method for urban scene reconstruction
CN111831010A (en) * 2020-07-15 2020-10-27 武汉大学 Unmanned aerial vehicle obstacle avoidance flight method based on digital space slice
CN112435325A (en) * 2020-09-29 2021-03-02 北京航空航天大学 VI-SLAM and depth estimation network-based unmanned aerial vehicle scene density reconstruction method
CN113192182A (en) * 2021-04-29 2021-07-30 山东产研信息与人工智能融合研究院有限公司 Multi-sensor-based live-action reconstruction method and system
CN113405547A (en) * 2021-05-21 2021-09-17 杭州电子科技大学 Unmanned aerial vehicle navigation method based on semantic VSLAM
CN114170459A (en) * 2021-11-23 2022-03-11 杭州弥拓科技有限公司 Dynamic building identification guiding method and system based on Internet of things and electronic equipment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Chen-Hsuan Lin ; Wei-Chiu Ma ; Antonio Torralba ; Simon Lucey.BARF: Bundle-Adjusting Neural Radiance Fields.《2021 IEEE/CVF International Conference on Computer Vision (ICCV)》.2021,第5721-5731页. *
徐风帆.基于神经辐射场优化的三维建模方法.《中国优秀硕士学位论文全文数据库信息科技辑》.2023,第I138-1494页. *
聂欣雨.基于深度学习的视觉SLAM算法研究.《中国优秀硕士学位论文全文数据库信息科技辑》.2023,第I138-1790页. *

Also Published As

Publication number Publication date
CN115826628A (en) 2023-03-21

Similar Documents

Publication Publication Date Title
Ming et al. Deep learning for monocular depth estimation: A review
JP6745328B2 (en) Method and apparatus for recovering point cloud data
US10929654B2 (en) Three-dimensional (3D) pose estimation from a monocular camera
CN110827415B (en) All-weather unknown environment unmanned autonomous working platform
CN109087329B (en) Human body three-dimensional joint point estimation framework based on depth network and positioning method thereof
CN111898635A (en) Neural network training method, data acquisition method and device
CN111902826A (en) Positioning, mapping and network training
CN111311685A (en) Motion scene reconstruction unsupervised method based on IMU/monocular image
Zhang et al. Progressive hard-mining network for monocular depth estimation
CN115855022A (en) Performing autonomous path navigation using deep neural networks
Liu et al. Unsupervised monocular visual odometry based on confidence evaluation
CN114586078A (en) Hand posture estimation method, device, equipment and computer storage medium
CN112258565B (en) Image processing method and device
Xiong et al. THE benchmark: Transferable representation learning for monocular height estimation
Hwang et al. Lidar depth completion using color-embedded information via knowledge distillation
Huang et al. Non-model-based monocular pose estimation network for uncooperative spacecraft using convolutional neural network
Lin et al. Efficient and high-quality monocular depth estimation via gated multi-scale network
Wei et al. Triaxial squeeze attention module and mutual-exclusion loss based unsupervised monocular depth estimation
CN115826628B (en) Heterogeneous unmanned aerial vehicle vision obstacle avoidance system and method based on NeRF neural network
Zhu et al. Autonomous reinforcement control of visual underwater vehicles: Real-time experiments using computer vision
CN116079727A (en) Humanoid robot motion simulation method and device based on 3D human body posture estimation
Tang et al. Encoder-decoder structure with the feature pyramid for depth estimation from a single image
Wu et al. Object detection and localization using stereo cameras
Kim et al. Bayesian Fusion Inspired 3D Reconstruction via LiDAR-Stereo Camera Pair
Cai et al. Semantic reconstruction based on RGB image and sparse depth

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant