CN117132973B - Method and system for reconstructing and enhancing visualization of surface environment of extraterrestrial planet - Google Patents

Method and system for reconstructing and enhancing visualization of surface environment of extraterrestrial planet Download PDF

Info

Publication number
CN117132973B
CN117132973B CN202311403367.3A CN202311403367A CN117132973B CN 117132973 B CN117132973 B CN 117132973B CN 202311403367 A CN202311403367 A CN 202311403367A CN 117132973 B CN117132973 B CN 117132973B
Authority
CN
China
Prior art keywords
image
parallax
enhanced
visible light
obstacle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311403367.3A
Other languages
Chinese (zh)
Other versions
CN117132973A (en
Inventor
陈驰
金昂
毕杰皓
杨必胜
应申
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202311403367.3A priority Critical patent/CN117132973B/en
Publication of CN117132973A publication Critical patent/CN117132973A/en
Application granted granted Critical
Publication of CN117132973B publication Critical patent/CN117132973B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/05Geographic models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/768Arrangements for image or video recognition or understanding using pattern recognition or machine learning using context analysis, e.g. recognition aided by known co-occurring patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

The invention provides a deep learning-based method and a deep learning-based system for reconstructing and enhancing an external planetary surface environment. And the barrier extraction of the reconstructed scene is realized through CSF filtering and SAM segmentation large model, and the reconstructed scene is visually enhanced. The method can better solve the problems that the accuracy and the visualization effect of the reconstruction result are unsatisfactory due to the fact that the complicated landform and illumination change of the surface of the outer planet and the lack of texture are likely to be performed poorly, and can fully consider the cognition habit of the earth landform of a researcher and a user, and the visualization enhancement is carried out on the reconstruction result, so that the faster and more accurate visualization of the environment of the surface of the outer planet is realized.

Description

Method and system for reconstructing and enhancing visualization of surface environment of extraterrestrial planet
Technical Field
The invention belongs to the technical field of deep space exploration, and particularly relates to an extraterrestrial planet surface environment reconstruction and enhancement visualization method and system based on deep learning.
Background
The reconstruction of the scene on the surface of the outer planet firstly uses the stereopair collected by the unmanned probe vehicle to recover the three-dimensional information of the scene through a stereo matching algorithm. Stereo matching algorithms can be broadly divided into two types, region-based matching and feature-based matching, depending on the matching primitive. The region-based matching algorithm comprises a normalization correlation algorithm, an image matching absolute value algorithm and Zabih is a Rank algorithm and a Census algorithm; feature-based matching is achieved by firstly extracting feature points by using an F rstner operator, a SIFT operator, a SURF operator and the like, and then achieving dense point matching by adopting a core line constraint, a coarse-to-fine strategy, a triangular mesh constraint and other matching strategies. With the deep learning method fused with computer vision and computer graphics theory, researchers have proposed a MC-CNN-acrt architecture based on convolutional neural network, a DispNet based on encoder-decoder architecture, and an end-to-end network GC-Net. However, although current stereo matching algorithms have made some progress in reconstructing the scene, they may perform poorly in the case of complex topography of the surface of the outer planet being processed, illumination changes, and lack of texture, resulting in unsatisfactory accuracy and visualization of the reconstruction results.
In the process of visualizing the environment of the surface of the extraterrestrial planet, the obstacle on the surface of the extraterrestrial planet is a unique and important ground feature. Failure to properly identify and avoid an obstacle may result in the probe vehicle becoming stuck, stuck on a terrain obstacle, or otherwise irreversibly damaged. Therefore, accurate extraction of obstacle information is critical to ensure smooth travel of the probe vehicle. Currently, in the field of computer vision, there are many researchers who have achieved accurate extraction of a given target in an image. Among them, there are methods such as YOLOv8, efficientset, etc., which can identify and locate different types of targets in a two-dimensional image. A semantic segmentation algorithm, such as SegViT, segFormer, may give each pixel in the image a label of the represented object. They are typically limited to analyzing surface information of the image. On the other hand, there are also many methods for extracting non-ground objects in a three-dimensional scene, such as a point cloud segmentation method PointNet++ and a point cloud filtering method CSF. These methods are suitable for extracting three-dimensional shape and position information of non-ground objects from sensor data, but they typically do not include fusion of image information, thus ignoring texture and appearance features associated with obstacles. At present, a method for fusing two-dimensional images and three-dimensional point cloud data is lacking, and the method is particularly suitable for extracting obstacles of the external planet surface environment.
Disclosure of Invention
Aiming at the problems of poor quality and insufficient visual effect of scene reconstruction in the prior art of extraterrestrial planet surface image data shot by a planet probe car navigation camera, the patent provides an extraterrestrial planet surface environment reconstruction and enhancement visual method and system based on deep learning, which can reconstruct real-time scenes according to collected stereopair data and enhance the visualizations in a mode conforming to human cognition rules.
In order to solve the technical problems, the invention adopts the following technical scheme:
firstly, acquiring stereo image pair data of the surface of an underground planet, which is shot by a navigation camera of a planet detection vehicle, constructing a stereo matching network based on deep learning, and reconstructing the surface environment; then, based on the point cloud scene of the reconstructed surface environment, the obstacle center points of the extraterrestrial planetary surfaces are extracted. Finally, an enhanced visualization method based on the SAM large model is provided according to the central point keywords, and the visualization effect of the reconstruction result is optimized.
An extraterrestrial planetary surface environment reconstruction and enhancement visualization method based on deep learning comprises the following steps:
step 1, obtaining visible light images shot by a navigation camera of a planet detection vehicle, and performing color restoration and color correction processing on the color stereopair data of the surface of the earth outside the earth; constructing a real-time stereo matching scene reconstruction network, and inputting left and right image pairs into a neural network to obtain a reconstruction scene;
step 2, after a parallax image is obtained by a three-dimensional matching scene reconstruction network, recovering the parallax image to a three-dimensional space according to a projection relation to obtain a point cloud; the ground filtering algorithm is used for dividing and obtaining the external planetary surface barriers such as stones and the like. And re-projecting the two-dimensional images to obtain two-dimensional center point keywords.
Step 3, taking the central point keyword as a prompt of a SAM large model, and dividing the original visible light image to obtain a stone area; and according to the requirements of human beings on the earth topography cognition habit and the deep space detection task, enhancing and visualizing the visible light image, and obtaining an enhanced and visual extraterrestrial planet surface scene graph.
Further, the stereo matching network in the step 1 specifically comprises a multi-scale feature extractor, a combined geometric coding body, an update operator based on ConvGRU and a spatial up-sampling module.
Further, the specific implementation of the step 2 includes the following sub-steps:
and 2.1, obtaining a parallax image by the stereo matching scene reconstruction network, and calculating the parallax image according to the projection relation to obtain a depth image. Each pixel in the depth map represents depth information for that point. The calculation formula is as follows:
wherein,is the pixel in the depth map +.>Depth value of the place>Is the focal length of the camera, < >>Is the length of the base line and,is the pixel in the disparity map +.>A disparity value at the position.
Step 2.2 for each pixel in the depth mapAccording to depth value->Convert it into three-dimensional coordinatesAdded to the point cloud data structure to generate a three-dimensional point cloud. The formula for the conversion process is shown below:
wherein,is the coordinates of the center of the image, ">Is the focal length of the camera.
And 2.3, filtering the ground point cloud by adopting a CSF ground filtering algorithm. CSF is an airborne lidar filtering method based on cloth simulation. The interaction between the distribution nodes and the corresponding lidar points is simulated, and the positions of the distribution nodes can be determined to generate an approximation of the ground. Ground points and non-ground points are then extracted from the lidar point cloud by comparing the original lidar points to the generated surface.
And 2.4, clustering the extracted non-ground point clouds based on an European clustering algorithm, and calculating the center point of each class. Ground obstacle center point obtained by clustering point cloudsMapping back image space->And re-projecting the point cloud onto the image. This process can be implemented by the following formula:
wherein,is the focal length of the camera, < >>And->Is the center point coordinates of the image.
Further, the specific implementation of the step 3 includes the following sub-steps:
further, constructing the image SAM big model comprises a lightweight image encoder, a mask decoder and a keyword decoder, which mainly comprise three parts, viT-Tiny, respectively.
Preferably, the enhancement visualization of the visible light image is specifically as follows:
obtaining parallax corresponding to each pixel in a left image output by a network, and utilizing an exponential function to predict the parallaxEnhancing the difference degree of near objects to obtain an enhanced parallax map, wherein the formula is as follows:
d is the original parallax, d' represents the parallax by enhancement, and m is the index value of the exponential function. Statistics of experiments show that the indexThe effect is optimal.
And visualizing the enhanced parallax map to generate an enhanced visual result. The parallax map adopts a color mapping mode to convert the array into RGB images, and the RGB images are displayed in a superimposed mode to obtain the enhanced visual parallax map. The color mapping mode can adopt Jet mapping. The superposition process is as follows:
in the formulaRepresenting the output result image +_>Representing the original left image,/>Representing enhanced parallax->RGB images generated through JET mapping. />And->Weight for controlling two images to be fused, usually between 0 and 1, and +.>,/>Is the bias value.
And performing enhanced display of the segmented ground obstacle on the enhanced visual disparity map. The ground obstacle mask obtained by dividing the key center points is displayed in a superimposed manner on the visual parallax map. Unlike the previous step, since ground obstructions are a very fatal hazard to the extraterrestrial planet detectors, the corresponding fringes generated by the mask area are superimposed and displayed in the enhanced image in an alternative manner here.
Compared with the prior art, the invention has the following advantages and beneficial effects:
the invention provides an external planet surface environment reconstruction and enhancement visualization method based on deep learning by taking image data acquired by a binocular navigation camera of a probe vehicle as a research object and aiming at the data characteristics of the image data. And combining with the deep learning neural network, realizing real-time analysis and processing of the acquired image and data, and reconstructing the environmental information. And a segmentation method based on a SAM large model is designed by combining scene reconstruction information, and the obstacles on the surface of the extraterrestrial planet are extracted. And fusing scene reconstruction information and obstacle extraction information to visually enhance the original stereoscopic image pair. The method can better aim at the characteristics of the topography and the topography of the planet outside the earth, fully consider the cognition habit of the topography and the topography of the earth of researchers and users, and carry out visual enhancement and obstacle marking on the reconstruction result, thereby realizing the faster and more accurate visualization of the surface environment of the planet outside the earth.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention.
Fig. 2 is a diagram of a three-dimensional matching neural network in an embodiment of the present invention.
FIG. 3 is a diagram of a network architecture of a SAM large model in an embodiment of the present invention.
FIG. 4 is a flow chart of visualization enhancement in an embodiment of the present invention.
Detailed Description
The technical scheme of the invention is described below with reference to the accompanying drawings and examples.
Example 1
The method for reconstructing and enhancing the external planetary surface environment and specifically explaining the method provided by the invention is characterized in that image data collected by a binocular navigation camera of a goddess's third lunar probe vehicle is selected as a research object. Referring to fig. 1, an embodiment of the present invention comprises the steps of:
step 1, obtaining visible light images shot by a navigation camera of a planet detection vehicle, and performing color restoration and color correction processing on the color stereopair data of the surface of the earth outside the earth; constructing a real-time stereo matching scene reconstruction network, and inputting left and right image pairs into a neural network to obtain a reconstruction scene;
step 2, after a parallax image is obtained by a three-dimensional matching scene reconstruction network, recovering the parallax image to a three-dimensional space according to a projection relation to obtain a point cloud; the foreign planet surface obstacle such as stone is obtained by dividing by a ground filtering algorithm. And re-projecting the two-dimensional images to obtain two-dimensional center point keywords.
Step 3, taking the central point keyword as a prompt of a SAM large model, and dividing the original visible light image to obtain a stone area; and according to the requirements of human beings on the earth topography cognition habit and the deep space detection task, enhancing and visualizing the visible light image, and obtaining an enhanced and visual extraterrestrial planet surface scene graph.
Further, the stereo matching network in the step 1 specifically comprises a multi-scale feature extractor, a combined geometric coding body, an update operator based on ConvGRU and a spatial up-sampling module.
Further, the feature extractor in step 1 includes two parts, a feature extraction network and a context network. In the feature extraction network, for the initial left-right image pairThe height, width and channel number are respectively marked as H, W, C, the original characteristic of a single image is marked as CxH x W, and the channel number, namely C, is generally 3. First, the original left and right pair of images is downsampled to 1/32 of the original size using a mobilenet v2 network pre-trained on the ImageNet dataset, then restored to 1/4 of the original size using an upsampling module with a jump connection, thereby obtaining the multiscale feature +.>. Wherein the height, width, and channel number of the multi-scale feature comprise the following dimensions, respectively:wherein->. In the context network, the original left-right pair +.>Into a network structure such as RAFT-stepeo, the network consists of a series of residual blocks and downsampling layers, resulting in a multi-scale context feature of 1/4, 1/8 and 1/16 with an input image resolution of 128 channels.
Further, for the combined geometric coded volume in step 1, the left and right pairs generated in the feature extractorDimension characteristics->As input, divide into +_along the channel dimension>(/>=8) groups and calculating correlations group by group as in equation (1):
wherein in formula (1)Represents the inner product, d is the parallax index, +.>Representing the number of characteristic channels. Further processing +.>To obtain a geometrically encoded body. The 3D regularization network R is based on a lightweight 3D-UNet, consisting of three downsampled blocks and three upsampled blocks. Each downsampled block consists of two 3 x 3D convolutions. The number of channels of the three downsampled blocks is 16, 32, 48, respectively. Each upsampled block consists of a 4 x 4 3D transpose convolution and two 3 x 3D convolutions. The matching costs in stereo matching are excited to perform cost aggregation as with CoEx using weights calculated from features of the left image. For one of cost aggregation +.>Matching cost of dimension->WhereinAnd->The guided matching cost excitation is expressed as formula (2):
wherein the method comprises the steps ofIs a sigmoid function, +.>Representing the hadamard product. Subsequently, the disparity dimensions are pooled using one-dimensional average pooling with kernel size 2 and stride 2 to form a two-level pyramid +.>And all pairing related cost pyramid->The two are combined to form a combined geometric coded body.
Further, the update operator based on ConvGRU in step 1. From geometrically coded volumes according to equation (3)Middle regression initial parallax->
Wherein the method comprises the steps ofIs a set of predetermined parallax indices at 1/4 resolution. From->Initially, the disparities were iteratively updated using three levels of convglu. The hidden state of the three-layer convglu is initialized according to the multi-scale context features generated in step 2. For each iteration, the current disparity +.>From combined geometric coding by linear interpolationIndexing in the volume, generating a set of geometrical features +.>。/>The calculation formula (4) of (2) is:
wherein the method comprises the steps ofIs the current parallax +.>Is the index radius>Representing pooling. These geometric features and the current disparity prediction +.>Through two encoder layers and then with +.>Ligating to form->. Then update hidden state +.>
Wherein the method comprises the steps ofIs a contextual feature generated from the context network. Conv represents convolution operation, ">Representing the update amount representing the hidden state +.>Representing the weight of the convolution. The number of channels in the ConvGRU hidden state is 128, and the number of channels in the context feature is also 128./>And->Each consisting of two convolutional layers. Hidden state->Residual disparity Δdk is decoded by two convolutional layers, and then the current disparity +.>The updated disparity is defined by->Expressed as formula (6):
further, the spatial upsampling module in step 1 predicts the view at 1/4 resolutionDifference of differenceThe weighted combination of (2) outputs a full resolution disparity map. The hidden states are convolved to generate features, which are then upsampled to 1/2 resolution. Up-sampled features and features in left image +.>The connection produces a weight W of 9×h×w in dimension. The full resolution disparity is output by a weighted combination of coarse resolution neighbors.
Further, the loss in the stereo matching network in step 1 comprises an initial disparity regressing from the GEVSmooth L1 loss on->As in formula (7):
wherein the method comprises the steps ofRepresenting the true parallax. Calculate all N predicted parallaxes->L1 loss->And exponentially increasing the weight, the total loss is defined as shown in equation (8):
where y=0.9,representing the true parallax.
Further, the specific implementation of the step 2 includes the following sub-steps:
and 2.1, obtaining a parallax image by the stereo matching scene reconstruction network, and calculating the parallax image according to the projection relation to obtain a depth image. Each pixel in the depth map represents depth information for that point. The calculation formula is as follows:
wherein,is the pixel in the depth map +.>Depth value of the place>Is the focal length of the camera, < >>Is the length of the base line and,is the pixel in the disparity map +.>A disparity value at the position.
Step 2.2 for each pixel in the depth mapAccording to depth value->Convert it into three-dimensional coordinatesAdded to the point cloud data structure to generate a three-dimensional point cloud. The formula for the conversion process is shown below:
wherein,is the coordinates of the center of the image, ">Is the focal length of the camera.
And 2.3, filtering the ground point cloud by adopting a CSF ground filtering algorithm. CSF is an airborne lidar filtering method based on cloth simulation. The interaction between the distribution nodes and the corresponding lidar points is simulated, and the positions of the distribution nodes can be determined to generate an approximation of the ground. Ground points and non-ground points are then extracted from the lidar point cloud by comparing the original lidar points to the generated surface.
And 2.4, clustering the extracted non-ground point clouds based on an European clustering algorithm, and calculating the center point of each class. Ground obstacle center point obtained by clustering point cloudsMapping back image space->And re-projecting the point cloud onto the image. This process can be implemented by the following formula:
wherein,is the focal length of the camera, < >>And->Is the center point coordinates of the image.
Further, the specific implementation of the step 3 includes the following sub-steps:
step 3.1, constructing an image SAM big model comprises a lightweight image encoder, a mask decoder and a keyword decoder, which mainly comprise three parts, viT-Tiny respectively.
The lightweight image encoder consists of four parts, gradually decreasing resolution. The first stage consists of a convolution block with an inverted residual, while the remaining three stages consist of a transducer module. At the beginning of the model there are 2 convolution blocks of step 2 for downsampling the resolution. The downsampling operation between the different phases is handled by a convolution block of step size 2. Step 2 in the last downsampling convolution is set to 1 to match the final resolution to that of ViT.
The keyword decoder encodes a ground obstacle center point generated for the projection. Its position code is first obtained and then a learned one-dimensional vector feature is generated based on whether it is foreground or background. And fusing the position codes and the features to obtain the key word features of the points.
Since the mask decoder in the original SAM is already lightweight, the patent adopts its decoder architecture. The mask decoder may effectively map the image encoding feature, the center point keyword hint feature, and the output marker to a mask. The decoder is modified based on the decoder blocks of the transducer, with the addition of a dynamic mask pre-header after the decoder. The decoder uses hints for self-attention and cross-attention. After the image is up-sampled, the MLP is used for mapping the output mark to the dynamic linear classifier, and finally, the ground obstacle in the image is segmented.
Step 3.2, obtaining the parallax corresponding to each pixel in the left image output by the network, and utilizing an exponential function to predict the parallaxPerforming enhancement to enhance the difference degree of near objects, wherein the formula (12) is as follows:
d is the original parallax, d' represents the parallax by enhancement, and m is the index value of the exponential function. Statistics of experiments prove that the index is determinedThe effect is optimal.
And 3.3, visualizing the enhanced parallax map to generate an enhanced visual result. The parallax map converts the array into RGB images in a color mapping mode, and the RGB images are displayed in a superimposed mode. Empirically, the color mapping approach may employ Jet mapping. By adopting Jet mapping, compared with other color mapping modes, the color distribution of red, yellow and blue is more consistent with the cognitive feel of human beings, the near red represents the dangerous area needing to be focused, and the far area has less light of darker blue. The operation rule of the extraterrestrial planet detector is also met, and the detector needs to pay attention to whether the surrounding environment has a large gradient, passable obstacle and the like when in operation. The superposition process is as in equation (13):
in the formulaRepresenting the output result image +_>Representing the original left image,/>Representing enhanced parallax->RGB images generated through JET mapping. />And->Weight for controlling two images to be fused, usually between 0 and 1, and +.>,/>Is the bias value. In this patent according to empirical values +.>,/>. The formula applies to an RGB image, applying the same weight to the RGB channels of each pixel.
And 3.4, performing enhanced display of the segmented ground obstacle on the enhanced visual parallax map. Since JET color mapping contains common colors such as red, yellow, blue, etc., and red, which is often used to label hazard information, has been used in the figures. The present patent uses red stripes to superimpose the ground obstacle mask, which is segmented by key center points, onto the disparity map that enhances visualization. Unlike the previous step, since ground obstructions are a very fatal hazard to the extraterrestrial planet detectors, the corresponding fringes generated by the mask area are superimposed and displayed in the enhanced image in an alternative manner here.
The binocular navigation camera of the goddess Chang's three-month lunar exploration vehicle is used for collecting image data, and after the image data is processed by the method, the unmanned operation speed can process a pair of three-dimensional images within 2 seconds, and the enhanced visual product is provided. The method can provide a quick and accurate visualization result of the external planet surface environment in real time. And stereo matching is performed on the 3D reconstruction evaluation dataset ETH3D dataset, and the pixel error rate of the matching result is as low as 3.6.
Example two
Based on the same conception, the scheme also designs an extraterrestrial planetary surface environment reconstruction and enhancement visualization system, which comprises a three-dimensional reconstruction module, a planetary surface reconstruction module, a three-dimensional reconstruction module and a three-dimensional reconstruction module, wherein the three-dimensional reconstruction module acquires planetary surface data and reconstructs a planetary surface scene;
the central point keyword acquisition module acquires three-dimensional space point cloud from parallax data in the reconstructed scene; dividing to obtain an obstacle on the planet ground, and re-projecting the obstacle to the two-dimensional image to obtain a two-dimensional central point keyword;
the enhanced visualization module takes the central point keyword as a prompt of the SAM large model, and segments the central point keyword from the original visible light image to obtain a stone area; and performing enhanced visualization on the depth map to obtain an enhanced visual extraterrestrial planet surface scene map.
Because the apparatus described in the second embodiment of the present invention is an extraterrestrial planetary surface environment reconstruction and enhancement visualization system for implementing the second embodiment of the present invention, based on the method described in the first embodiment of the present invention, a person skilled in the art can understand the specific structure and deformation of the electronic apparatus, and therefore, the detailed description thereof is omitted herein.
Example III
Based on the same inventive concept, the invention also provides an electronic device comprising one or more processors; a storage means for storing one or more programs; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method described in embodiment one.
Because the device described in the third embodiment of the present invention is an electronic device for implementing the method for reconstructing and enhancing the external planetary surface environment according to the first embodiment of the present invention, a person skilled in the art can know the specific structure and deformation of the electronic device based on the method described in the first embodiment of the present invention, and therefore, the detailed description thereof is omitted herein. All electronic devices adopted by the method of the embodiment of the invention belong to the scope of protection to be protected.
Example IV
Based on the same inventive concept, the present invention also provides a computer readable medium having stored thereon a computer program which, when executed by a processor, implements the method described in embodiment one.
Because the apparatus described in the fourth embodiment of the present invention is a computer readable medium for implementing the method for reconstructing and enhancing the external planetary surface environment according to the first embodiment of the present invention, a person skilled in the art can understand the specific structure and the deformation of the electronic apparatus based on the method described in the first embodiment of the present invention, and therefore, the detailed description thereof is omitted herein. All electronic devices adopted by the method of the embodiment of the invention belong to the scope of the invention to be protected.
The foregoing is a further detailed description of the invention in connection with specific embodiments, and it is not intended that the invention be limited to such description. It will be apparent to those skilled in the art that several simple deductions or substitutions may be made without departing from the spirit of the invention, and these should be considered to be within the scope of the invention.

Claims (9)

1. The method for reconstructing and enhancing the visualization of the surface environment of the outer planet is characterized by comprising the following steps of:
step 1, a planetary detection vehicle is utilized to navigate a binocular camera to obtain a visible light image, a planetary surface scene is reconstructed, and a parallax map corresponding to the visible light image is obtained;
step 2, acquiring a three-dimensional space point cloud according to a parallax map obtained by reconstructing a planetary surface scene; dividing the obtained three-dimensional space point cloud to obtain an extraterrestrial planetary surface obstacle, re-projecting the obtained extraterrestrial planetary surface obstacle to a two-dimensional center point of the visible light image to obtain the obstacle, and taking the two-dimensional center point as a keyword;
step 3, taking the keywords as the prompt of a SAM large model, and dividing the keywords from the visible light image to obtain an obstacle region, wherein the obstacle region comprises stones; the visible light image is enhanced and visualized, and an enhanced visible extraterrestrial planet surface scene graph is obtained, and the specific operation is as follows:
obtaining parallax corresponding to each pixel in a left image, and enhancing the predicted parallax by using an exponential function to obtain an enhanced parallax map;
the enhanced parallax image adopts a color mapping mode to convert the array into RGB images, and the RGB images are overlapped on the visible light images to obtain the enhanced visual parallax image;
and superposing a mask corresponding to the obstacle region segmented from the visible light image on the enhanced visual parallax map to obtain a final enhanced visual display, wherein corresponding stripes generated by the mask region are superposed and displayed in the enhanced image in an alternative mode.
2. The method for reconstructing and enhancing visualization of an extraterrestrial planetary surface environment according to claim 1, wherein:
step 1, obtaining visible light images shot by a navigation camera of a planet detection vehicle, and performing color restoration and color correction processing on the color stereopair data of the surface of the earth outside the earth; and constructing a real-time stereo matching scene reconstruction network, and inputting the left and right image pairs into a neural network to obtain a reconstruction scene.
3. The method for reconstructing and enhancing the visualization of the surface environment of an extraterrestrial planet according to claim 2, wherein: the stereo matching scene reconstruction network specifically comprises a multi-scale feature extractor, a combined geometric coding body, an update operator based on ConvGRU and a spatial up-sampling module:
the left and right stereopair of the visible light image firstly enters a multi-scale feature extractor, and the feature extractor extracts the individual image features and the combined multi-scale context features of the left and right stereopair;
inputting the image characteristics of the independent image pairs into a combined geometric coding body to obtain combined characteristics;
then, based on ConvGRU updating operator, carrying out operation on the combined features, iteratively updating to generate 1/4 resolution initial parallax, and updating ConvGRU updating operator hidden state by combining the combined multi-scale context features;
the spatial upsampling module outputs a full resolution disparity map by predicting iterative disparity combinations generated based on the update operator of ConvGRU.
4. The method for reconstructing and enhancing the visualization of the surface environment of an extraterrestrial planet according to claim 2, wherein the specific process of step 2 is as follows:
step 2.1, after a parallax image is obtained by a stereo matching scene reconstruction network, calculating the parallax image according to a projection relation to obtain a depth image;
step 2.2, for each pixel in the depth map, converting the pixel into three-dimensional coordinates according to the depth value, and adding the three-dimensional coordinates into a point cloud data structure to generate a three-dimensional point cloud;
step 2.3, filtering the ground point cloud by adopting a CSF ground filtering algorithm, and then extracting ground points and non-ground points from the laser radar point cloud by comparing the original laser radar points with the filtered ground point cloud;
and 2.4, clustering the extracted non-ground point clouds based on an European clustering algorithm, calculating the center point of each class, and re-projecting the point clouds onto an image by mapping the ground obstacle center points obtained by clustering in the point clouds back to a two-dimensional visible light image space.
5. The method for reconstructing and enhancing visualization of an extraterrestrial planetary surface environment according to claim 1, wherein: the SAM big model comprises a ViT-based lightweight image encoder, a mask decoder and a keyword decoder;
the visible light image is firstly sent into a ViT-based lightweight image encoder to extract image characteristics; then inputting the extracted keywords into a keyword decoder to obtain keyword characteristics; the image features and the keyword features are input into a mask decoder together, and the ground obstacle segmentation result in the image is output through calculation of the mask decoder.
6. The method for reconstructing and enhancing visualization of an extraterrestrial planetary surface environment according to claim 1, wherein: the color mapping mode in the enhanced visual parallax image adopts Jet mapping, and the superposition process is as follows:
in the formulaRepresenting the output result image +_>Representing the original left image,/>Representing enhanced parallax->RGB image generated by JET mapping, +.>And->For controlling the weight of two images to be fused, and +.>,/>Is the bias value.
7. An extraterrestrial planetary surface environment reconstruction and enhancement visualization system, characterized in that:
the system comprises a three-dimensional reconstruction module, a parallax image acquisition module and a parallax image acquisition module, wherein the three-dimensional reconstruction module acquires visible light images by utilizing a planetary probe car navigation binocular camera, reconstructs a planetary surface scene and acquires parallax images corresponding to the visible light images;
the keyword acquisition module acquires a three-dimensional space point cloud according to a parallax map obtained by reconstructing a planetary surface scene; dividing the obtained three-dimensional space point cloud to obtain an extraterrestrial planetary surface obstacle, re-projecting the obtained extraterrestrial planetary surface obstacle to a two-dimensional center point of the visible light image to obtain the obstacle, and taking the two-dimensional center point as a keyword;
the enhanced visualization module is used for taking the keywords as prompts of a SAM large model, and dividing the keywords from the visible light image to obtain an obstacle region, wherein the obstacle region comprises stones; the visible light image is enhanced and visualized, and an enhanced visible extraterrestrial planet surface scene graph is obtained, and the specific operation is as follows:
obtaining parallax corresponding to each pixel in a left image, and enhancing the predicted parallax by using an exponential function to obtain an enhanced parallax map;
the enhanced parallax image adopts a color mapping mode to convert the array into RGB images, and the RGB images are overlapped on the visible light images to obtain the enhanced visual parallax image;
and superposing a mask corresponding to the obstacle region segmented from the visible light image on the enhanced visual parallax map to obtain a final enhanced visual display, wherein corresponding stripes generated by the mask region are superposed and displayed in the enhanced image in an alternative mode.
8. An electronic device, comprising:
one or more processors;
a storage means for storing one or more programs;
when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-6.
9. A computer readable medium having a computer program stored thereon, characterized by: the program, when executed by a processor, implements the method of any of claims 1-6.
CN202311403367.3A 2023-10-27 2023-10-27 Method and system for reconstructing and enhancing visualization of surface environment of extraterrestrial planet Active CN117132973B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311403367.3A CN117132973B (en) 2023-10-27 2023-10-27 Method and system for reconstructing and enhancing visualization of surface environment of extraterrestrial planet

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311403367.3A CN117132973B (en) 2023-10-27 2023-10-27 Method and system for reconstructing and enhancing visualization of surface environment of extraterrestrial planet

Publications (2)

Publication Number Publication Date
CN117132973A CN117132973A (en) 2023-11-28
CN117132973B true CN117132973B (en) 2024-01-30

Family

ID=88856828

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311403367.3A Active CN117132973B (en) 2023-10-27 2023-10-27 Method and system for reconstructing and enhancing visualization of surface environment of extraterrestrial planet

Country Status (1)

Country Link
CN (1) CN117132973B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117475091B (en) * 2023-12-27 2024-03-22 浙江时光坐标科技股份有限公司 High-precision 3D model generation method and system

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015024407A1 (en) * 2013-08-19 2015-02-26 国家电网公司 Power robot based binocular vision navigation system and method based on
DE102016014783A1 (en) * 2016-12-02 2017-07-06 Daimler Ag Method for detecting objects
WO2021147548A1 (en) * 2020-01-20 2021-07-29 深圳市普渡科技有限公司 Three-dimensional reconstruction method, detection method and system for small obstacle, and robot and medium
CN114842340A (en) * 2022-05-13 2022-08-02 杜明芳 Robot binocular stereoscopic vision obstacle sensing method and system
CN115984494A (en) * 2022-12-13 2023-04-18 辽宁工程技术大学 Deep learning-based three-dimensional terrain reconstruction method for lunar navigation image
CN116051766A (en) * 2022-12-30 2023-05-02 北京航空航天大学 External planet surface environment reconstruction method based on nerve radiation field
CN116563377A (en) * 2023-05-26 2023-08-08 北京邮电大学 Mars rock measurement method based on hemispherical projection model
CN116630528A (en) * 2023-03-20 2023-08-22 清华大学深圳国际研究生院 Static scene reconstruction method based on neural network
CN116664855A (en) * 2023-05-23 2023-08-29 武汉大学 Deep learning three-dimensional sparse reconstruction method and system suitable for planetary probe vehicle images

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230245444A1 (en) * 2021-05-07 2023-08-03 California Institute Of Technology Unmanned aerial system (uas) autonomous terrain mapping and landing site detection

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015024407A1 (en) * 2013-08-19 2015-02-26 国家电网公司 Power robot based binocular vision navigation system and method based on
DE102016014783A1 (en) * 2016-12-02 2017-07-06 Daimler Ag Method for detecting objects
WO2021147548A1 (en) * 2020-01-20 2021-07-29 深圳市普渡科技有限公司 Three-dimensional reconstruction method, detection method and system for small obstacle, and robot and medium
CN114842340A (en) * 2022-05-13 2022-08-02 杜明芳 Robot binocular stereoscopic vision obstacle sensing method and system
CN115984494A (en) * 2022-12-13 2023-04-18 辽宁工程技术大学 Deep learning-based three-dimensional terrain reconstruction method for lunar navigation image
CN116051766A (en) * 2022-12-30 2023-05-02 北京航空航天大学 External planet surface environment reconstruction method based on nerve radiation field
CN116630528A (en) * 2023-03-20 2023-08-22 清华大学深圳国际研究生院 Static scene reconstruction method based on neural network
CN116664855A (en) * 2023-05-23 2023-08-29 武汉大学 Deep learning three-dimensional sparse reconstruction method and system suitable for planetary probe vehicle images
CN116563377A (en) * 2023-05-26 2023-08-08 北京邮电大学 Mars rock measurement method based on hemispherical projection model

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
A stereo-vision system for support of planetary surface exploration;Maarten Vergauwen et al.;《Machine Vision and Applications》;第5-14页 *
An Efficient Dense Stereo Matching Method for Planetary Rover;Haichao Li et al.;《IEEE Access》;第7卷;第48551-48564页 *
MarsSim: A high-fidelity physical and visual simulation for Mars rovers;Ruyi Zhou et al.;《IEEE Transactions on Aerospace and Electronic Systems》;第59卷;第1879-1892页 *
基于图像信息的小天体参数估计及探测器自主导航研究;邵巍;《中国优秀博士学位论文全文数据库 工程科技Ⅱ辑》;全文 *
基于被动视觉的行星着陆自主障碍检测技术研究;杨世坤;《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》;全文 *
火星车自主导航与路径规划技术研究;魏祥泉;黄建明;顾冬晴;陈凤;;深空探测学报(03);全文 *

Also Published As

Publication number Publication date
CN117132973A (en) 2023-11-28

Similar Documents

Publication Publication Date Title
Jaritz et al. Sparse and dense data with cnns: Depth completion and semantic segmentation
Guerry et al. Snapnet-r: Consistent 3d multi-view semantic labeling for robotics
Dong et al. Towards real-time monocular depth estimation for robotics: A survey
US20210149022A1 (en) Systems and methods for 3d object detection
Johnson‐Roberson et al. Generation and visualization of large‐scale three‐dimensional reconstructions from underwater robotic surveys
Chen et al. 3d point cloud processing and learning for autonomous driving
CN117132973B (en) Method and system for reconstructing and enhancing visualization of surface environment of extraterrestrial planet
Cho et al. A large RGB-D dataset for semi-supervised monocular depth estimation
Jiang et al. Determining ground elevations covered by vegetation on construction sites using drone-based orthoimage and convolutional neural network
CN115359372A (en) Unmanned aerial vehicle video moving object detection method based on optical flow network
CN114549338A (en) Method and device for generating electronic map and computer readable storage medium
US11887248B2 (en) Systems and methods for reconstructing a scene in three dimensions from a two-dimensional image
Gomez-Donoso et al. Three-dimensional reconstruction using SFM for actual pedestrian classification
Erkent et al. End-to-end learning of semantic grid estimation deep neural network with occupancy grids
CN112489119A (en) Monocular vision positioning method for enhancing reliability
Kniaz et al. Deep learning a single photo voxel model prediction from real and synthetic images
Siddiqui et al. Multi-modal depth estimation using convolutional neural networks
Ao Fully convulutional networks for street furniture identification in panorama images
EP3958167B1 (en) A method for training a neural network to deliver the viewpoints of objects using unlabeled pairs of images, and the corresponding system
JP7423500B2 (en) Information processing devices, information processing methods, programs, and vehicle control systems
Schennings Deep convolutional neural networks for real-time single frame monocular depth estimation
Endo et al. High definition map aided object detection for autonomous driving in urban areas
Tripodi et al. Automated chain for large-scale 3d reconstruction of urban scenes from satellite images
Nair et al. Annotated reconstruction of 3D spaces using drones
Qin An operational pipeline for generating digital surface models from multi-stereo satellite images for remote sensing applications

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant