CN108648264B - Underwater scene reconstruction method based on motion recovery and storage medium - Google Patents

Underwater scene reconstruction method based on motion recovery and storage medium Download PDF

Info

Publication number
CN108648264B
CN108648264B CN201810377322.6A CN201810377322A CN108648264B CN 108648264 B CN108648264 B CN 108648264B CN 201810377322 A CN201810377322 A CN 201810377322A CN 108648264 B CN108648264 B CN 108648264B
Authority
CN
China
Prior art keywords
image
patch
point
images
correlation coefficient
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810377322.6A
Other languages
Chinese (zh)
Other versions
CN108648264A (en
Inventor
王欣
杨熙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jilin University
Original Assignee
Jilin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jilin University filed Critical Jilin University
Priority to CN201810377322.6A priority Critical patent/CN108648264B/en
Publication of CN108648264A publication Critical patent/CN108648264A/en
Application granted granted Critical
Publication of CN108648264B publication Critical patent/CN108648264B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/50Lighting effects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/248Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • G06T7/337Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • G06T7/85Stereo camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • G06T2207/10021Stereoscopic video; Stereoscopic image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows

Abstract

The reconstruction method introduces an improved motion recovery algorithm, extracts a motion matrix and establishes the mutual relation between video images; after the redundant image elimination is finished, when feature point matching and point cloud generation are carried out in two steps: firstly, matching feature points on a binocular image, and generating a patch according to the matched feature points in order to obtain denser point cloud data; and then diffusing the surface patches to all visual angles to complete reconstruction of the scene model, and finally performing color correction on the point cloud model according to the imaging characteristics of the underwater scene. The method can still complete a better reconstruction result when only a few input images exist, has better efficiency and precision, and improves the accuracy and robustness of the reconstructed scene to a certain extent.

Description

Underwater scene reconstruction method based on motion recovery and storage medium
Technical Field
The invention relates to a three-dimensional reconstruction method, in particular to an underwater scene reconstruction method based on motion recovery and a storage medium, which can give consideration to both efficiency and precision.
Background
The real world is three-dimensional, and in order to facilitate observation, analysis and extension of the real world, a three-dimensional model needs to be reconstructed in a computer environment. In recent years, with the rapid advancement of computer hardware technology and the gradual change of software, the construction methods of three-dimensional models are more and more, and related software is widely applied to various fields such as medical image processing, 3D printing, computer games, virtual reality, mapping, simulated military training, movie and television entertainment and the like. According to different modes of obtaining reconstruction data, the three-dimensional model construction technology mainly comprises the following steps: the three-dimensional modeling tool is used for directly modeling, modeling by using instrument equipment and a three-dimensional reconstruction technology based on vision.
The modeling using the modeling tool does not need to acquire any data related to reconstruction in advance, and a user transforms the surface of an initial model into a complex structural shape through a series of geometric operations by using basic geometric shapes such as cubes, spheres and the like provided in a special modeling tool or by using a model or texture imported in advance. When the method is used for modeling a large and complex scene, the scene contains a large amount of information and the scene texture is very complex, so that a high-precision scene model is very difficult to obtain, a large amount of manpower and material resources are consumed, and the result is only irregular simulation and reduction of the scene to be reconstructed.
For modeling using instrumentation, a three-dimensional Scanner (3-dimensional scientific Scanner) is one of the currently important tools for three-dimensional modeling of real objects. The method can quickly convert real physical information into digital signals which can be directly processed by a computer, thereby directly obtaining a high-precision three-dimensional model. However, this method relies heavily on instruments to collect information, and is very difficult to collect in large scenes such as mountains, rivers, and the like. Moreover, the requirement of the instrument on the acquisition environment is high, too many interference sources cannot exist, otherwise, a large amount of time is needed for correcting the noise, and the result is unsatisfactory when a complex scene, particularly an underwater scene, is reconstructed.
For the vision-based three-dimensional reconstruction technology, a computer vision method is adopted to reconstruct a model and restore a scene according to a two-dimensional image or video obtained by collection. The method has low requirements on equipment and reconstructed objects, has high reconstruction speed, can complete reconstruction in a full-automatic manner according to input images, and is an extremely active research field of the current computer graphics. Different from the two methods, the input of the method is the image of the object to be reconstructed, and the acquisition difficulty is greatly reduced compared with modeling of instrument equipment. Meanwhile, the method is not limited by the size of a scene and the shape of a model, can be completed automatically or semi-automatically, can be conveniently integrated into daily hardware equipment, and can be widely applied to robot intellectualization, aviation mapping and industrial automation.
As one of the important branches of computer vision, three-dimensional reconstruction based on computer vision is based on Marr's visual theory framework, and in recent years, various scene reconstruction methods have been developed. The methods can be classified into a monocular vision method, a binocular vision method, and a binocular vision method according to the number of cameras at the same time.
Monocular reconstruction algorithms are algorithms that reconstruct images acquired from a single camera, either as one image at a single viewing angle or as multiple images at multiple viewing angles. Monocular reconstruction algorithms have been developed over the years for many mature algorithms. However, the monocular reconstruction algorithm usually needs additional auxiliary information to complete the reconstruction process, such as light, focusing, auxiliary contour, or an image under a large number of viewing angles, because the acquired information is less. The method has more requirements on the shooting environment and the shooting method, and the practical application is limited.
As one of the mainstream methods at present, the binocular stereo vision method realizes scene reconstruction through a binocular image which is really acquired, is similar to the process of human visual perception and observation, has perfect mathematical theory support and has higher reconstruction precision. However, the existing algorithm has many defects that the binocular stereo vision algorithm based on feature point matching has high accuracy and low time complexity, only sparse point cloud of a scene can be obtained, and the reconstruction effect is not ideal; although the binocular stereoscopic vision algorithm based on pixel point matching can obtain dense point cloud data, the precision is reduced, the time complexity is greatly improved compared with the characteristic point matching, and the consumed time is too long when the three-dimensional reconstruction of a large scene is carried out. Meanwhile, the traditional binocular stereo vision algorithm only considers the relation of corresponding frames, so that the reconstructed point cloud lacks the interconnection of different visual angles, the reconstructed point cloud is not naturally connected, and the reconstruction effect is influenced.
In the process of reconstruction of binocular stereo vision, the defects that similar images or repeated parts in the images cause mismatching and the like still occur, therefore, aiming at the problems, multi-view stereo vision reconstruction is provided, and a new camera is further added on the basis of binocular to provide more constraint information so as to improve the final reconstruction accuracy. Although the multi-view stereo vision can reduce the mismatching and the edge blurring to a certain extent in the reconstruction process, with the addition of an additional camera, the images needing to be processed at each view angle are increased, the equipment structure and the physical relationship are further complicated, the operation difficulty and the cost are greatly improved, and the effect is not ideal.
Therefore, how to improve reconstruction efficiency and accuracy for the existing computer vision reconstruction method becomes a technical problem to be solved urgently in the prior art.
Disclosure of Invention
The invention aims to provide a scene reconstruction method based on binocular stereo vision, provides a complete underwater scene reconstruction method based on motion recovery, introduces an improved motion recovery algorithm, realizes extraction of a motion matrix and establishes interrelation between video images; after the redundant image elimination is completed, in order to enhance the robustness of the algorithm, the characteristic point matching and the point cloud generation are carried out in two steps: firstly, matching feature points on a binocular image, and generating a patch according to the matched feature points in order to obtain denser point cloud data; and diffusing the surface patch to all the visual angles to complete the reconstruction of the scene model. And finally, carrying out color correction on the point cloud model according to the imaging characteristics of the underwater scene.
In order to achieve the purpose, the invention adopts the following technical scheme:
an underwater scene reconstruction method based on motion recovery is characterized by comprising the following steps: motion matrix extraction step S110: for each group of newly added binocular images, only one frame of the first eye image is selected to be matched with the previous frame of image for feature points, then motion matrix calculation is carried out, after the motion matrix of the first eye image is obtained through calculation, the motion matrix of the second eye image can be obtained according to calibration between binocular cameras, and new tracking points are selected from the motion matrix and added into a tracking point set;
redundant information removal step S120: first frame image p in first-view video1It is put into the next frame image p2Comparing, obtaining a projection matrix K according to the motion matrixes under two visual angles, and obtaining p through the formula (1)2The point on is mapped to p by the projection matrix K1Wherein r is2Is p2Coordinate of (a), r21Is p2Projection to p1The coordinates of (a) to (b) are,
r21=Kr2(1)
by comparing the images p1And p2Obtaining an image correlation coefficient delta between the two images by the mapped pixels, comparing the image correlation coefficient delta with a threshold value, and if the image correlation coefficient delta is smaller than the set threshold value, obtaining the next frame image p2Not redundant pictures, retaining p1And p2And compare p2And p2The adjacent next frame of the first target image, otherwise, the p is determined2Is a redundant picture, p is removed from the video picture set2Then p is added1And p2Comparing the adjacent first target images of the next frame; repeating the circulation until the last frame of image is compared, then repeating the step of removing redundant information on the image in the second-order video, and obtaining an image set P which is simplified and retains the scene characteristics;
a scene reconstruction step S130, comprising:
an initial matching sub-step, dividing each frame image in the image set P into β × β pixel grids, respectively calculating α local maximum values in each grid as feature points by using DOG operator and Harris operator, matching the first-order image and the second-order image by using the obtained feature points, obtaining a feature point pair set after completing the feature point matching, and obtaining a feature point pair (m) for each pair of matched feature point pairsl,mr) Sorting the point pairs from far to near according to the distance between the point pairs and the camera lens, generating point clouds from near to far, and generating a surface patch p of theta × theta pixels by taking m as the centermThe center of the patch is m, the patch pmThe normal vector of (a) is a connection line between m and the center point of the reference image camera, and the generated patch p is subjected tomScreening is carried out; for the generated patch pmThe screening is carried out as follows: pass-through patch pmThe image projection matrix of (2) to obtain corresponding affine transformation parameters, and then surface patch pmRespectively mapped to pl,prTo find pmAt plAnd prThe corresponding coordinates of (a); computing p by bilinear interpolationmAt plAnd prOnCalculating initial matching correlation coefficient epsilon of the two projection images through a normalized product correlation algorithm, if the initial matching correlation coefficient epsilon is larger than a threshold value, considering that the patch is successfully reconstructed, storing the patch and reconstructing a next pair of feature point pairs, and if not, deleting the patch pmReconstructing the next pair of characteristic point pairs;
and a diffusion surface patch reconstruction substep: for each patch generated in the initial reconstruction, if no patch exists in the adjacent mesh or the initial matching correlation coefficient epsilon of the patches of the adjacent mesh is less than the initial matching correlation coefficient epsilon of the patch, generating a new patch p taking the initial patch as a reference in the meshnNew patch pnTaking the intersection point of the optical center direction of the grid and the plane where the reference surface patch is as a central point, wherein the normal vector is the same as the reference surface patch; traversing all other images in the image set P, and summing all normal vectors with PnPutting the image with the normal vector included angle less than 60 degrees into a set U (t) as a contrast image; obtaining corresponding affine transformation parameters through an image projection matrix of a patch and a motion matrix of each image, and then, obtaining a patch pnRespectively mapped to ptAnd each image in U (t), using bilinear interpolation to obtain pnAt ptAnd the mapping images on U (t), calculating their correlation coefficient ζ1And all ζ are1The images in U (t) which are greater than the threshold value are put into a set V (t), if V (t) is empty, p is considerednCan not be observed by other images, can not meet the reconstruction requirement, and p is deletednLooking for the next diffusible point;
a color correction step S140, which includes:
compensation light removal substep: and converting the color of the three-dimensional model from the RGB color space into the HSV color space which is more consistent with the color visual characteristic. Then, determining the information of the compensating light according to the background points, and removing the compensating light in an HSV space;
and (3) completing a syndrome step according to the underwater illumination imaging model: the model color with the compensation light removed is converted into an RGB spatial representation. The color L presented by a point x on the model at this timeλThe calculation method is shown as formula (2):
Figure BDA0001640105630000051
Figure BDA0001640105630000052
RGB model representing natural light, NλRepresenting the absorption rate of seawater, D the depth of seawater, pxRepresenting the refractive index of the spot;
obtaining the color C of the model by the formula (3) according to the sea depth D of the sceneλ
Cλ=Lλ/(Nλ)Dλ ∈ { red, green, blue } (3).
Optionally, in the step S110 of extracting the motion matrix, the extraction of the feature points uses a Harris corner feature detection algorithm and a SIFT feature point extraction algorithm, the calculation method of the motion matrix uses a linear variation method, and further uses a sparse beam adjustment method to perform parameter optimization, and a more accurate motion matrix is restored by minimizing a projection error between an observation image and a prediction image.
Optionally, in the step S120 of removing redundant information, the image correlation coefficient δ is obtained by calculating the jackard distance of the pixel point after projection, and the threshold value compared with the image correlation coefficient δ is 0.9.
Optionally, in the initial matching substep, the matching of the first-eye image and the second-eye image by using the obtained feature points is as follows: p is to belThe feature point obtained by each DOG operator in the above step and prThe characteristic points obtained by the DOG operator are matched, and p is obtainedlThe characteristic point obtained by each Harris operator in the sequence is compared with prAnd matching the characteristic points obtained by the Harris operator.
Optionally, in the sub-step of reconstructing a diffusion patch, the patch p generated by diffusion is also subjected tonOptimizing and adjusting to make its correlation coefficient on other images as large as possible, and the z coordinate of the center point of the patch, the patch pnAngle of inclination of normal vector to pnMake an adjustment to recalculate pnThe center point and normal vector of (a); updating image sets U (t) and V (t), and correlating coefficient ζ2If the number of elements in V (t) is more than k, and k is the number of images with diffusion points being observed, the diffusion of the patch is considered to be successful, and the newly generated patch p is storednAnd otherwise, deleting the newly generated patch, and considering the next possible diffusion point until the diffusion cannot be performed.
Optionally, in the sub-step of reconstructing the diffusion patch, in the optimization adjustment process, the method optimizes the normal vector by changing the normal vector in a conical space of 15 degrees to calculate v (t) of several edge values, and compares the v (t) with the original normal vector to select a maximum value; where k is preferably 3.
Optionally, in the scene reconstructing step S130, after all diffusion is completed, the erroneous patches are removed, the patches in each grid whose initial matching correlation coefficient epsilon is smaller than the average correlation coefficient are deleted, and patch clusters that only include a few patches and are far from all other patches are deleted.
Optionally, in the sub-step of removing the compensating light, a background point which is as deep as the sea bed and far enough is searched in the video, and the brightness of the background point is used as the uniform brightness of the whole model to remove the compensating light.
Further, the present invention also discloses a storage medium for storing computer executable instructions, which is characterized in that: the computer executable instructions, when executed by a processor, perform the above-described underwater scene reconstruction method.
The method can still complete a better reconstruction result when only a few input images exist, has better efficiency and precision, and improves the accuracy and robustness of the reconstructed scene to a certain extent.
Drawings
Fig. 1 is a flow chart of a method for motion recovery based underwater scene reconstruction in accordance with a specific embodiment of the present invention;
FIG. 2 is a schematic diagram comparing a motion recovery algorithm according to an embodiment of the present invention with a conventional algorithm;
FIG. 3 is a schematic diagram of patch screening according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of patch optimization adjustment according to an embodiment of the present invention;
FIG. 5 is a schematic illustration of the principles of underwater imaging according to a specific embodiment of the present invention;
FIGS. 6(a) - (d) are four frames of an image with redundancy removed from the image in a reconstructed underwater video according to a specific embodiment of the present invention;
FIG. 7 is a model of a seabed three-dimensional point cloud obtained through scene reconstruction according to an embodiment of the present invention;
FIG. 8 is the final result of the sea bed after color correction according to a specific embodiment of the present invention;
FIG. 9 is a comparison of a reconstructed dinosaur model and a laser scanning model by using the reconstruction method of the present invention, wherein FIG. 9(a) is a laser scanning model and FIG. 9(b) is an algorithm reconstructed model of the present invention;
fig. 10 is a comparison between a temple model reconstructed by the reconstruction method of the present invention and a laser scanning model, wherein fig. 10(a) is a laser scanning model and fig. 10(b) is an algorithm reconstruction model of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
The invention discloses an underwater scene reconstruction method based on motion recovery, which comprises the following steps: in order to establish the interconnection between video images, improved motion recovery is introduced to realize the extraction of a motion matrix; after the redundant image elimination is completed, in order to enhance the robustness of the algorithm, the characteristic point matching and the point cloud generation are carried out in two steps: firstly, matching feature points on a binocular image, and generating a patch according to the matched feature points in order to obtain denser point cloud data; diffusing the surface patches to all the visual angles to complete the reconstruction of the scene model; and finally, carrying out color correction on the point cloud model according to the imaging characteristics of the underwater scene.
Referring to fig. 1, a flowchart of an underwater scene reconstruction method based on motion recovery according to the present invention is shown, in which underwater scene videos shot by a binocular camera are respectively marked as a left-eye video and a right-eye video, and are disassembled into an image set, and images are self-calibrated to eliminate lens distortion, and further includes the following steps:
motion matrix extraction step S110:
after the initialization is completed, the traditional method for extracting the motion matrix matches the added image with all the calculated images before for each frame to optimize all the motion matrices before, so that the time taken by the whole algorithm increases exponentially as the images increase.
Because the input of the invention is the ordered video frame, the upper and lower frames in the video contain a large amount of repeated information, for each newly added image, the corresponding characteristic points which can be matched by the current frame and all the images which have finished iteration are mostly concentrated on the image of the upper frame, and the matching characteristic points between the adjacent frames are enough to finish the calculation of the motion matrix, so the traditional method is optimized.
Therefore, the steps are specifically: for each group of newly added binocular images, only the first image (such as the left image) is selected to be matched with the feature points of the last frame image (the last frame left image), then the motion matrix calculation is carried out, and meanwhile, because the binocular cameras are strictly calibrated in advance, after the motion matrix of the first camera (namely, the left camera) is obtained through calculation, the motion matrix of the second camera (namely, the right camera) can be obtained according to the calibration between the binocular cameras, and new tracking points are selected from the motion matrix and added into the tracking point set.
In the embodiment of the present invention, the motion matrix of the left-eye camera is first obtained, and then the motion matrix of the right-eye camera is obtained according to the calibration between the binocular cameras, but obviously, this is merely an example, and the motion matrix of the right-eye camera may be obtained first, and then the motion matrix of the left-eye camera is obtained according to the calibration between the binocular cameras.
Further, in the invention, a Harris corner feature detection algorithm and an SIFT feature point extraction algorithm are adopted for extracting feature points, and a Linear Transformation (DLT) method is adopted for calculating a motion matrix. Since the left and right eye cameras are identical and have been calibrated, the intrinsic parameters have been determined. The calculation of the motion matrix is therefore the acquisition of the extrinsic parameters of the camera. And further adopts a Sparse Beam Adjustment (SBA) method to optimize parameters, and recovers a more accurate motion matrix by minimizing the projection error between the observation image and the prediction image. The method of minimization is to iteratively find the minimum solution for the total projection error.
Referring to fig. 2, a schematic diagram comparing a motion recovery algorithm according to an embodiment of the present invention with a conventional algorithm is shown.
Redundant information removal step S120:
due to the continuity of the video, a large number of redundant frames exist in the video, and the images are highly repeated with adjacent images, often contain no additional key information, and the whole video is directly processed, so that the processing is cumbersome and inefficient. The redundant frames are removed, so that the operation efficiency of the algorithm is greatly improved.
When redundant images are removed, the motion matrix under each visual angle is acquired in the motion matrix extracting step, so that the video images can be screened according to the similarity degree of adjacent images.
Therefore, the steps are specifically: first frame image p in first-view video1It is put into the next frame image p2Comparing, obtaining a projection matrix K according to the motion matrixes under two visual angles, and obtaining p through the formula (1)2The point on is mapped to p by the projection matrix K1Wherein r is2Is p2Coordinate of (a), r21Is p2Projection to p1The coordinates of (a) to (b) are,
r21=Kr2(1)
by comparing the images p1And p2Obtaining an image correlation coefficient delta between the two images by the mapped pixels, comparing the image correlation coefficient delta with a threshold value, and if the image correlation coefficient delta is smaller than the set threshold value, obtaining the next frame image p2Not redundant pictures, retaining p1And p2And compare p2And p2The adjacent next frame of the left eye image, otherwise, the next frame of the left eye image is determined as p2Is a redundant picture, p is removed from the video picture set2Then p is added1And p2Comparing the adjacent first target images of the next frame; this cycle is repeated until the last frame of image is compared, and then the step of removing redundant information is repeated for the images in the second-order video, and an image set P with the characteristics of the scene preserved after the reduction is obtained.
Further, the image correlation coefficient δ is obtained by calculating the jackard distance of the pixel point after projection, and in the present invention, the threshold value compared with the image correlation coefficient δ is 0.9, so as to take efficiency and accuracy into consideration.
Scene reconstruction step S130:
the step is used for starting matching of feature points and establishment of patches after redundancy is removed, and comprises two substeps of feature point matching and initial patch generation, and finishing reproduction of diffusion patches,
an initial matching substep:
in the initial matching stage, in order to facilitate the extraction and matching of the feature points, each frame of image in the image set P is divided into β × β pixel grids, α local maximum values in each grid are respectively calculated by using a DOG operator and a Harris operator to serve as the feature points, the obtained feature points are used for matching of the left eye image and the right eye image, after the feature point matching is completed, a feature point pair set is obtained, and for each pair of matched feature point pairs (m < m > - β pixel pairs) is obtainedl,mr) Sorting the point pairs from far to near according to the distance between the point pairs and the camera lens, generating point clouds from near to far, and generating a surface patch p of theta × theta pixels by taking m as the centermThe center of the patch is m, the patch pmIn a camera with a normal vector of m and a reference imageConnecting the centroids, and aligning the generated patches pmAnd (5) screening.
Because the binocular camera used in the invention is strictly calibrated and carries out stereo correction on the binocular images, the spatial information of any point on the left eye image can be quickly calculated through the mutual relationship between any point and any point as long as the corresponding point is found on the right eye image. Moreover, the matching is very high in precision, so that initial matching and point cloud generation under a single visual angle can be completed by using a binocular image.
In the specific matching, a pair of left and right eye image pairs (p)l,pr) Selecting its left eye image plAs a reference image, it is combined with the right eye image prAnd carrying out feature point matching. Since is already at prAnd plThe grid division and the feature point extraction are carried out, and the two images are subjected to stereo correction, namely the corresponding polar lines are on the same straight line. Thus, p islThe feature point obtained by each DOG operator in the above step and prThe characteristic points obtained by the DOG operator are matched, and p is obtainedlThe characteristic point obtained by each Harris operator in the sequence is compared with prAnd matching the characteristic points obtained by the Harris operator.
For the generated patch pmThe screening may specifically be: since at this time pmAnd pl,prAre all known, by patch pmThe image projection matrix of (2) to obtain corresponding affine transformation parameters, and then surface patch pmRespectively mapped to pl,prTo find pmAt plAnd prThe corresponding coordinates of (a); computing p by bilinear interpolationmAt plAnd prCalculating the initial matching correlation coefficient epsilon of the two projection images through a normalized product correlation algorithm, if the initial matching correlation coefficient epsilon is larger than a threshold value, considering that the patch is successfully reconstructed, storing the patch and reconstructing a next pair of feature point pairs, and if not, deleting the patch pmAnd reconstructing the next pair of characteristic point pairs.
Referring to fig. 3, a schematic of screening of the slides is shown.
And a diffusion surface patch reconstruction substep:
in order to obtain a dense reconstruction result of multiple visual angles, the initial surface patches are used as seed points and spread to the periphery, and at least one surface patch exists in each grid as much as possible, so that the reconstruction of the model is completed.
The method comprises the following specific steps: for each patch generated in the initial reconstruction, if no patch exists in the adjacent mesh or the initial matching correlation coefficient epsilon of the patches of the adjacent mesh is less than the initial matching correlation coefficient epsilon of the patch, generating a new patch p taking the initial patch as a reference in the meshnNew patch pnThe intersection point of the optical center direction of the grid and the plane of the reference surface patch is taken as a central point, the normal vector is the same as the reference surface patch, and the image ptIs the image plane where the grid is located; traversing all other images in the image set P, and summing all normal vectors with PnPutting the image with the normal vector included angle less than 60 degrees into a set U (t) as a contrast image; obtaining corresponding affine transformation parameters through an image projection matrix of a patch and a motion matrix of each image, and then, obtaining a patch pnRespectively mapped to ptAnd each image in U (t), using bilinear interpolation to obtain pnAt ptAnd the mapping images on U (t), calculating their correlation coefficient ζ1And all ζ are1The images in U (t) which are greater than the threshold value are put into a set V (t), if V (t) is empty, p is considerednCan not be observed by other images, can not meet the reconstruction requirement, and p is deletednThe next diffusible point is examined.
Further, the patch p generated by diffusion is also processed in the present sub-stepnOptimizing and adjusting to make its correlation coefficient on other images as large as possible, and the z coordinate of the center point of the patch, the patch pnAngle of inclination of normal vector to pnMake an adjustment to recalculate pnThe center point and normal vector of (a); updating image sets U (t) and V (t) due to pnHaving optimized, when updating the set v (t),will correlation coefficient ζ2Is further increased, when the number of elements in v (t) is greater than k, k is the number of images in which the diffusion point can be observed, that is, when the patch p is presentnIf the image is observed by enough images under other visual angles, the facet diffusion is considered to be successful, and the newly generated facet p is storedn(ii) a Otherwise, deleting the newly generated patch, and considering the next possible diffusion point until the diffusion cannot be performed.
In the optimization adjustment process, optimizing the normal vector by changing the normal vector in a conical space of 15 degrees to calculate V (t) of a plurality of edge values, and comparing the V (t) with the original normal vector to select a maximum value; where k is preferably 3. Fig. 4 is a schematic diagram of patch optimization adjustment.
After the diffusion is completed, the wrong patches need to be removed because redundant patches and wrong diffusion points may exist in the diffusion process. And deleting the patches of which the initial matching correlation coefficient epsilon is smaller than the average correlation coefficient in each grid, and deleting the patch clusters which only contain a few patches and are far away from all other patches.
Color correction step S140:
compensation light removal substep:
and converting the color of the three-dimensional model from the RGB color space into the HSV color space which is more consistent with the color visual characteristic. And then, determining the information of the compensating light according to the background points, and removing the compensating light in the HSV space.
Specifically, since the change in brightness caused by the irradiation of the ambient light through the sea water due to the local depth change of the sea bed is slight, the brightness of the entire reconstructed sea bed model is uniform. A background point which is as deep as the sea bed and far enough is searched in the video, the distance between the background point and the camera compensation light source is assumed to be large enough, the influence of the point on the compensation light can be ignored, and the brightness of the point is used as the uniform brightness of the whole model to realize the removal of the compensation light.
And (3) completing a syndrome step according to the underwater illumination imaging model:
after the influence of the compensating light is eliminated, the color can be corrected through an underwater illumination imaging model in the RGB space.
The model color with the compensation light removed is converted into an RGB spatial representation. The color L presented by a point x on the model at this timeλThe calculation method is shown as formula (2):
Figure BDA0001640105630000131
Figure BDA0001640105630000132
RGB model representing natural light, NλRepresenting the absorption rate of seawater, D the depth of seawater, pxRepresenting the refractive index of the spot;
according to the sea depth D of the scene, the color C of the model can be obtained by the formula (3)λ
Cλ=Lλ/(Nλ)DLambda ∈ { red, green, blue } (3)
Referring to fig. 5, a schematic diagram of the underwater imaging principle is shown.
Further, the present invention also discloses a storage medium for storing computer executable instructions, which is characterized in that: the computer executable instructions, when executed by a processor, perform the above-described underwater scene reconstruction method.
Example 1:
the underwater scene reconstruction method based on motion recovery can use computer language programming and run on a development platform to realize the functions.
In embodiment 1, the underwater scene reconstruction method based on motion recovery was developed on the visual studio2010 platform using C + + language and verified on a set of bottom videos captured by a strictly calibrated binocular GoPro 2 camera.
Two segments of avi format videos with the duration of 7s are used for shooting by the left-eye camera and the right-eye camera respectively. Each video segment contains 210 frames of images, and after redundant images are removed, an image set to be reconstructed containing 34 frames is obtained. Fig. 6(a) - (d) show four frames in the image after removing redundancy in the image in the reconstructed underwater video, fig. 7 shows a seabed three-dimensional point cloud model obtained by scene reconstruction of the image, and fig. 8 shows the final result of the three-dimensional point cloud model after color correction.
Example 2:
in order to further evaluate the effect of the underwater scene reconstruction method, the method of the invention respectively adopts two groups of multi-view model images with laser scanning data to reconstruct by the method of the invention and carries out quantitative comparative analysis.
Fig. 9 is a comparison between a dinosaur model reconstructed by the reconstruction method of the present invention and a laser scanning model, wherein fig. 9(a) is a laser scanning model, and fig. 9(b) is an algorithm reconstructed model of the present invention.
Fig. 10 is a comparison between a temple model reconstructed by the reconstruction method of the present invention and a laser scanning model, wherein fig. 10(a) is a laser scanning model and fig. 10(b) is an algorithm reconstruction model of the present invention.
It can be seen that the present invention can achieve better reconstruction results even when there are only a few input images. Namely, the invention has better efficiency and precision.
The invention can evaluate the quality of the reconstruction algorithm through two parameters:
A=max{|DW|} (4)
1. accuracy A, i.e. generating a set D of points of the model WWMaximum distance from laser scan data;
C=Rd/R (5)
2. degree of matching C, i.e. set of points R at a given distance d of the generated model from the points of the laser scan model less than ddAs a percentage of the total set of points R.
In consideration of the characteristics of the underwater scene, the accuracy is determined to be 80% and the matching degree is determined to be 0.25 mm.
TABLE 1 reconstruction accuracy of quantitative calculations
Figure BDA0001640105630000151
Therefore, the invention establishes the interrelation among the video frames by extracting the motion matrixes of the cameras under different visual angles through an improved motion recovery algorithm, provides a basis for subsequent redundancy removal and scene reconstruction, and then designs an algorithm to remove redundant images according to the characteristics of the underwater video, thereby improving the efficiency of the algorithm. And finally, carrying out color correction to ensure that the final color of the model is as close as practical. Compared with other reconstruction methods and laser scanning results in the prior art, the method can still complete a better reconstruction result when only a few input images exist, has better efficiency and precision, and improves the accuracy and robustness of the reconstructed scene to a certain extent.
It will be apparent to those skilled in the art that the various elements or steps of the invention described above may be implemented using a general purpose computing device, they may be centralized on a single computing device, or alternatively, they may be implemented using program code that is executable by a computing device, such that they may be stored in a memory device and executed by a computing device, or they may be separately fabricated into various integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
While the invention has been described in further detail with reference to specific preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (9)

1. An underwater scene reconstruction method based on motion recovery is characterized by comprising the following steps:
motion matrix extraction step S110: for each group of newly added binocular images, only one frame of the first eye image is selected to be matched with the previous frame of image for feature points, then motion matrix calculation is carried out, after the motion matrix of the first eye image is obtained through calculation, the motion matrix of the second eye image can be obtained according to calibration between binocular cameras, and new tracking points are selected from the motion matrix and added into a tracking point set;
redundant information removal step S120: first frame image p in first-view video1It is put into the next frame image p2Comparing, obtaining a projection matrix K according to the motion matrixes under two visual angles, and obtaining p through the formula (1)2The point on is mapped to p by the projection matrix K1Wherein r is2Is p2Coordinate of (a), r21Is p2Projection to p1The coordinates of (a) to (b) are,
r21=Kr2(1)
by comparing the images p1And p2Obtaining an image correlation coefficient delta between the two images by the mapped pixels, comparing the image correlation coefficient delta with a threshold value, and if the image correlation coefficient delta is smaller than the set threshold value, obtaining the next frame image p2Not redundant pictures, retaining p1And p2And compare p2And p2The adjacent next frame of the first target image, otherwise, the p is determined2Is a redundant picture, p is removed from the video picture set2Then p is added1And p2Comparing the adjacent first target images of the next frame; repeating the circulation until the last frame of image is compared, then repeating the step of removing redundant information on the image in the second-order video, and obtaining an image set P which is simplified and retains the scene characteristics;
a scene reconstruction step S130, comprising:
an initial matching sub-step, dividing each frame of image in the image set P into β × β pixel grids, respectively calculating α local maximum values in each grid as feature points by using DOG operator and Harris operator, matching the first and second images by using the obtained feature points, and obtaining the feature points after completing the feature point matchingA set of point pairs, for each pair of matched characteristic point pairs (m)l,mr) Sorting the point pairs from far to near according to the distance between the point pairs and the camera lens, generating point clouds from near to far, and generating a surface patch p of theta × theta pixels by taking m as the centermThe center of the patch is m, the patch pmThe normal vector of (a) is a connection line between m and the center point of the reference image camera, and the generated patch p is subjected tomScreening is carried out; for the generated patch pmThe screening is carried out as follows: pass-through patch pmThe image projection matrix of (2) to obtain corresponding affine transformation parameters, and then surface patch pmRespectively mapped to pl,prTo find pmAt plAnd prThe corresponding coordinates of (a); computing p by bilinear interpolationmAt plAnd prCalculating the initial matching correlation coefficient epsilon of the two projection images through a normalized product correlation algorithm, if the initial matching correlation coefficient epsilon is larger than a threshold value, considering that the patch is successfully reconstructed, storing the patch and reconstructing a next pair of feature point pairs, and if not, deleting the patch pmReconstructing the next pair of characteristic point pairs;
and a diffusion surface patch reconstruction substep: for each patch generated in the initial reconstruction, if no patch exists in the adjacent mesh or the initial matching correlation coefficient of the patch of the adjacent mesh is less than that of the patch, generating a new patch p taking the initial patch as a reference in the meshnNew patch pnThe intersection point of the optical center direction of the grid and the plane of the reference surface patch is taken as a central point, the normal vector is the same as the reference surface patch, and ptIs the image plane where the grid is located; traversing all other images in the image set P, and summing all normal vectors with PnPutting the image with the normal vector included angle less than 60 degrees into a set U (t) as a contrast image; obtaining corresponding affine transformation parameters through an image projection matrix of a patch and a motion matrix of each image, and then, obtaining a patch pnRespectively mapped to ptAnd each image in U (t), using bilinear interpolation to obtain pnAt ptAnd the mapping images on U (t), calculating their correlation coefficient ζ1And all ζ are1The images in U (t) which are greater than the threshold value are put into a set V (t), if V (t) is empty, p is considerednCan not be observed by other images, can not meet the reconstruction requirement, and p is deletednLooking for the next diffusible point;
a color correction step S140, which includes:
compensation light removal substep: converting the color of the three-dimensional model from an RGB color space into an HSV color space which is more in line with the color visual characteristic, then determining the information of compensating light according to background points, and removing the compensating light in the HSV color space;
and (3) completing a syndrome step according to the underwater illumination imaging model: converting the model color without the compensation light into RGB space representation, wherein the color L presented by one point x on the modelλThe calculation method is shown as formula (2):
Figure FDA0002404201750000031
Figure FDA0002404201750000032
RGB model representing natural light, NλRepresenting the absorption rate of seawater, D the depth of seawater, pxRepresenting the refractive index of the spot;
obtaining the color C of the model by the formula (3) according to the sea depth D of the sceneλ
Cλ=Lλ/(Nλ)Dλ ∈ { red, green, blue } (3).
2. The underwater scene reconstruction method according to claim 1, characterized in that:
in the step S110 of extracting the motion matrix, the extraction of the feature points uses a Harris corner feature detection algorithm and a SIFT feature point extraction algorithm, the calculation method of the motion matrix uses a linear variation method, and further uses a sparse beam adjustment method to perform parameter optimization, and a more accurate motion matrix is restored by minimizing the projection error between the observation and the prediction image.
3. The underwater scene reconstruction method according to claim 1, characterized in that:
in the step S120 of removing redundant information, the image correlation coefficient δ is obtained by calculating the jackard distance of the pixel point after projection, and the threshold value compared with the image correlation coefficient δ is 0.9.
4. The underwater scene reconstruction method according to claim 1, characterized in that:
in the initial matching substep, the matching of the first and second target images by using the obtained feature points is as follows: p is to belThe feature point obtained by each DOG operator in the above step and prThe characteristic points obtained by the DOG operator are matched, and p is obtainedlThe characteristic point obtained by each Harris operator in the sequence is compared with prAnd matching the characteristic points obtained by the Harris operator.
5. The underwater scene reconstruction method according to claim 1, characterized in that:
in the sub-step of reconstructing the diffused patch, the diffusion-generated patch p is also subjected tonOptimizing and adjusting to make its correlation coefficient on other images as large as possible, and the z coordinate of the center point of the patch, the patch pnAngle of inclination of normal vector to pnMake an adjustment to recalculate pnThe center point and normal vector of (a); updating image sets U (t) and V (t), further increasing the threshold value of the correlation coefficient, if the number of elements in V (t) is more than k, and k is the number of images with which diffusion points can be observed, considering that the diffusion of the surface patch is successful, and storing a newly generated surface patch pnAnd otherwise, deleting the newly generated patch, and considering the next possible diffusion point until the diffusion cannot be performed.
6. The underwater scene reconstruction method according to claim 2, characterized in that:
in the sub-step of reconstructing the diffusion surface patch, in the optimization adjustment process, optimizing V (t) of several edge values calculated by changing a normal vector in a conical space with 15 degrees of the normal vector, and comparing the V (t) with the original normal vector to select a maximum value; wherein k is 3.
7. The underwater scene reconstruction method according to claim 1, characterized in that:
in the scene reconstruction step S130, after all diffusion is completed, the erroneous patches are also removed, and the patches whose initial matching correlation coefficient e is smaller than the average correlation coefficient in each mesh are deleted, and patch clusters that only contain a few patches and are far from all other patches are deleted.
8. The underwater scene reconstruction method according to claim 1, characterized in that:
in the sub-step of removing the compensating light, a background point which is as deep as the sea bed and far enough is searched in the video, and the brightness of the background point is used as the uniform brightness of the whole model to realize the removing of the compensating light.
9. A storage medium for storing computer-executable instructions, characterized in that:
the computer executable instructions, when executed by a processor, perform the underwater scene reconstruction method of any one of claims 1-8.
CN201810377322.6A 2018-04-25 2018-04-25 Underwater scene reconstruction method based on motion recovery and storage medium Active CN108648264B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810377322.6A CN108648264B (en) 2018-04-25 2018-04-25 Underwater scene reconstruction method based on motion recovery and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810377322.6A CN108648264B (en) 2018-04-25 2018-04-25 Underwater scene reconstruction method based on motion recovery and storage medium

Publications (2)

Publication Number Publication Date
CN108648264A CN108648264A (en) 2018-10-12
CN108648264B true CN108648264B (en) 2020-06-23

Family

ID=63747582

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810377322.6A Active CN108648264B (en) 2018-04-25 2018-04-25 Underwater scene reconstruction method based on motion recovery and storage medium

Country Status (1)

Country Link
CN (1) CN108648264B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110290373B (en) * 2019-03-11 2020-12-08 长春理工大学 Integrated imaging calculation reconstruction method for increasing visual angle
CN110111422B (en) * 2019-03-28 2023-03-28 浙江碧晟环境科技有限公司 Method for constructing triangular surface net at bottom of water body
CN110111413A (en) * 2019-04-08 2019-08-09 西安电子科技大学 A kind of sparse cloud three-dimension modeling method based on land and water coexistence scenario
CN110322572B (en) * 2019-06-11 2022-09-09 长江勘测规划设计研究有限责任公司 Binocular vision-based underwater culvert and tunnel inner wall three-dimensional information recovery method
CN110415332A (en) * 2019-06-21 2019-11-05 上海工程技术大学 Complex textile surface three dimensional reconstruction system and method under a kind of non-single visual angle
CN111563921B (en) * 2020-04-17 2022-03-15 西北工业大学 Underwater point cloud acquisition method based on binocular camera
CN112822478B (en) * 2020-12-31 2022-10-18 杭州电子科技大学 High-quality photo sequence acquisition method for three-dimensional reconstruction
CN113971691A (en) * 2021-09-16 2022-01-25 中国海洋大学 Underwater three-dimensional reconstruction method based on multi-view binocular structured light

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102032878A (en) * 2009-09-24 2011-04-27 甄海涛 Accurate on-line measurement method based on binocular stereo vision measurement system
CN106097436A (en) * 2016-06-12 2016-11-09 广西大学 A kind of three-dimensional rebuilding method of large scene object
CN107767442A (en) * 2017-10-16 2018-03-06 浙江工业大学 A kind of foot type three-dimensional reconstruction and measuring method based on Kinect and binocular vision

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005351916A (en) * 2004-06-08 2005-12-22 Olympus Corp Binocular microscope device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102032878A (en) * 2009-09-24 2011-04-27 甄海涛 Accurate on-line measurement method based on binocular stereo vision measurement system
CN106097436A (en) * 2016-06-12 2016-11-09 广西大学 A kind of three-dimensional rebuilding method of large scene object
CN107767442A (en) * 2017-10-16 2018-03-06 浙江工业大学 A kind of foot type three-dimensional reconstruction and measuring method based on Kinect and binocular vision

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《基于运动恢复的双目视觉三维重建系统设计》;王欣等;《光学精密工程》;20140531;第22卷(第5期);全文 *

Also Published As

Publication number Publication date
CN108648264A (en) 2018-10-12

Similar Documents

Publication Publication Date Title
CN108648264B (en) Underwater scene reconstruction method based on motion recovery and storage medium
CN110363858B (en) Three-dimensional face reconstruction method and system
US11410320B2 (en) Image processing method, apparatus, and storage medium
Jiang et al. Learning to see moving objects in the dark
CN115082639B (en) Image generation method, device, electronic equipment and storage medium
CN108470370A (en) The method that three-dimensional laser scanner external camera joint obtains three-dimensional colour point clouds
CN111047709B (en) Binocular vision naked eye 3D image generation method
CN113192179B (en) Three-dimensional reconstruction method based on binocular stereo vision
CN110910431B (en) Multi-view three-dimensional point set recovery method based on monocular camera
US10169891B2 (en) Producing three-dimensional representation based on images of a person
CN112862736B (en) Real-time three-dimensional reconstruction and optimization method based on points
CN111402412A (en) Data acquisition method and device, equipment and storage medium
CN109218706B (en) Method for generating stereoscopic vision image from single image
CN114119987A (en) Feature extraction and descriptor generation method and system based on convolutional neural network
CN116012517B (en) Regularized image rendering method and regularized image rendering device
CN110147809B (en) Image processing method and device, storage medium and image equipment
CN109166176B (en) Three-dimensional face image generation method and device
CN115619974A (en) Large scene three-dimensional reconstruction method, reconstruction device, equipment and storage medium based on improved PatchMatch network
CN111985535A (en) Method and device for optimizing human body depth map through neural network
Chen et al. MoCo‐Flow: Neural Motion Consensus Flow for Dynamic Humans in Stationary Monocular Cameras
CN111932670A (en) Three-dimensional human body self-portrait reconstruction method and system based on single RGBD camera
Gu et al. Enhanced DIBR framework for free viewpoint video
CN114245096B (en) Intelligent photographing 3D simulation imaging system
CN111010558B (en) Stumpage depth map generation method based on short video image
CN117788736A (en) Mars surface three-dimensional terrain reconstruction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant