CN113160053B

CN113160053B - Pose information-based underwater video image restoration and splicing method

Info

Publication number: CN113160053B
Application number: CN202110354434.1A
Authority: CN
Inventors: 丁泉龙; 张权; 韦岗; 曹燕; 王一歌
Original assignee: South China University of Technology SCUT
Current assignee: South China University of Technology SCUT
Priority date: 2021-04-01
Filing date: 2021-04-01
Publication date: 2022-06-14
Anticipated expiration: 2041-04-01
Also published as: CN113160053A

Abstract

The invention discloses an underwater video image restoration and splicing method based on pose information, which comprises the following steps: acquiring an underwater video image with a synchronous calibration signal and pose information of a camera through an underwater video image acquisition device; carrying out one-to-one correspondence on video frames and pose information of the underwater video image; calculating homography matrixes among the video frames by using the pose information, further solving the overlapping area among the video frames, extracting key frames according to the proportion of the overlapping area, and performing time calibration on the video frames and the pose information; an improved shallow water image enhancement algorithm which is obtained based on self-adaptive parameters and is stretched relative to a global histogram is adopted to restore the underwater video image; dividing the restored key frame into a plurality of sub-modules by combining pose information; and after the sub-modules are spliced into sub-module images, synthesizing the sub-module images into a final panoramic image. The invention combines the pose information and can quickly obtain a clear underwater panorama.

Description

Pose information-based underwater video image restoration and splicing method

Technical Field

The invention relates to the technical field of image processing, in particular to an underwater video image restoration and splicing method based on pose information.

Background

The water bottom substrate, water level, water flow, the abundance of floating algae and the like of the cage all influence the capacity of cage culture. A good net cage site selection can provide superior survival and growth conditions for mariculture, and improve the culture capacity of the water area to a certain extent. An unmanned ship is used for carrying an underwater camera in an offshore shallow water area (the water depth is generally 3 to 10 meters), and panoramic splicing is carried out after underwater video shooting is carried out, so that underwater topography and landform can be obtained, and the auxiliary net cage site selection is facilitated, and scientific layout is carried out.

The underwater video images are acquired by an underwater camera, and before the videos are restored and spliced, key frames need to be extracted from the videos to reduce information redundancy generated in the video acquisition process and reduce data processing amount. Common key frame extraction methods include methods based on sampling, inter-frame difference, accumulated inter-frame difference, clustering, motion information and the like. The sampling-based key frame extraction method is simple in speed block and implementation, but depends on the setting of a sampling interval, when the interval is too small, the obtained key frames have larger redundancy, and when the interval is too large, the adjacent key frames possibly have no overlapping area. The key frame extraction method based on clustering has the defects of sensitivity to initial parameters and high calculation complexity. These methods directly extract key frames from the video without fully considering the pose and geographic location information of the underwater photography camera. If the video can be combined with position (GPS) information and attitude information, namely pose information for short (which can be acquired by a pose information recording module on the unmanned ship), the problem can be well solved.

Because of strong attenuation caused by scattering of particles and impurities in water and absorption of a water medium, the underwater image generally has the problems of color cast, blurring and small field range; in addition, due to the influence of an artificial light source, the underwater image also has the problem of uneven illumination. The image splicing directly performed by using the acquired underwater image is easy to cause splicing failure, so that the underwater image needs to be subjected to light homogenizing and defogging treatment. The traditional underwater image restoration method comprises a defogging algorithm based on a dark channel, an algorithm based on histogram equalization, an algorithm based on homomorphic filtering, an algorithm based on wavelet transformation and an algorithm based on image fusion. These methods are not ideal for the case of uneven illumination due to artificial light sources.

Disclosure of Invention

The invention aims to solve the defects in the prior art and provides an underwater video image restoration and splicing method based on pose information. Firstly, combining video frames and pose information of an underwater video image together to enable each video frame of the underwater video image to correspond to a piece of pose information, and carrying out time calibration on the pose information; calculating homography matrixes among the video frames according to the pose information, further solving the overlapping area among the video frames, and extracting key frames according to the proportion of the overlapping area; an improved shallow water image enhancement algorithm which is obtained based on self-adaptive parameters and is stretched relative to a global histogram is provided for restoring an underwater video image; dividing the restored key frame into a plurality of sub-modules by combining pose information; and after the sub-modules are spliced into sub-module images, splicing the sub-module images into a final panoramic image.

The purpose of the invention can be achieved by adopting the following technical scheme:

an underwater video image restoration and splicing method based on pose information comprises the following steps:

s1, acquiring an underwater video image with a synchronous calibration signal and pose information of a camera through an underwater video image acquisition device;

s2, carrying out one-to-one correspondence on the video frames and the pose information of the underwater video images;

s3, calculating homography matrixes among the video frames by using the pose information, further solving the overlapping area among the video frames, extracting key frames according to the proportion of the overlapping area, and performing time calibration on the video frames and the pose information;

s4, performing underwater video image restoration by adopting an improved shallow water image enhancement algorithm which is obtained based on self-adaptive parameters and is stretched relative to the global histogram, and obtaining a restored key frame;

s5, dividing the restored key frame into a plurality of sub-modules by combining pose information;

and S6, splicing the sub-modules into sub-module images, and splicing the sub-module images into a final panoramic image.

Further, the underwater video image acquisition device comprises a camera module and a pose information recording module, wherein the camera module consists of a camera and a first SD card, the camera is used for shooting underwater video images, and the first SD card is used for storing the underwater video images with synchronous calibration signals; the pose information recording module consists of a main controller, a gyroscope, a GPS (global positioning system), an underwater ultrasonic distance measuring sensor, a signal lamp and a second SD (secure digital) card, wherein the main controller is used for controlling the work of other units, the gyroscope is used for acquiring the pose information of a camera, the GPS is used for acquiring the position information of the camera, the underwater ultrasonic distance measuring sensor is used for acquiring the height of the camera from the water bottom, the signal lamp is used for sending an optical signal to the camera module, and the second SD card is used for storing the pose information of the camera.

Furthermore, because the underwater video image and the pose information are not acquired by the same system, a communication mode is needed for combining the underwater video image and the pose information; common communication modes include acoustic, optical, electrical and other signals; the optical signal propagation speed is fast, the resulting error is small, and the implementation is simple, so the optical signal is used as the start and time calibration signal of the video frame and the pose information, and the step S1 is as follows:

the method comprises the steps of firstly starting a camera module to start shooting underwater video images, then starting a pose information recording module, controlling a signal lamp to give a light signal 1 to the camera before the pose information recording module starts recording pose information, then starting recording the pose information, controlling the signal lamp to give a light signal 2 to the camera at intervals, and setting Flag of current pose information to be 1.

The lighting signal 1 is used as a starting signal of a video frame and pose information during post data processing; the light signal 2 is used as a time calibration signal of video frames and pose information during post data processing.

Further, the step S2 process is as follows:

reading the underwater video image obtained in the step S1, taking the video frames detected to contain light as signal frames, taking a certain number of continuously detected signal frames as start signals, taking the last frame of all the detected signal frames as a video frame corresponding to the first piece of pose information, and then sequentially corresponding each frame of video frame to each piece of pose information one by one;

the process of judging whether the current video frame is a signal frame is as follows: uniformly dividing a current video frame into an upper region, a lower region, a left region and a right region, respectively counting whether the proportion of the number of pixels of which the numerical value of a channel R is greater than a threshold value alpha1 in the total pixels of the whole region is greater than a threshold value alpha2 and whether the proportion of the number of pixels of which the numerical value of a channel G, B is less than a threshold value alpha1 in the total pixels of the whole region is greater than a threshold value alpha 2; if the value distribution of R, G, B three channels in one area meets the above condition, the current video frame is a signal frame; otherwise the current video frame is not a signal frame.

Further, when extracting key frames, a homography matrix between video frames is obtained by using pose information and a camera imaging principle, so that the overlapping area between the video frames is obtained, and the key frames are extracted according to the proportion of the overlapping area; compared with a key frame extraction method based on sampling, the method can quickly extract key frames and ensure that the overlapping area between the key frames is within a reasonable range; compared with a key frame extraction method based on clustering, the method has small calculation complexity, and the step S3 includes the following steps:

s301, taking the video frame corresponding to the first piece of pose information as a current frame, and taking the first piece of pose information as pose information of the current frame;

s302, judging whether the Flag of the pose information of the current frame is 1, if so, executing a step S303, and if not, executing a step S305;

s303, judging whether the current frame is a signal frame, if so, indicating that no time deviation occurs, executing a step S309, otherwise, indicating that the time deviation occurs, and executing a step S304;

s304, detecting N frames forwards and backwards respectively, setting the signal frame as a current frame if the signal frame is detected, executing the step S309, and ending the step S3 if the signal frame is not detected;

s305, judging whether absolute values of the roll angle and the pitch angle of the current frame are smaller than a threshold value beta1, if not, executing a step S309, and if so, executing a step S306;

s306, judging whether the key frame buffer queue is empty, if so, executing a step S308, and if not, executing a step S307;

s307, calculating the overlapping area of the current frame and the last frame in the key frame buffer queue by combining the pose information, judging whether the ratio of the overlapping area to the area of the current frame is in a beta2 range, executing a step S308 if the ratio is in a beta2 range, and executing a step S309 if the ratio is not in a beta2 range;

s308, extracting the current frame as a key frame, storing the key frame into a key frame buffer queue, and executing the step S309;

s309, judging whether a next video frame and next pose information exist, if so, reading the next video frame as a current frame, reading the next pose information as the pose information of the current frame, and repeatedly executing the step S302-the step S309, otherwise, ending the step S3.

Further, the process of calculating the overlap area in step S307 is as follows:

s3071, taking the last frame in the key frame buffer queue as the previous key frame, taking the shooting position of the previous key frame as the origin of coordinates, taking the true west as the positive direction of an X axis, the true north as the positive direction of a Y axis, and the true ground as the positive direction of a Z axis to establish a coordinate system, and calculating a homography matrix H from the current frame to the previous key frame by combining pose information, wherein the H is a matrix of 3X 3, and the calculation formula is as follows:

wherein H₁The homography matrix for the last key frame to the projection plane,

is H₁Inverse matrix of, H₂A homography matrix from the current frame to the projection plane;

wherein, because ofThe shooting position of the last key frame is taken as the origin of coordinates, and the translation vector is a zero vector, so H₁The calculation formula of (a) is as follows:

H₁＝KR₁K^-1 (2)

wherein K is an internal parameter matrix of the camera and is obtained by calibrating the camera, R₁A rotation matrix of 3 × 3;

wherein f represents the focal length of the camera; dx and dy respectively represent the physical size of each pixel on the key frame in the coordinate axis direction, and the unit is mm/pixel; (u)₀，v₀) Coordinates of an intersection point of the optical axis of the camera and the key frame plane are positioned at the center of the key frame;

wherein the content of the first and second substances,

θ₁、ψ₁respectively representing the roll angle, the pitch angle and the course angle of the camera during the shooting of the previous key frame;

H₂the calculation formula of (a) is as follows:

wherein R is₂Is a rotation matrix of 3 multiplied by 3, I is a unit matrix, t is a three-dimensional translation vector, d is the distance from the origin of coordinates to the water bottom plane, and n is a unit normal vector [0, 0, -1 ] of the water bottom plane]，n^TRepresents a transpose of the vector n;

wherein the content of the first and second substances,

θ₂、ψ₂respectively a roll angle, a pitch angle and a course angle of the camera when the current frame is shot;

t＝[x₂-x₁，y₂-y₁，z₂-z₁]^T (7)

x₂-x₁＝(lon₂-lon₁)×111110×cos(lat₁×π/180) (8)

y₂-y₁＝(lat₂-lat₁)×111110 (9)

z₂-z₁＝h₂-h₁ (10)

wherein (x)₁，y₁，z₁) Coordinates representing the last key frame shot position, (x)₂，y₂，z₂) Coordinates representing the shooting position of the current frame, lon₁、lat₁、h₁Longitude, latitude and height from the bottom of the water, lon, for the last key frame shooting position₂、lat₂、h₂Respectively representing the longitude and latitude of the shooting position of the current frame and the height from the water bottom;

s3072, calculating coordinate positions of four vertexes of the current frame after homography transformation, wherein a calculation formula is as follows:

position＝vertex./vertex(3，：) (12)

wherein, width represents the width of the current frame, height represents the height of the current frame, vertex/vertex (3: 3) represents the normalization processing of the homogeneous coordinate to obtain the coordinate of the two-dimensional plane;

s3073, counting the vertexes of the quadrangle of the current frame after homography transformation in the rectangle represented by the last key frame;

s3074, calculating the intersection point of each side of the quadrangle of the current frame after homographic transformation and each side of the previous key frame;

s3075, sequencing the obtained vertexes and intersection points clockwise or anticlockwise;

s3076, the area of the polygon in the overlapping area is obtained, any vertex of the polygon is taken as a fixed point, the polygon is divided into a plurality of triangles, the area of each triangle is calculated respectively and summed, and the size of the overlapping area is obtained.

Further, when the underwater image restoration is performed on the keyframe, the gamma function is used for correcting the bright channel, instead of performing linear sliding stretching on the bright channel in the original algorithm, so that a better dodging processing effect is achieved on the underwater video image with uneven illumination, and the step S4 is as follows:

s401, converting the key frame from an RGB color model into an HSV color model;

s402, convolving and weighting the bright channel V by Gaussian functions with different scales to obtain an estimated value I (x, y) of the illumination component;

wherein (x, y) represents the coordinates of the bright channel, V (x, y) represents the bright channel, G (x, y) represents the luminance of the bright channel_g，y_g) Representing a Gaussian function, ω_lRepresenting a weighting coefficient of an illumination component obtained by convolution of the ith scale Gaussian function on the bright channel V; n is a radical of_lIs the total number of Gaussian functions with different scales, i 1, 2_lσ is the standard deviation of the Gaussian function, (x)_g，y_g) Representing the template coordinates of the Gaussian function, wherein the central position of the template is an origin;

s403, correcting the bright channel V by using a two-dimensional gamma function, wherein the formula of the two-dimensional gamma function is as follows:

wherein, V_O(x, y) are the values of the corrected bright channel, γ is an index value for luminance enhancement, the index value contains the illumination component characteristics of the key frame, and m is the luminance mean of the illumination components;

s404, converting the key frame processed in the steps S402 and S403 from an HSV color model to an RGB color model, and solving a color balance coefficient theta of a channel G, B_g、θ_bAnd the pixel values of the channel G, B are multiplied by color equalization coefficients theta, respectively_g、θ_bCarrying out color balance to obtain a balanced RGB color model; wherein theta is_g、θ_bThe calculation formula of (c) is as follows:

wherein G is_avg、B_avgNormalized average values after recovery of channel G, B, M is the key frame height, N is the key frame width, I_g(I, j) is the value of channel G at (I, j), I_b(i, j) is the value of channel B at (i, j), i denotes column i of channel G, B, j denotesRow j of channel G, B;

s405, carrying out global histogram stretching on the R, G, B three channels after equalization, wherein a calculation formula is as follows:

wherein, P_inFor input of pixel values, P_outFor the image values after histogram stretching, I_minIs the value of the pixel value of the channel at 0.5% position after being sorted from small to large, I_maxIs the value of the pixel at 99.5% position, O, after the pixel values of the channel are sorted from small to large_maxIs 255, O_minIs 0;

s406, converting the key frame processed in the step S405 from the RGB color model to the CIELab color model, and modifying the color grades of the channels a and b in the CIELab color model, wherein the calculation formula is as follows:

wherein the content of the first and second substances,

a color value representing a color value of an input channel,

a value of an output color is represented,

the optimal experimental values are shown in the figure,

taking the empirical value of the raw material to be tested,

taking the value as channel a or channel b;

and S407, converting the key frame processed in the step S406 from the CIELab color model to the RGB color model to obtain a restored key frame.

Further, the step S5 process is as follows:

traversing all the key frames, calculating the absolute value change from the pose information of any key frame to the pose information of other key frames, and taking the key frame with the minimum value accumulated by the absolute value changes of the pose information as a reference image; finding a globally optimal key frame, and taking the globally optimal key frame as a globally reference image; continuously recursively finding all locally optimal key frames on two sides of the globally optimal key frame, and taking the locally optimal key frames as local reference images; and dividing all the key frames into a plurality of sub-modules according to a binary tree structure by taking the globally optimal key frame as a root node and the locally optimal key frame as a child node.

Further, the formula for calculating the absolute value change size from the pose information of any one key frame to the pose information of other key frames is as follows:

wherein i^τ∈[1，N]，

Respectively represent the ith^τRoll angle, pitch angle and height from the water bottom of the Zhang key frame,

respectively represent the j th^τRoll angle, pitch angle and height from water bottom of Zhang key frame, alpha, beta respectively represent weight occupied by influence brought by angle change and height change, N^τRepresenting the total number of key frames.

Further, the step S6 process is as follows: firstly, splicing submodules at the bottommost layer of the binary tree, and upwards splicing layer by layer to obtain a final panoramic image; in the sub-module, the local optimal key frame is used as a reference image to complete the splicing and fusion of all the two adjacent images;

the process of splicing the two adjacent images is as follows: calculating a roughly overlapped area of two adjacent images by using pose information, obtaining a Feature point by using a Scale Invariant Feature Transform (SIFT) algorithm in the overlapped area, roughly matching the Feature point by using a Best-First search (BBF) algorithm, finely matching the Feature point by using a Progressive Sample Consensus (PROSAC) algorithm, calculating a homography matrix between the two adjacent images by using a Direct Linear Transform (DLT) algorithm, taking one image as a reference image and the other image as a target image, projecting the target image to the reference image by using the homography matrix, and completing splicing of the two adjacent images;

the process of fusing two adjacent images is as follows: obtaining an optimal suture line in an overlapping area of the two images to be spliced by using an optimal suture line algorithm based on dynamic programming; taking the whole overlapping area part in the reference image as a background image, taking the overlapping area part close to one side of the suture line of the target image part in the target image as an image area to be cloned, and carrying out Poisson fusion on the overlapping area part to complete final image fusion.

Compared with the prior art, the invention has the following advantages and effects:

(1) the invention combines the video frames acquired from different systems with the pose information, so that each video frame of the underwater video image corresponds to a piece of pose information, and the video frames and the pose information are subjected to time calibration.

(2) According to the invention, the homography matrix between the video frames is calculated by using the pose information, so that the overlapping area between the video frames is obtained, the key frames are extracted according to the proportion of the overlapping area, the splicing failure caused by the undersize of the overlapping area between the extracted adjacent key frames is avoided, or the operation time is increased due to the oversized overlapping area, and the key frames can be extracted quickly.

(3) The invention provides an improved shallow water image enhancement method based on self-adaptive parameter acquisition and relative global histogram stretching, which has good light homogenizing and defogging effects on an underwater video image with uneven illumination.

(4) The invention selects global and local projection planes from the key frame by combining the pose information, divides the key frame into a plurality of sub-modules according to the structure of the binary tree, and adopts the method of firstly splicing locally and then splicing integrally, thereby reducing the accumulated error generated by splicing multiple images.

(5) In the feature point extraction process, the overlapping area is calculated by using the pose information, the feature points are extracted in the overlapping area, the feature point extraction range is narrowed, and the time for extracting the feature points and matching the feature points is effectively shortened.

Drawings

FIG. 1 is a block diagram of an underwater video image restoration and splicing device based on pose information disclosed in the present invention;

FIG. 2 is a flowchart of the operation of a pose information-based underwater video image restoration and stitching method disclosed in the present invention;

FIG. 3 is a flow chart of the present invention for acquiring an underwater video image with a synchronized calibration signal and pose information of a camera;

FIG. 4 is a flow chart of the present invention for mapping video frames and pose information collected from different systems;

FIG. 5 is a flow chart of the present invention for obtaining keyframes in conjunction with pose information;

FIG. 6 is a flow chart of the present invention for calculating the overlap area of two images using pose information;

fig. 7 is a flow chart of underwater image restoration in the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Examples

As shown in fig. 2, the embodiment provides a pose information-based underwater video image restoration and stitching method.

S1, acquiring the underwater video image with the synchronous calibration signal and the pose information of the camera through the underwater video image acquisition device,

the underwater video image acquisition device is structurally shown in fig. 1 and comprises a camera module and a pose information recording module, wherein the camera module consists of a camera and a first SD card, the camera is used for shooting underwater video images, and the first SD card is used for storing video files with synchronous calibration signals; the pose information recording module consists of a main controller, a gyroscope, a GPS (global positioning system), an underwater ultrasonic distance measuring sensor, a signal lamp and a second SD (secure digital) card, wherein the main controller is used for controlling the work of other units, the gyroscope is used for acquiring the pose information of the camera, the GPS is used for acquiring the position information of the camera, the underwater ultrasonic distance measuring sensor is used for acquiring the height of the camera from the water bottom, the signal lamp is used for sending an optical signal to the camera module, and the second SD card is used for storing pose information

The flow of step S1 is shown in fig. 3;

s101, starting a camera module to start shooting underwater video images;

s102, starting a pose information recording module, controlling the time of 6 periods (the camera and the pose information recording module adopt the same frequency) of red (considering that an underwater image presents blue-green color cast and the color of signal light is red) signal lamp lighting, and then extinguishing the time of 1 period;

s103, controlling the time of 1 period of lighting of the red signal lamp by the pose information recording module, simultaneously recording pose information, setting Flag of the current pose information to be 1, and setting the period number n to be 0;

s104, setting the periodicity n as n + 1;

s105, judging whether 3000 periods pass or not, if so, executing a step S103, and if not, executing a step S106;

s106, the pose information recording module records the current pose information and executes the step S107;

and S107, whether the data collection is stopped, if not, the step S104 is executed, and if so, the step S1 is ended.

S2, carrying out one-to-one correspondence on the video frames and the pose information of the underwater video images, wherein the flow of the step S2 is shown in FIG. 4;

s201, reading the underwater video image obtained in the step S1, and taking a first frame of the video as a current video frame;

s202, judging whether the current video frame is a signal frame or not, and if so, executing a step S203; if not, executing step S204;

s203, respectively detecting 5 frames forwards and backwards, calculating the total number of the continuously detected signal frames, if the total number is greater than or equal to 6, executing a step S204, otherwise executing a step S205;

s204, taking the 7 th frame from the 1 st frame to the next frame after the continuous detection of the signal frames as a current video frame, judging whether the current video frame is the signal frame, if not, executing the step S205, if so, the current video frame is the video frame corresponding to the first piece of pose information, and then each frame of video frame corresponds to each piece of pose information one by one, and ending the step S2;

s205, judging whether a 6 th frame behind the current video frame exists, if so, reading the 6 th frame behind the current video frame as the current video frame, repeatedly executing the steps S202 to S205, and if not, ending the step S2;

the process of judging whether the current video frame is a signal frame is as follows:

uniformly dividing a current video frame into an upper region, a lower region, a left region and a right region, and respectively calculating whether the proportion of the number of pixels with values larger than a threshold value alpha1 in a channel R in each region in the total pixels of the whole region is larger than a threshold value alpha2 and whether the proportion of the number of pixels with values smaller than a threshold value alpha1 in a channel G, B in the total pixels of the whole region is larger than a threshold value alpha 2; if the value distribution of R, G, B three channels in one area meets the above condition, the current video frame is a signal frame; otherwise, the current video frame is not a signal frame;

in this embodiment, the value of the threshold value alpha1 is 150, and the value of the threshold value alpha2 is 0.8.

S3, calculating homography matrixes among the video frames by using the pose information, further solving the overlapping area among the video frames, extracting key frames according to the proportion of the overlapping area, and performing time calibration on the video frames and the pose information; the flow of this step is shown in FIG. 5;

s305, judging whether absolute values of the roll angle and the pitch angle of the current frame are smaller than a threshold value beta1, if not, executing a step S309, and if so, executing a step S306, otherwise, executing a step S1;

In this embodiment, the value of the threshold beta1 is 3, the value of beta2 ranges from 0.3 to 0.8, and N is 5.

The flow of calculating the overlap area in step S307 is shown in fig. 6, and the procedure is as follows:

H₁the calculation formula of (a) is as follows:

H₁＝KR₁K^-1 (2)

wherein f represents the focal length of the camera; dx and dy respectively represent the physical size of each pixel on the key frame in the coordinate axis direction, and the unit is mm/pixel; (u)₀，v₀) Coordinates of the intersection point of the optical axis of the camera and the plane of the key frame, which is located at the center of the key frameAt least one of (1) and (b);

wherein the content of the first and second substances,

H₂the calculation formula of (a) is as follows:

wherein the content of the first and second substances,

t＝[x₂-x₁，y₂-y₁，z₂-z₁]^T (7)

x₂-x₁＝(lon₂-lon₁)×111110×cos(lat₁×π/180) (8)

y₂-y₁＝(lat₂-lat₁)×111110 (9)

z₂-z₁＝h₂-h₁ (10)

position＝vertex./vertex(3，：) (12)

s3076, calculating the area of the polygon in the overlapping area, dividing the polygon into a plurality of triangles by taking any vertex of the polygon as a fixed point, respectively calculating the area of each triangle and summing the areas to obtain the size of the overlapping area, wherein the calculation formula for calculating the area of any triangle is as follows:

wherein S is_ABCThe area of the triangle ABC is shown,

representing a cross product of the vector.

S4, restoring the underwater video image by adopting an improved shallow water image enhancement algorithm which is obtained based on self-adaptive parameters and is stretched relative to the global histogram to obtain a restored key frame, wherein the flow of the step S4 is shown in FIG. 7;

s401, converting the key frame from an RGB color model into an HSV color model;

wherein, V_O(x, y) isThe positive value of the bright channel, gamma is an index value for brightness enhancement, the index value comprises the illumination component characteristics of the key frame, and m is the brightness average value of the illumination component;

s404, converting the key frame processed in the steps S402 and S403 from an HSV color model to an RGB color model, and solving a color balance coefficient theta of a channel G, B_g、θ_bAnd the pixel values of the channel G, B are multiplied by color equalization coefficients theta, respectively_g、θ_bCarrying out color balance to obtain a balanced RGB color model; wherein theta is_g、θ_bThe calculation formula of (a) is as follows:

wherein G is_avg，B_avgNormalized average values after recovery of channel G, B, M is the key frame height, N is the key frame width, I_g(I, j) is the value of channel G at (I, j), I_b(i, j) is the value of channel B at (i, j), i represents the ith column of channel G, B, j represents the jth row of channel G, B;

wherein, P_inFor input of pixel values, P_outFor the image values after histogram stretching, I_minIs the value of the pixel value of the channel at 0.5% position after being sorted from small to large, I_maxIs the pixel value at 99.5% position after sorting the pixel value of the channel from small to large, O_maxIs 255, O_minIs 0;

s406, converting the keyframe processed in the step S405 from the RGB color model to the CIELab color model, and modifying the color grades of the channels a and b in the CIELab color model, wherein the calculation formula is as follows:

wherein the content of the first and second substances,

a color value representing a color value of an input channel,

a value of the output color is represented,

the optimal experimental values are shown in the figure,

taking the empirical value of the raw material to be tested,

taking the value as channel a or channel b;

The formula for calculating the absolute value change from the pose information of any key frame to the pose information of other key frames is as follows:

wherein i^τ∈[1，N]，

S6, splicing the sub-modules into sub-module images, and then splicing the sub-module images into a final panoramic image;

firstly, splicing submodules at the bottommost layer of the binary tree, and upwards splicing layer by layer to obtain a final panoramic image; in the sub-module, the local optimal key frame is used as a reference image to complete the splicing and fusion of all the two adjacent images;

The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims

1. An underwater video image restoration and splicing method based on pose information is characterized by comprising the following steps:

s4, performing underwater video image restoration by adopting an improved shallow water image enhancement algorithm which is obtained based on self-adaptive parameters and is stretched relative to the global histogram, and obtaining a restored key frame; the step S4 process is as follows:

s401, converting the key frame from an RGB color model into an HSV color model;

wherein, V_O(x, y) are values of the corrected bright channel, γ is an index value for brightness enhancement,the index value contains the illumination component characteristics of the key frame, and m is the brightness average value of the illumination component;

wherein G is_avg、B_avgNormalized average values after recovery of channel G, B, M is the key frame height, N is the key frame width, I_g(I, j) is the value of channel G at (I, j), I_b(i, j) is the value of channel B at (i, j), i represents the ith column of channel G, B, j represents the jth row of channel G, B;

wherein, P_inFor input of pixel values, P_outFor the image values after histogram stretching, I_minIs the pixel value at 0.5% position after sorting the pixel values of the channel from small to large, I_maxIs the value of the pixel at 99.5% position, O, after the pixel values of the channel are sorted from small to large_maxIs 255, O_minIs 0;

wherein the content of the first and second substances,

a color value representing a color value of an input channel,

a value of the output color is represented,

the optimal experimental values are shown in the figure,

taking the empirical value of the raw material to be tested,

taking the value as channel a or channel b;

s407, converting the key frame processed in the step S406 from the CIELab color model to an RGB color model to obtain a restored key frame;

2. The pose information-based underwater video image restoration and splicing method according to claim 1, wherein the underwater video image acquisition device comprises a camera module and a pose information recording module, wherein the camera module comprises a camera and a first SD card, the camera is used for shooting underwater video images, and the first SD card is used for storing the underwater video images with synchronous calibration signals; the pose information recording module consists of a main controller, a gyroscope, a GPS (global positioning system), an underwater ultrasonic distance measuring sensor, a signal lamp and a second SD (secure digital) card, wherein the main controller is used for controlling the work of other units, the gyroscope is used for acquiring the pose information of a camera, the GPS is used for acquiring the position information of the camera, the underwater ultrasonic distance measuring sensor is used for acquiring the height of the camera from the water bottom, the signal lamp is used for sending an optical signal to the camera module, and the second SD card is used for storing the pose information of the camera.

3. The pose information-based underwater video image restoration and splicing method according to claim 2, wherein the step S1 is as follows:

4. The pose information-based underwater video image restoration and splicing method according to claim 1, wherein the step S2 is as follows:

5. The pose information-based underwater video image restoration and splicing method according to claim 4, wherein the step S3 is as follows:

6. The pose information-based underwater video image restoration and splicing method according to claim 5, wherein the process of calculating the overlapping area in step S307 is as follows:

s3071, taking the last frame in the key frame buffer queue as the previous key frame, taking the shooting position of the previous key frame as the origin of coordinates, taking the true west as the positive direction of an X axis, the true north as the positive direction of a Y axis, and the true ground as the positive direction of a Z axis to establish a coordinate system, taking a water bottom plane as a projection plane, and calculating a homography matrix H from the current frame to the previous key frame by combining pose information, wherein H is a matrix of 3 multiplied by 3, and the calculation formula is as follows:

H₁the calculation formula of (a) is as follows:

H₁＝KR₁K^-1 (2)

wherein the content of the first and second substances,

H₂the calculation formula of (a) is as follows:

wherein R is₂Is a rotation matrix of 3 multiplied by 3, I is a unit matrix, t is a three-dimensional translation vector, d is the distance from the coordinate origin to the water bottom plane, n is a unit normal vector of the water bottom plane^TRepresents a transpose of the vector n;

wherein the content of the first and second substances,

t＝[x₂-x₁，y₂-y₁，z₂-z₁]^T (7)

x₂-x₁＝(lon₂-lon₁)×111110×cos(lat₁×π/180) (8)

y₂-y₁＝(lat₂-lat₁)×111110 (9)

z₂-z₁＝h₂-h₁ (10)

wherein (x)₁，y₁，z₁) Coordinates representing the last key frame shot position, (x)₂，y₂，z₂) Coordinates representing the shooting position of the current frame, lon₁、lat₁、h₁Longitude, latitude and height from the bottom of the water, lon, for the last key frame shooting position₂、lat₂、h₂Respectively representing the longitude and the latitude of the shooting position of the current frame and the height from the water bottom;

position＝vertex./vertex(3，：) (12)

s3073, counting vertexes of a quadrangle of the current frame after homographic transformation in a rectangle represented by the previous key frame;

7. The pose information-based underwater video image restoration and splicing method according to claim 1, wherein the step S5 is as follows:

traversing all key frames, calculating the absolute value change from the pose information of any key frame to the pose information of other key frames, and taking the key frame with the minimum absolute value change accumulation value of the pose information as a reference image; finding a globally optimal key frame, and taking the globally optimal key frame as a globally reference image; continuously recursively finding all locally optimal key frames on two sides of the globally optimal key frame, and taking the locally optimal key frames as local reference images; and dividing all the key frames into a plurality of sub-modules according to a binary tree structure by taking the globally optimal key frame as a root node and the locally optimal key frame as a child node.

8. The method for restoring and splicing the underwater video images based on the pose information according to claim 7, wherein the formula for calculating the absolute value change size from the pose information of any one key frame to the pose information of other key frames is as follows:

wherein i^τ∈[1，N]，

respectively represent the j th^τRoll angle, pitch angle and height from water bottom of a key frame, alpha and beta respectively represent weight occupied by angle change and influence caused by height change, and N^τRepresenting the total number of key frames.

9. The pose information-based underwater video image restoration and splicing method according to claim 7, wherein the step S6 is as follows: firstly, splicing submodules at the bottommost layer of the binary tree, and upwards splicing layer by layer to obtain a final panoramic image; in the sub-module, the local optimal key frame is used as a reference image to complete the splicing and fusion of all the two adjacent images;

the process of splicing the two adjacent images is as follows: calculating a rough overlapping area of the two adjacent images by using the pose information, solving feature points by using an SIFT algorithm in the overlapping area, performing rough matching of the feature points by using a BBF algorithm, performing fine matching of the feature points by using a PROSAC algorithm, calculating a homography matrix between the two adjacent images by using a DLT algorithm, taking one image as a reference image and the other image as a target image, projecting the target image to the reference image by using the homography matrix, and completing splicing of the two adjacent images;