CN114022562A

CN114022562A - Panoramic video stitching method and device capable of keeping integrity of pedestrians

Info

Publication number: CN114022562A
Application number: CN202111238422.9A
Authority: CN
Inventors: 张�林; 郭超政; 朱安琪; 沈莹; 赵生捷
Original assignee: Tongji University
Current assignee: Tongji University
Priority date: 2021-10-25
Filing date: 2021-10-25
Publication date: 2022-02-08

Abstract

The invention relates to a panoramic video stitching method and a panoramic video stitching device for keeping integrity of pedestrians, wherein the method comprises the following steps: collecting a plurality of paths of videos by adopting a structured panoramic camera array; jointly calibrating the pose of the camera based on a light beam adjustment method; geometrically aligning video images collected by different cameras, and mapping the video images onto a uniform cylindrical standard curved surface; performing luminosity alignment on images shot by different cameras by using a two-step least square method to eliminate brightness difference; locating a pedestrian target in the image based on semantic segmentation to obtain a semantic mask; based on the semantic mask, a suture line cost function of the video image is constructed, the optimal suture line is solved by using dynamic programming, and the overlapped parts of the aligned images are fused. Compared with the prior art, the panoramic stitching result obtained by the method has better visual consistency, the practicability of the system in the monitoring field is greatly enhanced by considering semantic information, and meanwhile, the method has good calculation efficiency and can realize real-time processing of video images.

Description

Panoramic video stitching method and device capable of keeping integrity of pedestrians

Technical Field

The invention relates to the technical field of video splicing, in particular to a panoramic video splicing method and device capable of keeping integrity of pedestrians.

Background

The panoramic mosaic system is an indispensable module in monitoring or space exploration, and a horizontal view covering all surrounding visual angles can be obtained by using the structured camera array and the panoramic mosaic system, so that a viewer can know the surrounding environment immediately. With the rise of the fields of video conferences, remote education, robot navigation and the like, a single camera cannot record all targets under a large scene due to a small visual field, the expressed scene area information is limited, a high-definition wide-angle camera cannot be popularized due to high price, the requirement of each field on the large visual field is met due to the appearance of a panoramic splicing technology, and the panoramic splicing technology has important practical value. At present, the panoramic image splicing technology is widely applied to the fields of security monitoring, military operation, virtual reality technology, remote sensing image processing, automobile driving assistance and the like.

In order to provide a natural panoramic view for a viewer, the panoramic stitching system needs to align and stitch images with different viewing angles, and simultaneously needs to realize smooth transition of a plurality of images at seams, so that the viewer cannot perceive stitching traces among the images. In addition, the method is different from common image panoramic stitching, and the video stitching has higher requirements on algorithm real-time performance and robustness.

The panoramic video image is formed by splicing a group of pictures which are obtained by shooting the camera around the camera, and more comprehensive environmental information around the camera can be obtained. Due to the rotation motion of the camera, the acquired image is a two-dimensional projection of an entity scene under different camera coordinate systems, severe image distortion can be generated by directly splicing the image, and visual consistency cannot be met, so that the image to be spliced needs to be mapped to a standard plane for image splicing. The panoramic image can be divided into a spherical panoramic image, a cubic panoramic image and a cylindrical panoramic image according to different mapping plane forms of the panoramic image. The cylindrical surface panoramic model has a 360-degree horizontal visual angle, the panoramic image using the model has uniform quality and high detail and reality, and can be directly subjected to subsequent processing by using a traditional image processing algorithm, so that the cylindrical surface panoramic model is widely applied.

The panoramic stitching mainly comprises two steps. The first step is to map a plurality of images to be spliced to a uniform coordinate system for alignment, so as to facilitate subsequent image splicing, wherein the step needs to estimate the transformation relationship of the plurality of images, and usually a method based on feature point matching or a method based on camera external reference calibration is adopted, and then the images are projected to the uniform coordinate system for alignment according to the transformation relationship. And the second step is to fuse the images after the projection alignment, realize the natural transition between the adjacent images, and finally obtain the panoramic image satisfying the visual system of the eyes. Due to the existence of parallax, foreground objects in adjacent images often cannot be completely aligned, and a great research space is still provided for how to intelligently process a dynamic foreground to ensure the consistency of the dynamic foreground in a panorama. In addition, a panoramic video stitching system in practical application also relates to the problems of luminosity alignment of a plurality of camera images, instantaneity of a video stitching algorithm, poor video quality under dark light and the like, and the existing panoramic stitching system is difficult to give consideration to.

Disclosure of Invention

The invention aims to provide a panoramic video stitching method and a panoramic video stitching device for keeping integrity of pedestrians, aiming at overcoming the defects of luminosity difference, image splitting and double images in the prior art.

The purpose of the invention can be realized by the following technical scheme:

a panoramic video stitching method for keeping integrity of pedestrians comprises the following steps:

s1: adopting a plurality of cameras to construct a structured panoramic camera array and collect multiple paths of videos;

s2: placing a calibration plate in the common visual area of the adjacent cameras to form a feature point matching pair, constructing the relative poses of the adjacent cameras and the pose of the forward looking camera in a loop, and further optimizing the poses by adopting a light beam adjustment method;

s3: based on the result of pose optimization, geometrically aligning video images acquired by different cameras, and mapping pixel coordinates onto a unified cylindrical standard curved surface;

s4: after geometric alignment, constructing a luminosity alignment equation of the video images of each camera and a mean value and variance alignment equation of the overlapping area of adjacent images, and solving by adopting a two-step least square method to eliminate the brightness difference between the adjacent images;

s5: positioning a pedestrian target in each camera video image based on semantic segmentation to obtain a semantic mask;

s6: constructing a suture line cost function based on the camera video images and the semantic masks after luminosity alignment, solving an optimal suture line by using dynamic programming, and fusing the overlapped parts of the camera video images after luminosity alignment.

Further, the structured panoramic camera array comprises four cameras, the four cameras face to the front, the rear, the left and the right directions respectively, the cameras are fixed through a support, and the horizontal viewing angle of each camera is within the range of 100 degrees and 200 degrees.

Further, the constructed relative poses of the neighboring cameras include a relative pose T between the forward-looking and left-looking cameras_LFLeft view and back view cameras_BLRelative pose T between rear view and right view cameras_RBRelative pose T between right-view and forward-view cameras_FRAnd pose T of forward looking camera_FF；

The loop-back forward-looking camera pose T'_FFThe calculation expression of (a) is:

T′_FF＝T_FR·T_RB·T_BL·T_LF·T_FF。

further, the pose optimization by using the light beam adjustment method specifically comprises:

constructing a pose loss function by taking the relative pose of adjacent cameras, the pose of the loop forward-looking camera and the feature point matching pairs as variables to be optimized, and solving based on graph optimization to obtain the relative pose between the adjacent cameras after optimization;

the calculation expression of the pose loss function is as follows:

in the formula u_ijIs the pixel coordinate of the jth feature point that can be observed by the ith camera, s_ijIs the depth of the feature point, K_iIs an internal reference of the i-th camera,

is the lie algebra form of the ith camera pose, P_ijThe loss function implies the conversion of homogeneous coordinates to non-homogeneous coordinates.

Further, step S3 specifically includes the following steps:

s301: taking the central positions of four cameras in the panoramic camera array as central coordinates P_centerAnd obtaining the center coordinate by the coordinate mean value of the four cameras under the coordinate system of the front-view camera:

s302: calculating a transformation matrix T of the forward-looking camera with respect to the central coordinates_FWComprises the following steps:

s303: based on the relative pose between adjacent cameras, and a transformation matrix T of the forward looking camera with respect to the center coordinates_FWAnd obtaining pose transformation of each camera relative to the center coordinate:

T_LW＝T_LFT_FW

T_BW＝T_BLT_LFT_FW

T_RW＝T_RBT_BLT_LFT_FW

in the formula, T_LWFor a transformation matrix of the left-view camera with respect to the central coordinates, T_BWFor transformation matrix of rear view camera with respect to central coordinate, T_RWA transformation matrix for the right-view camera with respect to the center coordinates;

s304: with P_centerEstablishing a cylindrical projection curved surface with the radius r for the coordinate center, and defining a mapped z-axis scale factor h_scaleAnd mapping the pixel coordinates onto a uniform cylindrical standard curved surface, wherein the computational expression of the mapping process is as follows:

in the formula, x_wIs the x coordinate, y of the mapped pixel point_wFor mapped y coordinates, z, of pixel points_WAnd (c) the z coordinate of the mapped pixel point, h is the height of the panoramic spliced image, w is the width of the panoramic spliced image, and (u, v) are the original pixel coordinates.

Further, the construction process of the photometric alignment equation of the video image of each camera includes:

aiming at the video images of the cameras, constructing a luminosity alignment equation of the video images of the cameras according to a luminosity adjustment model containing a deviation term, wherein the expression of the luminosity adjustment model is as follows:

I′＝g·I+b

in the formula, I' is an image after luminosity adjustment, I is an image before luminosity adjustment, g is a gain, and b is a deviation term;

the expression of the mean and variance alignment equation of the overlapping areas of the adjacent images is as follows:

g_i·I_ij+b_i＝g_j·I_ji+b_j

g_i·σ_ij＝g_j·σ_ji

ij∈{FR，RB，BL，LF}

in the formula, σ_ijIs the standard deviation, σ, of the pixel intensity values of image i in the region where image i and image j overlap_jiThe standard deviation of the pixel intensity value of the image j in the overlapping area of the image i and the image j;

the solving by adopting the two-step least square method is specifically as follows:

firstly, the least square method is adopted to carry out the four equations g_i·σ_ij＝g_j·σ_jiSolving the formed equation set, and normalizing to obtain an approximate solution of the gain g of each image;

substituting the adjusted gain into equation g_i·I_ij+b_i＝g_j·I_ji+b_jObtaining a deviation term equation:

carrying out SVD on the deviation term equation to obtain an approximate solution of the deviation term b adjusted by each image;

and adjusting the images according to the obtained gain g and the deviation term b of each image to obtain an image combination with consistent brightness of the overlapped area.

Further, in step S5, the neural network is used to segment the video images of the cameras to obtain a pedestrian target, and a semantic mask is obtained, in which the two pixel values are used to distinguish the human body from the background.

Further, in step S6, the expression of the suture cost function is:

in the formula, M (i, j) is a calculation result of the suture line cost function, Sem (i, j) represents a semantic cost of selecting the pixel (i, j) as a demarcation point, Spa (i, j, k) is a suture line path cost function starting from the previous line of pixels (i-1, k) to the current pixel (i, j), and λ is a parameter for adjusting the semantic cost and the path cost weight.

Furthermore, the value range of the parameter k is j-6 < k < j + 6;

in the dynamic programming solving process, before updating the suture line each time, the pixel difference between the adjacent image frames is calculated, if the pixel difference is smaller than a preset difference threshold value, the suture line is reserved, otherwise, a new suture line is calculated.

The invention also provides a panoramic video stitching device capable of maintaining the integrity of pedestrians, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor calls the computer program to execute the steps of the method.

Compared with the prior art, the invention has the following advantages:

(1) the image luminosity alignment and semantic detection achieve a natural panoramic video splicing effect, the practicability of the system in the monitoring field is greatly enhanced by considering semantic information, the system has higher vision consistency and luminosity alignment accuracy, meanwhile, the method is higher in efficiency, and real-time video splicing can be achieved.

(2) In the camera pose calibration process of the panoramic camera array around view, the pose is optimized by adopting a light beam adjustment method, and the following are found in experiments: the reprojection errors of the cameras at all the visual angles are reduced, and the average reprojection errors of all the cameras are reduced from 0.1881 to 0.1825, so that the effectiveness of the joint optimization of the camera pose in the invention is verified.

(3) In the process of aligning the luminosity of each video image, the invention takes the mean value and the variance alignment equation of the overlapping area of the adjacent images into account, and adopts a two-step least square method to solve, and the results are found in the experiment: the luminosity alignment model used by the method has smooth overall brightness conversion, no obvious suture line and better elimination of the obvious bright and dark boundary of the sky area, so that the image is fused more naturally near the suture line.

(4) In the process of obtaining the optimal suture line, the semantic mask of the pedestrian target is adopted, the suture line cost function which simultaneously takes semantic cost and path cost into consideration is constructed, and experiments show that the optimal suture line algorithm can ensure the integrity of the body of the pedestrian to the maximum extent so as to obtain a splicing result with better visual effect; the method can ensure the lowest proportion of the broken frames in most scenes, and the generated result is most in line with the observation habit of human vision; and the invention can splice panoramic video at the speed of 12-26 frames/second by selectively updating the suture line and based on the acceleration of zooming, the splicing speed depends on the number of objects in the shooting scene and the motion frequency thereof, and the efficiency is higher.

Drawings

Fig. 1 is a schematic main flow chart of a panoramic video stitching method for maintaining pedestrian integrity according to an embodiment of the present invention;

FIG. 2 is a schematic view of a panoramic camera system according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a panoramic camera system and cylindrical coordinate mapping according to an embodiment of the present invention;

fig. 4 is a schematic diagram of estimating an initial pose of a panoramic camera system according to an embodiment of the present invention;

FIG. 5 is a schematic diagram illustrating a comparison between the effects of a photometric alignment algorithm provided in an embodiment of the present invention;

FIG. 6 is a comparison of the effect of an optimal suture algorithm provided in the embodiments of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

Example 1

The embodiment provides a panoramic video stitching method for keeping integrity of pedestrians, which aims to solve the problems of light difference, image splitting, ghosting and the like in a traditional panoramic video stitching system and provide a high-quality and high-vision-consistency panoramic stitched video, and the method comprises the following steps:

The steps are described in detail below.

One, use the panoramic camera array system of the structural all around to gather the video of four routes

The structured panoramic camera array designed by the method is used for collecting videos, and the schematic diagram of a camera system is shown in figure 2. The camera system is composed of four fisheye cameras, the four cameras shoot in four directions of front, back, left and right respectively, the position of each camera is fixed through a support structure, and the pose of each adjacent camera can be conveniently obtained through camera external parameter calibration.

There are two ways of using the surround view camera system: the floor type support structure is suitable for outdoor scenes and the like, and the surrounding environment can be shot by placing the surround camera system on the ground; the desktop type structure is suitable for indoor scenes and the like, is formed by taking down and modifying the top camera device of the floor type support, and can shoot by placing the ring-view camera system on platforms such as a desktop. Two different modes of use greatly facilitate data acquisition in different scenarios.

The chessboard calibration plate can be used for calibrating the internal reference of the fisheye camera, and the method comprises the following specific operations: the fisheye camera to be calibrated is used for shooting the picture of the checkerboard calibration plate, so that the checkerboard can cover all areas of the fisheye image at various angles, and the accuracy of internal reference estimation can be guaranteed. After shooting about dozens of fisheye checkerboard images by each camera, estimating an internal reference matrix and distortion parameters of the camera by using an internal reference calibration function of the fisheye camera for subsequent external reference calibration.

Secondly, jointly calibrating poses of a plurality of cameras based on a light beam adjustment method

S201: placing a calibration board in a common-view area of adjacent cameras, imaging feature points on the calibration board on the two cameras to form a group of feature Point matching pairs, and solving the relative poses of the adjacent cameras by utilizing PnP (Passive-n-Point), wherein the relative poses of a front view camera, a left view camera, a rear view camera, a right view camera and a front view camera are respectively marked as T_LF，T_BL，T_RB.T_FRThe pose of the forward-looking camera is marked as T_FF；

S202: according to the loop structures of four cameras in the loop-around camera system, the relative pose transformation relation of adjacent cameras is utilized to obtain the pose T 'of the loop-around forward-looking camera in an estimated state'_FF：

T′_FF＝T_FR·T_RB·T_BL·T_LF·T_FF。

S203: with five camera poses: the forward looking camera pose, the left looking camera pose, the rear looking camera pose, the right looking camera pose, the loop forward looking camera pose and three-dimensional point coordinates of all checkerboard feature points are used as variables to be optimized, and the loop forward looking pose T 'is minimized through a beam Adjustment method'_FFAnd the original forward-looking pose T_FFThe corresponding loss function is:

S204: for the above nonlinear optimization problem, g2o is used to solve based on graph optimization, resulting in the relative poses between the four cameras in the camera system.

Thirdly, geometrically aligning images shot by different cameras, and mapping pixels to a uniform cylindrical standard curved surface

s303: based on relative pose T between adjacent cameras_LF，T_BL，T_RB.T_FRAnd a transformation matrix T of the forward-looking camera with respect to the central coordinates_FWObtaining the coordinates P of the front, the back, the left and the right cameras relative to the center_centerThe pose of (c) is transformed into:

T_LW＝T_LFT_FW

T_BW＝T_BLT_LFT_FW

T_RW＝T_RBT_BLT_LFT_FW

Fourthly, performing luminosity alignment on images shot by different cameras by using a two-step least square method to eliminate brightness difference between adjacent images

A photometric adjustment model containing a bias term is established for the input images of the four cameras. Based on an image adjustment model based on gain g, introducing a deviation term b, wherein the corresponding mathematical expression form is as follows:

I′＝g·I+b

and constructing a luminosity alignment equation of the four images based on the luminosity adjustment model, and solving an adjustment parameter which enables the luminosity consistency to be optimal. Variances of overlapping portions of adjacent images are calculated and aligned according to the adjustment model, and variance alignment is used as an additional constraint for photometric adjustment. m is_ijRepresenting the mean intensity value, σ, of the pixels of image i in the region where image i and image j overlap_ijRepresenting the standard deviation of the pixel intensity values of image i in that region. m is_jiAnd σ_jiRepresenting the mean intensity value and standard deviation of the image j in that region. The mean and variance alignment equations for the overlapping regions of adjacent images are:

g_i·I_ij+b_i＝g_j·I_ji+b_j

g_i·σ_ij＝g_j·σ_ji

ij∈{FR，RB，BL，LF}

the above photometric alignment equation is solved using a two-step least squares. The above equation set contains 8 equations in total, and firstly, four equations only contain unknown g_iThe system of equations is solved, a non-zero approximate solution is solved by using a least square method, and normalization is carried out by dividing the solution by a mean value to obtain the adjustment gain of each image.

Substituting the solution result of the previous step into an equation set, and arranging to obtain:

the above equation does not have an exact solution since the left-hand coefficients are not full rank matrices. Through SVD decomposition, the adjusted deviation term b of each image can be obtained_iThe approximate solution of (c).

And correspondingly adjusting the four projection images after solving the image intensity adjustment model parameters to obtain the projection result with consistent brightness and color of the overlapped area.

Fifthly, obtaining a semantic mask based on the pedestrian target in the semantic segmentation positioning image

And performing example segmentation on the input image by using Mask R-CNN, and obtaining a semantic segmentation Mask Sem (i, j) by using human body classification in a segmentation result, namely that the value of the pixel (i, j) belonging to the human body is 1, otherwise, the value is 0. And (5) solving the optimal suture line in the step (six) by using the semantic mask to realize the visual consistency of image splicing.

Sixthly, combining the semantic mask of the step (five), using a dynamic programming to minimize a suture cost function, searching an optimal suture, and fusing the overlapped parts of the aligned images

S601: defining a suture line path cost function starting from the pixel (i-1, k) in the previous row to the current pixel (i, j), representing the difference of pixel values of two images of pixel positions at two sides where the suture line passes through, and the mathematical expression is as follows:

preferably, j-6 is more than k and less than j +6 is taken in the expression, so that the search efficiency is ensured;

s602: defining a suture cost function:

Preferably, before updating the stitching line, pixel differences between adjacent image frames are calculated, if the differences are smaller than a certain threshold, the current stitching line is retained, otherwise, a new stitching line is calculated;

s603: defining a state transfer function according to the suture cost function, solving an optimal suture by using dynamic programming, and minimizing the suture cost function;

s604: and fusing the overlapping areas of the images by using the optimal suture line to obtain a spliced panoramic image.

Seventhly, the beneficial effects of the invention are explained by combining with specific experiments:

experimental conditions and scoring criteria:

the data set used in the experiment contains pedestrian videos in 5 scenes, including two indoor scenes and three outdoor scenes, the indoor scenes include the condition of walking of multiple persons, and at most, four persons exist simultaneously. In outdoor scenes, up to ten pedestrians are present in the video at the same time. The samples of each scene contain 4 directions of fisheye video, 200 frames per direction of video, 4000(5 × 4 × 200) images in total, and the image resolution is 1920 × 1080.

In a calibration experiment of a panoramic camera system, a re-projection error is used for evaluating the calibration performance of a structured camera array, wherein the re-projection error is a difference value between an observed pixel position and a projected two-dimensional position of a corresponding three-dimensional point; in the luminosity alignment experiment, the luminosity alignment performance is evaluated through the intensity difference between the overlapped areas of the adjacent images, and if the brightness level and the color of the adjacent images are well adjusted and unified, the intensity difference of the overlapped areas is as small as possible; in the optimal suture line algorithm experiment, the image splicing effect is evaluated by counting the frames with the fracture phenomenon in the panoramic splicing result, 4 volunteers are invited in the experiment, and the fracture frames of the splicing result of each method are counted. In order to avoid the subjective preference of volunteers, all the frames of the splicing result are disordered, and the performance of the method is better when the proportion of the broken frames in the video is lower after all the frames are counted.

Experiment 1

Performing a calibration algorithm experiment on a panoramic camera system; table 1 shows the experimental results of joint pose optimization, including the re-projection errors of the panoramic camera system before and after joint optimization. It can be seen from the table that the reprojection errors of the cameras from all the viewing angles are reduced, and the average reprojection error of all the cameras is reduced from 0.1881 to 0.1825, which verifies the effectiveness of the joint optimization in the method.

TABLE 1 calibrating reprojection error for panoramic camera system

Experiment 2

Performing a luminosity alignment algorithm experiment; in order to prove the effectiveness of the robust luminosity adjustment model in the method, the experiment compares the result of the method with a reference model: the document "Automatic laboratory image filing using innovative defects" (M.Brown and D.G.Lowe, International Journal of Computer Vision, vol.74, No.1, pp.59-73,2007.). And a scheme based on histogram matching in the document of "microscopic video recording of dual camera based on spatial-temporal section optimization" (Q.Liu, X.Su, L.Zhang, and H.Huang, Multimedia Tools and Applications, vol.79, No.5, pp.3107-3124,2020) is expanded to four images, and the effect is compared with the method.

Fig. 5 shows the results of photometric alignment by different methods, (a) as a result of non-photometric alignment, it can be observed that the suture line is very distinct, and the overall brightness distribution is not uniform, unlike the image photographed under natural conditions. (b) As a result of the Brown et al method, although the difference in brightness between adjacent images is reduced, the existence of the stitch line is still perceived in the sky area. (c) As a result of the method of Liu et al, the histogram matching based method directly results in severe distortion of the ground area in the image and there are also sharp bright and dark boundaries on the columns on the right side of the image. (d) The luminosity alignment model used by the method has smooth overall brightness conversion, no obvious suture line and better elimination of the obvious bright and dark boundary of the sky area, so that the image is fused more naturally near the suture line.

Experiment 3

An optimal suture algorithm experiment; the experiment was compared to the four most advanced optimal suture algorithms, GraphCut, DP, success and Iterative, and the results of autostart without suture algorithm were used as a control. In order to ensure the fairness of the experiment, the experiment firstly adopts the same cylinder projection and luminosity alignment preprocessing flow, and then adopts different optimal suture line algorithms in the image fusion stage.

Figure 6 shows the qualitative comparison results of the different stitch line algorithms on the two sets of test images. In the fusion results of autostart without the use of the suture algorithm, a significant ghosting phenomenon was observed. The results of GraphCut and Perception showed dislocation and loss of the pedestrian's body on both sets of test images. DP and Iterative also have the phenomenon of stitching misplacement on a set of test images. These facial and body splice misalignments and deletions are visually apparent to and unacceptable for human perception. In contrast, the optimal suture line algorithm for the pedestrian, which is provided by the method, can ensure the integrity of the body of the pedestrian to the greatest extent so as to obtain a splicing result with a better visual effect.

Table 2 shows the results of the quantitative experiments with the optimal suture algorithm, and lists the percentage of broken frames of pedestrians and objects in the stitching results of each group of videos by each method. It can be seen that the method can ensure the lowest ratio of the broken frames in most scenes, and the generated result is most consistent with the observation habit of human vision. In contrast, considering only the stitching result of DP of pixel cost, the test result of the indoor video-1 in which the proportion of broken frames is very high, especially very many pedestrians, is not considered because semantic information is not considered. The broken frame ratio of the method is the lowest in the average performance of all videos, which shows that the performance of the method is superior to that of the comparison method and the method has obvious advantages in the treatment of pedestrians.

TABLE 2 optimal suture algorithm quantitative experimental results

Experiment 4

An optimal suture algorithm time performance experiment; the average stitching time per frame using the different optimal stitch line algorithms is reported in table 3, where the resolution 1 is 500 × 1200, and the resolutions 0.5 × and 0.25 × are scaled by the length and width.

TABLE 3 optimal suture algorithm time Performance

Experimental results show that the time cost of variant performance and Iterative of GC is extremely high due to a series of additional time-consuming steps such as image saliency prediction and multi-round Iterative optimization. The method can splice the panoramic video at the speed of 12-26 frames/second by selectively updating the suture line and based on the acceleration of scaling, and the splicing speed depends on the number of objects in the shot scene and the motion frequency of the objects.

The embodiment also provides a panoramic video stitching device for maintaining the integrity of pedestrians, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor calls the computer program to execute the steps of the panoramic video stitching method for maintaining the integrity of pedestrians.

The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.

Claims

1. A panoramic video stitching method for keeping integrity of pedestrians is characterized by comprising the following steps:

2. The method as claimed in claim 1, wherein the panoramic video stitching method for maintaining the integrity of pedestrians is characterized in that the structured panoramic camera array comprises four cameras, the four cameras face to the front, the back, the left and the right respectively, each camera is fixed by a bracket, and the horizontal viewing angle of each camera is within the range of 100 degrees and 200 degrees.

3. The method of claim 2, wherein the constructed relative poses of the adjacent cameras comprise a relative pose T between a forward looking camera and a left looking camera_LFLeft view and back view cameras_BLRelative pose T between rear view and right view cameras_RBRelative pose T between right-view and forward-view cameras_FRAnd pose T of forward looking camera_FF；

T′_FF＝T_FR·T_RB·T_BL·T_LF·T_FF。

4. the panoramic video stitching method for maintaining the integrity of pedestrians according to claim 3, wherein the pose optimization by using the beam adjustment method specifically comprises:

the calculation expression of the pose loss function is as follows:

5. The panoramic video stitching method for maintaining the integrity of pedestrians according to claim 3, wherein the step S3 specifically comprises the following steps:

T_LW＝T_LFT_FW

T_BW＝T_BLT_LFT_FW

T_RW＝T_RBT_BLT_LFT_FW

6. The panoramic video stitching method for maintaining the integrity of pedestrians according to claim 3, wherein the construction process of the photometric alignment equation of the video images of each camera comprises:

I′＝g·I+b

g_i·I_ij+b_i＝g_j·I_ji+b_j

g_i·σ_ij＝g_j·σ_ji

ij∈{FR，RB，BL，LF}

7. The panoramic video stitching method for maintaining the integrity of the pedestrians as claimed in claim 1, wherein in step S5, the neural network is used to segment the video images of the cameras to obtain the pedestrian target, so as to obtain the semantic mask, and in the semantic mask, two pixel values are used to distinguish the human body and the background respectively.

8. The panoramic video stitching method for maintaining the integrity of pedestrians according to claim 1, wherein in step S6, the expression of the stitching line cost function is:

9. The method for stitching the panoramic video with the integrity of the pedestrian according to claim 8, wherein the value range of the parameter k is j-6 < k < j + 6;

10. A panoramic video stitching apparatus for maintaining pedestrian integrity, comprising a memory and a processor, wherein the memory stores a computer program, and the processor calls the computer program to execute the steps of the method according to any one of claims 1 to 9.