CN112017222A

CN112017222A - Video panorama stitching and three-dimensional fusion method and device

Info

Publication number: CN112017222A
Application number: CN202010933528.XA
Authority: CN
Inventors: 白刚; 彭靖轩; 籍盖辉; 李晓波
Original assignee: Innovisgroup Co ltd
Current assignee: Innovisgroup Co ltd
Priority date: 2020-09-08
Filing date: 2020-09-08
Publication date: 2020-12-01

Abstract

The embodiment of the application provides a method and a device for video panoramic stitching and three-dimensional fusion, wherein the method comprises the following steps: acquiring real-time video pictures of a plurality of adjacent cameras with overlapping areas, and carrying out image preprocessing on the real-time video pictures; performing feature extraction and feature matching on the real-time video pictures subjected to the image preprocessing, determining matching feature points of two adjacent real-time video pictures, and determining a transformation relation of the two adjacent real-time video pictures according to the matching feature points; performing color difference optimization processing on the two adjacent real-time video pictures, and deforming the real-time video pictures subjected to the color difference optimization processing according to a transformation relation to obtain corresponding panoramic images; rendering the panoramic image to a corresponding position of a preset three-dimensional model in real time through a three-dimensional fusion technology for displaying; the video display method and the video display device can effectively solve the problem of video discontinuity and increase video readability.

Description

Video panorama stitching and three-dimensional fusion method and device

Technical Field

The application relates to the field of video processing, in particular to a method and a device for video panorama stitching and three-dimensional fusion.

Background

Along with the continuous improvement of the safety requirements of people on living and working environments, the importance of a safety precaution system of a building is more and more prominent. The video monitoring system is more and more emphasized by people due to the characteristics of intuition, convenience and rich information content, so that the video becomes an important component of a safety precaution system.

In summary, the defects of the conventional monitoring at important gates or areas are urgently needed to be improved, and in recent years, with the development of computers, networks, image processing, computer graphics and transmission technologies, video monitoring technologies are also rapidly developed. At present, many monitoring devices are still displayed on a screen in a mode of a single camera in a 9-grid or more small videos, although the coverage range of the camera is wide, the pictures of each camera are disconnected, details are easy to lose, monitoring dead angles are many, geographical positions are not clear, and the like, so that a riding machine is provided for criminals.

Disclosure of Invention

Aiming at the problems in the prior art, the application provides a video panorama stitching and three-dimensional fusion method and device, which can effectively solve the problem of video discontinuity and increase the readability of the video, so that a user can grasp the whole situation in real time without leaking each monitoring corner, and the method and device have great practical application value.

In order to solve at least one of the above problems, the present application provides the following technical solutions:

in a first aspect, the present application provides a method for video panorama stitching and three-dimensional fusion, including:

acquiring real-time video pictures of a plurality of adjacent cameras with overlapping areas, and carrying out image preprocessing on the real-time video pictures;

performing feature extraction and feature matching on the real-time video pictures subjected to the image preprocessing, determining matching feature points of two adjacent real-time video pictures, and determining the transformation relation of the two adjacent real-time video pictures according to the matching feature points;

performing color difference optimization processing on the two adjacent real-time video pictures, and deforming the real-time video pictures subjected to the color difference optimization processing according to a transformation relation to obtain corresponding panoramic images;

and rendering the panoramic image to a corresponding position of a preset three-dimensional model in real time through a three-dimensional fusion technology for displaying.

Further, the acquiring real-time video frames of a plurality of adjacent cameras with overlapping areas and performing image preprocessing on the real-time video frames includes:

carrying out average weighting processing on the values of all pixel points of the real-time video picture according to a preset two-dimensional Gaussian filter kernel function;

and carrying out graying processing on the real-time video picture subjected to the average weighting processing to obtain a corresponding grayscale image.

Further, the performing feature extraction and feature matching on the real-time video pictures subjected to the image preprocessing, determining matching feature points of two adjacent real-time video pictures, and determining a transformation relation between the two adjacent real-time video pictures according to the matching feature points includes:

extracting the features of the real-time video picture to obtain corresponding feature points and feature descriptors;

performing rough matching on each feature point, and determining a Hamming distance between two feature points according to the feature descriptors;

performing fine matching according to the Hamming distance to obtain matching feature points of two adjacent real-time video pictures;

and determining and storing the transformation relation of the two adjacent real-time video pictures according to the matching feature points and a preset random sampling consistency algorithm.

Further, the performing color difference optimization processing on the two adjacent real-time video pictures and deforming the real-time video pictures subjected to the color difference optimization processing according to a transformation relation to obtain a corresponding panoramic image includes:

performing color correction according to color correction parameters between two adjacent real-time video pictures and preset global adjustment parameters;

establishing a panoramic image according to the number of the cameras and the resolution of the video pictures;

and performing overlapping optimization on the overlapping areas of the two adjacent real-time video pictures in the panoramic image to obtain the panoramic image subjected to the overlapping optimization processing.

Further, the rendering the panoramic image to a corresponding position of a preset three-dimensional model in real time through a three-dimensional fusion technology for displaying includes:

determining a three-dimensional model of a target area and a plurality of discrete point pairs of the panoramic image, wherein the discrete point pairs are composed of one three-dimensional model point coordinate and one rasterized coordinate of the panoramic image;

and determining the mapping relation of the panoramic image according to the discrete point pairs, carrying out coordinate interpolation according to the mapping relation, and carrying out panoramic video sampling according to the coordinate interpolation to obtain a three-dimensional fusion image of the panoramic video.

In a second aspect, the present application provides a video panorama stitching and three-dimensional fusion device, including:

the image preprocessing module is used for acquiring real-time video pictures of a plurality of adjacent cameras with overlapping areas and preprocessing the real-time video pictures;

the transformation relation determining module is used for performing feature extraction and feature matching on the real-time video pictures subjected to the image preprocessing, determining matching feature points of two adjacent real-time video pictures and determining the transformation relation of the two adjacent real-time video pictures according to the matching feature points;

the panoramic image generation module is used for carrying out color difference optimization processing on the two adjacent real-time video pictures and deforming the real-time video pictures subjected to the color difference optimization processing according to a transformation relation to obtain corresponding panoramic images;

and the three-dimensional fusion module is used for rendering the panoramic image to a position corresponding to a preset three-dimensional model in real time through a three-dimensional fusion technology for displaying.

Further, the image pre-processing module comprises:

the image noise reduction unit is used for carrying out average weighting processing on the values of all pixel points of the real-time video picture according to a preset two-dimensional Gaussian filter kernel function;

and the image graying unit is used for performing graying processing on the real-time video picture subjected to the average weighting processing to obtain a corresponding grayscale image.

Further, the transformation relation determining module includes:

the feature extraction unit is used for extracting features of the real-time video picture to obtain corresponding feature points and feature descriptors;

the rough matching unit is used for performing rough matching on each feature point and determining the Hamming distance between the two feature points according to the feature descriptors;

the fine matching unit is used for performing fine matching according to the Hamming distance to obtain matching feature points of two adjacent real-time video pictures;

and the transformation relation calculation unit is used for determining and storing the transformation relation of the two adjacent real-time video pictures according to the matching feature points and a preset random sampling consistency algorithm.

Further, the panoramic image generation module includes:

the color correction unit is used for performing color correction according to color correction parameters between two adjacent real-time video pictures and preset global adjustment parameters;

the panoramic image establishing unit is used for establishing a panoramic image according to the number of the cameras and the resolution of the video images;

and the overlapping optimization unit is used for performing overlapping optimization on the overlapping areas of the two adjacent real-time video pictures in the panoramic image to obtain the panoramic image subjected to the overlapping optimization processing.

Further, the three-dimensional fusion module includes:

a discrete point pair determining unit for determining a three-dimensional model of a target area and a plurality of discrete point pairs of the panoramic image, wherein the discrete point pairs are composed of one three-dimensional model point coordinate and one gridded coordinate of the panoramic image;

and the three-dimensional fusion unit is used for determining the mapping relation of the panoramic image according to the discrete point pairs, carrying out coordinate interpolation according to the mapping relation, and carrying out panoramic video sampling according to the coordinate interpolation to obtain a three-dimensional fusion image of the panoramic video.

In a third aspect, the present application provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the video panorama stitching and three-dimensional fusion method when executing the program.

In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the steps of the video panorama stitching and three-dimensional fusion method.

According to the technical scheme, the video panorama splicing and three-dimensional fusion method and device are characterized in that a plurality of adjacent camera pictures with overlapped areas are spliced into a complete picture through a panorama splicing technology through computer vision and an image processing technology, and then the real-time spliced picture is rendered to a corresponding three-dimensional model position through computer graphics, so that the combination of a geographic position and a real-time panorama video is realized, related security personnel can control the monitoring situation of the whole scene at any time and any place, and the opportunities of criminals are reduced.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings can be obtained by those skilled in the art without inventive work.

Fig. 1 is one of the flow diagrams of a video panorama stitching and three-dimensional fusion method in the embodiment of the present application;

fig. 2 is a second schematic flowchart of a video panorama stitching and three-dimensional fusion method in the embodiment of the present application;

fig. 3 is a third schematic flow chart of a video panorama stitching and three-dimensional fusion method in the embodiment of the present application;

fig. 4 is a fourth schematic flowchart of a video panorama stitching and three-dimensional fusion method in the embodiment of the present application;

fig. 5 is a fifth flowchart illustrating a video panorama stitching and three-dimensional fusion method according to an embodiment of the present application;

fig. 6 is one of the structural diagrams of the video panorama stitching and three-dimensional fusion apparatus in the embodiment of the present application;

FIG. 7 is a second block diagram of a video panorama stitching and three-dimensional fusion apparatus according to an embodiment of the present application;

fig. 8 is a third structural diagram of a video panorama stitching and three-dimensional fusion apparatus in the embodiment of the present application;

fig. 9 is a fourth structural diagram of a video panorama stitching and three-dimensional fusion apparatus in an embodiment of the present application;

fig. 10 is a fifth structural diagram of a video panorama stitching and three-dimensional fusion apparatus in an embodiment of the present application;

FIG. 11 is a parameter diagram of a Gaussian filter 3x3 template in an embodiment of the present application;

FIG. 12 is a diagram illustrating feature matching of neighboring video frames of a scene according to an embodiment of the present application;

fig. 13 is a schematic diagram of a two-dimensional image obtained by panoramic stitching of adjacent video frames of a certain scene in an embodiment of the present application;

fig. 14 is one of schematic effects of real-time video panorama stitching and three-dimensional fusion in an embodiment of the present application;

FIG. 15 is a second exemplary illustration of the effect of real-time panoramic stitching and three-dimensional fusion of video in accordance with an embodiment of the present application;

fig. 16 is a schematic structural diagram of an electronic device in an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In view of the recent development of computers, networks, image processing, computer graphics, and transmission technologies, video surveillance technology has also been rapidly developed. The method and the device for splicing the video panorama and three-dimensional fusion are provided, a plurality of adjacent camera pictures with overlapped areas are spliced into a complete picture through a panorama splicing technology by computer vision and image processing technologies, then the real-time spliced picture is rendered to a corresponding three-dimensional model position by using computer graphics, the combination of the geographic position and the real-time panoramic video is realized, and related security personnel can master the monitoring situation of the whole scene at any time and any place, reduces the opportunities for criminals.

In order to effectively solve the problem of video discontinuity and increase video readability, so that a user can master the whole situation in real time without leaking through each monitoring corner, and the method has a great practical application value, the application provides an embodiment of a video panorama stitching and three-dimensional fusion method, which specifically includes the following contents, referring to fig. 1:

step S101: acquiring real-time video pictures of a plurality of adjacent cameras with overlapping areas, and carrying out image preprocessing on the real-time video pictures;

step S102: performing feature extraction and feature matching on the real-time video pictures subjected to the image preprocessing, determining matching feature points of two adjacent real-time video pictures, and determining a transformation relation of the two adjacent real-time video pictures according to the matching feature points;

step S103: performing color difference optimization processing on the two adjacent real-time video pictures, and deforming the real-time video pictures subjected to the color difference optimization processing according to a transformation relation to obtain corresponding panoramic images;

step S104: and rendering the panoramic image to a corresponding position of a preset three-dimensional model in real time through a three-dimensional fusion technology for displaying.

As can be seen from the above description, the video panorama stitching and three-dimensional fusion method provided in the embodiment of the present application can stitch a plurality of adjacent camera images having overlapping areas into a complete image through a panorama stitching technology by using computer vision and an image processing technology, and then render the real-time stitched image to a corresponding three-dimensional model position by using computer graphics, so as to implement the combination of a geographic position and a real-time panorama video, so that related security personnel can control the monitoring situation of the whole scene at any time and any place, and reduce opportunities for criminals.

In an embodiment of the video panorama stitching and three-dimensional fusion method of the present application, referring to fig. 2, the following may be specifically included:

step S201: carrying out average weighting processing on the values of all pixel points of the real-time video picture according to a preset two-dimensional Gaussian filter kernel function;

step S202: and carrying out graying processing on the real-time video picture subjected to the average weighting processing to obtain a corresponding grayscale image.

Optionally, because of the influence of factors such as the production of front-end video acquisition equipment, installation and surrounding environment for there is difference in the video image quality of acquireing, and the quality of image quality directly influences the effect of follow-up image feature extraction and feature matching, therefore it is necessary also indispensable step to carry out image preprocessing in advance, specifically is: and (4) reducing noise of the image and graying the image.

The method specifically comprises the following steps: since most of things in nature are approximately distributed to be close to Gaussian distribution under the condition of enough samples, and the number of pixels in the image acquired by the method is thousands and each pixel is independent, the Gaussian filtering method is used in the image denoising stage. The kernel function of gaussian filtering can be classified into one-dimensional, two-dimensional or even multi-dimensional, and we will briefly introduce the kernel functions of one-dimensional and two-dimensional gaussian filtering,

the density function of one-dimensional gaussian filtering is as follows:

where μ is the mean and σ is the standard deviation

The kernel function of the two-dimensional gaussian filtering is as follows:

since the image is two-dimensional, we use a two-dimensional gaussian kernel function in the process of image denoising, and generally we select the variance to be 0 in image processing. Generally speaking, gaussian filtering is an average weighting process for the whole image, and the value of each pixel point is obtained by weighting and averaging the value of each pixel point and the values of other pixel points in the field. The specific operation process is as follows: given each pixel in the scanned image, row by row, of a template (alternatively referred to as a convolution, mask), the weighted average of the pixels in the field defined by the template is substituted for the value of the pixel at the center of the template. In the present invention, we use the convolution template to be the convolution of the square 3x 3. A specific template design is shown in fig. 11;

since a gray-scale image is required in the feature extraction stage, we perform graying of the color image in the second stage of image preprocessing. In the step, a function cvtColor carried by OpenCV is directly used for graying, and a grayscale image of each video picture is obtained.

In an embodiment of the video panorama stitching and three-dimensional fusion method of the present application, referring to fig. 3, the following may be specifically included:

step S301: extracting the features of the real-time video picture to obtain corresponding feature points and feature descriptors;

step S302: performing rough matching on each feature point, and determining a Hamming distance between two feature points according to the feature descriptors;

step S303: performing fine matching according to the Hamming distance to obtain matching feature points of two adjacent real-time video pictures;

step S304: and determining and storing the transformation relation of the two adjacent real-time video pictures according to the matching feature points and a preset random sampling consistency algorithm.

Optionally, extracting feature points of the preprocessed images by using a method combining SIFT (Scale-innovative feature transform) and surf (speeded Up Robust features) feature extraction, and acquiring feature points and corresponding feature descriptors of each image;

in the feature matching stage, due to the complexity of a scene and the influence of the outside, two matching modes, namely manual matching and automatic matching, are arranged for the feature matching device. For automatic matching, carrying out first matching on the obtained feature points of each image to obtain coarse matching points, and calculating the Hamming distance of every two feature points according to descriptors by using a violent matching algorithm in the matching; then, a second exact match is performed, and in the present invention, the distance threshold for the exact match is set to be optional, and the selection range is between 0.4 and 0.8. By the second exact match we obtain the exact match points of the neighboring images. For manual matching, a mouse is directly used for manually selecting corresponding feature points corresponding to adjacent images for matching.

And calculating the transformation relation between the adjacent images by using a RANSAC (random Sample consensus) algorithm according to the calculated matching points of the adjacent images, and storing a transformation relation matrix into a file so as to facilitate the next use.

In an embodiment of the video panorama stitching and three-dimensional fusion method of the present application, referring to fig. 4, the following may be specifically included:

step S401: performing color correction according to color correction parameters between two adjacent real-time video pictures and preset global adjustment parameters;

step S402: establishing a panoramic image according to the number of the cameras and the resolution of the video pictures;

step S403: and performing overlapping optimization on the overlapping areas of the two adjacent real-time video pictures in the panoramic image to obtain the panoramic image subjected to the overlapping optimization processing.

Optionally, the adjacent cameras have chromatic aberration due to the orientation, the photosensitive elements of the adjacent cameras and other factors, and if the adjacent cameras are directly spliced, a splicing seam which is visible to the naked eye is generated, so that the splicing effect and the use experience of a user are seriously affected, and therefore the invention provides the self-adaptive adjustment method of the chromatic aberration image. Firstly, calculating color correction parameters between adjacent video pictures, specifically: suppose there are n images P to be stitched₁， P₂，…P_nIn which P is_iAnd P_i+1Is composed of two adjacent video frames with overlapped region

And

if the image is the overlapped area of the adjacent video images, the image P to be spliced of the video is obtained_iA color channel ofThe correction parameters are calculated by the following formula:

in equation (3): m: overlapping regions of adjacent video pictures;

S_i(s): the pixel value of a pixel point s in the ith image;

S_i+1(s): the pixel value of a pixel point s in the (i + 1) th image;

γ: a specified parameter, typically set to 2.2;

meanwhile, in order to avoid the image over saturation, the invention sets a global adjusting parameter g_cAnd is used for adjusting the color value of the whole splicing sequence. Since our video images are generally three channels of R, G, and B, it is necessary to calculate a color compensation value and a color adjustment value for each channel, specifically, the calculation formula of the color adjustment value is as follows:

finally, the formula for performing color correction on the video picture by the color correction parameters and the global adjustment parameters is as follows:

S_c,i(s)＝(g_cα_c,i)^1/γS_c,i(s) (5)

wherein S is_c,i(s) refers to video pictures P_iThe pixel values of the pixel points on channel c ∈ { R, G, B }.

And S32, establishing a panoramic mapping image according to the number of the panoramic stitching cameras, wherein the panoramic image in the invention is determined according to the number of the cameras and the video picture resolution of each camera. Assuming that there are N cameras for panorama stitching, the video frame resolution of each individual camera is: WxH, then the resolution of the generated panoramic image is: (WxN) xH. If the number of the selected splicing cameras is odd, selecting the video picture made by the middle camera as a reference picture, and then, selecting the middle camera to be the panoramic imageThe mapping relation is an identity matrix, the x direction (i.e. the width of the corresponding video) in the translation matrix is the size of the resolution of the video picture, and the y direction (i.e. the height of the corresponding video) in the translation matrix is 0; the following panoramic image generation process is specifically described in one case: suppose there are n (n is an odd number) cameras, each P₁,P₂,P₃…P_nThe transformation matrix between the adjacent camera video pictures obtained by the above calculation is: h₁，H₂，H₃…H_n-1The subscript of the transformation matrix indicates a transformation relationship between the current camera i and the camera i +1 to the right of the current camera. The camera number in the middle is found from the above description as: (1+ n)/2. Let P_(1+n)/2If the reference camera is a reference camera, the transformation from the reference camera to the panoramic image is an identity matrix, I is set, and the translation matrix is T, then the transformation formula from the reference camera pixel point to the panoramic image is as follows:

P_p＝T*I*P_R (6)

wherein, P_pRepresenting the position of a certain pixel in the panoramic image as homogeneous coordinates;

P_crepresenting the position of a certain pixel in the reference image as a homogeneous coordinate;

i and T respectively represent rotation and translation matrixes, and are matrixes of 3x 3;

further, the picture to the left of the reference picture, i.e. the sequence number:

the transformation relation from each image pixel point to the panorama is calculated by the following formula:

in particular if

Equation (7) becomes:

for the image to the right of the reference image, that is:

the transformation relation calculation formula from the pixel point of each image to the panoramic image is as follows:

wherein

In particular if

Equation (9) is then modified to:

finally, each image can be mapped to the panorama by the above equations (6) to (10).

And searching the overlapping area of the adjacent images in the panoramic image, optimizing the overlapping area of the adjacent images in the panoramic image, and generating the panoramic image. If the overlapping area of the adjacent images in the panoramic image is not optimized, the overlapping area in the panoramic image loses some details due to the fact that the pixels are too saturated, and the pixel calculation formula of the overlapping area of the adjacent video images in the panoramic image is as follows:

P_(m,n+0)＝P_i,(m,n+0)*α+P_i+1,(m,n+0)*(1-α) (11)

P_(m,n+1)＝P_i,(m,n+1)*α+P_i+1,(m,n+1)*(1-α) (12)

P_(m,n+2)＝P_i,(m,n+2)*α+P_i+1,(m,n+2)*(1-α) (13)

α＝((width*ratio)-(n-s))/(width*ratio) (14)

wherein, P_(m,n+k)A pixel value indicating a k (k is 0,1,2) th channel overlapping region position (m, n) in the panoramic image;

P_i,(m,n+k)，P_i+1,(m,n+k)pixel values representing the k (k is 0,1,2) th channel overlap region position (m, n) in the image i and the image i +1, respectively.

width represents the width of an overlapping area of adjacent video pictures;

ratio represents the overlapping proportion of the adjacent video pictures;

s represents the start position of the overlapping area of adjacent video pictures;

after the steps are carried out, the panoramic image is generated and then transmitted to the client side for configuration and display.

In an embodiment of the video panorama stitching and three-dimensional fusion method of the present application, referring to fig. 5, the following may be specifically included:

step S501: determining a three-dimensional model of a target area and a plurality of discrete point pairs of the panoramic image, wherein the discrete point pairs are composed of one three-dimensional model point coordinate and one rasterized coordinate of the panoramic image;

step S502: and determining the mapping relation of the panoramic image according to the discrete point pairs, performing coordinate interpolation according to the mapping relation, and performing panoramic video sampling according to the coordinate interpolation to obtain a three-dimensional fusion image of the panoramic video.

Optionally, in order to accurately determine the position of the panoramic video, a static panoramic picture is selected as a reference for editing, a plurality of binary point pairs are selected from the three-dimensional scene and the corresponding panoramic picture respectively, and each point pair is composed of a three-dimensional scene point coordinate and a two-dimensional panoramic rasterized coordinate.

Since the mapping relationship of the whole panorama needs to be obtained by interpolation according to the discrete point pairs in the above steps, selecting a suitable interpolation method is crucial to the final panorama mapping result, and 2 specifically is:

because the panoramic image is two-dimensional, if the corresponding three-dimensional selected point is directly interpolated in the three-dimensional space, the perspective distortion of the interpolation result can be caused, so the invention takes the screen space as the interpolation space, adopts the interpolation of the three-dimensional selected point in the screen space, and converts the three-dimensional selected point in the three-dimensional space into the two-dimensional point in the screen space through projection transformation.

The usual interpolation method will use the distance between the interpolation point and the discrete point as the decision basis for selecting the adjacent discrete points, select the closest points as the adjacent discrete points, then, the value of the interpolation point is calculated according to the values of the adjacent discrete points, the method has good performance for the condition that the distribution of the discrete points is more uniform, however, under the condition of uneven distribution of the discrete points, the interpolated image is usually distorted greatly, because the nearest distance judgment can cause the superposition of adjacent discrete points of a plurality of interpolated points, the interpolation results are overlapped, while also causing distortion of the interpolation result in the case where the distribution of adjacent discrete points is close to a line, therefore, the delaunay triangulation method is adopted to subdivide the space domain formed by all discrete points, and the delaunay triangulation has the good property of maximally generating the internal angles of the triangle, so that the generation of the long and narrow triangle is effectively avoided.

The invention adopts triangle interpolation based on barycentric coordinates, uses the triangle generated in the above steps as an interpolation domain, determines barycentric coordinates according to the areas of three triangles formed by interpolation points and three vertexes of the triangle, and uses the barycentric coordinates to perform sampling point interpolation.

Although the delaunay triangulation method is relatively stable, a long and narrow triangle may still exist at the edge of an interpolation region to cause distortion of an interpolation image, so the invention provides an unsupervised long and narrow triangle automatic elimination algorithm.

And sampling the panoramic video according to the coordinates obtained by interpolation in the step to generate the final panoramic video three-dimensional fusion.

In order to effectively solve the problem of video discontinuity and increase video readability, so that a user can master the whole situation in real time without leaking through each monitoring corner, and the method has a great practical application value, the application provides an embodiment of a video panorama stitching and three-dimensional fusion device for realizing all or part of the contents of the video panorama stitching and three-dimensional fusion method, which is shown in fig. 6, and the video panorama stitching and three-dimensional fusion device specifically comprises the following contents:

the image preprocessing module 10 is configured to acquire real-time video frames of a plurality of adjacent cameras having an overlapping area, and perform image preprocessing on the real-time video frames;

a transformation relation determining module 20, configured to perform feature extraction and feature matching on the real-time video pictures subjected to the image preprocessing, determine matching feature points of two adjacent real-time video pictures, and determine a transformation relation between the two adjacent real-time video pictures according to the matching feature points;

the panoramic image generation module 30 is configured to perform color difference optimization processing on the two adjacent real-time video pictures, and deform the real-time video pictures subjected to the color difference optimization processing according to a transformation relation to obtain a corresponding panoramic image;

and the three-dimensional fusion module 40 is used for rendering the panoramic image to a corresponding position of a preset three-dimensional model in real time through a three-dimensional fusion technology for displaying.

As can be seen from the above description, the video panorama stitching and three-dimensional fusion device provided in the embodiment of the present application can stitch a plurality of adjacent camera images having overlapping areas into a complete image by using a panorama stitching technology through computer vision and an image processing technology, and then render the real-time stitched image to a corresponding three-dimensional model position by using computer graphics, so as to combine a geographic position with a real-time panorama video, so that related security personnel can control the monitoring situation of the whole scene at any time and any place, and reduce opportunities for criminals.

In an embodiment of the video panorama stitching and three-dimensional fusion apparatus according to the present application, referring to fig. 7, the image preprocessing module 10 includes:

the image denoising unit 11 is configured to perform average weighting processing on values of each pixel point of the real-time video picture according to a preset two-dimensional gaussian filter kernel function;

and an image graying unit 12, configured to perform graying processing on the real-time video picture subjected to the average weighting processing, so as to obtain a corresponding grayscale image.

In an embodiment of the apparatus for stitching and three-dimensional fusing a video panorama according to the present application, referring to fig. 8, the transformation relation determining module 20 includes:

a feature extraction unit 21, configured to perform feature extraction on the real-time video picture to obtain corresponding feature points and feature descriptors;

a rough matching unit 22, configured to perform rough matching on each feature point, and determine a hamming distance between two feature points according to the feature descriptors;

the fine matching unit 23 is configured to perform fine matching according to the hamming distance to obtain matching feature points of two adjacent real-time video pictures;

and the transformation relation calculation unit 24 is used for determining and storing the transformation relation of the two adjacent real-time video pictures according to the matching feature points and a preset random sampling consistency algorithm.

In an embodiment of the apparatus for stitching and three-dimensional fusing a video panorama according to the present application, referring to fig. 9, the panoramic image generating module 30 includes:

the color correction unit 31 is used for performing color correction according to color correction parameters between two adjacent real-time video pictures and preset global adjustment parameters;

a panoramic image establishing unit 32, configured to establish a panoramic image according to the number of cameras and the resolution of the video frames;

and an overlap optimization unit 33, configured to perform overlap optimization on overlapping areas of the two adjacent real-time video frames in the panoramic image, so as to obtain the panoramic image after the overlap optimization processing.

In an embodiment of the video panorama stitching and three-dimensional fusion apparatus of the present application, referring to fig. 10, the three-dimensional fusion module 40 includes:

a discrete point pair determining unit 41 for determining a three-dimensional model of a target area and a plurality of discrete point pairs of the panoramic image, wherein the discrete point pairs are composed of one of the three-dimensional model point coordinates and one of the panoramic image rasterized coordinates;

and the three-dimensional fusion unit 42 is configured to determine a mapping relationship of the panoramic image according to the discrete point pairs, perform coordinate interpolation according to the mapping relationship, and perform panoramic video sampling according to the coordinate interpolation to obtain a three-dimensional fusion image of the panoramic video.

In order to further explain the present solution, the present application further provides a specific application example of the method for implementing video panorama stitching and three-dimensional fusion by using the above video panorama stitching and three-dimensional fusion apparatus, which specifically includes the following contents:

a video panorama stitching and three-dimensional fusion technical method and a flow thereof comprise the following steps:

s1, acquiring real-time video pictures of a plurality of adjacent cameras with overlapping areas, and preprocessing images on the real-time video pictures through an image processing technology;

s2, obtaining the matching feature points of the adjacent video pictures by using the feature extraction and matching technology for the real-time video pictures and calculating the transformation relation of the images of the adjacent video pictures according to the matching feature points

S3, firstly, carrying out color difference optimization on the adjacent video pictures in the S2, then deforming the video pictures according to the transformation relation of the adjacent video pictures, and then generating corresponding panoramas;

and S4, loading the three-dimensional model by the client, taking the panoramic image output in the S3 as a texture, and displaying the panoramic image to a position corresponding to the model in real time through the three-dimensional fusion technology.

According to the invention, a plurality of camera video pictures with overlapped areas are seamlessly spliced and the GPU is utilized to accelerate image processing, so that the output panoramic picture is smooth and free of blockage, and the requirement of watching by human eyes is met. And finally, rendering the full scene video into the three-dimensional model in real time by using a GPU technology through a special means, so that perfect fusion and display of the virtual model and the video are realized.

The features and properties of the present invention are described in further detail below with reference to examples:

s1, acquiring real-time video pictures of a plurality of adjacent cameras with overlapping areas, and preprocessing images on the real-time video pictures through an image processing technology:

taking a certain area of Xinjiang as an example, firstly opening panorama generation configuration software, acquiring a plurality of adjacent cameras with overlapped areas through an equipment list to acquire a real-time monitoring picture, then selecting configuration splicing parameters, generally defaulting, clicking a 'start' button, automatically performing image preprocessing by a background according to the description of people, and finally displaying the matched features in the picture.

S2, the real-time video pictures utilize the feature extraction and matching technology to obtain the matching feature points of the adjacent video pictures and calculate the transformation relation of the images of the adjacent video pictures according to the matching feature points:

in the field of panoramic stitching, the robustness of feature point detection and matching influences the consistency of video panoramic stitching to a great extent. The scene tested in this example is somewhere in Xinjiang. By using the SURF and SIFT combined feature point method and subsequent automatic matching, matching points of two adjacent images are finally obtained, as shown in fig. 12. It is clear that the number of correct matches is the majority, providing a strong basis for the subsequent continued use of the RANSAC algorithm to calculate the transformation relationship between adjacent video pictures.

S3, firstly, color difference optimization is carried out on the adjacent video pictures in S2. Then, deforming the video pictures according to the transformation relation of the adjacent video pictures, and then generating a corresponding panoramic image, wherein the generated panoramic image is shown in FIG. 13;

s4, the client loads the three-dimensional model, and takes the panorama output in S3 as a texture, and displays the panorama to a corresponding position of the model in real time by the aforementioned three-dimensional fusion technique, and the final display effect is as shown in fig. 14 and fig. 15.

As can be seen from the above, the present application can achieve at least the following technical effects:

1. according to the method and the device, image preprocessing is carried out on the real-time video pictures of the plurality of acquired cameras, manual and automatic feature matching is achieved in different situations, on one hand, the feature matching accuracy and speed of adjacent video pictures are improved, on the other hand, the application of the method and the device can adapt to various complex environments, and the defect of video splicing is effectively overcome.

2. In the application, the video picture of the middle camera is selected as a reference, so that the deformation of the video pictures at two ends in the panoramic image is effectively improved. Meanwhile, the problem of chromatic aberration of the video pictures caused by the influence of the camera and the external environment is adjusted by adopting a local color-dividing channel and a global self-adaptive method, the image splicing quality is further improved, the visual feeling of a user is further improved, and finally, the ghost problem of the spliced pictures caused by the installation position of the camera can be obviously reduced by setting a pixel calculation mode of an overlapping area of adjacent video pictures and the overlapping proportion of the adjacent video pictures.

3. According to the method, interpolation is carried out through editing points on a screen space, the perspective distortion of an interpolation result caused by direct interpolation in a three-dimensional space is effectively improved, meanwhile Delaunay triangulation is adopted to determine interpolation top points, interpolation distortion caused by long and narrow triangles is effectively avoided, and finally, for long and narrow triangles possibly existing in an interpolation edge area, an unsupervised automatic elimination algorithm is provided, so that the interpolation image finally generated under various conditions can not be obviously distorted.

4. The video panorama stitching and three-dimensional fusion technical method and the video panorama stitching and three-dimensional fusion technical process are suitable for various real environments. The method and the device have the advantages that security personnel can efficiently and conveniently locate the whole situation of the monitoring area and the specific position rapidly and quickly once danger occurs, the combination of videos and the geographical position is realized, the videos are perfectly dispersed, and the characteristic that the dangerous location is difficult and the geographical position is unclear is realized.

In order to effectively solve the problem of video discontinuity and increase video readability in a hardware level, so that a user can master the whole situation in real time without leaking through each monitoring corner and has a large practical application value, the application provides an embodiment of an electronic device for implementing all or part of contents in the video panorama stitching and three-dimensional fusion method, and the electronic device specifically includes the following contents:

a processor (processor), a memory (memory), a communication Interface (Communications Interface), and a bus; the processor, the memory and the communication interface complete mutual communication through the bus; the communication interface is used for realizing information transmission between the video panoramic stitching and three-dimensional fusion device and relevant equipment such as a core service system, a user terminal and a relevant database; the logic controller may be a desktop computer, a tablet computer, a mobile terminal, and the like, but the embodiment is not limited thereto. In this embodiment, the logic controller may be implemented with reference to the embodiments of the video panorama stitching and three-dimensional fusion method and the embodiments of the video panorama stitching and three-dimensional fusion apparatus in the embodiments, and the contents thereof are incorporated herein, and repeated details are not repeated.

It is understood that the user terminal may include a smart phone, a tablet electronic device, a network set-top box, a portable computer, a desktop computer, a Personal Digital Assistant (PDA), an in-vehicle device, a smart wearable device, and the like. Wherein, intelligence wearing equipment can include intelligent glasses, intelligent wrist-watch, intelligent bracelet etc..

In practical applications, part of the video panorama stitching and three-dimensional fusion method may be executed on the electronic device side as described above, or all operations may be completed in the client device. The selection may be specifically performed according to the processing capability of the client device, the limitation of the user usage scenario, and the like. This is not a limitation of the present application. The client device may further include a processor if all operations are performed in the client device.

The client device may have a communication module (i.e., a communication unit) and may be communicatively connected to a remote server to implement data transmission with the server. The server may include a server on the side of the task scheduling center, and in other implementation scenarios, the server may also include a server on an intermediate platform, for example, a server on a third party server platform that has a communication link with the task scheduling center server. The server may include a single computer device, or may include a server cluster formed by a plurality of servers, or a server structure of a distributed apparatus.

Fig. 16 is a schematic block diagram of a system configuration of an electronic device 9600 according to an embodiment of the present application. As shown in fig. 16, the electronic device 9600 can include a central processor 9100 and a memory 9140; the memory 9140 is coupled to the central processor 9100. Notably, this fig. 16 is exemplary; other types of structures may also be used in addition to or in place of the structure to implement telecommunications or other functions.

In one embodiment, the video panorama stitching and three-dimensional fusion method functions may be integrated into the central processor 9100. The central processor 9100 may be configured to control as follows:

As can be seen from the above description, according to the electronic device provided in the embodiment of the present application, through computer vision and image processing technologies, multiple adjacent camera images with overlapping regions are spliced into a complete image through a panoramic stitching technology, and then the real-time stitched image is rendered to a corresponding three-dimensional model position by using computer graphics, so as to implement combination of a geographic position and a real-time panoramic video, so that related security personnel can master the monitoring situation of the whole scene at any time and any place, and reduce opportunities for criminals.

In another embodiment, the video panorama stitching and three-dimensional fusing device may be configured separately from the central processor 9100, for example, the video panorama stitching and three-dimensional fusing device may be configured as a chip connected to the central processor 9100, and the video panorama stitching and three-dimensional fusing method function is realized by the control of the central processor.

As shown in fig. 16, the electronic device 9600 may further include: a communication module 9110, an input unit 9120, an audio processor 9130, a display 9160, and a power supply 9170. It is noted that the electronic device 9600 also does not necessarily include all of the components shown in fig. 16; further, the electronic device 9600 may further include components not shown in fig. 16, which can be referred to in the related art.

As shown in fig. 16, a central processor 9100, sometimes referred to as a controller or operational control, may include a microprocessor or other processor device and/or logic device, which central processor 9100 receives input and controls the operation of various components of the electronic device 9600.

The memory 9140 can be, for example, one or more of a buffer, a flash memory, a hard drive, a removable media, a volatile memory, a non-volatile memory, or other suitable device. The information relating to the failure may be stored, and a program for executing the information may be stored. And the central processing unit 9100 can execute the program stored in the memory 9140 to realize information storage or processing, or the like.

The input unit 9120 provides input to the central processor 9100. The input unit 9120 is, for example, a key or a touch input device. Power supply 9170 is used to provide power to electronic device 9600. The display 9160 is used for displaying display objects such as images and characters. The display may be, for example, an LCD display, but is not limited thereto.

The memory 9140 can be a solid state memory, e.g., Read Only Memory (ROM), Random Access Memory (RAM), a SIM card, or the like. There may also be a memory that holds information even when power is off, can be selectively erased, and is provided with more data, an example of which is sometimes called an EPROM or the like. The memory 9140 could also be some other type of device. The memory 9140 includes a buffer memory 9141 (sometimes referred to as a buffer). The memory 9140 may include an application/function storage portion 9142, the application/function storage portion 9142 being used for storing application programs and function programs or for executing a flow of operations of the electronic device 9600 by the central processor 9100.

The memory 9140 can also include a data store 9143, the data store 9143 being used to store data, such as contacts, digital data, pictures, sounds, and/or any other data used by an electronic device. Driver storage portion 9144 of memory 9140 may include various drivers for the electronic device for communication functions and/or for performing other functions of the electronic device (e.g., messaging applications, contact book applications, etc.).

The communication module 9110 is a transmitter/receiver 9110 that transmits and receives signals via an antenna 9111. The communication module (transmitter/receiver) 9110 is coupled to the central processor 9100 to supply input signals and receive output signals, which may be the same as in the case of a conventional mobile communication terminal.

Based on different communication technologies, a plurality of communication modules 9110, such as a cellular network module, a bluetooth module, and/or a wireless local area network module, may be provided in the same electronic device. The communication module (transmitter/receiver) 9110 is also coupled to a speaker 9131 and a microphone 9132 via an audio processor 9130 to provide audio output via the speaker 9131 and receive audio input from the microphone 9132, thereby implementing ordinary telecommunications functions. The audio processor 9130 may include any suitable buffers, decoders, amplifiers and so forth. In addition, the audio processor 9130 is also coupled to the central processor 9100, thereby enabling recording locally through the microphone 9132 and enabling sounds stored locally to be played through the speaker 9131.

An embodiment of the present application further provides a computer-readable storage medium capable of implementing all steps in the video panorama stitching and three-dimensional fusion method with a server or a client as an execution subject in the foregoing embodiments, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the computer program implements all steps of the video panorama stitching and three-dimensional fusion method with a server or a client as an execution subject, for example, when the processor executes the computer program, the processor implements the following steps:

As can be seen from the above description, the computer-readable storage medium provided in this embodiment of the present application, through computer vision and image processing technology, splices a plurality of adjacent camera images with overlapping areas into a complete image through a panoramic splicing technology, and then renders the real-time spliced image to a corresponding three-dimensional model position by using computer graphics, so as to implement combination of a geographic position and a real-time panoramic video, so that related security personnel can master the monitoring situation of the whole scene at any time and any place, thereby reducing opportunities for criminals to take.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The principle and the implementation mode of the invention are explained by applying specific embodiments in the invention, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, in light of the above description, the present invention should not be construed as limited to the embodiments and the application scope of the present invention.

Claims

1. A video panorama stitching and three-dimensional fusion method is characterized by comprising the following steps:

performing feature extraction and feature matching on the real-time video pictures subjected to the image preprocessing, determining matching feature points of two adjacent real-time video pictures, and determining a transformation relation of the two adjacent real-time video pictures according to the matching feature points;

2. The method for splicing and fusing the video panoramas according to claim 1, wherein the acquiring real-time video frames of a plurality of adjacent cameras having an overlapping area and performing image preprocessing on the real-time video frames comprises:

3. The method for splicing and fusing the panoramic video images according to claim 1, wherein the steps of performing feature extraction and feature matching on the real-time video images after the image preprocessing, determining matching feature points of two adjacent real-time video images, and determining the transformation relationship between the two adjacent real-time video images according to the matching feature points comprise:

4. The method for splicing and fusing the panoramic views of the videos according to claim 1, wherein the performing color difference optimization processing on the two adjacent real-time video frames and transforming the real-time video frames after the color difference optimization processing according to a transformation relation to obtain corresponding panoramic images comprises:

5. The method for splicing and fusing the panoramic images of the videos according to claim 1, wherein the rendering the panoramic images to the corresponding positions of the preset three-dimensional models for displaying by the three-dimensional fusion technology in real time comprises:

and determining the mapping relation of the panoramic image according to the discrete point pairs, performing coordinate interpolation according to the mapping relation, and performing panoramic video sampling according to the coordinate interpolation to obtain a three-dimensional fusion image of the panoramic video.

6. The utility model provides a video panorama concatenation and three-dimensional device that fuses which characterized in that includes:

and the three-dimensional fusion module is used for rendering the panoramic image to a corresponding position of a preset three-dimensional model in real time through a three-dimensional fusion technology for displaying.

7. The apparatus for stitching and fusing the panoramic video image according to claim 6, wherein the image preprocessing module comprises:

8. The apparatus for stitching and three-dimensional fusing of video panoramas of claim 6, wherein the transformation relation determining module comprises:

9. The apparatus for stitching and three-dimensional fusing of video panoramas of claim 6, wherein the panoramic image generation module comprises:

the panoramic image establishing unit is used for establishing a panoramic image according to the number of the cameras and the resolution of the video pictures;

10. The apparatus for stitching and three-dimensional fusion of video panorama according to claim 6, wherein the three-dimensional fusion module comprises:

a discrete point pair determining unit configured to determine a three-dimensional model of a target area and a plurality of discrete point pairs of the panoramic image, wherein the discrete point pairs are composed of one of the three-dimensional model point coordinates and one of the panoramic image rasterized coordinates;