CN111836012B

CN111836012B - Video fusion and video linkage method based on three-dimensional scene and electronic equipment

Info

Publication number: CN111836012B
Application number: CN202010598667.1A
Authority: CN
Inventors: 王永威; 刘金龙; 牟风涛; 张耀华
Original assignee: Space Geodata Beijing Co ltd
Current assignee: Space Geodata Beijing Co ltd
Priority date: 2020-06-28
Filing date: 2020-06-28
Publication date: 2022-05-13
Anticipated expiration: 2040-06-28
Also published as: CN111836012A

Abstract

The invention relates to the technical field of comprehensive situation monitoring and commanding, in particular to a video fusion and video linkage method and electronic equipment based on a three-dimensional scene, which comprises the steps of three-dimensional modeling of a monitoring area, planning and deploying of the scene and a target camera, enhanced shake elimination and correction processing of the scene video, recognition and analysis of the scene video, scene video and thermodynamic diagram thereof, three-dimensional matching and fusion of a hot target, three-dimensional matching linkage of the target camera, monitoring and correction of position and attitude deviation of the scene camera and the like; the scene camera video image and the thermodynamic diagram thereof, the accurate space matching and fusion display of the hot target and the scene three-dimensional model are realized, the intelligent scheduling target camera obtains the fine image of the hot target, the accurate space matching and fusion display of other situation perception information, equipment facility information and task management and control information and the scene three-dimensional model are realized, the full-dimensional situation fusion monitoring based on the three-dimensional scene and the intelligent space positioning analysis and associated information fusion display based on the hot spot inspection, the plan task and the detection alarm are realized, and the intelligent level and the application efficiency of the comprehensive situation monitoring and commanding system are remarkably improved.

Description

Video fusion and video linkage method based on three-dimensional scene and electronic equipment

Technical Field

The invention relates to the technical field of comprehensive situation monitoring and commanding, in particular to a video fusion and video linkage method based on a three-dimensional scene and electronic equipment.

Background

The video monitoring is a main subsystem of a regional comprehensive situation monitoring and commanding system, cameras such as a fixed gun camera, a PTZ ball machine and the like are installed in a monitoring site in a type-selecting mode, a monitoring center adopts a display mode of one camera and one window, when a two-dimensional map or a three-dimensional model of a monitoring region is used for comprehensive display, icons are added at the installation positions of the cameras, the icons are manually clicked or the periphery is selected to call out related camera windows, and because video images are only schematically matched with the three-dimensional model, in the comprehensive situation monitoring and commanding application of scenes such as cities, parks, stations and the like, the high-efficiency positioning and scheduling of massive videos and related information are difficult, and the technical upgrading and the practical efficiency of the comprehensive situation monitoring and commanding system are restricted.

Disclosure of Invention

Aiming at the defects of the prior art, the invention discloses a video fusion and video linkage method and electronic equipment based on a three-dimensional scene, which are used for realizing the accurate spatial matching and fusion display of a scene camera video image and a thermodynamic diagram thereof, a hotspot target and a scene three-dimensional model, acquiring a fine image of the hotspot target by an intelligent scheduling target camera, realizing the accurate spatial matching and fusion display of other situation perception information, equipment facility information and task control information and the scene three-dimensional model, realizing the full-dimensional situation fusion monitoring based on the three-dimensional scene and the intelligent spatial positioning analysis and associated information fusion display based on hotspot patrol, planning tasks and detection alarm, and obviously improving the intelligent level and application efficiency of a comprehensive situation monitoring and commanding system.

The invention is realized by the following technical scheme:

in a first aspect, the invention discloses a video fusion and video linkage method based on a three-dimensional scene, which comprises the following steps:

s1, collecting and manufacturing a three-dimensional model of the monitoring area with the aim of establishing space matching fusion;

s2 planning and deploying scene cameras and target cameras with the purpose of obtaining global situation images;

s3, enhancing, eliminating shake and correcting the obtained scene video image;

s4, identifying, analyzing and processing the obtained scene video image to obtain a man-vehicle dynamic density thermodynamic diagram and a hot spot target in the monitored area;

s5 matching and fusing the scene video image, the thermodynamic diagram and the hot spot target thereof with the three-dimensional model;

s6, matching and linking the target camera with the three-dimensional model;

s7 monitors for a corrected scene camera pose offset.

Furthermore, in S1, a three-dimensional model with spatial accuracy, scale and texture meeting the demand of monitoring and commanding the comprehensive situation is manufactured by aerial photography, laser radar scanning, on-site photographing and architectural engineering design data.

Further, in S2, the principle of camera planning deployment is: generally, a high-definition gun camera is selected and installed at a high point position, and a monitoring area is statically covered to obtain the overall situation of the monitoring area; and a PTZ ball machine is selected as the target camera, and the monitoring area is dynamically covered to track and acquire a fine image of the hot target.

Further, in S3, the enhancing process includes:

s3a histogram equalizes the original image:

s3b histogram equalizes the target image:

s3c determining from r_k→z_mThe mapping relationship between: let s_k＝v_kObtaining:

will r is_kAnd z_mThe mapping table is formed by the corresponding relation of the original image and the pixels on the original image are sequentially updated according to the mapping table;

the method for eliminating the jitters comprises the following steps:

s3d determines the conversion relationship from the previous frame to the current frame, and is divided into three parameters: d_x，d_y，d_α；

S3e, calculating the motion trail of the current frame through the transformation relation and the motion trail of the previous frame, wherein the solving relation is as follows:

x_cur＝x_pre+d_x

y_cur＝y_pre+d_y

α_cur＝α_pre+d_α；

s3f, smoothing the motion trail of the current frame, wherein the smoothing method comprises the following steps:

s3g calculates a new frame-to-frame conversion relationship H', which is calculated by the following relationship:

dx'＝dx+x-x_cur

dy'＝dy+y-y_cur

dα'＝dα+α-α_cur

the transformation matrix H' is obtained as:

and S3h, transforming the video by using the transformation matrix to obtain the de-jittered video.

In S3d, the conversion relationship from the previous frame to the current frame is determined and divided into three parameters d_x，d_y，d_αMethod (2)Comprises the following steps:

s3d1 extracting feature point sets of a previous frame and a current frame;

s3d2 calculates a transformation matrix H between two frames from the change of the feature point set between the two frames:

s3d3 extracting transformation parameter d between two frames from transformation matrix_x，d_y，d_α；

Wherein:

dx＝b₁

dy＝b₂

the correction treatment comprises the following steps:

s3i, detecting contour points in the image by using an edge detection algorithm;

s3j, detecting a distortion straight line in the image by a straight line detection method, and initially estimating and initializing a distortion model;

s3k iterative estimation of a parameter distortion model and a distortion center through the energy optimization process according to the distance between the correction point and the relevant straight line;

s3l performs distortion correction on the image using the calculated distortion model.

In S3j, detecting a distortion line in the image by a line detection method, and initially estimating and initializing the distortion model includes:

s3j1, because the contour which is supposed to be a straight line in the original image is changed into a discontinuous curve after being distorted, a new dimension is needed to be added to describe the discontinuous curve, whether the discontinuous curve belongs to the same straight line or not is determined by voting through an approximate approximation method, and the straight line is subjected to fitting correction;

s3j2 estimating an initialization distortion model through the corrected straight line group, selecting N longest straight lines, calculating the reliability of the straight lines, selecting the most reliable straight line parameters as the initialization model, and adopting the formula as follows:

the reliability of the ith line compared to the jth line.

In S3k, iteratively estimating the parameter distortion model and the distortion center through the energy optimization process according to the distance between the correction point and the relevant straight line includes:

s3k1, adding a constraint parameter item to the initial distortion model, and further constraining the distortion model;

and S3k2, optimizing the optimal parameters after iterative estimation constraint through an energy formula, and obtaining the optimal parameters through loop traversal of the pixels on the straight line and continuous iterative calculation. The energy formula is as follows:

nl denotes the number of lines, N (j) is the number of points on line j,

points corrected using the distortion model d are shown.

Furthermore, in S5, the spatial position of the camera in the three-dimensional model is determined through parameter analysis of the scene camera, the view cone of the projector corresponding to the camera is constructed, matching, positioning and fusion display of the scene video image and the thermodynamic diagram thereof and the hot spot target are performed, and the conflict and redundant parts in the video image are cut to improve the three-dimensional fusion visual effect.

The scene camera parameter analysis comprises the following steps:

s5a, extracting and matching feature points of the corrected image of the scene camera and the three-dimensional model of the corresponding area, selecting a plurality of feature point pairs and calculating pixel coordinates and three-dimensional coordinates of the feature point pairs;

s5b calculates camera parameters according to the correspondence between the pixel coordinates and the three-dimensional coordinates of the matching points, and the principle is as follows:

the corresponding relation between the pixel coordinate and the three-dimensional coordinate is as follows:

wherein R is a rotation matrix, T is a translation matrix, and the middle part on the right of the equal sign is a camera internal reference matrix; substituting the corresponding points selected in the step S5a into the formula, solving by using a least square method and repeatedly iterating to obtain an optimal solution, so as to obtain the positions and postures of the internal and external parameters of the camera;

in S5c, the view angle of the camera is calculated from the camera reference obtained in S5b, and the formula is:

FOVx＝2×tan(α)

FOVy＝2×tan(β)

wherein:

the matching and fusing of the scene video image and the three-dimensional model comprises the following steps:

s5d determining the position of the camera in the three-dimensional model space and constructing a visual cone of the projector corresponding to the camera according to the camera parameters calculated in S5b and S5 c;

s5e, carrying out frame processing on the real-time scene video stream, rendering the real-time scene video stream into dynamic video texture, and projecting and mapping the dynamic video texture to a three-dimensional model through a texture mapping technology;

s5f, fusing static textures and dynamic video textures on the surface of the three-dimensional model, and enabling the dynamic video textures to be covered as the uppermost layer of textures, so that the three-dimensional model is positioned and real-time scene video images are displayed;

s5g, according to the monitoring requirement, cutting the non-hot spot area and the part which affects the fusion visual effect in the scene video image, and cutting the overlapping area of the video images of the adjacent scene cameras, so that the multi-channel video images are seamlessly connected;

and S5h, obtaining a three-dimensional model static image under the view angle of the projector according to the parameters of the projector obtained in the step S5d by using a reverse imaging principle, and storing the three-dimensional model static image, the feature point pairs selected in the step S5a, the coordinates of the feature point pairs and the internal and external parameters calculated in the step S5b together for monitoring and correcting the state of the scene camera.

The matching and fusing of the density thermodynamic diagram and the three-dimensional model comprises the following steps:

s5i, rendering the real-time density thermodynamic diagrams analyzed by the video of each scene camera into dynamic thermodynamic textures, and realizing three-dimensional model positioning and real-time density thermodynamic diagram display according to S5e and S5 f;

s5j, cutting non-hot-spot areas and parts influencing the fusion visual effect in the density thermodynamic diagram according to the monitoring requirement; carrying out color processing on the overlapping areas of the adjacent thermodynamic diagrams by adopting a weighted average algorithm so as to optimize the fusion visual effect;

and S5k human-vehicle density thermodynamic diagrams can be further converted into human-vehicle three-dimensional models with corresponding density, direction and speed for fusion display.

The hot spot target and the three-dimensional model are matched and fused, and the hot spot target and the three-dimensional model are matched and fused by the following steps:

s5l extracting the pixel coordinates of the hot spot target central point in real time;

s5m, solving the three-dimensional coordinates of the hotspot target according to the camera parameters obtained in S5 b;

s5n, positioning the target label in real time on the three-dimensional model according to the three-dimensional coordinates of the hot target.

Further, in S6, based on the three-dimensional scene, the visual field of each target camera is calculated and the relay tracking interface of the adjacent overlapping visual fields is determined, and according to the position coordinates of the hot spot target, the designated target camera is scheduled and the cloud mirror is controlled by calculation, so as to obtain the fine image of the hot spot target, and the dynamic target can be continuously tracked in a relay manner.

The step of calculating the visual field of the target camera comprises the following steps:

s6a fixing the position and posture of the target camera, and solving the position of each target camera in the three-dimensional model according to S5 a-S5 d;

s6b calculating the visible range of each target camera within 360 degrees horizontally according to the camera position obtained in the step S6a and the given visible radius and the vertical visual angle range;

and S6c, optimizing the relay tracking interfaces of adjacent overlapped visible areas according to the monitoring requirements, and determining and storing the monitoring area of each target camera.

The step of resolving control of the target camera cloud mirror comprises the following steps:

s6d, when the position point of the hot spot target in the target camera monitoring area is given, firstly, the attitude of the control holder is calculated, and the rotation angles in the horizontal direction and the vertical direction are calculated according to the position of the camera and the target position;

and (3) determining the horizontal rotation direction and angle of the camera by using trigonometric function calculation, wherein the formula is as follows:

wherein d1 is calculated from the latitude of two location points and d2 is calculated from the longitude of two location points;

and (3) determining the vertical rotation direction and angle of the camera by using trigonometric function calculation, wherein the formula is as follows:

wherein the height difference d3 is calculated according to the elevations of the two position points, and the horizontal distance d4 is calculated according to the longitude and latitude of the two position points;

s6e, resolving the controlled holder attitude according to S6d, resolving the zoom factor of the focal length of the control lens, and calculating according to two strategies of a fixed space scale and a fixed target ratio;

wherein the zoom factor of the lens focal length of the fixed spatial scale mode is calculated as follows:

setting the width of a camera monitoring space as W, the installation height of the camera as H, the current field angle as theta, and the pitch angle after the vertical rotation control of the pan-tilt head as theta

After rotationThe field angle ω is:

the calculation formula of the focal length scaling factor after the pan-tilt rotates is as follows:

will Z_newThe focal distance of the camera is calculated and controlled in real time according to the pitch angle, and the set monitoring space scale can be kept;

and calculating a lens focal length scaling factor of the fixed target ratio mode:

in the 16:9 frame mode, the height pixel of the video image is H_imgWidth pixel is W_imgSetting the frame ratio of the target as P, identifying the height pixel as H, the width pixel as W, the current focal length scaling factor as Z,

when the 16H is larger than or equal to 9W, the calculation formula of the focal length scaling factor after the pan-tilt rotates is as follows:

when 16H <9W, the calculation formula of the focal length scaling factor after the pan-tilt rotation is as follows:

will Z_newAnd calculating and controlling the focal distance of the camera in real time according to the pitch angle, and keeping the set target frame ratio.

Further, in S7, the camera parameter change is monitored manually or periodically, the projection parameters are automatically corrected and updated or the alarm is given to prompt the field to adjust and maintain, and the scene video image and the thermodynamic diagram thereof, the three-dimensional fusion precision of the hot spot target, and the established monitoring coverage area are maintained, including the steps of:

s7a, automatically acquiring a frame of image of the video stream of the real-time scene camera manually or regularly at any time, and performing feature point extraction matching with the static image of the three-dimensional model under the visual angle of the camera projector stored in S5 h;

s7b, matching the feature point pairs extracted and matched in the step S7a with the feature point pairs stored in the step S5h, selecting a plurality of feature point pairs and analyzing pixel coordinates of the feature point pairs;

s7c, calculating the monitoring parameters of the camera by the method of the step S5b according to the characteristic points selected in the step S7 b;

s7d, comparing the camera monitoring parameters obtained in the step S7c with the storage parameters obtained in the step S5b, if the change is smaller than a set threshold value, updating the three-dimensional model positioning and displaying of the scene video images through the steps S5c to S5h according to the monitoring parameters, and updating and storing the selected feature point pairs and the coordinates thereof, the camera monitoring parameters and the three-dimensional model static images under the view angle of the projector; if the threshold value is larger than the set threshold value, an alarm is given to prompt field adjustment and maintenance.

Furthermore, the method is based on image depth processing of the regional monitoring video and accurate matching and fusion of the regional monitoring video and the three-dimensional model, and other situation perception information, equipment facility information and task management and control information are displayed in an accurate space matching and fusion mode.

In a second aspect, the present invention discloses an electronic device, which includes a processor and a memory storing execution instructions, wherein when the processor executes the execution instructions stored in the memory, the processor executes the method for video fusion and video linkage based on three-dimensional scene according to the first aspect.

The invention has the beneficial effects that:

according to the invention, based on the image depth processing of the regional monitoring video and the accurate matching and fusion with the three-dimensional model, the accurate spatial matching and fusion display of other situation perception information, equipment facility information and task management and control information are further realized, so that the full-dimensional situation fusion monitoring based on the three-dimensional scene, the intelligent spatial positioning analysis and the associated information fusion display based on the hot spot inspection, the planned task and the detection alarm are realized, and the intelligent level and the application efficiency of the comprehensive situation monitoring and commanding system are remarkably improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a schematic block diagram of a method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a camera deployment according to an embodiment of the invention;

FIG. 3 is a flow chart of video rectification according to an embodiment of the present invention;

FIG. 4 is a three-dimensional matching fusion effect diagram of a scene video according to an embodiment of the present invention;

FIG. 5 is a three-dimensional matching fusion effect diagram of scene video according to an embodiment of the present invention;

FIG. 6 is a three-dimensional matching fusion effect diagram of a people flow density thermodynamic diagram according to an embodiment of the invention;

FIG. 7 is a three-dimensional matching fusion effect diagram of a traffic density thermodynamic diagram according to an embodiment of the invention;

FIG. 8 is a three-dimensional matching fusion effect diagram of a human current density thermodynamic diagram target model according to an embodiment of the invention;

FIG. 9 is a diagram illustrating the fusion effect of three-dimensional matching of objects according to an embodiment of the present invention;

fig. 10 is a schematic view of calculation of the horizontal rotation angle of the target camera according to the embodiment of the present invention;

FIG. 11 is a schematic diagram of a target camera vertical pitch angle calculation according to an embodiment of the present invention;

FIG. 12 is a diagram illustrating a relay tracking of a target camera for a dynamic target according to an embodiment of the present invention;

FIG. 13 is an architectural diagram of a city/park full-dimensional situation fusion monitoring and commanding system according to an embodiment of the present invention;

FIG. 14 is a schematic diagram of an application interface of a city/park full-dimensional situation fusion monitoring and commanding system architecture according to an embodiment of the present invention;

FIG. 15 is a diagram of a station/venue full-dimensional situation fusion monitoring and commanding system architecture according to an embodiment of the present invention;

FIG. 16 is a schematic diagram of an application interface of a station/venue full-dimensional situation fusion monitoring and commanding system architecture according to an embodiment of the present invention;

FIG. 17 is a schematic illustration of a shortcut key for a three-dimensional model according to an embodiment of the present invention;

FIG. 18 is a navigation chart illustrating a three-dimensional model of a hot spot area according to an embodiment of the present invention;

FIG. 19 is a schematic view of a three-dimensional fusion window of a hot spot region according to an embodiment of the present invention;

FIG. 20 is a schematic view of a hotspot patrol pattern in accordance with an embodiment of the present invention;

FIG. 21 is a schematic view of a manual control mode according to an embodiment of the present invention;

FIG. 22 is a schematic diagram of the positioning and linkage of tasks according to an embodiment of the invention;

FIG. 23 is a schematic diagram of alarm positioning and linkage according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example 1

The embodiment discloses a video fusion and video linkage method based on a three-dimensional model, which comprises the steps of three-dimensional modeling of a monitoring area, planning and deployment of a scene and a target camera, enhanced shake elimination and correction processing of the scene video, identification and analysis of the scene video, the scene video and thermodynamic diagram thereof, three-dimensional matching fusion of a hot target, three-dimensional matching linkage of the target camera, monitoring and correction of position and attitude offset of the scene camera and the like, wherein the system architecture is shown in figure 1 and specifically comprises the following steps:

step 1, collecting and manufacturing a three-dimensional model of a monitoring area;

step 2, planning and deploying the scene camera and the target camera;

step 3, enhancing, eliminating and correcting the scene video image;

step 4, identifying and analyzing scene videos;

step 5, matching and fusing the scene video image, the thermodynamic diagram of the scene video image, the hot spot target and the three-dimensional model;

step 6, matching and linking the target camera and the three-dimensional model;

and 7, monitoring and correcting the position and posture offset of the scene camera.

The monitoring area three-dimensional model in the step 1 is collected and manufactured, a three-dimensional model with space precision, dimension and texture meeting the requirement of comprehensive situation monitoring and commanding can be processed and manufactured through aerial photography, laser radar scanning, field photographing, building engineering design data and the like, and space matching and fusion display of monitoring videos and other situation information are supported.

The principle of the camera planning deployment in the step 2 is as follows: generally, a high-definition gun camera is selected and installed at a high point position, and a monitoring area is statically covered to obtain the overall situation of the monitoring area; and a PTZ ball machine is selected as the target camera, and the monitoring area is dynamically covered to track and acquire a fine image of the hot target. The specific deployment scenario is shown in fig. 2.

And 3, enhancing, eliminating and correcting the scene video images to realize the uniform and beautiful color and accurate and stable image of the three-dimensional fusion display of the multi-path scene video.

The enhancing treatment comprises the following steps:

step 3 a: histogram equalization is carried out on the original image:

and step 3 b: histogram equalization is carried out on the target image:

and step 3 c: determine from r_k→z_mThe mapping relationship between: let s_k＝v_kObtaining:

will r is_kAnd z_mThe mapping table is formed by the corresponding relations, and the pixels on the original image are sequentially updated according to the mapping table.

The method for eliminating the jitters comprises the following steps:

and step 3 d: determining the conversion relation from the previous frame to the current frame, and dividing the conversion relation into three parameters: d_x，d_y，d_α；

Step 3 e: calculating the motion trail of the current frame through the transformation relation and the motion trail of the previous frame, wherein the solving relation is as follows:

x_cur＝x_pre+d_x

y_cur＝y_pre+d_y

α_cur＝α_pre+d_α；

and step 3 f: smoothing the motion trail of the current frame, wherein the smoothing method comprises the following steps:

step 3 g: calculating a new frame-to-frame conversion relationship H':

by calculating the relationship:

dx'＝dx+x-x_cur

dy'＝dy+y-y_cur

dα'＝dα+α-α_cur；

the transformation matrix H' is obtained as:

step 3 h: and transforming the video by using the transformation matrix to obtain the video after the jittering is eliminated.

The method for determining the conversion relationship from the previous frame to the current frame and dividing the conversion relationship into three parameters dx, dy and d α in the step 3d comprises the following steps:

step 3d 1: extracting feature point sets of a previous frame and a current frame;

step 3d 2: calculating a transformation matrix H between two frames through the change of the feature point set between the two frames:

step 3d 3: extracting transformation parameter d between two frames from transformation matrix_x，d_y，d_α. Wherein:

dx＝b₁

dy＝b₂

the flow chart of the correction process is shown in fig. 3, and the steps are as follows:

step 3 i: detecting contour points in the image using an edge detection algorithm;

step 3 j: detecting a distortion straight line in the image by a straight line detection method, and initially estimating and initializing a distortion model;

step 3k, iteratively estimating a parameter distortion model and a distortion center through an energy optimization process according to the distance between the correction point and the relevant straight line;

step 3 l: and performing distortion correction on the image by using the calculated distortion model.

The step 3j of detecting a distorted straight line in the image by a straight line detection method, and the step of preliminarily estimating and initializing the distorted model comprises the following steps:

step 3j1, because in the original image, the contour which should be a straight line is changed into a discontinuous curve after being distorted, a new dimension is needed to be added to describe the discontinuous curve, whether the discontinuous curve belongs to the same straight line or not is determined by voting by an approximate approximation method, and the straight line is subjected to fitting correction;

step 3j2, estimating an initialization distortion model through the corrected straight line group, selecting N longest straight lines, calculating the reliability of the straight lines, selecting the most reliable straight line parameters as the initialization model, and adopting the formula as follows:

the reliability of the ith line compared to the jth line.

The step of iteratively estimating the parameter distortion model and the distortion center through the distance between the correction point and the relevant straight line in the step 3k through an energy optimization process comprises the following steps:

step 3k 1: adding a constraint parameter item to the initial distortion model, and further constraining the distortion model;

step 3k 2: and optimizing the optimal parameters after iterative estimation constraint through an energy formula, and obtaining the optimal parameters through loop traversal of the pixels on the straight line and continuous iterative calculation. The energy formula is as follows:

nl denotes the number of lines, N (j) is the number of points on line j,

points corrected using the distortion model d are shown.

And 4, scene video identification and analysis, which is mainly used for acquiring dynamic density thermodynamic diagrams such as people and vehicles in a monitoring area, hot spot targets and the like required by situation monitoring so as to supplement and optimize situation information and visual effects displayed by three-dimensional fusion of video images of a specific scene.

And 5, matching and fusing the scene video image, the thermodynamic diagram of the scene video image, the hot spot target and the three-dimensional model, determining the spatial position of the camera in the three-dimensional model and constructing a view cone of the projector corresponding to the camera through parameter analysis of the scene camera, performing matching positioning and fusion display on the scene video image, the thermodynamic diagram of the scene video image and the hot spot target, and cutting conflict and redundant parts in the video image to improve the three-dimensional fusion visual effect.

The method for analyzing the scene camera parameters comprises the following steps:

step 5 a: extracting and matching feature points of a corrected image of a scene camera with a three-dimensional model of a corresponding area, selecting a plurality of feature point pairs and calculating pixel coordinates and three-dimensional coordinates of the feature point pairs;

and step 5 b: calculating camera parameters according to the corresponding relation between the pixel coordinates and the three-dimensional coordinates of the matching points, wherein the principle is as follows:

wherein R is a rotation matrix, T is a translation matrix, and the middle part on the right of the equal sign is a camera internal reference matrix. Substituting the corresponding points selected in the step 5a into the formula, solving by using a least square method and repeatedly iterating to obtain an optimal solution, so as to obtain the positions and postures of the internal and external parameters of the camera;

and step 5 c: calculating the view angle of the camera by the internal reference of the camera obtained in the step 5b, wherein the formula is as follows:

FOVx＝2×tan(α)

FOVy＝2×tan(β)

wherein:

and step 5 d: determining the position of the camera in the three-dimensional model space and constructing a visual cone of the projector corresponding to the camera according to the camera parameters calculated in the steps 5b and 5 c;

step 5e, performing frame processing on the real-time scene video stream, rendering the real-time scene video stream into dynamic video textures, and projecting and mapping the dynamic video textures to the three-dimensional model through a texture mapping technology;

step 5f, fusing the static texture and the dynamic video texture on the surface of the three-dimensional model, and enabling the dynamic video texture to cover the uppermost texture, so as to realize the positioning of the three-dimensional model and the display of a real-time scene video image;

step 5 g: according to the monitoring requirements, cutting non-hot spot areas and parts influencing the fusion visual effect in the scene video images, such as trees, buildings and the like around dynamic monitoring areas such as roads, squares and the like; and cutting the overlapped area of the video images of the cameras in the adjacent scenes so as to enable the multi-channel video images to be connected seamlessly. The specific effect is shown in fig. 4 and 5.

Step 5 h: and (4) obtaining a three-dimensional model static image under the visual angle of the projector according to the parameters of the projector obtained in the step (5 d) by a reverse imaging principle, storing the static image, the feature point pairs and the coordinates thereof selected in the step (5 a) and the internal and external parameters calculated in the step (5 b) together, and monitoring and correcting the state of the scene camera.

step 5 i: rendering the real-time density thermodynamic diagrams analyzed by the video of each scene camera into dynamic thermodynamic textures, and realizing three-dimensional model positioning and real-time density thermodynamic diagram display according to the steps 5e and 5 f;

step 5 j: according to the monitoring requirement, cutting non-hot spot areas and parts influencing the fusion visual effect in the density thermodynamic diagram; and carrying out color processing on the overlapped areas of the adjacent thermodynamic diagrams by adopting a weighted average algorithm so as to optimize the fusion visual effect. The specific effects are shown in fig. 6 and 7.

Step 5 k: the man-vehicle equal-density thermodynamic diagram can be converted into a man-vehicle three-dimensional model with corresponding density, direction and speed for fusion display. The specific effect is shown in figure 8.

The hot spot target and the three-dimensional model are matched and fused in the following steps:

step 5 l: extracting the pixel coordinates of the central point of the hot spot target in real time;

step 5 m: resolving the three-dimensional coordinates of the hot spot target according to the camera parameters obtained in the step 5 b;

and 5 n: and positioning the target label in real time on the three-dimensional model according to the three-dimensional coordinates of the hot target. The specific effect is shown in figure 9.

And 6, matching and linking the target cameras and the three-dimensional model, calculating the visual field of each target camera and determining a relay tracking interface of adjacent overlapped visual fields through space matching of the target cameras and the three-dimensional model, scheduling the specified target cameras and resolving and controlling the cloud mirror according to the position coordinates of the hot spot target, and realizing automatic scheduling and acquisition of fine images of the manually specified position, the patrolling hot spot region and the task/alarm related region and automatic scheduling and tracking of the specified or identified alarm people, vehicles and other dynamic targets to acquire fine images.

step 6 a: fixing the target cameras, and solving the position of each target camera in the three-dimensional model according to the steps 5a to 5 d;

step 6 b: calculating the visible range of each target camera within 360 degrees horizontally according to the camera position obtained in the step 6a and the given visible radius and the vertical visual angle range;

step 6 c: and optimizing the relay tracking interfaces of adjacent overlapped visible areas according to the monitoring requirement, and determining and storing the monitoring area of each target camera.

and 6 d: when the position point of a hot spot target in a given target camera monitoring area is determined, firstly, the attitude of a control holder is calculated, and the rotation angles in the horizontal direction and the vertical direction are calculated according to the position of the camera and the target position;

as shown in fig. 10, the trigonometric function is used to determine the horizontal rotation direction and angle of the camera, and the formula is as follows:

where d1 is calculated from the latitude of two location points and d2 is calculated from the longitude of two location points.

As shown in fig. 11, the camera vertical rotation direction and angle are determined by trigonometric function calculation, and the formula is as follows:

wherein the height difference d3 is calculated according to the elevations of the two position points, and the horizontal distance d4 is calculated according to the longitude and latitude of the two position points.

Step 6 e: resolving the controlled holder attitude according to the step 6d, resolving the zoom factor of the focal length of the control lens, and calculating according to two strategies of determining the space scale and the target ratio;

and (3) calculating a lens focal length scaling factor of the fixed space scale mode:

The rotated field angle ω is:

the formula for calculating the focal length scaling factor after the pan-tilt rotates is as follows:

will Z_newThe focal distance of the camera is calculated and controlled in real time according to the pitch angle, and the set monitoring space scale can be kept.

in the 16:9 frame mode, the height pixel of the video image is H_imgWidth pixel is W_imgSetting the frame ratio of the targetP, the height pixel of the recognition target is H, the width pixel is W, the current focal length scaling factor is Z,

Manually designating the position, patrolling the hot spot area and linking the target cameras in the task/alarm related area:

step 6 f: analyzing and acquiring three-dimensional coordinates of a manually-specified position, a patrol hotspot area and a task/alarm related area;

step 6 g: and 6 d-6 e, calculating the cloud mirror attitude of the control target camera to obtain a fine image of the area.

Target camera linkage of dynamic targets such as appointed or intelligent identification people and vehicles:

step 6 h: continuously analyzing and obtaining three-dimensional coordinates of appointed or intelligently identified dynamic targets such as people, vehicles and the like;

step 6 i: continuously calculating and controlling the cloud mirror attitude of the appointed initial target camera through the steps 6 d-6 e, and continuously acquiring a fine image of the target;

step 6 j: and when the dynamic target crosses the relay tracking interface of the adjacent target camera, selecting the relay target camera according to the three-dimensional coordinates of the dynamic target obtained in the step 6h, and continuously calculating and controlling the cloud mirror posture of the relay target camera through the steps 6d to 6e to continuously obtain the fine image of the target. The process is schematically shown in FIG. 12.

The position and posture deviation monitoring and correction of the scene camera in the step 7, the parameter change of the camera is monitored manually or periodically, the projection parameters are automatically corrected and updated or the scene is prompted to be adjusted and maintained in an alarm mode, the three-dimensional fusion precision of the scene video image and the set monitoring coverage area are maintained, and the steps are as follows:

step 7 a: automatically acquiring a frame of image of the video stream of the camera in the real-time scene at any time manually or periodically, and performing feature point extraction matching with the static image of the three-dimensional model under the visual angle of the projector of the camera stored in the step 5 h;

and 7 b: matching the feature point pairs extracted and matched in the step 7a with the plurality of feature point pairs stored in the step 5h, selecting the plurality of feature point pairs and analyzing pixel coordinates of the feature point pairs;

and 7 c: calculating the monitoring parameters of the camera by the method in the step 5b according to the characteristic point pairs selected in the step 7 b;

and 7 d: comparing the camera monitoring parameters obtained in the step 7c with the storage parameters obtained in the step 5b, if the change is smaller than a set threshold value, updating the three-dimensional model positioning and displaying of the scene video image through the steps 5 c-5 h according to the monitoring parameters, and updating and storing the selected feature point pairs and the coordinates thereof, the camera monitoring parameters and the three-dimensional model static image under the projector view angle; if the threshold value is larger than the set threshold value, an alarm is given to prompt field adjustment and maintenance.

Example 2

In the embodiment, the system is applied to a full-dimensional situation fusion monitoring command scheme of an open management and control environment such as a city and a garden, high-definition and array scene cameras, regional scene cameras and target cameras matched with streets and communities are deployed in high-rise buildings in key areas such as squares and main roads, information such as monitoring facilities for license plate recognition and face recognition and alarm receiving centers is fused, the system is structured as shown in fig. 13, and the application interface is applied as shown in fig. 14.

The full-dimensional situation fusion monitoring and commanding application interface comprises comprehensive business chart split screens, three-dimensional operation situation split screens and three-dimensional hotspot video split screens.

Example 3

In this embodiment, the system is applied to a full-dimensional situation fusion monitoring command scheme of closed management and control environments such as stations and venues, high-altitude high-definition and array scene cameras are deployed in high-rise buildings in key regions such as squares, the scene cameras and the target cameras in internal regions of the building are matched, information such as equipment facilities and operation management of a barrier gate, fire fighting, lighting and the like is fused, the system architecture is shown in fig. 15, and the application interface is shown in fig. 16.

And a shortcut key is configured on the three-dimensional model in the three-dimensional operation situation split screen, and the display of the universe/region and the display of the designated view angle are supported, as shown in fig. 17.

When the hot spot region is shown in an enlarged manner, a navigation map is automatically added to the lower right corner of the interface, the position and the view angle of the hot spot region in the whole domain are displayed, and the overall spatial orientation is kept, as shown in fig. 18.

And on the basis of the global three-dimensional model, integrating and displaying the whole operation situation information of the station, wherein the information comprises equipment facility state information, routine task state information, alarm handling state information, vehicle in-out state information, personnel in-out state information, staff on duty state information and the like.

The three-dimensional fusion window of the hot spot area takes a three-dimensional model as a background, and a scene video image occupies 2/3 frames so as to enhance the spatial scene and the orientation of the scene video image; the lower right corner automatically adds a navigation map to enhance the spatial orientation and perspective of the hotspot area, as in fig. 19.

And configuring a hot spot area of the target camera, wherein the hot spot video is in a picture-in-picture mode, and the three-dimensional fusion scene video and the fine video of the target camera are switched.

The hot video split screen is configured into a directory window and a main display window.

The control mode and the priority thereof can be configured, and the general manual control is more than alarm linkage more than task linkage more than hot spot patrol.

1. Hotspot patrol mode

The sequencing, switching and duration of the hot videos displayed in the directory window and the main display window in turn can be set, and the patrol monitoring of all hot areas is supported, as shown in fig. 20.

2. Manual control mode

Clicking any hot video of the directory window, and switching the hot video into the main display window.

Clicking any point position of the global three-dimensional model, switching the hot video of the region into a main display window, and automatically following the point position by the target camera, as shown in fig. 21.

The target camera can set an automatic tour mode, automatically surrounds 360 degrees, can pause zooming at any position and captures a fine video of the target.

3. Positioning and linking tasks

After the routine task is started, the task tag (yellow) of the relevant area and the starting state of the associated equipment facility are displayed in a three-dimensional situation split screen mode, the hotspot video of the relevant area is automatically displayed in a split screen mode, and the hotspot video is overlaid with task information (yellow) as shown in fig. 22.

When a plurality of tasks are parallel, the hot video is split and automatically displayed in a screen-sharing manner, and the hot video of the relevant area of each task is displayed in a circulating manner; when a certain task of the integrated chart is clicked and split-screen-displayed, the task label of the relevant area of the task and the starting state of the relevant equipment facility are split-screen-displayed in a three-dimensional situation, the hot video of the relevant area of the task is automatically displayed in a split-screen mode, and task information is overlapped in a flashing mode.

4. Alarm positioning and linkage

After the equipment facilities and the videos are detected and alarmed, the three-dimensional situation screen-division flickering positioning display device displays an information label (red) of the alarm, the hot video of the alarm is automatically displayed in a screen-division mode, and the hot video is overlaid with the alarm information (red), as shown in fig. 23.

When a plurality of alarms are parallel, hot video screens are split, and hot videos of all the alarms are automatically displayed in a turn mode; when a certain alarm is clicked, the comprehensive diagram is split and the screen is split, the three-dimensional situation split screen flickers an information label of the alarm, the hot video split screen automatically displays the hot video of the alarm, and the hot video flickers and overlaps the alarm information.

Example 4

The embodiment discloses an electronic device which comprises a processor and a memory, wherein the memory stores execution instructions, and when the processor executes the execution instructions stored in the memory, the processor executes a video fusion and video linkage method based on a three-dimensional scene.

In conclusion, the invention is based on the image depth processing of the regional monitoring video and the accurate matching and fusion of the regional monitoring video and the three-dimensional model, and further realizes the accurate spatial matching and fusion display of other situation perception information, equipment facility information and task control information, thereby realizing the full-dimensional situation fusion monitoring command based on the whole regional three-dimensional model and the local three-dimensional model positioning and associated information intelligent scheduling fusion display based on the plan task, the hot spot region and the detection alarm, and obviously improving the intelligent level and the application efficiency of the system.

The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A video fusion and video linkage method based on a three-dimensional scene is characterized by comprising the following steps:

s3, enhancing, eliminating shake and correcting the obtained scene video image;

s6, matching and linking the target camera with the three-dimensional model;

s7, monitoring and correcting the pose offset of the scene camera;

in the step S5, matching and fusing the scene video image, the thermodynamic diagram thereof, the hot spot target and the three-dimensional model, determining the spatial position of the camera in the three-dimensional model and constructing a view cone of the projector corresponding to the camera through parameter analysis of the scene camera, performing matching, positioning and fusion display of the scene video image, the thermodynamic diagram thereof and the hot spot target, and cutting out a conflict part and a redundant part in the video image to improve the three-dimensional fusion visual effect;

the scene camera parameter analysis comprises the following steps:

FOVx＝2×tan(α)

FOVy＝2×tan(β)

wherein:

s5g, according to the monitoring requirement, cutting the non-hot spot area and the part which affects the fusion visual effect in the scene video image, and cutting the overlapping area of the video images of the adjacent scene cameras, so that the multi-path video images are seamlessly connected;

s5h, obtaining a three-dimensional model static image under the view angle of the projector through a reverse imaging principle according to the projector parameters obtained in the step S5d, storing the three-dimensional model static image, the feature point pairs and the coordinates thereof selected in the step S5a and the internal and external parameters calculated in the step S5b, and monitoring and correcting the state of the scene camera;

s5k the man-vehicle density thermodynamic diagram can be converted into a man-vehicle three-dimensional model with corresponding density, direction and speed for fusion display;

s5l, extracting the pixel coordinates of the hot spot target central point in real time;

s5m, solving the three-dimensional coordinates of the hotspot target according to the camera parameters acquired in the step S5 b;

2. The method for fusing and linking video based on three-dimensional scene according to claim 1, wherein in step S3, the enhancement processing comprises:

s3a histogram equalizes the original image:

s3b histogram equalizes the target image:

the method for eliminating the jitters comprises the following steps:

x_cur＝x_pre+d_x

y_cur＝y_pre+d_y

α_cur＝α_pre+d_α；

dx'＝dx+x-x_cur

dy'＝dy+y-y_cur

dα'＝dα+α-α_cur；

the transformation matrix H' is obtained as:

s3h, transforming the video by using the transformation matrix to obtain the video after shaking elimination;

the correction treatment comprises the following steps:

3. The method as claimed in claim 2, wherein in step S3d, the conversion relationship from the previous frame to the current frame is determined and divided into three parameters d_x，d_y，d_αThe method comprises the following steps:

s3d1 extracting feature point sets of a previous frame and a current frame;

s3d3 extracting transformation parameter d between two frames from transformation matrix_x，d_y，d_αWherein:

dx＝b₁

dy＝b₂

4. the method for video fusion and video linkage based on three-dimensional scene according to claim 2, wherein in S3j, the distorted straight line in the image is detected by a straight line detection method, and the step of initially estimating and initializing the distorted model comprises:

s3j1, voting by an approximate approximation method to determine whether discontinuous curves belong to the same straight line, and performing fitting correction on the straight line;

s3j2 estimates an initialization distortion model through the corrected straight line group, selects N longest straight lines, calculates the reliability of the straight lines, selects the most reliable straight line parameters as the initialization model, and has the formula as follows:

the reliability of the ith line compared to the jth line.

5. The method for video fusion and video linkage based on three-dimensional scene according to claim 2, wherein in the step S3k, the step of iteratively estimating the parameter distortion model and the distortion center through the energy optimization process according to the distance between the correction point and the relevant straight line comprises:

s3k2, optimizing the optimal parameters after iterative estimation constraint through an energy formula, and obtaining the optimal parameters through loop traversal of the pixels on the straight line and continuous iterative calculation, wherein the energy formula is as follows:

nl denotes the number of lines, N (j) is the number of points on line j,

points corrected using the distortion model d are shown.

6. The video fusion and video linkage method based on three-dimensional scene according to claim 1, wherein in S6, the target cameras are linked with the three-dimensional model in a matching manner, the designated target cameras are scheduled and the cloud mirrors are controlled by resolving according to the position coordinates of the hot target by calculating the visual field of each target camera based on the three-dimensional scene and determining the relay tracking interface of the adjacent overlapped visual fields, so as to obtain the fine image of the hot target, and the dynamic target can be continuously tracked in a relay manner;

s6a fixing the position and posture of the target camera, and solving the position of each target camera in the three-dimensional model according to the steps S5 a-S5 d;

s6c, optimizing the relay tracking interface of adjacent overlapped visible areas according to the monitoring requirement, and determining and storing the monitoring area of each target camera;

s6e, resolving the posture of the controlled holder according to the step S6d, resolving the zoom factor of the focal length of the controlled lens, and calculating according to two strategies of fixed space scale and fixed target ratio;

setting the width of a camera monitoring space as W, the installation height of the camera as H, the current angle of view as theta, the pitch angle after the vertical rotation control of the holder as phi, and the angle of view omega after the rotation as:

in 16:9 frame mode, video imagesIs H_imgWidth pixel is W_imgSetting the frame ratio of the target as P, identifying the height pixel of the target as H, the width pixel as W and the current focal length scaling factor as Z;

7. The method as claimed in claim 1, wherein in step S7, the scene camera is monitored and corrected for attitude and attitude offset, and the camera parameters are monitored manually or periodically to automatically correct and update projection parameters or to alert and prompt field adjustment and maintenance, so as to maintain the scene video image and thermodynamic diagram thereof, the three-dimensional fusion accuracy of the hot spot target, and the established monitoring coverage area, the steps are as follows:

s7a, automatically acquiring a frame of image of the video stream of the real-time scene camera manually or periodically at any time, and extracting and matching feature points with the static image of the three-dimensional model under the visual angle of the camera projector, which is stored in the step S5 h;

s7d comparing the camera monitoring parameters obtained in the step S7c with the storage parameters obtained in the step S5b, if the change is smaller than a set threshold, updating the three-dimensional model positioning and displaying of the scene video images through the steps S5c to S5h according to the monitoring parameters, and updating and storing the selected feature point pairs and the coordinates thereof, the camera monitoring parameters and the three-dimensional model static images under the projector view angle; if the threshold value is larger than the set threshold value, an alarm is given to prompt field adjustment and maintenance.

8. The three-dimensional scene-based video fusion and video linkage method according to claim 1, wherein the method is based on image depth processing of regional surveillance videos and precise matching and fusion with three-dimensional models, and further realizes precise spatial matching and fusion display of other situation awareness information, equipment facility information and task management and control information, so that full-dimensional situation fusion monitoring based on three-dimensional scenes and intelligent spatial positioning analysis and associated information fusion display based on hotspot tours, planning tasks and detection alarms are realized.

9. An electronic device comprising a processor and a memory storing execution instructions, wherein when the processor executes the execution instructions stored in the memory, the processor executes the three-dimensional scene-based video fusion and video linkage method according to any one of claims 1 to 8.