CN110245199B - Method for fusing large-dip-angle video and 2D map - Google Patents

Method for fusing large-dip-angle video and 2D map Download PDF

Info

Publication number
CN110245199B
CN110245199B CN201910350808.5A CN201910350808A CN110245199B CN 110245199 B CN110245199 B CN 110245199B CN 201910350808 A CN201910350808 A CN 201910350808A CN 110245199 B CN110245199 B CN 110245199B
Authority
CN
China
Prior art keywords
video
background image
image
dynamic target
map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201910350808.5A
Other languages
Chinese (zh)
Other versions
CN110245199A (en
Inventor
朱雪坚
刘学军
叶远智
刘洋
王晓辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Natural Resources Monitoring Center
Nanjing Normal University
Original Assignee
Zhejiang Natural Resources Monitoring Center
Nanjing Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Natural Resources Monitoring Center, Nanjing Normal University filed Critical Zhejiang Natural Resources Monitoring Center
Priority to CN201910350808.5A priority Critical patent/CN110245199B/en
Publication of CN110245199A publication Critical patent/CN110245199A/en
Application granted granted Critical
Publication of CN110245199B publication Critical patent/CN110245199B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B29/00Maps; Plans; Charts; Diagrams, e.g. route diagram
    • G09B29/003Maps
    • G09B29/006Representation of non-cartographic information on maps, e.g. population distribution, wind direction, radiation levels, air and sea routes

Abstract

The invention discloses a method for fusing a large-dip-angle video and a 2D map, which obtains a static video background image through video background modeling; extracting a foreground dynamic target by adopting a mode of combining a background subtraction method and a three-frame difference method; performing image segmentation on the static video background image obtained in the step S1 to obtain a road surface area in the image, and performing geometric correction on the video background image based on a homography large-inclination-angle video image geometric correction algorithm to obtain a corrected video background image; establishing a mutual mapping model of the monitoring video and the 2D geographic space data based on the internal and external parameters of the camera; and based on the mutual mapping model of the monitoring video and the 2D geographic space data established in the S4, mapping the corrected static video background image and the extracted foreground dynamic target to a two-dimensional map, and finishing the integrated expression of the monitoring video and the two-dimensional map.

Description

Method for fusing large-dip-angle video and 2D map
Technical Field
The invention belongs to a technology for fusing a monitoring video and a 2D map, and particularly relates to a method for fusing a large-dip-angle video and a 2D map.
Background
Integration of video and GIS is a new way to perform geographical scene representation, wherein, in 1978, in the field of integration of video and 2D GIS, Lippman (Lippman a. motion maps: an application of the optical video to computer graphics [ J ]. sigraph' 80, 1980, 14(3):32-42.) first integrates video and GIS, and dynamic and user-interactive hypermedia maps are developed, and then, the research work of video and GIS integration is gradually deepened, more and more emphasized by researchers, and a great deal of related research is developed. Berry (Berry J K. Capture "Where" and "When" on video-based GIS [ J ]. GEO WORLD,2000,13:26-27.) and other people put forward a frame of a video map, and design a corresponding conceptual scheme for field acquisition, processing and application of data; lewis (Lewis P, Fotheringham S, Winstanley A.spatial video and GIS.International Journal of geographic Information Science,2011,25(5):697-716) and the like define a pyramid data structure of a geospatial video data model under the constraint of GIS, can be applied to two-dimensional GIS analysis and visualization, and the feasibility of the pyramid data structure is verified through experiments; the campus geographic video monitoring WebGIS system design and implementation [ J ], surveying and mapping science, 2012,37(1):195 & 197.) and the like and the Zhang (Zhang. GIS-based public security video monitoring plan system design and implementation [ D ]. Kaifeng: Henan university, 2011.) and the like respectively realize the loose integration of the video monitoring system and the GIS, represent a camera by one point or represent the view range of the camera by one sector in a two-dimensional map, and statically call a video file to play by adopting a hyperlink mode; establishing a mapping relation among a geographic position (XY), a road mileage (M), a video time or a frame (T/F) by people such as a hole cloud peak (hole cloud peak, geographic video data model design and network video GIS realization [ J ]. Wuhan university academic newspaper information science edition, 2010, 35(2): 133-; the inter-mapping model of the monitoring video and the geographic spatial data is researched by Zhang Xing (Zhang Xing, Liu Xue Jun, Wang Si Ning, and the like.) the inter-mapping model of the monitoring video and the 2D geographic spatial data is researched, and a semi-automatic inter-mapping method based on feature matching is provided, wherein the inter-mapping model is provided by the Wuhan university newspaper, the information science edition 2015,40(8):1130 and 1136.
The geometric correction of the image has wide application in the field of remote sensing, and is one of important means for reducing the difference between the remote sensing image and the real form of the ground. In the field of photogrammetry, only the problem of geometric correction of small oblique shots (with an inclination angle within 2 °) is generally discussed. In the aspect of geometric correction of small-inclination images, existing algorithms mainly include methods based on control point solution according to certain mathematical models, such as polynomial methods, and the other is collinear equation methods based on digital elevation models and formation equations (grand-pod. Some researchers have also conducted some research on the geometric correction of large-tilt images (tilt angles between 2 ° and 90 °). And (5) performing geometric correction [ J ] based on homography large inclination angle images, college university of Shanghai, Nature edition, 2005,11(5): 481-. Xuqingyang (Xuqingyang, near space large inclination angle remote sensing image geometric correction method research [ D ]. Harbin Industrial university, 2009 ]) deep research is carried out on the difficult problems in the near space large inclination angle remote sensing image geometric correction technology, a segmented polynomial correction model is provided, the process of automatically selecting control points by SIFT interest operators is optimized through an iterative error control point removing algorithm and a uniform distribution algorithm, and full-automatic geometric correction of images is realized. For the large-inclination aerial photographic image, such as Juzhuang (steady juxie, Wangyong, great courage) geographic coordinate assignment method [ J ] ocean mapping, 2010,30(3):23-26 ], a method for performing geometric correction based on an improved six-parameter affine transformation model by referring to a reference image is provided.
From the above analysis, it can be seen that the research on the integration problem of the surveillance video and the two-dimensional map has been emphasized by the learners, the research on the related theories and methods has gradually become a focus of attention in the related fields of academia, and corresponding research results have been obtained, but the main problems are:
(1) in the aspect of integration of videos and GIS, the existing research uses video data as the attribute of spatial data, adopts a hyperlink mode to statically call a video file, lacks the spatial resolution of the videos, or simply places the videos on a map, and does not fully utilize rich information contained in the videos. In addition, the existing research only integrates the monitoring video and the map of the coverage area of each camera, has less deductive attention on the monitoring blind area, and cannot sense the spatial pattern of the dynamic target of the monitoring blind area.
(2) In the aspect of image geometric correction, the photogrammetry field generally only discusses the geometric correction problem of small-inclination images, while the existing monitoring cameras generally have larger inclination angles due to the requirement of monitoring range, and the geometric correction method suitable for the small-inclination images cannot be simply applied to the geometric correction of large-inclination monitoring video images. In the existing geometric correction method for the large-inclination image, a dynamic target in a correction result has large geometric deformation and large distortion, and the geometric correction requirement of a real-time monitoring video image is difficult to meet.
Disclosure of Invention
The invention provides a method for fusing a large-inclination-angle video and a 2D map, which aims at solving the problems that in a corrected image result obtained by the existing large-inclination image geometric correction method, a dynamic target has large geometric deformation, has large distortion and is low in correction algorithm efficiency.
The invention discloses a method for fusing a large-dip-angle video and a 2D map, which comprises the following steps of:
s1: establishing a mutual mapping model of the monitoring video and the 2D map according to the parameters of the camera;
s2: and mapping the front-view static video background image of the surveillance video and the foreground dynamic target of the surveillance video onto the 2D map according to the mutual mapping model, so as to finish the integrated expression of the surveillance video and the 2D map.
Further, the acquiring of the front-view static video background image of the surveillance video comprises the following steps:
obtaining a static video background image of a monitoring video according to a video background modeling technology;
and performing geometric correction on the static video background image to obtain a corresponding front-view image.
Further, the step of extracting foreground dynamic objects in the surveillance video includes:
performing AND operation on the foreground dynamic target binary image obtained by the three-frame difference method and the foreground dynamic target binary image obtained by the background subtraction method to obtain a final foreground dynamic target;
and obtaining the position of the foreground dynamic target by analyzing the connected domain of the foreground dynamic target.
Further, the step of obtaining a foreground dynamic target binary image includes:
obtaining a static video background image of a monitoring video according to a video background modeling technology;
respectively extracting foreground dynamic targets from a static video background image according to a three-frame difference method and a background subtraction method to respectively obtain preliminary foreground dynamic targets;
each frame of the surveillance video
Figure GDA0003053794780000031
Obtaining a difference value g corresponding to each pixel by subtracting the preliminary foreground dynamic target1And g2
Figure GDA0003053794780000032
Figure GDA0003053794780000033
If g is1>k1Or g2>k2Wherein k is1And k2Respectively, are the corresponding adaptive threshold values,
Figure GDA0003053794780000034
Figure GDA0003053794780000035
Figure GDA0003053794780000036
and if the average gray value of the background image of the static video is the gray value, marking the pixel point as 1, and marking other points as 0 to obtain a video foreground dynamic target binary image.
Further, before the step of geometrically correcting the background image of the still video, the method includes:
performing super-pixel segmentation on a static video background image;
based on the prior knowledge of the ground and the non-ground, constructing a decision tree which takes the image characteristics extracted from the segmented static video background image as a classification basis;
classifying the horizontal bottom surface and the non-bottom surface of the divided static video background image by adopting a decision tree to obtain a ground part and a non-ground part in the static video background image;
the step of geometrically correcting the still video background image comprises: correcting the ground part in the static video background image into an orthographic image by adopting a homography matrix;
if the front-view image has the void point, acquiring a corresponding point of the void point on the static video background image, calculating a gray value of the corresponding point of the void point on the static video background image by using a bilinear interpolation method, and further acquiring the gray value of the void point to obtain the finally corrected static video background image.
Further, the step of mapping the front-view static video background image in the surveillance video onto the 2D map comprises:
acquiring a corresponding view trapezoid of the monitoring video in the geographic space according to the mutual mapping model established in the S1;
according to the coordinates (X) of four corner points of the view trapezoidi,Yi),i∈[1,4]And calculating to obtain the side length L of each side of the trapezoid of the vision fieldi,i∈[1,4]The distance l corresponding to each side length of the trapezoid of the view field on the mapi,i∈[1,4]:li=s×Li,i∈[1,4]S is the scale of the 2D map;
according to the length PL of each side of the front-view static video background image in the monitoring videoiAnd its actual length l, calculating the scaling factor epsiloni
Figure GDA0003053794780000041
Based on a scaling factor, after scaling transformation is carried out on the visual static video background image, according to the coordinate of the center point of the camera, the rotation angle and the pitch angle of the camera, rotation and translation transformation are carried out on the visual static video background image, and the static video background image is mapped to the correct position of the 2D map;
the step of mapping foreground dynamic objects in the surveillance video onto the 2D map comprises:
calculating the center coordinate Centre of each foreground dynamic target in the monitoring video:
Figure GDA0003053794780000042
wherein M is the pixel number of the foreground dynamic target, (x)i,yi) Pixel coordinates of the foreground dynamic target;
according to a scaling factor epsiloniScaling the foreground dynamic target in the same proportion;
converting the center coordinates of each foreground dynamic target into 2D geographic coordinates according to a mutual mapping model of the monitoring video and the 2D map;
and mapping the foreground dynamic target of the monitoring video to a 2D map according to the central coordinates of each foreground dynamic target and the movement direction of the foreground dynamic target, and updating the position of the foreground dynamic target in real time.
Further, the moving direction of the foreground dynamic target is determined by the rotation angle of the camera.
Further, the mutual mapping model of the monitoring video and the 2D map comprises a mapping model from a video image space to a geographic space and a model from the geographic space to the video image space;
the mapping model of the video image space to the geographic space is represented as:
Figure GDA0003053794780000043
in the formula (X)G,YG,ZG) The space coordinate of the target is taken, (Xc, Yc, Zc) is the optical center point coordinate of the camera, (f, x, y) is a sight line vector, P and T are rotation matrixes of the camera, and lambda is a ray extension parameter;
the mapping model of the geographic space to the video image space is represented as:
Figure GDA0003053794780000051
further, the video background modeling technology adopts a Vibe algorithm.
Further, the super-pixel segmentation of the static video background image is realized by adopting a SLIC super-pixel segmentation algorithm.
Has the advantages that: the invention provides a method for integrating a large-dip-angle monitoring video and a 2D map based on the defects existing in the fusion of the monitoring video and the 2D map at present, and mainly solves the problem of deformation and distortion of a dynamic target after the geometric correction of a large-dip-angle image; meanwhile, in the video shot by the gunlock, the background of the video is kept unchanged, so that each frame of the video does not need to be geometrically corrected, the calculated amount is greatly reduced, the efficiency of geometric correction of the monitoring video image is effectively improved, the integration of a single monitoring video and a two-dimensional map is completed, and the expression of the two-dimensional map on a dynamic target is enhanced.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a video context separation experiment result of the present invention;
FIG. 3 is a result diagram of the segmentation of horizontal road surface areas according to the present invention, wherein 3-1 is an original video background diagram, 3-2 is a color effect diagram after classification, and 3-3 is a segmented horizontal road surface;
FIG. 4 is a diagram of the video background image correction result of the present invention, wherein 4-1 is the video background image to be corrected, and 4-2 is the result after correction;
fig. 5 is a general effect diagram of the present invention, wherein 5-1 is the original image of the 359 th frame of video, and 5-2 is the integrated effect diagram of the 359 th frame of video image and the two-dimensional map.
Detailed Description
The invention is further illustrated below with reference to the figures and examples.
The basic idea of the invention is as follows: firstly, video background modeling is carried out on a monitoring video, automatic segmentation of a flat road surface is carried out on an obtained static video background image by adopting a decision tree, and then a homography matrix is utilized to carry out large-inclination-angle geometric correction processing on the flat road surface; extracting a video foreground dynamic target by adopting a mode of combining a background subtraction method and a three-frame difference method; and finally, mapping the corrected static video background image and the extracted foreground dynamic target to a 2D map respectively, and finally realizing the integration of the monitoring video and the two-dimensional map, thereby enhancing the expression of the two-dimensional map to the dynamic target.
Example 1:
as shown in fig. 1, the method for fusing a large-inclination-angle video and a 2D map according to the embodiment includes the following steps:
firstly, video background modeling:
the video background modeling technology is used to obtain a static video background image, namely a static or very slow moving point in the video. The background modeling technology adopts a Vibe algorithm, the Vibe algorithm is a pixel-level video background modeling algorithm, the algorithm is high in calculation efficiency and has certain robustness on noise, the Vibe algorithm is suitable for complex scenes such as camera shaking and illumination change, good real-time performance can be kept, and video background modeling quality and efficiency can be guaranteed.
Secondly, foreground dynamic target extraction:
the foreground dynamic target refers to a point which obviously moves in a monitored video, the points are usually represented as running vehicles, walking pedestrians and the like in the monitored video, the video foreground dynamic target is extracted by adopting a mode of combining a background subtraction method and a three-frame difference method, and the specific steps are as follows:
(1) background subtraction is carried out by utilizing the static video background image obtained in the first step, so as to extract a preliminary foreground dynamic target, and then the average gray value of the static video background image is used for
Figure GDA0003053794780000061
Setting two adaptive thresholds k1And k2
Figure GDA0003053794780000062
Figure GDA0003053794780000063
(2) Reading each frame of a surveillance video
Figure GDA0003053794780000067
And solving the difference value g corresponding to each pixel by subtracting the preliminary foreground dynamic target1And g2
Figure GDA0003053794780000064
Figure GDA0003053794780000065
(3) If g is1>k1Or g2>k2And marking the pixel point as 1 in the foreground binary image and marking other points as 0, thereby obtaining a preliminary video foreground dynamic target binary image.
(4) And in order to eliminate the noise, performing AND operation on the foreground dynamic target obtained by the three-frame difference method and the foreground dynamic target obtained by the background subtraction method to obtain a final foreground dynamic target result.
(5) Determining the position of the foreground dynamic target by performing Connected Component analysis (Connected Component) on the foreground dynamic target, and solving a central coordinate Centre of the moving target:
Figure GDA0003053794780000066
wherein M is the pixel number of the foreground dynamic target, (x)i,yi) Is the pixel coordinate of the foreground dynamic target.
Thirdly, correcting the video background geometry:
the geometric correction of the video background is based on a decision tree, the static video background image obtained in the first step is subjected to image segmentation to obtain a flat pavement area in the image, and the geometric correction of the pavement area in the static video background image is performed based on a homography large-inclination video image geometric correction algorithm, and the method specifically comprises the following steps:
(1) and performing superpixel segmentation on the static video background image obtained in the first step, wherein an SLIC superpixel segmentation algorithm is adopted, and the method has high-efficiency processing speed, low algorithm complexity and good segmentation boundary.
(2) And acquiring prior knowledge of the ground and the non-ground through machine learning to construct a decision tree. The decision tree classification is based on image features extracted from the segmented static video background image. The embodiment selects 55 pixel characteristics in total of 10 image characteristics as the basis of decision tree classification.
TABLE 1 image characteristics for decision tree classification
Figure GDA0003053794780000071
(3) And carrying out horizontal ground and non-ground classification on the segmented video background image to obtain a ground part and a non-ground part.
(4) And correcting the ground part in the static video background image into an orthographic image through the homography matrix. Suppose that two images before and after correction are respectively I1And I2To, forIn the video background image I to be corrected1At any point (x)1,y1) Can be in the image I2Find the corresponding point (x) above2,y2). The points of the two images before and after correction satisfy a simple homography:
x2=Hx1 (6)
the matrix H is a 3 × 3 matrix, which is specifically expressed as:
Figure GDA0003053794780000072
obtaining the coordinates of the corrected image points:
Figure GDA0003053794780000073
Figure GDA0003053794780000081
(5) in general, x2、y2Is an integer, and therefore, needs to be rounded, corresponding to point (x)2,y2) The gray value of (A) is represented by point (x)1,y1) Determined by the gray value of (d). In the image obtained by the method, holes exist, and in order to eliminate the holes, the holes (x) need to be arranged in a hole point2′,y2') detected and evaluated in image I1Corresponding point (x) on1′,y1') and then calculating the gray value of the void point by using a bilinear interpolation method to obtain the finally corrected orthographic static video background image.
Fourthly, the video moving and static target and the 2D map are mapped with each other:
the mutual mapping model refers to the realization of mutual mapping between an image space and a geographic space, and comprises the following specific steps:
(1) mapping of image space to geographic space.
The video image space is a two-dimensional space, the geographic space is a three-dimensional space, and the monitoring camera can project the ground objects in the geographic space into the image space. The monitoring camera adopts perspective projection for imaging, and the model is as follows:
Figure GDA0003053794780000082
the meaning of the above formula is: in the initial state of the camera, the position of the camera is (0, 0, 0), the horizontal angle and the pitch angle are both 0 degrees, and the sight line vector CP is (f, x, y); as the camera rotates, the rotation is expressed by the rotation matrix of P, T, and the sight line vector is multiplied by the rotation matrix to be the rotated sight line direction; the ray extension, i.e. multiplication by λ in the formula, can obtain the vector from the optical center point of the camera to the object point. Optical center point coordinates (X) of known cameraC,YC,ZC) And the vector from the optical center to the object space point, the coordinate (X) of the object space point can be obtainedG,YG,ZG). In the formula, the left side of the equal sign is the spatial position of the target and is an unknown quantity; to the right of the equal sign are the position of the image pixel, the pose, position and λ of the video sensor, which are known quantities.
The image space is a two-dimensional space, wherein a point may correspond to an infinite point on a straight line in the three-dimensional space, and the above formula is an expression of the straight line. When λ is determined, a unique point in three-dimensional space can be determined.
Let the distance from the optical center point of the camera to the target surface be fDAnd the distance from the optical center point to the object space point is D, the geometrical relationship can be obtained as follows:
Figure GDA0003053794780000083
Figure GDA0003053794780000084
the invention is based on the horizontal ground, when Z isGEqual to the groundThe elevation of (a). When Z isGWhen known, one can deduce that λ is:
Figure GDA0003053794780000091
substituting lambda into formula 10 to obtain XGAnd YGThereby determining the spatial location of the object on the image.
(2) Mapping of geospatial to image space.
For mapping from geospatial to image space, it is the inverse of the image to geospatial mapping. Inverting and transforming equation 10 to obtain the following equation:
Figure GDA0003053794780000092
the right side of the middle mark in the formula contains known quantities such as geospatial coordinates, camera attitude, camera position, and the like. Left of equal sign when the focal length f is known, λ can be obtained from equation 12:
Figure GDA0003053794780000093
by substituting expression 15 into expression 10, the coordinates (x, y) of the spatial point in the image space can be calculated.
Fifthly, integrating the video moving and static targets with the 2D map:
the integration of the video target and the 2D map means that a mutual mapping model of the monitoring video and the 2D map is established, and the corrected front-view static video background image and the extracted foreground dynamic target are mapped onto the two-dimensional map, so that the integrated expression of the monitoring video and the two-dimensional map is realized.
(1) The mapping of the static video background of the monitoring video and the two-dimensional map comprises the following specific steps:
firstly, establishing a mapping model from a video image to 2D geographic space data according to internal and external parameters of a camera;
calculating a corresponding view trapezoid of the monitoring video image in the geographic space according to the established mapping model;
coordinate (X) of four corner points of view trapezoidi,Yi),i∈[1,4]Calculating the side length L of each side of the trapezoid of the viewing areai,i∈[1,4]
Fourthly, calculating the distance l of each side length of the trapezoid of the view field corresponding to the map according to the scale s of the current 2D mapi,i∈[1,4]
li=s×Li,i∈[1,4] (16)
Fifthly, knowing the length PL of each side of the corrected front-view static video background imagei,i∈[1,4]And the actual length l of the image, a scaling relation exists between the images before and after mapping, and the scaling factor is as follows:
Figure GDA0003053794780000094
the scaling is performed in order to overlay the surveillance video background image to a suitable size on the map. And after the video background image is subjected to zoom transformation, the monitoring video background image is subjected to rotation and translation transformation according to the coordinate of the central point of the camera, the rotation angle and the pitch angle of the camera, and the monitoring video background image is mapped to the correct position of the two-dimensional map.
(2) Monitoring mapping of a foreground dynamic target of a video and a two-dimensional map, which comprises the following specific steps:
extracting foreground dynamic targets in the monitored video according to the second step, and calculating the center coordinates of each foreground dynamic target;
② according to the scaling factor epsiloniScaling the dynamic target in the same proportion;
thirdly, according to the internal and external parameters of the monitoring camera, a mapping model of the monitoring video and the 2D geographic space data is constructed, and based on the mapping model, the central coordinates of all the moving objects are converted into 2D geographic coordinates;
mapping the dynamic foreground target of the monitoring video to two-dimensional geographic space data, determining the moving direction of the dynamic target such as the direction of the head of a moving vehicle according to the rotation angle of the camera, and enabling the dynamic target to move on the two-dimensional map by continuously updating the real-time position of the dynamic target.
(3) And (3) re-integrating the mapping results of the (1) and the (2) together, and finally completing the integration of the monitoring video and the two-dimensional map.
Example 2:
first step, preparation of related equipment: a portable notebook computer is prepared, and one high-definition monitoring camera is arranged.
Secondly, separating the background image of the static video from the foreground dynamic target: establishing a video background by using a Vibe algorithm, extracting a foreground dynamic target according to an established video background image, and sequentially obtaining an original image, a video background image and a video dynamic target as a result shown in figure 2; in fig. 2, the first line is the processing result of the 75 th frame of the video, and the second line is the processing result of the 359 th frame of the video.
Thirdly, geometrically correcting the background image of the static video:
(1) and (4) constructing a decision tree by using the urban road traffic map provided by the labelME database as a training picture set. And (3) dividing the flat pavement of the experimental video data based on the constructed decision tree, wherein the division result is shown in figure 3.
(2) And performing geometric correction on the video background image after segmentation. Fig. 3-3 is the image to be corrected, and the homography matrix H of the correction transformation is solved by the internal and external parameters of the monitoring camera, and the result is shown in fig. 4 after the correction is performed on fig. 3-3.
Fourthly, integrating the video moving and static targets with a two-dimensional map:
and mapping the corrected static video background image and the extracted foreground dynamic target to a two-dimensional map.
The method adopts OpenLayers to call Google map tiles as base maps, determines a mapping model of video images and 2D geographic data according to internal and external parameters of a camera, and realizes the mapping of video moving and static targets to the 2D geographic data, wherein the video background images and the dynamic targets can be superposed on the Google maps by calling the ol.layer.image () class and the ol.style.icon () class in the OpenLayers. A graph of the results of the video integration experiment with the two-dimensional map is shown in fig. 5.
Example 3:
the embodiment discloses a fusion system of a large-dip-angle video and a 2D map, which comprises a network interface, a memory and a processor; wherein the content of the first and second substances,
the network interface is used for receiving and sending signals in the process of receiving and sending information with other external network elements;
a memory for storing computer program instructions executable on the processor;
a processor for executing the steps of the method for fusing a high-tilt-angle video with a 2D map in embodiment 1 when executing the computer program instructions.
Example 4:
the present embodiment discloses a computer storage medium storing a program of a fusion method of a high-tilt-angle video and a 2D map, which when executed by at least one processor, implements the steps of the fusion method of a high-tilt-angle video and a 2D map of embodiment 1.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above embodiments are only intended to illustrate the technical solution of the present invention and not to limit the same, and a person of ordinary skill in the art can make modifications or equivalents to the specific embodiments of the present invention with reference to the above embodiments, and such modifications or equivalents without departing from the spirit and scope of the present invention are within the scope of the claims of the present invention as set forth in the claims.

Claims (9)

1. A method for fusing a large-dip-angle video and a 2D map is characterized by comprising the following steps: the method comprises the following steps:
s1: establishing a mutual mapping model of the monitoring video and the 2D geographic space data according to the camera parameters;
s2: according to the mutual mapping model, mapping the front-view static video background image of the surveillance video and the foreground dynamic target of the surveillance video to the 2D map to complete the integrated expression of the surveillance video and the 2D map;
wherein the step of mapping the front-view static video background image in the surveillance video onto the 2D map comprises:
acquiring a corresponding view trapezoid of the surveillance video in the geographic space according to the mutual mapping model of the surveillance video and the 2D geographic space data established in the S1;
according to the coordinates (X) of four corner points of the view trapezoidi,Yi),i∈[1,4]And calculating to obtain the side length L of each side of the trapezoid of the vision fieldi,i∈[1,4]Corresponding to each side length of trapezoid of view fieldDistance l on the graphi,i∈[1,4]:li=s×Li,i∈[1,4]S is the scale of the 2D map;
according to the length PL of each side of the front-view static video background image in the monitoring videoiAnd its actual length l, calculating the scaling factor epsiloni
Figure FDA0003053794770000011
Based on a scaling factor, after scaling transformation is carried out on the visual static video background image, according to the coordinate of the center point of the camera, the rotation angle and the pitch angle of the camera, rotation and translation transformation are carried out on the visual static video background image, and the static video background image is mapped to the correct position of the 2D map;
the step of mapping foreground dynamic objects in the surveillance video onto the 2D map comprises:
calculating the center coordinate Centre of each foreground dynamic target in the monitoring video:
Figure FDA0003053794770000012
wherein M is the pixel number of the foreground dynamic target, (x)i,yi) Pixel coordinates of the foreground dynamic target;
according to a scaling factor epsiloniScaling the foreground dynamic target in the same proportion;
converting the center coordinates of each foreground dynamic target into 2D geographical coordinates according to a mutual mapping model of the monitoring video and the 2D geographical space data;
and mapping the foreground dynamic target of the monitoring video to a 2D map according to the central coordinates of each foreground dynamic target and the movement direction of the foreground dynamic target, and updating the position of the foreground dynamic target in real time.
2. The method for fusing a large-inclination-angle video and a 2D map according to claim 1, wherein the method comprises the following steps: the acquisition of the front-view static video background image of the monitoring video comprises the following steps:
obtaining a static video background image of a monitoring video according to a video background modeling technology;
and performing geometric correction on the static video background image to obtain a corresponding front-view image.
3. The method for fusing a large-inclination-angle video and a 2D map according to claim 1, wherein the method comprises the following steps: the method for extracting the foreground dynamic target in the monitoring video comprises the following steps:
performing AND operation on the foreground dynamic target binary image obtained by the three-frame difference method and the foreground dynamic target binary image obtained by the background subtraction method to obtain a final foreground dynamic target;
and obtaining the position of the foreground dynamic target by analyzing the connected domain of the foreground dynamic target.
4. The method for fusing a large-inclination-angle video and a 2D map according to claim 3, wherein the method comprises the following steps: the foreground dynamic target binary image comprises the following steps:
obtaining a static video background image of a monitoring video according to a video background modeling technology;
respectively extracting foreground dynamic targets from a static video background image according to a three-frame difference method and a background subtraction method to respectively obtain preliminary foreground dynamic targets;
each frame of the surveillance video
Figure FDA0003053794770000021
Obtaining a difference value g corresponding to each pixel by subtracting the preliminary foreground dynamic target1And g2
Figure FDA0003053794770000022
Figure FDA0003053794770000023
If g is1>k1Or g2>k2Wherein k is1And k2Respectively, are the corresponding adaptive threshold values,
Figure FDA0003053794770000024
Figure FDA0003053794770000025
Figure FDA0003053794770000026
and if the average gray value of the background image of the static video is the gray value, marking the pixel point as 1, and marking other points as 0 to obtain a video foreground dynamic target binary image.
5. The method for fusing a large-inclination-angle video and a 2D map according to claim 2, wherein the method comprises the following steps: before the step of geometrically correcting the still video background image, the method comprises the following steps:
performing super-pixel segmentation on a static video background image;
based on the prior knowledge of the ground and the non-ground, constructing a decision tree which takes the image characteristics extracted from the segmented static video background image as a classification basis;
classifying the horizontal bottom surface and the non-bottom surface of the divided static video background image by adopting a decision tree to obtain a ground part and a non-ground part in the static video background image;
the step of geometrically correcting the still video background image comprises: correcting the ground part in the static video background image into an orthographic image by adopting a homography matrix;
if the front-view image has the void point, acquiring a corresponding point of the void point on the static video background image, calculating a gray value of the corresponding point of the void point on the static video background image by using a bilinear interpolation method, and further acquiring the gray value of the void point to obtain the finally corrected static video background image.
6. The method for fusing a large-inclination-angle video and a 2D map according to claim 1, wherein the method comprises the following steps: the moving direction of the foreground dynamic target is determined by the rotation angle of the camera.
7. The method for fusing a large-inclination-angle video and a 2D map according to claim 1, wherein the method comprises the following steps: the mutual mapping model of the monitoring video and the 2D geographic space data comprises a mapping model from a video image space to geographic space data and a model from the geographic space data to the video image space;
the mapping model of the video image space to the geospatial data is expressed as:
Figure FDA0003053794770000031
in the formula (X)G,YG,ZG) The space coordinate of the target is taken, (Xc, Yc, Zc) is the optical center point coordinate of the camera, (f, x, y) is a sight line vector, P and T are rotation matrixes of the camera, and lambda is a ray extension parameter;
the mapping model of the geospatial data to the video image space is represented as:
Figure FDA0003053794770000032
8. the method for fusing a high-inclination-angle video and a 2D map according to claim 2 or 4, wherein: the video background modeling technology adopts a Vibe algorithm.
9. The method for fusing a large-inclination-angle video and a 2D map according to claim 5, wherein the method comprises the following steps: and performing superpixel segmentation on the static video background image by adopting an SLIC superpixel segmentation algorithm.
CN201910350808.5A 2019-04-28 2019-04-28 Method for fusing large-dip-angle video and 2D map Expired - Fee Related CN110245199B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910350808.5A CN110245199B (en) 2019-04-28 2019-04-28 Method for fusing large-dip-angle video and 2D map

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910350808.5A CN110245199B (en) 2019-04-28 2019-04-28 Method for fusing large-dip-angle video and 2D map

Publications (2)

Publication Number Publication Date
CN110245199A CN110245199A (en) 2019-09-17
CN110245199B true CN110245199B (en) 2021-10-08

Family

ID=67883630

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910350808.5A Expired - Fee Related CN110245199B (en) 2019-04-28 2019-04-28 Method for fusing large-dip-angle video and 2D map

Country Status (1)

Country Link
CN (1) CN110245199B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110764110B (en) * 2019-11-12 2022-04-08 深圳创维数字技术有限公司 Path navigation method, device and computer readable storage medium
CN112040265B (en) * 2020-09-09 2022-08-09 河南省科学院地理研究所 Multi-camera collaborative geographic video live broadcast stream generation method
CN112967214A (en) * 2021-02-18 2021-06-15 深圳市慧鲤科技有限公司 Image display method, device, equipment and storage medium
CN113033348A (en) * 2021-03-11 2021-06-25 北京文安智能技术股份有限公司 Overlook image correction method for pedestrian re-recognition, storage medium, and electronic device
CN113297950B (en) * 2021-05-20 2023-02-17 首都师范大学 Dynamic target detection method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101277429A (en) * 2007-03-27 2008-10-01 中国科学院自动化研究所 Method and system for amalgamation process and display of multipath video information when monitoring
WO2014170886A1 (en) * 2013-04-17 2014-10-23 Digital Makeup Ltd System and method for online processing of video images in real time
CN104581018A (en) * 2013-10-21 2015-04-29 北京航天长峰科技工业集团有限公司 Video monitoring method for realizing two-dimensional map and satellite image interaction
CN106780541A (en) * 2016-12-28 2017-05-31 南京师范大学 A kind of improved background subtraction method
CN107197200A (en) * 2017-05-22 2017-09-22 北斗羲和城市空间科技(北京)有限公司 It is a kind of to realize the method and device that monitor video is shown
CN108389396A (en) * 2018-02-28 2018-08-10 北京精英智通科技股份有限公司 A kind of vehicle matching process, device and charge system based on video
CN108960566A (en) * 2018-05-29 2018-12-07 高新兴科技集团股份有限公司 A kind of traffic Visualized Monitoring System

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7730406B2 (en) * 2004-10-20 2010-06-01 Hewlett-Packard Development Company, L.P. Image processing system and method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101277429A (en) * 2007-03-27 2008-10-01 中国科学院自动化研究所 Method and system for amalgamation process and display of multipath video information when monitoring
WO2014170886A1 (en) * 2013-04-17 2014-10-23 Digital Makeup Ltd System and method for online processing of video images in real time
CN104581018A (en) * 2013-10-21 2015-04-29 北京航天长峰科技工业集团有限公司 Video monitoring method for realizing two-dimensional map and satellite image interaction
CN106780541A (en) * 2016-12-28 2017-05-31 南京师范大学 A kind of improved background subtraction method
CN107197200A (en) * 2017-05-22 2017-09-22 北斗羲和城市空间科技(北京)有限公司 It is a kind of to realize the method and device that monitor video is shown
CN108389396A (en) * 2018-02-28 2018-08-10 北京精英智通科技股份有限公司 A kind of vehicle matching process, device and charge system based on video
CN108960566A (en) * 2018-05-29 2018-12-07 高新兴科技集团股份有限公司 A kind of traffic Visualized Monitoring System

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
一种基于背景减除与三帧差分的运动目标检测算法;莫林 等;《微计算机信息》;20090425;第25卷(第4-3期);第274-276页 *
监控视频与2D地理空间数据互映射;张兴国 等;《武汉大学学报(信息科学版)》;20150703;第40卷(第8期);第1130-1136页 *
监控视频与二维地图的集成研究;刘洋;《中国优秀硕士学位论文全文数据库 基础科学辑》;20200515(第05(2020)期);第A008-84页 *

Also Published As

Publication number Publication date
CN110245199A (en) 2019-09-17

Similar Documents

Publication Publication Date Title
CN110245199B (en) Method for fusing large-dip-angle video and 2D map
US11080911B2 (en) Mosaic oblique images and systems and methods of making and using same
US10949978B2 (en) Automatic background replacement for single-image and multi-view captures
CN107292965B (en) Virtual and real shielding processing method based on depth image data stream
CN109003325B (en) Three-dimensional reconstruction method, medium, device and computing equipment
CN110288657B (en) Augmented reality three-dimensional registration method based on Kinect
CN110033475B (en) Aerial photograph moving object detection and elimination method based on high-resolution texture generation
CN107843251A (en) The position and orientation estimation method of mobile robot
CN106803275A (en) Estimated based on camera pose and the 2D panoramic videos of spatial sampling are generated
CN109712247B (en) Live-action training system based on mixed reality technology
CN110211169B (en) Reconstruction method of narrow baseline parallax based on multi-scale super-pixel and phase correlation
CN108830925B (en) Three-dimensional digital modeling method based on spherical screen video stream
CN105005964A (en) Video sequence image based method for rapidly generating panorama of geographic scene
CN110941996A (en) Target and track augmented reality method and system based on generation of countermeasure network
CN106530407A (en) Three-dimensional panoramic splicing method, device and system for virtual reality
Kuschk Large scale urban reconstruction from remote sensing imagery
CN114782628A (en) Indoor real-time three-dimensional reconstruction method based on depth camera
JP2023502793A (en) Method, device and storage medium for generating panoramic image with depth information
CN115272494B (en) Calibration method and device for camera and inertial measurement unit and computer equipment
Wang et al. Terrainfusion: Real-time digital surface model reconstruction based on monocular slam
CN103617631A (en) Tracking method based on center detection
CN114241372A (en) Target identification method applied to sector-scan splicing
CN107767393B (en) Scene flow estimation method for mobile hardware
CN110738696B (en) Driving blind area perspective video generation method and driving blind area view perspective system
CN112102504A (en) Three-dimensional scene and two-dimensional image mixing method based on mixed reality

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20211008