CN113642463B - Heaven and earth multi-view alignment method for video monitoring and remote sensing images - Google Patents

Heaven and earth multi-view alignment method for video monitoring and remote sensing images Download PDF

Info

Publication number
CN113642463B
CN113642463B CN202110930060.3A CN202110930060A CN113642463B CN 113642463 B CN113642463 B CN 113642463B CN 202110930060 A CN202110930060 A CN 202110930060A CN 113642463 B CN113642463 B CN 113642463B
Authority
CN
China
Prior art keywords
image
monitoring camera
remote sensing
picture
longitude
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110930060.3A
Other languages
Chinese (zh)
Other versions
CN113642463A (en
Inventor
梁华
李晓威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Fu'an Digital Technology Co ltd
Original Assignee
Guangzhou Fu'an Digital Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Fu'an Digital Technology Co ltd filed Critical Guangzhou Fu'an Digital Technology Co ltd
Priority to CN202110930060.3A priority Critical patent/CN113642463B/en
Publication of CN113642463A publication Critical patent/CN113642463A/en
Application granted granted Critical
Publication of CN113642463B publication Critical patent/CN113642463B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Closed-Circuit Television Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for aligning heaven and earth multiple views of video monitoring and remote sensing images, which comprises the steps of establishing a low-precision mapping relation between a monitoring camera and longitude and latitude coordinates by aligning the picture coordinates and the longitude and latitude coordinates of the monitoring camera, and obtaining a corresponding target area on the remote sensing images by the longitude and latitude coordinates obtained through conversion; then, example segmentation is respectively carried out on the monitoring camera picture and the remote sensing image target area to obtain an image block information set phi of all objects in the monitoring camera picture and an image block information set mu of all objects in the remote sensing image target area, and the characteristics of each image block are extracted; according to the characteristics, the categories and the mapping position relation of the image blocks, the coarse matching between the image blocks in the monitoring camera picture and the image blocks in the remote sensing image target area is realized by using a Hungarian algorithm, and a corresponding set of image block pairs is obtained; and selecting data by utilizing the corresponding set of image block pairs, and establishing an accurate mapping relation between a monitoring camera picture and a remote sensing image.

Description

Heaven and earth multi-view alignment method for video monitoring and remote sensing images
Technical Field
The invention relates to the field of target detection, in particular to a method for aligning heaven and earth multiple views of video monitoring and remote sensing images.
Background
Image registration is the process of geometrically matching two images taken at the same location, at different times, or with different sensors. Image registration is a precondition and basis for applications such as image fusion and change detection, and the precision of image registration has an important influence on subsequent applications.
The feature-based image registration method is one of the most common methods for image registration at present, and has the greatest advantages that various analyses performed on the whole image can be converted into analyses on image feature information, namely feature points, feature curves, edges and smaller areas, so that the operation amount in the image processing process is greatly reduced, the method has good adaptability to gray scale change, image deformation, occlusion and the like, and quick and accurate registration of the image under complex imaging conditions can be realized. Due to the influence of factors such as noise, shooting conditions, seasonal changes, visual angle changes, platform shaking and the like, the feature-based registration method is more suitable for remote sensing image registration.
If a large number of repeated features exist in the matched images, for example, identical windows or patterns exist on the surface of a building, the traditional image matching method cannot effectively distinguish the features, and finally, the two images cannot be effectively matched.
In recent years, some identification and detection methods combining surveillance video and remote sensing satellites are proposed, for example:
the invention patent with publication number CN 108132979A discloses a port ground feature monitoring method and system based on remote sensing images, the invention includes that an original ground feature database is established, a three-dimensional scanner is used for scanning all ground features to be monitored of a port at the same spatial height to obtain a three-dimensional remote sensing image set, and corresponding real photos are recorded and stored; establishing a ground object coordinate system, selecting an origin of the three-dimensional coordinate system in a relatively open area and near a port edge, simulating an X axis at the port edge, and establishing the three-dimensional coordinate system; classifying the ground objects, namely classifying the ground objects by combining the actual sizes and shapes of the ground objects; extracting surface feature boundary points, and carrying out 360-degree panoramic scanning on the port by using a three-dimensional scanner to obtain a discrete point set of the surface feature boundary; and (3) monitoring the ground features in real time, transmitting the discrete point set obtained by scanning to a data processing unit for three-dimensional remote sensing graphic simulation, extracting a three-dimensional remote sensing image with the highest approximation degree from an original ground feature database, and further matching the three-dimensional remote sensing image with a corresponding real image. However, if a large number of repeated features exist in the matched images, the features cannot be effectively distinguished.
Disclosure of Invention
In order to solve the technical problems, the invention aims to provide a method for aligning the heaven and earth multiple views of a video monitoring and remote sensing image, which comprises the steps of constructing an example segmentation model to perform example segmentation on a monitoring camera picture and a remote sensing image target region to obtain an image block information set phi of all objects in the monitoring camera picture and an image block information set mu of all objects in the remote sensing image target region, and separating the objects from a background by utilizing semantic segmentation, thereby reducing the interference of the image background and improving the image matching efficiency. And then the Hungarian algorithm is utilized to obtain the optimal matching, the corresponding matching between the image blocks in the picture of the monitoring camera and the image blocks in the target area of the remote sensing image is realized, the accuracy of the image matching is improved, the corresponding relation is established between the image blocks of the monitoring camera and the remote sensing image, the accurate mapping relation between the monitoring picture and the remote sensing image is realized, and the rapid and accurate registration of the image under the complex imaging condition can be realized.
In order to realize the purpose of the invention, the technical scheme adopted by the invention is as follows:
a method for aligning the multiple views of the sky and the ground of a video monitoring and remote sensing image comprises the following steps:
step S1: aligning the picture coordinates and the longitude and latitude coordinates of the monitoring camera, establishing a low-precision mapping relation between the picture coordinates and the longitude and latitude coordinates of the monitoring camera, converting the four picture coordinates of the left upper part, the left lower part, the right upper part and the right lower part of the monitoring camera into the longitude and latitude coordinates, and obtaining a corresponding target area on a remote sensing image according to the longitude and latitude coordinates obtained through conversion;
step S2: respectively carrying out example segmentation on a monitoring camera picture and a remote sensing image target area to obtain an image block information set phi of all objects in the monitoring camera picture and an image block information set mu of all objects in the remote sensing image target area, and extracting the characteristics of each image block;
and step S3: according to the characteristics, the categories and the mapping position relation of the image blocks in the step S2, the Rochoy algorithm is utilized to realize the rough matching between the image blocks in the monitoring camera picture and the image blocks in the remote sensing image target area, and a corresponding set of image block pairs is obtained;
and step S4: and (4) selecting data by utilizing the corresponding set of the image block pairs in the step (S3), and establishing an accurate mapping relation between the picture of the monitoring camera and the remote sensing image according to the corresponding relation of the plurality of groups of image block pairs.
Preferably, in the step S1, the process of establishing the low-precision mapping relationship between the picture coordinates and the longitude and latitude coordinates of the monitoring camera specifically includes:
step 1.1: according to a Haverine formula, calculating the vertical projection position O' of the position of the monitoring camera on the horizontal plane and the optional position A of the monitoring camera on the horizontal plane within the visible range i Straight horizontal distance d i In units of m, O' and A i Horizontal distance s of longitude of i The unit is m:
Figure BDA0003210249770000021
Figure BDA0003210249770000031
Figure BDA0003210249770000032
Figure BDA0003210249770000033
wherein: a. b are all intermediate variable values, O' (λ) 0 ,ψ 0 ) In order to monitor the vertical projection position of the camera position on the horizontal plane, A ii ,ψ i ) Is any position on a horizontal plane within the visual range of the monitoring camera, and r is the radius of the earth and the unit is m;
step 1.2: from step 1.1, calculate O' and A i Angle beta between the connecting line of (A) and the true north direction of geography i
Figure BDA0003210249770000034
Step 1.3: from step 1.1, calculate O and A i Angle theta between the connecting line of (a) and the vertical line i
Figure BDA0003210249770000035
H is the height of the monitoring camera from the horizontal plane, and the unit is m;
step 1.4: calculation of A i In the picture coordinate (x) of the monitoring camera i ,y i ):
Figure BDA0003210249770000036
Figure BDA0003210249770000037
Wherein, X is the pixel width of the image, Y is the pixel height, and the parameter values of X and Y can be obtained according to the image resolution of the monitoring camera of X multiplied by Y;
theta is the included angle between the central line of the monitoring camera and the vertical line, beta is the included angle between the projection of the central line of the monitoring camera on the horizontal plane and the true north direction of geography, omega x For monitoring the horizontal field angle, omega, of the camera y The vertical field angle of the monitoring camera is adopted.
Preferably, in the step S2, an example segmentation model is constructed to perform example segmentation on the monitoring camera picture and the remote sensing image target region, and all objects in the monitoring camera picture and the remote sensing image target region are segmented through the example segmentation model to obtain an image block information set Φ of all objects in the monitoring camera picture and an image block information set μ of all objects in the remote sensing image target region; extracting the global features of each image block comprises: color features, texture features and shape features to obtain a feature set of each image block.
Preferably, the example segmentation model is Mask R-CNN, and a Mask R-CNN main network is a characteristic pyramid network FPN; the Mask R-CNN also comprises an area generation network RPN, a RolAlign layer and a deconvolution network Deconv, wherein the RolAlign layer is a target detection special layer; the feature pyramid network FPN carries out feature extraction on an input monitoring camera picture and a remote sensing image target area image to generate a feature map, the feature map is input into an area generation network RPN, a RolAlign layer processes the feature map and the output of the area generation network, and a processing result is input into a deconvolution network Deconv to generate a predicted mask.
Preferably, matching of the image blocks in the monitoring camera picture and the image blocks in the remote sensing image target area is realized according to the categories of the image blocks and the similarity and the characteristics of the mapping positions, and a category label of each image block is obtained according to the category of the image block, wherein the matching process between the image blocks in the monitoring camera picture and the image blocks in the remote sensing image target area in the step S3 comprises the following steps:
step S3.1: constructing a bipartite graph G = (X, Y, E), wherein X represents an image block set in a picture of a monitoring camera, Y represents an image block set in a remote sensing picture, and E represents an edge set between all nodes in X and all nodes in Y; the edge set E is constructed according to the following rules: if Sim (x, y)>0,x ∈ X, Y ∈ Y, and then an edge (X, Y) is connected between the two corresponding vertices X and Y in the bipartite graph G, and the weight w of the edge is set xy Sif (X, Y), where sif (X, Y) represents the similarity of feature X to feature Y (X ∈ X, Y ∈ Y);
step S3.2: connecting the image blocks of the same category in the monitoring camera picture and the remote sensing image through the category label of each image block to obtain one-to-one and one-to-many conditions;
step S3.3: converting the coordinates of the image blocks in the image of the monitoring camera into longitude and latitude coordinates by using the low-precision mapping relation between the coordinates of the image of the monitoring camera and the longitude and latitude coordinates in the step S1 and taking the central point of the image block as the coordinates of the image block to obtainThe converted longitude and latitude coordinate set A = { < lambda { ii > -, where < lambda ii The longitude and latitude coordinate, lambda, of the ith image block in the picture of the monitoring camera after conversion is more than i Is longitude,. Psi i Is the latitude; image block longitude and latitude coordinate set obtained by remote sensing image map
Figure BDA0003210249770000041
Wherein
Figure BDA0003210249770000042
Representing longitude and latitude coordinates of the jth image block in the remote sensing image target area,
Figure BDA0003210249770000043
is longitude,. Epsilon j If the latitude is the latitude, setting a threshold (lon, lat), if the following conditions are met, keeping the connection line between the latitude and the threshold, and if the following conditions are not met, deleting the connection line:
Figure BDA0003210249770000051
ij |<lat
obtaining a matching set between image blocks with smaller errors by screening the position similarity;
step S3.4: and (4) obtaining the optimal matching by using the Hungarian algorithm according to the matching set and the corresponding characteristic set among the image blocks obtained in the step (S3.3), and realizing the corresponding matching between the image blocks in the picture of the monitoring camera and the image blocks in the target area of the remote sensing image.
Preferably, the specific process of step S3.4 is as follows:
in a subgraph P of the bipartite graph G, any two edges in the edge set of P do not depend on the same vertex, and M is called matching; calculating the optimal matching of the bipartite graph by using a Hungarian algorithm, matching the image block features in the monitoring camera picture with the image block features in the remote sensing image target area to obtain matched feature pairs, and then calculating the similarity between the image blocks in the monitoring camera picture and the image blocks in the remote sensing image target area according to the similarity between the matched feature pairs to realize the correspondence between the image blocks in the monitoring picture and the image blocks in the remote sensing image target area to obtain a corresponding set of the image block pairs.
Preferably, in the step S4, the process of establishing the accurate mapping relationship between the monitoring camera picture and the remote sensing image is as follows:
selecting three groups of data from the corresponding set of the image block pairs in the step S3 each time, establishing a coordinate mapping relation according to the corresponding relation among the plurality of groups of image block pairs, and obtaining a plurality of transformation matrixes M through inverse matrix calculation i
Figure BDA0003210249770000052
Wherein (lon) i1 ,lat i1 )、(lon i2 ,lat i2 )、(lon i3 ,lat i3 ) Three groups of longitude and latitude coordinates of image blocks in a remote sensing image target area (x) respectively i1 ,y i1 )、(x i2 ,y i2 )、(x i3 ,y i3 ) Three groups of coordinates of image blocks in the picture of the monitoring camera are respectively;
taking a plurality of transformation matrices M i Average value of (a):
Figure BDA0003210249770000053
the accurate conversion relation between the picture coordinate of the monitoring camera and the longitude and latitude coordinate of the remote sensing image is as follows:
Figure BDA0003210249770000054
wherein (lon, lat) is the longitude and latitude coordinates of the remote sensing image, (x, y) is the picture coordinates of the monitoring camera, and M is a transformation matrix.
Compared with the prior art, the invention has the beneficial technical effects that:
the method and the device perform example segmentation on the monitoring camera picture and the remote sensing image target area by constructing an example segmentation model to obtain the image block information set phi of all objects in the monitoring camera picture and the image block information set mu of all objects in the remote sensing image target area, and separate the objects from the background by semantic segmentation, thereby reducing the interference of the image background and improving the image matching efficiency. And then the Hungarian algorithm is utilized to obtain the optimal matching, the corresponding matching between the image blocks in the picture of the monitoring camera and the image blocks in the target area of the remote sensing image is realized, the accuracy of the image matching is improved, the corresponding relation is established between the image blocks of the monitoring camera and the remote sensing image, the accurate mapping relation between the monitoring picture and the remote sensing image is realized, and the rapid and accurate registration of the image under the complex imaging condition can be realized.
Drawings
FIG. 1 is a flow chart of a method for aligning multiple views of a video surveillance and remote sensing image in accordance with an embodiment of the present invention;
FIG. 2 is a first schematic diagram of a first method for calculating a mapping relationship between longitude and latitude coordinates and frame coordinates of a monitoring camera in an embodiment of the present invention;
FIG. 3 is a schematic diagram of a second method for calculating a mapping relationship between longitude and latitude coordinates and coordinates of a monitoring camera frame in an embodiment of the present invention;
FIG. 4 is a schematic diagram of a coordinate position of a monitor camera in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments, but the scope of the present invention is not limited to the following embodiments.
Examples
Referring to fig. 1, the present embodiment discloses a method for aligning a plurality of views of a video surveillance and a remote sensing image, which includes the following steps:
step S1: aligning the picture coordinates and the longitude and latitude coordinates of the monitoring camera, establishing a low-precision mapping relation between the picture coordinates and the longitude and latitude coordinates of the monitoring camera, converting the four picture coordinates of the left upper part, the left lower part, the right upper part and the right lower part of the monitoring camera into the longitude and latitude coordinates, and obtaining a corresponding target area on a remote sensing image according to the longitude and latitude coordinates obtained through conversion;
step S2: respectively carrying out example segmentation on a monitoring camera picture and a remote sensing image target area to obtain an image block information set phi of all objects in the monitoring camera picture and an image block information set mu of all objects in the remote sensing image target area, and extracting the characteristics of each image block;
and step S3: according to the characteristics, the categories and the mapping position relation of the image blocks in the step S2, the Rochoy algorithm is utilized to realize the rough matching between the image blocks in the monitoring camera picture and the image blocks in the remote sensing image target area, and a corresponding set of image block pairs is obtained;
and step S4: and (4) selecting data by utilizing the corresponding set of the image block pairs in the step (S3), and establishing an accurate mapping relation between the picture of the monitoring camera and the remote sensing image according to the corresponding relation of the plurality of groups of image block pairs.
In the step S1, the process of establishing the low-precision mapping relationship between the picture coordinates and the longitude and latitude coordinates of the monitoring camera is specifically as follows:
step S1.0: parameter acquisition preparation:
as shown in fig. 2 to 4, N in fig. 3 indicates the geographical true north direction, the height of the monitoring camera from the horizontal plane is measured as H, the included angle between the center line of the monitoring camera and the vertical line is measured as θ, the included angle between the projection of the center line of the monitoring camera on the horizontal plane and the geographical true north direction is measured as β, and the horizontal field angle of the monitoring camera is measured as ω x The vertical field angle of the monitoring camera is omega y Acquiring the resolution parameter information of the image of the monitoring camera as X multiplied by Y (X is the pixel width of the image, and Y is the pixel height);
suppose that: the coordinate of the center of the monitoring camera is (0,0), the vertical projection position of the position O where the monitoring camera is located on the horizontal plane is O', and the longitude and latitude are (lambda) 0 ,ψ 0 ) Aiming at any position A on the horizontal plane within the visual range of the monitoring camera i Latitude and longitude coordinates (λ) i ,ψ i ) Can be as followsConversion to surveillance camera frame coordinates (x) i ,y i );
Step 1.1: according to a Haverine formula, calculating the vertical projection position O' of the position of the monitoring camera on the horizontal plane and the optional position A of the monitoring camera on the horizontal plane within the visible range i Straight horizontal distance d i In units of m, O' and A i Longitude horizontal distance s of i The unit is m:
Figure BDA0003210249770000071
Figure BDA0003210249770000072
Figure BDA0003210249770000073
Figure BDA0003210249770000074
wherein: a. b are all intermediate variable values, O' (λ) 0 ,ψ 0 ) For monitoring the vertical projection position of the camera on the horizontal plane, A ii ,ψ i ) Is any position on a horizontal plane within the visible range of a monitoring camera, and r is the radius of the earth and the unit is m;
step 1.2: from step 1.1, O' and A are calculated i Angle beta between connecting line of (a) and true north direction of geography i
Figure BDA0003210249770000081
Step 1.3: from step 1.1, calculate O and A i Angle theta between the connecting line and the vertical line i
Figure BDA0003210249770000082
H is the height of the monitoring camera from the horizontal plane, and the unit is m;
step 1.4: calculation of A i In the picture coordinate (x) of the monitoring camera i ,y i ):
Figure BDA0003210249770000083
Figure BDA0003210249770000084
Wherein, X is the pixel width of the image, Y is the pixel height, and the parameter values of X and Y can be obtained according to the image resolution of the monitoring camera of X multiplied by Y;
theta is the included angle between the central line of the monitoring camera and the vertical line, beta is the included angle between the projection of the central line of the monitoring camera on the horizontal plane and the true north direction of geography, omega x For monitoring the horizontal field angle, omega, of the camera y The vertical field angle of the monitoring camera is adopted.
According to the low-precision mapping relation between the picture coordinates and the longitude and latitude coordinates of the monitoring camera established by the method, the four picture coordinates of the left upper part, the left lower part, the right upper part and the right lower part of the monitoring camera are converted into the longitude and latitude coordinates, a corresponding target area is obtained on a remote sensing image through the longitude and latitude coordinates obtained through conversion, and further the position information of each image block is obtained;
in the step S2, instance segmentation is carried out on the monitoring camera picture and the remote sensing image target area by constructing an instance segmentation model, and all objects in the monitoring camera picture and the remote sensing image target area are segmented through the instance segmentation model to obtain an image block information set phi of all objects in the monitoring camera picture and an image block information set mu of all objects in the remote sensing image target area; extracting the global features of each image block comprises: color features, texture features and shape features to obtain a feature set of each image block.
The example segmentation model is Mask R-CNN, and a Mask R-CNN main network is a feature pyramid network FPN; the Mask R-CNN also comprises an area generation network RPN, a RolAlign layer and a deconvolution network Deconv, wherein the RolAlign layer is a target detection special layer; the feature pyramid network FPN carries out feature extraction on an input monitoring camera picture and a remote sensing image target area image to generate a feature map, the feature map is input into an area generation network RPN, a RolAlign layer processes the feature map and the output of the area generation network, and a processing result is input into a deconvolution network Deconv to generate a predicted mask.
According to the category of the image blocks and the similarity and the characteristics of the mapping positions, the matching between the image blocks in the image of the monitoring camera and the image blocks in the target area of the remote sensing image is realized, the category label of each image block is obtained according to the category of the image blocks, and the matching process between the image blocks in the image of the monitoring camera and the image blocks in the target area of the remote sensing image in the step S3 comprises the following steps:
step S3.1: constructing a bipartite graph G = (X, Y, E), wherein X represents an image block set in a picture of a monitoring camera, Y represents an image block set in a remote sensing picture, and E represents an edge set between all nodes in X and all nodes in Y; the edge set E is constructed according to the following rules: if Sim (x, y)>0,x ∈ X, Y ∈ Y, and then an edge (X, Y) is connected between the two corresponding vertices X and Y in the bipartite graph G, and the weight w of the edge is set xy Sif (X, Y), where sif (X, Y) represents the similarity of feature X to feature Y (X ∈ X, Y ∈ Y);
step S3.2: connecting the image blocks of the same category in the monitoring camera picture and the remote sensing image through the category label of each image block to obtain one-to-one and one-to-many conditions;
step S3.3: converting the image block coordinates in the monitoring camera picture into longitude and latitude coordinates by using the low-precision mapping relation between the picture coordinates and the longitude and latitude coordinates of the monitoring camera in the step S1 and taking the central point of the image block as the coordinates of the image block to obtain a converted longitude and latitude coordinate set A = { < lambda { (λ) } ii > -, where < lambda ii The longitude and latitude coordinate, lambda, of the ith image block in the picture of the monitoring camera after conversion is more than i Is longitude,. Psi i Is the latitude; image block longitude and latitude coordinate set obtained by remote sensing image map
Figure BDA0003210249770000091
Wherein
Figure BDA0003210249770000092
Representing longitude and latitude coordinates of the jth image block in the remote sensing image target area,
Figure BDA0003210249770000093
is longitude,. Epsilon j If the latitude is the latitude, setting a threshold (lon, lat), and if the following conditions are met, keeping the connection between the latitude and the threshold, otherwise, deleting the connection:
Figure BDA0003210249770000094
ij |<lat
obtaining a matching set between image blocks with smaller errors by screening the position similarity;
step S3.4: and (4) obtaining the optimal matching by using the Hungarian algorithm according to the matching set and the corresponding characteristic set among the image blocks obtained in the step (S3.3), and realizing the corresponding matching between the image blocks in the picture of the monitoring camera and the image blocks in the target area of the remote sensing image.
Preferably, the specific process of step S3.4 is as follows:
in a subgraph P of the bipartite graph G, any two edges in the edge set of P do not depend on the same vertex, and M is called matching; calculating the optimal matching of the bipartite graph by using a Hungarian algorithm, matching the image block features in the monitoring camera picture with the image block features in the remote sensing image target area to obtain matched feature pairs, and then calculating the similarity between the image blocks in the monitoring camera picture and the image blocks in the remote sensing image target area according to the similarity between the matched feature pairs to realize the correspondence between the image blocks in the monitoring picture and the image blocks in the remote sensing image target area to obtain a corresponding set of the image block pairs.
In step S4, the process of establishing an accurate mapping relationship between the monitoring camera picture and the remote sensing image is as follows:
selecting three groups of data from the corresponding set of the image block pairs in the step S3 each time, establishing a coordinate mapping relation according to the corresponding relation among the plurality of groups of image block pairs, and obtaining a plurality of transformation matrixes M through inverse matrix calculation i
Figure BDA0003210249770000101
Wherein (lon) i1 ,lat i1 )、(lon i2 ,lat i2 )、(lon i3 ,lat i3 ) Three groups of longitude and latitude coordinates of image blocks in a remote sensing image target area (x) respectively i1 ,y i1 )、(x i2 ,y i2 )、(x i3 ,y i3 ) Three groups of coordinates of image blocks in a picture of the monitoring camera are respectively;
taking a plurality of transformation matrices M i Average value of (a):
Figure BDA0003210249770000102
the accurate conversion relationship between the picture coordinate of the monitoring camera and the latitude and longitude coordinates of the remote sensing image is as follows:
Figure BDA0003210249770000103
wherein (lon, lat) is the longitude and latitude coordinates of the remote sensing image, (x, y) is the picture coordinates of the monitoring camera, and M is a transformation matrix.
And finally, establishing an accurate mapping relation between the monitoring camera picture and the remote sensing image through the step S4, and improving the accuracy of image matching so as to realize rapid and accurate registration of the image under the complex imaging condition.
Variations and modifications to the above-described embodiments may occur to those skilled in the art, which fall within the scope and spirit of the above description. Therefore, the present invention is not limited to the specific embodiments disclosed and described above, and some modifications and variations of the present invention should fall within the scope of the claims of the present invention. Furthermore, although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.

Claims (6)

1. A method for aligning the multiple views of the sky and the ground of a video monitoring and remote sensing image is characterized by comprising the following steps:
step S1: aligning the picture coordinates and the longitude and latitude coordinates of the monitoring camera, establishing a low-precision mapping relation between the picture coordinates and the longitude and latitude coordinates of the monitoring camera, converting the four picture coordinates of the left upper part, the left lower part, the right upper part and the right lower part of the monitoring camera into the longitude and latitude coordinates, and obtaining a corresponding target area on a remote sensing image according to the longitude and latitude coordinates obtained through conversion;
step S2: respectively carrying out example segmentation on a monitoring camera picture and a remote sensing image target area to obtain an image block information set phi of all objects in the monitoring camera picture and an image block information set mu of all objects in the remote sensing image target area, and extracting the characteristics of each image block;
and step S3: according to the characteristics, the categories and the mapping position relation of the image blocks in the step S2, the Rochoy algorithm is utilized to realize the rough matching between the image blocks in the monitoring camera picture and the image blocks in the remote sensing image target area, and a corresponding set of image block pairs is obtained;
and step S4: selecting data by utilizing the corresponding set of the image block pairs in the step S3, and establishing an accurate mapping relation between a monitoring camera picture and a remote sensing image according to the corresponding relation of the plurality of groups of image block pairs;
the matching process between the image blocks in the monitoring camera picture and the image blocks in the remote sensing image target area in the step S3 comprises the following steps:
step S3.1: construction bipartite graph G =(X, Y, E), wherein X represents an image block set in a picture of the monitoring camera, Y represents an image block set in a remote sensing picture, and E represents an edge set between all nodes in X and all nodes in Y; the edge set E is constructed according to the following rules: if Sim (x, y)>0,x ∈ X, Y ∈ Y, and then an edge (X, Y) is connected between the two corresponding vertices X and Y in the bipartite graph G, and the weight w of the edge is set xy Sif (X, Y), where sif (X, Y) represents the similarity of feature X to feature Y (X ∈ X, Y ∈ Y);
step S3.2: connecting the image blocks of the same category in the monitoring camera picture and the remote sensing image through the category label of each image block to obtain one-to-one and one-to-many conditions;
step S3.3: converting the image block coordinates in the monitoring camera picture into longitude and latitude coordinates by using the low-precision mapping relation between the picture coordinates and the longitude and latitude coordinates of the monitoring camera in the step S1 and taking the central point of the image block as the coordinates of the image block to obtain a converted longitude and latitude coordinate set A = { < lambda { (λ) } ii > } where < lambda ii The longitude and latitude coordinate, lambda, of the ith image block in the picture of the monitoring camera after conversion is more than i Is longitude,. Psi i Is the latitude; image block longitude and latitude coordinate set obtained by remote sensing image map
Figure FDA0003925464010000011
Wherein
Figure FDA0003925464010000012
Representing longitude and latitude coordinates of the jth image block in the remote sensing image target area,
Figure FDA0003925464010000013
is longitude,. Epsilon j If the latitude is the latitude, setting a threshold (lon, lat), and if the following conditions are met, keeping the connection between the latitude and the threshold, otherwise, deleting the connection:
Figure FDA0003925464010000021
ij |<lat
step S3.4: and (4) obtaining the optimal matching by using the Hungarian algorithm according to the matching set and the corresponding characteristic set among the image blocks obtained in the step (S3.3), and realizing the corresponding matching between the image blocks in the picture of the monitoring camera and the image blocks in the target area of the remote sensing image.
2. The method according to claim 1, wherein in step S1, the process of establishing the low-precision mapping relationship between the coordinates of the monitor camera and the coordinates of latitude and longitude is as follows:
step 1.1: according to a Haverine formula, calculating the vertical projection position O' of the position of the monitoring camera on the horizontal plane and the optional position A of the monitoring camera on the horizontal plane within the visible range i Straight horizontal distance d i In units of m, O' and A i Longitude horizontal distance s of i The unit is m:
Figure FDA0003925464010000022
Figure FDA0003925464010000023
Figure FDA0003925464010000024
Figure FDA0003925464010000025
wherein: a. b are all intermediate variable values, O' (λ) 0 ,ψ 0 ) For monitoring the vertical projection position of the camera on the horizontal plane, A ii ,ψ i ) Is any position on a horizontal plane within the visual range of the monitoring camera, and r is the radius of the earth and the unit is m;
step 1.2: from step 1.1, O' and A are calculated i Angle beta between connecting line of (a) and true north direction of geography i
Figure FDA0003925464010000026
Step 1.3: from step 1.1, calculate O and A i Angle theta between the connecting line and the vertical line i
Figure FDA0003925464010000031
H is the height of the monitoring camera from the horizontal plane, and the unit is m;
step 1.4: calculation of A i In the picture coordinate (x) of the monitoring camera i ,y i ):
Figure FDA0003925464010000032
Figure FDA0003925464010000033
Wherein, X is the pixel width of the image, Y is the pixel height, and the parameter values of X and Y can be obtained according to the image resolution of the monitoring camera of X multiplied by Y;
theta is the included angle between the central line of the monitoring camera and the vertical line, beta is the included angle between the projection of the central line of the monitoring camera on the horizontal plane and the true north direction of geography, omega x For monitoring the horizontal field angle, omega, of the camera y The vertical field angle of the monitoring camera is adopted.
3. The method for multi-view alignment of the sky and the earth of the video surveillance and remote sensing images according to claim 1, wherein in the step S2, instance segmentation is performed on the surveillance camera picture and the remote sensing image target area by constructing an instance segmentation model, and all objects in the surveillance camera picture and the remote sensing image target area are segmented by the instance segmentation model to obtain an image block information set Φ of all the objects in the surveillance camera picture and an image block information set μ of all the objects in the remote sensing image target area; extracting the global features of each image block comprises: color features, texture features and shape features to obtain a feature set of each image block.
4. The method according to claim 3, wherein the instance segmentation model is Mask R-CNN, and a Mask R-CNN backbone network is a feature pyramid network FPN; the Mask R-CNN also comprises an area generation network RPN, a RolAlign layer and a deconvolution network Deconv, wherein the RolAlign layer is a target detection special layer; the FPN carries out feature extraction on the input monitoring camera picture and the remote sensing image target area image to generate a feature map, the feature map is input into an area generation network RPN, the RolAlign layer processes the feature map and the output of the area generation network, and the processing result is input into a deconvolution network Deconv to generate a predicted mask.
5. The method for multi-view alignment of video surveillance and remote sensing images according to claim 1, wherein the specific process of step S3.4 is as follows:
in one sub-graph P of the bipartite graph G, any two edges in an edge set of the P do not depend on the same vertex, the best matching of the bipartite graph is calculated by using a Hungarian algorithm, the image block features in a monitoring camera picture are matched with the image block features in a remote sensing image target area to obtain matched feature pairs, then the similarity between the image blocks in the monitoring camera picture and the image blocks in the remote sensing image target area is calculated through the similarity between the matched feature pairs, the correspondence between the image blocks in the monitoring picture and the image blocks in the remote sensing image target area is realized, and the corresponding set of the image block pairs is obtained.
6. The method according to claim 1, wherein in step S4, the process of establishing the precise mapping relationship between the monitoring camera picture and the remote sensing image is as follows:
selecting three groups of data from the corresponding set of the image block pairs in the step S3 each time, establishing a coordinate mapping relation according to the corresponding relation among the plurality of groups of image block pairs, and obtaining a plurality of transformation matrixes M through inverse matrix calculation i
Figure FDA0003925464010000041
Wherein (lon) i1 ,lat i1 )、(lon i2 ,lat i2 )、(lon i3 ,lat i3 ) Three groups of longitude and latitude coordinates of image blocks in a remote sensing image target area (x) respectively i1 ,y i1 )、(x i2 ,y i2 )、(x i3 ,y i3 ) Three groups of coordinates of image blocks in the picture of the monitoring camera are respectively;
taking a plurality of transformation matrices M i Average value of (d):
Figure FDA0003925464010000042
the accurate conversion relation between the picture coordinate of the monitoring camera and the longitude and latitude coordinate of the remote sensing image is as follows:
Figure FDA0003925464010000043
wherein (lon, lat) is longitude and latitude coordinates of the remote sensing image, (x, y) are picture coordinates of the monitoring camera, and M is a transformation matrix.
CN202110930060.3A 2021-08-13 2021-08-13 Heaven and earth multi-view alignment method for video monitoring and remote sensing images Active CN113642463B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110930060.3A CN113642463B (en) 2021-08-13 2021-08-13 Heaven and earth multi-view alignment method for video monitoring and remote sensing images

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110930060.3A CN113642463B (en) 2021-08-13 2021-08-13 Heaven and earth multi-view alignment method for video monitoring and remote sensing images

Publications (2)

Publication Number Publication Date
CN113642463A CN113642463A (en) 2021-11-12
CN113642463B true CN113642463B (en) 2023-03-10

Family

ID=78421686

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110930060.3A Active CN113642463B (en) 2021-08-13 2021-08-13 Heaven and earth multi-view alignment method for video monitoring and remote sensing images

Country Status (1)

Country Link
CN (1) CN113642463B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114187517A (en) * 2021-12-14 2022-03-15 广州赋安数字科技有限公司 Abnormal target detection method and system integrating video monitoring and remote sensing
CN115830124A (en) * 2022-12-27 2023-03-21 北京爱特拉斯信息科技有限公司 Matching-based camera pixel coordinate and geodetic coordinate conversion method and system
CN116912476B (en) * 2023-07-05 2024-05-31 农芯(南京)智慧农业研究院有限公司 Remote sensing monitoring rapid positioning method and related device for pine wood nematode disease unmanned aerial vehicle

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104992140A (en) * 2015-05-27 2015-10-21 上海海事大学 Sea surface abnormal floating object detecting method based on remote sensing image
CN109919975A (en) * 2019-02-20 2019-06-21 中国人民解放军陆军工程大学 Wide-area monitoring moving target association method based on coordinate calibration
CN112085772A (en) * 2020-08-24 2020-12-15 南京邮电大学 Remote sensing image registration method and device
CN113012047A (en) * 2021-03-26 2021-06-22 广州市赋安电子科技有限公司 Dynamic camera coordinate mapping establishing method and device and readable storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9147260B2 (en) * 2010-12-20 2015-09-29 International Business Machines Corporation Detection and tracking of moving objects
US9977978B2 (en) * 2011-11-14 2018-05-22 San Diego State University Research Foundation Image station matching, preprocessing, spatial registration and change detection with multi-temporal remotely-sensed imagery
CN108133028B (en) * 2017-12-28 2020-08-04 北京天睿空间科技股份有限公司 Aircraft listing method based on combination of video analysis and positioning information
CN110796691B (en) * 2018-08-03 2023-04-11 中国科学院沈阳自动化研究所 Heterogeneous image registration method based on shape context and HOG characteristics

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104992140A (en) * 2015-05-27 2015-10-21 上海海事大学 Sea surface abnormal floating object detecting method based on remote sensing image
CN109919975A (en) * 2019-02-20 2019-06-21 中国人民解放军陆军工程大学 Wide-area monitoring moving target association method based on coordinate calibration
CN112085772A (en) * 2020-08-24 2020-12-15 南京邮电大学 Remote sensing image registration method and device
CN113012047A (en) * 2021-03-26 2021-06-22 广州市赋安电子科技有限公司 Dynamic camera coordinate mapping establishing method and device and readable storage medium

Also Published As

Publication number Publication date
CN113642463A (en) 2021-11-12

Similar Documents

Publication Publication Date Title
CN113642463B (en) Heaven and earth multi-view alignment method for video monitoring and remote sensing images
CN105046251B (en) A kind of automatic ortho-rectification method based on environment No.1 satellite remote-sensing image
CN110866531A (en) Building feature extraction method and system based on three-dimensional modeling and storage medium
CN110334701B (en) Data acquisition method based on deep learning and multi-vision in digital twin environment
CN109461132B (en) SAR image automatic registration method based on feature point geometric topological relation
CN103353941B (en) Natural marker registration method based on viewpoint classification
CN113470090A (en) Multi-solid-state laser radar external reference calibration method based on SIFT-SHOT characteristics
CN111524168A (en) Point cloud data registration method, system and device and computer storage medium
CN113569647B (en) AIS-based ship high-precision coordinate mapping method
CN112946679B (en) Unmanned aerial vehicle mapping jelly effect detection method and system based on artificial intelligence
CN114549871A (en) Unmanned aerial vehicle aerial image and satellite image matching method
CN113345084B (en) Three-dimensional modeling system and three-dimensional modeling method
CN113034678A (en) Three-dimensional rapid modeling method for dam face of extra-high arch dam based on group intelligence
CN115451964A (en) Ship scene simultaneous mapping and positioning method based on multi-mode mixed features
CN112929626A (en) Three-dimensional information extraction method based on smartphone image
CN114998448B (en) Multi-constraint binocular fisheye camera calibration and space point positioning method
CN115201883A (en) Moving target video positioning and speed measuring system and method
CN114372992A (en) Edge corner point detection four-eye vision algorithm based on moving platform
CN113658144B (en) Method, device, equipment and medium for determining geometric information of pavement diseases
CN107941241B (en) Resolution board for aerial photogrammetry quality evaluation and use method thereof
CN112017259B (en) Indoor positioning and image building method based on depth camera and thermal imager
CN111260735B (en) External parameter calibration method for single-shot LIDAR and panoramic camera
CN116188249A (en) Remote sensing image registration method based on image block three-stage matching
CN114742876B (en) Land vision stereo measurement method
Brunken et al. Incorporating Plane-Sweep in Convolutional Neural Network Stereo Imaging for Road Surface Reconstruction.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 510000 room 1301 (Location: room 1301-1), No. 68, yueken Road, Wushan street, Tianhe District, Guangzhou City, Guangdong Province (office only)

Applicant after: Guangzhou Fu'an Digital Technology Co.,Ltd.

Address before: 510000 No. 1501, 68 yueken Road, Tianhe District, Guangzhou City, Guangdong Province

Applicant before: GUANGZHOU FUAN ELECTRONIC TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant