CN116468609A - Super-glue-based two-stage zoom camera multi-image stitching method and system - Google Patents

Super-glue-based two-stage zoom camera multi-image stitching method and system Download PDF

Info

Publication number
CN116468609A
CN116468609A CN202310459405.0A CN202310459405A CN116468609A CN 116468609 A CN116468609 A CN 116468609A CN 202310459405 A CN202310459405 A CN 202310459405A CN 116468609 A CN116468609 A CN 116468609A
Authority
CN
China
Prior art keywords
image
template base
pixel
clear
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310459405.0A
Other languages
Chinese (zh)
Inventor
刘志文
杨景翔
程思远
许根
吴佳宗
肖江剑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ningbo Institute of Material Technology and Engineering of CAS
Original Assignee
Ningbo Institute of Material Technology and Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ningbo Institute of Material Technology and Engineering of CAS filed Critical Ningbo Institute of Material Technology and Engineering of CAS
Priority to CN202310459405.0A priority Critical patent/CN116468609A/en
Publication of CN116468609A publication Critical patent/CN116468609A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/269Analysis of motion using gradient-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • G06T7/337Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a super-glue-based multi-image splicing method and system for a two-stage zoom camera, wherein the method comprises the following steps: step one, collecting images with large angles of view and completing splicing to obtain a template base map; step two, up-sampling and blocking the template base map to generate a template base map block; step three, calculating the position of a shooting point, automatically acquiring clear images, and enabling all acquired clear images to cover a designated area; step four, selecting a clear image and a template bottom block, performing feature matching, calculating a homography matrix and transforming the clear image; step five, calculating the optical flow between the coverage area template base map and the transformed clear image; step six, carrying out pixel-by-pixel registration on the transformed clear image according to the optical flow so as to align the clear image with the template base map; and step seven, converting all the clear images to a coordinate system of the template base map, and fusing the images. The image stitching method has good stitching efficiency and stitching accuracy.

Description

Super-glue-based two-stage zoom camera multi-image stitching method and system
Technical Field
The invention relates to a splicing method and a system for multiple images of a multi-view multi-image, in particular to a method and a system for splicing multiple images of a two-section zoom camera based on SuperGlue and an optical flow field, and belongs to the technical field of image processing.
Background
With the rapid development of computer vision technology, cameras are often used in various fields such as security and traffic monitoring, medical images, industrial detection, and the like. For some large-range scenes, because of the limitation of the view field of a camera, a complete picture is difficult to obtain, and an image stitching method is often adopted to obtain images of the large-range view field. However, the problem that the problems of dislocation, ghost and the like are easily generated due to the fact that the difficulty of image splicing tasks with large visual angle changes and a large number of images is high, and the splicing process is complex. It is significant to study an efficient stitching method for multi-view and multi-image.
The traditional image stitching method generally adopts a Sift or Orb algorithm to extract the characteristics of the images, and then obtains an image transformation relation matrix through characteristic matching calculation to finish the alignment stitching of the images, so that the image stitching method is the most practical and common method; but for some sparse textured scenes, this approach may have difficulty obtaining enough feature points and accurate feature matching; for a multi-plane scene, a homography transformation matrix is difficult to align the whole image, and a more representative solution is a method based on spatial domain transformation, such as an APAP algorithm, the alignment of overlapping areas is performed by dividing grids and a large number of local homography matrixes, but the calculation process is too complex, the parameters to be adjusted are large, and the non-overlapping areas are easy to generate distortion.
In the aspect of multi-image stitching, the traditional method is generally to stitch one by one according to the sequence, the stitching process is complex and accumulated errors are easy to generate, so that the stitching result is obviously distorted or deformed, and is not natural enough.
Disclosure of Invention
In order to solve the problems, the invention provides a high-efficiency splicing method of multiple images based on a variable-focus pan-tilt camera, which is characterized in that the splicing of the images is divided into two steps, the splicing of the images with a large angle of view is firstly completed, the fine splicing of clear images is secondly completed on the basis, the splicing of each image is mutually independent, and the splicing process is simplified. Feature extraction and matching are carried out by adopting SuperPoint and SuperGlue algorithms, the method has robustness and high speed in a sparse texture scene, and unavoidable splicing errors generated by a single homography matrix are reduced by introducing optical flow field registration.
In order to achieve the aim of the invention, the invention adopts the following scheme:
the invention provides a super glue-based two-stage zoom camera multi-image stitching method, which comprises the following steps of:
firstly, reducing the multiplying power of a pan-tilt camera, shooting a plurality of images with large angles of view, and performing image stitching by adopting a feature matching algorithm to generate a template base image;
step two, up-sampling and blocking the template base map to generate a template base map block;
step three, calculating and recording the position of a shooting point by combining a field geometric model of a camera and an area needing shooting and combining a field angle of a tripod head camera, automatically acquiring clear images, and enabling all acquired clear images to cover a designated area;
selecting a clear image, selecting the template base image blocks with approximately the same coverage area, performing feature extraction and matching on the clear image and the template base image blocks by adopting SuperPoint and SuperGlue algorithms, removing mismatching points, restoring pixel point positions of template base image blocks in the matching points to pixel positions in the original template base image, and calculating a homography transformation matrix;
step five, transforming the clear image to a coordinate system of an original template base image by using the homography transformation matrix, obtaining a template base image of a coverage area of the transformed clear image, and calculating an optical flow between the coverage area template base image and the transformed clear image;
step six, carrying out pixel-by-pixel registration on the transformed clear image according to the optical flow so as to align the clear image with the template base map;
and seventhly, repeating the fourth step to the sixth step until all the clear images are transformed to the coordinate system of the template base map, and finally fusing the images.
In one embodiment, the calculating the position of the shooting point in the third step specifically includes:
assume that when the camera is facing downward, the spatial coordinates of the four corners of the camera field of view region on the imaging plane at a distance h are respectively Wherein α represents a horizontal angle of view, and β represents a vertical angle of view; when the cloud platform rotates left and right or up and down by an angle theta, the points A, B, C and D are uniformly expressed as v 0 ,v′ 0 =T*R(z)*R(x)*v 0 Representing the coordinates after rotation translation, v 'is present' 0 =T*R(z)*R(x)*v 0 Wherein R (z) and R (x) respectively represent rotation matrices around the z axis and the x axis, and then v 'is calculated' 0 =T*R(z)*R(x)*v 0 The intersection point with the ground is the coordinates of four intersection points of the camera view field range.
In one embodiment, the optical flow calculation in the fifth step is as follows:
i (x, y, z) =i (x+dx, y+dy, t+dt) represents the distance that a pixel point moves from one frame to the next, taking the time of dt; by Taylor expansion
Further obtain
Is provided withRepresenting velocity vectors of the optical flow along the x and y axes, respectively, so that
I x u+I y v+I t =0
Wherein I is x ,I y ,I t The partial derivatives of the gray values of the pixel points of the image respectively expressed in the X, Y and T directions can be obtained through the image data, and (u, v) is the optical flow loss of the optical flow along the X axis and the Y axis respectively.
In one embodiment, in the sixth step, the pixel-by-pixel registration is performed on the transformed clear image according to the optical flow, so that the process of aligning the template base map uses the following calculation formula:
F(x+u,y+v)=P(x,y);
wherein P is a clear image coordinate pixel value before aligning the template base map, F is a clear image pixel value after aligning the template base map, X and Y represent pixel coordinates, and u and v represent optical flow magnitudes in the X and Y directions of the pixel positions.
In one embodiment, the fusing of the images in the seventh step specifically adopts the following calculation formula:
P(x,y)=w1*P1(x,y)+w2*P2(x,y)
wherein P represents the pixel value of the fused image, P1 and P2 respectively represent the pixel values of the overlapping areas of two mutually overlapped clear images, and w1 and w2 represent the weights of the pixel values of the two images.
Another aspect of the present invention provides a SuperGlue-based two-stage zoom camera multi-image stitching system, wherein the image stitching system comprises:
the image acquisition and template base image generation module is used for reducing the multiplying power of the cradle head camera, shooting a plurality of large-angle-of-view images, and performing image stitching by adopting a feature matching algorithm to generate a template base image;
the template base map block generation module is used for upsampling and blocking the template base map to generate a template base map block;
the clear image acquisition module is used for calculating and recording the position of a shooting point by combining a view field geometric model of a camera and a region needing to be shot and combining a view field angle of a cradle head camera, automatically acquiring clear images and enabling all acquired clear images to cover a designated region;
the transformation matrix calculation module is used for selecting a clear image, selecting the template base image blocks with approximately the same coverage area, carrying out feature extraction and matching on the clear image and the template base image blocks by adopting SuperPoint and SuperGlue algorithms, removing mismatching points, restoring the pixel point positions of the template base image blocks in the matching points to the pixel positions in the original template base image, and calculating a homography transformation matrix;
the optical flow calculation module is used for transforming the clear image to a coordinate system of an original template base image by using the homography transformation matrix, obtaining a template base image of a coverage area of the transformed clear image, and calculating the optical flow between the coverage area template base image and the transformed clear image;
the pixel matching module is used for carrying out pixel-by-pixel registration on the transformed clear image according to the optical flow so as to align the clear image with the template base map;
and the image fusion module is used for fusing the images after all the clear images are transformed to the coordinate system of the template base map.
In one embodiment, the clear image acquisition module calculates a position of a shooting point, and specifically includes:
assume that when the camera is facing downward, the spatial coordinates of the four corners of the camera field of view region on the imaging plane at a distance h are respectively Wherein α represents a horizontal angle of view, and β represents a vertical angle of view; when the cloud platform rotates left and right or up and down by an angle theta, the points A, B, C and D are uniformly expressed as v 0 ,v′ 0 =T*R(z)*R(x)*v 0 Representing the coordinates after rotation translation, v 'is present' 0 =T*R(z)*R(x)*v 0 Wherein R (z) and R (x) respectively represent rotation matrices around the z axis and the x axis, and then v 'is calculated' 0 =T*R(z)*R(x)*v 0 The intersection point with the ground is the coordinates of four intersection points of the camera view field range.
In one embodiment, the optical flow calculation module calculates the optical flow as follows:
i (x, y, z) =i (x+dx, y+dy, t+dt) represents the distance that a pixel point moves from one frame to the next, taking the time of dt; by Taylor expansion
Further obtain
Is provided withRepresenting velocity vectors of the optical flow along the x and y axes, respectively, so that
I x u+I y v+I t =0
Wherein I is x ,I y ,I t The partial derivatives of the gray values of the pixel points of the image respectively expressed in the X, Y and T directions can be obtained through the image data, and (u, v) is the optical flow loss of the optical flow along the X axis and the Y axis respectively.
In one embodiment, the pixel matching module performs pixel-by-pixel registration on the transformed clear image according to the optical flow, so that the process of aligning the transformed clear image with the template base map adopts the following calculation formula:
F(x+u,y+v)=P(x,y);
wherein P is a clear image coordinate pixel value before aligning the template base map, F is a clear image pixel value after aligning the template base map, X and Y represent pixel coordinates, and u and v represent optical flow magnitudes in the X and Y directions of the pixel positions.
In one embodiment, the image module performs graphic fusion specifically using the following calculation formula:
P(x,y)=w1*P1(x,y)+w2*P2(x,y)
wherein P represents the pixel value of the fused image, P1 and P2 respectively represent the pixel values of the overlapping areas of two mutually overlapped clear images, and w1 and w2 represent the weights of the pixel values of the two images.
Compared with the prior art, the invention has at least the following advantages: (1) Splitting the image splicing into two steps, firstly completing the image splicing with a large field angle, then completing the fine splicing of the clear images on the basis, mutually independent splicing of each image, and simplifying the splicing process; (2) The feature extraction and matching are carried out by adopting SuperPoint and SuperGlue algorithms, so that the method has robustness and higher speed in a sparse texture scene; (3) The introduction of optical flow field registration reduces the unavoidable splice errors generated by striking a single homography matrix.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments or the description of the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a stitching method in an embodiment of the present invention;
FIG. 2 is a view field geometric model of pan/tilt head shooting in an embodiment of the present invention;
FIG. 3 is an example of a template base map in an embodiment of the invention;
FIG. 4 is a feature matching ratio versus graph of the Sift algorithm and the SuperGlue algorithm;
FIG. 5a is a clear illustration of an embodiment of the present invention, and FIG. 5b is a schematic diagram of a splicing method according to an embodiment of the present invention;
fig. 6a and 6b are diagrams of partial alignment before and after optical flow registration, respectively, in an embodiment of the present invention.
Detailed Description
The inventor proposes the technical scheme of the invention to overcome the defects or problems of the prior nerve radiation field technology. The invention relates to a three-dimensional reconstruction method of a nerve radiation field based on depth dynamic sampling, which is comparable with some most advanced nerve radiation field reconstruction methods in performance and has good visual effect.
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following detailed description of the embodiments of the present invention will be given with reference to the accompanying drawings. Examples of these preferred embodiments are illustrated in the accompanying drawings. The embodiments of the invention shown in the drawings and described in accordance with the drawings are merely exemplary and the invention is not limited to these embodiments.
It should be noted here that, in order to avoid obscuring the present invention due to unnecessary details, only structures and/or processing steps closely related to the solution according to the present invention are shown in the drawings, while other details not greatly related to the present invention are omitted.
In a more typical embodiment of the present application, the method comprises the steps of:
1) In order to realize accurate splicing of a certain number of multi-view images shot by the pan-tilt camera, the magnification of the pan-tilt camera is firstly reduced, the angle of view is larger at the moment, and a plurality of low-definition images covering a specified range are shot. The pan-tilt camera magnification is an attribute of a zoom camera, and the lower the camera magnification is, the larger the field angle is. Because the angle transformation of the large-field-angle image is small, the features are rich, and the homography transformation matrix is calculated by adopting the traditional feature matching method to splice, so that the template base map is obtained.
2) And (3) upsampling the template base map obtained in the step (1), and dividing the template base map into 2 x 2 or 4*4 blocks to obtain a template base block.
3) And acquiring a clear image, namely calculating and recording the position of a shooting point by combining a field geometric model of a camera and an area (as shown in the figure) needing to be shot and the field angle of a tripod head camera, wherein the calculation process is as follows:
let the spatial coordinates of the four corners of the camera field of view area on the imaging plane at a distance h be a (h x Wherein α represents a horizontal angle of view, and β represents a vertical angle of view; when the cloud platform rotates left and right or up and down by an angle theta, the points A, B, C and D are uniformly expressed as v 0 ,v′ 0 =T*R(z)*R(x)*v 0 Representing the coordinates after rotation translation, v 'is present' 0 =T*R(z)*R(x)*v 0 Wherein R (z) and R (x) respectively represent rotation matrices around the z axis and the x axis, and then v 'is calculated' 0 =T*R(z)*R(x)*v 0 The intersection point with the ground is the coordinates of four intersection points of the camera view field range.
Therefore, the preset points in the field of view range of the camera can be obtained through the height, multiplying power and rotation angle of the cradle head camera to comprise the three groups of data, and the preset points are set to enable all the shot images to cover a designated area, so that the acquisition of clear images is automatically completed.
4) Selecting a clear image, selecting a template base map block with approximately the same coverage area, and carrying out feature extraction and matching on the clear image and the template base map block by adopting SuperPoint and SuperGlue algorithms. After the error matching points are removed by adopting a random consistency method, the pixel point positions of the bottom image blocks in the matching points are restored to the pixel positions in the original template bottom image, and a homography transformation matrix is calculated. The homography transformation matrix represents an image linear geometric transformation matrix with eight degrees of freedom, a linear equation is established through coordinates of a plurality of pairs of matching points of two images, and a least square method is adopted to fit a best result.
5) Transforming the clear image by using the homography transformation matrix calculated in the step 4), establishing a template base image pixel coordinate system by taking the upper left corner of the template base image as an origin, transforming the clear image into the coordinate system of the template base image by performing operations such as rotary scaling, translation, miscut, mirroring and the like, obtaining a template base image of a coverage area of the transformed clear image, obtaining a dense optical flow between the transformed clear image and the template base image, wherein the optical flow refers to projection of a space three-dimensional motion field on the image, represents the motion size and direction of an image pixel at a certain moment, and comprises the following calculation processes:
i (x, y, z) =i (x+dx, y+dy, t+dt) represents the distance that a pixel point moves from one frame to the next, taking the time of dt; by Taylor expansion
Further obtain
Is provided withRepresenting velocity vectors of the optical flow along the x and y axes, respectively, so that
I x u+I y v+I t =0
Wherein I is x ,I y ,I t The gray values of the pixel points of the image respectively expressed are X, Y and TThe direction deviation can be obtained from the image data, and (u, v) is the optical flow loss of the optical flow along the X axis and the Y axis respectively.
6) And 5) carrying out pixel-by-pixel registration on the converted clear image through the optical flow obtained in the step 5), and aligning the clear image with the template base map according to the following formula.
F(x+u,y+v)=P(x,y);
Wherein P is a clear image coordinate pixel value before aligning the template base map, F is a clear image pixel value after aligning the template base map, X and Y represent pixel coordinates, and u and v represent optical flow magnitudes in the X and Y directions of the pixel position.
7) Repeating the steps 4), 5) and 6) until all the clear images are transformed to the coordinate system of the template base map, and finally fusing the images by adopting a weighted average method, wherein the formula is as follows:
P(x,y)=w1*P1(x,y)+w2*P2(x,y)
where P represents the pixel value of the fused image, P1 and P2 represent the pixel values of the overlapping areas of two clear images overlapping each other, w1, w2 represent the weights of the pixel values of the two images, and the simplest method is to set the values to 0.5 and 0.5, and other suitable values can also be set.
Specifically, as shown in fig. 1, the method for efficiently splicing multiple images of a two-stage zoom camera based on SuperGlue has the following specific implementation process:
1) As shown in fig. 2, images with different definition and positions can be shot by collecting images from top to bottom through the pan-tilt camera and adjusting the magnification and the pitching and yawing angles of the pan-tilt.
2) 2-3 large-angle-of-view low-definition images covering a specified range are shot by adjusting the multiplying power of a pan-tilt camera, a template base image is obtained by splicing by adopting a traditional method of sift+cann, up-sampling and blocking are carried out, and the template base image obtained by splicing 2 large-angle-of-view images is shown in fig. 3; and then calculating shooting preset positions according to the required definition, automatically acquiring clear images, and enabling all the clear images to cover a specified range.
3) Selecting a clear image and a template base map block, performing feature extraction and matching by adopting a SuperPonit, superGlue algorithm, removing mismatching points by adopting a random consistency method, and obtaining accurate matching point pairs, wherein a large number of mismatching exists in the traditional Sift method as shown in fig. 4.
4) And restoring the pixel coordinates of the bottom image block of the middle template in the matching point pair to the position of the original template bottom image, and calculating to obtain a homography transformation matrix, wherein as shown in fig. 5a and 5b, the clear image is transformed onto the coordinate system of the original template bottom image according to the homography transformation matrix.
5) And generating a mask through the coordinate positions of the four vertexes of the clear image after transformation, thereby obtaining a template base image of the coverage area of the transformed clear image, and calculating the optical flow between the template base image and the transformed clear image.
6) The calculated optical flow contains the position offset of each pixel in the x and y directions, and the transformed clear images are registered pixel by pixel according to the offset, so that the transformed clear images are aligned with the template base map, and as shown in fig. 6a and 6b, the local alignment condition before and after optical flow registration is adopted, the alignment between the re-projected images can be accurate after optical flow registration is adopted.
It should be understood that although the present disclosure describes embodiments, not every embodiment is provided with a separate technical solution, and this description is for clarity only, and those skilled in the art should consider the disclosure as a whole, and the technical solutions of the embodiments may be combined appropriately to form other embodiments that can be understood by those skilled in the art.

Claims (10)

1. The super glue-based multi-image stitching method for the two-stage zoom camera is characterized by comprising the following steps of:
firstly, reducing the multiplying power of a pan-tilt camera, shooting a plurality of images with large angles of view, and performing image stitching by adopting a feature matching algorithm to generate a template base image;
step two, up-sampling and blocking the template base map to generate a template base map block;
step three, calculating and recording the position of a shooting point by combining a field geometric model of a camera and an area needing shooting and combining a field angle of a tripod head camera, automatically acquiring clear images, and enabling all acquired clear images to cover a designated area;
selecting a clear image, selecting the template base image blocks with approximately the same coverage area, performing feature extraction and matching on the clear image and the template base image blocks by adopting SuperPoint and SuperGlue algorithms, removing mismatching points, restoring pixel point positions of template base image blocks in the matching points to pixel positions in the original template base image, and calculating a homography transformation matrix;
step five, transforming the clear image to a coordinate system of an original template base image by using the homography transformation matrix, obtaining a template base image of a coverage area of the transformed clear image, and calculating an optical flow between the coverage area template base image and the transformed clear image;
step six, carrying out pixel-by-pixel registration on the transformed clear image according to the optical flow so as to align the clear image with the template base map;
and seventhly, repeating the fourth step to the sixth step until all the clear images are transformed to the coordinate system of the template base map, and finally fusing the images.
2. The image stitching method according to claim 1, wherein the calculating the position of the shooting point in the third step specifically includes:
assume that when the camera is facing downward, the spatial coordinates of the four corners of the camera field of view region on the imaging plane at a distance h are respectively Wherein α represents a horizontal angle of view, and β represents a vertical angle of view; when the cloud platform rotates left and right or up and down by an angle theta, the points A, B, C and D are uniformly expressed as v 0 ,v′ 0 =T*R(z)*R(x)*v 0 Representing rotational translationThe latter coordinates then have v' 0 =T*R(z)*R(x)*v 0 Wherein R (z) and R (x) respectively represent rotation matrices around the z axis and the x axis, and then v 'is calculated' 0 =T*R(z)*R(x)*v 0 The intersection point with the ground is the coordinates of four intersection points of the camera view field range.
3. The image stitching method according to claim 1, wherein the optical flow calculation in the fifth step is as follows:
i (x, y, z) =i (x+dx, y+dy, t+dt) represents the distance that a pixel point moves from one frame to the next, taking the time of dt; by Taylor expansion
Further obtain
Is provided withRepresenting velocity vectors of the optical flow along the x and y axes, respectively, so that
I x u+I y v+I t =0
Wherein I is x ,I y ,I t The partial derivatives of the gray values of the pixel points of the image respectively expressed in the X, Y and T directions can be obtained through the image data, and (u, v) is the optical flow loss of the optical flow along the X axis and the Y axis respectively.
4. The image stitching method according to claim 3, wherein in the sixth step, the transformed clear image is registered pixel by pixel according to the optical flow, so that the process of aligning the template base map uses the following calculation formula:
F(x+u,y+v)=P(x,y);
wherein P is a clear image coordinate pixel value before aligning the template base map, F is a clear image pixel value after aligning the template base map, X and Y represent pixel coordinates, and u and v represent optical flow magnitudes in the X and Y directions of the pixel positions.
5. The image stitching method according to claim 1, wherein the fusing of the images in the seventh step specifically adopts the following calculation formula:
P(x,y)=w1*P1(x,y)+w2*P2(x,y)
wherein P represents the pixel value of the fused image, P1 and P2 respectively represent the pixel values of the overlapping areas of two mutually overlapped clear images, and w1 and w2 represent the weights of the pixel values of the two images.
6. A SuperGlue-based two-stage zoom camera multi-image stitching system, the image stitching system comprising:
the image acquisition and template base image generation module is used for reducing the multiplying power of the cradle head camera, shooting a plurality of large-angle-of-view images, and performing image stitching by adopting a feature matching algorithm to generate a template base image;
the template base map block generation module is used for upsampling and blocking the template base map to generate a template base map block;
the clear image acquisition module is used for calculating and recording the position of a shooting point by combining a view field geometric model of a camera and a region needing to be shot and combining a view field angle of a cradle head camera, automatically acquiring clear images and enabling all acquired clear images to cover a designated region;
the transformation matrix calculation module is used for selecting a clear image, selecting the template base image blocks with approximately the same coverage area, carrying out feature extraction and matching on the clear image and the template base image blocks by adopting SuperPoint and SuperGlue algorithms, removing mismatching points, restoring the pixel point positions of the template base image blocks in the matching points to the pixel positions in the original template base image, and calculating a homography transformation matrix;
the optical flow calculation module is used for transforming the clear image to a coordinate system of an original template base image by using the homography transformation matrix, obtaining a template base image of a coverage area of the transformed clear image, and calculating the optical flow between the coverage area template base image and the transformed clear image;
the pixel matching module is used for carrying out pixel-by-pixel registration on the transformed clear image according to the optical flow so as to align the clear image with the template base map;
and the image fusion module is used for fusing the images after all the clear images are transformed to the coordinate system of the template base map.
7. The image stitching system of claim 6 wherein the sharp image acquisition module calculates the location of the capture point, comprising:
assume that when the camera is facing downward, the spatial coordinates of the four corners of the camera field of view region on the imaging plane at a distance h are respectively Wherein α represents a horizontal angle of view, and β represents a vertical angle of view; when the cloud platform rotates left and right or up and down by an angle theta, the points A, B, C and D are uniformly expressed as v 0 ,v′ 0 =T*R(z)*R(x)*v 0 Representing the coordinates after rotation translation, v 'is present' 0 =T*R(z)*R(x)*v 0 Wherein R (z) and R (x) respectively represent rotation matrices around the z axis and the x axis, and then v 'is calculated' 0 =T*R(z)*R(x)*v 0 The intersection point with the ground is the coordinates of four intersection points of the camera view field range.
8. The image stitching system of claim 6 wherein the optical flow calculation module calculates optical flow as follows:
i (x, y, z) =i (x+dx, y+dy, t+dt) represents the distance that a pixel point moves from one frame to the next, taking the time of dt; by Taylor expansion
Further obtain
Is provided withRepresenting velocity vectors of the optical flow along the x and y axes, respectively, so that
I x u+I y v+I t =0
Wherein I is x ,I y ,I t The partial derivatives of the gray values of the pixel points of the image respectively expressed in the X, Y and T directions can be obtained through the image data, and (u, v) is the optical flow loss of the optical flow along the X axis and the Y axis respectively.
9. The image stitching system of claim 8 wherein the pixel matching module performs pixel-by-pixel registration of the transformed sharp image based on the optical flow to align the template base map using the following calculation formula:
F(x+u,y+v)=P(x,y);
wherein P is a clear image coordinate pixel value before aligning the template base map, F is a clear image pixel value after aligning the template base map, X and Y represent pixel coordinates, and u and v represent optical flow magnitudes in the X and Y directions of the pixel positions.
10. The image stitching system of claim 6 wherein the image module performs a graphic fusion using the following calculation formula:
P(x,y)=w1*P1(x,y)+w2*P2(x,y)
wherein P represents the pixel value of the fused image, P1 and P2 respectively represent the pixel values of the overlapping areas of two mutually overlapped clear images, and w1 and w2 represent the weights of the pixel values of the two images.
CN202310459405.0A 2023-04-23 2023-04-23 Super-glue-based two-stage zoom camera multi-image stitching method and system Pending CN116468609A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310459405.0A CN116468609A (en) 2023-04-23 2023-04-23 Super-glue-based two-stage zoom camera multi-image stitching method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310459405.0A CN116468609A (en) 2023-04-23 2023-04-23 Super-glue-based two-stage zoom camera multi-image stitching method and system

Publications (1)

Publication Number Publication Date
CN116468609A true CN116468609A (en) 2023-07-21

Family

ID=87180489

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310459405.0A Pending CN116468609A (en) 2023-04-23 2023-04-23 Super-glue-based two-stage zoom camera multi-image stitching method and system

Country Status (1)

Country Link
CN (1) CN116468609A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117218244A (en) * 2023-11-07 2023-12-12 武汉博润通文化科技股份有限公司 Intelligent 3D animation model generation method based on image recognition
CN117474906A (en) * 2023-12-26 2024-01-30 合肥吉麦智能装备有限公司 Spine X-ray image matching method and intraoperative X-ray machine resetting method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117218244A (en) * 2023-11-07 2023-12-12 武汉博润通文化科技股份有限公司 Intelligent 3D animation model generation method based on image recognition
CN117218244B (en) * 2023-11-07 2024-02-13 武汉博润通文化科技股份有限公司 Intelligent 3D animation model generation method based on image recognition
CN117474906A (en) * 2023-12-26 2024-01-30 合肥吉麦智能装备有限公司 Spine X-ray image matching method and intraoperative X-ray machine resetting method
CN117474906B (en) * 2023-12-26 2024-03-26 合肥吉麦智能装备有限公司 Intraoperative X-ray machine resetting method based on spine X-ray image matching

Similar Documents

Publication Publication Date Title
TWI709107B (en) Image feature extraction method and saliency prediction method including the same
CN110211043B (en) Registration method based on grid optimization for panoramic image stitching
CN110782394A (en) Panoramic video rapid splicing method and system
CN116468609A (en) Super-glue-based two-stage zoom camera multi-image stitching method and system
CN107194991B (en) Three-dimensional global visual monitoring system construction method based on skeleton point local dynamic update
WO2015127847A1 (en) Super resolution processing method for depth image
WO2018235163A1 (en) Calibration device, calibration chart, chart pattern generation device, and calibration method
JP2015022510A (en) Free viewpoint image imaging device and method for the same
CN113077519B (en) Multi-phase external parameter automatic calibration method based on human skeleton extraction
CN109448105B (en) Three-dimensional human body skeleton generation method and system based on multi-depth image sensor
CN112734863A (en) Crossed binocular camera calibration method based on automatic positioning
CN110008779A (en) A kind of stereoscopic two-dimensional code processing method and processing device
CN115937288A (en) Three-dimensional scene model construction method for transformer substation
CN110580720A (en) camera pose estimation method based on panorama
CN112017302A (en) Real-time registration method of projection mark and machine vision based on CAD model
CN115330594A (en) Target rapid identification and calibration method based on unmanned aerial vehicle oblique photography 3D model
Pathak et al. Dense 3D reconstruction from two spherical images via optical flow-based equirectangular epipolar rectification
KR20190008306A (en) Method and apparatus for developing a lens image into a panoramic image
CN108830921A (en) Laser point cloud reflected intensity correcting method based on incident angle
CN113450416A (en) TCSC (thyristor controlled series) method applied to three-dimensional calibration of three-view camera
Li et al. Research on multiview stereo mapping based on satellite video images
CN116363290A (en) Texture map generation method for large-scale scene three-dimensional reconstruction
CN112132971B (en) Three-dimensional human modeling method, three-dimensional human modeling device, electronic equipment and storage medium
CN112102504A (en) Three-dimensional scene and two-dimensional image mixing method based on mixed reality
CN109272445B (en) Panoramic video stitching method based on spherical model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination