CN117011137A - Image stitching method, device and equipment based on RGB similarity feature matching - Google Patents

Image stitching method, device and equipment based on RGB similarity feature matching Download PDF

Info

Publication number
CN117011137A
CN117011137A CN202310782266.5A CN202310782266A CN117011137A CN 117011137 A CN117011137 A CN 117011137A CN 202310782266 A CN202310782266 A CN 202310782266A CN 117011137 A CN117011137 A CN 117011137A
Authority
CN
China
Prior art keywords
image
spliced
images
feature
preliminary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310782266.5A
Other languages
Chinese (zh)
Inventor
林檎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Beinghd Electronics Co ltd
Original Assignee
Shenzhen Beinghd Electronics Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Beinghd Electronics Co ltd filed Critical Shenzhen Beinghd Electronics Co ltd
Priority to CN202310782266.5A priority Critical patent/CN117011137A/en
Publication of CN117011137A publication Critical patent/CN117011137A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/60Editing figures and text; Combining figures or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/757Matching configurations of points or features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses an image splicing method, device and equipment based on RGB similarity feature matching, wherein the splicing method comprises the following steps: extracting characteristic information of the images to be spliced, regularizing the characteristic information, calculating characteristic global similarity between the two images to be spliced, calculating a prediction offset, and estimating a homography matrix of change between at least two images to be spliced based on the prediction offset; based on the images to be spliced, the homography matrix and the identity matrix, deforming the images to be spliced to obtain transformed images, matching characteristic information and determining optimal matching according to pixel point RGB similarity of characteristic areas of at least two transformed images, and generating a preliminary spliced image; and connecting the preliminary spliced image and at least two transformed images into nine-channel output, and inputting the nine-channel output and preset real labels into a trained generated countermeasure network to obtain a high-quality spliced image. The application can obtain a high-quality splicing result.

Description

Image stitching method, device and equipment based on RGB similarity feature matching
Technical Field
The present application relates to the field of image processing technologies, and in particular, to an image stitching method, apparatus and device based on RGB similarity feature matching.
Background
Image stitching is an important technique in the field of image processing, which aims to stitch images obtained from different viewing angles into a larger image to obtain a high resolution, wide viewing angle image. The technology has a very wide application range, such as the fields of synthesis of ultra-wide view angle images of cameras, space detection, submarine surveying, medical imaging, virtual reality, geological surveying, remote sensing technology, video compression and the like, so the quality of image stitching is particularly important.
The current image stitching technology is mainly divided into a traditional stitching method and a method based on deep learning. The traditional image stitching method is mainly divided into two branches, wherein one branch is based on image registration of areas, the alignment is carried out by utilizing the intensity of correlation among pixels of images, firstly, interest points are selected in one image, and then, corresponding pixel points are searched in the other image by utilizing cross correlation; the other is feature-based image registration, which estimates the transformation matrix between images by extracting obvious blocks, lines and points in the images as features, firstly, by extracting features of the images to be registered, then performing image feature matching, then estimating the transformation matrix between the images by the matched features, and finally, aligning the images by using the transformation matrix. The conventional stitching algorithm faces the following challenges, such as repeated texture, large foreground interference, night vision, noise interference and large parallax, mainly because the images lack obvious features, feature extraction becomes very difficult, and finally the extracted image features have low quality, which can cause great influence on the subsequent stitching process. In recent years, deep learning plays an increasingly important role in the field of image stitching, and compared with a traditional feature extraction algorithm, features extracted by using a convolutional neural network have richer information, good feature information can effectively improve the efficiency and accuracy of subsequent feature registration, and image stitching based on the deep learning can accelerate the image stitching speed. However, compared with the traditional image stitching method, the current depth stitching method is still in a development stage, and the problems of obvious double image problems at the stitching position, general stitching quality, inaccurate matching when two images have less repeated information and the like are embodied.
Disclosure of Invention
In view of the above technical problems, the application provides an image stitching method, device and equipment based on RGB similarity feature matching, which are mainly used for estimating homography matrixes corresponding to two images, obtaining a transformation image by utilizing a self-adaptive space transformation matrix according to the homography matrixes, obtaining a preliminary stitching result according to RGB fusion, and finally obtaining a high-quality stitching result through a pre-trained generation countermeasure network according to the preliminary stitching result and generated real labels.
Other features and advantages of the present disclosure will be apparent from the following detailed description, or may be learned in part by the practice of the disclosure.
According to an aspect of the present application, an image stitching method based on RGB similarity feature matching is provided, where the stitching method includes:
extracting feature information of images to be spliced based on ResNeXt grouping convolution and an attention mechanism, carrying out L1 regularization processing on the feature information, calculating feature global similarity between two images to be spliced, calculating a prediction offset, and estimating a homography matrix of change between at least two images to be spliced based on the prediction offset;
based on a self-adaptive space transformation network SA-STN, based on the image to be spliced, the homography matrix and the identity matrix, deforming the image to be spliced to obtain a transformed image, matching the characteristic information and determining optimal matching according to pixel point RGB similarity of characteristic areas of at least two transformed images, and generating a preliminary spliced image;
and connecting the preliminary spliced image and at least two transformation images into nine-channel output, wherein the nine-channel output comprises the characteristic information of the image to be spliced and the preliminary spliced image, and inputting the nine-channel output and preset real labels into a trained generated countermeasure network to obtain a high-quality spliced image.
Further, before extracting the feature information, the method further includes:
and carrying out graying treatment on the images to be spliced.
Further, when extracting the feature information of the stitched image, the method includes:
based on ResNeXt grouping convolution, respectively extracting the characteristic information of at least two images to be spliced;
based on an attention mechanism, acquiring parts, of the characteristic information of at least two images to be spliced, with weights higher than a threshold value.
Further, the estimating, based on the predicted offset, a homography matrix of a change between at least two images to be stitched includes:
the predicted offset is the transverse offset and the longitudinal offset of one of the images to be spliced relative to the four vertexes of the other image to be spliced;
and performing direct linear transformation on the transverse offset and the longitudinal offset of the four vertexes to estimate the homography matrix.
Further, the matching of the feature information and the determination of the optimal matching are performed according to the pixel point RGB similarity of the feature areas of at least two transformed images, which specifically includes:
taking one point from each of the characteristic areas of the two transformed images, and establishing a loss function of R, G, B of each point;
obtaining a difference value R, G, B of all points of the two transformed images based on the loss function;
and constructing an overall loss function based on the difference value, and determining the optimal matching.
Further, the generating a preliminary stitched image includes:
and carrying out average fusion on pixels of the feature intervals of the two transformed images based on the feature information corresponding to the optimal matching to obtain the preliminary mosaic image.
Further, based on the image to be spliced, the homography matrix and the identity matrix, the adaptive spatial transformation network SA-STN deforms the image to be spliced, including:
based on an adaptive space transformation network SA-STN, adaptively generating a network with the same size as a splicing label for at least two acquired images to be spliced, wherein each element in the grid represents a two-dimensional space position of the network, and the two-dimensional space position is represented by coordinates (ui, vi), wherein i=1, 2, and the coordinates are calculated by a formula:
wherein H is the homography matrix representing the projective transformation from one of the images to be stitched to another of the images to be stitched, (x, y, z) being homogeneous coordinates of the original image to be stitched, based on which the coordinates of one of the transformed images are determined; e is an identity matrix, and H is replaced by the identity matrix to obtain another transformation image.
According to a second aspect of the present disclosure, there is provided an image stitching apparatus based on RGB similarity feature matching, including a homography estimation module, including: the feature extraction unit is used for extracting feature information of the images to be spliced; the regularization unit is used for carrying out L1 regularization processing on the characteristic information; the global correlation unit is used for calculating the feature global similarity between the two images to be spliced; the convolution full-connection unit is used for calculating the prediction offset between the two images to be spliced; the direct linear change unit is used for estimating a homography matrix which changes between at least two images to be spliced based on the prediction offset;
a preliminary splice module, comprising: the spatial transformation unit is used for transforming the image to be spliced to obtain a transformed image based on the adaptive spatial transformation network SA-STN and the homography matrix and the identity matrix; the RGB fusion unit is used for carrying out matching of the characteristic information and determining optimal matching according to the pixel point RGB similarity of the characteristic areas of at least two transformed images to generate a preliminary spliced image;
an image generation module, comprising: the connecting unit is used for connecting the preliminary stitched image and at least two transformation images into nine-channel output, wherein the nine-channel output comprises the characteristic information of the image to be stitched and the preliminary stitched image; and the GAN network unit is used for inputting the nine-channel output and the preset real label into a trained generating countermeasure network to obtain a high-quality spliced image.
According to a third aspect of the present disclosure, there is provided an image stitching apparatus based on RGB similarity feature matching, comprising: a processor; and a memory arranged to store computer executable instructions that, when executed, cause the processor to: extracting characteristic information of images to be spliced, carrying out L1 regularization processing on the characteristic information, calculating characteristic global similarity between two images to be spliced, calculating a prediction offset, and estimating a homography matrix of change between at least two images to be spliced based on the prediction offset;
based on a self-adaptive space transformation network SA-STN, based on the image to be spliced, the homography matrix and the identity matrix, deforming the image to be spliced to obtain a transformed image, matching the characteristic information and determining optimal matching according to pixel point RGB similarity of characteristic areas of at least two transformed images, and generating a preliminary spliced image;
and connecting the preliminary spliced image and at least two transformation images into nine-channel output, wherein the nine-channel output comprises the characteristic information of the image to be spliced and the preliminary spliced image, and inputting the nine-channel output and preset real labels into a trained generated countermeasure network to obtain a high-quality spliced image.
The technical scheme of the present disclosure has the following beneficial effects:
1. attention mechanisms are introduced, and high weight is given to more important information when feature extraction is carried out, so that the accuracy of feature extraction can be improved;
2. the feature extraction network selects an RseNeXt structure network, and less samples can be realized by utilizing packet convolution, so that more and more complex feature information can be extracted;
3. in the image fusion stage, a new fusion mode is provided, RGB fusion is carried out, and the accuracy of image splicing is improved by carrying out fusion on the RGB similarity of the pixel points of the characteristic region;
4. and inputting an image preliminary stitching result and a real label, and generating a high-quality image by using a generation countermeasure network.
Drawings
Fig. 1 is a flowchart of an image stitching method based on RGB similarity feature matching in an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of an original image randomly selected from a COCO2014 dataset in an embodiment of the present disclosure;
FIG. 3 is a flow chart of generating a data set as used herein in an embodiment of the present description;
fig. 4 is a schematic diagram of a graying process of images to be stitched according to an embodiment of the present disclosure
FIG. 5 is a flow chart of a homography estimation phase in an embodiment of the present disclosure;
FIG. 6 is a flow chart of a preliminary stitching stage of images to be stitched in an embodiment of the present disclosure
FIG. 7 is a flow chart of an RGB fusion process in an embodiment of the present disclosure;
FIG. 8 is a flow chart of an image generation phase in an embodiment of the present disclosure;
fig. 9 is a schematic diagram of an image stitching device based on RGB similarity feature matching in the embodiment of the present disclosure;
fig. 10 is a terminal device for implementing an image stitching method based on RGB similarity feature matching in the embodiment of the present disclosure;
fig. 11 is a computer readable storage medium for implementing an image stitching method based on RGB similarity feature matching in an embodiment of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the present disclosure. One skilled in the relevant art will recognize, however, that the aspects of the disclosure may be practiced without one or more of the specific details, or with other methods, components, devices, steps, etc. In other instances, well-known technical solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.
Furthermore, the drawings are only schematic illustrations of the present disclosure. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor devices and/or microcontroller devices.
As shown in fig. 1, an embodiment of the present disclosure provides an image stitching method based on RGB similarity feature matching, where an execution body of the method may be a terminal device, where the terminal device may be a mobile phone, a tablet computer, a personal computer, or the like. The method specifically may include the following steps S200, S300, S400:
in step S200, based on the ResNeXt grouping convolution sum and attention mechanism, extracting feature information of the images to be spliced, performing L1 regularization processing on the feature information, calculating feature global similarity between the two images to be spliced, calculating a prediction offset, and estimating a homography matrix of change between at least two images to be spliced based on the prediction offset.
In step S300, based on the adaptive spatial transformation network SA-STN, based on the image to be stitched, the homography matrix, and the identity matrix, the image to be stitched is deformed to obtain a transformed image, and matching of the feature information is performed and optimal matching is determined according to the pixel point RGB similarity of the feature areas of at least two transformed images, so as to generate a preliminary stitched image.
In step S400, the preliminary stitched image and at least two transformed images are connected to form a nine-channel output, the nine-channel output includes the feature information of the image to be stitched and the preliminary stitched image, and the nine-channel output and a preset real label are input into a trained generated countermeasure network, so as to obtain a high-quality stitched image.
Specifically, to assist in explaining how to stitch images in the embodiments of the present specification, as shown in fig. 2 to 4, a data set used in the embodiments is generated, including steps S101 to S105:
in step S101, the COCO2014 data set is used as an original data set, that is, a training set and a testing set required for implementation are generated from the COCO2014 data set, and as shown in fig. 2, a picture is randomly selected to generate the training set and the testing set;
in step S102, a 128 x 128 region is randomly cut out of a picture in the COCO2014 dataset, as shown in the green first frame in fig. 3, and the first frame is cut out as I Input1 (to ensure experimental effect, the region is cut out from the vicinity of the center of the original image);
in step S103, random displacement within the range of [ -64,64] is carried out on the first frame body cut out in step S102, so as to obtain a second frame body in FIG. 3;
in step S104, random perturbation in the range of [ -25,25] is performed on the four vertices of the second frame obtained randomly in step S103, so as to obtain a third frame in fig. 3, wherein the offset f of the four vertices of the red frame relative to the green frame is used as the true offset in the homography estimation stage, and the pixel content enclosed by the red frame and the green frame is used as the Label in the content correction stage;
in step S105, the inverse homography matrix calculated from green frame to red frame is applied to the artwork, the third frame is warped to align with the second frame, and the second frame is cut to IInput2. Thereby obtaining a set of training samples (aI Input1 ,I Input2 ,f,Label)。
In one embodiment, after determining the training sample, as shown in fig. 4-5, homography estimation is performed on the images to be stitched to obtain a homography matrix H, and step S200 is performed, where step S200 specifically includes steps S201 to S206:
in step S201, preprocessing is performed on the image to be stitched to obtain a graying image of the image to be stitched, as shown in fig. 4, which specifically includes: the gray level can reduce the calculation complexity, and only a single-channel image is used for operation; graying can eliminate interference of different color distributions of the color pictures on image stitching; the graying can improve the accuracy of image stitching, and the stitched image is more real and natural because the graying image has higher contrast and detail information.
In step S202, features of the preprocessed images to be spliced are extracted by using a feature extractor with shared weights, in this step, a feature network based on ResNeXt is mainly adopted and an attention mechanism is introduced, packet convolution of ResNeXt is used to realize extraction of more feature information for fewer samples, and an attention mechanism is introduced to obtain important information (such as repeated textures, lines, corner points, etc.) of two images with high weights, and mismatching information is ignored with low weights, so that the error feature information is reduced.
In step S203, regularization is performed on the feature vectors in the extracted feature information, where L1 regularization is specifically selected, that is, L1 Normalization, where L1 regularization may generate a sparse weight matrix, which may be used for feature selection, and L1 may also prevent overfitting.
In step S204, after regularization, the output is sent to the global correlation layer, and the similarity between the two is calculated, so as to reduce the average corner error and improve the estimation accuracy of the homography matrix, for example, according to the following formula:
wherein x is 1 ,x 2 The positions of feature map 1 and feature map 2, respectively; FA (x) 1 ) Is a one-dimensional feature vector characterizing position x1 in FIG. 1, FB (x 2 ) Is the position x in the feature map 2 2 Is a one-dimensional feature vector of (a); CV (x) 1 ,x 2 ) Is FA (x) 1 ) And FB (x) 2 ) The closer to 1, the higher the similarity of the two, i.e., the better the feature matching that represents the two.
In step S205, the globally relevant layer is followed by the convolutional layer and the fully-connected layer, outputting 8 parameters f, which are the image I Input2 Relative to I Input1 The horizontal offset and the vertical offset of the four vertexes, namely the horizontal offset and the vertical offset of the two images to be spliced.
In step S206, the image I can be estimated by Direct Linear Transformation (DLT), which is an algorithm that establishes a direct linear relationship between the coordinates of the image point and the coordinates of the object space of the corresponding object point, based on the algorithm and 8 parameters f in step S205 Input2 Change to I Input1 Homography matrix H of (a).
In an embodiment, after determining the homography matrix H, as shown in fig. 6, the two images to be stitched may be initially stitched, and step S300 is performed, where step S300 specifically includes the following steps S310 and S320:
in step S310, the image to be stitched is input to obtain a transformed image W using an adaptive spatial transformation network SA-STN Input1 And transform image W Input2 Specifically, the method comprises the steps of S311-312:
in step S311, a network having the same size as the splicing label is adaptively generated for at least two of the acquired images to be spliced based on the adaptive spatial transformation network SA-STN, each element in the grid representing its two-dimensional spatial position in terms of coordinates (u i ,v i ) Representation, where i=1, 2, coordinates calculated with the formula:
wherein H is a homography matrix representing the image I to be spliced from Input2 To another image I to be spliced Input1 (x, y, z) are homogeneous coordinates of the original images to be stitched, based on which one of the transformed images W is determined Input2 Coordinates of (c); e is an identity matrix, and H is replaced by the identity matrix to obtain another transformation image W Input1
In step S312, a smoothed transformed image W is obtained by bilinear interpolation Input1 And transform image W Input2 As shown in fig. 6. The bilinear interpolation may be calculated by reference to a bilinear interpolation algorithm of the prior art, which is not limited by the present disclosure.
Step S320 is executed to convert the converted image W obtained in step S312 Input1 And transform image W Input2 Performing preliminary stitching, i.e. RGB fusion stitching, e.g. on two transformed images W Input1 And W is Input2 Each point in the characteristic region of (2) and establishing a loss function of R, G, B of each point; based on the loss function, two transformed images W are obtained Input1 And W is Input2 R, G, B of all points of (b); based on the difference, an overall loss function is constructed, and the optimal match is determined, specifically, as shown in fig. 7.
Wherein in the preliminary stitching, the transformed image W Input1 And transform image W Input2 The RGB fusion splicing is carried out, and more specific steps include S321 to S324:
in step S321, transform image W Input1 And transform image W Input2 Is denoted as a, B, wherein RGB of a, B is expressed as:
A:(R i 、G i 、B i ) Characteristic region phi of i epsilon A
B:(R j 、G j 、B j ) Characteristic region phi of j epsilon B
In step S322, feature information matching is performed, and a loss function of R, G, B is established:
characteristic region ii, j e B of (c)
Characteristic region ii, j e B of (c)
Characteristic region ii, j e B of (c)
Wherein, loss 1 、Loss 2 、Loss 3 Respectively represent the transformed images W Input1 And transform image W Input2 R, G, B difference between all points;
in step S323, an overall loss function is constructed, and optimal matching information is found:
Loss=λ 1 Loss 12 Loss 23 Loss 3
the less the Loss is, the more the characteristic information is matched;
in step S324, according to the feature information corresponding to the minimum Loss, the pixels in the feature intervals of the two images to be stitched are averaged and fused to obtain a preliminary stitched image.
In one embodiment, after determining the preliminary stitched image, as shown in fig. 8, the quality of the preliminary stitched image may be improved, and step S400 is performed, where step S400 specifically includes the following steps S410-S420:
in step S410, the preliminary stitching result, the interpolated transformed image W Input1 And transform image W Input2 And performing channel connection to obtain nine-channel output, wherein the nine-channel output comprises the preliminary spliced image and all characteristic information of the image to be spliced.
In step S420, the nine-channel output connected in step S410 and the real Label generated in step S104 are input, and a pre-trained generation countermeasure network GAN is introduced to output a high-quality splicing result.
The generation process of generating the antagonizing network may refer to the prior art, which is not limited by the present disclosure.
In view of the problems existing in the current image stitching based on deep learning, the above embodiment provides a method for performing image stitching based on RGB similarity feature matching and GAN optimization, which uses the attention mechanism to pay attention to the characteristics of important areas, so that feature information in the image stitching can be effectively extracted, the image stitching accuracy can be improved, and a pre-trained generation countermeasure network is introduced to output high-quality stitched images. The method mainly comprises three stages: firstly, a homography matrix corresponding to two images is estimated, then a transformation image is obtained by utilizing a self-adaptive space transformation matrix according to the homography matrix, then a preliminary splicing result is obtained by RGB fusion, and finally a high-quality splicing result is obtained by a pre-trained generation countermeasure network according to the preliminary splicing result and the generated real label.
Based on the same concept, as shown in fig. 9, the exemplary embodiment of the present disclosure further provides an image stitching device 900 based on RGB similarity feature matching, including a homography estimation module 901, which includes: the feature extraction unit is used for extracting feature information of the images to be spliced; the regularization unit is used for carrying out L1 regularization processing on the characteristic information; the global correlation unit is used for calculating the feature global similarity between the two images to be spliced; the convolution full-connection unit is used for calculating the prediction offset between the two images to be spliced; the direct linear change unit is used for estimating a homography matrix which changes between at least two images to be spliced based on the prediction offset;
a preliminary stitching module 902, comprising: the spatial transformation unit is used for transforming the image to be spliced to obtain a transformed image based on the adaptive spatial transformation network SA-STN and the homography matrix and the identity matrix; the RGB fusion unit is used for carrying out matching of the characteristic information and determining optimal matching according to the pixel point RGB similarity of the characteristic areas of at least two transformed images to generate a preliminary spliced image;
an image generation module 903, comprising: the connecting unit is used for connecting the preliminary stitched image and at least two transformation images into nine-channel output, wherein the nine-channel output comprises the characteristic information of the image to be stitched and the preliminary stitched image; and the GAN network unit is used for inputting the nine-channel output and the preset real label into a trained generating countermeasure network to obtain a high-quality spliced image.
The image stitching device 900 introduces a focusing mechanism, and when extracting features, the device can give a high weight to more important information, so that the accuracy of feature extraction can be improved; the feature extraction network selects an RseNeXt structure network, and less samples can be realized by utilizing packet convolution, so that more and more complex feature information can be extracted; in the image fusion stage, a new fusion mode is provided, RGB fusion is carried out, and the accuracy of image splicing is improved by carrying out fusion on the RGB similarity of the pixel points of the characteristic region; and inputting an image preliminary stitching result and a real label, and generating a high-quality image by using a generation countermeasure network.
The specific details of each module/unit in the above apparatus are already described in the method section embodiments, and the details not disclosed may refer to the method section embodiments, so that they will not be described in detail.
Based on the same thought, the embodiment of the present disclosure further provides an image stitching device based on RGB similarity feature matching, as shown in fig. 10.
The image stitching device may be a terminal device or a server provided in the above embodiment.
The image stitching device may vary considerably in configuration or performance, and may include one or more processors 1001 and memory 1002, where the memory 1002 may store one or more stored applications or data. The memory 1002 may include readable media in the form of volatile memory units, such as Random Access Memory (RAM) units and/or cache memory units, and may further include read only memory units. The application programs stored in memory 1002 may include one or more program modules (not shown), including but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Still further, the processor 1001 may be configured to communicate with the memory 1002 and execute a series of computer executable instructions in the memory 1002 on the image stitching device. The image stitching device may also include one or more power sources 1003, one or more wired or wireless network interfaces 1004, one or more I/O interfaces (input/output interfaces) 1005, one or more external devices 1006 (e.g., keyboard, hand-drawn pad, bluetooth device, etc.), one or more devices that enable a user to interact with the device, and/or any devices (e.g., routers, modems, etc.) that enable the device to communicate with one or more other computing devices. Such communication may occur through the I/O interface 1005. Also, the device can communicate with one or more networks (e.g., a Local Area Network (LAN)) via a wired or wireless interface 1004.
In particular, in this embodiment, the image stitching device includes a memory, and one or more programs, where the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer executable instructions for the image stitching device, and execution of the one or more programs by the one or more processors includes computer executable instructions for:
extracting feature information of images to be spliced based on ResNeXt grouping convolution and an attention mechanism, carrying out L1 regularization processing on the feature information, calculating feature global similarity between two images to be spliced, calculating a prediction offset, and estimating a homography matrix of change between at least two images to be spliced based on the prediction offset; based on a self-adaptive space transformation network SA-STN, based on the image to be spliced, the homography matrix and the identity matrix, deforming the image to be spliced to obtain a transformed image, matching the characteristic information and determining optimal matching according to pixel point RGB similarity of characteristic areas of at least two transformed images, and generating a preliminary spliced image; and connecting the preliminary spliced image and at least two transformation images into nine-channel output, wherein the nine-channel output comprises the characteristic information of the image to be spliced and the preliminary spliced image, and inputting the nine-channel output and preset real labels into a trained generated countermeasure network to obtain a high-quality spliced image.
Based on the same idea, exemplary embodiments of the present disclosure further provide a computer readable storage medium having stored thereon a program product capable of implementing the method described in the present specification. In some possible implementations, various aspects of the disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the disclosure as described in the "exemplary methods" section of this specification, when the program product is run on the terminal device.
Referring to fig. 11, a program product 1100 for implementing the above-described method according to an exemplary embodiment of the present disclosure is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a terminal device, or a network device, etc.) to perform the method according to the exemplary embodiments of the present disclosure.
Furthermore, the above-described figures are only schematic illustrations of processes included in the method according to the exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily appreciated that the processes shown in the above figures do not indicate or limit the temporal order of these processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, for example, among a plurality of modules.
It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit in accordance with exemplary embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. An image stitching method based on RGB similarity feature matching is characterized by comprising the following steps:
extracting feature information of images to be spliced based on ResNeXt grouping convolution and an attention mechanism, carrying out L1 regularization processing on the feature information, calculating feature global similarity between two images to be spliced, calculating a prediction offset, and estimating a homography matrix of change between at least two images to be spliced based on the prediction offset;
based on a self-adaptive space transformation network SA-STN, based on the image to be spliced, the homography matrix and the identity matrix, deforming the image to be spliced to obtain a transformed image, matching the characteristic information and determining optimal matching according to pixel point RGB similarity of characteristic areas of at least two transformed images, and generating a preliminary spliced image;
and connecting the preliminary spliced image and at least two transformation images into nine-channel output, wherein the nine-channel output comprises the characteristic information of the image to be spliced and the preliminary spliced image, and inputting the nine-channel output and preset real labels into a trained generated countermeasure network to obtain a high-quality spliced image.
2. The RGB similarity feature matching-based image stitching method of claim 1, further comprising, prior to extracting the feature information:
and carrying out graying treatment on the images to be spliced.
3. The RGB similarity feature matching-based image stitching method according to claim 1, wherein when extracting the feature information of the image to be stitched, comprising:
based on ResNeXt grouping convolution, respectively extracting the characteristic information of at least two images to be spliced;
based on an attention mechanism, acquiring parts, of the characteristic information of at least two images to be spliced, with weights higher than a threshold value.
4. The image stitching method based on RGB similarity feature matching according to claim 1, wherein estimating a homography matrix of a change between at least two images to be stitched based on the prediction offset comprises:
the predicted offset is the transverse offset and the longitudinal offset of one of the images to be spliced relative to the four vertexes of the other image to be spliced;
and performing direct linear transformation on the transverse offset and the longitudinal offset of the four vertexes to estimate the homography matrix.
5. The image stitching method based on RGB similarity feature matching according to claim 1, wherein the matching of the feature information and determining the optimal matching according to the pixel point RGB similarity of the feature areas of at least two of the transformed images specifically includes:
taking one point from each of the characteristic areas of the two transformed images, and establishing a loss function of R, G, B of each point;
obtaining a difference value R, G, B of all points of the two transformed images based on the loss function;
and constructing an overall loss function based on the difference value, and determining the optimal matching.
6. The RGB similarity feature matching-based image stitching method of claim 1, wherein the generating a preliminary stitched image comprises:
and carrying out average fusion on pixels of the feature intervals of the two transformed images based on the feature information corresponding to the optimal matching to obtain the preliminary mosaic image.
7. The image stitching method based on RGB similarity feature matching according to claim 1, wherein the deforming the image to be stitched based on the image to be stitched, the homography matrix, and the identity matrix based on the adaptive spatial transformation network SA-STN includes:
based on the adaptive space transformation network SA-STN, adaptively generating a network with the same size as the splicing label for at least two acquired images to be spliced, wherein each element in the grid represents the two-dimensional space position of the network, and the two-dimensional space position is represented by coordinates (u i ,v i ) Representation, where i=1, 2, coordinates calculated with the formula:
wherein H is the homography matrix representing the projective transformation from one of the images to be stitched to another of the images to be stitched, (x, y, z) being homogeneous coordinates of the original image to be stitched, based on which the coordinates of one of the transformed images are determined; e is an identity matrix, and H is replaced by the identity matrix to obtain another transformation image.
8. The image stitching method based on RGB similarity feature matching according to claim 1, wherein after the transformed image is obtained, bilinear interpolation processing is performed on the transformed image to obtain a smoothed transformed image.
9. An image stitching device based on RGB similarity feature matching, comprising:
a homography estimation module, comprising: the feature extraction unit is used for extracting feature information of the images to be spliced; the regularization unit is used for carrying out L1 regularization processing on the characteristic information; the global correlation unit is used for calculating the feature global similarity between the two images to be spliced; the convolution full-connection unit is used for calculating the prediction offset between the two images to be spliced; the direct linear change unit is used for estimating a homography matrix which changes between at least two images to be spliced based on the prediction offset;
a preliminary splice module, comprising: the spatial transformation unit is used for transforming the image to be spliced to obtain a transformed image based on the adaptive spatial transformation network SA-STN and the homography matrix and the identity matrix; the RGB fusion unit is used for carrying out matching of the characteristic information and determining optimal matching according to the pixel point RGB similarity of the characteristic areas of at least two transformed images to generate a preliminary spliced image;
an image generation module, comprising: the connecting unit is used for connecting the preliminary stitched image and at least two transformation images into nine-channel output, wherein the nine-channel output comprises the characteristic information of the image to be stitched and the preliminary stitched image; and the GAN network unit is used for inputting the nine-channel output and the preset real label into a trained generating countermeasure network to obtain a high-quality spliced image.
10. An image stitching device based on RGB similarity feature matching, comprising:
a processor; and a memory arranged to store computer executable instructions that, when executed, cause the processor to:
extracting characteristic information of images to be spliced, carrying out L1 regularization processing on the characteristic information, calculating characteristic global similarity between two images to be spliced, calculating a prediction offset, and estimating a homography matrix of change between at least two images to be spliced based on the prediction offset;
based on a self-adaptive space transformation network SA-STN, based on the image to be spliced, the homography matrix and the identity matrix, deforming the image to be spliced to obtain a transformed image, matching the characteristic information and determining optimal matching according to pixel point RGB similarity of characteristic areas of at least two transformed images, and generating a preliminary spliced image;
and connecting the preliminary spliced image and at least two transformation images into nine-channel output, wherein the nine-channel output comprises the characteristic information of the image to be spliced and the preliminary spliced image, and inputting the nine-channel output and preset real labels into a trained generated countermeasure network to obtain a high-quality spliced image.
CN202310782266.5A 2023-06-28 2023-06-28 Image stitching method, device and equipment based on RGB similarity feature matching Pending CN117011137A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310782266.5A CN117011137A (en) 2023-06-28 2023-06-28 Image stitching method, device and equipment based on RGB similarity feature matching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310782266.5A CN117011137A (en) 2023-06-28 2023-06-28 Image stitching method, device and equipment based on RGB similarity feature matching

Publications (1)

Publication Number Publication Date
CN117011137A true CN117011137A (en) 2023-11-07

Family

ID=88562737

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310782266.5A Pending CN117011137A (en) 2023-06-28 2023-06-28 Image stitching method, device and equipment based on RGB similarity feature matching

Country Status (1)

Country Link
CN (1) CN117011137A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117558011A (en) * 2024-01-08 2024-02-13 暨南大学 Image text tampering detection method based on self-consistency matrix and multi-scale loss

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019047284A1 (en) * 2017-09-05 2019-03-14 平安科技(深圳)有限公司 Methods for feature extraction and panoramic stitching, and apparatus thereof, device, readable storage medium
US20200327642A1 (en) * 2019-04-15 2020-10-15 Tata Consultancy Services Limited System and method for stitching images using non-linear optimization and multi-constraint cost function minimization
CN112308773A (en) * 2020-08-19 2021-02-02 贵州电网有限责任公司 Unmanned aerial vehicle aerial image nondestructive amplification and splicing fusion method
CN113689331A (en) * 2021-07-20 2021-11-23 中国铁路设计集团有限公司 Panoramic image splicing method under complex background
US20220044356A1 (en) * 2019-12-17 2022-02-10 Dalian University Of Technology Large-field-angle image real-time stitching method based on calibration
US20220101542A1 (en) * 2019-05-10 2022-03-31 State Grid Zheiang Electronic Power Co., Ltd. Taizhou Power Supply Company Method and apparatus for stitching dual-camera images and electronic device
CN115115522A (en) * 2022-08-15 2022-09-27 浙江工业大学 Goods shelf commodity image splicing method and system
CN115661015A (en) * 2022-11-14 2023-01-31 智慧互通科技股份有限公司 Binocular camera image fusion method and system based on homography transformation matrix
CN115841422A (en) * 2022-12-05 2023-03-24 杭州电子科技大学 Image splicing method based on pyramid structure super-resolution network

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019047284A1 (en) * 2017-09-05 2019-03-14 平安科技(深圳)有限公司 Methods for feature extraction and panoramic stitching, and apparatus thereof, device, readable storage medium
US20200327642A1 (en) * 2019-04-15 2020-10-15 Tata Consultancy Services Limited System and method for stitching images using non-linear optimization and multi-constraint cost function minimization
US20220101542A1 (en) * 2019-05-10 2022-03-31 State Grid Zheiang Electronic Power Co., Ltd. Taizhou Power Supply Company Method and apparatus for stitching dual-camera images and electronic device
US20220044356A1 (en) * 2019-12-17 2022-02-10 Dalian University Of Technology Large-field-angle image real-time stitching method based on calibration
CN112308773A (en) * 2020-08-19 2021-02-02 贵州电网有限责任公司 Unmanned aerial vehicle aerial image nondestructive amplification and splicing fusion method
CN113689331A (en) * 2021-07-20 2021-11-23 中国铁路设计集团有限公司 Panoramic image splicing method under complex background
CN115115522A (en) * 2022-08-15 2022-09-27 浙江工业大学 Goods shelf commodity image splicing method and system
CN115661015A (en) * 2022-11-14 2023-01-31 智慧互通科技股份有限公司 Binocular camera image fusion method and system based on homography transformation matrix
CN115841422A (en) * 2022-12-05 2023-03-24 杭州电子科技大学 Image splicing method based on pyramid structure super-resolution network

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
AOCHENG LI ET AL: "Image Stitching Based on Semantic Planar Region Consensus", IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 30, 31 December 2021 (2021-12-31), pages 5545 - 5558, XP011860534, DOI: 10.1109/TIP.2021.3086079 *
张欣鹏;杜伟强;肖志涛;张芳;耿磊;吴骏;苏龙;: "结合SIFT变换与Otsu匹配的彩色眼底图像拼接方法", 计算机工程与应用, no. 18, 31 December 2017 (2017-12-31), pages 181 - 186 *
衡玮等: "基于密集视点插值的实时视频拼接方法", 光学学报, 10 May 2023 (2023-05-10), pages 1 - 20 *
迟龙云;张海;赵晨旭;: "基于局部单应性矩阵的图像拼接与定位算法研究", 导航定位与授时, no. 03, 15 May 2020 (2020-05-15), pages 68 - 75 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117558011A (en) * 2024-01-08 2024-02-13 暨南大学 Image text tampering detection method based on self-consistency matrix and multi-scale loss
CN117558011B (en) * 2024-01-08 2024-04-26 暨南大学 Image text tampering detection method based on self-consistency matrix and multi-scale loss

Similar Documents

Publication Publication Date Title
US11410323B2 (en) Method for training convolutional neural network to reconstruct an image and system for depth map generation from an image
CN107330439B (en) Method for determining posture of object in image, client and server
US20220222776A1 (en) Multi-Stage Multi-Reference Bootstrapping for Video Super-Resolution
TWI739151B (en) Method, device and electronic equipment for image generation network training and image processing
GB2553782A (en) Predicting depth from image data using a statistical model
WO2022134971A1 (en) Noise reduction model training method and related apparatus
CN113066017B (en) Image enhancement method, model training method and equipment
CN111915480B (en) Method, apparatus, device and computer readable medium for generating feature extraction network
CN111460876B (en) Method and apparatus for identifying video
CN113256529B (en) Image processing method, image processing device, computer equipment and storage medium
CN117011137A (en) Image stitching method, device and equipment based on RGB similarity feature matching
CN113780326A (en) Image processing method and device, storage medium and electronic equipment
US11961266B2 (en) Multiview neural human prediction using implicit differentiable renderer for facial expression, body pose shape and clothes performance capture
EP4292059A1 (en) Multiview neural human prediction using implicit differentiable renderer for facial expression, body pose shape and clothes performance capture
CN114677422A (en) Depth information generation method, image blurring method and video blurring method
CN113766117B (en) Video de-jitter method and device
WO2022021287A1 (en) Data enhancement method and training method for instance segmentation model, and related apparatus
CN116934591A (en) Image stitching method, device and equipment for multi-scale feature extraction and storage medium
CN115272575B (en) Image generation method and device, storage medium and electronic equipment
Wang et al. Deep intensity guidance based compression artifacts reduction for depth map
CN113065585B (en) Training method and device of image synthesis model and electronic equipment
Zhang et al. End-to-end learning of self-rectification and self-supervised disparity prediction for stereo vision
CN112927200B (en) Intrinsic image decomposition method and device, readable storage medium and electronic equipment
CN112052863B (en) Image detection method and device, computer storage medium and electronic equipment
CN111582121A (en) Method for capturing facial expression features, terminal device and computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination