CN112950466A - Image splicing method based on semantic object matching - Google Patents

Image splicing method based on semantic object matching Download PDF

Info

Publication number
CN112950466A
CN112950466A CN202110104851.0A CN202110104851A CN112950466A CN 112950466 A CN112950466 A CN 112950466A CN 202110104851 A CN202110104851 A CN 202110104851A CN 112950466 A CN112950466 A CN 112950466A
Authority
CN
China
Prior art keywords
image
semantic
matching
method based
splicing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110104851.0A
Other languages
Chinese (zh)
Inventor
周忠
李萌
吕伟
杨硌
梅澜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202110104851.0A priority Critical patent/CN112950466A/en
Publication of CN112950466A publication Critical patent/CN112950466A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image splicing method based on semantic object matching, which is suitable for images or video frames with overlapping areas. The image splicing method realized by introducing the semantic information can improve the effect of a large scene or panoramic splicing result, and the splicing result has a semantic label and can be better used for subsequent image or video analysis, thereby realizing the information association and scientific management and utilization of related images or videos.

Description

Image splicing method based on semantic object matching
Technical Field
The invention belongs to the technical field of image processing, and relates to feature extraction, semantic matching, image matching and image splicing, in particular to an image splicing method based on semantic object matching and introducing image high-level semantic information into an image splicing frame.
Background
With the rapid increase of the number of videos and the fragmentation of video contents, people can hardly acquire effective information from the videos rapidly, and image splicing is one of effective means for solving the video fragmentation. Image stitching may combine video frames with overlapping regions into video frames with a wider field of view or panorama, thereby enabling information correlation of the videos. Image stitching is generally divided into two categories: pixel-based image stitching methods (direct methods) and feature-based image stitching methods. The pixel method is based on the image, and the transformation parameters are estimated by utilizing the pixel information of the images such as depth, gradient, color, geometry and the like to realize the image deformation alignment, so that the image splicing work is realized. However, the preprocessing and calculation processes based on the pixel method are complicated, all pixels in the overlapped area need to be calculated, and the method is limited by strict conditions that the scene is approximate to a plane and the optical centers are nearly consistent, so that the method is mainly used for solving the image splicing problem of simple scenes. The steps based on the feature method comprise image matching, image alignment and image fusion, and sparse feature points are used for estimating geometric transformation between image pairs, but under the scenes with few textures, large parallax or wide base lines, the traditional bottom-layer feature matching method is insufficient in robustness, and uniform and high-quality matching point pairs cannot be obtained, so that the image splicing result generates artifacts, ghosts or misalignment.
Aiming at the defects of the current image splicing method, the invention provides an image splicing method based on semantic object matching. With the great success of deep learning in the field of computer vision, high-level semantic features can be extracted from a deep network, the features are more robust to the apparent difference and shape change of an image, the features have invariance to low-level visual features such as depth, gradient, corners and color, and the like, and have high-level semantic information of the image, so that the robustness of image matching is improved to a certain extent, but accurate matching and positioning at an image pixel level cannot be obtained. In consideration of the respective characteristics of the traditional characteristics and the semantic characteristics, the method combines the accuracy of the bottom layer visual characteristics and the robustness of the high-layer semantic characteristics, introduces the high-layer semantic information of the image into an image splicing frame, improves the accuracy and the reliability of image matching in the image splicing step, and further improves the image splicing effect. The image splicing method realized by introducing the semantic information can improve the effect of a large scene or panoramic splicing result, and the splicing result has a semantic label and can be better used for later image or video analysis, so that the information association and scientific management and utilization of related images or videos are realized.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: under the framework of image splicing, the image splicing effect is improved. Aiming at the problems, the image splicing method based on semantic object matching, which introduces the high-level semantic information of the image into the frame of image splicing, is provided, so that the accuracy and reliability of image matching in the image splicing step are improved, and the image splicing effect is improved.
In order to solve the technical problems, the image stitching method based on semantic object matching adopts the following technical scheme: semantic matching pairs among instance objects are obtained by adopting a semantic matching algorithm in an image splicing frame, so that high-level semantic information of the image is introduced, the robustness and accuracy of feature matching in image splicing are improved, and the image splicing effect is further improved.
The method comprises the steps of obtaining information of example objects in an image by using image segmentation, constructing a spatial relation graph for the example objects, obtaining the same example objects through the spatial relation graph, and obtaining pixel-level semantic matching pairs through a semantic matching module pair by pair.
And the bottom layer visual feature matching pair is screened and optimized according to the segmentation result and the semantic matching pair to obtain a final matching pair, the final matching pair is input into a grid optimization framework to deform and align the images, and then the images are spliced by using an image fusion algorithm.
The image segmentation is to obtain the mask, the labeling frame and the label information of the instance object in the image through a general instance object detection segmentation frame of the image or the video frame to be spliced.
The method comprises the steps of constructing a space relation graph, judging topological relations and directional relations among all example objects according to position information of the example objects, wherein the related topological relations are separated and intersected, and the directional relations adopt an eight-direction cone model.
The semantic matching module uses a pre-trained VGG19 convolutional neural network to perform feature similarity measurement according to geometric consistency constraint to obtain pixel-level semantic feature matching pairs, and then performs optimization screening on the bottom layer visual feature matching pairs by combining with image segmentation information to obtain final feature matching pairs.
Wherein the mesh optimization framework to achieve image alignment comprises: firstly, dividing an image into uniform initial grids, carrying out iteration minimization solving on an energy function, continuously updating the grids in an iteration process until the maximum iteration times is reached to obtain a final grid, and carrying out deformation alignment on the image according to the final grid.
Wherein the energy function includes an alignment constraint, a scale constraint, a smoothing constraint, and a straight line constraint.
The image splicing is to merge corresponding pixels of an overlapping area in an image to be spliced after the aligned image is obtained, and to retain information of the pixels in a non-overlapping area, so that macroscopic light and shade change and visible seams are eliminated, and an image splicing result is more natural.
Compared with the prior art, the invention has the advantages that: the semantic matching algorithm is introduced into the framework of image splicing to obtain the semantic matching pairs among the example objects, so that the image splicing method based on semantic object matching of the high-level semantic information of the image can improve the robustness and accuracy of image matching in the image splicing step, and the image splicing effect is improved. According to the invention, the image splicing realized by introducing the semantic information can not only obtain the effect of splicing results of large scenes or panoramas, but also the splicing results have semantic labels and can be better used for subsequent image or video analysis, thereby realizing the information association and scientific management and utilization of related images or videos.
Drawings
FIG. 1 is a flow chart of an image stitching method based on semantic object matching according to the present invention;
FIGS. 2a and 2b are exemplary diagrams of image segmentation results in an image stitching method based on semantic object matching according to the present invention;
FIGS. 3a and 3b are illustrations of spatial relationship diagrams in an image stitching method based on semantic object matching according to the present invention;
FIG. 4 is a flow chart of semantic matching modules in the image stitching method based on semantic object matching according to the present invention;
FIG. 5 is a flowchart of a mesh optimization framework for image alignment in an image stitching method based on semantic object matching according to the present invention;
FIG. 6 is an exemplary diagram of an image stitching result in the image stitching method based on semantic object matching according to the present invention.
Detailed Description
Other aspects, features and advantages of the present invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, which form a part of this specification, and which illustrate, by way of example, the principles of the invention.
The invention provides an image splicing method based on semantic object matching, which realizes the technical scheme for solving the technical problem and comprises the following steps: semantic matching pairs among instance objects are obtained by adopting a semantic matching algorithm in an image splicing frame, so that high-level semantic information of the image is introduced, the robustness and accuracy of feature matching in image splicing are improved, and the image splicing effect is further improved.
As shown in fig. 1, the invention is a flow chart of an image stitching method based on semantic object matching, and the method specifically comprises the following steps:
step 1: the image segmentation is to obtain Mask, label frame and label information of an example object in an image by detecting a segmentation frame through a general object example segmentation frame (Mask R-CNN), and the segmentation result is exemplified as follows: please refer to fig. 2a for the image segmentation result of the left image to be stitched, and refer to fig. 2b for the image segmentation result of the right image to be stitched. If the image has the instance object, continuing to execute the steps in sequence, and if the image does not have the instance object, directly executing the step (3);
please refer to fig. 2a, which is an image segmentation result of the left image to be stitched, and the image segmentation result of the left image to be stitched is described in detail as follows: taking the chair on the right side in the figure as an example, the surrounding rectangle surrounding box is a labeling box, the English "chair" in the upper left corner of the labeling box is the recognized class to which the instance object belongs, the percentage "98%" in the upper left corner is the probability that the instance object belongs to the class, and the gray area covered on the instance object in the labeling box is the mask of the instance object. The detection segmentation framework may identify 80 common instance object classes, such as: chair, book, tv, display screen, mouse, keyboard, bottle, etc. The labels referred to in fig. 2b are the same, and are not described herein again. According to the method, the spatial relationship diagram is constructed for the example objects in the diagram by obtaining the information of the labeling frame, the category label and the mask through image segmentation.
Step 2: the method comprises the steps of constructing a spatial relationship diagram, judging topological relationships and directional relationships among all example objects according to position information of the example objects, wherein the topological relationships are separated dt and intersected ov, and the directional relationships adopt an eight-direction cone model.
And (3) constructing a spatial relationship diagram of each image according to the following algorithm 1, wherein the constructed spatial relationship diagram is an example, fig. 3a shows a spatial relationship diagram of a left image to be spliced, and fig. 3b shows a spatial relationship diagram of a right image to be spliced.
Please refer to fig. 3a, which is a spatial relationship diagram of the left graph to be merged, and the following describes the spatial relationship of the left graph to be merged in detail: taking the leftmost chair as an example and pointing to the arrow line of the leftmost computer, the line is annotated as "NE, dt", the former "NE" indicates that the end point is located in the northeast direction of the start point (N: normal, E: east), and the latter "dt" indicates that the topological relation between the end point instance object and the start point instance object is a distance (discrete). The annotation of the spatial relationship with the end points of the other lines at the starting points is shown in table 1 below. The labels referred to in fig. 3b are the same, and are not described herein again. According to the method, through traversing the spatial relationship graph of the image pair to be spliced, whether the class labels, the topological relationship and the spatial relationship of the two example objects are consistent or not is judged, and the same example object matching pair which is one-to-one in the image pair to be spliced is found.
Table 1: annotating meanings with spatial relationships
Figure BDA0002916961260000051
Figure BDA0002916961260000061
Figure BDA0002916961260000062
And step 3: image feature matching: if the image has the instance object, traversing the spatial relationship graph of the images to be spliced, finding the same instance object, namely matching the same instance object, and then inputting the instance object matching pair by pair into the semantic matching module. And if the image does not have the instance object, directly inputting the whole image into the semantic matching module. As shown in fig. 4, the semantic matching module flowchart is that a pre-trained VGG19 convolutional neural network is first configured to perform feature similarity measurement according to geometric consistency constraint to obtain pixel-level semantic feature matching pairs, and then the bottom layer visual feature matching pairs are optimized and screened by combining image segmentation information to obtain final feature matching pairs. Wherein the geometric consistency constraint comprises an apparent constraint, a relative direction constraint and a relative distance constraint. The apparent constraint condition is used for calculating the similarity degree between the characteristic points, and the calculation method adopts a cosine similarity measurement method. The relative direction constraint condition refers to that the relative directions of the salient features contained between two semantic objects to be matched which belong to the same category are almost consistent. For example, the directions between the mouth and the nose of different people are vertical up and down, and the directions of the centers of the front lamps and the centers of the front lenses of different automobiles are basically fixed. The relative distance constraint condition refers to that the relative distance between two semantic objects to be matched which belong to the same category is almost consistent.
And 4, step 4: the image alignment grid optimization framework flowchart is shown in fig. 5, and is characterized in that an image is divided into uniform initial grids, then an energy function comprising alignment constraint, scale constraint, smoothness constraint and straight line constraint is designed, iterative minimization solution is performed on the energy function, the grids are continuously updated in an iterative process until the maximum iteration number is reached to obtain a final grid, and the image is subjected to deformation alignment according to the final grid.
And 5: after the aligned images are obtained, corresponding pixels of the overlapped areas in the images to be spliced need to be merged, information of the pixels in the non-overlapped areas is kept, and visible light and shade changes and visible seams which are visible to naked eyes are eliminated, so that the image splicing result is more natural. Common image fusion algorithms include linear fusion, feathering fusion, and multi-band fusion.
The image mosaic realized by introducing semantic information can not only obtain the mosaic result of a large scene or a panorama, but also the mosaic result has a semantic label and can be better used for subsequent image or video analysis, thereby realizing the information association and scientific management and utilization of related images or videos.
Parts of the invention not described in detail are well known to those skilled in the art.
Finally, it should be noted that the above mentioned embodiments are only preferred embodiments of the present invention, and it should be noted that the present invention is not limited to the above mentioned preferred embodiments, and any other various forms of products can be obtained by anyone in the light of the present invention, but any changes in form or structure thereof, which are the same or similar to the technical solutions of the present invention, should be covered within the protection scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (9)

1. An image splicing method based on semantic object matching is characterized in that: semantic matching pairs among instance objects are obtained by adopting a semantic matching algorithm in an image splicing frame, so that high-level semantic information of the image is introduced, the robustness and accuracy of feature matching in image splicing are improved, and the image splicing effect is further improved.
2. The image stitching method based on semantic object matching according to claim 1, characterized in that: the method comprises the steps of obtaining information of example objects in an image by using image segmentation, constructing a spatial relation graph for the example objects, obtaining the same example objects through the spatial relation graph, and obtaining pixel-level semantic matching pairs through a semantic matching module pair by pair.
3. The image stitching method based on semantic object matching according to claim 2, characterized in that: and screening and optimizing the bottom layer visual feature matching pair according to the segmentation result and the semantic matching pair to obtain a final matching pair, inputting the final matching pair into a grid optimization framework to deform and align the images, and then realizing image splicing by using an image fusion algorithm.
4. The image stitching method based on semantic object matching according to claim 2, characterized in that: the image segmentation is to obtain the mask, the labeling frame and the label information of the instance object in the image through a general instance object detection segmentation frame of the image or the video frame to be spliced.
5. The image stitching method based on semantic object matching according to claim 2, characterized in that: the method comprises the steps of constructing a spatial relationship diagram, judging topological relationships and directional relationships among all example objects according to position information of the example objects, wherein the related topological relationships are separated and intersected, and the directional relationships adopt an eight-direction cone model.
6. The image stitching method based on semantic object matching according to claim 2, characterized in that: the semantic matching module uses a pre-trained VGG19 convolutional neural network to perform feature similarity measurement according to geometric consistency constraint to obtain pixel-level semantic feature matching pairs, and then performs optimization screening on the bottom layer visual feature matching pairs by combining image segmentation information to obtain final feature matching pairs.
7. The image stitching method based on semantic object matching according to claim 3, characterized in that: the mesh optimization framework that achieves image alignment includes: firstly, dividing an image into uniform initial grids, carrying out iteration minimization solving on an energy function, continuously updating the grids in an iteration process until the maximum iteration times is reached to obtain a final grid, and carrying out deformation alignment on the image according to the final grid.
8. The image stitching method based on semantic object matching according to claim 7, characterized in that: the energy function includes an alignment constraint, a scale constraint, a smoothing constraint, and a straight line constraint.
9. The image stitching method based on semantic object matching according to claim 1, characterized in that: the image splicing is to merge corresponding pixels of an overlapping area in an image to be spliced after the aligned image is obtained, and to retain information of the pixels in a non-overlapping area, so that the visible light and shade change and visible seams of naked eyes are eliminated, and the image splicing result is more natural.
CN202110104851.0A 2021-01-26 2021-01-26 Image splicing method based on semantic object matching Pending CN112950466A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110104851.0A CN112950466A (en) 2021-01-26 2021-01-26 Image splicing method based on semantic object matching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110104851.0A CN112950466A (en) 2021-01-26 2021-01-26 Image splicing method based on semantic object matching

Publications (1)

Publication Number Publication Date
CN112950466A true CN112950466A (en) 2021-06-11

Family

ID=76237132

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110104851.0A Pending CN112950466A (en) 2021-01-26 2021-01-26 Image splicing method based on semantic object matching

Country Status (1)

Country Link
CN (1) CN112950466A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116128734A (en) * 2023-04-17 2023-05-16 湖南大学 Image stitching method, device, equipment and medium based on deep learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101763439A (en) * 2010-03-05 2010-06-30 中国科学院软件研究所 Hypervideo construction method based on rough drawings
US20180211393A1 (en) * 2017-01-24 2018-07-26 Beihang University Image guided video semantic object segmentation method and apparatus
CN110660023A (en) * 2019-09-12 2020-01-07 中国测绘科学研究院 Video stitching method based on image semantic segmentation
CN110909778A (en) * 2019-11-12 2020-03-24 北京航空航天大学 Image semantic feature matching method based on geometric consistency

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101763439A (en) * 2010-03-05 2010-06-30 中国科学院软件研究所 Hypervideo construction method based on rough drawings
US20180211393A1 (en) * 2017-01-24 2018-07-26 Beihang University Image guided video semantic object segmentation method and apparatus
CN110660023A (en) * 2019-09-12 2020-01-07 中国测绘科学研究院 Video stitching method based on image semantic segmentation
CN110909778A (en) * 2019-11-12 2020-03-24 北京航空航天大学 Image semantic feature matching method based on geometric consistency

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨毅 等: "融合图像语义的动态视频拼接方法", 《系统工程与电子技术第40卷 第12期》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116128734A (en) * 2023-04-17 2023-05-16 湖南大学 Image stitching method, device, equipment and medium based on deep learning

Similar Documents

Publication Publication Date Title
Seibel et al. Eyes on the target: Super-resolution and license-plate recognition in low-quality surveillance videos
US10353271B2 (en) Depth estimation method for monocular image based on multi-scale CNN and continuous CRF
US20080232715A1 (en) Image processing apparatus
CN110956681B (en) Portrait background automatic replacement method combining convolution network and neighborhood similarity
CN110838086B (en) Outdoor image splicing method based on correlation template matching
CN111553845B (en) Quick image stitching method based on optimized three-dimensional reconstruction
CN111768415A (en) Image instance segmentation method without quantization pooling
CN103841298A (en) Video image stabilization method based on color constant and geometry invariant features
CN113159043A (en) Feature point matching method and system based on semantic information
Chen et al. Autosweep: Recovering 3d editable objects from a single photograph
CN114120163A (en) Video frame processing method and device, and related equipment and storage medium thereof
CN112734914A (en) Image stereo reconstruction method and device for augmented reality vision
CN114255197A (en) Infrared and visible light image self-adaptive fusion alignment method and system
CN115115522A (en) Goods shelf commodity image splicing method and system
CN112365518A (en) Image splicing method based on optimal suture line self-selection area gradual-in and gradual-out algorithm
Tsutsui et al. Distantly supervised road segmentation
Zhang et al. Image stitching based on human visual system and SIFT algorithm
CN112950466A (en) Image splicing method based on semantic object matching
Rui et al. Research on fast natural aerial image mosaic
Huang et al. Crack detection of masonry structure based on thermal and visible image fusion and semantic segmentation
CN112686247A (en) Identification card number detection method and device, readable storage medium and terminal
CN111105350A (en) Real-time video splicing method based on self homography transformation under large parallax scene
CN116596815A (en) Image stitching method based on multi-stage alignment network
CN116843893A (en) Three-dimensional image segmentation method and system based on attention mechanism multi-scale convolutional neural network
CN116132610A (en) Fully-mechanized mining face video stitching method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210611