CN116152073B

CN116152073B - Improved multi-scale fundus image stitching method based on Loftr algorithm

Info

Publication number: CN116152073B
Application number: CN202310350202.8A
Authority: CN
Inventors: 赵振栋; 姜冲; 刘春燕; 唐旭
Original assignee: Jiangsu Fuhan Medical Industry Development Co ltd
Current assignee: Jiangsu Fuhan Medical Industry Development Co ltd
Priority date: 2023-04-04
Filing date: 2023-04-04
Publication date: 2023-08-22
Anticipated expiration: 2043-04-04
Also published as: CN116152073A

Abstract

The application provides an improved multi-scale fundus image stitching method based on a Loftr algorithm, and belongs to the technical fields of computer vision and medical image processing. Firstly, performing rectangular blacking treatment on a fundus image to enable an eyeball to be imaged and internally cut into an image outer frame; then respectively carrying out downsampling twice to find out the central point of the feature extraction area; adjusting ratio, selecting an area for fundus image feature extraction, taking the area as input of a Loftr algorithm, completing feature point extraction and feature point matching by using the Loftr algorithm, and forming a connecting assembly; and then, performing homography transformation on the images by adopting a MAGSAC++ robust estimation method, transforming each image under the same coordinate system, and finally, combining a quasi-normal weighted fusion algorithm and a connection component to complete multi-image fusion, thereby completing the splicing. The method can carry out high-efficiency feature extraction on the low-texture fundus image, and realizes accurate and rapid fundus image stitching.

Description

Improved multi-scale fundus image stitching method based on Loftr algorithm

Technical Field

The application belongs to the technical field of medical image processing, and particularly relates to an improved multi-scale fundus image stitching method based on a Loftr algorithm.

Background

At present, fundus images are generally obtained through fundus cameras, and due to the limitation of the fundus cameras, the obtained images can only be partial images of fundus, so that an ophthalmologist can only observe and manually align the images by naked eyes in clinical diagnosis and treatment, the efficiency is low, and the accuracy cannot be ensured. There are two ways to solve this problem, one is to increase the field of view of the device imaging, but this usually requires a relatively expensive expense, which is not practical for most hospitals. Another approach is to stitch multiple fundus images so that they present an image of the entire fundus of the patient on one map, thereby meeting the needs of clinical diagnosis and treatment.

Neonatal fundus images are typically obtained by special fundus cameras. The existing image stitching technology is limited by low-granularity of the fundus images of the newborns, the characteristic points cannot be accurately detected, namely, the number and the accuracy of the characteristic points cannot provide conditions for registering and stitching of the low-granularity images. The phenomena of incapability of splicing and unstable splicing frequently occur. Namely, the existing fundus image stitching technology mainly has the following defects: firstly, the characteristic points are too few to be registered, or the matching parameters are solved by mistake because of too many mismatching points; secondly, the splicing speed is low, and the splicing result cannot be verified.

Disclosure of Invention

In order to solve the problems, the application discloses an improved multi-scale fundus image stitching method based on the Loftr algorithm, which aims at the defects of the existing stitching technology, can perform high-efficiency feature extraction on low-texture fundus images and realizes accurate and rapid fundus image stitching.

In order to achieve the above purpose, the technical scheme of the application is as follows:

an improved multi-scale fundus image stitching method based on an Loftr algorithm comprises the following steps:

s1, performing rectangular blacking treatment on a fundus image to enable an eyeball to be imaged and internally cut into an image outer frame;

s2, respectively performing downsampling twice according to the image obtained after the operation in the step S1, scaling the width and the height of the image according to the proportion, generating a graph a by the action of a proportion blend_scale, and generating a graph b by the action of a proportion feature_scale;

s3, obtaining the maximum inscribed rectangular position of the eyeball according to the graph b obtained after the operation in the step S2, wherein the information is the upper left corner coordinates (x, y), the width w and the height h;

s4, deriving the center coordinates of the video disc according to the graph b obtained after the operation in the step S2, and taking the center coordinates as the center point of the feature extraction area, and if no video disc is detected in the image, calculating coordinates (x+0.5xw, y+0.5xh) according to the width w and the height h obtained after the operation in the step S3;

s5, adjusting the ratio according to the central point of the feature extraction area obtained in the step S4, and selecting an area for fundus image feature extraction, wherein the width of the area is ratio w and the height of the area is ratio h;

s6, taking the region for fundus image feature extraction obtained in the step S5 as input of a Loftr algorithm, completing feature point extraction and feature point matching by using the Loftr algorithm to obtain a matching pair key_b based on fundus images on a multi-scale region, and restoring the matching pair coordinates to a fusion reference graph of the graph a according to the downsampling size and the multi-scale size to obtain a new matching pair key_a on the reference graph;

s7, checking correct and connectable matching pairs by using key_a information to form a connecting assembly, wherein the connecting assembly comprises position information for forming each fundus panoramic image;

s8, constructing a homography matrix by adopting a MAGSAC++ robust estimation method, carrying out homography transformation on each image in each group of connection components by using a matching pair, transforming each image into the same coordinate system, and calculating a homography matrix H according to a formula (1):

（1）

in the formula (1)And%p，q) The method comprises the steps that paired points of an image to be registered and a reference image are paired, and H is a three-by-three parameter matrix;

s9, sequentially completing homography transformation according to the homography transformation matrix obtained by the calculation in the S8, and the pairing points of each group of images;

s10, combining a quasi-normal weighted fusion algorithm and a connecting component according to the image subjected to homography transformation in S9, and completing multi-image fusion, so that splicing is completed.

Further, the specific method of step S3 is: firstly, carrying out graying treatment on the graph b, searching a pixel threshold value by using an image threshold processing function threshold in the traditional OpenCV, carrying out morphological processing operation, finding out the outline of the eyeball, and using a boundingRect function to obtain the minimum rectangular frame, namely obtaining the maximum inscribed rectangular position of the eyeball, wherein the information is the upper left corner coordinates (x, y), the width w and the height h.

Further, in step S4, map b locates the optic disc using a deep learning that employs the object detection algorithm YOLOv5, and derives the center coordinates of the optic disc.

Further, in step S5, the ratio is 0 to 1.

Further, in step S6, the feature extraction part of the Loftr algorithm adopts a feature map pyramid network; the matching of feature points of the Loftr algorithm uses an attention mechanism to allocate a weight to each item of the input model so that similar feature points in two images are highlighted, and for an image pair { A, B }, the feature points of an image A to be queried are expressed as query vectorsQ _i The feature points of image B are represented as key vectorsK _j And true valueV _j Computing query vectorsQ _i Sum key vectorK _j Is based on the similarity and true valueV _j The weighted summation realizes the aggregation of the characteristics, and the calculation formula of the attention mechanism is as follows:

wherein the superscript T represents the transpose of the matrix;

the local Transformer module of the Loftr algorithm is based on a transducer, wherein the transducer encoder is overlapped by a plurality of sequentially connected encoder layers, the core of the encoder layers is a multi-head attention mechanism, and the encoder layers are formed by a plurality of alternating self-attention layers and cross-attention layers, so that the matching achieves tight convergence.

The beneficial effects of the application are as follows:

according to the technical scheme, the improved multi-scale fundus image splicing method based on the Loftr algorithm is provided, multi-scale feature extraction is carried out on a multi-scale area with the optic disc as the center according to the optic disc position of the neonatal fundus, feature descriptors of any two images are obtained by adopting the Loftr algorithm added with dynamic position reduction, low-texture dense matching is achieved, correct matching pairs and connecting components are generated, homography matrix transformation is carried out on the graph a based on the matching pairs, and fusion splicing of a plurality of neonatal fundus images is achieved by combining a quasi-normal weighted fusion algorithm and the connecting components. The method can solve the problems of low speed and poor splicing effect of the traditional algorithm.

Drawings

FIG. 1 is a flow chart of the present application;

FIG. 2 is a view of an original image after rectangular blackout;

FIG. 3 is a view of a deep learning based optic disc extraction;

FIG. 4 is a graph of a feature extraction region using the Loftr algorithm;

fig. 5 is a fundus splice of the method of the present application.

Detailed Description

The present application is further illustrated in the following drawings and detailed description, which are to be understood as being merely illustrative of the application and not limiting the scope of the application.

As shown in fig. 1, the improved multi-scale fundus image stitching method based on the Loftr algorithm comprises the following steps:

s1, performing rectangular blacking treatment on the fundus image, enabling the eyeball imaging to be internally cut into an outer frame of the image, and reducing influence of an invalid background part on feature extraction, as shown in fig. 2.

S2, respectively performing downsampling twice according to the image obtained after the operation in the step S1, scaling the width and the height of the image according to the proportion, generating a graph a by the action of a proportion blend_scale, and generating a graph b by the action of a proportion feature_scale; wherein figure b is used as feature extraction and figure a is used as a fusion reference graph. The feature points detected in figure b and matches are applied to the fusion of the downsampled images of figure a. And the two proportion parameters are separated, so that the automatic adjustment is convenient for later period according to the time requirement and the imaging definition requirement.

S3, obtaining the maximum inscribed rectangular position of the eyeball from the graph b, wherein the information is the upper left corner coordinates (x, y), the width w and the height h; the image b downsampled image is subjected to threshold searching by using a traditional OpenCV method, the image b is subjected to gray processing, a pixel threshold is searched by using an image threshold processing function threshold in the traditional OpenCV, the outline of the eyeball is found out by combining morphological processing operation, the minimum rectangular frame is obtained by using a boundingRect function, namely, the maximum inscribed rectangular position of the eyeball is obtained, and the information is the upper left corner coordinates (x, y), the width w and the height h.

S4, deriving the center coordinates of the video disc from the graph b, taking the center coordinates of the feature extraction area (x+0.5 x w, y+0.5 x h) if no video disc is detected in the image as the center point of the feature extraction area; the downsampled image of figure b locates the optic disc using a deep learning using the object detection algorithm YOLOv5 with the extraction result shown in figure 3, and derives the center coordinates of the optic disc.

S5, adjusting the ratio according to the central point of the feature extraction area obtained in the step S4, and selecting an area for feature extraction, wherein the width of the area is ratio w, and the height of the area is ratio h; the Ratio is used to control the size of the feature extraction region, and the range is (0-1). If the value is too small, the feature extraction area is smaller, and the extracted feature information is insufficient; if the value is too large, a plurality of invalid areas are contained in the feature extraction area, the feature information is interfered, and meanwhile, the algorithm efficiency is reduced.

The above steps are referred to as a "multiscale feature extraction region algorithm (Multi-scale feature extraction area)". The specific algorithm is as follows:

a = blend_scale(eye)

b = feature_scale(eye)

x, y, w, h =maximum_internal_rectangle(b)

point =optic_disc_object_detection(b)

Output =Multi_scale_feature_extraction_area(point,ratio*w,ratio*h)

s6, taking the result obtained in the step S5 as input of a Loftr algorithm, completing feature extraction and feature matching by using the Loftr algorithm to obtain a matching pair key_b based on the fundus images on the multi-scale area, and restoring the matching pair coordinate to a fusion reference graph of the graph a according to the downsampling size and the multi-scale size to obtain a new matching pair key_a on the reference graph; the extraction results are shown in FIG. 4.

S7, checking correct and connectable matching pairs by using key_a information to form a connecting assembly, wherein the connecting assembly comprises position information for forming each fundus image;

（1）

in the formula (1)And%p，q) The paired points of the image to be registered and the reference image are paired, H is a three-by-three parameter matrix, the homography transformation matrix in the embodiment is 3*3, and the space projection relationship represented by the matrix H is utilized by the image to be spliced.

S9, sequentially completing homography transformation according to the homography transformation matrix by the pairing points of each group of images;

s10, combining the homography transformed image with a normal weighting fusion algorithm and a connection assembly to finish multi-image fusion, and completing the splicing, wherein the splicing result is shown in fig. 5.

In the step S6, a feature extraction part of the Loftr algorithm adopts a feature map pyramid network;

the image registration stage of the Loftr algorithm uses an attention mechanism to assign a weight to each item of the input model, so that similar characteristic points in two images can be highlighted, and for an image pair { A, B }, the characteristic points of an image A to be queried are represented as query vectorsQ _i The feature points of image B are represented as key vectorsK _j And true valueV _j Computing query vectorsQ _i Sum key vectorK _j Is based on the similarity and true valueV _j The weighted summation realizes the aggregation of the characteristics, and the calculation formula of the attention mechanism is as follows:

wherein the superscript T represents the transpose of the matrix;

It should be noted that the foregoing merely illustrates the technical idea of the present application and is not intended to limit the scope of the present application, and that a person skilled in the art may make several improvements and modifications without departing from the principles of the present application, which fall within the scope of the claims of the present application.

Claims

1. An improved multi-scale fundus image stitching method based on an Loftr algorithm is characterized by comprising the following steps:

（1）

2. The improved multi-scale fundus image stitching method based on the Loftr algorithm according to claim 1, wherein the specific method of step S3 is: firstly, carrying out graying treatment on the graph b, searching a pixel threshold value by using an image threshold processing function threshold in the traditional OpenCV, carrying out morphological processing operation, finding out the outline of the eyeball, and using a boundingRect function to obtain the minimum rectangular frame, namely obtaining the maximum inscribed rectangular position of the eyeball, wherein the information is the upper left corner coordinates (x, y), the width w and the height h.

3. The improved multi-scale fundus image stitching method according to claim 1, wherein in step S4, map b locates the optic disc using a deep learning using the object detection algorithm YOLOv5, and derives the center coordinates of the optic disc.

4. The improved multi-scale fundus image stitching method based on the Loftr algorithm according to claim 1, wherein the ratio value in step S5 is 0-1.

5. The improved multi-scale fundus image stitching method based on the Loftr algorithm according to claim 1, wherein the feature extraction portion of the Loftr algorithm in step S6 employs a feature map pyramid network; the matching of feature points of the Loftr algorithm uses an attention mechanism to allocate a weight to each item of the input model so that similar feature points in two images are highlighted, and for an image pair { A, B }, the feature points of an image A to be queried are expressed as query vectorsQ _i The feature points of image B are represented as key vectorsK _j And true valueV _j Computing query vectorsQ _i Sum key vectorK _j Is based on the similarity and true valueV _j The weighted summation realizes the aggregation of the characteristics, and the calculation formula of the attention mechanism is as follows:

wherein the superscript T represents the transpose of the matrix;