CN111815690B

CN111815690B - Method, system and computer equipment for real-time splicing of microscopic images

Info

Publication number: CN111815690B
Application number: CN202010950938.5A
Authority: CN
Inventors: 向北海; 张建南; 许会
Original assignee: Hunan Guokezhitong Technology Co ltd
Current assignee: Hunan Guokezhitong Technology Co ltd
Priority date: 2020-09-11
Filing date: 2020-09-11
Publication date: 2020-12-08
Anticipated expiration: 2040-09-11
Also published as: CN111815690A

Abstract

The invention discloses a method, a system and computer equipment for real-time splicing of microscopic images, wherein the method adopts a two-stage double-network cascaded microscopic image splicing model, utilizes a pixel-level image registration network to directly output a pixel-level mapping table from an input image pair, realizes pixel-by-pixel mapping, can align each pixel to an ideal alignment position, and has high precision and high speed and can process in real time compared with the common homography matrix for calculating one homography matrix or calculating homography matrixes corresponding to a plurality of grids; and then, a multi-focus image fusion network is utilized to directly output the fused image from the input aligned image, so that multi-focus image fusion is realized, all parts of the obtained spliced image are completely clear, and accurate splicing can be realized under the condition that a weak texture area exists in an image pair overlapping area.

Description

Method, system and computer equipment for real-time splicing of microscopic images

Technical Field

The invention relates to the technical field of microscopic image processing, in particular to a method, a system and computer equipment for real-time splicing of microscopic images.

Background

With the development of the digital microscope technology, the prior art can use a sample slide scanning mode to perform comprehensive scanning on a sample slide to obtain a full-section digital microscopic image, that is, microscopic images of each local view of the sample slide are sequentially shot under the amplification of a digital microscope, and a complete full-section digital microscopic image is obtained by using an image splicing technology.

Generally, a microscopic image splicing technology comprises two key steps, namely image registration and image fusion, and the image splicing technology widely used at present is an image splicing technology based on feature point matching. The image stitching technology based on feature point matching can obtain a good stitching effect, but does not meet real-time performance, and particularly when a weak texture area exists in an overlapping area, the method cannot detect effective feature points to perform feature matching, so that image stitching failure can be caused.

Disclosure of Invention

The invention provides a method, a system and computer equipment for real-time splicing of microscopic images, which are used for overcoming the defects that the real-time property is not met in the prior art, and even image splicing failure can be caused when a weak texture area exists in an overlapped area.

In order to achieve the above object, the present invention provides a method for real-time stitching of microscopic images, comprising:

acquiring a local microscopic image;

forming image pairs by adjacent local microscopic images in the local microscopic images, obtaining ideal labels corresponding to each group of image pairs by using an image registration algorithm, and forming a training set by all the image pairs and the ideal labels;

inputting the training set into a pre-constructed microscopic image splicing model, wherein the microscopic image splicing model comprises an image registration network and an image fusion network;

registering the input image pair by using an image registration network, obtaining a mapping table according to an ideal label corresponding to the image pair, and aligning the image pair according to the mapping table to obtain an aligned image;

extracting the mutually overlapped parts in the alignment images, fusing the mutually overlapped parts in the alignment images by using an image fusion network to obtain local fusion images, cascading the local fusion images and the non-overlapped parts in the alignment images to obtain spliced images, and obtaining a trained microscopic image splicing model;

and splicing the local microscopic images to be spliced by using the trained microscopic image splicing model to obtain a spliced image.

In order to achieve the above object, the present invention further provides a system for real-time stitching of microscopic images, comprising:

the image acquisition module is used for acquiring a local microscopic image;

the training set generation module is used for forming image pairs by adjacent local microscopic images in the local microscopic images, obtaining ideal labels corresponding to each group of image pairs by using the image registration model, and forming a training set by all the image pairs and the ideal labels;

the model training module is used for inputting the training set into a pre-constructed microscopic image splicing model, and the microscopic image splicing model comprises an image registration network and an image fusion network; registering the input image pair by using an image registration network, obtaining a mapping table according to an ideal label corresponding to the image pair, and aligning the image pair according to the mapping table to obtain an aligned image; extracting the mutually overlapped parts in the alignment images, fusing the mutually overlapped parts in the alignment images by using an image fusion network to obtain local fusion images, cascading the local fusion images and the non-overlapped parts in the alignment images to obtain spliced images, and obtaining a trained microscopic image splicing model;

and the image splicing module is used for splicing the local microscopic images to be spliced by utilizing the trained microscopic image splicing model to obtain spliced images.

To achieve the above object, the present invention further provides a computer device, which includes a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of the method when executing the computer program.

To achieve the above object, the present invention further proposes a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method described above.

Compared with the prior art, the invention has the beneficial effects that:

the method for real-time splicing of the microscopic images adopts a two-stage double-network cascaded microscopic image splicing model, utilizes a pixel-level image registration network to directly output a pixel-level mapping table from an input image pair, realizes pixel-by-pixel mapping, can align each pixel to an ideal alignment position, and has high precision and high speed of the mapping table based on the pixel level and real-time processing compared with the common homography matrix calculation or homography matrices corresponding to a plurality of grids; and then, a multi-focus image fusion network is utilized to directly output the fused image from the input aligned image, so that multi-focus image fusion is realized, all parts of the obtained spliced image are completely clear, and accurate splicing can be realized under the condition that a weak texture area exists in an image pair overlapping area.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.

FIG. 1 is a flow chart of a method for real-time stitching of microscopic images according to the present invention;

FIG. 2 is a block diagram of a microscopic image stitching model provided by the present invention;

FIG. 3 is a structural diagram of an image registration network in a microscopic image stitching model according to an embodiment of the present invention;

FIG. 4 is a structural diagram of an image fusion network in the microscopic image stitching model in the embodiment of the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In addition, the technical solutions in the embodiments of the present invention may be combined with each other, but it must be based on the realization of those skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination of technical solutions should not be considered to exist, and is not within the protection scope of the present invention.

The invention provides a method for real-time splicing of microscopic images, which comprises the following steps of:

101: acquiring a local microscopic image;

collecting a plurality of lesion tissue samples, preparing sample slides by a pathology method, collecting local microscopic images under each visual field in the plurality of sample slides by a high-magnification digital microscope, and ensuring that an overlapping area exists between adjacent local microscopic images when the local microscopic images under each visual field are collected.

102: forming image pairs by adjacent local microscopic images in the local microscopic images, obtaining ideal labels corresponding to each group of image pairs by using an image registration algorithm, and forming a training set by all the image pairs and the ideal labels;

the image registration model is an existing common image registration model based on feature point matching.

The ideal label is an aligned image which is registered and aligned by using an image registration model.

103: inputting a training set into a pre-constructed microscopic image splicing model, wherein the microscopic image splicing model comprises an image registration network and an image fusion network;

the microscopic image splicing model is a two-stage double-network cascading model.

104: registering the input image pair by using an image registration network, obtaining a mapping table according to an ideal label corresponding to the image pair, and aligning the image pair according to the mapping table to obtain an aligned image;

and the mapping table comprises a mapping table in the horizontal direction and a mapping table in the vertical direction.

For each pixel, the corresponding value in the mapping table represents a mapping that transforms the original position of the pixel to the aligned position in the aligned view, i.e. the corresponding position of the respective pixel in the original image in the aligned image can be determined by the mapping table.

105: extracting the overlapped parts in the aligned images, fusing the overlapped parts in the aligned images by using an image fusion network to obtain local fusion images, cascading the local fusion images and the non-overlapped parts in the aligned images to obtain spliced images, and obtaining a trained microscopic image splicing model;

106: and splicing the local microscopic images to be spliced by using the trained microscopic image splicing model to obtain a spliced image.

The traditional image splicing algorithm based on feature point matching requires that the image content is rich enough, more feature points can be extracted, more matched feature point pairs are obtained, and the image splicing algorithm depends heavily on the content of the shot image. The method of the invention adopts a deep learning mode and does not depend on the content of the image, and the method has wide application range, namely strong robustness.

In one embodiment, for step 102, obtaining the ideal label corresponding to each set of image pairs by using an image registration algorithm includes:

201: extracting SURF feature descriptors of each group of image pairs by using an image registration algorithm, and performing feature pre-matching according to the SURF feature descriptors;

the SURF is called Speed Up Robust Features, and is an accelerated version of Robust feature algorithm, which greatly accelerates the running time of a program by adopting the concepts of Haar Features and integral images.

202: eliminating mismatching feature points in the feature pre-matching process by using a RANSAC algorithm to obtain a matching result, and calculating a homography matrix according to the matching result;

ransac (random Sample consensus), which is a random Sample consensus algorithm. It can iteratively estimate the parameters of the mathematical model from a set of observed data sets comprising "outliers". It is an uncertain algorithm with a certain probability to get a reasonable result.

203: and transforming one image in the image pair by using the homography matrix, and aligning the transformed image with the other image in the image pair to obtain an aligned image, wherein the aligned image is an ideal label corresponding to the image pair.

In the next embodiment, as for step 103, the microscopic image stitching model is a two-stage dual-network cascaded model as shown in fig. 2, in the first stage, an input image pair is registered by using a pixel-level image registration network (referred to as PWIRNet for short), and an aligned image after alignment is output; and in the second stage, multi-focus fusion is carried out on the overlapped part in the input aligned images by adopting a multi-focus image fusion network (MFIFNet for short), and a large-size spliced image is output.

The image registration network is mainly a codec framework similar to a U-Net framework, and is used for inputting two image pairs with the size of H multiplied by W multiplied by 1 (H is height, W is width, and 1 is channel number) and outputting a mapping table

I.e. mapping tables in the horizontal direction

And mapping table in vertical direction

，

And

the sizes are the same. For each pixel, the corresponding value in the mapping table represents a mapping that transforms the original position of the pixel to the aligned position in the aligned view, i.e. the corresponding position of the respective pixel in the original image in the aligned image can be determined by the mapping table.

The image registration network (abbreviated as PWIRNet) is a pixel level network, as shown in fig. 3, and includes an encoder module, a decoder module, and a spatial transform module;

the encoder module is used for extracting high-level features from an input image pair, and sequentially comprises 8 convolutional layers (conv 1-conv 8), the size of a convolutional kernel of each convolutional layer is 3 multiplied by 3, the step size is 1, and the number of feature maps of each layer is 32, 64, 128, 256 and 256;

the decoder module is used for decoding the high-level features extracted by the encoder module to obtain a fine mapping table

Fine mapping table in horizontal direction

And a fine mapping table in the vertical direction

，

And

the encoder module comprises 7 deconvolution layers (deconv 7-deconv 1), wherein the convolution kernel size of each deconvolution layer is 3 multiplied by 3, the step length is 1) and 1 convolution layer (conv 9), each deconvolution layer corresponds to one convolution layer in the encoder module, each deconvolution layer is connected with the corresponding convolution layer in a jump connection mode, and the number of feature maps of each deconvolution layer is the same as that of the feature maps in the corresponding convolution layer;

the space transformation module is used for multiplying the affine transformation Ht and the constant matrix A to obtain a coarse mapping table

Coarse mapping table in horizontal direction

And coarse mapping table in vertical direction

，

And

same size, will thin the mapping table

And coarse mapping table

Superposition is carried out to obtain a mapping table

Using a mapping table

One of the images in the pair is transformed,and aligning the transformed image with the other image of the pair of images to obtain an aligned image.

In one embodiment, the affine transformation Ht is obtained by downsampling the feature map output by the last convolutional layer (conv 8) in the encoder block through one convolutional layer (conv 10) and then through one 1 × 1 convolutional layer (conv 11).

The image fusion network is a multi-focus network, and as shown in fig. 4, the image fusion network comprises two parallel Feature extraction sub-networks (the two parallel Feature extraction sub-networks are respectively used for Feature extraction of Input images Input1 and Input 2), a fusion Module and a Feature Reconstruction Module (Feature Reconstruction Module), wherein the Feature extraction sub-networks comprise 2 convolution layers (conv 1 and conv 2; the convolution kernel size of conv1 is 3 × 3), and output a 64-dimensional Feature map; conv2 has no activation function) and 1 Feature Extraction Module (Feature Extraction Module) located between 2 convolutional layers;

the characteristic extraction module is used for extracting characteristics of an input image to obtain a high-dimensional nonlinear characteristic diagram, and sequentially comprises 5 convolutional layers (conv), wherein a ReLU nonlinear activation layer is connected behind each convolutional layer; the convolution kernel size of each convolution layer is 3 × 3, and the number of feature maps of each convolution layer is 64.

The fusion module is used for fusing the feature maps output by the two parallel feature extraction sub-networks and comprises 1 fusion layer (Fuse);

the characteristic reconstruction module is used for reconstructing the fusion characteristic diagram output by the fusion module to obtain a local fusion image, and cascading the local fusion image and the non-overlapping part in the alignment image to obtain a spliced image, wherein the spliced image sequentially comprises 7 convolutional layers (conv), a Leaky-ReLU nonlinear activation layer is connected behind each of the first 6 convolutional layers, and a 1 Sigmoid layer is connected behind the 7 th convolutional layer; the convolution kernel size of each convolution layer is 3 × 3, and the number of feature maps of each convolution layer is 64.

The Input1 and the Input2 are two images of mutually overlapped parts in an aligned Image, two parallel Feature Extraction sub-networks are respectively Input, a 64-dimensional Feature map is obtained through a convolution layer conv1, then the two Feature maps are respectively Input into a Feature Extraction Module (Feature Extraction Module) to obtain a high-dimensional nonlinear Feature map, the Feature map passes through a convolution layer conv2 without an activation function, the two output Feature maps are Fused in a fusion layer Fuse, the Fused Feature map is Input into a Feature Reconstruction Module (Feature Reconstruction Module) to obtain a local fusion Image, and finally the obtained local fusion Image and the parts (non-overlapped parts are Image areas which are not involved in fusion) in the aligned Image are cascaded to generate a large-size spliced Image.

In the next embodiment, for the steps 104-105, the training of the microscopic image mosaic model is divided into two stages, wherein the first stage is to train the image registration network by using a training set to obtain parameters of the image registration network; and in the second stage, the image registration network is fixed according to the parameters obtained in the first stage, the alignment images output by the image registration network are utilized, the image fusion network is trained in an unsupervised mode, and the parameters of the image fusion network are obtained.

Firstly, training an image registration network in a first stage independently, initializing all weights of the image registration network at a pixel level by utilizing normal distribution with an average value of 0 and a standard deviation of 0.02, setting an initial learning rate to be 0.002 by adopting an Adam optimization algorithm, and multiplying the learning rate by 0.1 in each 10k iterations.

After the first-stage training is finished, the image fusion network is trained in an unsupervised mode according to the parameter fixed image registration network obtained in the first stage, and unsupervised learning of the multi-focus image fusion network is guided by using image structure similarity as a loss function. In the training process of the image fusion network, an unsupervised mode is adopted for training, the requirement of synthesizing a large number of training sample sets is avoided, and the effect of splicing the trained models in the real data sets is good.

In a certain embodiment, in order to ensure that the image registration network correctly predicts the mapping table of pixel level, a loss function as shown below is used to guide the training process of the pixel level image registration network:

wherein I represents one image of the pair;

representing the aligned image after transforming I;

representing the pair of images corresponding to an ideal label;

representing a content loss function;

representing a shape loss function;

the overall loss function is represented.

The loss function adopted by the image fusion network in the training is as follows:

in the formula (I), the compound is shown in the specification,

represents the total loss function;x ₁andx ₂representing mutually overlapping portions of the input registered images;

representing the output stitched image;

representing a local window; n represents the total number of the sliding windows in the image;

representing a similarity measure of the image blocks within the local window.

The invention also provides a system for real-time splicing of microscopic images, which comprises:

the image acquisition module is used for acquiring a local microscopic image;

In one embodiment, the training set generation module further comprises:

extracting SURF feature descriptors of each group of image pairs by using an image registration algorithm, and performing feature pre-matching according to the SURF feature descriptors;

eliminating mismatching feature points in the feature pre-matching process by using a RANSAC algorithm to obtain a matching result, and calculating a homography matrix according to the matching result;

and transforming one image in the image pair by using the homography matrix, and aligning the transformed image with the other image in the image pair to obtain an aligned image, wherein the aligned image is an ideal label corresponding to the image pair.

In a next embodiment, for the model training module, the image registration network is a pixel level network comprising an encoder module, a decoder module, and a spatial transform module;

the encoder module is used for extracting high-level features from an input image pair and sequentially comprises 8 convolutional layers;

the decoder module is used for decoding the high-level features extracted by the encoder module to obtain a fine mapping table, and the fine mapping table sequentially comprises 7 deconvolution layers and 1 convolution layer, each deconvolution layer corresponds to one convolution layer in the encoder module, each deconvolution layer is connected with the corresponding convolution layer in a jump connection mode, and the number of feature maps of each deconvolution layer is the same as that of the feature maps in the corresponding convolution layer;

the space transformation module is used for multiplying affine transformation and a constant matrix to obtain a coarse mapping table, superposing the fine mapping table and the coarse mapping table to obtain a mapping table, transforming one image in an image pair by using the mapping table, and aligning the transformed image with the other image in the image pair to obtain an aligned image.

In another embodiment, for the model training module, the affine transformation is obtained by downsampling the feature map output by the last convolutional layer in the encoder module through a convolutional layer and then through a 1 × 1 convolutional layer.

In a next embodiment, for the model training module, the image fusion network is a multi-focus network, and includes two parallel feature extraction sub-networks, a fusion module and a feature reconstruction module, where the feature extraction sub-network includes 2 convolutional layers and 1 feature extraction module located between the 2 convolutional layers;

the characteristic extraction module is used for extracting characteristics of an input image, and sequentially comprises 5 convolutional layers, wherein a ReLU nonlinear activation layer is connected behind each convolutional layer;

the fusion module is used for fusing the feature maps output by the two parallel feature extraction sub-networks and comprises 1 fusion layer;

the feature reconstruction module is used for reconstructing the fusion feature map output by the fusion module to obtain a local fusion image, and cascading the local fusion image and the non-overlapping part in the alignment image to obtain a spliced image, wherein the spliced image sequentially comprises 7 convolutional layers, a Leaky-ReLU nonlinear activation layer is connected behind each of the first 6 convolutional layers, and a 1 Sigmoid layer is connected behind the 7 th convolutional layer.

In the next embodiment, for the model training module, the training of the microscopic image mosaic model is divided into two stages, wherein the first stage is to train the image registration network by using a training set to obtain parameters of the image registration network; and in the second stage, the image registration network is fixed according to the parameters obtained in the first stage, the alignment images output by the image registration network are utilized, the image fusion network is trained in an unsupervised mode, and the parameters of the image fusion network are obtained.

In a certain embodiment, for the model training module, the loss function used by the image registration network in the training is:

wherein I represents one image of the pair;

representing the aligned image after transforming I;

representing the pair of images corresponding to an ideal label;

representing a content loss function;

representing a shape loss function;

represents the total loss function;

in the formula (I), the compound is shown in the specification,

representing the output stitched image;

representing a similarity measure of the image blocks within the local window.

The invention further provides a computer device, which includes a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of the method when executing the computer program.

The invention also proposes a computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method described above.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention, and all modifications and equivalents of the present invention, which are made by the contents of the present specification and the accompanying drawings, or directly/indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A method for real-time stitching of microscopic images, comprising:

acquiring a local microscopic image;

forming image pairs by adjacent local microscopic images in the local microscopic images, obtaining ideal labels corresponding to each group of image pairs by using an image registration algorithm, and forming a training set by all the image pairs and the ideal labels; the ideal label is an aligned image which is registered and aligned by using an image registration model;

registering the input image pair by using an image registration network, obtaining a mapping table according to an ideal label corresponding to the image pair, and aligning the image pair according to the mapping table to obtain an aligned image; the mapping table comprises a mapping table in the horizontal direction and a mapping table in the vertical direction; for each pixel, the corresponding value in the mapping table represents a mapping for transforming the original position of the pixel to the aligned position in the aligned view, i.e. the corresponding position of each pixel in the original image in the aligned image can be determined by the mapping table;

extracting the mutually overlapped parts in the alignment images, fusing the mutually overlapped parts in the alignment images by using an image fusion network to obtain local fusion images, cascading the local fusion images and the non-overlapped parts in the alignment images to obtain spliced images, and obtaining a trained microscopic image splicing model; the training of the microscopic image splicing model is divided into two stages, wherein the first stage is to train an image registration network by using a training set to obtain parameters of the image registration network; the second stage is to fix the image registration network according to the parameters obtained in the first stage, train the image fusion network by using the aligned images output by the image registration network and adopting an unsupervised mode to obtain the parameters of the image fusion network;

2. The method for real-time stitching of microscopic images according to claim 1, wherein obtaining the ideal label corresponding to each set of image pairs using an image registration algorithm comprises:

3. The method for real-time stitching of microscopic images according to claim 1, wherein the image registration network is a pixel level network comprising an encoder module, a decoder module, and a spatial transform module;

4. The method for real-time stitching of microscope images as claimed in claim 3, wherein the affine transformation is obtained by downsampling the feature map output by the last convolutional layer in the encoder block through a convolutional layer and then through a 1 x 1 convolutional layer.

5. The method for real-time stitching of microscopic images according to claim 1, wherein the image fusion network is a multi-focus network, and comprises two parallel sub-networks of feature extraction, a fusion module and a feature reconstruction module, wherein the sub-networks of feature extraction comprise 2 convolutional layers and 1 feature extraction module located between the 2 convolutional layers;

6. The method for real-time stitching of microscopic images according to claim 1, wherein the loss function employed by the image registration network in training is:

wherein I represents one image of the pair;

representing the aligned image after transforming I;representing the pair of images corresponding to an ideal label;

representing a content loss function;

representing a shape loss function;

represents the total loss function;

in the formula (I), the compound is shown in the specification,

representing the output stitched image;

representing a similarity measure of the image blocks within the local window.

7. A system for real-time stitching of microscopic images, comprising:

the image acquisition module is used for acquiring a local microscopic image;

the model training module is used for inputting the training set into a pre-constructed microscopic image splicing model, and the microscopic image splicing model comprises an image registration network and an image fusion network; registering the input image pair by using an image registration network, obtaining a mapping table according to an ideal label corresponding to the image pair, and aligning the image pair according to the mapping table to obtain an aligned image; extracting the mutually overlapped parts in the alignment images, fusing the mutually overlapped parts in the alignment images by using an image fusion network to obtain local fusion images, cascading the local fusion images and the non-overlapped parts in the alignment images to obtain spliced images, and obtaining a trained microscopic image splicing model; the training of the microscopic image splicing model is divided into two stages, wherein the first stage is to train an image registration network by using a training set to obtain parameters of the image registration network; the second stage is to fix the image registration network according to the parameters obtained in the first stage, train the image fusion network by using the aligned images output by the image registration network and adopting an unsupervised mode to obtain the parameters of the image fusion network; for each pixel, the corresponding value in the mapping table represents a mapping for transforming the original position of the pixel to the aligned position in the aligned view, i.e. the corresponding position of each pixel in the original image in the aligned image can be determined by the mapping table;

8. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor when executing the computer program implements the steps of the method of any of claims 1 to 6.

9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.