CN113570611A - A real-time mineral segmentation method based on multi-feature fusion decoder - Google Patents

A real-time mineral segmentation method based on multi-feature fusion decoder Download PDF

Info

Publication number
CN113570611A
CN113570611A CN202110847545.6A CN202110847545A CN113570611A CN 113570611 A CN113570611 A CN 113570611A CN 202110847545 A CN202110847545 A CN 202110847545A CN 113570611 A CN113570611 A CN 113570611A
Authority
CN
China
Prior art keywords
data set
mineral
segmentation
method based
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110847545.6A
Other languages
Chinese (zh)
Inventor
牛福生
薛文强
张晋霞
郭力娜
姚珊珊
粱银英
陈稳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North China University of Science and Technology
Original Assignee
North China University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North China University of Science and Technology filed Critical North China University of Science and Technology
Priority to CN202110847545.6A priority Critical patent/CN113570611A/en
Publication of CN113570611A publication Critical patent/CN113570611A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了基于多特征融合解码器的矿物实时分割方法,属于矿相分割技术领域。基于多特征融合解码器的矿物实时分割方法,包括如下步骤:S1:在磁铁矿显微图像下对标签白色区域的石英进行语义信息分割,制作多个数据集,对得到的数据集分别进行训练和测试,采用垂直翻转、水平翻转、随机旋转n个90度、仿射变换和随机平移等策略的组合对磁铁矿显微图像进行增强;S2:所述S1中的增强方法采用区域克隆数据集增强方法,在数据集的训练过程中,从数据集中随机克隆另外一张图片的部分区域到索引图像,同时标签也执行相同操作;本发明解决了采用传统医学语义分割策略完成磁铁矿显微图像下石英的分割任务仍需花费大量时间的问题。

Figure 202110847545

The invention discloses a real-time mineral segmentation method based on a multi-feature fusion decoder, and belongs to the technical field of mineral phase segmentation. The real-time mineral segmentation method based on the multi-feature fusion decoder includes the following steps: S1: Segment the quartz in the white area of the label under the magnetite microscopic image, create multiple data sets, and perform For training and testing, the magnetite microscopic image is enhanced by a combination of strategies such as vertical flip, horizontal flip, random rotation n 90 degrees, affine transformation and random translation; S2: The enhancement method in S1 uses region cloning The data set enhancement method, in the training process of the data set, randomly clones a partial area of another picture from the data set to the index image, and the label also performs the same operation; the invention solves the problem of using the traditional medical semantic segmentation strategy to complete the magnetite The task of segmenting quartz under microscopic images is still time-consuming.

Figure 202110847545

Description

Mineral real-time segmentation method based on multi-feature fusion decoder
Technical Field
The invention belongs to the technical field of mineral facies segmentation, and particularly relates to a real-time mineral segmentation method based on a multi-feature fusion decoder.
Background
The technical mineralogy workers operating the microscope to quantitatively analyze minerals have high requirements on professional knowledge and practical experience, the method is original and long in working time, and the computer is used for quickly segmenting the mineral phases to obtain the components, so that the method has great significance to the technical mineralogy workers; in recent years, the classification work of mineral rocks and the like by adopting a deep learning method at home and abroad is increasing, and students obtain quite abundant results on the identification of minerals, but as the color and texture characteristics of the mineral facies under the mirror are complex and various, and the mineral facies are difficult to be segmented by adopting a traditional image processing method, the students are rarely involved in the segmentation of the mineral facies, and the segmentation of the mineral facies becomes possible along with the development of deep learning semantic segmentation in recent years.
Mineral microscopic image segmentation tasks are closer to medical image segmentation, a medical image segmentation method has great reference value, a plurality of effective medical image segmentation schemes appear in the past years, most classically, a U-shaped network obtains a good segmentation effect in the biomedical field, then, the U-Net + + and U-Net3+ are provided by improvement on the basis of the U-Net, and in 2019, hollow convolution and pyramid pooling are introduced into the U-shaped network, so that the segmentation precision is further improved, and the medical image segmentation algorithm is mature in the segmentation precision; however, it still takes a lot of time to segment these photographs by using the conventional medical semantic segmentation strategy, for example, one ore light sheet is 3.5 × 3.5cm in size, and a microscope with 50 times of objective lens needs to take ten thousand photographs to take a complete picture.
Therefore, under the inspiration of a characteristic multiplexing structure, a decoder of a U-shaped network is improved, a multi-characteristic fusion decoder structure is provided, and the task of segmenting quartz under a magnetite microscopic image is completed; in order to solve the problems, the invention provides a real-time mineral segmentation method based on a multi-feature fusion decoder.
Disclosure of Invention
The invention aims to improve a decoder of a U-shaped network, provides a multi-feature fusion decoder structure, and completes the task of segmenting quartz under a magnetite microscopic image so as to solve the problems in the background art.
In order to achieve the purpose, the invention adopts the following technical scheme:
the real-time mineral segmentation method based on the multi-feature fusion decoder comprises the following steps:
s1: performing semantic information segmentation on quartz in a white area of a label under a magnetite microscopic image, manufacturing a plurality of data sets, respectively training and testing the obtained data sets, and enhancing the magnetite microscopic image by adopting the combination of strategies such as vertical overturning, horizontal overturning, random rotation for n 90 degrees, affine transformation, random translation and the like;
s2: the enhancement method in the S1 adopts a region cloning data set enhancement method, in the training process of the data set, a partial region of another picture is randomly cloned from the data set to the index image, and meanwhile, the same operation is also executed by the label, so that the diversity of the mineral facies data is improved, and the overfitting phenomenon in the training process is reduced;
s3: constructing a relevant network model based on the data set processed in the S2, encoding, decoding and fusing the data set, fusing all feature maps with the same scale in the repeated encoding and decoding operation process, acquiring an encoder feature map and a decoder feature map, and encoding again after the encoder feature map and the decoder feature map are fused;
s4: when the data set is encoded and decoded in the S3, a multi-feature aggregation decoder structure and a lightweight Resnet34 are adopted to build an MA-net network structure;
s5: based on the MA-net network structure established in S4, a channel attention mechanism is introduced for training to obtain a network model, and the segmentation precision is improved;
s6: introducing a residual multi-kernel pooling module at the end of the MA-net network structure built in the S4, wherein the residual multi-kernel pooling module mainly depends on a plurality of effective views to detect objects with different sizes;
s7: in the training process of the data set in the S1, replacing Batch Normalization (BN) with Filter Response Normalization (FRN), and replacing modified linear unit (Relu) with corresponding activation layer Threshold Linear Unit (TLU); model performance analysis experiments were performed on the EM challenge data set, the LUNA challenge data set, and the DRIVE data set, respectively.
Preferably, a detail extraction module is added to the MA-net network structure described in S4 to improve the ability to segment small objects in the DRIVE data set.
Preferably, in the MA-net network structure described in S4, the first convolutional layer uses 16 channels, the number of output channels of the first convolutional core is reduced to 1/4 of the number of input channels, and this is used as the number of input channels of the second convolutional layer, and the parameter amount is greatly reduced without changing the input and output channels of each decoder block.
Preferably, the channel attention mechanism introduced in S5 includes the following steps:
a1: global average pooling is performed first to maintain a maximum receptive field;
a2: and then, a learnable weight is distributed to the channel of each feature map, so that the network model is more concerned about the classified main objects, and the channel attention mechanism adopts an ARM module.
Preferably, the step of introducing the residual multi-kernel pooling module in S6 includes the following steps:
b1: collecting context information by adopting four pooling kernels with different sizes to enrich high-level semantic information;
b2: obtaining a feature map with the same size as the original feature map by bilinear interpolation, and reducing the dimension to 1 by 1 × 1 convolution;
b3: merging the original characteristic diagram and the up-sampled characteristic diagram into a channel; the residual multi-kernel pooling module can cope with large variations in object size in the image.
Preferably, the FRN expression described in S7 is as follows:
Figure BDA0003181224490000041
Figure BDA0003181224490000042
where x is a vector of N dimensions (H W); unlike the normalization method where the mean is subtracted from the BN layer and then divided by the standard deviation, the mean of the quadratic norm is subtracted from the FRN; ε in the formula is a small normal amount to prevent division by 0.
Preferably, the TLU expression described in S7 is as follows:
zi=max(yi,τ)=ReLu(y-τ)+τ (3)
where τ is a learnable parameter.
Compared with the prior art, the invention provides a mineral real-time segmentation method based on a multi-feature fusion decoder, which has the following beneficial effects:
(1) the invention provides a multi-feature fusion decoder structure, a MA-Net network structure is built by combining with lightweight Resnet34, residual multi-core pooling is added at the end of an encoder to enhance the segmentation effect on various size targets, a channel attention mechanism is introduced to improve the segmentation precision, FRN is adopted to eliminate the dependence of the network on Batchsize in the training process, and meanwhile, compared with the decoder structure with a single path, the network has the advantages that a downsampling process is added, multi-stage feature information is aggregated in the encoding and decoding processes, so that the MA-Net network structure can be compared with other U-type networks, the number of network channels can be greatly reduced, parameters are reduced, and the segmentation precision is also guaranteed.
(2) In order to obtain a higher segmentation effect by using a small amount of training sample data, the method of randomly cloning a partial area of another picture in a data set to an index image is adopted for data enhancement; through experimental verification and analysis, the method can be applied to a segmentation task with random segmentation target space positions such as a mineral microscopic image, and can effectively reduce overfitting and improve segmentation precision.
(3) The invention is obtained by testing and analyzing on EM, LUNA and DRIVE data sets, and MA-Net shows outstanding performance when segmenting a larger target, is not good at segmenting a tiny target, and needs to be optimized and improved in the aspect of segmenting a small target; the MA-Net was used for the task of partitioning quartz in the magnetite phase, and the Dice coefficient reached 0.9637.
(4) In order to avoid the influence of batch size on a training result during training, filter response normalization is adopted to replace batch standardization, meanwhile, a corresponding activation layer threshold linear unit is adopted to replace a correction linear unit, the average Dice coefficients tested under an EM challenge data set and a LUNA challenge data set are 0.9657 and 0.9852 respectively, compared with 0.9584 and 0.9758 of U-net, the improvement is great, and meanwhile, the floating point operand is 5.72G and is only 1/43 of U-net; it can thus be seen that the task of using MA-Net for magnetite to divide quartz shows good dividing effect.
(5) The influence of each module of MA-Net on the segmentation effect and the real-time performance of the model is verified on a mineral microscopic image data set, the multi-feature fusion decoding strategy adopted by the MA-Net can fully extract the information of deep features and shallow features, the correlation of the deep features and the shallow features is learned to process isolated pixel points in the segmentation result, and the problems of information loss and poor fusion quality in the up-sampling process are greatly solved.
Drawings
FIG. 1 is a diagram showing magnetite mineral phase data set of the multi-feature fusion decoder-based mineral real-time segmentation method of the present invention;
FIG. 2 is a diagram showing a region clone data enhancement method of the multi-feature fusion decoder-based real-time mineral segmentation method of the present invention;
FIG. 3 is a structural comparison diagram of a U-shaped network and a stage feature multiplexing structure of the real-time mineral segmentation method based on the multi-feature fusion decoder, which is provided by the invention;
FIG. 4 is a diagram showing the MA-net structure of the multi-feature fusion decoder-based real-time mineral segmentation method proposed in the present invention;
FIG. 5 is an ARM module display diagram of the multi-feature fusion decoder-based real-time mineral segmentation method of the present invention;
FIG. 6 is a diagram showing a residual multi-kernel pooling module of the multi-feature fusion decoder-based real-time mineral segmentation method of the present invention;
FIG. 7 is a comparison graph of the mineral segmentation effect of the real-time mineral segmentation method based on the multi-feature fusion decoder according to the present invention;
FIG. 8 is a diagram illustrating the effect of the region clone data enhancement method of the real-time mineral segmentation method based on the multi-feature fusion decoder according to the present invention;
fig. 9 is a mineral segmentation result diagram of the real-time mineral segmentation method based on the multi-feature fusion decoder according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
Example 1:
referring to fig. 1-6, the method for real-time mineral segmentation based on multi-feature fusion decoder includes the following steps:
s1: performing semantic information segmentation on quartz in a white area (shown in figure 1) of a label under a magnetite microscopic image, making a plurality of data sets, respectively training and testing the obtained data sets, and enhancing the magnetite microscopic image by adopting the combination of strategies of vertical overturning, horizontal overturning, random rotation for n 90 degrees, affine transformation, random translation and the like;
s2: the enhancement method in the S1 adopts a region cloning data set enhancement method, in the training process of the data set, a partial region of another picture is randomly cloned from the data set to the index image, and meanwhile, the same operation is also executed by the label, so that the diversity of the mineral facies data is improved, and the overfitting phenomenon in the training process is reduced;
the method is applied to the mining facies data set, the probability of increasing the richness of the data set is greater than the probability of destroying information, and the experimental part of the method verifies the method; the region clone data enhancement method is shown in FIG. 2;
s3: constructing a relevant network model based on the data set processed in the S2, encoding, decoding and fusing the data set, fusing all feature maps with the same scale in the repeated encoding and decoding operation process, acquiring an encoder feature map and a decoder feature map, and encoding again after the encoder feature map and the decoder feature map are fused;
the overall structure of the network is a coding and decoding structure, the traditional decoder structure adopts a single-path up-sampling feature map which is continuously enriched, all up-sampling processes are not closely connected, and a deep feature map is difficult to recover detailed information in the decoding process;
the invention has proposed a decoder structure to gather the characteristic of multiple stages in conjuction with characteristic multiplexing structure and code decoding structure that the open literature puts forward, this tactics fuses all characteristic pictures of the same scale in encoding and decoding the operation process repeatedly, and encode and can learn the correlation relation of the two further after encoder characteristic picture and decoder characteristic picture are fused again, make it more appropriate to fuse, because the characteristic picture of multiple stages of the structure supplements detailed information and spatial information in decoding process of each depth, can be very large to compress the channel number of characteristic picture compared with traditional structure, thus reduce the parameter number, U type network and stage characteristic multiplexing structure are as shown in fig. 3;
s4: when the data set is encoded and decoded in S3, a multi-feature aggregation decoder structure and a lightweight Resnet34 are used to construct an MA-net network structure, which is shown in fig. 4, where 'c' represents a merging channel and is a1 × 1 convolution;
in the MA-net network structure described in S4, the first convolutional layer uses 16 channels, the number of output channels of the first convolutional core is reduced to 1/4 of the number of input channels, and this is used as the number of input channels of the second convolutional layer, and the parameter amount is greatly reduced under the condition that the input and output channels of each decoder block are not changed;
in a deep convolutional neural network, the size of a shallow characteristic diagram is larger than that of a deep layer, and the calculation amount is influenced by the number of channels and is more sensitive; wherein the encoder parameters and the number of output channels are shown in table 1,/2 denotes 2-fold down-sampling; the decoder configuration parameters are shown in Table 2, where
2 denotes 2-fold upsampling;
TABLE 1 encoder Module parameters
Figure BDA0003181224490000091
TABLE 2 decoder Module parameters
Figure DEST_PATH_IMAGE001
S5: based on the MA-net network structure established in S4, a channel attention mechanism is introduced for training to obtain a network model, and the segmentation precision is improved;
the channel attention mechanism introduced in the step S5 includes the following steps:
a1: global average pooling is performed first to maintain a maximum receptive field;
a2: then, a learnable weight is distributed to the channel of each feature graph, so that the network model is more concerned about classified main objects, wherein an ARM module is adopted as a channel attention mechanism, as shown in FIG. 5;
s6: introducing a residual multi-kernel pooling module at the end of the MA-net network structure built in the S4, wherein the residual multi-kernel pooling module mainly depends on a plurality of effective views to detect objects with different sizes;
the step of introducing the residual multi-kernel pooling module in the step S6 includes the following steps:
b1: collecting context information by adopting four pooling kernels with different sizes to enrich high-level semantic information;
b2: obtaining a feature map with the same size as the original feature map by bilinear interpolation, and reducing the dimension to 1 by 1 × 1 convolution;
b3: merging the original characteristic diagram and the up-sampled characteristic diagram into a channel; the residual multi-kernel pooling module can cope with large changes in the size of objects in the image;
the module introduces fewer parameters, namely 388 parameters, which causes a slight increase in calculation cost, but the obtained accuracy improvement is more important, as shown in fig. 6, which is an RMP module;
s7: in the training process of the data set in the S1, replacing Batch Normalization (BN) with Filter Response Normalization (FRN), and replacing modified linear unit (Relu) with corresponding activation layer Threshold Linear Unit (TLU); respectively performing model performance analysis experiments on the EM challenge data set, the LUNA challenge data set and the DRIVE data set;
adding a detail extraction module into the MA-net network structure in S4 to improve the capability of segmenting tiny targets in the DRIVE data set;
in order to improve the capability of segmenting tiny targets in a DRIVE data set, a detail extraction module is added in a network, namely a spatial information extraction path of a small step height channel proposed in a public document is fused with an original decoder path, so that the segmentation effect of the tiny targets is effectively improved, the calculated amount is increased, and the precision of a segmentation task with a larger target cannot be improved;
the FRN expression described in S7 is as follows:
Figure BDA0003181224490000121
Figure BDA0003181224490000122
where x is a vector of N dimensions (H W); unlike the normalization method where the mean is subtracted from the BN layer and then divided by the standard deviation, the mean of the quadratic norm is subtracted from the FRN; ε in the formula is a small normal amount to prevent division by 0;
to solve the above-mentioned problem that Relu activation generates a 0 value, and the disclosure proposes a thresholded Relu adopted after FRN, i.e. TLU is important for training performance improvement, the TLU expression described in S7 is as follows:
zi=max(yi,τ)=ReLu(y-τ)+τ (3)
where τ is a learnable parameter.
The model performance analysis experiment is as follows:
(one) Experimental setup
The evaluation index adopted by the experiment is a Dice coefficient, and the test set is not subjected to any enhancement, such as multi-scale or multi-angle, so that the predicted result quality is higher. The Dice coefficient is a set similarity metric function, which is generally used to calculate the similarity of two samples, and has a value ranging from 0 to 1, and the segmentation is preferably 1 and 0 at worst, and is as follows, where TP, FP, and FN represent the number of true positives, false positives, and false negatives, respectively. The Dice expression is as follows:
Figure BDA0003181224490000131
the experimental operating system is Arch, a pytorch deep learning framework, the batch size (batch size) is 8, the Adam optimizer adopts a Dice coefficient loss function, and the input image size is 512 multiplied by 512.
Setting of channel
The number of channels of the output layer of the encoder is one of the main limitations of network acceleration, the experiment adopts Resnet18 as the reference network of the encoder to perform the experiment on the combination of three groups of channels on the magnetite microscopic image data set, as shown in Table 3, it can be seen that the calculated amount is obviously increased along with the increase of the number of channels output by each layer of the encoder, the segmentation precision of the channel number strategy 2 is greatly improved compared with that of the strategy 1, the segmentation precision of the strategy 3 is almost unchanged compared with that of the strategy 2, the segmentation task is considered to be not complex, excessive parameters only generate redundancy, and the network structure limits the capability of extracting semantic information.
TABLE 3MA-net number of channels comparison experiment
Figure BDA0003181224490000132
Coder network selection
In order to further explore the influence of the reference network depth of the MA-net encoder on the network performance and select a proper encoder network, a channel strategy 2 is adopted in the experiment, the segmentation performance of the Resnet-18 and Resnet-34 is compared on a magnetite microscopic image data set, in order to verify that the influence of the network depth and the number of channels on the segmentation performance is increased at the same time, a group of comparison experiments adopting an original parameter Resnet34 are added and are expressed by Resnet34-B, and table 4 shows the segmentation performance and the calculation amount of 3 encoders, so that the network deepening can be found to improve the model segmentation performance to a certain extent, the effect of an excessively deep network and the excessive number of channels is not large, and the calculation amount is increased sharply. And the lightweight encoder network has little impact on model performance. Later experiments all used lightweight Resnet34, channel number strategy 2.
TABLE 4MA-net reference network comparison experiment
Figure BDA0003181224490000141
(II) analysis of model
According to the invention, an MA-Net ablation experiment is carried out on a magnetite mineral microscopic image data set, the performance of each module is analyzed, the Dice coefficient, the parameter quantity and the calculated quantity of the module are shown in Table 5, it can be seen that an attention mechanism ARM has a certain effect on the model segmentation precision, the precision of the model is greatly improved by adopting residual multi-core pooling RMP, but the added calculated quantity is minimum, the segmentation precision is slightly improved by introducing an FRN normalization method, and meanwhile, the calculated quantity is reduced.
TABLE 5MA-net ablation experiments on mineral segmentation datasets
Figure BDA0003181224490000142
Figure BDA0003181224490000151
(III) model comparison experiment
Comparing the proposed MA-Net and advanced algorithms on the magnetite microscopic image data set, it can be seen that the MA-Net segmentation accuracy exceeds the other two networks, while the parameters and the calculated amount are minimal.
TABLE 6 Magnetite microscopic image data set modeling contrast experiment
Figure BDA0003181224490000152
Fig. 7 is a comparison graph of segmentation effect, it can be seen that CE-net is far lower than MA-net in segmentation effect, the CE-net segmentation image is easily interfered by some highlight portions, although the overall contour segmentation effect is better, a large number of holes exist in the image, and MA-net rarely occurs this case, CE-net adds a hole convolution and multi-kernel pooling at the end of the encoder to increase the receptive field, but the encoder feature map and the decoder feature map are fused in a simple addition manner, and information loss and improper fusion inevitably occur in the process of upsampling, and the multi-feature fusion decoding strategy adopted by MA-net can fully extract information of deep-layer features and shallow-layer features, learn the correlation thereof to process a large target in the segmentation result, and greatly overcome the problems of information loss and poor fusion in the process of upsampling, meanwhile, downsampling is carried out after the superficial layer feature maps are fused each time, and the expansion of the receptive field is beneficial to spatial information supplement of a large target.
Regional cloning data enhancement methods comparative experiments were as follows:
in order to analyze the effect of the regional clone data enhancement method, an experiment is carried out on a mineral microscopic image data set, as shown in fig. 8, the segmentation effect on a test set is compared by adopting the regional clone data enhancement method and when the regional clone data enhancement method is not adopted, the fact that the Dice value fluctuation is small when the data enhancement method is adopted can be seen from a curve, and a high segmentation effect is finally obtained; the analysis reason is as follows: the information of the two pictures is combined by adopting a region clone data enhancement method, so that the difference between the pictures can be effectively reduced, and further, the variance of the data is reduced;
by adopting the MA-Net network and the regional clone data enhancement method provided by the invention to perform the task of dividing quartz in the magnetite phase, the Dice coefficient reaches 0.9637, and is extremely close to the manually marked label, as shown in FIG. 9, a single picture can be predicted only by 0.16 second under the Ruilong R7-3700x CPU.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims (7)

1.基于多特征融合解码器的矿物实时分割方法,其特征在于:包括如下步骤:1. the mineral real-time segmentation method based on multi-feature fusion decoder, is characterized in that: comprise the steps: S1:在磁铁矿显微图像下对标签白色区域的石英进行语义信息分割,制作多个数据集,对得到的数据集分别进行训练和测试,采用垂直翻转、水平翻转、随机旋转n个90度、仿射变换和随机平移的组合对磁铁矿显微图像进行增强;S1: Semantic information segmentation is performed on the quartz in the white area of the label under the magnetite microscopic image, multiple datasets are made, and the obtained datasets are trained and tested respectively. Magnetite microscopic images are enhanced by a combination of degrees, affine transformations and random translations; S2:所述S1中进行增强时采用区域克隆数据集增强方法,在数据集的训练过程中,从数据集中随机克隆另外一张图片的部分区域到索引图像,同时标签也执行相同操作,提高矿相数据的多样性并且降低训练过程过拟合现象;S2: The area clone data set enhancement method is used for the enhancement in the S1. During the training process of the data set, a part of the area of another image is randomly cloned from the data set to the index image, and the label also performs the same operation to improve the mining efficiency. The diversity of phase data and reduce the overfitting phenomenon of the training process; S3:基于S2中处理所得的数据集构建相关网络模型,对数据集进行编码、解码以及融合处理,在反复进行编码和解码操作过程中融合所有同尺度的特征图,获取编码器特征图和解码器特征图,并在编码器特征图和解码器特征图融合后再一次进行编码;S3: Build a relevant network model based on the data set processed in S2, encode, decode and fuse the data set, and fuse all feature maps of the same scale during repeated encoding and decoding operations to obtain encoder feature maps and decoding. feature map of the encoder, and encode it again after the encoder feature map and the decoder feature map are fused; S4:所述S3中对数据集进行编码和解码处理时,采用多特征聚合解码器结构和轻量化Resnet34搭建MA-net网络结构;S4: When encoding and decoding the data set in the S3, the multi-feature aggregation decoder structure and the lightweight Resnet34 are used to build the MA-net network structure; S5:基于S4中所搭建的MA-net网络结构,引入通道注意力机制训练得到网络模型,提高分割精度;S5: Based on the MA-net network structure built in S4, the channel attention mechanism is introduced to train the network model to improve the segmentation accuracy; S6:在所述S4中搭建的MA-net网络结构的末尾引入残差多内核池化模块,残差多内核池化模块主要依靠多个有效的视野来检测不同大小的对象;S6: Introduce a residual multi-kernel pooling module at the end of the MA-net network structure built in S4, and the residual multi-kernel pooling module mainly relies on multiple effective fields of view to detect objects of different sizes; S7:在所述S1中数据集的训练过程中,采用滤波器响应归一化取代批量标准化,同时采用对应的激活层阈值线性单元代替修正线性单元;分别在EM挑战赛数据集、LUNA挑战赛数据集和DRIVE数据集上进行模型性能分析实验。S7: In the training process of the data set in S1, filter response normalization is used to replace batch normalization, and the corresponding activation layer threshold linear unit is used to replace the corrected linear unit; respectively in the EM challenge data set and the LUNA challenge Model performance analysis experiments are performed on the dataset and the DRIVE dataset. 2.根据权利要求1所述的基于多特征融合解码器的矿物实时分割方法,其特征在于:在S4中所述的MA-net网络结构中加入细节提取模块,提升分割DRIVE数据集中微小目标的能力。2. the mineral real-time segmentation method based on multi-feature fusion decoder according to claim 1, is characterized in that: in the MA-net network structure described in S4, add detail extraction module, improve the segmentation of tiny targets in the DRIVE data set. ability. 3.根据权利要求1或2所述的基于多特征融合解码器的矿物实时分割方法,其特征在于:S4中所述的MA-net网络结构中,第一个卷积层采用16通道,将第一个卷积核输出通道数缩减为输入通道数的1/4,并将此作为第二个卷积层的输入通道数,在每个解码器块的输入和输出通道不变的情况下参数量减少。3. The mineral real-time segmentation method based on multi-feature fusion decoder according to claim 1 or 2, characterized in that: in the MA-net network structure described in S4, the first convolutional layer adopts 16 channels, and the The number of output channels of the first convolution kernel is reduced to 1/4 of the number of input channels, and this is used as the number of input channels of the second convolution layer, with the input and output channels of each decoder block unchanged The number of parameters is reduced. 4.根据权利要求1中所述的基于多特征融合解码器的矿物实时分割方法,其特征在于:所述S5中引入通道注意力机制,包括如下步骤:4. according to the mineral real-time segmentation method based on multi-feature fusion decoder described in claim 1, it is characterized in that: introducing channel attention mechanism in described S5, comprises the steps: A1:先进行全局平均池化以保持最大的感受野;A1: Perform global average pooling first to maintain the largest receptive field; A2:再通过对每个特征图的通道分配可学习的权值,使网络模型更加关注于分类的主要物体,所述通道注意力机制采用ARM模块。A2: Then, by assigning learnable weights to the channels of each feature map, the network model is made to pay more attention to the main objects to be classified, and the channel attention mechanism adopts the ARM module. 5.根据权利要求1所述的基于多特征融合解码器的矿物实时分割方法,其特征在于:所述S6中引入残差多内核池化模块包括如下步骤:5. The mineral real-time segmentation method based on multi-feature fusion decoder according to claim 1, is characterized in that: introducing residual multi-kernel pooling module in described S6 comprises the steps: B1:采用四个不同大小的池化内核收集上下文信息以丰富高级语义信息;B1: Four pooling kernels of different sizes are used to collect contextual information to enrich high-level semantic information; B2:再通过双线性插值获得与原始特征图相同大小的特征图并通过1×1卷积将维度缩减为1;B2: Then obtain a feature map of the same size as the original feature map through bilinear interpolation and reduce the dimension to 1 through 1×1 convolution; B3:将原始特征图与上采样的特征图合并通道;所述残差多内核池化模块用于应对图像中对象尺寸的巨大变化。B3: Merge channels with the original feature map and the upsampled feature map; the residual multi-kernel pooling module is used to cope with large changes in the size of objects in the image. 6.根据权利要求1所述的基于多特征融合解码器的矿物实时分割方法,其特征在于:S7中所述的FRN表达式如下:6. the mineral real-time segmentation method based on multi-feature fusion decoder according to claim 1, is characterized in that: the FRN expression described in S7 is as follows:
Figure FDA0003181224480000031
Figure FDA0003181224480000031
Figure FDA0003181224480000032
Figure FDA0003181224480000032
其中x为一个N维度(H×W)的向量;与BN层减去均值然后除以标准差的归一化方法不同的是FRN减去二次范数的平均值;公式中的ε是一个很小的正常量,以防止除0。where x is an N-dimensional (H×W) vector; the difference from the normalization method in which the BN layer subtracts the mean and then divides by the standard deviation is the FRN minus the mean of the quadratic norm; ε in the formula is a Small normal amount to prevent division by 0.
7.根据权利要求6所述的基于多特征融合解码器的矿物实时分割方法,其特征在于:S7中所述的TLU表达式如下:7. the mineral real-time segmentation method based on multi-feature fusion decoder according to claim 6, is characterized in that: the TLU expression described in S7 is as follows: zi=max(yi,τ)=ReLu(y-τ)+τ (3)z i =max(y i ,τ)=ReLu(y-τ)+τ (3) 其中τ是一个可学习参数。where τ is a learnable parameter.
CN202110847545.6A 2021-07-27 2021-07-27 A real-time mineral segmentation method based on multi-feature fusion decoder Pending CN113570611A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110847545.6A CN113570611A (en) 2021-07-27 2021-07-27 A real-time mineral segmentation method based on multi-feature fusion decoder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110847545.6A CN113570611A (en) 2021-07-27 2021-07-27 A real-time mineral segmentation method based on multi-feature fusion decoder

Publications (1)

Publication Number Publication Date
CN113570611A true CN113570611A (en) 2021-10-29

Family

ID=78167689

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110847545.6A Pending CN113570611A (en) 2021-07-27 2021-07-27 A real-time mineral segmentation method based on multi-feature fusion decoder

Country Status (1)

Country Link
CN (1) CN113570611A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113887524A (en) * 2021-11-04 2022-01-04 华北理工大学 Magnetite microscopic image segmentation method based on semantic segmentation

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10361802B1 (en) * 1999-02-01 2019-07-23 Blanding Hovenweep, Llc Adaptive pattern recognition based control system and method
CN111681252A (en) * 2020-05-30 2020-09-18 重庆邮电大学 An automatic segmentation method of medical images based on multi-path attention fusion
CN111797779A (en) * 2020-07-08 2020-10-20 兰州交通大学 Remote sensing image semantic segmentation method based on regional attention multi-scale feature fusion

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10361802B1 (en) * 1999-02-01 2019-07-23 Blanding Hovenweep, Llc Adaptive pattern recognition based control system and method
CN111681252A (en) * 2020-05-30 2020-09-18 重庆邮电大学 An automatic segmentation method of medical images based on multi-path attention fusion
CN111797779A (en) * 2020-07-08 2020-10-20 兰州交通大学 Remote sensing image semantic segmentation method based on regional attention multi-scale feature fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SAURABH SINGH: "Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks" *
陈家石: "基于深度学习的医学图像分割研究" *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113887524A (en) * 2021-11-04 2022-01-04 华北理工大学 Magnetite microscopic image segmentation method based on semantic segmentation

Similar Documents

Publication Publication Date Title
CN113221639B (en) A micro-expression recognition method based on multi-task learning for representative AU region extraction
CN113159051B (en) A Lightweight Semantic Segmentation Method for Remote Sensing Images Based on Edge Decoupling
CN108416266B (en) A Fast Video Behavior Recognition Method Using Optical Flow to Extract Moving Objects
CN114119638A (en) Medical image segmentation method integrating multi-scale features and attention mechanism
Li et al. Learning face image super-resolution through facial semantic attribute transformation and self-attentive structure enhancement
CN112541864A (en) Image restoration method based on multi-scale generation type confrontation network model
CN111738363B (en) Alzheimer disease classification method based on improved 3D CNN network
CN109685724B (en) Symmetric perception face image completion method based on deep learning
CN114972213A (en) A two-stage motherboard image defect detection and localization method based on machine vision
CN115619743A (en) Construction method and application of surface defect detection model for new OLED display devices
CN115761735B (en) A semi-supervised semantic segmentation method based on adaptive pseudo-label correction
CN114724155A (en) Scene text detection method, system and equipment based on deep convolutional neural network
CN110503113B (en) Image saliency target detection method based on low-rank matrix recovery
CN113870286B (en) Foreground segmentation method based on multi-level feature and mask fusion
CN113807176A (en) Small sample video behavior identification method based on multi-knowledge fusion
CN114387641A (en) False video detection method and system based on multi-scale convolutional network and ViT
CN113436224B (en) An intelligent image cropping method and device based on explicit composition rule modeling
Hongmeng et al. A detection method for deepfake hard compressed videos based on super-resolution reconstruction using CNN
CN114120202A (en) A semi-supervised video object segmentation method based on multi-scale object model and feature fusion
Zhou et al. Attention transfer network for nature image matting
Wu et al. Variant semiboost for improving human detection in application scenes
CN113177970A (en) Multi-scale filtering target tracking method based on self-adaptive feature fusion
Yu et al. MagConv: Mask-guided convolution for image inpainting
CN116630387A (en) Monocular Image Depth Estimation Method Based on Attention Mechanism
CN113570611A (en) A real-time mineral segmentation method based on multi-feature fusion decoder

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20211029