CN113570611A - Mineral real-time segmentation method based on multi-feature fusion decoder - Google Patents
Mineral real-time segmentation method based on multi-feature fusion decoder Download PDFInfo
- Publication number
- CN113570611A CN113570611A CN202110847545.6A CN202110847545A CN113570611A CN 113570611 A CN113570611 A CN 113570611A CN 202110847545 A CN202110847545 A CN 202110847545A CN 113570611 A CN113570611 A CN 113570611A
- Authority
- CN
- China
- Prior art keywords
- segmentation
- data set
- decoder
- mineral
- real
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 78
- 230000011218 segmentation Effects 0.000 title claims abstract description 75
- 229910052500 inorganic mineral Inorganic materials 0.000 title claims abstract description 47
- 239000011707 mineral Substances 0.000 title claims abstract description 47
- 230000004927 fusion Effects 0.000 title claims abstract description 33
- 230000008569 process Effects 0.000 claims abstract description 25
- 238000012549 training Methods 0.000 claims abstract description 22
- SZVJSHCCFOBDDC-UHFFFAOYSA-N iron(II,III) oxide Inorganic materials O=[Fe]O[Fe]O[Fe]=O SZVJSHCCFOBDDC-UHFFFAOYSA-N 0.000 claims abstract description 20
- 208000035126 Facies Diseases 0.000 claims abstract description 10
- 239000010453 quartz Substances 0.000 claims abstract description 10
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N silicon dioxide Inorganic materials O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 claims abstract description 10
- 238000012360 testing method Methods 0.000 claims abstract description 7
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims abstract description 4
- 230000002708 enhancing effect Effects 0.000 claims abstract description 4
- 230000009466 transformation Effects 0.000 claims abstract description 4
- 238000013519 translation Methods 0.000 claims abstract description 4
- 238000004519 manufacturing process Methods 0.000 claims abstract description 3
- 238000011176 pooling Methods 0.000 claims description 23
- 238000002474 experimental method Methods 0.000 claims description 18
- 238000010586 diagram Methods 0.000 claims description 15
- 230000007246 mechanism Effects 0.000 claims description 11
- 238000010606 normalization Methods 0.000 claims description 11
- 238000004458 analytical method Methods 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 5
- 230000004044 response Effects 0.000 claims description 4
- 230000002776 aggregation Effects 0.000 claims description 3
- 238000004220 aggregation Methods 0.000 claims description 3
- 238000012937 correction Methods 0.000 claims description 2
- 238000010367 cloning Methods 0.000 abstract description 5
- 230000000694 effects Effects 0.000 description 16
- 230000001965 increasing effect Effects 0.000 description 6
- 238000003709 image segmentation Methods 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 238000005070 sampling Methods 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 3
- 238000002679 ablation Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 239000013589 supplement Substances 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 239000011435 rock Substances 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20024—Filtering details
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a real-time mineral segmentation method based on a multi-feature fusion decoder, and belongs to the technical field of mineral facies segmentation. The real-time mineral segmentation method based on the multi-feature fusion decoder comprises the following steps: s1: performing semantic information segmentation on quartz in a white area of a label under a magnetite microscopic image, manufacturing a plurality of data sets, respectively training and testing the obtained data sets, and enhancing the magnetite microscopic image by adopting the combination of strategies such as vertical overturning, horizontal overturning, random rotation for n 90 degrees, affine transformation, random translation and the like; s2: the enhancement method in the S1 adopts a region cloning data set enhancement method, and in the training process of the data set, randomly clones a partial region of another picture from the data set to the index image, and the label also performs the same operation; the method solves the problem that a great deal of time is still needed to finish the quartz segmentation task under the magnetite microscopic image by adopting the traditional medical semantic segmentation strategy.
Description
Technical Field
The invention belongs to the technical field of mineral facies segmentation, and particularly relates to a real-time mineral segmentation method based on a multi-feature fusion decoder.
Background
The technical mineralogy workers operating the microscope to quantitatively analyze minerals have high requirements on professional knowledge and practical experience, the method is original and long in working time, and the computer is used for quickly segmenting the mineral phases to obtain the components, so that the method has great significance to the technical mineralogy workers; in recent years, the classification work of mineral rocks and the like by adopting a deep learning method at home and abroad is increasing, and students obtain quite abundant results on the identification of minerals, but as the color and texture characteristics of the mineral facies under the mirror are complex and various, and the mineral facies are difficult to be segmented by adopting a traditional image processing method, the students are rarely involved in the segmentation of the mineral facies, and the segmentation of the mineral facies becomes possible along with the development of deep learning semantic segmentation in recent years.
Mineral microscopic image segmentation tasks are closer to medical image segmentation, a medical image segmentation method has great reference value, a plurality of effective medical image segmentation schemes appear in the past years, most classically, a U-shaped network obtains a good segmentation effect in the biomedical field, then, the U-Net + + and U-Net3+ are provided by improvement on the basis of the U-Net, and in 2019, hollow convolution and pyramid pooling are introduced into the U-shaped network, so that the segmentation precision is further improved, and the medical image segmentation algorithm is mature in the segmentation precision; however, it still takes a lot of time to segment these photographs by using the conventional medical semantic segmentation strategy, for example, one ore light sheet is 3.5 × 3.5cm in size, and a microscope with 50 times of objective lens needs to take ten thousand photographs to take a complete picture.
Therefore, under the inspiration of a characteristic multiplexing structure, a decoder of a U-shaped network is improved, a multi-characteristic fusion decoder structure is provided, and the task of segmenting quartz under a magnetite microscopic image is completed; in order to solve the problems, the invention provides a real-time mineral segmentation method based on a multi-feature fusion decoder.
Disclosure of Invention
The invention aims to improve a decoder of a U-shaped network, provides a multi-feature fusion decoder structure, and completes the task of segmenting quartz under a magnetite microscopic image so as to solve the problems in the background art.
In order to achieve the purpose, the invention adopts the following technical scheme:
the real-time mineral segmentation method based on the multi-feature fusion decoder comprises the following steps:
s1: performing semantic information segmentation on quartz in a white area of a label under a magnetite microscopic image, manufacturing a plurality of data sets, respectively training and testing the obtained data sets, and enhancing the magnetite microscopic image by adopting the combination of strategies such as vertical overturning, horizontal overturning, random rotation for n 90 degrees, affine transformation, random translation and the like;
s2: the enhancement method in the S1 adopts a region cloning data set enhancement method, in the training process of the data set, a partial region of another picture is randomly cloned from the data set to the index image, and meanwhile, the same operation is also executed by the label, so that the diversity of the mineral facies data is improved, and the overfitting phenomenon in the training process is reduced;
s3: constructing a relevant network model based on the data set processed in the S2, encoding, decoding and fusing the data set, fusing all feature maps with the same scale in the repeated encoding and decoding operation process, acquiring an encoder feature map and a decoder feature map, and encoding again after the encoder feature map and the decoder feature map are fused;
s4: when the data set is encoded and decoded in the S3, a multi-feature aggregation decoder structure and a lightweight Resnet34 are adopted to build an MA-net network structure;
s5: based on the MA-net network structure established in S4, a channel attention mechanism is introduced for training to obtain a network model, and the segmentation precision is improved;
s6: introducing a residual multi-kernel pooling module at the end of the MA-net network structure built in the S4, wherein the residual multi-kernel pooling module mainly depends on a plurality of effective views to detect objects with different sizes;
s7: in the training process of the data set in the S1, replacing Batch Normalization (BN) with Filter Response Normalization (FRN), and replacing modified linear unit (Relu) with corresponding activation layer Threshold Linear Unit (TLU); model performance analysis experiments were performed on the EM challenge data set, the LUNA challenge data set, and the DRIVE data set, respectively.
Preferably, a detail extraction module is added to the MA-net network structure described in S4 to improve the ability to segment small objects in the DRIVE data set.
Preferably, in the MA-net network structure described in S4, the first convolutional layer uses 16 channels, the number of output channels of the first convolutional core is reduced to 1/4 of the number of input channels, and this is used as the number of input channels of the second convolutional layer, and the parameter amount is greatly reduced without changing the input and output channels of each decoder block.
Preferably, the channel attention mechanism introduced in S5 includes the following steps:
a1: global average pooling is performed first to maintain a maximum receptive field;
a2: and then, a learnable weight is distributed to the channel of each feature map, so that the network model is more concerned about the classified main objects, and the channel attention mechanism adopts an ARM module.
Preferably, the step of introducing the residual multi-kernel pooling module in S6 includes the following steps:
b1: collecting context information by adopting four pooling kernels with different sizes to enrich high-level semantic information;
b2: obtaining a feature map with the same size as the original feature map by bilinear interpolation, and reducing the dimension to 1 by 1 × 1 convolution;
b3: merging the original characteristic diagram and the up-sampled characteristic diagram into a channel; the residual multi-kernel pooling module can cope with large variations in object size in the image.
Preferably, the FRN expression described in S7 is as follows:
where x is a vector of N dimensions (H W); unlike the normalization method where the mean is subtracted from the BN layer and then divided by the standard deviation, the mean of the quadratic norm is subtracted from the FRN; ε in the formula is a small normal amount to prevent division by 0.
Preferably, the TLU expression described in S7 is as follows:
zi=max(yi,τ)=ReLu(y-τ)+τ (3)
where τ is a learnable parameter.
Compared with the prior art, the invention provides a mineral real-time segmentation method based on a multi-feature fusion decoder, which has the following beneficial effects:
(1) the invention provides a multi-feature fusion decoder structure, a MA-Net network structure is built by combining with lightweight Resnet34, residual multi-core pooling is added at the end of an encoder to enhance the segmentation effect on various size targets, a channel attention mechanism is introduced to improve the segmentation precision, FRN is adopted to eliminate the dependence of the network on Batchsize in the training process, and meanwhile, compared with the decoder structure with a single path, the network has the advantages that a downsampling process is added, multi-stage feature information is aggregated in the encoding and decoding processes, so that the MA-Net network structure can be compared with other U-type networks, the number of network channels can be greatly reduced, parameters are reduced, and the segmentation precision is also guaranteed.
(2) In order to obtain a higher segmentation effect by using a small amount of training sample data, the method of randomly cloning a partial area of another picture in a data set to an index image is adopted for data enhancement; through experimental verification and analysis, the method can be applied to a segmentation task with random segmentation target space positions such as a mineral microscopic image, and can effectively reduce overfitting and improve segmentation precision.
(3) The invention is obtained by testing and analyzing on EM, LUNA and DRIVE data sets, and MA-Net shows outstanding performance when segmenting a larger target, is not good at segmenting a tiny target, and needs to be optimized and improved in the aspect of segmenting a small target; the MA-Net was used for the task of partitioning quartz in the magnetite phase, and the Dice coefficient reached 0.9637.
(4) In order to avoid the influence of batch size on a training result during training, filter response normalization is adopted to replace batch standardization, meanwhile, a corresponding activation layer threshold linear unit is adopted to replace a correction linear unit, the average Dice coefficients tested under an EM challenge data set and a LUNA challenge data set are 0.9657 and 0.9852 respectively, compared with 0.9584 and 0.9758 of U-net, the improvement is great, and meanwhile, the floating point operand is 5.72G and is only 1/43 of U-net; it can thus be seen that the task of using MA-Net for magnetite to divide quartz shows good dividing effect.
(5) The influence of each module of MA-Net on the segmentation effect and the real-time performance of the model is verified on a mineral microscopic image data set, the multi-feature fusion decoding strategy adopted by the MA-Net can fully extract the information of deep features and shallow features, the correlation of the deep features and the shallow features is learned to process isolated pixel points in the segmentation result, and the problems of information loss and poor fusion quality in the up-sampling process are greatly solved.
Drawings
FIG. 1 is a diagram showing magnetite mineral phase data set of the multi-feature fusion decoder-based mineral real-time segmentation method of the present invention;
FIG. 2 is a diagram showing a region clone data enhancement method of the multi-feature fusion decoder-based real-time mineral segmentation method of the present invention;
FIG. 3 is a structural comparison diagram of a U-shaped network and a stage feature multiplexing structure of the real-time mineral segmentation method based on the multi-feature fusion decoder, which is provided by the invention;
FIG. 4 is a diagram showing the MA-net structure of the multi-feature fusion decoder-based real-time mineral segmentation method proposed in the present invention;
FIG. 5 is an ARM module display diagram of the multi-feature fusion decoder-based real-time mineral segmentation method of the present invention;
FIG. 6 is a diagram showing a residual multi-kernel pooling module of the multi-feature fusion decoder-based real-time mineral segmentation method of the present invention;
FIG. 7 is a comparison graph of the mineral segmentation effect of the real-time mineral segmentation method based on the multi-feature fusion decoder according to the present invention;
FIG. 8 is a diagram illustrating the effect of the region clone data enhancement method of the real-time mineral segmentation method based on the multi-feature fusion decoder according to the present invention;
fig. 9 is a mineral segmentation result diagram of the real-time mineral segmentation method based on the multi-feature fusion decoder according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
Example 1:
referring to fig. 1-6, the method for real-time mineral segmentation based on multi-feature fusion decoder includes the following steps:
s1: performing semantic information segmentation on quartz in a white area (shown in figure 1) of a label under a magnetite microscopic image, making a plurality of data sets, respectively training and testing the obtained data sets, and enhancing the magnetite microscopic image by adopting the combination of strategies of vertical overturning, horizontal overturning, random rotation for n 90 degrees, affine transformation, random translation and the like;
s2: the enhancement method in the S1 adopts a region cloning data set enhancement method, in the training process of the data set, a partial region of another picture is randomly cloned from the data set to the index image, and meanwhile, the same operation is also executed by the label, so that the diversity of the mineral facies data is improved, and the overfitting phenomenon in the training process is reduced;
the method is applied to the mining facies data set, the probability of increasing the richness of the data set is greater than the probability of destroying information, and the experimental part of the method verifies the method; the region clone data enhancement method is shown in FIG. 2;
s3: constructing a relevant network model based on the data set processed in the S2, encoding, decoding and fusing the data set, fusing all feature maps with the same scale in the repeated encoding and decoding operation process, acquiring an encoder feature map and a decoder feature map, and encoding again after the encoder feature map and the decoder feature map are fused;
the overall structure of the network is a coding and decoding structure, the traditional decoder structure adopts a single-path up-sampling feature map which is continuously enriched, all up-sampling processes are not closely connected, and a deep feature map is difficult to recover detailed information in the decoding process;
the invention has proposed a decoder structure to gather the characteristic of multiple stages in conjuction with characteristic multiplexing structure and code decoding structure that the open literature puts forward, this tactics fuses all characteristic pictures of the same scale in encoding and decoding the operation process repeatedly, and encode and can learn the correlation relation of the two further after encoder characteristic picture and decoder characteristic picture are fused again, make it more appropriate to fuse, because the characteristic picture of multiple stages of the structure supplements detailed information and spatial information in decoding process of each depth, can be very large to compress the channel number of characteristic picture compared with traditional structure, thus reduce the parameter number, U type network and stage characteristic multiplexing structure are as shown in fig. 3;
s4: when the data set is encoded and decoded in S3, a multi-feature aggregation decoder structure and a lightweight Resnet34 are used to construct an MA-net network structure, which is shown in fig. 4, where 'c' represents a merging channel and is a1 × 1 convolution;
in the MA-net network structure described in S4, the first convolutional layer uses 16 channels, the number of output channels of the first convolutional core is reduced to 1/4 of the number of input channels, and this is used as the number of input channels of the second convolutional layer, and the parameter amount is greatly reduced under the condition that the input and output channels of each decoder block are not changed;
in a deep convolutional neural network, the size of a shallow characteristic diagram is larger than that of a deep layer, and the calculation amount is influenced by the number of channels and is more sensitive; wherein the encoder parameters and the number of output channels are shown in table 1,/2 denotes 2-fold down-sampling; the decoder configuration parameters are shown in Table 2, where
2 denotes 2-fold upsampling;
TABLE 1 encoder Module parameters
TABLE 2 decoder Module parameters
S5: based on the MA-net network structure established in S4, a channel attention mechanism is introduced for training to obtain a network model, and the segmentation precision is improved;
the channel attention mechanism introduced in the step S5 includes the following steps:
a1: global average pooling is performed first to maintain a maximum receptive field;
a2: then, a learnable weight is distributed to the channel of each feature graph, so that the network model is more concerned about classified main objects, wherein an ARM module is adopted as a channel attention mechanism, as shown in FIG. 5;
s6: introducing a residual multi-kernel pooling module at the end of the MA-net network structure built in the S4, wherein the residual multi-kernel pooling module mainly depends on a plurality of effective views to detect objects with different sizes;
the step of introducing the residual multi-kernel pooling module in the step S6 includes the following steps:
b1: collecting context information by adopting four pooling kernels with different sizes to enrich high-level semantic information;
b2: obtaining a feature map with the same size as the original feature map by bilinear interpolation, and reducing the dimension to 1 by 1 × 1 convolution;
b3: merging the original characteristic diagram and the up-sampled characteristic diagram into a channel; the residual multi-kernel pooling module can cope with large changes in the size of objects in the image;
the module introduces fewer parameters, namely 388 parameters, which causes a slight increase in calculation cost, but the obtained accuracy improvement is more important, as shown in fig. 6, which is an RMP module;
s7: in the training process of the data set in the S1, replacing Batch Normalization (BN) with Filter Response Normalization (FRN), and replacing modified linear unit (Relu) with corresponding activation layer Threshold Linear Unit (TLU); respectively performing model performance analysis experiments on the EM challenge data set, the LUNA challenge data set and the DRIVE data set;
adding a detail extraction module into the MA-net network structure in S4 to improve the capability of segmenting tiny targets in the DRIVE data set;
in order to improve the capability of segmenting tiny targets in a DRIVE data set, a detail extraction module is added in a network, namely a spatial information extraction path of a small step height channel proposed in a public document is fused with an original decoder path, so that the segmentation effect of the tiny targets is effectively improved, the calculated amount is increased, and the precision of a segmentation task with a larger target cannot be improved;
the FRN expression described in S7 is as follows:
where x is a vector of N dimensions (H W); unlike the normalization method where the mean is subtracted from the BN layer and then divided by the standard deviation, the mean of the quadratic norm is subtracted from the FRN; ε in the formula is a small normal amount to prevent division by 0;
to solve the above-mentioned problem that Relu activation generates a 0 value, and the disclosure proposes a thresholded Relu adopted after FRN, i.e. TLU is important for training performance improvement, the TLU expression described in S7 is as follows:
zi=max(yi,τ)=ReLu(y-τ)+τ (3)
where τ is a learnable parameter.
The model performance analysis experiment is as follows:
(one) Experimental setup
The evaluation index adopted by the experiment is a Dice coefficient, and the test set is not subjected to any enhancement, such as multi-scale or multi-angle, so that the predicted result quality is higher. The Dice coefficient is a set similarity metric function, which is generally used to calculate the similarity of two samples, and has a value ranging from 0 to 1, and the segmentation is preferably 1 and 0 at worst, and is as follows, where TP, FP, and FN represent the number of true positives, false positives, and false negatives, respectively. The Dice expression is as follows:
the experimental operating system is Arch, a pytorch deep learning framework, the batch size (batch size) is 8, the Adam optimizer adopts a Dice coefficient loss function, and the input image size is 512 multiplied by 512.
Setting of channel
The number of channels of the output layer of the encoder is one of the main limitations of network acceleration, the experiment adopts Resnet18 as the reference network of the encoder to perform the experiment on the combination of three groups of channels on the magnetite microscopic image data set, as shown in Table 3, it can be seen that the calculated amount is obviously increased along with the increase of the number of channels output by each layer of the encoder, the segmentation precision of the channel number strategy 2 is greatly improved compared with that of the strategy 1, the segmentation precision of the strategy 3 is almost unchanged compared with that of the strategy 2, the segmentation task is considered to be not complex, excessive parameters only generate redundancy, and the network structure limits the capability of extracting semantic information.
TABLE 3MA-net number of channels comparison experiment
Coder network selection
In order to further explore the influence of the reference network depth of the MA-net encoder on the network performance and select a proper encoder network, a channel strategy 2 is adopted in the experiment, the segmentation performance of the Resnet-18 and Resnet-34 is compared on a magnetite microscopic image data set, in order to verify that the influence of the network depth and the number of channels on the segmentation performance is increased at the same time, a group of comparison experiments adopting an original parameter Resnet34 are added and are expressed by Resnet34-B, and table 4 shows the segmentation performance and the calculation amount of 3 encoders, so that the network deepening can be found to improve the model segmentation performance to a certain extent, the effect of an excessively deep network and the excessive number of channels is not large, and the calculation amount is increased sharply. And the lightweight encoder network has little impact on model performance. Later experiments all used lightweight Resnet34, channel number strategy 2.
TABLE 4MA-net reference network comparison experiment
(II) analysis of model
According to the invention, an MA-Net ablation experiment is carried out on a magnetite mineral microscopic image data set, the performance of each module is analyzed, the Dice coefficient, the parameter quantity and the calculated quantity of the module are shown in Table 5, it can be seen that an attention mechanism ARM has a certain effect on the model segmentation precision, the precision of the model is greatly improved by adopting residual multi-core pooling RMP, but the added calculated quantity is minimum, the segmentation precision is slightly improved by introducing an FRN normalization method, and meanwhile, the calculated quantity is reduced.
TABLE 5MA-net ablation experiments on mineral segmentation datasets
(III) model comparison experiment
Comparing the proposed MA-Net and advanced algorithms on the magnetite microscopic image data set, it can be seen that the MA-Net segmentation accuracy exceeds the other two networks, while the parameters and the calculated amount are minimal.
TABLE 6 Magnetite microscopic image data set modeling contrast experiment
Fig. 7 is a comparison graph of segmentation effect, it can be seen that CE-net is far lower than MA-net in segmentation effect, the CE-net segmentation image is easily interfered by some highlight portions, although the overall contour segmentation effect is better, a large number of holes exist in the image, and MA-net rarely occurs this case, CE-net adds a hole convolution and multi-kernel pooling at the end of the encoder to increase the receptive field, but the encoder feature map and the decoder feature map are fused in a simple addition manner, and information loss and improper fusion inevitably occur in the process of upsampling, and the multi-feature fusion decoding strategy adopted by MA-net can fully extract information of deep-layer features and shallow-layer features, learn the correlation thereof to process a large target in the segmentation result, and greatly overcome the problems of information loss and poor fusion in the process of upsampling, meanwhile, downsampling is carried out after the superficial layer feature maps are fused each time, and the expansion of the receptive field is beneficial to spatial information supplement of a large target.
Regional cloning data enhancement methods comparative experiments were as follows:
in order to analyze the effect of the regional clone data enhancement method, an experiment is carried out on a mineral microscopic image data set, as shown in fig. 8, the segmentation effect on a test set is compared by adopting the regional clone data enhancement method and when the regional clone data enhancement method is not adopted, the fact that the Dice value fluctuation is small when the data enhancement method is adopted can be seen from a curve, and a high segmentation effect is finally obtained; the analysis reason is as follows: the information of the two pictures is combined by adopting a region clone data enhancement method, so that the difference between the pictures can be effectively reduced, and further, the variance of the data is reduced;
by adopting the MA-Net network and the regional clone data enhancement method provided by the invention to perform the task of dividing quartz in the magnetite phase, the Dice coefficient reaches 0.9637, and is extremely close to the manually marked label, as shown in FIG. 9, a single picture can be predicted only by 0.16 second under the Ruilong R7-3700x CPU.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.
Claims (7)
1. The mineral real-time segmentation method based on the multi-feature fusion decoder is characterized by comprising the following steps: the method comprises the following steps:
s1: performing semantic information segmentation on quartz in a white area of a label under a magnetite microscopic image, manufacturing a plurality of data sets, respectively training and testing the obtained data sets, and enhancing the magnetite microscopic image by adopting a combination of vertical overturning, horizontal overturning, random rotation for n 90 degrees, affine transformation and random translation;
s2: in the S1, a region clone data set enhancement method is adopted during enhancement, in the training process of the data set, a partial region of another picture is randomly cloned from the data set to the index image, and the same operation is performed on the label, so that the diversity of the mineral facies data is improved, and the overfitting phenomenon in the training process is reduced;
s3: constructing a relevant network model based on the data set processed in the S2, encoding, decoding and fusing the data set, fusing all feature maps with the same scale in the repeated encoding and decoding operation process, acquiring an encoder feature map and a decoder feature map, and encoding again after the encoder feature map and the decoder feature map are fused;
s4: when the data set is encoded and decoded in the S3, a multi-feature aggregation decoder structure and a lightweight Resnet34 are adopted to build an MA-net network structure;
s5: based on the MA-net network structure established in S4, a channel attention mechanism is introduced for training to obtain a network model, and the segmentation precision is improved;
s6: introducing a residual multi-kernel pooling module at the end of the MA-net network structure built in the S4, wherein the residual multi-kernel pooling module mainly depends on a plurality of effective views to detect objects with different sizes;
s7: in the training process of the data set in the step S1, filter response normalization is adopted to replace batch normalization, and meanwhile, a corresponding activation layer threshold linear unit is adopted to replace a correction linear unit; model performance analysis experiments were performed on the EM challenge data set, the LUNA challenge data set, and the DRIVE data set, respectively.
2. The real-time mineral segmentation method based on multi-feature fusion decoder as claimed in claim 1, wherein: and adding a detail extraction module into the MA-net network structure in the S4 to improve the capability of segmenting tiny targets in the DRIVE data set.
3. The real-time mineral segmentation method based on multi-feature fusion decoder as claimed in claim 1 or 2, wherein: in the MA-net network structure described in S4, the first convolutional layer uses 16 channels, the number of output channels of the first convolutional core is reduced to 1/4, which is the number of input channels, and this is used as the number of input channels of the second convolutional layer, and the parameter amount is reduced without changing the input and output channels of each decoder block.
4. The method for real-time segmentation of multi-feature fusion decoder-based minerals according to claim 1, wherein: the channel attention mechanism introduced in the step S5 includes the following steps:
a1: global average pooling is performed first to maintain a maximum receptive field;
a2: and then, a learnable weight is distributed to the channel of each feature map, so that the network model is more concerned about the classified main objects, and the channel attention mechanism adopts an ARM module.
5. The real-time mineral segmentation method based on multi-feature fusion decoder as claimed in claim 1, wherein: the step of introducing the residual multi-kernel pooling module in the step S6 includes the following steps:
b1: collecting context information by adopting four pooling kernels with different sizes to enrich high-level semantic information;
b2: obtaining a feature map with the same size as the original feature map by bilinear interpolation, and reducing the dimension to 1 by 1 × 1 convolution;
b3: merging the original characteristic diagram and the up-sampled characteristic diagram into a channel; the residual multi-kernel pooling module is used to cope with large changes in object size in the image.
6. The real-time mineral segmentation method based on multi-feature fusion decoder as claimed in claim 1, wherein: the FRN expression described in S7 is as follows:
where x is a vector of N dimensions (H W); unlike the normalization method where the mean is subtracted from the BN layer and then divided by the standard deviation, the mean of the quadratic norm is subtracted from the FRN; ε in the formula is a small normal amount to prevent division by 0.
7. The real-time mineral segmentation method based on multi-feature fusion decoder as claimed in claim 6, wherein: the TLU expression described in S7 is as follows:
zi=max(yi,τ)=ReLu(y-τ)+τ (3)
where τ is a learnable parameter.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110847545.6A CN113570611A (en) | 2021-07-27 | 2021-07-27 | Mineral real-time segmentation method based on multi-feature fusion decoder |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110847545.6A CN113570611A (en) | 2021-07-27 | 2021-07-27 | Mineral real-time segmentation method based on multi-feature fusion decoder |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113570611A true CN113570611A (en) | 2021-10-29 |
Family
ID=78167689
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110847545.6A Pending CN113570611A (en) | 2021-07-27 | 2021-07-27 | Mineral real-time segmentation method based on multi-feature fusion decoder |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113570611A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113887524A (en) * | 2021-11-04 | 2022-01-04 | 华北理工大学 | Magnetite microscopic image segmentation method based on semantic segmentation |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10361802B1 (en) * | 1999-02-01 | 2019-07-23 | Blanding Hovenweep, Llc | Adaptive pattern recognition based control system and method |
CN111681252A (en) * | 2020-05-30 | 2020-09-18 | 重庆邮电大学 | Medical image automatic segmentation method based on multipath attention fusion |
CN111797779A (en) * | 2020-07-08 | 2020-10-20 | 兰州交通大学 | Remote sensing image semantic segmentation method based on regional attention multi-scale feature fusion |
-
2021
- 2021-07-27 CN CN202110847545.6A patent/CN113570611A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10361802B1 (en) * | 1999-02-01 | 2019-07-23 | Blanding Hovenweep, Llc | Adaptive pattern recognition based control system and method |
CN111681252A (en) * | 2020-05-30 | 2020-09-18 | 重庆邮电大学 | Medical image automatic segmentation method based on multipath attention fusion |
CN111797779A (en) * | 2020-07-08 | 2020-10-20 | 兰州交通大学 | Remote sensing image semantic segmentation method based on regional attention multi-scale feature fusion |
Non-Patent Citations (2)
Title |
---|
SAURABH SINGH: "Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks" * |
陈家石: "基于深度学习的医学图像分割研究" * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113887524A (en) * | 2021-11-04 | 2022-01-04 | 华北理工大学 | Magnetite microscopic image segmentation method based on semantic segmentation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111768432B (en) | Moving target segmentation method and system based on twin deep neural network | |
Li et al. | Learning face image super-resolution through facial semantic attribute transformation and self-attentive structure enhancement | |
CN115619743A (en) | Construction method and application of OLED novel display device surface defect detection model | |
CN114972213A (en) | Two-stage mainboard image defect detection and positioning method based on machine vision | |
CN110503113B (en) | Image saliency target detection method based on low-rank matrix recovery | |
CN113870286B (en) | Foreground segmentation method based on multi-level feature and mask fusion | |
CN113870124B (en) | Weak supervision-based double-network mutual excitation learning shadow removing method | |
CN114724155A (en) | Scene text detection method, system and equipment based on deep convolutional neural network | |
CN112906813A (en) | Flotation condition identification method based on density clustering and capsule neural network | |
CN103886585A (en) | Video tracking method based on rank learning | |
CN114387641A (en) | False video detection method and system based on multi-scale convolutional network and ViT | |
CN111507215A (en) | Video target segmentation method based on space-time convolution cyclic neural network and cavity convolution | |
CN113807176A (en) | Small sample video behavior identification method based on multi-knowledge fusion | |
CN106530330A (en) | Low-rank sparse-based video target tracking method | |
CN115953784A (en) | Laser coding character segmentation method based on residual error and feature blocking attention | |
CN117218378A (en) | High-precision regression infrared small target tracking method | |
CN113870330B (en) | Twin vision tracking method based on specific labels and loss function | |
CN116596966A (en) | Segmentation and tracking method based on attention and feature fusion | |
CN111612802A (en) | Re-optimization training method based on existing image semantic segmentation model and application | |
CN114120202A (en) | Semi-supervised video target segmentation method based on multi-scale target model and feature fusion | |
CN113570611A (en) | Mineral real-time segmentation method based on multi-feature fusion decoder | |
CN114359626A (en) | Visible light-thermal infrared obvious target detection method based on condition generation countermeasure network | |
CN114155165A (en) | Image defogging method based on semi-supervision | |
CN117315543A (en) | Confidence gating space-time memory network-based semi-supervised video target segmentation method | |
CN113177970A (en) | Multi-scale filtering target tracking method based on self-adaptive feature fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20211029 |
|
RJ01 | Rejection of invention patent application after publication |