CN113570611A - Mineral real-time segmentation method based on multi-feature fusion decoder - Google Patents

Mineral real-time segmentation method based on multi-feature fusion decoder Download PDF

Info

Publication number
CN113570611A
CN113570611A CN202110847545.6A CN202110847545A CN113570611A CN 113570611 A CN113570611 A CN 113570611A CN 202110847545 A CN202110847545 A CN 202110847545A CN 113570611 A CN113570611 A CN 113570611A
Authority
CN
China
Prior art keywords
segmentation
data set
decoder
mineral
real
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110847545.6A
Other languages
Chinese (zh)
Inventor
牛福生
薛文强
张晋霞
郭力娜
姚珊珊
粱银英
陈稳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North China University of Science and Technology
Original Assignee
North China University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North China University of Science and Technology filed Critical North China University of Science and Technology
Priority to CN202110847545.6A priority Critical patent/CN113570611A/en
Publication of CN113570611A publication Critical patent/CN113570611A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a real-time mineral segmentation method based on a multi-feature fusion decoder, and belongs to the technical field of mineral facies segmentation. The real-time mineral segmentation method based on the multi-feature fusion decoder comprises the following steps: s1: performing semantic information segmentation on quartz in a white area of a label under a magnetite microscopic image, manufacturing a plurality of data sets, respectively training and testing the obtained data sets, and enhancing the magnetite microscopic image by adopting the combination of strategies such as vertical overturning, horizontal overturning, random rotation for n 90 degrees, affine transformation, random translation and the like; s2: the enhancement method in the S1 adopts a region cloning data set enhancement method, and in the training process of the data set, randomly clones a partial region of another picture from the data set to the index image, and the label also performs the same operation; the method solves the problem that a great deal of time is still needed to finish the quartz segmentation task under the magnetite microscopic image by adopting the traditional medical semantic segmentation strategy.

Description

Mineral real-time segmentation method based on multi-feature fusion decoder
Technical Field
The invention belongs to the technical field of mineral facies segmentation, and particularly relates to a real-time mineral segmentation method based on a multi-feature fusion decoder.
Background
The technical mineralogy workers operating the microscope to quantitatively analyze minerals have high requirements on professional knowledge and practical experience, the method is original and long in working time, and the computer is used for quickly segmenting the mineral phases to obtain the components, so that the method has great significance to the technical mineralogy workers; in recent years, the classification work of mineral rocks and the like by adopting a deep learning method at home and abroad is increasing, and students obtain quite abundant results on the identification of minerals, but as the color and texture characteristics of the mineral facies under the mirror are complex and various, and the mineral facies are difficult to be segmented by adopting a traditional image processing method, the students are rarely involved in the segmentation of the mineral facies, and the segmentation of the mineral facies becomes possible along with the development of deep learning semantic segmentation in recent years.
Mineral microscopic image segmentation tasks are closer to medical image segmentation, a medical image segmentation method has great reference value, a plurality of effective medical image segmentation schemes appear in the past years, most classically, a U-shaped network obtains a good segmentation effect in the biomedical field, then, the U-Net + + and U-Net3+ are provided by improvement on the basis of the U-Net, and in 2019, hollow convolution and pyramid pooling are introduced into the U-shaped network, so that the segmentation precision is further improved, and the medical image segmentation algorithm is mature in the segmentation precision; however, it still takes a lot of time to segment these photographs by using the conventional medical semantic segmentation strategy, for example, one ore light sheet is 3.5 × 3.5cm in size, and a microscope with 50 times of objective lens needs to take ten thousand photographs to take a complete picture.
Therefore, under the inspiration of a characteristic multiplexing structure, a decoder of a U-shaped network is improved, a multi-characteristic fusion decoder structure is provided, and the task of segmenting quartz under a magnetite microscopic image is completed; in order to solve the problems, the invention provides a real-time mineral segmentation method based on a multi-feature fusion decoder.
Disclosure of Invention
The invention aims to improve a decoder of a U-shaped network, provides a multi-feature fusion decoder structure, and completes the task of segmenting quartz under a magnetite microscopic image so as to solve the problems in the background art.
In order to achieve the purpose, the invention adopts the following technical scheme:
the real-time mineral segmentation method based on the multi-feature fusion decoder comprises the following steps:
s1: performing semantic information segmentation on quartz in a white area of a label under a magnetite microscopic image, manufacturing a plurality of data sets, respectively training and testing the obtained data sets, and enhancing the magnetite microscopic image by adopting the combination of strategies such as vertical overturning, horizontal overturning, random rotation for n 90 degrees, affine transformation, random translation and the like;
s2: the enhancement method in the S1 adopts a region cloning data set enhancement method, in the training process of the data set, a partial region of another picture is randomly cloned from the data set to the index image, and meanwhile, the same operation is also executed by the label, so that the diversity of the mineral facies data is improved, and the overfitting phenomenon in the training process is reduced;
s3: constructing a relevant network model based on the data set processed in the S2, encoding, decoding and fusing the data set, fusing all feature maps with the same scale in the repeated encoding and decoding operation process, acquiring an encoder feature map and a decoder feature map, and encoding again after the encoder feature map and the decoder feature map are fused;
s4: when the data set is encoded and decoded in the S3, a multi-feature aggregation decoder structure and a lightweight Resnet34 are adopted to build an MA-net network structure;
s5: based on the MA-net network structure established in S4, a channel attention mechanism is introduced for training to obtain a network model, and the segmentation precision is improved;
s6: introducing a residual multi-kernel pooling module at the end of the MA-net network structure built in the S4, wherein the residual multi-kernel pooling module mainly depends on a plurality of effective views to detect objects with different sizes;
s7: in the training process of the data set in the S1, replacing Batch Normalization (BN) with Filter Response Normalization (FRN), and replacing modified linear unit (Relu) with corresponding activation layer Threshold Linear Unit (TLU); model performance analysis experiments were performed on the EM challenge data set, the LUNA challenge data set, and the DRIVE data set, respectively.
Preferably, a detail extraction module is added to the MA-net network structure described in S4 to improve the ability to segment small objects in the DRIVE data set.
Preferably, in the MA-net network structure described in S4, the first convolutional layer uses 16 channels, the number of output channels of the first convolutional core is reduced to 1/4 of the number of input channels, and this is used as the number of input channels of the second convolutional layer, and the parameter amount is greatly reduced without changing the input and output channels of each decoder block.
Preferably, the channel attention mechanism introduced in S5 includes the following steps:
a1: global average pooling is performed first to maintain a maximum receptive field;
a2: and then, a learnable weight is distributed to the channel of each feature map, so that the network model is more concerned about the classified main objects, and the channel attention mechanism adopts an ARM module.
Preferably, the step of introducing the residual multi-kernel pooling module in S6 includes the following steps:
b1: collecting context information by adopting four pooling kernels with different sizes to enrich high-level semantic information;
b2: obtaining a feature map with the same size as the original feature map by bilinear interpolation, and reducing the dimension to 1 by 1 × 1 convolution;
b3: merging the original characteristic diagram and the up-sampled characteristic diagram into a channel; the residual multi-kernel pooling module can cope with large variations in object size in the image.
Preferably, the FRN expression described in S7 is as follows:
Figure BDA0003181224490000041
Figure BDA0003181224490000042
where x is a vector of N dimensions (H W); unlike the normalization method where the mean is subtracted from the BN layer and then divided by the standard deviation, the mean of the quadratic norm is subtracted from the FRN; ε in the formula is a small normal amount to prevent division by 0.
Preferably, the TLU expression described in S7 is as follows:
zi=max(yi,τ)=ReLu(y-τ)+τ (3)
where τ is a learnable parameter.
Compared with the prior art, the invention provides a mineral real-time segmentation method based on a multi-feature fusion decoder, which has the following beneficial effects:
(1) the invention provides a multi-feature fusion decoder structure, a MA-Net network structure is built by combining with lightweight Resnet34, residual multi-core pooling is added at the end of an encoder to enhance the segmentation effect on various size targets, a channel attention mechanism is introduced to improve the segmentation precision, FRN is adopted to eliminate the dependence of the network on Batchsize in the training process, and meanwhile, compared with the decoder structure with a single path, the network has the advantages that a downsampling process is added, multi-stage feature information is aggregated in the encoding and decoding processes, so that the MA-Net network structure can be compared with other U-type networks, the number of network channels can be greatly reduced, parameters are reduced, and the segmentation precision is also guaranteed.
(2) In order to obtain a higher segmentation effect by using a small amount of training sample data, the method of randomly cloning a partial area of another picture in a data set to an index image is adopted for data enhancement; through experimental verification and analysis, the method can be applied to a segmentation task with random segmentation target space positions such as a mineral microscopic image, and can effectively reduce overfitting and improve segmentation precision.
(3) The invention is obtained by testing and analyzing on EM, LUNA and DRIVE data sets, and MA-Net shows outstanding performance when segmenting a larger target, is not good at segmenting a tiny target, and needs to be optimized and improved in the aspect of segmenting a small target; the MA-Net was used for the task of partitioning quartz in the magnetite phase, and the Dice coefficient reached 0.9637.
(4) In order to avoid the influence of batch size on a training result during training, filter response normalization is adopted to replace batch standardization, meanwhile, a corresponding activation layer threshold linear unit is adopted to replace a correction linear unit, the average Dice coefficients tested under an EM challenge data set and a LUNA challenge data set are 0.9657 and 0.9852 respectively, compared with 0.9584 and 0.9758 of U-net, the improvement is great, and meanwhile, the floating point operand is 5.72G and is only 1/43 of U-net; it can thus be seen that the task of using MA-Net for magnetite to divide quartz shows good dividing effect.
(5) The influence of each module of MA-Net on the segmentation effect and the real-time performance of the model is verified on a mineral microscopic image data set, the multi-feature fusion decoding strategy adopted by the MA-Net can fully extract the information of deep features and shallow features, the correlation of the deep features and the shallow features is learned to process isolated pixel points in the segmentation result, and the problems of information loss and poor fusion quality in the up-sampling process are greatly solved.
Drawings
FIG. 1 is a diagram showing magnetite mineral phase data set of the multi-feature fusion decoder-based mineral real-time segmentation method of the present invention;
FIG. 2 is a diagram showing a region clone data enhancement method of the multi-feature fusion decoder-based real-time mineral segmentation method of the present invention;
FIG. 3 is a structural comparison diagram of a U-shaped network and a stage feature multiplexing structure of the real-time mineral segmentation method based on the multi-feature fusion decoder, which is provided by the invention;
FIG. 4 is a diagram showing the MA-net structure of the multi-feature fusion decoder-based real-time mineral segmentation method proposed in the present invention;
FIG. 5 is an ARM module display diagram of the multi-feature fusion decoder-based real-time mineral segmentation method of the present invention;
FIG. 6 is a diagram showing a residual multi-kernel pooling module of the multi-feature fusion decoder-based real-time mineral segmentation method of the present invention;
FIG. 7 is a comparison graph of the mineral segmentation effect of the real-time mineral segmentation method based on the multi-feature fusion decoder according to the present invention;
FIG. 8 is a diagram illustrating the effect of the region clone data enhancement method of the real-time mineral segmentation method based on the multi-feature fusion decoder according to the present invention;
fig. 9 is a mineral segmentation result diagram of the real-time mineral segmentation method based on the multi-feature fusion decoder according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
Example 1:
referring to fig. 1-6, the method for real-time mineral segmentation based on multi-feature fusion decoder includes the following steps:
s1: performing semantic information segmentation on quartz in a white area (shown in figure 1) of a label under a magnetite microscopic image, making a plurality of data sets, respectively training and testing the obtained data sets, and enhancing the magnetite microscopic image by adopting the combination of strategies of vertical overturning, horizontal overturning, random rotation for n 90 degrees, affine transformation, random translation and the like;
s2: the enhancement method in the S1 adopts a region cloning data set enhancement method, in the training process of the data set, a partial region of another picture is randomly cloned from the data set to the index image, and meanwhile, the same operation is also executed by the label, so that the diversity of the mineral facies data is improved, and the overfitting phenomenon in the training process is reduced;
the method is applied to the mining facies data set, the probability of increasing the richness of the data set is greater than the probability of destroying information, and the experimental part of the method verifies the method; the region clone data enhancement method is shown in FIG. 2;
s3: constructing a relevant network model based on the data set processed in the S2, encoding, decoding and fusing the data set, fusing all feature maps with the same scale in the repeated encoding and decoding operation process, acquiring an encoder feature map and a decoder feature map, and encoding again after the encoder feature map and the decoder feature map are fused;
the overall structure of the network is a coding and decoding structure, the traditional decoder structure adopts a single-path up-sampling feature map which is continuously enriched, all up-sampling processes are not closely connected, and a deep feature map is difficult to recover detailed information in the decoding process;
the invention has proposed a decoder structure to gather the characteristic of multiple stages in conjuction with characteristic multiplexing structure and code decoding structure that the open literature puts forward, this tactics fuses all characteristic pictures of the same scale in encoding and decoding the operation process repeatedly, and encode and can learn the correlation relation of the two further after encoder characteristic picture and decoder characteristic picture are fused again, make it more appropriate to fuse, because the characteristic picture of multiple stages of the structure supplements detailed information and spatial information in decoding process of each depth, can be very large to compress the channel number of characteristic picture compared with traditional structure, thus reduce the parameter number, U type network and stage characteristic multiplexing structure are as shown in fig. 3;
s4: when the data set is encoded and decoded in S3, a multi-feature aggregation decoder structure and a lightweight Resnet34 are used to construct an MA-net network structure, which is shown in fig. 4, where 'c' represents a merging channel and is a1 × 1 convolution;
in the MA-net network structure described in S4, the first convolutional layer uses 16 channels, the number of output channels of the first convolutional core is reduced to 1/4 of the number of input channels, and this is used as the number of input channels of the second convolutional layer, and the parameter amount is greatly reduced under the condition that the input and output channels of each decoder block are not changed;
in a deep convolutional neural network, the size of a shallow characteristic diagram is larger than that of a deep layer, and the calculation amount is influenced by the number of channels and is more sensitive; wherein the encoder parameters and the number of output channels are shown in table 1,/2 denotes 2-fold down-sampling; the decoder configuration parameters are shown in Table 2, where
2 denotes 2-fold upsampling;
TABLE 1 encoder Module parameters
Figure BDA0003181224490000091
TABLE 2 decoder Module parameters
Figure DEST_PATH_IMAGE001
S5: based on the MA-net network structure established in S4, a channel attention mechanism is introduced for training to obtain a network model, and the segmentation precision is improved;
the channel attention mechanism introduced in the step S5 includes the following steps:
a1: global average pooling is performed first to maintain a maximum receptive field;
a2: then, a learnable weight is distributed to the channel of each feature graph, so that the network model is more concerned about classified main objects, wherein an ARM module is adopted as a channel attention mechanism, as shown in FIG. 5;
s6: introducing a residual multi-kernel pooling module at the end of the MA-net network structure built in the S4, wherein the residual multi-kernel pooling module mainly depends on a plurality of effective views to detect objects with different sizes;
the step of introducing the residual multi-kernel pooling module in the step S6 includes the following steps:
b1: collecting context information by adopting four pooling kernels with different sizes to enrich high-level semantic information;
b2: obtaining a feature map with the same size as the original feature map by bilinear interpolation, and reducing the dimension to 1 by 1 × 1 convolution;
b3: merging the original characteristic diagram and the up-sampled characteristic diagram into a channel; the residual multi-kernel pooling module can cope with large changes in the size of objects in the image;
the module introduces fewer parameters, namely 388 parameters, which causes a slight increase in calculation cost, but the obtained accuracy improvement is more important, as shown in fig. 6, which is an RMP module;
s7: in the training process of the data set in the S1, replacing Batch Normalization (BN) with Filter Response Normalization (FRN), and replacing modified linear unit (Relu) with corresponding activation layer Threshold Linear Unit (TLU); respectively performing model performance analysis experiments on the EM challenge data set, the LUNA challenge data set and the DRIVE data set;
adding a detail extraction module into the MA-net network structure in S4 to improve the capability of segmenting tiny targets in the DRIVE data set;
in order to improve the capability of segmenting tiny targets in a DRIVE data set, a detail extraction module is added in a network, namely a spatial information extraction path of a small step height channel proposed in a public document is fused with an original decoder path, so that the segmentation effect of the tiny targets is effectively improved, the calculated amount is increased, and the precision of a segmentation task with a larger target cannot be improved;
the FRN expression described in S7 is as follows:
Figure BDA0003181224490000121
Figure BDA0003181224490000122
where x is a vector of N dimensions (H W); unlike the normalization method where the mean is subtracted from the BN layer and then divided by the standard deviation, the mean of the quadratic norm is subtracted from the FRN; ε in the formula is a small normal amount to prevent division by 0;
to solve the above-mentioned problem that Relu activation generates a 0 value, and the disclosure proposes a thresholded Relu adopted after FRN, i.e. TLU is important for training performance improvement, the TLU expression described in S7 is as follows:
zi=max(yi,τ)=ReLu(y-τ)+τ (3)
where τ is a learnable parameter.
The model performance analysis experiment is as follows:
(one) Experimental setup
The evaluation index adopted by the experiment is a Dice coefficient, and the test set is not subjected to any enhancement, such as multi-scale or multi-angle, so that the predicted result quality is higher. The Dice coefficient is a set similarity metric function, which is generally used to calculate the similarity of two samples, and has a value ranging from 0 to 1, and the segmentation is preferably 1 and 0 at worst, and is as follows, where TP, FP, and FN represent the number of true positives, false positives, and false negatives, respectively. The Dice expression is as follows:
Figure BDA0003181224490000131
the experimental operating system is Arch, a pytorch deep learning framework, the batch size (batch size) is 8, the Adam optimizer adopts a Dice coefficient loss function, and the input image size is 512 multiplied by 512.
Setting of channel
The number of channels of the output layer of the encoder is one of the main limitations of network acceleration, the experiment adopts Resnet18 as the reference network of the encoder to perform the experiment on the combination of three groups of channels on the magnetite microscopic image data set, as shown in Table 3, it can be seen that the calculated amount is obviously increased along with the increase of the number of channels output by each layer of the encoder, the segmentation precision of the channel number strategy 2 is greatly improved compared with that of the strategy 1, the segmentation precision of the strategy 3 is almost unchanged compared with that of the strategy 2, the segmentation task is considered to be not complex, excessive parameters only generate redundancy, and the network structure limits the capability of extracting semantic information.
TABLE 3MA-net number of channels comparison experiment
Figure BDA0003181224490000132
Coder network selection
In order to further explore the influence of the reference network depth of the MA-net encoder on the network performance and select a proper encoder network, a channel strategy 2 is adopted in the experiment, the segmentation performance of the Resnet-18 and Resnet-34 is compared on a magnetite microscopic image data set, in order to verify that the influence of the network depth and the number of channels on the segmentation performance is increased at the same time, a group of comparison experiments adopting an original parameter Resnet34 are added and are expressed by Resnet34-B, and table 4 shows the segmentation performance and the calculation amount of 3 encoders, so that the network deepening can be found to improve the model segmentation performance to a certain extent, the effect of an excessively deep network and the excessive number of channels is not large, and the calculation amount is increased sharply. And the lightweight encoder network has little impact on model performance. Later experiments all used lightweight Resnet34, channel number strategy 2.
TABLE 4MA-net reference network comparison experiment
Figure BDA0003181224490000141
(II) analysis of model
According to the invention, an MA-Net ablation experiment is carried out on a magnetite mineral microscopic image data set, the performance of each module is analyzed, the Dice coefficient, the parameter quantity and the calculated quantity of the module are shown in Table 5, it can be seen that an attention mechanism ARM has a certain effect on the model segmentation precision, the precision of the model is greatly improved by adopting residual multi-core pooling RMP, but the added calculated quantity is minimum, the segmentation precision is slightly improved by introducing an FRN normalization method, and meanwhile, the calculated quantity is reduced.
TABLE 5MA-net ablation experiments on mineral segmentation datasets
Figure BDA0003181224490000142
Figure BDA0003181224490000151
(III) model comparison experiment
Comparing the proposed MA-Net and advanced algorithms on the magnetite microscopic image data set, it can be seen that the MA-Net segmentation accuracy exceeds the other two networks, while the parameters and the calculated amount are minimal.
TABLE 6 Magnetite microscopic image data set modeling contrast experiment
Figure BDA0003181224490000152
Fig. 7 is a comparison graph of segmentation effect, it can be seen that CE-net is far lower than MA-net in segmentation effect, the CE-net segmentation image is easily interfered by some highlight portions, although the overall contour segmentation effect is better, a large number of holes exist in the image, and MA-net rarely occurs this case, CE-net adds a hole convolution and multi-kernel pooling at the end of the encoder to increase the receptive field, but the encoder feature map and the decoder feature map are fused in a simple addition manner, and information loss and improper fusion inevitably occur in the process of upsampling, and the multi-feature fusion decoding strategy adopted by MA-net can fully extract information of deep-layer features and shallow-layer features, learn the correlation thereof to process a large target in the segmentation result, and greatly overcome the problems of information loss and poor fusion in the process of upsampling, meanwhile, downsampling is carried out after the superficial layer feature maps are fused each time, and the expansion of the receptive field is beneficial to spatial information supplement of a large target.
Regional cloning data enhancement methods comparative experiments were as follows:
in order to analyze the effect of the regional clone data enhancement method, an experiment is carried out on a mineral microscopic image data set, as shown in fig. 8, the segmentation effect on a test set is compared by adopting the regional clone data enhancement method and when the regional clone data enhancement method is not adopted, the fact that the Dice value fluctuation is small when the data enhancement method is adopted can be seen from a curve, and a high segmentation effect is finally obtained; the analysis reason is as follows: the information of the two pictures is combined by adopting a region clone data enhancement method, so that the difference between the pictures can be effectively reduced, and further, the variance of the data is reduced;
by adopting the MA-Net network and the regional clone data enhancement method provided by the invention to perform the task of dividing quartz in the magnetite phase, the Dice coefficient reaches 0.9637, and is extremely close to the manually marked label, as shown in FIG. 9, a single picture can be predicted only by 0.16 second under the Ruilong R7-3700x CPU.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims (7)

1. The mineral real-time segmentation method based on the multi-feature fusion decoder is characterized by comprising the following steps: the method comprises the following steps:
s1: performing semantic information segmentation on quartz in a white area of a label under a magnetite microscopic image, manufacturing a plurality of data sets, respectively training and testing the obtained data sets, and enhancing the magnetite microscopic image by adopting a combination of vertical overturning, horizontal overturning, random rotation for n 90 degrees, affine transformation and random translation;
s2: in the S1, a region clone data set enhancement method is adopted during enhancement, in the training process of the data set, a partial region of another picture is randomly cloned from the data set to the index image, and the same operation is performed on the label, so that the diversity of the mineral facies data is improved, and the overfitting phenomenon in the training process is reduced;
s3: constructing a relevant network model based on the data set processed in the S2, encoding, decoding and fusing the data set, fusing all feature maps with the same scale in the repeated encoding and decoding operation process, acquiring an encoder feature map and a decoder feature map, and encoding again after the encoder feature map and the decoder feature map are fused;
s4: when the data set is encoded and decoded in the S3, a multi-feature aggregation decoder structure and a lightweight Resnet34 are adopted to build an MA-net network structure;
s5: based on the MA-net network structure established in S4, a channel attention mechanism is introduced for training to obtain a network model, and the segmentation precision is improved;
s6: introducing a residual multi-kernel pooling module at the end of the MA-net network structure built in the S4, wherein the residual multi-kernel pooling module mainly depends on a plurality of effective views to detect objects with different sizes;
s7: in the training process of the data set in the step S1, filter response normalization is adopted to replace batch normalization, and meanwhile, a corresponding activation layer threshold linear unit is adopted to replace a correction linear unit; model performance analysis experiments were performed on the EM challenge data set, the LUNA challenge data set, and the DRIVE data set, respectively.
2. The real-time mineral segmentation method based on multi-feature fusion decoder as claimed in claim 1, wherein: and adding a detail extraction module into the MA-net network structure in the S4 to improve the capability of segmenting tiny targets in the DRIVE data set.
3. The real-time mineral segmentation method based on multi-feature fusion decoder as claimed in claim 1 or 2, wherein: in the MA-net network structure described in S4, the first convolutional layer uses 16 channels, the number of output channels of the first convolutional core is reduced to 1/4, which is the number of input channels, and this is used as the number of input channels of the second convolutional layer, and the parameter amount is reduced without changing the input and output channels of each decoder block.
4. The method for real-time segmentation of multi-feature fusion decoder-based minerals according to claim 1, wherein: the channel attention mechanism introduced in the step S5 includes the following steps:
a1: global average pooling is performed first to maintain a maximum receptive field;
a2: and then, a learnable weight is distributed to the channel of each feature map, so that the network model is more concerned about the classified main objects, and the channel attention mechanism adopts an ARM module.
5. The real-time mineral segmentation method based on multi-feature fusion decoder as claimed in claim 1, wherein: the step of introducing the residual multi-kernel pooling module in the step S6 includes the following steps:
b1: collecting context information by adopting four pooling kernels with different sizes to enrich high-level semantic information;
b2: obtaining a feature map with the same size as the original feature map by bilinear interpolation, and reducing the dimension to 1 by 1 × 1 convolution;
b3: merging the original characteristic diagram and the up-sampled characteristic diagram into a channel; the residual multi-kernel pooling module is used to cope with large changes in object size in the image.
6. The real-time mineral segmentation method based on multi-feature fusion decoder as claimed in claim 1, wherein: the FRN expression described in S7 is as follows:
Figure FDA0003181224480000031
Figure FDA0003181224480000032
where x is a vector of N dimensions (H W); unlike the normalization method where the mean is subtracted from the BN layer and then divided by the standard deviation, the mean of the quadratic norm is subtracted from the FRN; ε in the formula is a small normal amount to prevent division by 0.
7. The real-time mineral segmentation method based on multi-feature fusion decoder as claimed in claim 6, wherein: the TLU expression described in S7 is as follows:
zi=max(yi,τ)=ReLu(y-τ)+τ (3)
where τ is a learnable parameter.
CN202110847545.6A 2021-07-27 2021-07-27 Mineral real-time segmentation method based on multi-feature fusion decoder Pending CN113570611A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110847545.6A CN113570611A (en) 2021-07-27 2021-07-27 Mineral real-time segmentation method based on multi-feature fusion decoder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110847545.6A CN113570611A (en) 2021-07-27 2021-07-27 Mineral real-time segmentation method based on multi-feature fusion decoder

Publications (1)

Publication Number Publication Date
CN113570611A true CN113570611A (en) 2021-10-29

Family

ID=78167689

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110847545.6A Pending CN113570611A (en) 2021-07-27 2021-07-27 Mineral real-time segmentation method based on multi-feature fusion decoder

Country Status (1)

Country Link
CN (1) CN113570611A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113887524A (en) * 2021-11-04 2022-01-04 华北理工大学 Magnetite microscopic image segmentation method based on semantic segmentation

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10361802B1 (en) * 1999-02-01 2019-07-23 Blanding Hovenweep, Llc Adaptive pattern recognition based control system and method
CN111681252A (en) * 2020-05-30 2020-09-18 重庆邮电大学 Medical image automatic segmentation method based on multipath attention fusion
CN111797779A (en) * 2020-07-08 2020-10-20 兰州交通大学 Remote sensing image semantic segmentation method based on regional attention multi-scale feature fusion

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10361802B1 (en) * 1999-02-01 2019-07-23 Blanding Hovenweep, Llc Adaptive pattern recognition based control system and method
CN111681252A (en) * 2020-05-30 2020-09-18 重庆邮电大学 Medical image automatic segmentation method based on multipath attention fusion
CN111797779A (en) * 2020-07-08 2020-10-20 兰州交通大学 Remote sensing image semantic segmentation method based on regional attention multi-scale feature fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SAURABH SINGH: "Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks" *
陈家石: "基于深度学习的医学图像分割研究" *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113887524A (en) * 2021-11-04 2022-01-04 华北理工大学 Magnetite microscopic image segmentation method based on semantic segmentation

Similar Documents

Publication Publication Date Title
CN111768432B (en) Moving target segmentation method and system based on twin deep neural network
Li et al. Learning face image super-resolution through facial semantic attribute transformation and self-attentive structure enhancement
CN115619743A (en) Construction method and application of OLED novel display device surface defect detection model
CN114972213A (en) Two-stage mainboard image defect detection and positioning method based on machine vision
CN110503113B (en) Image saliency target detection method based on low-rank matrix recovery
CN113870286B (en) Foreground segmentation method based on multi-level feature and mask fusion
CN113870124B (en) Weak supervision-based double-network mutual excitation learning shadow removing method
CN114724155A (en) Scene text detection method, system and equipment based on deep convolutional neural network
CN112906813A (en) Flotation condition identification method based on density clustering and capsule neural network
CN103886585A (en) Video tracking method based on rank learning
CN114387641A (en) False video detection method and system based on multi-scale convolutional network and ViT
CN111507215A (en) Video target segmentation method based on space-time convolution cyclic neural network and cavity convolution
CN113807176A (en) Small sample video behavior identification method based on multi-knowledge fusion
CN106530330A (en) Low-rank sparse-based video target tracking method
CN115953784A (en) Laser coding character segmentation method based on residual error and feature blocking attention
CN117218378A (en) High-precision regression infrared small target tracking method
CN113870330B (en) Twin vision tracking method based on specific labels and loss function
CN116596966A (en) Segmentation and tracking method based on attention and feature fusion
CN111612802A (en) Re-optimization training method based on existing image semantic segmentation model and application
CN114120202A (en) Semi-supervised video target segmentation method based on multi-scale target model and feature fusion
CN113570611A (en) Mineral real-time segmentation method based on multi-feature fusion decoder
CN114359626A (en) Visible light-thermal infrared obvious target detection method based on condition generation countermeasure network
CN114155165A (en) Image defogging method based on semi-supervision
CN117315543A (en) Confidence gating space-time memory network-based semi-supervised video target segmentation method
CN113177970A (en) Multi-scale filtering target tracking method based on self-adaptive feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20211029

RJ01 Rejection of invention patent application after publication