Port ore heap segmentation and reserve calculation method based on improved UNet network
Technical Field
The invention relates to a port ore heap segmentation and reserve calculation method based on an improved UNet network, and belongs to the field of optical remote sensing image processing and deep learning.
Background
Image semantic segmentation is an important field in computer vision, and can identify objects at a pixel level and predict a class to which each pixel in an image belongs. The port ore stacking area is a specific area for stacking ores to be transported in a port, ore piles in the same ore stacking area are arranged in order, but the ore piles can be gradually increased or decreased along with continuous transportation, so that the shape is irregular, and a plurality of ores are closer to the color of a bare area, so that the traditional computer vision method is difficult to detect the edge of the ore in a remote sensing image. The deep learning image semantic segmentation method can effectively solve the problems of irregular shape and close color to the bare ground of the ore heap. The image semantic segmentation technology is used for finding out the ore stacking area, so that the ore stacking area can be further calculated, the ore reserves of ports can be estimated, and the method has high application value. Although few researches on semantic segmentation of the remote sensing images of the ore heap are carried out at present, with the development of the high-spatial-resolution optical remote sensing satellite technology, many researchers use a deep learning method to carry out semantic segmentation on the images in the field of the optical remote sensing images and produce a plurality of application results.
UNet is an image semantic segmentation network proposed by Ronneberger et al in 2015. The UNet network adopts the idea of an encoder-decoder, has a simple structure and is suitable for training small sample data sets. Researchers at home and abroad use UNet networks to obtain a plurality of research results in the aspects of building extraction classification, mining area change detection, forest type classification and the like.
Liuhao et al have proposed an improved UNet network SE-Unet for the classification of the building ground, SE-Unet has adopted the way of compressing and activating of the characteristic in the course of encoder downsampling, has added four characteristic compression activation modules, act on input image, downsampling course, upsampling and amalgamation process and final output image separately, compress and activate the characteristic that each convolution gets, improve the utilization ability of the effective characteristic.
Sunward et al conducted a change detection study on the mine area in the remote sensing image based on an improved UNet twin network, where the pooling layer of UNet was replaced with a convolutional layer with a step length of 2, and a twin network was constructed with a dual-channel structure with shared weights, so that the network receives images of two periods simultaneously and extracts differences therefrom. The pooling layer in UNet increases the field of view, allowing the convolution to accept more information, but loses much information, and replacing pooling with 2-step convolutional layers can increase the field of view while reducing information loss.
Wang Yahui et al used UNet to classify forest types of high-resolution multispectral remote sensing images, firstly extracted NDVI characteristics and four wave bands of original images, and 5 characteristic data in total constructed UNet models to classify, and then processed the classification results with Conditional Random Field (CRF). The CRF can effectively refine edges among various ground objects and improve classification precision.
Disclosure of Invention
The invention aims to provide a port ore heap segmentation and reserve calculation method based on an improved UNet network, so as to solve the problems in the prior art.
A port ore heap segmentation and reserve calculation method based on an improved UNet network comprises the following steps:
step 1, making a port ore heap semantic segmentation data set based on a high-resolution optical remote sensing image;
step 2, the UNet network algorithm is improved: optimizing the UNet network downsampling process by using the hole convolution layer;
step 3, training the ore heap segmentation data set by using an improved UNet network;
step 4, performing image semantic segmentation on image test data containing the ore heap by using the trained network;
and 5, estimating the reserve of the segmented ore heap by using an ore heap volume estimation method.
Further, in step 1, the method specifically comprises the following steps:
step 1.1, selecting a data set image of the ore heap for segmentation, wherein the data set image selects Google base map, randomly selecting 40 images containing port ore stacking areas for interception, wherein the images comprise port ore heaps with different shapes, colors, sizes and stacking modes, using 36 images for network training, and using 4 images for a test set;
step 1.2, manually labeling the selected image by using a labelme tool to generate a segmentation result json, converting by using the labelme tool, and converting the segmentation result json into a label image form;
and 1.3, cutting the image of the ore stacking area and the corresponding label image into 256 × 256 images, filling the cut part with the size less than 256 × 256 with a 0 value to 256 × 256, removing the image without the effective label, and dividing 346 cut images into 80% of training sets and 20% of verification sets.
Further, in step 2, the method specifically comprises the following steps:
step 2.1, constructing an improved UNet network, replacing five convolutional layers in the downsampling process of the UNet network with a hole convolution, wherein the hole convolution has a plurality of intervals compared with a standard convolutional layer, and the improved UNet network adopts 5 multiplied by 5 hole convolution with the interval number of 1;
and 2.2, filling by using a 0 value in the up-sampling and down-sampling processes.
Further, in step 3, the method specifically comprises the following steps:
step 3.1, training the data set with the following hyper-parameters, where the initial learning rate is 0.001, the batch size is 4, the training algebra epochs is 100, and the segmentation class n _ classes is 2;
3.2, in the training process, reducing the learning rate by half when the loss value of the verification set is unchanged in continuous two generations of training by using a reduced LROnPlateau;
and 3.3, storing the training result once per generation, and taking the highest accuracy of the final verification set as the final training result.
Further, in step 4, the method specifically comprises the following steps:
step 4.1, slicing the image for testing into 256 × 256 images, and filling the part with the size less than 256 × 256 after cutting into 256 × 256 size by using 0 value;
step 4.2, performing semantic segmentation on each slice by using a trained UNet network to obtain a gray image with the size of 256 × 256, wherein the gray value range is [0, n _ classes ], n _ classes is the number of segmentation classes, and n _ classes is 2 and comprises two segmentation classes of a mine pile and a bare region;
and 4.3, splicing the segmentation result slices according to the sequence of the original image, and cutting the part which exceeds the boundary of the original image.
Further, in step 5, the method specifically comprises the following steps:
step 5.1, extracting the contour of the marking result in the segmentation result image by using a findContours method in opencv, wherein each extracted contour is a mine pile;
step 5.2, calculating the number of pixels occupied by each ore heap in the image, and calculating the total number of pixels of the image;
step 5.3, calculating the geographical area actually occupied by the image by using the remote sensing positioning information;
step 5.4, converting the contour length unit in the ore heap image into meters from pixels according to the proportion of the total pixel number and the geographic area size;
step 5.5, calculating the external rectangle of the ore heap by using opencv minAreaRect method according to the extracted contour, and calculating the length l of the external rectangleboxAnd width wbox;
Step 5.6, obtaining the radius r of the bottom surface of the cone according to the height h and the stacking angle alpha of the ore pile and the length and the width l of the circumscribed rectanglebox、wboxAnd the radius r of the bottom surface of the cone estimates the width w of the upper surface of the steptopAnd upper surface length ltop:
r=h/tanα (1)
wtop=wbox-2r (2)
ltop=lbox-r (3);
Step 5.7, estimating the upper surface area S of the step according to the following condition according to whether the ore heap is complete or nottopAnd surface area S 'under the terrace'bottom:
Step 5.8, calculating the total bottom area S of one triangular prism and two 1/4 cones on one side of the ore pilecones:
Scones=πr2/2+wtopr (6);
Step 5.9, according to the whole bottom areas S and S of the ore heapconesComparing and dividing the integrity of the ore pile into three conditions, and correcting the surface area S 'under the bench'bottomThe error exists, and the real surface area S under the step is obtainedbottom:
And 5.10, estimating the volume V of the ore heap by using a step and cone volume formula:
the invention has the following main advantages:
(1) the UNet network is improved, and the hole convolution is adopted, so that the receptive field is improved, the information loss is reduced, and the identification precision is improved;
(2) the result is optimized by using a conditional random field, the ore heap and the bare ground segmentation edge are refined, and the probability of wrong segmentation during semantic segmentation is reduced;
(3) the ore heap reserve estimation is carried out by using an ore heap reserve estimation algorithm, and the method has a guiding function in the fields of financial future prices and the like.
Drawings
FIG. 1 is a flow chart of a method for port heap splitting and reserve calculation based on an improved UNet network according to the present invention;
FIG. 2 is a schematic structural diagram of a modified UNet structure using hole convolution;
FIG. 3 is a diagram of a heap model, wherein FIG. 3(a) is a three-view diagram of the complete heap model; fig. 3(b) shows the ore heap model after excavation.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, the invention relates to a port heap segmentation and reserve calculation method based on an improved UNet network, which comprises the following steps:
step 1, making a port ore heap semantic segmentation data set based on a high-resolution optical remote sensing image;
step 2, the UNet network algorithm is improved: optimizing the UNet network downsampling process by using the hole convolution layer;
step 3, training the ore heap segmentation data set by using an improved UNet network;
step 4, performing image semantic segmentation on image test data containing the ore heap by using the trained network;
and 5, estimating the reserve of the segmented ore heap by using an ore heap volume estimation method.
Further, in step 1, the method specifically comprises the following steps:
step 1.1, selecting a data set image of the ore heap for segmentation, wherein the data set image selects Google base map, randomly selecting 40 images containing port ore stacking areas for interception, wherein the images comprise port ore heaps with different shapes, colors, sizes and stacking modes, using 36 images for network training, and using 4 images for a test set;
step 1.2, manually labeling the selected image by using a labelme tool to generate a segmentation result json, converting by using the labelme tool, and converting the segmentation result json into a label image form;
and 1.3, cutting the image of the ore stacking area and the corresponding label image into 256 × 256 images, filling the cut part with the size less than 256 × 256 with a 0 value to 256 × 256, removing the image without the effective label, and dividing 346 cut images into 80% of training sets and 20% of verification sets.
Further, in step 2, the method specifically comprises the following steps:
step 2.1, constructing an improved UNet network as shown in fig. 2, replacing five convolutional layers in the downsampling process of the UNet network with a cavity convolution, wherein the cavity convolution has many intervals compared with a standard convolutional layer, so that the information loss can be effectively reduced while the receptive field is increased, and the invention adopts 5 × 5 cavity convolution with the interval number of 1;
and 2.2, filling by using a 0 value in the up-sampling and down-sampling processes.
Further, in step 3, the method specifically comprises the following steps:
step 3.1, training the data set with the following hyper-parameters, where the initial learning rate is 0.001, the batch size is 4, the training algebra epochs is 100, and the segmentation class n _ classes is 2;
3.2, in the training process, reducing the learning rate by half when the loss value of the verification set is unchanged in continuous two generations of training by using a reduced LROnPlateau;
and 3.3, storing the training result once per generation, and taking the highest accuracy of the final verification set as the final training result.
Specifically, the program runs on a machine with a CPU of Intel Core i7-9700, a GPU of NVIDIA GeForce RTX2060 (computer Capability 7.5,1920CUDACores), a memory of 16GB, and an operating system of Ubuntu 18.04, and uses a Python version of 3.5, a tensrflow version of 1.13.1, and a keras version of 2.2.4.
Further, in step 4, the method specifically comprises the following steps:
step 4.1, slicing the image for testing into 256 × 256 images, and filling the part with the size less than 256 × 256 after cutting into 256 × 256 size by using 0 value;
step 4.2, performing semantic segmentation on each slice by using a trained UNet network to obtain a gray image with the size of 256 × 256, wherein the gray value range is [0, n _ classes ], n _ classes is the number of segmentation classes, and n _ classes is 2 and comprises two segmentation classes of a mine pile and a bare region;
and 4.3, splicing the segmentation result slices according to the sequence of the original image, and cutting the part which exceeds the boundary of the original image.
Further, in step 5, the method specifically comprises the following steps:
step 5.1, extracting the contour of the marking result in the segmentation result image by using a findContours method in opencv, wherein each extracted contour is a mine pile;
step 5.2, calculating the number of pixels occupied by each ore heap in the image, and calculating the total number of pixels of the image;
step 5.3, calculating the geographical area actually occupied by the image by using the remote sensing positioning information;
step 5.4, converting the contour length unit in the ore heap image into meters from pixels according to the proportion of the total pixel number and the geographic area size;
step 5.5, as shown in fig. 3, regarding the complete ore pile as the splicing of a step and two sides, wherein the incomplete ore pile is obtained by digging a part of one side of the complete ore pile, the two sides can be respectively regarded as the splicing of a triangular prism and two 1/4 cones, the gradient of the step is the stacking angle of the ore pile, the stacking angle and the pile height are slightly different according to different mineral types and stacking area regulations, the stacking angle and the pile height can be obtained by looking up a table according to actual conditions, and opencv minA is used according to the extracted contourCalculating the external rectangle of the ore pile by using a reaflect method, and calculating the length l of the external rectangleboxAnd width wbox;
Step 5.6, obtaining the radius r of the bottom surface of the cone according to the height h of the ore heap and the stacking angle alpha (unit rad), and obtaining the length l and the width l of the circumscribed rectanglebox、wboxAnd the radius r of the bottom surface of the cone estimates the width w of the upper surface of the steptopAnd upper surface length ltop:
r=h/tanα (1)
wtop=wbox-2r (2)
ltop=lbox-r (3);
Step 5.7, estimating the upper surface area S of the step according to the following condition according to whether the ore heap is complete or nottopAnd surface area S 'under the terrace'bottom:
Step 5.8, calculating the total bottom area S of one triangular prism and two 1/4 cones on one side of the ore pilecones:
Scones=πr2/2+wtopr (6);
Step 5.9, according to the whole bottom areas S and S of the ore heapconesComparing and dividing the integrity of the ore pile into three conditions, and correcting the surface area S 'under the bench'bottomThe error exists, and the real surface area S under the step is obtainedbottom:
And 5.10, estimating the volume V of the ore heap by using a step and cone volume formula: