Specific embodiment
In one embodiment, as shown in Figure 1, disclosing the remote sensing of a kind of combination complete residual error and multi-scale feature fusion
Image partition method includes the following steps:
S100: to the core network as segmentation: convolutional encoding-decoding network improves, specifically:
S101: the core network using convolutional encoding-decoding network as segmentation, the core network include two components:
Encoder and decoder;
S102: the feature pyramid module for polymerizeing multiple dimensioned contextual information is added in the core network;
S103: residual unit is added inside the corresponding convolutional layer of encoder and decoder of the core network, simultaneously
Feature in encoder is fused in decoder equivalent layer in a manner of being added pixel-by-pixel;
S200: remote sensing figure is carried out using the image segmentation network of the complete residual error of improved combination and multi-scale feature fusion
The segmentation of picture;
S300: the segmentation result of remote sensing images is exported.
Wherein, in conjunction with the definition of complete residual error and the image segmentation network of multi-scale feature fusion: the network is in convolution
On the basis of coding-is decoded, by between encoder and decoder and its interior roll lamination is added to complete residual error and connects,
The feature pyramid of a polymerization Analysis On Multi-scale Features has been used the convolution feature in the last one convolution stage of encoder simultaneously
Module FPM is completed.
Specifically: firstly, basic network is a convolutional encoding-decoding network, it is by full symmetric encoder and solution
Code device composition.Secondly, first adding short-range residual error connection inside each convolutional layer of encoder and decoder.Residual error attended operation
Realization be one data of input first, convolutional layer, batch normalization unit and amendment linear unit etc. are passed through to the data
Sequence of operations unit learns the residual error of input data, then this residual error is added with input data and is exported.It will compile simultaneously
The feature in each convolution stage is fused in the equivalent layer of decoder in a manner of being added pixel-by-pixel in code device, analogy residual unit
The residual error that the connection of this step is known as long range can be connected, short distance is connected with the residual error of long range and is known as by handling principle
Complete residual error connection.Finally, when by Fusion Features equivalent layer into decoder in convolution stage each in encoder, to encoder
In the feature in the 5th convolution stage used a feature pyramid module FPM, the context thus having polymerize under different scale
Information, then by obtained multi-scale feature fusion into decoder equivalent layer.The above connection of residual error completely and feature pyramid mould
The operation of block is to carry out simultaneously, belongs to the different operation of same grade.
Above-described embodiment uses the image segmentation network of the complete residual error of improved combination and multi-scale feature fusion, both
The training for simplifying deep layer network, enhances Fusion Features, moreover, different scale and the Fusion Features of mode enable the network to mention
Contextual information abundant is taken, reply target scale variation promotes segmentation performance.
In another embodiment, the encoder in step S101 includes 13 convolutional layers and 5 pond layers, is being compiled
Code device one decoder of stacked on top, the decoder and encoder are in complete mirror, include 13 convolutional layers and 5 Xie Chi
Change layer.
For the embodiment, being achieved in that for encoder carries out spy to input data by the convolution kernel of sizes
Sign is extracted, this kind of implementation can obtain good feature extraction effect.
In another embodiment, 13 convolutional layers of the encoder are divided into five convolution stages, first volume
Product stage and second convolution stage respectively include two convolutional layers, third convolution stage, Volume Four product stage and the 5th convolution order
Section respectively includes three convolutional layers.
In another embodiment, linear comprising a batch normalization unit and an amendment after each convolutional layer
Unit, wherein the characteristic extracted is normalized batch normalization unit, and amendment linear unit is non-thread for being added
Sexual factor;It include a pond layer after each convolution stage.
For the embodiment, it is able to solve during training network using batch normalization unit, intermediate layer data
The problem of distribution changes accelerates training speed to prevent gradient from disappearing;Using amendment linear unit be added it is non-linear because
Element promotes network to the ability to express of data.
In another embodiment, the pondization operation in the encoder is using maximum pond, and saves maximum pond
The index position of change.
For the embodiment, Xie Chiization layer can be conducive to for the lesser characteristic pattern expansion of size by saving maximum pondization index
Greatly to obtain sparse features figure.
In another embodiment, commonly extract different scale contextual information pyramid structure such as PSPNet and
Spatial pyramid pond in DeepLab network or the ASPP module with empty convolution, this generic module are spliced with parallel channel
Mode polymerize multi-scale information, on the one hand network parameter can be made excessive in this way, the operation of another aspect pondization and empty convolution are divided
Local message is not easily caused to lose and grid phenomenon, the final locally coherence for influencing characteristic pattern.Therefore the feature in this method
Pyramid module (FPM), structure as shown in Fig. 2, first use 3x3, the convolution kernel of 5x5 is to former input feature vector figure respectively
(conv5) contextual information under different scale is extracted, then is gradually integrated to reach the mesh in conjunction with adjacent scale contextual feature
's.Then 1x1 convolution is carried out to former input feature vector figure (conv5) and is multiplied with Analysis On Multi-scale Features with pixel-wise.Finally merge
Global pool information improves the performance of feature pyramid module.Wherein, the Upsample in Fig. 2 refers to the size of characteristic pattern
Given resolution is extended to by deconvolution operation.
For the embodiment, using feature pyramid module mitigate computation burden, not will cause local message lose and
Grid phenomenon.
In another embodiment, in the encoder the Fusion Features in the 5th convolution stage into decoder before equivalent layer
Use the feature pyramid module.
It, will not using biggish convolution kernel since the advanced features figure resolution ratio of deeper is smaller for the embodiment
Excessive computation burden is brought, so the selection of feature pyramid module is in conv5 stages operating.
In another embodiment, described gradually integrate is to polymerize multiple dimensioned letter in a manner of being gradually added pixel-by-pixel
Breath.
For the embodiment, it polymerize multi-scale information by the way of being gradually added pixel-by-pixel, does so and consider
The hierarchical dependencies of feature, maintain the locally coherence of characteristic information under different scale.
In another embodiment, the feature in encoder is merged in a manner of being added pixel-by-pixel described in step S103
It is specifically into decoder equivalent layer:
The last layer convolution characteristic pattern is only selected to the second convolution stage in the first convolution stage in encoder and encoder,
All convolution are selected to the 5th convolution stage in Volume Four product stage in third convolution stage in encoder, encoder and encoder
Characteristic pattern is added fusion to do pixel-by-pixel.
For the embodiment, reduce the loss of characteristic pattern resolution ratio.
In another embodiment, this method joined in the corresponding convolution order intersegmental part of encoder and decoder such as Fig. 3
Residual unit, referred to as short distance residual error connect.In Fig. 3, Xl, y indicates the, and a residual unit is output and input, F (Xl)
The residual error that the residual unit learns is indicated, by a series of convolutional layer, batch normalization unit (batch
Normalization, BN), the operations such as amendment linear unit (rectified linear unit, RELU) learn to obtain.It is described
For convolutional layer for extracting feature, the batch normalization unit is described to repair for the characteristic extracted to be normalized
Linear positive unit is for being added non-linear factor.Y=F (Xl)+Xl, in special circumstances, as residual error F (XlWhen)=0, output is equal to
Input.Fusion Features in step S102 can be connected the residual error connection for being known as long range by the principle of analogy Fig. 3 residual unit,
It connect with short distance residual error together constitutes complete residual error connection, on the one hand solves depth network because level deepens appearance
Gradient disappearance problem, on the other hand for depth network because caused by convolution operation profile information lose, complete residual error connection
Because not only having merged Analysis On Multi-scale Features, the original input information of this layer is also merged, thus to a certain extent to loss
Information is supplemented, and Fusion Features are further enhanced.
For the embodiment, using residual unit, gradient is effectively prevent to disappear.
In another embodiment, the work station for being equipped with 64 Ubuntu systems, hardware configuration Intel are used
(R) Xeon (R) CPU E5-2690 v32.6GHz processor, 256GB memory and 4TB hard disk.The training of whole network uses
Caffe deep learning platform is accelerated using one piece of NVIDIA Tesla K 40c 126B video memory CPU in training process.Net
Network parameter using on ImageNet data set pre-training resulting VGG16 initialize, remaining layer parameter passes through He et al.
(2015) the MSRA initial method proposed is initialized, and when only considering input number n, it can make weight obey mean value
It is 0, variance is the Gaussian Profile of 2/n.In the training process, fixed learning rate is 0.0001, batch_size 5, and gamma is
1, weight decays to 0.0002, and momentum 0.99, maximum number of iterations is 100000 times.
In trained back-propagation phase, error is calculated by cross entropy loss function, more using stochastic gradient descent method
The weight of new whole network, the definition of cross entropy loss function are as follows:
liIndicate the true tag at pixel i, pK, iIndicate that pixel i belongs to the output probability of kth class, K indicates classification
Sum, N indicate that the sum of all pixels point in batch images, σ () indicate a sign function, work as liIt is 1 when=k, otherwise
It is 0.L indicates true tag set, and p indicates the output of the last one convolutional layer in decoder, and θ indicates the parameter in loss function
Collection, log default are bottom with 10.
Convolutional neural networks are by the error-duration model of network end-point using back-propagation algorithm to every in deep learning field
One layer, these layer of modification is allowed to update the weight for working as layer, each layer of convolutional neural networks is finally made to extract the ability of feature more
It is good.Back-propagation algorithm (Back Propagation, BP) standard step be include that a propagated forward stage and one are reversed
Propagation stage.The propagated forward stage finally obtains in the end of network according to the feature of initially given weights learning input picture
One predicted value is not related to weight there are an error between the predicted value and really given label value in this stage
It updates.In order to enable each layer in network of weight preferably in analog image feature distribution, needed in back-propagation phase
The above error is passed back into preceding layer layer by layer to update each layer of weight.Multiple propagated forward has updated with back-propagation process
The predicted value that network can be made finally to learn after weight more approaches true tag value.Used when updating weight with
Machine gradient descent algorithm.Above-mentioned mentioned error needs to define a loss function to calculate, and is damaged in the method using cross entropy
Function is lost to calculate error of the propagated forward later between true tag value.
In another embodiment, the performance of proposed network segmentation remote sensing images is verified using following two datasets simultaneously
Data extending is done to following two datasets, is specifically described as follows:
(1) ISPRS Vaihingen Challenge Dataset: it is ISPRS 2D semantic label in Vaihingen
The benchmark dataset of challenge, by 3 wave band IRRG (near-infrared, infrared, green) image data and corresponding digital surface network
(DSM) and normalization digital surface network (NDSM) data form.The data set include 33 sizes it is not equal, surface sample away from
From the image for 9cm, wherein there is 16 tape label figures, each image is all marked as six classes, i.e. impermeable surface
(Impervious surfaces), building (Building), short vegetation (Low vegetation), trees (Tree), vapour
Vehicle (Car), clutter or background (Clutter/Background).12 are randomly selected from 16 images of tape label as instruction
Practice collection, 2 as verifying collection, 2 be used as test set.The data set is relatively small for training depth network, in experiment
The image block of 256x256 is selected to train network.The training set of above-mentioned division and verifying collection quantity for training depth network and
Say it is relatively small, therefore for training set and verifying collect, we carry out expanding data using dual stage process.First stage, for
Given image is first 256x256 using size, the sliding window that step-length is 128 is right with it in IRRG image due to size etc.
It is intercepted on the label figure answered, then extracts the image block (that is, the upper right corner, the lower left corner and lower right corner) of 3 fixed positions.The
Two-stage first carries out 90 degree, 180 degree and 270 degree of rotations respectively to all image blocks, then does water to image blocks obtained by all rotations
Flat vertical mirror overturning.Finally respectively obtain 15000 training set samples and 2045 verifying collection samples.
(2) Road Detection Dataset: the data set is adopted from Google Earth by Cheng et al. (2017)
Collect and hand labeled lane segmentation with reference to figure with its corresponding center line with reference to figure, be maximum road data collection at present.
It includes the high-definition picture that 224 spatial resolutions are 1.2m, each image at least 600x600 pixel, road width
About 12~15 pixels.224 width images are randomly divided into 180 training sets, 14 verifying collection and 30 test sets by us.It is real
The middle image block for selecting 300x300 is tested to train network.Similarly, number is expanded using dual stage process to training set and verifying collection
According to.First stage first extracts the image block of 4 fixed positions (that is, the upper left corner, the upper right corner, the lower left corner for given image
And the lower right corner), it reuses the sliding window that size is 300x300 and intercepts 25 image blocks at random in original image and label figure.The
Two-stage is first rotated all image blocks with every 90 degree of step-length, is then overturn in the horizontal and vertical directions.Most
31320 training set samples and 2436 verifying collection samples are respectively obtained eventually.
In another embodiment, in order to verify the validity of image segmentation network of the invention, respectively with following networks
It compares, is specifically described as follows:
FCN8s (Long etc., 2015), DeconvNet (Noh etc., 2015), SegNet (Badrinarayanan etc.,
And four kinds of semantic segmentation networks such as U-Net (Ronneberger etc., 2015) 2017).
These four semantic segmentation networks, for configuration aspects, FCN8s structure is most simple, the FCN8s net based on VGG16
The coded portion of network includes 15 convolutional layers and 5 pond layers, and decoded portion is by the characteristic pattern of third and fourth and five convolutional layers
It operates to expand and be successively added by deconvolution and carries out Fusion Features, finally carry out pixel class prediction again.DeconvNet,
SegNet and U-Net network can incorporate into as full symmetric this major class of coding-decoding network, and constructional depth is suitable, it
Encoder be all to operate to complete by convolution sum pondization, the decoder of DeconvNet and SegNet by Xie Chiization with instead
Convolution (or convolution) operation is completed, and the decoder of U-Net network only operates completion by deconvolution.It is this kind of compared to FCN8s
Coding-decoding network decoding process is deeper.For in terms of the Fusion Features, FCN8s and U-Net network has all carried out feature and has melted
It closes, FCN8s carries out third and fourth in encoder, the characteristic pattern in five stages to be successively added fusion.U-Net network is by encoder
In each convolution stage the last layer characteristic pattern all duplication has been fused in the equivalent layer of decoder, the characteristic information of fusion is more
It is more, amalgamation mode is increasingly complex.And DeconvNet and SegNet network does not utilize Fusion Features in decoding process, they are only
Be the advanced features in encoder are successively extended to the equirotal characteristic pattern of input picture, it is pre- finally to do pixel class
It surveys.
Image segmentation network of the invention can also incorporate into as this major class of coding-decoding network, in structure with U-Net
Network is very similar, but there is also 4 points of differences.First point: amalgamation mode is different, and image segmentation network of the invention will encode
Characteristic pattern in device is fused in the equivalent layer of decoder in a manner of being added pixel-by-pixel, and U-Net network is spliced with channel
Mode carry out Fusion Features.Splice compared to channel, the amalgamation mode being added pixel-by-pixel will not increase additional parameter to network.
Second point: fusion content is different, since gradually the operation of convolution sum pondization can lose characteristic pattern resolution ratio, this hair in encoder
Bright image segmentation network selected in fusion the last one convolution feature of the first and second stage and third and fourth, the institute in five stages
There is convolution feature, and U-Net network has only selected each convolution stage the last layer feature in encoder in fusion.Thirdly:
Analysis On Multi-scale Features are merged, before the characteristic pattern in the 5th stage is fused to equivalent layer, image segmentation network of the invention is utilized
Feature pyramid module extracts Analysis On Multi-scale Features information, can cope with the multiple dimensioned variation of target, and U-Net network does not melt
Close different scale feature.4th point: complete residual error connection, image segmentation network of the invention are corresponding with decoder in encoder
Increase residual error connection inside convolutional layer, it connects with the Fusion Features in image segmentation network of the invention and constitutes completely
Residual error connection, the complete residual error connection allow gradient that can be propagate directly to any one convolutional layer, simplify training process.And U-
Residual error connection is not used in Net network.
In another embodiment, for the quality of quantitative evaluation segmentation network performance, following evaluation index has been used, it
Explanation and definition it is as follows:
F1- value (F1-score), whole accuracy rate (0A) and friendship and than (IOU).
F1 value is the harmomic mean of accurate rate (P) and recall rate (R), is a comprehensive evaluation index;Whole accuracy rate
(0A) is the percentage measured all pixels being correctly marked and account for total number of image pixels, and definition difference is as follows:
Wherein:The positive class of TP:true positive is determined the class that is positive;FP:
The negative class of false positive is determined the class that is positive;The positive class of FN:false negative is determined the class that is negative;TN:true
The negative class of negative is determined the class that is negative.
IOU is the gauge of semantic segmentation, indicates the pixel number and in advance of the intersection set of predicted value and true tag value
The ratio of the pixel number of the union set of measured value and true tag value, definition are as follows:
Wherein: PgtIt is the pixel set of authentic signature figure, PmIt is the pixel set of forecast image, " ∩ " and " ∪ " difference table
Show intersection and union operation.| | it indicates to calculate the pixel number in the group.
In another embodiment, it is tested on ISPRS Vaihingen test set as follows:
On ISPRS Vaihingen, the segmentation result of this method and advanced deep layer network is as shown in figure 4, all nets
The input image size of network is 256x256, and is all only IRRG Three Channel Color image, and output is and input picture size phase
Same prediction label figure.Fig. 4 is followed successively by IRRG image, label figure, FCN8s segmentation result, DeconvNet segmentation knot from top to bottom
Fruit, SegNet segmentation result, U-Net segmentation result, FRes-MFDNN segmentation result.
Target size is not equal in each figure, comes in every shape, and all there is certain shadow occlusion.For example, the first width and the 5th
Short vegetation and trees distribution in width image compare concentration, since the influence of trees and depth of building causes to exist in original image
The shade of large area, and partial phantom forms automobile and road surface and blocks.As seen from Figure 4, FCN8s and DeconvNet
The segmentation result of network is poor, and wherein the result of DeconvNet segmentation differs larger with physical tags figure, and at object edge
Details is fuzzy, and the inside of single target is discontinuous etc. in the presence of segmentation.Compared with FCN8s, SegNet network is due to code of promoting mutual understanding
Process, and location index value, segmentation result obtained in the process of pond is utilized and is closer to physical tags figure, preferably
The detailed information of target is remained, wrong branch point is also than the reduction of FCN8s and DeconvNet network.U-Net is by encoder
The feature duplication in middle corresponding stage is fused in decoder in respective stage, segmentation result and physical tags figure more closely,
Target detail information is relatively clear.Network in this method in encoder and decoder equivalent layer because having used complete residual error to connect
It connects, and has merged the multi-scale information of advanced features, segmentation result and physical tags figure are very close, and target detail is more clear
Clear, mistake point is less, this embodies this method can cope with target size diversity and shadow band in original image to a certain extent
The influence come, improves segmentation accuracy.
Fig. 5 gives the quantitative assessment corresponding to Fig. 4 as a result, runic represents best result, and underscore represents time good result.
Wherein accurate rate (P) and recall rate (R) have measured the integrality and correctness of segmentation respectively, and ideal situation of dividing is accurate rate
It is all high with recall rate.Measure Indexes of this method on each width figure reach highest, in addition, in average accuracy and averagely recalling
It is higher by about 3% and 2% respectively than secondary good result in rate, from this method in terms of qualitative and quantitative result in urban remote sensing image segmentation side
Face and real marking figure are closer, and effect is more preferable.
The evaluation result such as Fig. 6 (a) and Fig. 6 of each deep layer network and this method in ISPRS Vaihingen test image
(b), wherein although from Fig. 6 (a) and Fig. 6 (b) as it can be seen that some comparison algorithms are in IOU and F1 value metric has preferable knot
Fruit, but this method has all reached optimal in the average behavior of IOU, F1 value and the test set entirety of each classification.Specifically
For, the average IOU of this method is higher by about 6% than secondary good result (U-Net), and average F1 value is higher by about 4% than secondary good result, this is sufficiently
Demonstrate validity of this method in terms of urban remote sensing image segmentation.
The comparing result of the method in this method and the preferable document of current segmentation performance is given in Fig. 7,
Paisitkriangkrai et al. (2015) proposes that the CNN+RF for combining CNN with random forest (RF) divides network, CNN master
It is used to extract feature, RF is for classifying.Deconvolution network is used for Remote Sensing Image Segmentation by Volpi and Tuia (2017) proposition,
Its network is made of symmetrical encoder and decoder, and encoder is completed by eight convolutional layers and three pond layers, decoder with
Encoder is in mirror, wherein with 1x1 convolutional layer concatenated coding and decoding process.Sherrah (2016) uses empty convolution
Remote sensing images are split, and smoothing processing is done to segmentation result with CRF;Maggiori et al. (2016) is in coding-decoding
It is dissolved into CRF as post-processing in the training process of depth network the end of network.Audebert et al. (2017) use pair
Coding-decoding network of title, encoder are made of convolutional layer and pond layer, and decoder is by warp lamination and anti-pond layer group
At.The above experimental result is derived from document original text, and the training samples number that each method uses is about the same.From Fig. 7 comparing result
In as can be seen that in the F1 value of every one kind and the whole accuracy rate of segmentation, the segmentation effect of this method is better than institute's comparative approach.
In order to preferably verify the segmentation performance of this method, herein using area4 in ISPRS Vaihingen data set,
Area31 and area35 etc. three, without mark figure, are tested in each comparison network respectively, and partial results show such as Fig. 8,
It is followed successively by original image, FRes-MFDNN segmentation result, U-Net segmentation result, SegNet segmentation result, FCN8s segmentation from left to right
As a result with DeconvNet segmentation result.
In the case where no label schemes to refer to, referring to original image (first row), it can be seen that, this method is in the correct of segmentation
Property, it is better than the effects of other comparison networks in integrality and object boundary flatness.
In another embodiment, it is tested on Road Detection data set as follows:
On Road Detection Dataset, this method and the segmentation result of each deep layer network are as shown in Figure 9.It is all
The input image size of network is the RGB triple channel image of 300x300, and output is the prediction knot with input picture same size
Fruit figure, black represent background, and white represents road.Fig. 7 is followed successively by RGB image, label figure, FRes-MFDNN segmentation from top to bottom
As a result, U-Net segmentation result, SegNet segmentation result, DeconvNet segmentation result, FCN8s segmentation result.
Fig. 9 the first row gives five width spectral informations and the different image of background complexity, part road by trees and
Automobile is blocked, wherein the 4th width image section residential area residential house and road are extremely close on spectral information, the 4th width image
It also include to be trampled apparent loess road surface with the 5th width image, these factors are all that segmentation increases certain challenge.By
The segmentation result that Fig. 9 can be seen that FCN8s and DeconvNet network differs larger with physical tags figure, mistake point and leakage facet
Product is more, and the road continuity being partitioned into is poor.The segmentation result of SegNet network and physical tags figure are more similar, wrong facet
Product is compared to have for DeconvNet and relatively significantly be reduced, but still divides phenomenon in the presence of leakage.Point of U-Net network and this method
It cuts result and physical tags figure is the most similar, mistake point and leakage point are also substantially reduced compared with other networks.Compared with U-Net network,
The segmentation result detailed information of this method is more perfect, and when having automobile and trees are blocked, the road edge split is more smooth,
Space Consistency is higher.
Figure 10 gives the evaluation result corresponding to Fig. 9, and runic represents best values, and underscore represents time figure of merit.Similarly,
Although some comparative approach yield good result on accurate rate or recall rate, this method is on every piece image
Measure Indexes can almost reach highest, and average accuracy and average recall rate are higher by 2% He than the method for suboptimum respectively
3%, more close with real marking figure in terms of dividing remote sensing road image from context of methods in terms of qualitative and quantitative result, effect is more
It is good.
Figure 11 gives average IOU of each method on Road Detection test set and average F1 value.It can be seen that we
The average IOU of method and average F1 value are all apparently higher than other methods, and average IOU is even more to improve than second-best U-Net method
4%, average F1 value has reached 93%, has fully demonstrated this method good segmentation performance on the data set.
Figure 12 is the result pair of the research method on Road Detection data set of this method and existing road segmentation
Than the deduction time including average IOU of each method on the data set, average F1 value, training time (h) and a width figure
(the s/p expression second /).
In Figure 12, Zhang et al. (2018) proposes the Res-unet network comprising three layer coders and three layer decoders,
Cataloged procedure is completed by convolution operation, and decoding process is completed by bilinear interpolation, wherein by each stage the last layer of encoder
Characteristic pattern duplication has been fused in the respective stage of decoder, and is introduced residual error in encoder and encoder and connected.
Ronneberger et al. (2015) proposes the U-Net for being used for medical image segmentation, and many researchs are used for remote sensing figure at present
In the task of picture segmentation.Panboonyuen et al. (2017) proposes ELU-SegNet structure, he is on the basis of SegNet network
RELU activation primitive is substituted for ELU activation primitive.Cheng et al. (2014) is proposed by four layer coders and four layer decoder groups
At Cascaded-net structure, wherein encoder by convolution sum pondization operate complete, decoder by deconvolution reconciliation pondization it is complete
At.The networks such as Res-unet, ELU-SegNet and Cascaded-net are proposed both for lane segmentation application.The above knot
Fruit is all on the caffe deep learning platform configured together with the network training of this method, for Road Detection data
Collection experiment gained.It can be recognized from fig. 12 that although this method is slightly inferior to Res- on training time and deduction time
The time of unet and U-Net network, but gap and little, this method is than other methods in average IOU and average two side of F1 value
Face all takes advantage.
In order to preferably verify performance of each comparison network in terms of dividing remote sensing road image, we are from Google Maps
The image in the U.S., block overhead, St. Louis is acquired, all images are triple channel RGB color image, spatial resolution 20
Rice, is respectively fed to trained each network and is tested, and partial results are as shown in figure 13, is followed successively by original image, sheet from left to right
Network segmentation result, U-Net segmentation result, SegNet segmentation result, DeconvNet segmentation result and the FCN8s segmentation of method
As a result.
Although acquisition image and for train the Road Detection data set of network background complexity, spectrum believe
It is all different in terms of breath and spatial resolution, but as can be seen from Figure 13, compared to other control methods, this method energy
It is enough to be preferably partitioned into road, most of background interference is effectively rejected, spatial resolution difference bring is overcome to influence.This is also filled
Divide and demonstrates robustness of this method in terms of dividing remote sensing road image.
Although embodiment of the present invention is described in conjunction with attached drawing above, the invention is not limited to above-mentioned
Specific embodiments and applications field, above-mentioned specific embodiment are only schematical, directiveness, rather than restricted
's.Those skilled in the art are under the enlightenment of this specification and in the range for not departing from the claims in the present invention and being protected
In the case where, a variety of forms can also be made, these belong to the column of protection of the invention.