CN110059698A - The semantic segmentation method and system based on the dense reconstruction in edge understood for streetscape - Google Patents
The semantic segmentation method and system based on the dense reconstruction in edge understood for streetscape Download PDFInfo
- Publication number
- CN110059698A CN110059698A CN201910359119.0A CN201910359119A CN110059698A CN 110059698 A CN110059698 A CN 110059698A CN 201910359119 A CN201910359119 A CN 201910359119A CN 110059698 A CN110059698 A CN 110059698A
- Authority
- CN
- China
- Prior art keywords
- feature
- edge
- image
- semantic segmentation
- coding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The present invention relates to a kind of semantic segmentation method and system based on the dense reconstruction in edge understood for streetscape, this method comprises: pre-processing to training set input picture, making image standardization and obtaining the pretreatment image of identical size;Generic features are extracted with convolutional network, then obtain three-level context space pyramid fusion feature, extract coding characteristic using the cascade of this two parts as coding network;Half input size coding feature is obtained using coding characteristic, edge feature is obtained based on convolutional network, in conjunction with half input size coding feature, using the dense net of combination of edge feature as decoding network, reconstructed image resolution obtains decoding feature;The edge penalty of semantic segmentation loss and back-up surveillance is calculated, is that target is trained deep neural network to minimize the two weighted sum loss;Segmented image is treated using deep neural network model and carries out semantic segmentation, exports segmentation result.This method and system are conducive to improve the accuracy and robustness of image, semantic segmentation.
Description
Technical field
The present invention relates to technical field of computer vision, and in particular to a kind of for the dense heavy based on edge of streetscape understanding
The semantic segmentation method and system built.
Background technique
Image, semantic segmentation be computer vision in artificial intelligence field an important branch, be in machine vision about
The important ring of image understanding.Image, semantic segmentation is exactly that each of image pixel is accurately referred to its affiliated class
Not, make the visual representation content of itself and image itself consistent, so image, semantic segmentation task is also referred to as the image of Pixel-level
Classification task.
Since image, semantic segmentation has certain similitude with image classification, so miscellaneous image classification network is normal
It is replaceable as the backbone network of image, semantic segmentation network, and between each other often after rejecting last full articulamentum.Sometimes
Convolution can be finally used by removing the pond layer in backbone network or the modifications such as convolution with holes being used to obtain larger sized feature
The convolutional layer that core is 1 obtains semantic segmentation result.With image classification in contrast, image, semantic segmentation difficulty it is higher,
Because it not only needs global contextual information, it is also necessary to determine the class of each pixel in conjunction with fine local message
Not, it so usually extracting more global feature using backbone network, is then carried out in conjunction with the shallow-layer feature in backbone network special
Sign resolution reconstruction is restored to original image size.First become smaller the feature to become larger again based on characteristic size, so usually the former
Referred to as coding network, the latter are known as decoding network.Simultaneously in an encoding process, in order to more preferably capture the spy of different size object
Sign usually combines different feeling wild and dimensional information, such as spatial pyramid pond with holes technology, but the technology expands volume
The interval of product core, has ignored interior pixels point, while also could not make up oneself expression in conjunction with more global contextual information
The deficiency of ability.Meanwhile in existing semantic segmentation method, previous stage feature is usually only simply based in decoding process
Restore resolution ratio, the information made up in cataloged procedure then in conjunction with the shallow-layer feature of correspondingly-sized is lost, both could not be effective
The validity feature during resolution reconstruction is reused in ground, also could not pointedly solve object after image resolution ratio is rebuild
The problem of obscurity boundary.
Summary of the invention
The purpose of the present invention is to provide a kind of semantic segmentation methods based on the dense reconstruction in edge understood for streetscape
And system, this method and system are conducive to improve accuracy and robustness that image, semantic is divided.
To achieve the above object, the technical scheme is that it is a kind of for streetscape understand based on the dense reconstruction in edge
Semantic segmentation method, comprising the following steps:
Step A: pre-processing training set input picture, and allowing image to subtract its image mean value first makes its standardization,
Then the shearing for carrying out uniform sizes to image at random obtains the pretreatment image of identical size;
Step B: generic features F is extracted with convolutional networkbackbone, then it is based on generic features FbackboneObtain three-level or more
Literary spatial pyramid fusion feature Ftspp, for capturing multiple dimensioned contextual information, then cascaded using this two parts as coding
Network extracts coding characteristic Fencoder;
Step C: expand coding characteristic FencoderIt is special to obtain half input size coding to the half of input image size for size
Levy Fus, middle layer feature is chosen from the convolutional networkCalculate edge featureIt is special in conjunction with half input size coding
Levy Fus, with combination of edge featureDense net be decoding network, carry out image resolution ratio reconstruction, calculate decoding feature
Fdecoder;
Step D: with decoding feature FdecoderAnd edge featureSemantic segmentation probability graph and marginal probability are obtained respectively
Figure, in training set semantic image mark calculate edge image mark, using semantic segmentation probability graph and marginal probability figure with
And corresponding mark calculates separately to obtain the edge penalty of semantic segmentation loss and back-up surveillance, to minimize the two weighting
Entire depth neural network is trained with loss is target;
Step E: segmented image is treated using trained deep neural network model and carries out semantic segmentation, output segmentation knot
Fruit.
Further, in the step B, generic features F is extracted with convolutional networkbackbone, then it is based on generic features
FbackboneObtain three-level context space pyramid fusion feature Ftspp, then cascaded using this two parts as coding network and extracted
Coding characteristic Fencoder, comprising the following steps:
Step B1: generic features F is extracted to pretreatment image using convolutional networkbackbone;
Step B2: using 1 × 1 convolution to feature FbackboneFeature Dimension Reduction is carried out, feature is obtained
Step B3: to FbackboneWhole image carries out average pond, then reuses arest neighbors demosaicing to full size,
Image level feature F is obtained using 1 × 1 convolutionimage;
Step B4: being r with porosityasConvolution kernel to FbackboneIt carries out convolution with holes and obtains featureThen splice three
Grade contextual featureFimageWithFusion Features are carried out using 1 × 1 convolution afterwards, obtaining porosity is rasThree-level context
Fusion featureThe same distribution for keeping input in convolution process using batch standardization, uses line rectification function as activation
Function;Wherein, convolutional calculation formula with holes is as follows:
Wherein,It indicates in output coordinate masThe use porosity of position is rasConvolution with holes processing result,
xas[mas+ras·kas] indicate input xasIn coordinate masOn position in porosity be rasAnd convolution kernel coordinate with holes is kasWhen institute it is right
The input reference pixel answered, was[kas] indicate in convolution kernel with holes as kasThe weight of position;
Step B5: repeating previous step using different porositys, until obtaining ntsppA feature, then by this ntsppA spy
Sign withAnd FimageSpliced, obtains three-level context space pyramid fusion feature Ftspp;
Step B6: using 1 × 1 convolution to feature FtsppDimensionality reduction is carried out, is then carried out again with the dropout in deep learning
Regularization obtains coding characteristic F to the endencoder。
Further, in the step C, expand coding characteristic FencoderSize is obtained to the half of input image size
Half input size coding feature Fus, middle layer feature is chosen from the convolutional networkCalculate edge featureIn conjunction with
Half input size coding feature Fus, with combination of edge featureDense net be decoding network, carry out image resolution ratio reconstruction,
Calculate decoding feature Fdecoder, comprising the following steps:
Step C1: the ratio that definition initially enters picture size and characteristic size is the output stride of this feature, using most
Neighbour's interpolation processing coding characteristic Fencoder, obtain the characteristic pattern F that output stride is 2us;
Step C2: the middle layer feature that output stride is os is chosen from the convolutional network for extracting generic featuresFirst
Dimensionality reduction is carried out using 1 × 1 convolution, is then expanded using bilinear interpolationEdge feature is obtained again
Step C3: splicing feature FusWithAfter 1 × 1 convolution dimensionality reduction, reuses 3 × 3 convolution extraction feature and obtain
Decode feature Fdecoder;
Step C4: choosing than in step C2 smaller output stride os, if all the processing of output stride is completed, decodes
Feature extraction is completed, and F is otherwise splicedusAnd FdecoderAs new Fus, and repeat step C2 to C3.
Further, in the step D, with decoding feature FdecoderAnd edge featureIt is general that semantic segmentation is obtained respectively
Rate figure and marginal probability figure calculate edge image mark with the semantic image mark in training set, utilize semantic segmentation probability graph
It calculates separately to obtain the edge penalty of semantic segmentation loss and back-up surveillance with marginal probability figure and corresponding mark, with
Minimizing the two weighted sum loss is target to be trained to entire depth neural network, comprising the following steps:
Step D1: with bilinear interpolation by feature FdecoderWith all featuresZoom to the size with input picture
It is identical, and semantic segmentation probability and marginal probability are obtained as 1 × 1 convolutional calculation of activation primitive by using softmax,
Softmax calculation formula is as follows:
Wherein, σcFor the probability of c classification, e is natural Exponents, γcAnd γkIt is special to respectively indicate the un-activation that classification is c and k
Value indicative, C are total classification number;
Step D2: the semantic segmentation mark of training set is subjected to one-hot coding, then calculates and obtains edge mark, edge mark
It is as follows to infuse calculation formula:
Wherein, yedge(i, j, c) andFor the edge mark and semantic tagger of the position coordinate (i, j) c class, (iu,
ju) indicate (i, j) coordinate under 8 neighborhood U8In one group of coordinate, sgn () be sign function;
Step D3: using the corresponding mark of the probability graph at both semantic segmentation and edge, the friendship of Pixel-level is calculated separately
Entropy is pitched, corresponding semantic segmentation loss L is obtainedsWith the edge penalty of back-up surveillanceThen it calculates weighted sum and loses L:
Wherein,For edge featureCorresponding penalty values, αosForThe shared weight in final loss;
Finally by stochastic gradient descent optimization method, model parameter is updated using backpropagation iteration, is added with minimizing
L is weighed and lost to train entire depth neural network, obtains deep neural network model to the end.
The present invention also provides a kind of semantic segmentation systems based on the dense reconstruction in edge understood for streetscape, comprising:
Preprocessing module for pre-processing training set input picture, including allows image to subtract its image mean value to make
It is standardized, and the shearing for carrying out uniform sizes to image at random obtains the pretreatment image of identical size;
Coding characteristic extraction module, for extracting generic features F with convolutional networkbackbone, then it is based on generic features
FbackboneObtain three-level context space pyramid fusion feature Ftspp, for capturing multiple dimensioned contextual information, then with this
Two parts cascade extracts coding characteristic F as coding networkencoder;
Characteristic extracting module is decoded, for expanding coding characteristic FencoderSize is obtained to the half of input image size
Half input size coding feature Fus, middle layer feature is chosen from the convolutional networkCalculate edge featureIn conjunction with
Half input size coding feature Fus, with combination of edge featureDense net be decoding network, carry out image resolution ratio weight
It builds, extracts decoding feature Fdecoder;
Neural metwork training module, for using decoding feature FdecoderAnd edge featureIt is general that semantic segmentation is obtained respectively
Rate figure and marginal probability figure calculate edge image mark with the semantic image mark in training set, utilize semantic segmentation probability graph
It calculates separately to obtain the edge penalty of semantic segmentation loss and back-up surveillance with marginal probability figure and corresponding mark, with
Minimizing the two weighted sum loss is target to be trained to entire depth neural network, obtains deep neural network model;
And
Semantic segmentation module carries out semantic point for treating segmented image using trained deep neural network model
It cuts, exports segmentation result.
Compared to the prior art, the beneficial effects of the present invention are: more rulers after backbone network first in coding network
Three-level context space pyramid fusion feature has been used in degree feature capture, has pointedly utilized internal feature and global characteristics
Optimize the feature of original different feeling open country, to enrich coding characteristic ability to express.Then it is combined in decoding network
Interbed feature derives from and is aided with the edge feature of supervision, pointedly inclined to being easy to produce in feature resolution reconstruction process
The marginal portion of difference is adjusted, and optimizes the semantic segmentation between different objects as a result, carrying out feature with the mode of dense net simultaneously
Resolution reconstruction preferably reconstruction features to be reused.Compared with the conventional method, the present invention can obtain in encoded more
Powerful contextual information ability to express, the obscurity boundary that jointing edge supervision can more effectively be corrected between object in decoding process are asked
Topic, while feature is more effectively utilized using the reuse performance of dense web frame, make network be easier to train, thus most
After can obtain more accurate semantic segmentation result.
Detailed description of the invention
Fig. 1 is the method implementation flow chart of the embodiment of the present invention.
Fig. 2 is the system structure diagram of the embodiment of the present invention.
Specific embodiment
Below in conjunction with the accompanying drawings and specific embodiment, the present invention is described in further details.
The present invention provides a kind of semantic segmentation methods based on the dense reconstruction in edge understood for streetscape, such as Fig. 1 institute
Show, comprising the following steps:
Step A: pre-processing training set input picture, and allowing image to subtract its image mean value first makes its standardization,
Then the shearing for carrying out uniform sizes to image at random obtains the pretreatment image of identical size.
Step B: generic features F is extracted with general convolutional networkbackbone, then it is based on generic features FbackboneObtain three
Grade context space pyramid fusion feature Ftspp, for capturing multiple dimensioned contextual information, then with described in step B this two
Part cascade extracts coding characteristic F as coding networkencoder;Specifically includes the following steps:
Step B1: using general convolutional network, (the present embodiment is using the xception provided in deeplabv3+ network
Network) generic features F is extracted to pretreatment imagebackbone;
Step B2: using 1 × 1 convolution to feature FbackboneFeature Dimension Reduction is carried out, feature is obtained
Step B3: to FbackboneWhole image carries out average pond, then reuses arest neighbors demosaicing to full size,
Image level feature F is obtained using 1 × 1 convolutionimage;
Step B4: being r with porosityasConvolution kernel to FbackboneIt carries out convolution with holes and obtains featureThen splice three
Grade contextual featureFimageWithFusion Features are carried out using 1 × 1 convolution afterwards, obtaining porosity is rasThree-level context
Fusion featureThe same distribution for keeping input in convolution process using batch standardization, uses line rectification function as activation letter
Number;Wherein, convolutional calculation formula with holes is as follows:
Wherein,It indicates in output coordinate masThe use porosity of position is rasConvolution with holes processing result,
xas[mas+ras·kas] indicate input xasIn coordinate masOn position in porosity be rasAnd convolution kernel coordinate with holes is kasWhen institute it is right
The input reference pixel answered, was[kas] indicate in convolution kernel with holes as kasThe weight of position;
Step B5: repeating previous step using different porositys, until obtaining ntspp(the present embodiment is 3 spies to a feature
Sign, porosity is respectively 6,12,18), then by this ntsppA feature withAnd FimageSpliced, it is empty to obtain three-level context
Between pyramid fusion feature Ftspp;
Step B6: using 1 × 1 convolution to feature FtsppDimensionality reduction is carried out, is then carried out again with the dropout in deep learning
Regularization obtains coding characteristic F to the endencoder。
Step C: expand coding characteristic FencoderIt is special to obtain half input size coding to the half of input image size for size
Levy Fus, middle layer feature is chosen from the convolutional networkCalculate edge featureIt is special in conjunction with half input size coding
Levy Fus, with combination of edge featureDense net be decoding network, carry out image resolution ratio reconstruction, calculate decoding feature
Fdecoder;Specifically includes the following steps:
Step C1: the ratio that definition initially enters picture size and characteristic size is the output stride of this feature, using most
Neighbour's interpolation processing coding characteristic Fencoder, obtain the characteristic pattern F that output stride is 2us;
Step C2: the middle layer feature that output stride is os is chosen from the convolutional network for extracting generic featuresFirst
Dimensionality reduction is carried out using 1 × 1 convolution, is then expanded using bilinear interpolationEdge feature is obtained again
Step C3: splicing feature FusWithAfter 1 × 1 convolution dimensionality reduction, reuses 3 × 3 convolution extraction feature and obtain
Decode feature Fdecoder;
Step C4: choosing than in step C2 smaller output stride os, if all the processing of output stride is completed, decodes
Feature extraction is completed, and F is otherwise splicedusAnd FdecoderAs new Fus, and repeat step C2 to C3.
Step D: with decoding feature FdecoderAnd edge featureSemantic segmentation probability graph and marginal probability are obtained respectively
Figure, in training set semantic image mark calculate edge image mark, using semantic segmentation probability graph and marginal probability figure with
And corresponding mark calculates separately to obtain the edge penalty of semantic segmentation loss and back-up surveillance, to minimize the two weighting
Entire depth neural network is trained with loss is target;Specifically includes the following steps:
Step D1: with bilinear interpolation by feature FdecoderWith all featuresZoom to the size with input picture
It is identical, and semantic segmentation probability and marginal probability are obtained as 1 × 1 convolutional calculation of activation primitive by using softmax,
Softmax calculation formula is as follows:
Wherein, σcFor the probability of c classification, e is natural Exponents, γcAnd γkIt is special to respectively indicate the un-activation that classification is c and k
Value indicative, C are total classification number;
Step D2: the semantic segmentation mark of training set is subjected to one-hot coding, then calculates and obtains edge mark, edge mark
It is as follows to infuse calculation formula:
Wherein, yedge(i, j, c) andFor the edge mark and semantic tagger of the position coordinate (i, j) c class, (iu,
ju) indicate (i, j) coordinate under 8 neighborhood U8In one group of coordinate, sgn () be sign function;
Step D3: using the corresponding mark of the probability graph at both semantic segmentation and edge, the friendship of Pixel-level is calculated separately
Entropy is pitched, corresponding semantic segmentation loss L is obtainedsWith the edge penalty of back-up surveillanceThen it calculates weighted sum and loses L:
Wherein,For edge featureCorresponding penalty values, αosForShared weight, α in final lossos
MeetAnd each αosIt is equal;
Finally by stochastic gradient descent optimization method, model parameter is updated using backpropagation iteration, is added with minimizing
L is weighed and lost to train entire depth neural network, obtains deep neural network model to the end.
Step E: segmented image is treated using trained deep neural network model and carries out semantic segmentation, output segmentation knot
Fruit.
The present invention also provides the semantic segmentation systems understood for streetscape for realizing the above method, as shown in Fig. 2,
Include:
Preprocessing module for pre-processing training set input picture, including allows image to subtract its image mean value to make
It is standardized, and the shearing for carrying out uniform sizes to image at random obtains the pretreatment image of identical size;
Coding characteristic extraction module, for extracting generic features F with convolutional networkbackbone, then it is based on generic features
FbackboneObtain three-level context space pyramid fusion feature Ftspp, for capturing multiple dimensioned contextual information, then with this
Two parts cascade extracts coding characteristic F as coding networkencoder;
Characteristic extracting module is decoded, for expanding coding characteristic FencoderSize is obtained to the half of input image size
Half input size coding feature Fus, middle layer feature is chosen from the convolutional networkCalculate edge featureIn conjunction with
Half input size coding feature Fus, with combination of edge featureDense net be decoding network, carry out image resolution ratio reconstruction,
Extract decoding feature Fdecoder;
Neural metwork training module, for using decoding feature FdecoderAnd edge featureIt is general that semantic segmentation is obtained respectively
Rate figure and marginal probability figure calculate edge image mark with the semantic image mark in training set, utilize semantic segmentation probability graph
It calculates separately to obtain the edge penalty of semantic segmentation loss and back-up surveillance with marginal probability figure and corresponding mark, with
Minimizing the two weighted sum loss is target to be trained to entire depth neural network, obtains deep neural network model;
And
Semantic segmentation module carries out semantic point for treating segmented image using trained deep neural network model
It cuts, exports segmentation result.
The above are preferred embodiments of the present invention, all any changes made according to the technical solution of the present invention, and generated function is made
When with range without departing from technical solution of the present invention, all belong to the scope of protection of the present invention.
Claims (5)
1. a kind of semantic segmentation method based on the dense reconstruction in edge understood for streetscape, which is characterized in that including following step
It is rapid:
Step A: pre-processing training set input picture, and allowing image to subtract its image mean value first makes its standardization, then
The shearing for carrying out uniform sizes to image at random obtains the pretreatment image of identical size;
Step B: generic features F is extracted with convolutional networkbackbone, then it is based on generic features FbackboneObtain three-level context space
Pyramid fusion feature Ftspp, for capturing multiple dimensioned contextual information, then cascaded using this two parts as coding network and mentioned
Take coding characteristic Fencoder;
Step C: expand coding characteristic FencoderSize obtains half input size coding feature to the half of input image size
Fus, middle layer feature is chosen from the convolutional networkCalculate edge featureIn conjunction with half input size coding feature
Fus, with combination of edge featureDense net be decoding network, carry out image resolution ratio reconstruction, calculate decoding feature
Fdecoder;
Step D: with decoding feature FdecoderAnd edge featureSemantic segmentation probability graph and marginal probability figure are obtained respectively, with
Semantic image mark in training set calculates edge image mark, using semantic segmentation probability graph and marginal probability figure and respectively
Corresponding mark calculates separately to obtain the edge penalty of semantic segmentation loss and back-up surveillance, to minimize the two weighted sum loss
Entire depth neural network is trained for target;
Step E: segmented image is treated using trained deep neural network model and carries out semantic segmentation, exports segmentation result.
2. the semantic segmentation method based on the dense reconstruction in edge according to claim 1 understood for streetscape, feature
It is, in the step B, extracts generic features F with convolutional networkbackbone, then it is based on generic features FbackboneIt obtains in three-level
Hereafter spatial pyramid fusion feature Ftspp, then cascaded using this two parts as coding network and extract coding characteristic Fencoder, packet
Include following steps:
Step B1: generic features F is extracted to pretreatment image using convolutional networkbackbone;
Step B2: using 1 × 1 convolution to feature FbackboneFeature Dimension Reduction is carried out, feature is obtained
Step B3: to FbackboneWhole image carries out average pond, then reuses arest neighbors demosaicing to full size, then pass through
It crosses 1 × 1 convolution and obtains image level feature Fimage;
Step B4: being r with porosityasConvolution kernel to FbackboneIt carries out convolution with holes and obtains featureThen splice in three-level
Following traitsFimageWithFusion Features are carried out using 1 × 1 convolution afterwards, obtaining porosity is rasThree-level context fusion
FeatureThe same distribution for keeping input in convolution process using batch standardization, uses line rectification function as activation primitive;
Wherein, convolutional calculation formula with holes is as follows:
Wherein,It indicates in output coordinate masThe use porosity of position is rasConvolution with holes processing result, xas[mas
+ras·kas] indicate input xasIn coordinate masOn position in porosity be rasAnd convolution kernel coordinate with holes is kasWhen it is corresponding defeated
Enter reference pixel, was[kas] indicate in convolution kernel with holes as kasThe weight of position;
Step B5: repeating previous step using different porositys, until obtaining ntsppA feature, then by this ntsppA feature withAnd FimageSpliced, obtains three-level context space pyramid fusion feature Ftspp;
Step B6: using 1 × 1 convolution to feature FtsppDimensionality reduction is carried out, then carries out canonical with the dropout in deep learning again
Change, obtains coding characteristic F to the endencoder。
3. the semantic segmentation method based on the dense reconstruction in edge according to claim 2 understood for streetscape, feature
It is, in the step C, expands coding characteristic FencoderSize obtains half input size and compiles to the half of input image size
Code feature Fus, middle layer feature is chosen from the convolutional networkCalculate edge featureIt is compiled in conjunction with half input size
Code feature Fus, with combination of edge featureDense net be decoding network, carry out image resolution ratio reconstruction, calculate decoding feature
Fdecoder, comprising the following steps:
Step C1: the ratio that definition initially enters picture size and characteristic size is the output stride of this feature, uses arest neighbors
Interpolation processing coding characteristic Fencoder, obtain the characteristic pattern F that output stride is 2us;
Step C2: the middle layer feature that output stride is os is chosen from the convolutional network for extracting generic featuresFirst use 1
× 1 convolution carries out dimensionality reduction, is then expanded using bilinear interpolationEdge feature is obtained again
Step C3: splicing feature FusWithAfter 1 × 1 convolution dimensionality reduction, reuses 3 × 3 convolution extraction feature and decoded
Feature Fdecoder;
Step C4: choosing than in step C2 smaller output stride os, if all the processing of output stride is completed, decodes feature
It extracts and completes, otherwise splice FusAnd FdecoderAs new Fus, and repeat step C2 to C3.
4. the semantic segmentation method based on the dense reconstruction in edge according to claim 3 understood for streetscape, feature
It is, in the step D, with decoding feature FdecoderAnd edge featureIt obtains semantic segmentation probability graph respectively and edge is general
Rate figure calculates edge image mark with the semantic image mark in training set, utilizes semantic segmentation probability graph and marginal probability figure
And corresponding mark calculates separately to obtain the edge penalty of semantic segmentation loss and back-up surveillance, both to minimize plus
Power and loss are target to be trained to entire depth neural network, comprising the following steps:
Step D1: with bilinear interpolation by feature FdecoderWith all featuresZoom to it is identical as the size of input picture,
And semantic segmentation probability and marginal probability, softmax are obtained as 1 × 1 convolutional calculation of activation primitive by using softmax
Calculation formula is as follows:
Wherein, σcFor the probability of c classification, e is natural Exponents, γcAnd γkThe un-activation characteristic value that classification is c and k is respectively indicated,
C is total classification number;
Step D2: carrying out one-hot coding for the semantic segmentation mark of training set, then calculate and obtain edge mark, edge mark meter
It is as follows to calculate formula:
Wherein, yedge(i, j, c) andFor the edge mark and semantic tagger of the position coordinate (i, j) c class, (iu,ju) indicate
8 neighborhood U under (i, j) coordinate8In one group of coordinate, sgn () be sign function;
Step D3: using the corresponding mark of the probability graph at both semantic segmentation and edge, calculating separately the cross entropy of Pixel-level,
Obtain corresponding semantic segmentation loss LsWith the edge penalty of back-up surveillanceThen it calculates weighted sum and loses L:
Wherein,For edge featureCorresponding penalty values, αosForThe shared weight in final loss;
Finally by stochastic gradient descent optimization method, model parameter is updated using backpropagation iteration, to minimize weighted sum
L is lost to train entire depth neural network, obtains deep neural network model to the end.
5. a kind of semantic segmentation system based on the dense reconstruction in edge understood for streetscape characterized by comprising
Preprocessing module for pre-processing training set input picture, including allows image to subtract its image mean value to make its mark
Standardization, and the pretreatment image of the identical size of shearing acquisition of uniform sizes is carried out to image at random;
Coding characteristic extraction module, for extracting generic features F with convolutional networkbackbone, then it is based on generic features FbackboneIt obtains
Take three-level context space pyramid fusion feature Ftspp, for capturing multiple dimensioned contextual information, then with this two parts grade
Connection extracts coding characteristic F as coding networkencoder;
Characteristic extracting module is decoded, for expanding coding characteristic FencoderSize obtains half and inputs to the half of input image size
Size coding feature Fus, middle layer feature is chosen from the convolutional networkCalculate edge featureIt is inputted in conjunction with half
Size coding feature Fus, with combination of edge featureDense net be decoding network, carry out image resolution ratio reconstruction, extract solution
Code feature Fdecoder;
Neural metwork training module, for using decoding feature FdecoderAnd edge featureSemantic segmentation probability graph is obtained respectively
With marginal probability figure, edge image mark is calculated with the semantic image mark in training set, utilizes semantic segmentation probability graph and side
Edge probability graph and corresponding mark calculate separately to obtain the edge penalty of semantic segmentation loss and back-up surveillance, with minimum
Changing the two weighted sum loss is target to be trained to entire depth neural network, obtains deep neural network model;And
Semantic segmentation module carries out semantic segmentation for treating segmented image using trained deep neural network model, defeated
Segmentation result out.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910359119.0A CN110059698B (en) | 2019-04-30 | 2019-04-30 | Semantic segmentation method and system based on edge dense reconstruction for street view understanding |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910359119.0A CN110059698B (en) | 2019-04-30 | 2019-04-30 | Semantic segmentation method and system based on edge dense reconstruction for street view understanding |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110059698A true CN110059698A (en) | 2019-07-26 |
CN110059698B CN110059698B (en) | 2022-12-23 |
Family
ID=67321810
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910359119.0A Active CN110059698B (en) | 2019-04-30 | 2019-04-30 | Semantic segmentation method and system based on edge dense reconstruction for street view understanding |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110059698B (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110517278A (en) * | 2019-08-07 | 2019-11-29 | 北京旷视科技有限公司 | Image segmentation and the training method of image segmentation network, device and computer equipment |
CN110599514A (en) * | 2019-09-23 | 2019-12-20 | 北京达佳互联信息技术有限公司 | Image segmentation method and device, electronic equipment and storage medium |
CN110598846A (en) * | 2019-08-15 | 2019-12-20 | 北京航空航天大学 | Hierarchical recurrent neural network decoder and decoding method |
CN110895814A (en) * | 2019-11-30 | 2020-03-20 | 南京工业大学 | Intelligent segmentation method for aero-engine hole detection image damage based on context coding network |
CN111340047A (en) * | 2020-02-28 | 2020-06-26 | 江苏实达迪美数据处理有限公司 | Image semantic segmentation method and system based on multi-scale feature and foreground and background contrast |
CN111341438A (en) * | 2020-02-25 | 2020-06-26 | 中国科学技术大学 | Image processing apparatus, electronic device, and medium |
CN111429473A (en) * | 2020-02-27 | 2020-07-17 | 西北大学 | Chest film lung field segmentation model establishment and segmentation method based on multi-scale feature fusion |
CN112150478A (en) * | 2020-08-31 | 2020-12-29 | 温州医科大学 | Method and system for constructing semi-supervised image segmentation framework |
CN112700462A (en) * | 2020-12-31 | 2021-04-23 | 北京迈格威科技有限公司 | Image segmentation method and device, electronic equipment and storage medium |
CN113051983A (en) * | 2019-12-28 | 2021-06-29 | 中移(成都)信息通信科技有限公司 | Method for training field crop disease recognition model and field crop disease recognition |
CN113128353A (en) * | 2021-03-26 | 2021-07-16 | 安徽大学 | Emotion sensing method and system for natural human-computer interaction |
CN113706545A (en) * | 2021-08-23 | 2021-11-26 | 浙江工业大学 | Semi-supervised image segmentation method based on dual-branch nerve discrimination dimensionality reduction |
CN114627086A (en) * | 2022-03-18 | 2022-06-14 | 江苏省特种设备安全监督检验研究院 | Crane surface damage detection method based on improved feature pyramid network |
CN115953394A (en) * | 2023-03-10 | 2023-04-11 | 中国石油大学(华东) | Target segmentation-based detection method and system for mesoscale ocean vortexes |
CN116978011A (en) * | 2023-08-23 | 2023-10-31 | 广州新华学院 | Image semantic communication method and system for intelligent target recognition |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10095977B1 (en) * | 2017-10-04 | 2018-10-09 | StradVision, Inc. | Learning method and learning device for improving image segmentation and testing method and testing device using the same |
CN109241972A (en) * | 2018-08-20 | 2019-01-18 | 电子科技大学 | Image, semantic dividing method based on deep learning |
CN109509192A (en) * | 2018-10-18 | 2019-03-22 | 天津大学 | Merge the semantic segmentation network in Analysis On Multi-scale Features space and semantic space |
-
2019
- 2019-04-30 CN CN201910359119.0A patent/CN110059698B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10095977B1 (en) * | 2017-10-04 | 2018-10-09 | StradVision, Inc. | Learning method and learning device for improving image segmentation and testing method and testing device using the same |
CN109241972A (en) * | 2018-08-20 | 2019-01-18 | 电子科技大学 | Image, semantic dividing method based on deep learning |
CN109509192A (en) * | 2018-10-18 | 2019-03-22 | 天津大学 | Merge the semantic segmentation network in Analysis On Multi-scale Features space and semantic space |
Non-Patent Citations (2)
Title |
---|
YUZHONG CHEN: "Pyramid Context Contrast for Semantic Segmentation", 《IEEE ACCESS》 * |
胡太: "基于深度神经网络的小目标语义分割算法研究", 《中国优秀硕士学位论文全文数据库》 * |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110517278B (en) * | 2019-08-07 | 2022-04-29 | 北京旷视科技有限公司 | Image segmentation and training method and device of image segmentation network and computer equipment |
CN110517278A (en) * | 2019-08-07 | 2019-11-29 | 北京旷视科技有限公司 | Image segmentation and the training method of image segmentation network, device and computer equipment |
CN110598846A (en) * | 2019-08-15 | 2019-12-20 | 北京航空航天大学 | Hierarchical recurrent neural network decoder and decoding method |
CN110598846B (en) * | 2019-08-15 | 2022-05-03 | 北京航空航天大学 | Hierarchical recurrent neural network decoder and decoding method |
CN110599514A (en) * | 2019-09-23 | 2019-12-20 | 北京达佳互联信息技术有限公司 | Image segmentation method and device, electronic equipment and storage medium |
CN110599514B (en) * | 2019-09-23 | 2022-10-04 | 北京达佳互联信息技术有限公司 | Image segmentation method and device, electronic equipment and storage medium |
CN110895814A (en) * | 2019-11-30 | 2020-03-20 | 南京工业大学 | Intelligent segmentation method for aero-engine hole detection image damage based on context coding network |
CN113051983B (en) * | 2019-12-28 | 2022-08-23 | 中移(成都)信息通信科技有限公司 | Method for training field crop disease recognition model and field crop disease recognition |
CN113051983A (en) * | 2019-12-28 | 2021-06-29 | 中移(成都)信息通信科技有限公司 | Method for training field crop disease recognition model and field crop disease recognition |
CN111341438B (en) * | 2020-02-25 | 2023-04-28 | 中国科学技术大学 | Image processing method, device, electronic equipment and medium |
CN111341438A (en) * | 2020-02-25 | 2020-06-26 | 中国科学技术大学 | Image processing apparatus, electronic device, and medium |
CN111429473A (en) * | 2020-02-27 | 2020-07-17 | 西北大学 | Chest film lung field segmentation model establishment and segmentation method based on multi-scale feature fusion |
CN111429473B (en) * | 2020-02-27 | 2023-04-07 | 西北大学 | Chest film lung field segmentation model establishment and segmentation method based on multi-scale feature fusion |
CN111340047A (en) * | 2020-02-28 | 2020-06-26 | 江苏实达迪美数据处理有限公司 | Image semantic segmentation method and system based on multi-scale feature and foreground and background contrast |
CN112150478A (en) * | 2020-08-31 | 2020-12-29 | 温州医科大学 | Method and system for constructing semi-supervised image segmentation framework |
CN112700462A (en) * | 2020-12-31 | 2021-04-23 | 北京迈格威科技有限公司 | Image segmentation method and device, electronic equipment and storage medium |
CN113128353B (en) * | 2021-03-26 | 2023-10-24 | 安徽大学 | Emotion perception method and system oriented to natural man-machine interaction |
CN113128353A (en) * | 2021-03-26 | 2021-07-16 | 安徽大学 | Emotion sensing method and system for natural human-computer interaction |
CN113706545A (en) * | 2021-08-23 | 2021-11-26 | 浙江工业大学 | Semi-supervised image segmentation method based on dual-branch nerve discrimination dimensionality reduction |
CN113706545B (en) * | 2021-08-23 | 2024-03-26 | 浙江工业大学 | Semi-supervised image segmentation method based on dual-branch nerve discrimination dimension reduction |
CN114627086A (en) * | 2022-03-18 | 2022-06-14 | 江苏省特种设备安全监督检验研究院 | Crane surface damage detection method based on improved feature pyramid network |
CN114627086B (en) * | 2022-03-18 | 2023-04-28 | 江苏省特种设备安全监督检验研究院 | Crane surface damage detection method based on characteristic pyramid network |
CN115953394A (en) * | 2023-03-10 | 2023-04-11 | 中国石油大学(华东) | Target segmentation-based detection method and system for mesoscale ocean vortexes |
CN116978011A (en) * | 2023-08-23 | 2023-10-31 | 广州新华学院 | Image semantic communication method and system for intelligent target recognition |
CN116978011B (en) * | 2023-08-23 | 2024-03-15 | 广州新华学院 | Image semantic communication method and system for intelligent target recognition |
Also Published As
Publication number | Publication date |
---|---|
CN110059698B (en) | 2022-12-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110059698A (en) | The semantic segmentation method and system based on the dense reconstruction in edge understood for streetscape | |
CN110059768A (en) | The semantic segmentation method and system of the merging point and provincial characteristics that understand for streetscape | |
CN110059769A (en) | The semantic segmentation method and system rebuild are reset based on pixel for what streetscape understood | |
CN110070091A (en) | The semantic segmentation method and system rebuild based on dynamic interpolation understood for streetscape | |
CN110033410B (en) | Image reconstruction model training method, image super-resolution reconstruction method and device | |
CN115797931A (en) | Remote sensing image semantic segmentation method based on double-branch feature fusion | |
CN110992270A (en) | Multi-scale residual attention network image super-resolution reconstruction method based on attention | |
CN104113789B (en) | On-line video abstraction generation method based on depth learning | |
CN108427920A (en) | A kind of land and sea border defense object detection method based on deep learning | |
CN109598269A (en) | A kind of semantic segmentation method based on multiresolution input with pyramid expansion convolution | |
CN108549893A (en) | A kind of end-to-end recognition methods of the scene text of arbitrary shape | |
CN109410146A (en) | A kind of image deblurring algorithm based on Bi-Skip-Net | |
CN112884073B (en) | Image rain removing method, system, terminal and storage medium | |
CN110097110A (en) | A kind of semantic image restorative procedure based on objective optimization | |
CN111179196A (en) | Multi-resolution depth network image highlight removing method based on divide-and-conquer | |
CN113762265A (en) | Pneumonia classification and segmentation method and system | |
CN111462090A (en) | Multi-scale image target detection method | |
CN111126185B (en) | Deep learning vehicle target recognition method for road gate scene | |
CN115082966A (en) | Pedestrian re-recognition model training method, pedestrian re-recognition method, device and equipment | |
CN113688715A (en) | Facial expression recognition method and system | |
Feng et al. | Coal mine image dust and fog clearing algorithm based on deep learning network | |
Wan et al. | Siamese Attentive Convolutional Network for Effective Remote Sensing Image Change Detection | |
CN110414301A (en) | It is a kind of based on double compartment crowd density estimation methods for taking the photograph head | |
CN116485689B (en) | Progressive coupling image rain removing method and system based on CNN and transducer | |
CN117934327A (en) | Reflective image removing method based on characteristic dynamic cross perception |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |