CN110059698A - The semantic segmentation method and system based on the dense reconstruction in edge understood for streetscape - Google Patents

The semantic segmentation method and system based on the dense reconstruction in edge understood for streetscape Download PDF

Info

Publication number
CN110059698A
CN110059698A CN201910359119.0A CN201910359119A CN110059698A CN 110059698 A CN110059698 A CN 110059698A CN 201910359119 A CN201910359119 A CN 201910359119A CN 110059698 A CN110059698 A CN 110059698A
Authority
CN
China
Prior art keywords
feature
edge
image
semantic segmentation
coding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910359119.0A
Other languages
Chinese (zh)
Other versions
CN110059698B (en
Inventor
陈羽中
林洋洋
柯逍
黄腾达
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN201910359119.0A priority Critical patent/CN110059698B/en
Publication of CN110059698A publication Critical patent/CN110059698A/en
Application granted granted Critical
Publication of CN110059698B publication Critical patent/CN110059698B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The present invention relates to a kind of semantic segmentation method and system based on the dense reconstruction in edge understood for streetscape, this method comprises: pre-processing to training set input picture, making image standardization and obtaining the pretreatment image of identical size;Generic features are extracted with convolutional network, then obtain three-level context space pyramid fusion feature, extract coding characteristic using the cascade of this two parts as coding network;Half input size coding feature is obtained using coding characteristic, edge feature is obtained based on convolutional network, in conjunction with half input size coding feature, using the dense net of combination of edge feature as decoding network, reconstructed image resolution obtains decoding feature;The edge penalty of semantic segmentation loss and back-up surveillance is calculated, is that target is trained deep neural network to minimize the two weighted sum loss;Segmented image is treated using deep neural network model and carries out semantic segmentation, exports segmentation result.This method and system are conducive to improve the accuracy and robustness of image, semantic segmentation.

Description

The semantic segmentation method and system based on the dense reconstruction in edge understood for streetscape
Technical field
The present invention relates to technical field of computer vision, and in particular to a kind of for the dense heavy based on edge of streetscape understanding The semantic segmentation method and system built.
Background technique
Image, semantic segmentation be computer vision in artificial intelligence field an important branch, be in machine vision about The important ring of image understanding.Image, semantic segmentation is exactly that each of image pixel is accurately referred to its affiliated class Not, make the visual representation content of itself and image itself consistent, so image, semantic segmentation task is also referred to as the image of Pixel-level Classification task.
Since image, semantic segmentation has certain similitude with image classification, so miscellaneous image classification network is normal It is replaceable as the backbone network of image, semantic segmentation network, and between each other often after rejecting last full articulamentum.Sometimes Convolution can be finally used by removing the pond layer in backbone network or the modifications such as convolution with holes being used to obtain larger sized feature The convolutional layer that core is 1 obtains semantic segmentation result.With image classification in contrast, image, semantic segmentation difficulty it is higher, Because it not only needs global contextual information, it is also necessary to determine the class of each pixel in conjunction with fine local message Not, it so usually extracting more global feature using backbone network, is then carried out in conjunction with the shallow-layer feature in backbone network special Sign resolution reconstruction is restored to original image size.First become smaller the feature to become larger again based on characteristic size, so usually the former Referred to as coding network, the latter are known as decoding network.Simultaneously in an encoding process, in order to more preferably capture the spy of different size object Sign usually combines different feeling wild and dimensional information, such as spatial pyramid pond with holes technology, but the technology expands volume The interval of product core, has ignored interior pixels point, while also could not make up oneself expression in conjunction with more global contextual information The deficiency of ability.Meanwhile in existing semantic segmentation method, previous stage feature is usually only simply based in decoding process Restore resolution ratio, the information made up in cataloged procedure then in conjunction with the shallow-layer feature of correspondingly-sized is lost, both could not be effective The validity feature during resolution reconstruction is reused in ground, also could not pointedly solve object after image resolution ratio is rebuild The problem of obscurity boundary.
Summary of the invention
The purpose of the present invention is to provide a kind of semantic segmentation methods based on the dense reconstruction in edge understood for streetscape And system, this method and system are conducive to improve accuracy and robustness that image, semantic is divided.
To achieve the above object, the technical scheme is that it is a kind of for streetscape understand based on the dense reconstruction in edge Semantic segmentation method, comprising the following steps:
Step A: pre-processing training set input picture, and allowing image to subtract its image mean value first makes its standardization, Then the shearing for carrying out uniform sizes to image at random obtains the pretreatment image of identical size;
Step B: generic features F is extracted with convolutional networkbackbone, then it is based on generic features FbackboneObtain three-level or more Literary spatial pyramid fusion feature Ftspp, for capturing multiple dimensioned contextual information, then cascaded using this two parts as coding Network extracts coding characteristic Fencoder
Step C: expand coding characteristic FencoderIt is special to obtain half input size coding to the half of input image size for size Levy Fus, middle layer feature is chosen from the convolutional networkCalculate edge featureIt is special in conjunction with half input size coding Levy Fus, with combination of edge featureDense net be decoding network, carry out image resolution ratio reconstruction, calculate decoding feature Fdecoder
Step D: with decoding feature FdecoderAnd edge featureSemantic segmentation probability graph and marginal probability are obtained respectively Figure, in training set semantic image mark calculate edge image mark, using semantic segmentation probability graph and marginal probability figure with And corresponding mark calculates separately to obtain the edge penalty of semantic segmentation loss and back-up surveillance, to minimize the two weighting Entire depth neural network is trained with loss is target;
Step E: segmented image is treated using trained deep neural network model and carries out semantic segmentation, output segmentation knot Fruit.
Further, in the step B, generic features F is extracted with convolutional networkbackbone, then it is based on generic features FbackboneObtain three-level context space pyramid fusion feature Ftspp, then cascaded using this two parts as coding network and extracted Coding characteristic Fencoder, comprising the following steps:
Step B1: generic features F is extracted to pretreatment image using convolutional networkbackbone
Step B2: using 1 × 1 convolution to feature FbackboneFeature Dimension Reduction is carried out, feature is obtained
Step B3: to FbackboneWhole image carries out average pond, then reuses arest neighbors demosaicing to full size, Image level feature F is obtained using 1 × 1 convolutionimage
Step B4: being r with porosityasConvolution kernel to FbackboneIt carries out convolution with holes and obtains featureThen splice three Grade contextual featureFimageWithFusion Features are carried out using 1 × 1 convolution afterwards, obtaining porosity is rasThree-level context Fusion featureThe same distribution for keeping input in convolution process using batch standardization, uses line rectification function as activation Function;Wherein, convolutional calculation formula with holes is as follows:
Wherein,It indicates in output coordinate masThe use porosity of position is rasConvolution with holes processing result, xas[mas+ras·kas] indicate input xasIn coordinate masOn position in porosity be rasAnd convolution kernel coordinate with holes is kasWhen institute it is right The input reference pixel answered, was[kas] indicate in convolution kernel with holes as kasThe weight of position;
Step B5: repeating previous step using different porositys, until obtaining ntsppA feature, then by this ntsppA spy Sign withAnd FimageSpliced, obtains three-level context space pyramid fusion feature Ftspp
Step B6: using 1 × 1 convolution to feature FtsppDimensionality reduction is carried out, is then carried out again with the dropout in deep learning Regularization obtains coding characteristic F to the endencoder
Further, in the step C, expand coding characteristic FencoderSize is obtained to the half of input image size Half input size coding feature Fus, middle layer feature is chosen from the convolutional networkCalculate edge featureIn conjunction with Half input size coding feature Fus, with combination of edge featureDense net be decoding network, carry out image resolution ratio reconstruction, Calculate decoding feature Fdecoder, comprising the following steps:
Step C1: the ratio that definition initially enters picture size and characteristic size is the output stride of this feature, using most Neighbour's interpolation processing coding characteristic Fencoder, obtain the characteristic pattern F that output stride is 2us
Step C2: the middle layer feature that output stride is os is chosen from the convolutional network for extracting generic featuresFirst Dimensionality reduction is carried out using 1 × 1 convolution, is then expanded using bilinear interpolationEdge feature is obtained again
Step C3: splicing feature FusWithAfter 1 × 1 convolution dimensionality reduction, reuses 3 × 3 convolution extraction feature and obtain Decode feature Fdecoder
Step C4: choosing than in step C2 smaller output stride os, if all the processing of output stride is completed, decodes Feature extraction is completed, and F is otherwise splicedusAnd FdecoderAs new Fus, and repeat step C2 to C3.
Further, in the step D, with decoding feature FdecoderAnd edge featureIt is general that semantic segmentation is obtained respectively Rate figure and marginal probability figure calculate edge image mark with the semantic image mark in training set, utilize semantic segmentation probability graph It calculates separately to obtain the edge penalty of semantic segmentation loss and back-up surveillance with marginal probability figure and corresponding mark, with Minimizing the two weighted sum loss is target to be trained to entire depth neural network, comprising the following steps:
Step D1: with bilinear interpolation by feature FdecoderWith all featuresZoom to the size with input picture It is identical, and semantic segmentation probability and marginal probability are obtained as 1 × 1 convolutional calculation of activation primitive by using softmax, Softmax calculation formula is as follows:
Wherein, σcFor the probability of c classification, e is natural Exponents, γcAnd γkIt is special to respectively indicate the un-activation that classification is c and k Value indicative, C are total classification number;
Step D2: the semantic segmentation mark of training set is subjected to one-hot coding, then calculates and obtains edge mark, edge mark It is as follows to infuse calculation formula:
Wherein, yedge(i, j, c) andFor the edge mark and semantic tagger of the position coordinate (i, j) c class, (iu, ju) indicate (i, j) coordinate under 8 neighborhood U8In one group of coordinate, sgn () be sign function;
Step D3: using the corresponding mark of the probability graph at both semantic segmentation and edge, the friendship of Pixel-level is calculated separately Entropy is pitched, corresponding semantic segmentation loss L is obtainedsWith the edge penalty of back-up surveillanceThen it calculates weighted sum and loses L:
Wherein,For edge featureCorresponding penalty values, αosForThe shared weight in final loss;
Finally by stochastic gradient descent optimization method, model parameter is updated using backpropagation iteration, is added with minimizing L is weighed and lost to train entire depth neural network, obtains deep neural network model to the end.
The present invention also provides a kind of semantic segmentation systems based on the dense reconstruction in edge understood for streetscape, comprising:
Preprocessing module for pre-processing training set input picture, including allows image to subtract its image mean value to make It is standardized, and the shearing for carrying out uniform sizes to image at random obtains the pretreatment image of identical size;
Coding characteristic extraction module, for extracting generic features F with convolutional networkbackbone, then it is based on generic features FbackboneObtain three-level context space pyramid fusion feature Ftspp, for capturing multiple dimensioned contextual information, then with this Two parts cascade extracts coding characteristic F as coding networkencoder
Characteristic extracting module is decoded, for expanding coding characteristic FencoderSize is obtained to the half of input image size Half input size coding feature Fus, middle layer feature is chosen from the convolutional networkCalculate edge featureIn conjunction with Half input size coding feature Fus, with combination of edge featureDense net be decoding network, carry out image resolution ratio weight It builds, extracts decoding feature Fdecoder
Neural metwork training module, for using decoding feature FdecoderAnd edge featureIt is general that semantic segmentation is obtained respectively Rate figure and marginal probability figure calculate edge image mark with the semantic image mark in training set, utilize semantic segmentation probability graph It calculates separately to obtain the edge penalty of semantic segmentation loss and back-up surveillance with marginal probability figure and corresponding mark, with Minimizing the two weighted sum loss is target to be trained to entire depth neural network, obtains deep neural network model; And
Semantic segmentation module carries out semantic point for treating segmented image using trained deep neural network model It cuts, exports segmentation result.
Compared to the prior art, the beneficial effects of the present invention are: more rulers after backbone network first in coding network Three-level context space pyramid fusion feature has been used in degree feature capture, has pointedly utilized internal feature and global characteristics Optimize the feature of original different feeling open country, to enrich coding characteristic ability to express.Then it is combined in decoding network Interbed feature derives from and is aided with the edge feature of supervision, pointedly inclined to being easy to produce in feature resolution reconstruction process The marginal portion of difference is adjusted, and optimizes the semantic segmentation between different objects as a result, carrying out feature with the mode of dense net simultaneously Resolution reconstruction preferably reconstruction features to be reused.Compared with the conventional method, the present invention can obtain in encoded more Powerful contextual information ability to express, the obscurity boundary that jointing edge supervision can more effectively be corrected between object in decoding process are asked Topic, while feature is more effectively utilized using the reuse performance of dense web frame, make network be easier to train, thus most After can obtain more accurate semantic segmentation result.
Detailed description of the invention
Fig. 1 is the method implementation flow chart of the embodiment of the present invention.
Fig. 2 is the system structure diagram of the embodiment of the present invention.
Specific embodiment
Below in conjunction with the accompanying drawings and specific embodiment, the present invention is described in further details.
The present invention provides a kind of semantic segmentation methods based on the dense reconstruction in edge understood for streetscape, such as Fig. 1 institute Show, comprising the following steps:
Step A: pre-processing training set input picture, and allowing image to subtract its image mean value first makes its standardization, Then the shearing for carrying out uniform sizes to image at random obtains the pretreatment image of identical size.
Step B: generic features F is extracted with general convolutional networkbackbone, then it is based on generic features FbackboneObtain three Grade context space pyramid fusion feature Ftspp, for capturing multiple dimensioned contextual information, then with described in step B this two Part cascade extracts coding characteristic F as coding networkencoder;Specifically includes the following steps:
Step B1: using general convolutional network, (the present embodiment is using the xception provided in deeplabv3+ network Network) generic features F is extracted to pretreatment imagebackbone
Step B2: using 1 × 1 convolution to feature FbackboneFeature Dimension Reduction is carried out, feature is obtained
Step B3: to FbackboneWhole image carries out average pond, then reuses arest neighbors demosaicing to full size, Image level feature F is obtained using 1 × 1 convolutionimage
Step B4: being r with porosityasConvolution kernel to FbackboneIt carries out convolution with holes and obtains featureThen splice three Grade contextual featureFimageWithFusion Features are carried out using 1 × 1 convolution afterwards, obtaining porosity is rasThree-level context Fusion featureThe same distribution for keeping input in convolution process using batch standardization, uses line rectification function as activation letter Number;Wherein, convolutional calculation formula with holes is as follows:
Wherein,It indicates in output coordinate masThe use porosity of position is rasConvolution with holes processing result, xas[mas+ras·kas] indicate input xasIn coordinate masOn position in porosity be rasAnd convolution kernel coordinate with holes is kasWhen institute it is right The input reference pixel answered, was[kas] indicate in convolution kernel with holes as kasThe weight of position;
Step B5: repeating previous step using different porositys, until obtaining ntspp(the present embodiment is 3 spies to a feature Sign, porosity is respectively 6,12,18), then by this ntsppA feature withAnd FimageSpliced, it is empty to obtain three-level context Between pyramid fusion feature Ftspp
Step B6: using 1 × 1 convolution to feature FtsppDimensionality reduction is carried out, is then carried out again with the dropout in deep learning Regularization obtains coding characteristic F to the endencoder
Step C: expand coding characteristic FencoderIt is special to obtain half input size coding to the half of input image size for size Levy Fus, middle layer feature is chosen from the convolutional networkCalculate edge featureIt is special in conjunction with half input size coding Levy Fus, with combination of edge featureDense net be decoding network, carry out image resolution ratio reconstruction, calculate decoding feature Fdecoder;Specifically includes the following steps:
Step C1: the ratio that definition initially enters picture size and characteristic size is the output stride of this feature, using most Neighbour's interpolation processing coding characteristic Fencoder, obtain the characteristic pattern F that output stride is 2us
Step C2: the middle layer feature that output stride is os is chosen from the convolutional network for extracting generic featuresFirst Dimensionality reduction is carried out using 1 × 1 convolution, is then expanded using bilinear interpolationEdge feature is obtained again
Step C3: splicing feature FusWithAfter 1 × 1 convolution dimensionality reduction, reuses 3 × 3 convolution extraction feature and obtain Decode feature Fdecoder
Step C4: choosing than in step C2 smaller output stride os, if all the processing of output stride is completed, decodes Feature extraction is completed, and F is otherwise splicedusAnd FdecoderAs new Fus, and repeat step C2 to C3.
Step D: with decoding feature FdecoderAnd edge featureSemantic segmentation probability graph and marginal probability are obtained respectively Figure, in training set semantic image mark calculate edge image mark, using semantic segmentation probability graph and marginal probability figure with And corresponding mark calculates separately to obtain the edge penalty of semantic segmentation loss and back-up surveillance, to minimize the two weighting Entire depth neural network is trained with loss is target;Specifically includes the following steps:
Step D1: with bilinear interpolation by feature FdecoderWith all featuresZoom to the size with input picture It is identical, and semantic segmentation probability and marginal probability are obtained as 1 × 1 convolutional calculation of activation primitive by using softmax, Softmax calculation formula is as follows:
Wherein, σcFor the probability of c classification, e is natural Exponents, γcAnd γkIt is special to respectively indicate the un-activation that classification is c and k Value indicative, C are total classification number;
Step D2: the semantic segmentation mark of training set is subjected to one-hot coding, then calculates and obtains edge mark, edge mark It is as follows to infuse calculation formula:
Wherein, yedge(i, j, c) andFor the edge mark and semantic tagger of the position coordinate (i, j) c class, (iu, ju) indicate (i, j) coordinate under 8 neighborhood U8In one group of coordinate, sgn () be sign function;
Step D3: using the corresponding mark of the probability graph at both semantic segmentation and edge, the friendship of Pixel-level is calculated separately Entropy is pitched, corresponding semantic segmentation loss L is obtainedsWith the edge penalty of back-up surveillanceThen it calculates weighted sum and loses L:
Wherein,For edge featureCorresponding penalty values, αosForShared weight, α in final lossos MeetAnd each αosIt is equal;
Finally by stochastic gradient descent optimization method, model parameter is updated using backpropagation iteration, is added with minimizing L is weighed and lost to train entire depth neural network, obtains deep neural network model to the end.
Step E: segmented image is treated using trained deep neural network model and carries out semantic segmentation, output segmentation knot Fruit.
The present invention also provides the semantic segmentation systems understood for streetscape for realizing the above method, as shown in Fig. 2, Include:
Preprocessing module for pre-processing training set input picture, including allows image to subtract its image mean value to make It is standardized, and the shearing for carrying out uniform sizes to image at random obtains the pretreatment image of identical size;
Coding characteristic extraction module, for extracting generic features F with convolutional networkbackbone, then it is based on generic features FbackboneObtain three-level context space pyramid fusion feature Ftspp, for capturing multiple dimensioned contextual information, then with this Two parts cascade extracts coding characteristic F as coding networkencoder
Characteristic extracting module is decoded, for expanding coding characteristic FencoderSize is obtained to the half of input image size Half input size coding feature Fus, middle layer feature is chosen from the convolutional networkCalculate edge featureIn conjunction with Half input size coding feature Fus, with combination of edge featureDense net be decoding network, carry out image resolution ratio reconstruction, Extract decoding feature Fdecoder
Neural metwork training module, for using decoding feature FdecoderAnd edge featureIt is general that semantic segmentation is obtained respectively Rate figure and marginal probability figure calculate edge image mark with the semantic image mark in training set, utilize semantic segmentation probability graph It calculates separately to obtain the edge penalty of semantic segmentation loss and back-up surveillance with marginal probability figure and corresponding mark, with Minimizing the two weighted sum loss is target to be trained to entire depth neural network, obtains deep neural network model; And
Semantic segmentation module carries out semantic point for treating segmented image using trained deep neural network model It cuts, exports segmentation result.
The above are preferred embodiments of the present invention, all any changes made according to the technical solution of the present invention, and generated function is made When with range without departing from technical solution of the present invention, all belong to the scope of protection of the present invention.

Claims (5)

1. a kind of semantic segmentation method based on the dense reconstruction in edge understood for streetscape, which is characterized in that including following step It is rapid:
Step A: pre-processing training set input picture, and allowing image to subtract its image mean value first makes its standardization, then The shearing for carrying out uniform sizes to image at random obtains the pretreatment image of identical size;
Step B: generic features F is extracted with convolutional networkbackbone, then it is based on generic features FbackboneObtain three-level context space Pyramid fusion feature Ftspp, for capturing multiple dimensioned contextual information, then cascaded using this two parts as coding network and mentioned Take coding characteristic Fencoder
Step C: expand coding characteristic FencoderSize obtains half input size coding feature to the half of input image size Fus, middle layer feature is chosen from the convolutional networkCalculate edge featureIn conjunction with half input size coding feature Fus, with combination of edge featureDense net be decoding network, carry out image resolution ratio reconstruction, calculate decoding feature Fdecoder
Step D: with decoding feature FdecoderAnd edge featureSemantic segmentation probability graph and marginal probability figure are obtained respectively, with Semantic image mark in training set calculates edge image mark, using semantic segmentation probability graph and marginal probability figure and respectively Corresponding mark calculates separately to obtain the edge penalty of semantic segmentation loss and back-up surveillance, to minimize the two weighted sum loss Entire depth neural network is trained for target;
Step E: segmented image is treated using trained deep neural network model and carries out semantic segmentation, exports segmentation result.
2. the semantic segmentation method based on the dense reconstruction in edge according to claim 1 understood for streetscape, feature It is, in the step B, extracts generic features F with convolutional networkbackbone, then it is based on generic features FbackboneIt obtains in three-level Hereafter spatial pyramid fusion feature Ftspp, then cascaded using this two parts as coding network and extract coding characteristic Fencoder, packet Include following steps:
Step B1: generic features F is extracted to pretreatment image using convolutional networkbackbone
Step B2: using 1 × 1 convolution to feature FbackboneFeature Dimension Reduction is carried out, feature is obtained
Step B3: to FbackboneWhole image carries out average pond, then reuses arest neighbors demosaicing to full size, then pass through It crosses 1 × 1 convolution and obtains image level feature Fimage
Step B4: being r with porosityasConvolution kernel to FbackboneIt carries out convolution with holes and obtains featureThen splice in three-level Following traitsFimageWithFusion Features are carried out using 1 × 1 convolution afterwards, obtaining porosity is rasThree-level context fusion FeatureThe same distribution for keeping input in convolution process using batch standardization, uses line rectification function as activation primitive; Wherein, convolutional calculation formula with holes is as follows:
Wherein,It indicates in output coordinate masThe use porosity of position is rasConvolution with holes processing result, xas[mas +ras·kas] indicate input xasIn coordinate masOn position in porosity be rasAnd convolution kernel coordinate with holes is kasWhen it is corresponding defeated Enter reference pixel, was[kas] indicate in convolution kernel with holes as kasThe weight of position;
Step B5: repeating previous step using different porositys, until obtaining ntsppA feature, then by this ntsppA feature withAnd FimageSpliced, obtains three-level context space pyramid fusion feature Ftspp
Step B6: using 1 × 1 convolution to feature FtsppDimensionality reduction is carried out, then carries out canonical with the dropout in deep learning again Change, obtains coding characteristic F to the endencoder
3. the semantic segmentation method based on the dense reconstruction in edge according to claim 2 understood for streetscape, feature It is, in the step C, expands coding characteristic FencoderSize obtains half input size and compiles to the half of input image size Code feature Fus, middle layer feature is chosen from the convolutional networkCalculate edge featureIt is compiled in conjunction with half input size Code feature Fus, with combination of edge featureDense net be decoding network, carry out image resolution ratio reconstruction, calculate decoding feature Fdecoder, comprising the following steps:
Step C1: the ratio that definition initially enters picture size and characteristic size is the output stride of this feature, uses arest neighbors Interpolation processing coding characteristic Fencoder, obtain the characteristic pattern F that output stride is 2us
Step C2: the middle layer feature that output stride is os is chosen from the convolutional network for extracting generic featuresFirst use 1 × 1 convolution carries out dimensionality reduction, is then expanded using bilinear interpolationEdge feature is obtained again
Step C3: splicing feature FusWithAfter 1 × 1 convolution dimensionality reduction, reuses 3 × 3 convolution extraction feature and decoded Feature Fdecoder
Step C4: choosing than in step C2 smaller output stride os, if all the processing of output stride is completed, decodes feature It extracts and completes, otherwise splice FusAnd FdecoderAs new Fus, and repeat step C2 to C3.
4. the semantic segmentation method based on the dense reconstruction in edge according to claim 3 understood for streetscape, feature It is, in the step D, with decoding feature FdecoderAnd edge featureIt obtains semantic segmentation probability graph respectively and edge is general Rate figure calculates edge image mark with the semantic image mark in training set, utilizes semantic segmentation probability graph and marginal probability figure And corresponding mark calculates separately to obtain the edge penalty of semantic segmentation loss and back-up surveillance, both to minimize plus Power and loss are target to be trained to entire depth neural network, comprising the following steps:
Step D1: with bilinear interpolation by feature FdecoderWith all featuresZoom to it is identical as the size of input picture, And semantic segmentation probability and marginal probability, softmax are obtained as 1 × 1 convolutional calculation of activation primitive by using softmax Calculation formula is as follows:
Wherein, σcFor the probability of c classification, e is natural Exponents, γcAnd γkThe un-activation characteristic value that classification is c and k is respectively indicated, C is total classification number;
Step D2: carrying out one-hot coding for the semantic segmentation mark of training set, then calculate and obtain edge mark, edge mark meter It is as follows to calculate formula:
Wherein, yedge(i, j, c) andFor the edge mark and semantic tagger of the position coordinate (i, j) c class, (iu,ju) indicate 8 neighborhood U under (i, j) coordinate8In one group of coordinate, sgn () be sign function;
Step D3: using the corresponding mark of the probability graph at both semantic segmentation and edge, calculating separately the cross entropy of Pixel-level, Obtain corresponding semantic segmentation loss LsWith the edge penalty of back-up surveillanceThen it calculates weighted sum and loses L:
Wherein,For edge featureCorresponding penalty values, αosForThe shared weight in final loss;
Finally by stochastic gradient descent optimization method, model parameter is updated using backpropagation iteration, to minimize weighted sum L is lost to train entire depth neural network, obtains deep neural network model to the end.
5. a kind of semantic segmentation system based on the dense reconstruction in edge understood for streetscape characterized by comprising
Preprocessing module for pre-processing training set input picture, including allows image to subtract its image mean value to make its mark Standardization, and the pretreatment image of the identical size of shearing acquisition of uniform sizes is carried out to image at random;
Coding characteristic extraction module, for extracting generic features F with convolutional networkbackbone, then it is based on generic features FbackboneIt obtains Take three-level context space pyramid fusion feature Ftspp, for capturing multiple dimensioned contextual information, then with this two parts grade Connection extracts coding characteristic F as coding networkencoder
Characteristic extracting module is decoded, for expanding coding characteristic FencoderSize obtains half and inputs to the half of input image size Size coding feature Fus, middle layer feature is chosen from the convolutional networkCalculate edge featureIt is inputted in conjunction with half Size coding feature Fus, with combination of edge featureDense net be decoding network, carry out image resolution ratio reconstruction, extract solution Code feature Fdecoder
Neural metwork training module, for using decoding feature FdecoderAnd edge featureSemantic segmentation probability graph is obtained respectively With marginal probability figure, edge image mark is calculated with the semantic image mark in training set, utilizes semantic segmentation probability graph and side Edge probability graph and corresponding mark calculate separately to obtain the edge penalty of semantic segmentation loss and back-up surveillance, with minimum Changing the two weighted sum loss is target to be trained to entire depth neural network, obtains deep neural network model;And
Semantic segmentation module carries out semantic segmentation for treating segmented image using trained deep neural network model, defeated Segmentation result out.
CN201910359119.0A 2019-04-30 2019-04-30 Semantic segmentation method and system based on edge dense reconstruction for street view understanding Active CN110059698B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910359119.0A CN110059698B (en) 2019-04-30 2019-04-30 Semantic segmentation method and system based on edge dense reconstruction for street view understanding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910359119.0A CN110059698B (en) 2019-04-30 2019-04-30 Semantic segmentation method and system based on edge dense reconstruction for street view understanding

Publications (2)

Publication Number Publication Date
CN110059698A true CN110059698A (en) 2019-07-26
CN110059698B CN110059698B (en) 2022-12-23

Family

ID=67321810

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910359119.0A Active CN110059698B (en) 2019-04-30 2019-04-30 Semantic segmentation method and system based on edge dense reconstruction for street view understanding

Country Status (1)

Country Link
CN (1) CN110059698B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110517278A (en) * 2019-08-07 2019-11-29 北京旷视科技有限公司 Image segmentation and the training method of image segmentation network, device and computer equipment
CN110599514A (en) * 2019-09-23 2019-12-20 北京达佳互联信息技术有限公司 Image segmentation method and device, electronic equipment and storage medium
CN110598846A (en) * 2019-08-15 2019-12-20 北京航空航天大学 Hierarchical recurrent neural network decoder and decoding method
CN110895814A (en) * 2019-11-30 2020-03-20 南京工业大学 Intelligent segmentation method for aero-engine hole detection image damage based on context coding network
CN111340047A (en) * 2020-02-28 2020-06-26 江苏实达迪美数据处理有限公司 Image semantic segmentation method and system based on multi-scale feature and foreground and background contrast
CN111341438A (en) * 2020-02-25 2020-06-26 中国科学技术大学 Image processing apparatus, electronic device, and medium
CN111429473A (en) * 2020-02-27 2020-07-17 西北大学 Chest film lung field segmentation model establishment and segmentation method based on multi-scale feature fusion
CN112150478A (en) * 2020-08-31 2020-12-29 温州医科大学 Method and system for constructing semi-supervised image segmentation framework
CN112700462A (en) * 2020-12-31 2021-04-23 北京迈格威科技有限公司 Image segmentation method and device, electronic equipment and storage medium
CN113051983A (en) * 2019-12-28 2021-06-29 中移(成都)信息通信科技有限公司 Method for training field crop disease recognition model and field crop disease recognition
CN113128353A (en) * 2021-03-26 2021-07-16 安徽大学 Emotion sensing method and system for natural human-computer interaction
CN113706545A (en) * 2021-08-23 2021-11-26 浙江工业大学 Semi-supervised image segmentation method based on dual-branch nerve discrimination dimensionality reduction
CN114627086A (en) * 2022-03-18 2022-06-14 江苏省特种设备安全监督检验研究院 Crane surface damage detection method based on improved feature pyramid network
CN115953394A (en) * 2023-03-10 2023-04-11 中国石油大学(华东) Target segmentation-based detection method and system for mesoscale ocean vortexes
CN116978011A (en) * 2023-08-23 2023-10-31 广州新华学院 Image semantic communication method and system for intelligent target recognition

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10095977B1 (en) * 2017-10-04 2018-10-09 StradVision, Inc. Learning method and learning device for improving image segmentation and testing method and testing device using the same
CN109241972A (en) * 2018-08-20 2019-01-18 电子科技大学 Image, semantic dividing method based on deep learning
CN109509192A (en) * 2018-10-18 2019-03-22 天津大学 Merge the semantic segmentation network in Analysis On Multi-scale Features space and semantic space

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10095977B1 (en) * 2017-10-04 2018-10-09 StradVision, Inc. Learning method and learning device for improving image segmentation and testing method and testing device using the same
CN109241972A (en) * 2018-08-20 2019-01-18 电子科技大学 Image, semantic dividing method based on deep learning
CN109509192A (en) * 2018-10-18 2019-03-22 天津大学 Merge the semantic segmentation network in Analysis On Multi-scale Features space and semantic space

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YUZHONG CHEN: "Pyramid Context Contrast for Semantic Segmentation", 《IEEE ACCESS》 *
胡太: "基于深度神经网络的小目标语义分割算法研究", 《中国优秀硕士学位论文全文数据库》 *

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110517278B (en) * 2019-08-07 2022-04-29 北京旷视科技有限公司 Image segmentation and training method and device of image segmentation network and computer equipment
CN110517278A (en) * 2019-08-07 2019-11-29 北京旷视科技有限公司 Image segmentation and the training method of image segmentation network, device and computer equipment
CN110598846A (en) * 2019-08-15 2019-12-20 北京航空航天大学 Hierarchical recurrent neural network decoder and decoding method
CN110598846B (en) * 2019-08-15 2022-05-03 北京航空航天大学 Hierarchical recurrent neural network decoder and decoding method
CN110599514A (en) * 2019-09-23 2019-12-20 北京达佳互联信息技术有限公司 Image segmentation method and device, electronic equipment and storage medium
CN110599514B (en) * 2019-09-23 2022-10-04 北京达佳互联信息技术有限公司 Image segmentation method and device, electronic equipment and storage medium
CN110895814A (en) * 2019-11-30 2020-03-20 南京工业大学 Intelligent segmentation method for aero-engine hole detection image damage based on context coding network
CN113051983B (en) * 2019-12-28 2022-08-23 中移(成都)信息通信科技有限公司 Method for training field crop disease recognition model and field crop disease recognition
CN113051983A (en) * 2019-12-28 2021-06-29 中移(成都)信息通信科技有限公司 Method for training field crop disease recognition model and field crop disease recognition
CN111341438B (en) * 2020-02-25 2023-04-28 中国科学技术大学 Image processing method, device, electronic equipment and medium
CN111341438A (en) * 2020-02-25 2020-06-26 中国科学技术大学 Image processing apparatus, electronic device, and medium
CN111429473A (en) * 2020-02-27 2020-07-17 西北大学 Chest film lung field segmentation model establishment and segmentation method based on multi-scale feature fusion
CN111429473B (en) * 2020-02-27 2023-04-07 西北大学 Chest film lung field segmentation model establishment and segmentation method based on multi-scale feature fusion
CN111340047A (en) * 2020-02-28 2020-06-26 江苏实达迪美数据处理有限公司 Image semantic segmentation method and system based on multi-scale feature and foreground and background contrast
CN112150478A (en) * 2020-08-31 2020-12-29 温州医科大学 Method and system for constructing semi-supervised image segmentation framework
CN112700462A (en) * 2020-12-31 2021-04-23 北京迈格威科技有限公司 Image segmentation method and device, electronic equipment and storage medium
CN113128353B (en) * 2021-03-26 2023-10-24 安徽大学 Emotion perception method and system oriented to natural man-machine interaction
CN113128353A (en) * 2021-03-26 2021-07-16 安徽大学 Emotion sensing method and system for natural human-computer interaction
CN113706545A (en) * 2021-08-23 2021-11-26 浙江工业大学 Semi-supervised image segmentation method based on dual-branch nerve discrimination dimensionality reduction
CN113706545B (en) * 2021-08-23 2024-03-26 浙江工业大学 Semi-supervised image segmentation method based on dual-branch nerve discrimination dimension reduction
CN114627086A (en) * 2022-03-18 2022-06-14 江苏省特种设备安全监督检验研究院 Crane surface damage detection method based on improved feature pyramid network
CN114627086B (en) * 2022-03-18 2023-04-28 江苏省特种设备安全监督检验研究院 Crane surface damage detection method based on characteristic pyramid network
CN115953394A (en) * 2023-03-10 2023-04-11 中国石油大学(华东) Target segmentation-based detection method and system for mesoscale ocean vortexes
CN116978011A (en) * 2023-08-23 2023-10-31 广州新华学院 Image semantic communication method and system for intelligent target recognition
CN116978011B (en) * 2023-08-23 2024-03-15 广州新华学院 Image semantic communication method and system for intelligent target recognition

Also Published As

Publication number Publication date
CN110059698B (en) 2022-12-23

Similar Documents

Publication Publication Date Title
CN110059698A (en) The semantic segmentation method and system based on the dense reconstruction in edge understood for streetscape
CN110059768A (en) The semantic segmentation method and system of the merging point and provincial characteristics that understand for streetscape
CN110059769A (en) The semantic segmentation method and system rebuild are reset based on pixel for what streetscape understood
CN110070091A (en) The semantic segmentation method and system rebuild based on dynamic interpolation understood for streetscape
CN110033410B (en) Image reconstruction model training method, image super-resolution reconstruction method and device
CN115797931A (en) Remote sensing image semantic segmentation method based on double-branch feature fusion
CN110992270A (en) Multi-scale residual attention network image super-resolution reconstruction method based on attention
CN104113789B (en) On-line video abstraction generation method based on depth learning
CN108427920A (en) A kind of land and sea border defense object detection method based on deep learning
CN109598269A (en) A kind of semantic segmentation method based on multiresolution input with pyramid expansion convolution
CN108549893A (en) A kind of end-to-end recognition methods of the scene text of arbitrary shape
CN109410146A (en) A kind of image deblurring algorithm based on Bi-Skip-Net
CN112884073B (en) Image rain removing method, system, terminal and storage medium
CN110097110A (en) A kind of semantic image restorative procedure based on objective optimization
CN111179196A (en) Multi-resolution depth network image highlight removing method based on divide-and-conquer
CN113762265A (en) Pneumonia classification and segmentation method and system
CN111462090A (en) Multi-scale image target detection method
CN111126185B (en) Deep learning vehicle target recognition method for road gate scene
CN115082966A (en) Pedestrian re-recognition model training method, pedestrian re-recognition method, device and equipment
CN113688715A (en) Facial expression recognition method and system
Feng et al. Coal mine image dust and fog clearing algorithm based on deep learning network
Wan et al. Siamese Attentive Convolutional Network for Effective Remote Sensing Image Change Detection
CN110414301A (en) It is a kind of based on double compartment crowd density estimation methods for taking the photograph head
CN116485689B (en) Progressive coupling image rain removing method and system based on CNN and transducer
CN117934327A (en) Reflective image removing method based on characteristic dynamic cross perception

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant