CN109829929A - A kind of level Scene Semantics parted pattern based on depth edge detection - Google Patents
A kind of level Scene Semantics parted pattern based on depth edge detection Download PDFInfo
- Publication number
- CN109829929A CN109829929A CN201811649016.XA CN201811649016A CN109829929A CN 109829929 A CN109829929 A CN 109829929A CN 201811649016 A CN201811649016 A CN 201811649016A CN 109829929 A CN109829929 A CN 109829929A
- Authority
- CN
- China
- Prior art keywords
- network
- edge
- filtering
- domain
- semantic feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The present invention relates to a kind of level Scene Semantics parted patterns based on depth edge detection, which comprises the following steps: (1) extraction for, using PSPNet carrying out semantic feature is set using parameter of the Adam algorithm to PSPNet network.(2), the semantic feature for using PSPNet network to obtain in domain on the basis of transformation model, by controlling the size of loss, constantly adjusts network parameter as input, training whole network;(3), the part close to edge is filtered using Fully-CRF algorithm, obtains final result.By being directed to the filtering processing of marginal portion, realizes the detection for marginal portion higher precision, further promote the accuracy of Scene Semantics segmenting structure.
Description
Technical field
The present invention relates to a kind of level Scene Semantics parted patterns based on depth edge detection, belong to scene cut technology
Field.
Background technique
With the fast development of computer vision, various algorithms based on computer vision are constantly innovated again, are improved,
And Algorithm of Scene is also one of them indispensable widely used part.Algorithm of Scene is often applied to various
Need to simulate the place of current environment, such as unmanned vehicle technology, scene cut technology is often wherein providing eyes
Effect is responsible for simulating current environment locating for vehicle as far as possible to come, for the decision judgement after vehicle.Future with
The continuous improvement of scene cut precision, will play the role of in more areas even more important.
It in the prior art, is that an image block is taken centered on some pixel for the classical way of semantic segmentation,
Then the feature of image block is taken to go to train classifier as sample.In test phase, similarly in test chart on piece with each picture
An image block is adopted centered on vegetarian refreshments to classify, predicted value of the classification results as the pixel, finally realize point of pixel
Class is to achieve the purpose that scene cut.But more noise, and the detection for marginal portion will appear to scene cut in this way
It is easy to appear more fault.
Summary of the invention
The purpose of the present invention is to provide a kind of level Scene Semantics parted patterns based on depth edge detection, pass through needle
Filtering processing to marginal portion, realizes the detection for marginal portion higher precision, further promotes Scene Semantics segmentation knot
The accuracy of structure.
For achieving the above object, the technical scheme of the present invention is realized as follows: a kind of detected based on depth edge
Level Scene Semantics parted pattern, which comprises the following steps:
(1), the extraction that semantic feature is carried out using PSPNet, is set using parameter of the Adam algorithm to PSPNet network
It is fixed.
(2), the semantic feature for using PSPNet network to obtain trains whole network, the base of transformation model in domain as input
On plinth, by controlling the size of loss, network parameter is constantly adjusted, so that domain converts density, that is, side in the transformation model of domain
Certain accuracy is reached for the size detection of edge degree in hoddy network;Whole network joins whole network using SGD algorithm
Number is adjusted setting.It is filtered hence for the position far from edge.
Wherein, in order to promote the detection effect at semantic edge, a convolutional layer is additionally added in edge network herein,
And the output channel number of this layer is set as 10.All convolution kernels in edge network are sized to 1, convolution size is 1
Convolution kernel carries out process of convolution to each pixel in characteristic pattern, functions as full articulamentum, can be to a certain degree
Upper acquisition global information, completes the effect of coding, and compares with full articulamentum, and the parameter amount of the convolutional layer is considerably less, thus
The complexity of network model is simplified, overfitting problem can be effectively inhibited.So compared with full articulamentum, convolution having a size of
1 convolutional layer is more suitable for completing Fusion Features operation.
(3) part close to edge is filtered using Fully-CRF algorithm
Fully-CRF combines maximum entropy model and hidden Markov model, is a kind of undirected graph model, and with picture picture
Element is that node constructs its energy function.The connection of Fully-CRF be it is global, binary potential function describes each pixel
Relationship between other all pixels.
The positive effect of the present invention is the detection realized for marginal portion higher precision, further promotes Scene Semantics point
Cut the accuracy of structure.
Detailed description of the invention
Fig. 1 (a) is the overall flow figure of the level Scene Semantics parted pattern detected the present invention is based on depth edge.
Fig. 1 (b) is the overall flow figure of the level Scene Semantics parted pattern detected the present invention is based on depth edge.
Fig. 2 is the flow through a network figure of PSPNet algorithm in the present invention.
Fig. 3 is domain conversion operation schematic diagram in the present invention.
Fig. 4 (a) is a kind of circulation way of domain conversion in the present invention.
Fig. 4 (b) is another circulation way of domain conversion in the present invention.
Fig. 5 is the filter action schematic diagram that Fully-CRF algorithm plays in the present invention.
Specific embodiment
A specific embodiment of the invention is described with reference to the accompanying drawing, preferably so as to those skilled in the art
Understand the present invention.Requiring particular attention is that in the following description, when known function and the detailed description of design perhaps
When can desalinate main contents of the invention, these descriptions will be ignored herein.
Fig. 1 (a) (b) is that the present invention is based on the overall flow figures of the level Scene Semantics parted pattern of depth edge detection.
In the present embodiment, as shown in Fig. 1 (a) (b), a kind of level Scene Semantics segmentation of depth edge detection of the present invention
Model, comprising the following steps:
S1, the extraction that semantic feature is carried out using PSPNet, are set using parameter of the Adam algorithm to PSPNet network
It is fixed.
As shown in Fig. 2, the semantic feature of Encoder is down sampled to different scale in last Decoder by PSPNet
Under, these semantic features are then upsampled to size identical with input picture, finally carry out Fusion Features.So PSPNet
Sharpest edges be to semantic feature carry out part and globality extraction and fusion
S2, the semantic feature for using PSPNet network to obtain train whole network, the basis of transformation model in domain as input
On, by controlling the size of loss, network parameter is constantly adjusted, so that domain converts density, that is, edge in the transformation model of domain
Certain accuracy is reached for the size detection of edge degree in network;Whole network is using SGD algorithm to whole network parameter
It is adjusted setting.It is filtered hence for the position far from edge.
As shown in Fig. 4 (a), specific filtering is as follows, it is assumed that the length of one-dimensional input signal x is N { x1,x2,
x3...xN, if the y of output y1=x1, then it is designated as i=2 under ... the processing of N is as follows:
yi=(1-wi)xi+wiyi-1 (1.1)
Wherein weight wiDensity d is converted by domainiIt acquires:
But the filtering that formula (1.1) is carried out is asymmetric, because the output of current position relies only on one
Output at position can be such that filtering deviates towards a direction as a result, such asymmetry is handled, and will lead to worse segmentation result
It is passed down.In order to solve this problem, domain transformation has successively used the filtering operation of four direction respectively, is respectively: from a left side
To the right side, from right to left, from top to bottom, from top to bottom.As shown in figure 3, domain conversion is a separation side to the processing of 2D signal
Carried out under formula, i.e., individual one-dimensional signal filtering carried out respectively per one-dimensional to spatial domain: first carry out horizontal direction filtering (from
It is left-to-right and from right to left), carrying out the filtering (from top to bottom and from top to bottom) in vertical direction.It converts in each iteration in domain
When reduce the standard deviation of filtering core, and require total variance and be equal to desired varianceThat is:
σ is used at kth time iterationkInstead of σs, to calculate weight wi.Domain converts density diIt is defined as:
Wherein variable gi> 0 is the output of edge network as a result, σrIndicate standard of the filtering core on edge detection characteristic pattern
Difference.It should be noted giValue it is more big more showing that the probability for belonging to edge at the position i is bigger.So working as giWhen bigger,
The output of domain conversion is compared dependent on original input signal xi(semantic feature), works as giWhen smaller, the output of domain conversion
Compare dependent on upper result yi-1, to realize the filtering to semantic feature in the place far from edge.
As shown in Fig. 4 (b), it is assumed that node yiNot only influence next node yi+1, but also a succeeding layer is acted as, because
This obtains gradient value from current layer in the back-propagation process of convolutional network.At this time gradient propagation formula is as follows:
WhereinWithIt is initialised and is set as 0,Initialization be transmitted by succeeding layer Lai value set by.
Weight wiIt is shared in all filtering stages (horizontal direction filtering and vertical direction filtering) and the number of iterations.
Using these partial derivatives, can produce relative to margin signal giDerivative.Formula (1.4) is updated to formula
(1.2), available:
Then rule is sought according to local derviation, formula (1.8) is updated to (1.6), the derivation of margin signal can be obtained are as follows:
So far the penalty values being calculated at loss layers can be transmitted to edge network and semantic feature extraction net respectively
Network.
Wherein, in order to promote the detection effect at semantic edge, a convolutional layer is additionally added in edge network herein,
And the output channel number of this layer is set as 10.All convolution kernels in edge network are sized to 1, convolution size is 1
Convolution kernel carries out process of convolution to each pixel in characteristic pattern, functions as full articulamentum, can be to a certain degree
Upper acquisition global information, completes the effect of coding, and compares with full articulamentum, and the parameter amount of the convolutional layer is considerably less, thus
The complexity of network model is simplified, overfitting problem can be effectively inhibited.So compared with full articulamentum, convolution having a size of
1 convolutional layer is more suitable for completing Fusion Features operation.
S3, the part close to edge is filtered using Fully-CRF algorithm
As shown in figure 5, shown in such as formula of energy function constructed by Fully-CRF (1.10).
Wherein x is the class prediction to pixel.θi(xi)=- logP (xi), and P (xi) it is convolutional network in the position i place
The classification of calculating belongs to probability.Potential function are as follows:
Wherein work as xi≠xjWhen, μ (xi,xj)=1, in addition to this μ (xi,xj)=0.As shown in figure 5, in picture no matter as
The distance between plain i and pixel j are how far, have a connection between two pixels, so the graph model connects entirely.It is each
A kmFeature extraction between pixel i, j is corresponding, weight wm, and the position between pixel and colouring information are considered,
Expression are as follows:
Wherein variable p and I respectively indicates position and the rgb value of pixel.First Gaussian kernel dependent on pixel position and
Color, and second Gaussian kernel depends on the position of pixel.Parameter σα、σβ、σγIt is the parameter of corresponding Gaussian kernel.
Although the illustrative specific embodiment of the present invention is described above, in order to the technology of the art
Personnel understand the present invention, it should be apparent that the present invention is not limited to the range of specific embodiment, to the common skill of the art
For art personnel, if various change the attached claims limit and determine the spirit and scope of the present invention in, these
Variation is it will be apparent that all utilize the innovation and creation of present inventive concept in the column of protection.
Claims (2)
1. a kind of level Scene Semantics parted pattern based on depth edge detection, which comprises the following steps:
(1), the extraction that semantic feature is carried out using PSPNet, is set using parameter of the Adam algorithm to PSPNet network;
(2), the semantic feature for using PSPNet network to obtain trains whole network as input, in domain on the basis of transformation model,
By controlling the size of loss, network parameter is constantly adjusted, so that domain converts density d in the transformation model of domaini, that is, edge net
Certain accuracy is reached for the size detection of edge degree in network;Whole network using SGD algorithm to whole network parameter into
Row adjustment setting, is filtered hence for the position far from edge;
Wherein, in order to promote the detection effect at semantic edge, a convolutional layer is additionally added in edge network herein, and set
The output channel number of this layer is 10;All convolution kernels in edge network are sized to 1, the convolution that convolution size is 1
It checks each pixel in characteristic pattern and carries out process of convolution, function as full articulamentum, can obtain to a certain extent
Global information is taken, the effect of coding is completed, and is compared with full articulamentum, the parameter amount of the convolutional layer is considerably less, to simplify
The complexity of network model, can effectively inhibit overfitting problem;So comparing with full articulamentum, convolution is having a size of volume 1
Lamination is more suitable for completing Fusion Features operation;
(3), the part close to edge is filtered using Fully-CRF algorithm, obtains final result.
2. the level Scene Semantics parted pattern according to claim 1 based on depth edge detection, which is characterized in that institute
The domain switched filter operation stated are as follows:
Assuming that the length of one-dimensional input signal x is N { x1,x2,x3…xN, if the y of output y1=x1, then it is designated as i=2 under,
... the processing of N is as follows:
yi=(1-wi)xi+wiyi-1 (1.1)
Wherein weight wiDensity d is converted by domainiIt acquires:
But the filtering that formula (1.1) is carried out is asymmetric, because the output of current position relies only on a position
The output at place can be such that filtering deviates towards a direction as a result, such asymmetry is handled, and will lead to worse segmentation result and passed
It passs down;In order to solve this problem, domain transformation has successively used the filtering operation of four direction respectively, is respectively: from left to right,
From right to left, from top to bottom, from top to bottom;Conversion to the processing of 2D signal is carried out under a separate mode, i.e., to space
Domain carries out individual one-dimensional signal filtering per one-dimensional respectively: first carry out the filtering of horizontal direction from left to right and from right to left,
Carrying out the filtering in vertical direction from top to bottom and from top to bottom;Domain conversion reduces the mark of filtering core when each iteration
Quasi- deviation, and require total variance and be equal to desired varianceThat is:
σ is used at kth time iterationkInstead of σs, to calculate weight wi;Domain converts density diIt is defined as:
Wherein variable gi> 0 is the output of edge network as a result, σrIndicate standard deviation of the filtering core on edge detection characteristic pattern;
It should be noted giValue it is more big more showing that the probability for belonging to edge at the position i is bigger;So working as giWhen bigger, domain
The output of conversion is compared dependent on original input signal xiSemantic feature works as giWhen smaller, the output of domain conversion is compared
Dependent on upper result yi-1, to realize the filtering to semantic feature in the place far from edge;
Assuming that node yiNot only influence next node yi+1, but also a succeeding layer is acted as, therefore in the reversed of convolutional network
Gradient value is obtained from current layer in communication process;At this time gradient propagation formula is as follows:
WhereinWithIt is initialised and is set as 0,Initialization be transmitted by succeeding layer Lai value set by, weight wi
It is shared in all filtering of filtering stage horizontal direction and vertical direction filtering and the number of iterations;
Using these partial derivatives, can produce relative to margin signal giDerivative;Formula (1.4) is updated to formula (1.2),
It is available:
Then rule is sought according to local derviation, formula (1.8) is updated to (1.6), the derivation of margin signal can be obtained are as follows:
So far it can be transmitted to edge network and semantic feature extraction network respectively in oss layers of penalty values being calculated of l.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811649016.XA CN109829929A (en) | 2018-12-30 | 2018-12-30 | A kind of level Scene Semantics parted pattern based on depth edge detection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811649016.XA CN109829929A (en) | 2018-12-30 | 2018-12-30 | A kind of level Scene Semantics parted pattern based on depth edge detection |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109829929A true CN109829929A (en) | 2019-05-31 |
Family
ID=66861471
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811649016.XA Pending CN109829929A (en) | 2018-12-30 | 2018-12-30 | A kind of level Scene Semantics parted pattern based on depth edge detection |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109829929A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110502998A (en) * | 2019-07-23 | 2019-11-26 | 平安科技(深圳)有限公司 | Car damage identification method, device, equipment and storage medium |
CN111666945A (en) * | 2020-05-11 | 2020-09-15 | 深圳力维智联技术有限公司 | Storefront violation identification method and device based on semantic segmentation and storage medium |
CN114882091A (en) * | 2022-04-29 | 2022-08-09 | 中国科学院上海微系统与信息技术研究所 | Depth estimation method combined with semantic edge |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108062756A (en) * | 2018-01-29 | 2018-05-22 | 重庆理工大学 | Image, semantic dividing method based on the full convolutional network of depth and condition random field |
-
2018
- 2018-12-30 CN CN201811649016.XA patent/CN109829929A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108062756A (en) * | 2018-01-29 | 2018-05-22 | 重庆理工大学 | Image, semantic dividing method based on the full convolutional network of depth and condition random field |
Non-Patent Citations (1)
Title |
---|
郭智豪: ""基于卷积神经网络的智能车语义场景分割算法研究"", 《中国优秀硕士学位论文全文数据库》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110502998A (en) * | 2019-07-23 | 2019-11-26 | 平安科技(深圳)有限公司 | Car damage identification method, device, equipment and storage medium |
CN110502998B (en) * | 2019-07-23 | 2023-01-31 | 平安科技(深圳)有限公司 | Vehicle damage assessment method, device, equipment and storage medium |
CN111666945A (en) * | 2020-05-11 | 2020-09-15 | 深圳力维智联技术有限公司 | Storefront violation identification method and device based on semantic segmentation and storage medium |
CN114882091A (en) * | 2022-04-29 | 2022-08-09 | 中国科学院上海微系统与信息技术研究所 | Depth estimation method combined with semantic edge |
CN114882091B (en) * | 2022-04-29 | 2024-02-13 | 中国科学院上海微系统与信息技术研究所 | Depth estimation method combining semantic edges |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108319972B (en) | End-to-end difference network learning method for image semantic segmentation | |
CN109711316B (en) | Pedestrian re-identification method, device, equipment and storage medium | |
CN109614922B (en) | Dynamic and static gesture recognition method and system | |
CN109325954B (en) | Image segmentation method and device and electronic equipment | |
CN104537647B (en) | A kind of object detection method and device | |
CN108985181A (en) | A kind of end-to-end face mask method based on detection segmentation | |
CN110738207A (en) | character detection method for fusing character area edge information in character image | |
CN107204010A (en) | A kind of monocular image depth estimation method and system | |
CN106228528B (en) | A kind of multi-focus image fusing method based on decision diagram and rarefaction representation | |
CN104182772A (en) | Gesture recognition method based on deep learning | |
CN107169463A (en) | Method for detecting human face, device, computer equipment and storage medium | |
CN113807188B (en) | Unmanned aerial vehicle target tracking method based on anchor frame matching and Siamese network | |
Wu et al. | Stereo matching with fusing adaptive support weights | |
CN106846339A (en) | A kind of image detecting method and device | |
CN109829929A (en) | A kind of level Scene Semantics parted pattern based on depth edge detection | |
CN110443173A (en) | A kind of instance of video dividing method and system based on inter-frame relation | |
CN108320306B (en) | Video target tracking method fusing TLD and KCF | |
CN103679187B (en) | Image-recognizing method and system | |
JP2017211939A (en) | Generation device, generation method, and generation program | |
CN110472634A (en) | Change detecting method based on multiple dimensioned depth characteristic difference converged network | |
CN112712546A (en) | Target tracking method based on twin neural network | |
CN111507334A (en) | Example segmentation method based on key points | |
CN107944437B (en) | A kind of Face detection method based on neural network and integral image | |
CN102034247A (en) | Motion capture method for binocular vision image based on background modeling | |
JP2020038666A (en) | Method for generating data set for learning for detection of obstacle in autonomous driving circumstances and computing device, learning method, and learning device using the same |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |