CN109829929A - A kind of level Scene Semantics parted pattern based on depth edge detection - Google Patents

A kind of level Scene Semantics parted pattern based on depth edge detection Download PDF

Info

Publication number
CN109829929A
CN109829929A CN201811649016.XA CN201811649016A CN109829929A CN 109829929 A CN109829929 A CN 109829929A CN 201811649016 A CN201811649016 A CN 201811649016A CN 109829929 A CN109829929 A CN 109829929A
Authority
CN
China
Prior art keywords
network
edge
filtering
domain
semantic feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811649016.XA
Other languages
Chinese (zh)
Inventor
王祎男
王宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
FAW Group Corp
Original Assignee
FAW Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by FAW Group Corp filed Critical FAW Group Corp
Priority to CN201811649016.XA priority Critical patent/CN109829929A/en
Publication of CN109829929A publication Critical patent/CN109829929A/en
Pending legal-status Critical Current

Links

Abstract

The present invention relates to a kind of level Scene Semantics parted patterns based on depth edge detection, which comprises the following steps: (1) extraction for, using PSPNet carrying out semantic feature is set using parameter of the Adam algorithm to PSPNet network.(2), the semantic feature for using PSPNet network to obtain in domain on the basis of transformation model, by controlling the size of loss, constantly adjusts network parameter as input, training whole network;(3), the part close to edge is filtered using Fully-CRF algorithm, obtains final result.By being directed to the filtering processing of marginal portion, realizes the detection for marginal portion higher precision, further promote the accuracy of Scene Semantics segmenting structure.

Description

A kind of level Scene Semantics parted pattern based on depth edge detection
Technical field
The present invention relates to a kind of level Scene Semantics parted patterns based on depth edge detection, belong to scene cut technology Field.
Background technique
With the fast development of computer vision, various algorithms based on computer vision are constantly innovated again, are improved, And Algorithm of Scene is also one of them indispensable widely used part.Algorithm of Scene is often applied to various Need to simulate the place of current environment, such as unmanned vehicle technology, scene cut technology is often wherein providing eyes Effect is responsible for simulating current environment locating for vehicle as far as possible to come, for the decision judgement after vehicle.Future with The continuous improvement of scene cut precision, will play the role of in more areas even more important.
It in the prior art, is that an image block is taken centered on some pixel for the classical way of semantic segmentation, Then the feature of image block is taken to go to train classifier as sample.In test phase, similarly in test chart on piece with each picture An image block is adopted centered on vegetarian refreshments to classify, predicted value of the classification results as the pixel, finally realize point of pixel Class is to achieve the purpose that scene cut.But more noise, and the detection for marginal portion will appear to scene cut in this way It is easy to appear more fault.
Summary of the invention
The purpose of the present invention is to provide a kind of level Scene Semantics parted patterns based on depth edge detection, pass through needle Filtering processing to marginal portion, realizes the detection for marginal portion higher precision, further promotes Scene Semantics segmentation knot The accuracy of structure.
For achieving the above object, the technical scheme of the present invention is realized as follows: a kind of detected based on depth edge Level Scene Semantics parted pattern, which comprises the following steps:
(1), the extraction that semantic feature is carried out using PSPNet, is set using parameter of the Adam algorithm to PSPNet network It is fixed.
(2), the semantic feature for using PSPNet network to obtain trains whole network, the base of transformation model in domain as input On plinth, by controlling the size of loss, network parameter is constantly adjusted, so that domain converts density, that is, side in the transformation model of domain Certain accuracy is reached for the size detection of edge degree in hoddy network;Whole network joins whole network using SGD algorithm Number is adjusted setting.It is filtered hence for the position far from edge.
Wherein, in order to promote the detection effect at semantic edge, a convolutional layer is additionally added in edge network herein, And the output channel number of this layer is set as 10.All convolution kernels in edge network are sized to 1, convolution size is 1 Convolution kernel carries out process of convolution to each pixel in characteristic pattern, functions as full articulamentum, can be to a certain degree Upper acquisition global information, completes the effect of coding, and compares with full articulamentum, and the parameter amount of the convolutional layer is considerably less, thus The complexity of network model is simplified, overfitting problem can be effectively inhibited.So compared with full articulamentum, convolution having a size of 1 convolutional layer is more suitable for completing Fusion Features operation.
(3) part close to edge is filtered using Fully-CRF algorithm
Fully-CRF combines maximum entropy model and hidden Markov model, is a kind of undirected graph model, and with picture picture Element is that node constructs its energy function.The connection of Fully-CRF be it is global, binary potential function describes each pixel Relationship between other all pixels.
The positive effect of the present invention is the detection realized for marginal portion higher precision, further promotes Scene Semantics point Cut the accuracy of structure.
Detailed description of the invention
Fig. 1 (a) is the overall flow figure of the level Scene Semantics parted pattern detected the present invention is based on depth edge.
Fig. 1 (b) is the overall flow figure of the level Scene Semantics parted pattern detected the present invention is based on depth edge.
Fig. 2 is the flow through a network figure of PSPNet algorithm in the present invention.
Fig. 3 is domain conversion operation schematic diagram in the present invention.
Fig. 4 (a) is a kind of circulation way of domain conversion in the present invention.
Fig. 4 (b) is another circulation way of domain conversion in the present invention.
Fig. 5 is the filter action schematic diagram that Fully-CRF algorithm plays in the present invention.
Specific embodiment
A specific embodiment of the invention is described with reference to the accompanying drawing, preferably so as to those skilled in the art Understand the present invention.Requiring particular attention is that in the following description, when known function and the detailed description of design perhaps When can desalinate main contents of the invention, these descriptions will be ignored herein.
Fig. 1 (a) (b) is that the present invention is based on the overall flow figures of the level Scene Semantics parted pattern of depth edge detection.
In the present embodiment, as shown in Fig. 1 (a) (b), a kind of level Scene Semantics segmentation of depth edge detection of the present invention Model, comprising the following steps:
S1, the extraction that semantic feature is carried out using PSPNet, are set using parameter of the Adam algorithm to PSPNet network It is fixed.
As shown in Fig. 2, the semantic feature of Encoder is down sampled to different scale in last Decoder by PSPNet Under, these semantic features are then upsampled to size identical with input picture, finally carry out Fusion Features.So PSPNet Sharpest edges be to semantic feature carry out part and globality extraction and fusion
S2, the semantic feature for using PSPNet network to obtain train whole network, the basis of transformation model in domain as input On, by controlling the size of loss, network parameter is constantly adjusted, so that domain converts density, that is, edge in the transformation model of domain Certain accuracy is reached for the size detection of edge degree in network;Whole network is using SGD algorithm to whole network parameter It is adjusted setting.It is filtered hence for the position far from edge.
As shown in Fig. 4 (a), specific filtering is as follows, it is assumed that the length of one-dimensional input signal x is N { x1,x2, x3...xN, if the y of output y1=x1, then it is designated as i=2 under ... the processing of N is as follows:
yi=(1-wi)xi+wiyi-1 (1.1)
Wherein weight wiDensity d is converted by domainiIt acquires:
But the filtering that formula (1.1) is carried out is asymmetric, because the output of current position relies only on one Output at position can be such that filtering deviates towards a direction as a result, such asymmetry is handled, and will lead to worse segmentation result It is passed down.In order to solve this problem, domain transformation has successively used the filtering operation of four direction respectively, is respectively: from a left side To the right side, from right to left, from top to bottom, from top to bottom.As shown in figure 3, domain conversion is a separation side to the processing of 2D signal Carried out under formula, i.e., individual one-dimensional signal filtering carried out respectively per one-dimensional to spatial domain: first carry out horizontal direction filtering (from It is left-to-right and from right to left), carrying out the filtering (from top to bottom and from top to bottom) in vertical direction.It converts in each iteration in domain When reduce the standard deviation of filtering core, and require total variance and be equal to desired varianceThat is:
σ is used at kth time iterationkInstead of σs, to calculate weight wi.Domain converts density diIt is defined as:
Wherein variable gi> 0 is the output of edge network as a result, σrIndicate standard of the filtering core on edge detection characteristic pattern Difference.It should be noted giValue it is more big more showing that the probability for belonging to edge at the position i is bigger.So working as giWhen bigger, The output of domain conversion is compared dependent on original input signal xi(semantic feature), works as giWhen smaller, the output of domain conversion Compare dependent on upper result yi-1, to realize the filtering to semantic feature in the place far from edge.
As shown in Fig. 4 (b), it is assumed that node yiNot only influence next node yi+1, but also a succeeding layer is acted as, because This obtains gradient value from current layer in the back-propagation process of convolutional network.At this time gradient propagation formula is as follows:
WhereinWithIt is initialised and is set as 0,Initialization be transmitted by succeeding layer Lai value set by. Weight wiIt is shared in all filtering stages (horizontal direction filtering and vertical direction filtering) and the number of iterations.
Using these partial derivatives, can produce relative to margin signal giDerivative.Formula (1.4) is updated to formula (1.2), available:
Then rule is sought according to local derviation, formula (1.8) is updated to (1.6), the derivation of margin signal can be obtained are as follows:
So far the penalty values being calculated at loss layers can be transmitted to edge network and semantic feature extraction net respectively Network.
Wherein, in order to promote the detection effect at semantic edge, a convolutional layer is additionally added in edge network herein, And the output channel number of this layer is set as 10.All convolution kernels in edge network are sized to 1, convolution size is 1 Convolution kernel carries out process of convolution to each pixel in characteristic pattern, functions as full articulamentum, can be to a certain degree Upper acquisition global information, completes the effect of coding, and compares with full articulamentum, and the parameter amount of the convolutional layer is considerably less, thus The complexity of network model is simplified, overfitting problem can be effectively inhibited.So compared with full articulamentum, convolution having a size of 1 convolutional layer is more suitable for completing Fusion Features operation.
S3, the part close to edge is filtered using Fully-CRF algorithm
As shown in figure 5, shown in such as formula of energy function constructed by Fully-CRF (1.10).
Wherein x is the class prediction to pixel.θi(xi)=- logP (xi), and P (xi) it is convolutional network in the position i place The classification of calculating belongs to probability.Potential function are as follows:
Wherein work as xi≠xjWhen, μ (xi,xj)=1, in addition to this μ (xi,xj)=0.As shown in figure 5, in picture no matter as The distance between plain i and pixel j are how far, have a connection between two pixels, so the graph model connects entirely.It is each A kmFeature extraction between pixel i, j is corresponding, weight wm, and the position between pixel and colouring information are considered, Expression are as follows:
Wherein variable p and I respectively indicates position and the rgb value of pixel.First Gaussian kernel dependent on pixel position and Color, and second Gaussian kernel depends on the position of pixel.Parameter σα、σβ、σγIt is the parameter of corresponding Gaussian kernel.
Although the illustrative specific embodiment of the present invention is described above, in order to the technology of the art Personnel understand the present invention, it should be apparent that the present invention is not limited to the range of specific embodiment, to the common skill of the art For art personnel, if various change the attached claims limit and determine the spirit and scope of the present invention in, these Variation is it will be apparent that all utilize the innovation and creation of present inventive concept in the column of protection.

Claims (2)

1. a kind of level Scene Semantics parted pattern based on depth edge detection, which comprises the following steps:
(1), the extraction that semantic feature is carried out using PSPNet, is set using parameter of the Adam algorithm to PSPNet network;
(2), the semantic feature for using PSPNet network to obtain trains whole network as input, in domain on the basis of transformation model, By controlling the size of loss, network parameter is constantly adjusted, so that domain converts density d in the transformation model of domaini, that is, edge net Certain accuracy is reached for the size detection of edge degree in network;Whole network using SGD algorithm to whole network parameter into Row adjustment setting, is filtered hence for the position far from edge;
Wherein, in order to promote the detection effect at semantic edge, a convolutional layer is additionally added in edge network herein, and set The output channel number of this layer is 10;All convolution kernels in edge network are sized to 1, the convolution that convolution size is 1 It checks each pixel in characteristic pattern and carries out process of convolution, function as full articulamentum, can obtain to a certain extent Global information is taken, the effect of coding is completed, and is compared with full articulamentum, the parameter amount of the convolutional layer is considerably less, to simplify The complexity of network model, can effectively inhibit overfitting problem;So comparing with full articulamentum, convolution is having a size of volume 1 Lamination is more suitable for completing Fusion Features operation;
(3), the part close to edge is filtered using Fully-CRF algorithm, obtains final result.
2. the level Scene Semantics parted pattern according to claim 1 based on depth edge detection, which is characterized in that institute The domain switched filter operation stated are as follows:
Assuming that the length of one-dimensional input signal x is N { x1,x2,x3…xN, if the y of output y1=x1, then it is designated as i=2 under, ... the processing of N is as follows:
yi=(1-wi)xi+wiyi-1 (1.1)
Wherein weight wiDensity d is converted by domainiIt acquires:
But the filtering that formula (1.1) is carried out is asymmetric, because the output of current position relies only on a position The output at place can be such that filtering deviates towards a direction as a result, such asymmetry is handled, and will lead to worse segmentation result and passed It passs down;In order to solve this problem, domain transformation has successively used the filtering operation of four direction respectively, is respectively: from left to right, From right to left, from top to bottom, from top to bottom;Conversion to the processing of 2D signal is carried out under a separate mode, i.e., to space Domain carries out individual one-dimensional signal filtering per one-dimensional respectively: first carry out the filtering of horizontal direction from left to right and from right to left, Carrying out the filtering in vertical direction from top to bottom and from top to bottom;Domain conversion reduces the mark of filtering core when each iteration Quasi- deviation, and require total variance and be equal to desired varianceThat is:
σ is used at kth time iterationkInstead of σs, to calculate weight wi;Domain converts density diIt is defined as:
Wherein variable gi> 0 is the output of edge network as a result, σrIndicate standard deviation of the filtering core on edge detection characteristic pattern; It should be noted giValue it is more big more showing that the probability for belonging to edge at the position i is bigger;So working as giWhen bigger, domain The output of conversion is compared dependent on original input signal xiSemantic feature works as giWhen smaller, the output of domain conversion is compared Dependent on upper result yi-1, to realize the filtering to semantic feature in the place far from edge;
Assuming that node yiNot only influence next node yi+1, but also a succeeding layer is acted as, therefore in the reversed of convolutional network Gradient value is obtained from current layer in communication process;At this time gradient propagation formula is as follows:
WhereinWithIt is initialised and is set as 0,Initialization be transmitted by succeeding layer Lai value set by, weight wi It is shared in all filtering of filtering stage horizontal direction and vertical direction filtering and the number of iterations;
Using these partial derivatives, can produce relative to margin signal giDerivative;Formula (1.4) is updated to formula (1.2), It is available:
Then rule is sought according to local derviation, formula (1.8) is updated to (1.6), the derivation of margin signal can be obtained are as follows:
So far it can be transmitted to edge network and semantic feature extraction network respectively in oss layers of penalty values being calculated of l.
CN201811649016.XA 2018-12-30 2018-12-30 A kind of level Scene Semantics parted pattern based on depth edge detection Pending CN109829929A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811649016.XA CN109829929A (en) 2018-12-30 2018-12-30 A kind of level Scene Semantics parted pattern based on depth edge detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811649016.XA CN109829929A (en) 2018-12-30 2018-12-30 A kind of level Scene Semantics parted pattern based on depth edge detection

Publications (1)

Publication Number Publication Date
CN109829929A true CN109829929A (en) 2019-05-31

Family

ID=66861471

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811649016.XA Pending CN109829929A (en) 2018-12-30 2018-12-30 A kind of level Scene Semantics parted pattern based on depth edge detection

Country Status (1)

Country Link
CN (1) CN109829929A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110502998A (en) * 2019-07-23 2019-11-26 平安科技(深圳)有限公司 Car damage identification method, device, equipment and storage medium
CN111666945A (en) * 2020-05-11 2020-09-15 深圳力维智联技术有限公司 Storefront violation identification method and device based on semantic segmentation and storage medium
CN114882091A (en) * 2022-04-29 2022-08-09 中国科学院上海微系统与信息技术研究所 Depth estimation method combined with semantic edge

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062756A (en) * 2018-01-29 2018-05-22 重庆理工大学 Image, semantic dividing method based on the full convolutional network of depth and condition random field

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062756A (en) * 2018-01-29 2018-05-22 重庆理工大学 Image, semantic dividing method based on the full convolutional network of depth and condition random field

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郭智豪: ""基于卷积神经网络的智能车语义场景分割算法研究"", 《中国优秀硕士学位论文全文数据库》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110502998A (en) * 2019-07-23 2019-11-26 平安科技(深圳)有限公司 Car damage identification method, device, equipment and storage medium
CN110502998B (en) * 2019-07-23 2023-01-31 平安科技(深圳)有限公司 Vehicle damage assessment method, device, equipment and storage medium
CN111666945A (en) * 2020-05-11 2020-09-15 深圳力维智联技术有限公司 Storefront violation identification method and device based on semantic segmentation and storage medium
CN114882091A (en) * 2022-04-29 2022-08-09 中国科学院上海微系统与信息技术研究所 Depth estimation method combined with semantic edge
CN114882091B (en) * 2022-04-29 2024-02-13 中国科学院上海微系统与信息技术研究所 Depth estimation method combining semantic edges

Similar Documents

Publication Publication Date Title
CN108319972B (en) End-to-end difference network learning method for image semantic segmentation
CN109711316B (en) Pedestrian re-identification method, device, equipment and storage medium
CN109614922B (en) Dynamic and static gesture recognition method and system
CN109325954B (en) Image segmentation method and device and electronic equipment
CN104537647B (en) A kind of object detection method and device
CN108985181A (en) A kind of end-to-end face mask method based on detection segmentation
CN110738207A (en) character detection method for fusing character area edge information in character image
CN107204010A (en) A kind of monocular image depth estimation method and system
CN106228528B (en) A kind of multi-focus image fusing method based on decision diagram and rarefaction representation
CN104182772A (en) Gesture recognition method based on deep learning
CN107169463A (en) Method for detecting human face, device, computer equipment and storage medium
CN113807188B (en) Unmanned aerial vehicle target tracking method based on anchor frame matching and Siamese network
Wu et al. Stereo matching with fusing adaptive support weights
CN106846339A (en) A kind of image detecting method and device
CN109829929A (en) A kind of level Scene Semantics parted pattern based on depth edge detection
CN110443173A (en) A kind of instance of video dividing method and system based on inter-frame relation
CN108320306B (en) Video target tracking method fusing TLD and KCF
CN103679187B (en) Image-recognizing method and system
JP2017211939A (en) Generation device, generation method, and generation program
CN110472634A (en) Change detecting method based on multiple dimensioned depth characteristic difference converged network
CN112712546A (en) Target tracking method based on twin neural network
CN111507334A (en) Example segmentation method based on key points
CN107944437B (en) A kind of Face detection method based on neural network and integral image
CN102034247A (en) Motion capture method for binocular vision image based on background modeling
JP2020038666A (en) Method for generating data set for learning for detection of obstacle in autonomous driving circumstances and computing device, learning method, and learning device using the same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination