CN107564013A - Merge the scene cut modification method and system of local message - Google Patents

Merge the scene cut modification method and system of local message Download PDF

Info

Publication number
CN107564013A
CN107564013A CN201710650541.2A CN201710650541A CN107564013A CN 107564013 A CN107564013 A CN 107564013A CN 201710650541 A CN201710650541 A CN 201710650541A CN 107564013 A CN107564013 A CN 107564013A
Authority
CN
China
Prior art keywords
local boundary
scene cut
confidence level
local
corrective networks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710650541.2A
Other languages
Chinese (zh)
Other versions
CN107564013B (en
Inventor
唐胜
张蕊
李锦涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN201710650541.2A priority Critical patent/CN107564013B/en
Publication of CN107564013A publication Critical patent/CN107564013A/en
Application granted granted Critical
Publication of CN107564013B publication Critical patent/CN107564013B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The present invention relates to a kind of scene cut modification method, pass through local boundary corrective networks, front end model is used as using complete residual error convolutional network, the confidence level figure and original image of the front end model are stitched together by passage, as the input of the local boundary corrective networks, so as to export the partial polymerization coefficient of all positions of confidence level figure, the partial polymerization coefficient is multiplied with the correspondence position of the confidence level figure, central point is aggregated to, obtains the local boundary correction result of scene cut;The local boundary corrective networks are trained using known scene cut data set.The invention also provides global residual GM network and the local boundary corrective networks are together in series simultaneously, form the framework of cascade, the framework of the cascade can carry out complete modification and local correction to the segmentation result of front end model, so as to obtain more accurate scene cut correction result.

Description

Merge the scene cut modification method and system of local message
Technical field
This method belongs to machine learning and computer vision field, and more particularly to computer-oriented vision Scene is split Machine Learning Problems.
Background technology
Currently a popular Scene Segmentation is based primarily upon convolutional neural networks (Convolutional Neural Networks, CNNs).Wherein most method make use of full convolutional network (Fully Convolutional Networks, FCNs framework).Many methods are further improved on the basis of FCNs, using expand convolution, addition multilayer warp The methods of lamination and capture network intermediate layer feature.However, these methods are based primarily upon the thinking raising point for improving network structure Cut precision.
Different from the above method, some other method is then for the purpose of improving current segmentation result.It is wherein more famous Include " full condition of contact random field " method and " multiple dimensioned expansion convolution " method.Full condition of contact Random Fields Method can have The partitioning boundary that becomes more meticulous of effect, this method are based on one energy function of optimization, so as to be carried out automatically to the shot chart for splitting classification Integration amendment.However, this method with only the optimization of the information progress energy function of low layer.Multiple dimensioned expansion convolution side Rule has used expansion convolution operation sub, multiple dimensioned image information is progressively caught, so as to carry out the amendment of segmentation result.It is this Method is based primarily upon the global information of image.
Also have at present much based on the method for neighborhood information and spatial relationship progress scene cut in image that catches.Some sides Method make use of multidimensional Recognition with Recurrent Neural Network to catch neighborhood information and spatial relationship.These methods are directed to the characteristics of image, will circulate Neutral net is designed as different topological structures, including diagonal arrangement, eight neighborhood structure and graph structure etc..In order to reduce circulation For the sequence length of neutral net so as to reduce computational complexity, these methods are most of to act on low resolution by Recognition with Recurrent Neural Network In the prediction result of rate, this will cause the loss of many detailed information.In addition, the method based on graph model is also largely used in Catch in the spatial information between image block.Graph model is modeled as certain special layer in the method for some scene cuts, so Insert it into afterwards and end-to-end optimization is carried out in neutral net.The semanteme that these method primary capture neural network learnings arrive is special Sign, therefore the spatial information of the image block obtained is concentrated mainly on semantic level.
Currently a popular Scene Segmentation is based primarily upon full convolutional network and its deformation.These methods adopt migration The thinking of study, using the good convolutional neural networks of the pre-training on large-scale image categorized data set, it is adjusted to roll up entirely Product network structure simultaneously carries out retraining on scene partitioned data set.This kind of method is primarily present problems with:(1) segmentation result In usually there is inconsistent, discontinuous problem, the partitioning boundary of (2) target is often not accurate, incoherent.
The content of the invention
In order to solve the above problems, the present invention proposes local boundary from the abundant angle for excavating image local content information Corrective networks.Local boundary corrective networks can be used alone, can also level be associated in after front network and form a coalition framework, So as to improve the precision of correction result.
The confidence level figure of front network and original image are stitched together work by local boundary corrective networks in the dimension of passage For the input of network, because the rgb value of original image has normalized to [0,1] scope in pretreatment, the value in confidence level figure It is also required to normalize to identical scope, is normalized according to below equation:
WhereinIt is confidence value of the front end model in position i for classification k, K is the classification sum included in data set. The output of local boundary corrective networks is the partial polymerization coefficient figure to all positions generation m × m sizes, and these polymerizing factors pass through Cross following normalization:
WhereinIt is local boundary corrective networks correspondence position i neighborhood p direct output valve, through shown in formula (2) Turn into correspondence position i neighborhood p partial polymerization coefficient after normalizationM × m is the size of neighborhood.Position i polymerization system Number vector will be laid into a square first, and then the correspondence position with the confidence level figure of front end model is multiplied, and finally polymerize To central point, so as to obtain revised result, it is expressed as:
WhereinIt is the neighborhood p of correspondence position i in front network confidence level vector.Can be multiplied intoAll points In amount.Significantly, since local boundary corrective networks can export corresponding polymerizing factor for each position, therefore Its polymerizing factor for learning to obtain is that position is adaptive, can preferably catch the local content information of diverse location.
In order to learn unknown polymerizing factor, the present invention proposes a kind of implicitly learned method, by polymerizing factor with before The confidence level figure of network is held to be multiplied, by calculating the error of revised result and true value, end-to-end training local boundary is repaiied Positive network.This process avoids the explicit supervised learning to polymerizing factor, because the optimal value of polymerizing factor is can not Directly obtain.In specific implementation process, the present invention devises a less network for including 73 × 3 convolutional layers, Because local boundary corrective networks only need to catch local content information, and excessive receptive field is not needed.Local boundary is repaiied The concrete structure of positive network is as shown in table 1.Wherein pooling or excessive step-length are not used, to ensure the defeated of the network The resolution ratio for entering output is identical.With the increase of the number of plies, receptive field gradually expands, and the content information captured increases, because The port number of this local boundary corrective networks increases, to store more information.
The present invention proposes a kind of rational initial method for the characteristic of local boundary corrective networks.Due to local edge Boundary's corrective networks only carry out local trickle amendment, and it inputs and as output is close.Accordingly, the initialization that the present invention uses Method is as follows:
Wherein L is the number of plies of local boundary corrective networks, klIt is the initialization convolution nuclear parameter of l layers, blIt is l layers Initialization biasing, c is the port number of this layer, and a is the optional position of convolution kernel in this layer, ε~N (0, σ2) and σ < < 1.At this In kind initial method, convolution kernel is initialized to less value, and the biasing in addition to last layer is set to 0.For last One layer, in addition to being set as 1 except the position of (m × m+1)/2 at corresponding center, the biasing of other positions is set to 0.By this initial Change method, the confidence level of center can produce large effect to correction result, and makeover process is similar to identical mapping.In net During network optimizes, local boundary corrective networks can gradually capture local content information using identical mapping as starting point, to dividing Result is cut to be modified.
The present invention relates to a kind of scene cut modification method, it is characterised in that including:
Local boundary corrective networks, using complete residual error convolutional network as front end model, by the confidence of the front end model Degree figure and original image are stitched together by passage, so as to export the partial polymerization coefficient of all positions, by the partial polymerization coefficient It is multiplied with the correspondence position of the confidence level figure, is aggregated to central point, obtains the local boundary correction result of the scene cut;Use Known scene cut data set is trained to the local boundary corrective networks.
Step 1, the local boundary corrective networks are designed as a network for including 73 × 3 convolutional layers, when the part Border corrective networks number of plies increase, receptive field expand, and the port number of the local boundary corrective networks increases, to store more letters Breath.
Step 2, formula is passed throughThe local boundary is repaiied Positive network is initialized, and the input of the local boundary corrective networks and output is kept approximate, now convolution kernel removes last The biasing of the other positions at the corresponding center of layer is set to 0, and the biasing of the position at this last layer corresponding center is set as 1, its Middle L is the number of plies of local boundary corrective networks, klIt is the initialization convolution nuclear parameter of l layers, blIt is the initialization biasing of l layers, C is the port number of this layer, and a is the optional position of convolution kernel in this layer, ε~N (0, σ2) and σ < < 1.
Step 3, according to formulaValue in the confidence level figure is normalized to this The rgb value identical magnitude of original image, whereinIt is confidence value of the front end model in position i for classification k,It is to return The confidence value after one change, K are the classification sums included in data set.
Step 4, the output of the local boundary corrective networks is the partial polymerization coefficient to all positions generation m × m sizes Figure, the polymerizing factor pass through formulaP ∈ 1,2 ..., m × m carry out normalization, whereinIt is local edge Boundary corrective networks correspondence position i neighborhood p direct output valve,It is the office of the neighborhood p after normalization as correspondence position i Portion's polymerizing factor, m × m are the sizes of neighborhood.
Step 5, position i polymerizing factor vector is laid into a square, the correspondence position phase with the confidence level figure Multiply, be aggregated to central point, pass through formulaRevised result is obtained, whereinTo be corresponded in front network Position i neighborhood p confidence level vector.
Step 6, by the training image in scene cut data set, it is local to obtain training by the local boundary corrective networks Polymerizing factor.
Step 7, it is revised by being multiplied the training part polymerizing factor with the confidence level figure of the training image to obtain As a result, the error of the result and the scene cut data set true value is calculated, the local boundary corrective networks are carried out end-to-end Training.
Global residual GM network and local boundary corrective networks series connection can be implemented, form the framework of cascade, it is right The scene cut carries out complete modification and local correction.
The invention further relates to a kind of scene cut update the system, including:
Local boundary update the system, for the confidence level figure and original image of front end model to be stitched together by passage, make For the input of the local boundary update the system, so as to export the partial polymerization coefficient of all positions of confidence level figure, normalization The partial polymerization coefficient is simultaneously laid into a square, is multiplied with the correspondence position of the confidence level figure, is aggregated to central point, Obtain the correction result.
Scene cut amendment system of the present invention, including:
Initialization module, for the convolution kernel by setting the local boundary corrective networks, to the local boundary amendment net Network is initialized;
Module is normalized, for the value in the confidence level figure to be normalized, to reach the rgb value phase of the original image Same magnitude;
Input module, the local boundary amendment is used as the confidence level figure and the original image to be stitched together by passage The input of network;
Output module, for exporting the partial polymerization coefficient of all positions of confidence level figure, the regular partial polymerization Coefficient is simultaneously laid into a square, is multiplied with the correspondence position of the confidence level figure, is aggregated to central point, obtains the amendment As a result;
Training module, for by using scene cut data set, being trained to the local boundary update the system.
Can connect global residual GM system with the local boundary update the system implementation, form the framework of cascade, use In carrying out complete modification and local correction to scene cut, to obtain more accurate scene cut result.
Brief description of the drawings
The structure of Fig. 1 local boundary corrective networks
Fig. 2 cascades the structure of framework
Comparative result of Fig. 3 local boundaries corrective networks in ADE20K data sets
Comparative result of Fig. 4 local boundaries corrective networks in Cityscapes data sets
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, below in conjunction with accompanying drawing, the present invention is carried The local boundary corrective networks gone out are further described.It should be appreciated that specific implementation method described herein only to The present invention is explained, is not intended to limit the present invention.
In order to verify local boundary corrective networks proposed by the present invention, the present invention uses currently a popular complete residual error convolution Network is modified as front end model using local boundary corrective networks to the segmentation result of front end model.
Wherein, front end model has used currently a popular complete convolution residual error network.The network is utilized in large-scale image The parameter of the residual error network of pre-training, so as to obtain the segmentation result of low resolution, connects deconvolution afterwards on categorized data set Layer, so as to which segmentation result be upsampled to and original image formed objects.7 × 7 global pool layer quilt in raw residual network 3 × 3 convolutional layers that expansion parameters are 3 are replaced with, so as to keep retaining more in the case that archetype receptive field size is constant More detailed information, while number of parameters is reduced as far as possible.In addition, the down-sampling that 5 step-lengths are 2 is included in raw residual network Process, the resolution ratio of the segmentation result before deconvolution can be dropped to 1/32.In order to increase the resolution ratio of segmentation result, front end model In used hole algorithms.The down-sampling process that most latter two step-length in residual error network is 2 is removed, and by hole algorithm applications All layers in the 4th and Part V of residual error network.By this modification, point of the segmentation result before warp lamination Resolution can bring up to 1/8.
In order to more preferably use local boundary corrective networks proposed by the present invention, present invention employs a kind of framework pair of cascade The segmentation result of front network is modified.The framework includes three parts:(1) currently a popular complete residual error convolution net is used Network is as front end model;(2) global residual GM network is used, is modified using global content information;(3) local edge is used Boundary's corrective networks, local correction is carried out to partitioning boundary.Because global residual GM network and local boundary corrective networks have Certain complementarity and concertedness, revised segmentation precision can be greatly improved using cascade structure.
In cascade structure, front end model has used currently a popular complete convolution residual error network.The network is utilized big The parameter of the residual error network of pre-training on scale image classification data collection, so as to obtain the segmentation result of low resolution, Zhi Houlian Reversed convolutional layer, so as to which segmentation result be upsampled to and original image formed objects.7 × 7 in raw residual network are global Pooling layers are replaced by 3 × 3 convolutional layers that expansion parameters are 3, so as to keep the constant feelings of archetype receptive field size Retain more detailed information under condition, while reduce number of parameters as far as possible.In addition, 5 step-lengths are included in raw residual network For 2 down-sampling process, the resolution ratio of the segmentation result before deconvolution can be dropped to 1/32.In order to increase the resolution of segmentation result Rate, hole algorithms have been used in the model of front end.The down-sampling process that most latter two step-length in residual error network is 2 is removed, and will Hole algorithms are applied to all layers in the 4th and Part V of residual error network.By this modification, before warp lamination The resolution ratio of segmentation result can bring up to 1/8.
The present invention relates to a kind of scene cut modification method, it is characterised in that including:
Local boundary corrective networks mainly make use of local content information, and adaptive carries out local repair to segmentation result Just.
Local boundary corrective networks, using complete residual error convolutional network as front end model, by the confidence of the front end model Degree figure and original image are stitched together by passage, so as to export the partial polymerization coefficient of all positions, by the partial polymerization coefficient It is multiplied with the correspondence position of the confidence level figure, is aggregated to central point, obtains the local boundary correction result of the scene cut;Use Known scene cut data set is trained to the local boundary corrective networks.
The confidence level figure of front network and original image are stitched together work by local boundary corrective networks in the dimension of passage For the input of network, because the rgb value of original image has normalized to [0,1] scope in pretreatment, the value in confidence level figure It is also required to normalize to identical scope, is normalized according to below equation:
The output of local boundary corrective networks is the partial polymerization coefficient figure to all positions generation m × m sizes, and these are poly- Syzygy number passes through following normalization:
WhereinIt is local boundary corrective networks correspondence position i neighborhood p direct output valve, through shown in formula (2) Turn into correspondence position i neighborhood p partial polymerization coefficient after normalizationM × m is the size of neighborhood.Position i polymerizing factor Vector will be laid into a square first, and then the correspondence position with the confidence level figure of front end model is multiplied, and is finally aggregated to Central point, so as to obtain revised result, it is expressed as:
WhereinIt is the neighborhood p of correspondence position i in front network confidence level vector.Can be multiplied intoAll points In amount.Significantly, since local boundary corrective networks can export corresponding polymerizing factor for each position, therefore Its polymerizing factor for learning to obtain is that position is adaptive, can preferably catch the local content information of diverse location.
In order to learn unknown polymerizing factor, the present invention proposes a kind of implicitly learned method, by polymerizing factor with before The confidence level figure of network is held to be multiplied, by calculating the error of revised result and true value, end-to-end training local boundary is repaiied Positive network.This process avoids the explicit supervised learning to polymerizing factor, because the optimal value of polymerizing factor is can not Directly obtain.In specific implementation process, the present invention devises a less network for including 73 × 3 convolutional layers, Because local boundary corrective networks only need to catch local content information, and excessive receptive field is not needed.Local boundary is repaiied The concrete structure of positive network is as shown in table 1.Wherein pooling or excessive step-length are not used, to ensure the defeated of the network The resolution ratio for entering output is identical.With the increase of the number of plies, receptive field gradually expands, and the content information captured increases, because The port number of this local boundary corrective networks increases, to store more information.
Table 1:The structure of local boundary corrective networks
The present invention proposes a kind of rational initial method for the characteristic of local boundary corrective networks.Due to local edge Boundary's corrective networks only carry out local trickle amendment, and it inputs and as output is close.Accordingly, the initialization that the present invention uses Method is as follows:
Wherein L is the number of plies of local boundary corrective networks, klIt is the initialization convolution nuclear parameter of l layers, blIt is l layers Initialization biasing, c is the port number of this layer, and a is the optional position of convolution kernel in this layer, ε~N (0, σ2) and σ < < 1.At this In kind initial method, convolution kernel is initialized to less value, and the biasing in addition to last layer is set to 0.For last One layer, in addition to being set as 1 except the position of (m × m+1)/2 at corresponding center, the biasing of other positions is set to 0.By this initial Change method, the confidence level of center can produce large effect to correction result, and makeover process is similar to identical mapping.In net During network optimizes, local boundary corrective networks can gradually capture local content information using identical mapping as starting point, to dividing Result is cut to be modified.
The present invention relates to a kind of scene cut modification method, it is characterised in that including:
The local boundary corrective networks are designed as a network for including 73 × 3 convolutional layers, when the local boundary is repaiied Positive network number of plies increase, receptive field expand, and the port number of the local boundary corrective networks increases, to store more information;
According to formulaValue in the confidence level figure is normalized into the original graph The rgb value identical magnitude of picture, whereinIt is confidence value of the front end model in position i for classification k,After being normalization The confidence value, K are the classification sums included in data set.
The output of the local boundary corrective networks is the partial polymerization coefficient figure to all positions generation m × m sizes, and this is poly- Syzygy number passes through formulaNormalization is carried out, whereinIt is local boundary amendment net Network correspondence position i neighborhood p direct output valve,It is the partial polymerization system of the neighborhood p after normalization as correspondence position i Number, m × m is the size of neighborhood.
Position i polymerizing factor vector is laid into a square, is multiplied with the correspondence position of the confidence level figure, polymerization To central point, pass through formulaRevised result is obtained, whereinFor correspondence position i in front network Neighborhood p confidence level vector.
Pass through formulaTo the local boundary corrective networks Initialized, the input of the local boundary corrective networks and output is kept approximate, now convolution kernel is except last layer correspondence The biasing of the other positions at center is set to 0, and the biasing of the position at this last layer corresponding center is set as 1, and wherein L is office The number of plies of portion border corrective networks, klIt is the initialization convolution nuclear parameter of l layers, blIt is the initialization biasing of l layers, c is the layer Port number, a is the optional position of convolution kernel in this layer, ε~N (0, σ2) and σ < < 1.
By the training image in scene cut data set, obtain training partial polymerization system by the local boundary corrective networks Number;By being multiplied the training part polymerizing factor with the confidence level figure of the training image to obtain revised result, calculating should As a result with the error of the scene cut data set true value, the local boundary corrective networks are carried out with end-to-end training.
Global residual GM network and local boundary corrective networks series connection are implemented, the framework of cascade is formed, to this Scape segmentation carries out complete modification and local correction.
The invention further relates to a kind of scene cut update the system, it is characterised in that including:
Local boundary update the system, for the confidence level figure and original image of front end model to be stitched together by passage, make For the input of the local boundary update the system, so as to export the partial polymerization coefficient of all positions of confidence level figure, normalization The partial polymerization coefficient is simultaneously laid into a square, is multiplied with the correspondence position of the confidence level figure, is aggregated to central point, Obtain correction result.
The system includes:
Initialization module, for the convolution kernel by setting the local boundary corrective networks, to the local boundary amendment net Network is initialized.
Module is normalized, for the value in the confidence level figure to be normalized, to reach the rgb value phase of the original image Same magnitude;
Input module, the local boundary amendment is used as the confidence level figure and the original image to be stitched together by passage The input of network;
Output module, for exporting the partial polymerization coefficient of all positions of confidence level figure, the regular partial polymerization Coefficient is simultaneously laid into a square, is multiplied with the correspondence position of the confidence level figure, is aggregated to central point, obtains the amendment As a result.
Training module, for by scene cut data set, being trained to the local boundary update the system.
The system can also connect global residual GM system with the local boundary update the system implementation, form cascade Framework, for carrying out complete modification and local correction to scene cut, to obtain more accurate scene cut result.
It is following test result indicates that:Local boundary corrective networks proposed by the present invention can be before the angle modification of part The segmentation result of model is held, so as to obtain the segmentation precision of more accurately partitioning boundary and Geng Gao.
In order to verify the validity of the inventive method, we are on ADE20K the and Cityscapes data sets of current trend Experimental verification is carried out respectively.
ADE20K is a larger scene cut data set, in the extensive visual identitys of the ImageNet of 2016 Used in challenge match.The data set includes 150 semantic classes, 20210 training images, 2000 authentication images and 3351 Test image.Each image in data set both provides the fine mark of pixel scale.The data set is handed over and compared using average (Mean IoU) is used as performance detection index.Cityscapes data sets include 5000 images for being collected in city streetscape, and Provide the fine mark of the pixel scale of 19 classifications.2975 images are used for training pattern in data set, and 500 images are used In checking model, 1525 images are used for test model.Image resolution ratio in the data set is higher, is 2048 × 1024.Should Data set uses average friendship and is used as performance detection index than (Mean IoU).
(1) validity of the inventive method in ADE20K data sets
We verify the validity of local boundary corrective networks proposed by the present invention on ADE20K data sets first.We Using ADE20K training set training pattern, and performance of the observing and nursing on ADE20K checking collection, as a result shown in table 2.With For Mean IoU as evaluation metricses, the front end model accuracy based on 101 layers of residual error network is 38.45%.Repaiied using local boundary Positive network can obtain 1.34% performance boost.We also tested two kinds of scene cut amendment sides of other currently a popular Method, including " full condition of contact random field " method and " multiple dimensioned expansion convolution " method.Wherein full condition of contact Random Fields Method It is only capable of bringing 0.33% performance boost, and multiple dimensioned expansion convolution method can bring 0.98% performance boost.Both The lifting amplitude of method is respectively less than local boundary corrective networks proposed by the present invention.Finally, we also use in test phase The method of Multiscale Fusion, precision is further lifted 41.27%.We by the performance that obtains of the present invention with it is current optimal Method be compared, the results are shown in Table 3.By contrast, the present invention uses single model to be based on 101 layers of residual error network as front end mould Type, 41.27% precision can be obtained.152 layers of recycling and 200 layers of residual error network are melted as front end model using multi-model Model accuracy optimal at present can significantly be exceeded by precision improvement to 43.21% by closing.By local boundary corrective networks Effect exemplary plot is shown in Fig. 2.Therefrom it is observed that after local boundary corrective networks, the partitioning boundary of shake can be smart Refinement is so as to more accurate.
Table 2:Result of the inventive method in ADE20K checking collection
Table 3:The Comparative result that the inventive method collects with currently a popular method in ADE20K checkings
(2) validity of the inventive method in Cityscapes data sets
Secondly we demonstrate the effective of local boundary corrective networks proposed by the present invention on Cityscapes data sets Property.We are tested on Cityscapes checking collection first, using Mean IoU as evaluation metricses, as a result as shown in table 4. Front end model performance based on 101 layers of residual error network is 72.93%.It can be lifted using the distribution of local boundary corrective networks 1.43% segmentation precision.On this basis, test phase using Multiscale Fusion can further by performance boost extremely 75.89%.Compared with other currently a popular two kinds of scene cut modification methods, full condition of contact Random Fields Method can be brought 0.54% performance boost, and multiple dimensioned expansion convolution method can bring 1.03% performance boost, still than present invention side The performance boost amplitude for two kinds of corrective networks that method proposes is low.We are surveyed the model of the inventive method on test set Try, and test result is submitted to the test and appraisal website of Cityscapes data sets, with other currently a popular Scene Segmentations Compare, the results are shown in Table shown in 5.On test set, we use single model (being based on 101 layers of residual error network) to be used as front network, Using the result for reaching 74.88% after local boundary corrective networks amendment, and will further can divide after using multi-model fusion Precision improvement is cut to 76.02%.See Fig. 3 by the effect exemplary plot of local boundary corrective networks.Therefrom it can be found that local edge Boundary's corrective networks can cause the partitioning boundary of shake to become continuous, and the partitioning boundary that can become more meticulous, so that it is more Accurately.
Table 4:Result of the inventive method in Cityscapes checking collection
Table 5:The Comparative result of the inventive method and currently a popular method in Cityscapes test sets.

Claims (10)

  1. A kind of 1. scene cut modification method, it is characterised in that including:
    By local boundary corrective networks, using complete residual error convolutional network as front end model, by the confidence level of the front end model Figure and original image are stitched together by passage, as the input of the local boundary corrective networks, so as to export the confidence level figure institute There is the partial polymerization coefficient of position, the partial polymerization coefficient be multiplied with the correspondence position of the confidence level figure, is aggregated to central point, Obtain the local boundary correction result of scene cut;The local boundary corrective networks are entered using known scene cut data set Row training.
  2. 2. scene cut modification method as claimed in claim 1, it is characterised in that including:
    The local boundary corrective networks are a network for including K l × l convolutional layer, by setting the local boundary amendment net The convolution kernel of network, the local boundary corrective networks are initialized;Set what the convolution kernel K layers corresponded to center to be biased to 1, The biasing for setting other positions of the convolution kernel in addition to the center is 0.
  3. 3. scene cut modification method as claimed in claim 1, it is characterised in that including:
    Step 11, the value in the confidence level figure is normalized, reaches the rgb value identical magnitude of the original image;
    Step 12, the confidence level figure and the original image are stitched together as the defeated of the local boundary corrective networks by passage Enter;
    Step 13, the partial polymerization coefficient of all positions of confidence level figure is exported, normalization is carried out to the partial polymerization coefficient;
    Step 14, the partial polymerization coefficient after normalization is laid into a square, the correspondence position phase with the confidence level figure Multiply, be aggregated to central point, obtain the correction result.
  4. 4. scene cut modification method as claimed in claim 1, it is characterised in that the local boundary corrective networks also include:
    Step 21, by the training image in known scene cut data set, trained by the local boundary corrective networks Partial polymerization coefficient;
    Step 22, the training part polymerizing factor is multiplied with the confidence level figure of the training image, obtains training correction result, meter The error of the training correction result and the scene cut data set true value is calculated, by stochastic gradient descent method, to the local boundary Corrective networks carry out end-to-end training.
  5. 5. scene cut modification method as claimed in claim 1, it is characterised in that the local boundary corrective networks also include:
    Global residual GM network and local boundary corrective networks series connection are implemented, form the framework of cascade, to the scene point Undercutting row complete modification and local correction.
  6. A kind of 6. scene cut update the system, it is characterised in that including:
    Local boundary update the system, for the confidence level figure and original image of front end model to be stitched together by passage, it is used as this The input of local boundary update the system, so as to export the partial polymerization coefficient of all positions of confidence level figure, the regular office Portion's polymerizing factor is simultaneously laid into a square, is multiplied with the correspondence position of the confidence level figure, is aggregated to central point, obtained Scene cut correction result.
  7. 7. scene cut amendment system as claimed in claim 6, it is characterised in that including:
    Initialization module, for the convolution kernel by setting the local boundary corrective networks, complete the local boundary corrective networks Initialization.
  8. 8. scene cut update the system as claimed in claim 6, it is characterised in that including:
    Module is normalized, for the value in the confidence level figure to be normalized, to reach the rgb value identical of the original image Magnitude;
    Input module, the local boundary corrective networks are used as the confidence level figure and the original image to be stitched together by passage Input;
    Output module, for exporting the partial polymerization coefficient of all positions of confidence level figure, the regular partial polymerization coefficient And a square is laid into, it is multiplied with the correspondence position of the confidence level figure, is aggregated to central point, obtains the amendment knot Fruit.
  9. 9. scene cut update the system as claimed in claim 6, it is characterised in that also include:
    Training module, for by using scene cut data set, being trained to the local boundary update the system.
  10. 10. scene cut update the system as claimed in claim 6, it is characterised in that can by global residual GM system with Local boundary update the system series connection is implemented, and forms the framework of cascade, is repaiied for carrying out complete modification to scene cut with local Just, to obtain more accurate scene cut correction result.
CN201710650541.2A 2017-08-02 2017-08-02 Scene segmentation correction method and system fusing local information Active CN107564013B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710650541.2A CN107564013B (en) 2017-08-02 2017-08-02 Scene segmentation correction method and system fusing local information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710650541.2A CN107564013B (en) 2017-08-02 2017-08-02 Scene segmentation correction method and system fusing local information

Publications (2)

Publication Number Publication Date
CN107564013A true CN107564013A (en) 2018-01-09
CN107564013B CN107564013B (en) 2020-06-26

Family

ID=60974936

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710650541.2A Active CN107564013B (en) 2017-08-02 2017-08-02 Scene segmentation correction method and system fusing local information

Country Status (1)

Country Link
CN (1) CN107564013B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109034198A (en) * 2018-06-25 2018-12-18 中国科学院计算技术研究所 The Scene Segmentation and system restored based on characteristic pattern
CN109657538A (en) * 2018-11-05 2019-04-19 中国科学院计算技术研究所 Scene Segmentation and system based on contextual information guidance
CN109670506A (en) * 2018-11-05 2019-04-23 中国科学院计算技术研究所 Scene Segmentation and system based on Kronecker convolution

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102169538A (en) * 2011-04-12 2011-08-31 广州市威宝网络科技有限公司 Background modeling method based on pixel confidence
US20150206319A1 (en) * 2014-01-17 2015-07-23 Microsoft Corporation Digital image edge detection
CN104915676A (en) * 2015-05-19 2015-09-16 西安电子科技大学 Deep-level feature learning and watershed-based synthetic aperture radar (SAR) image classification method
CN106803256A (en) * 2017-01-13 2017-06-06 深圳市唯特视科技有限公司 A kind of 3D shape based on projection convolutional network is split and semantic marker method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102169538A (en) * 2011-04-12 2011-08-31 广州市威宝网络科技有限公司 Background modeling method based on pixel confidence
US20150206319A1 (en) * 2014-01-17 2015-07-23 Microsoft Corporation Digital image edge detection
CN104915676A (en) * 2015-05-19 2015-09-16 西安电子科技大学 Deep-level feature learning and watershed-based synthetic aperture radar (SAR) image classification method
CN106803256A (en) * 2017-01-13 2017-06-06 深圳市唯特视科技有限公司 A kind of 3D shape based on projection convolutional network is split and semantic marker method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李岳云等: "深度卷积神经网络的显著性检测", 《中国图象图形学报》 *
田彦等: "多任务网络融合多层信息的目标定位", 《计算机辅助设计与图形学学报》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109034198A (en) * 2018-06-25 2018-12-18 中国科学院计算技术研究所 The Scene Segmentation and system restored based on characteristic pattern
CN109034198B (en) * 2018-06-25 2020-12-11 中国科学院计算技术研究所 Scene segmentation method and system based on feature map recovery
CN109657538A (en) * 2018-11-05 2019-04-19 中国科学院计算技术研究所 Scene Segmentation and system based on contextual information guidance
CN109670506A (en) * 2018-11-05 2019-04-23 中国科学院计算技术研究所 Scene Segmentation and system based on Kronecker convolution
CN109657538B (en) * 2018-11-05 2021-04-27 中国科学院计算技术研究所 Scene segmentation method and system based on context information guidance

Also Published As

Publication number Publication date
CN107564013B (en) 2020-06-26

Similar Documents

Publication Publication Date Title
WO2022002150A1 (en) Method and device for constructing visual point cloud map
CN109816012B (en) Multi-scale target detection method fusing context information
CN108898610A (en) A kind of object contour extraction method based on mask-RCNN
CN107239736A (en) Method for detecting human face and detection means based on multitask concatenated convolutional neutral net
CN107330357A (en) Vision SLAM closed loop detection methods based on deep neural network
CN109583483A (en) A kind of object detection method and system based on convolutional neural networks
CN106709568A (en) RGB-D image object detection and semantic segmentation method based on deep convolution network
CN107066916B (en) Scene semantic segmentation method based on deconvolution neural network
CN109598268A (en) A kind of RGB-D well-marked target detection method based on single flow depth degree network
CN106650615B (en) A kind of image processing method and terminal
CN109886128B (en) Face detection method under low resolution
CN107808129A (en) A kind of facial multi-characteristic points localization method based on single convolutional neural networks
CN107564007A (en) The scene cut modification method and system of amalgamation of global information
CN110751195A (en) Fine-grained image classification method based on improved YOLOv3
CN107944459A (en) A kind of RGB D object identification methods
CN110991444A (en) Complex scene-oriented license plate recognition method and device
CN107564013A (en) Merge the scene cut modification method and system of local message
CN108664994A (en) A kind of remote sensing image processing model construction system and method
CN113888461A (en) Method, system and equipment for detecting defects of hardware parts based on deep learning
CN102651132A (en) Medical image registration method based on intersecting cortical model
CN112686119A (en) License plate motion blurred image processing method based on self-attention generation countermeasure network
CN111401380A (en) RGB-D image semantic segmentation method based on depth feature enhancement and edge optimization
CN115496928A (en) Multi-modal image feature matching method based on multi-feature matching
CN103839244B (en) Real-time image fusion method and device
CN111292308A (en) Convolutional neural network-based infrared defect detection method for photovoltaic solar panel

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant