CN110472493A - Scene Segmentation and system based on consistency feature - Google Patents
Scene Segmentation and system based on consistency feature Download PDFInfo
- Publication number
- CN110472493A CN110472493A CN201910604601.6A CN201910604601A CN110472493A CN 110472493 A CN110472493 A CN 110472493A CN 201910604601 A CN201910604601 A CN 201910604601A CN 110472493 A CN110472493 A CN 110472493A
- Authority
- CN
- China
- Prior art keywords
- consistency
- feature
- characteristic spectrum
- transformation
- scene
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/35—Categorising the entire scene, e.g. birthday party or wedding scene
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The present invention proposes a kind of Scene Segmentation and system for being based on consistency feature (ConsensusFeatures), feature including learning to feature extractor carries out the transformation of example consistency and the transformation of classification consistency, transformed feature is input to scene cut sub-network, obtains the scene cut result of original image.The invention proposes a kind of consistency features that study example hierarchy is gone by example consistency converter unit.On the other hand, since, there is multiple similar examples, use classes consistency unit of the present invention goes the consistency feature of study class hierarchy in scene image.The two units greatly improve the performance of the existing scene cut model based on full convolution.
Description
Technical field
The present invention relates to machine learning and computer vision field, and in particular to a kind of based on Kronecker convolution sum tree
The Scene Segmentation and system of shape structure feature aggregation module.
Background technique
Semantic segmentation has become an important composition component in scene understanding, and plays the part of in many application fields
Very important role, such as automatic Pilot, self-navigation and virtual reality.The target of semantic segmentation is to every in image
A pixel is classified.Depth convolutional neural networks achieve significant progress in semantic segmentation field.Current most of prevalences
Semantic segmentation work be all based on image classification network.However, using image classification network as the spy of semantic segmentation model
Levying extractor can having some limitations property.Image classification network tendency is indicated with the image level for learning entire input sample.Before
Work show that the characterization of this image level is often dominated by the ga s safety degree region of foreground object or conspicuousness object, such as
The head of horse, the face of dog.Then semantic segmentation target is that each pixel is divided in an image, therefore Pixel-level is characterized in
It is required.Therefore two defects will directly be will lead to semantic segmentation using image classification network, as shown in Fig. 2 (d): (1) main
It leads in the class of all spatial positions of object that feature is inconsistent, causes the segmentation result of the same example class inconsistent;(2) can not
Feature is easy to be confused between the class of distinguishable region (such as secondary object or filling class region), causes similar example to have different
The segmentation result of cause.To solve the above-mentioned problems, it is desirable to go the consistency feature of study Pixel-level.Consistency be characterized in by
Neighbours' consistency related work is encouraged, it is that object matching field is used to find the direct reliable intensive sound of a pair of of image
It answers.The present invention, our targets are study consistency features, it refers to all spies in an example or in same class example
Levy indifference.As shown in Fig. 2 (a), the feature (such as B and C) of different zones should keep instance-level in (1) same example
Consistency;(2) (for example A should keep consistent on class hierarchy with C) to the feature of the different zones of the different instances of the same category
Property.In order to learn consistency feature, the present invention proposes two consistency converter units, including example consistency converter unit
(InstanceConsensusTransformunit, ICTunit) and class consistency converter unit
(CategoryConsensusTransformunit,CCTunit).Example consistency unit is expected to take study example hierarchy
Consistency feature.Particularly, the localized network (LocalNetwork is abbreviated as LN) that we have introduced a lightweight goes benefit
With the transformation parameter for around contextual information being each pixel generation instance-level.Then we utilize the transformation parameter of example hierarchy
Take the feature being aggregated in the same example.On the other hand, since there is multiple similar examples, the present invention in scene image
Use classes consistency unit goes to pursue the consistency feature of class hierarchy.Specifically, we have introduced the overall situation of a lightweight
Network (GlobalNetwork is abbreviated as GN) goes to generate the consistency transformation parameter of class hierarchy.It needs to model different from LN, GN
Any position and its
All interactions of his position.Two units proposed by the present invention are that the mode of data-driven learns, and trains
Additional supervision is not needed in journey.The two units are used to update the feature of all positions.For each position,
The two units are all the information (foreground area) of adaptive enhancing relevant position and inhibit incoherent (background
Region).Therefore consistency is characterized in maintaining the invariance to the variation of background in foreground area indifference.Learn with benchmark model
To characteristic pattern 2 (c) compare, method proposed by the present invention learns to be characterized in that more cohesion is in example hierarchy and classification layer
It is secondary, as shown in Fig. 2 (e).Meanwhile the method that the inconsistent segmentation result in Fig. 2 (d) is suggested is corrected, such as Fig. 2 (f) institute
Show.Example consistency converter unit and classification consistency converter unit based on proposition, we have proposed one to be called consistency
The semantic segmentation frame of character network (ConsensusFeatureNetwork is abbreviated as CFNet) goes the consistent of study Pixel-level
Property feature, and obtain consistent segmentation result.By in four famous semantic segmentation data sets, including Cityscapes and
PASCALContext, obtain be more than current best method precision, it was demonstrated that the validity of proposition method of the present invention.
Summary of the invention
In view of the deficiencies of the prior art, the present invention proposes a kind of Scene Segmentation based on consistency feature, feature
It is, comprising:
Step 1 uses residual error network as feature extractor, extracts the local feature map in original image, and to this
Local feature map carries out the transformation of example consistency, obtains the consistency feature of example hierarchy;
Step 2 carries out the transformation of classification consistency to the consistency feature of the example hierarchy, obtains the consistency of class hierarchy
Feature;
Step 3, using category consistency feature as input, the field of the original image is exported by scene cut sub-network
Scape segmentation result.
The Scene Segmentation based on consistency feature, which is characterized in that the example consistency, which converts, includes:
For a local feature mapC is the number of channels for levying map, and H × W is space size, first
First with 11 × 1 convolution dimensionality reduction local feature map, characteristic spectrum is obtainedC1It is the logical of characteristic spectrum
Road quantity generates the parameter of example consistency transformation after obtaining characteristic spectrum PHere r is indicated to work as
Regional area size centered on front space position, the size of parameter θ are converted with the transformation of regional area size r;
Dimension is carried out to convert to obtainThe operation that N=H × W, characteristic spectrum P execute an expansion, which goes to extract, to be slided
The feature of dynamic window block, and carry out dimension and convert to obtain characteristic spectrumNew characteristic spectrum Q,Wherein functionFor the Element-Level multiplication of tensor x and y, and asked according to the last one dimension
With;
Dimension is carried out to characteristic spectrum Q to convert to obtain the consistency feature of the example hierarchy:
The Scene Segmentation based on consistency feature characterized by comprising building classification consistency transformation
Unit carries out the transformation of category consistency using consistency feature of the category consistency converter unit to example hierarchy:
Category consistency converter unit includes: that memory network and 1 convolutional layer go instantiation complete to two two-way length in short-term
Office network;
First two-way length in short-term memory network by from bottom to top with it is top-down in a manner of go to scan the example hierarchy
The characteristic spectrum of consistency feature, two-way length memory network in short-term specifically:
For scanning rule,For the output state of t moment;
It cascades and hides layer stateWithThe characteristic spectrum H1 of a mixing is obtained, and uses second two-way length
Short-term memory network takes horizontal direction to scan this feature map H1, to cascade the characteristic pattern that the state of forward and backward is mixed
H2 is composed, this feature map H2 is input to the convolutional layer, obtains the consistency transformation parameter of class hierarchy, converts and joins to the consistency
Number is normalized to obtain transformation parameter φ by activation primitive;
New characteristic spectrum F can be generated as follows:
Wherein functionFor the tensorial multiplication between tensor x and y, E is characterized figure, the characteristic pattern new to this
Spectrum F carries out dimension and converts to obtain the characteristic spectrum of the consistency feature of category level
The Scene Segmentation based on consistency feature, which is characterized in that the activation primitive is Softmax letter
Number.
The Scene Segmentation based on consistency feature, which is characterized in that the convolutional layer uses 1 × 1 convolution
Core.
The invention also provides a kind of scene cut systems based on consistency feature characterized by comprising
Module 1 uses residual error network as feature extractor, extracts the local feature map in original image, and to this
Local feature map carries out the transformation of example consistency, obtains the consistency feature of example hierarchy;
Module 2 carries out the transformation of classification consistency to the consistency feature of the example hierarchy, obtains the consistency of class hierarchy
Feature;
Module 3, using category consistency feature as input, the field of the original image is exported by scene cut sub-network
Scape segmentation result.
The scene cut system based on consistency feature, which is characterized in that the example consistency, which converts, includes:
For a local feature mapC is the number of channels for levying map, and H × W is space size, first
First with 11 × 1 convolution dimensionality reduction local feature map, characteristic spectrum is obtainedC1It is the logical of characteristic spectrum
Road quantity generates the parameter of example consistency transformation after obtaining characteristic spectrum PHere r is indicated to work as
Regional area size centered on front space position, the size of parameter θ are converted with the transformation of regional area size r;
Dimension is carried out to convert to obtainThe operation that N=H × W, characteristic spectrum P execute an expansion, which goes to extract, to be slided
The feature of dynamic window block, and carry out dimension and convert to obtain characteristic spectrumNew characteristic spectrum Q,Wherein functionFor the Element-Level multiplication of tensor x and y, and asked according to the last one dimension
With;
Dimension is carried out to characteristic spectrum Q to convert to obtain the consistency feature of the example hierarchy:
The scene cut system based on consistency feature characterized by comprising building classification consistency transformation
Unit carries out the transformation of category consistency using consistency feature of the category consistency converter unit to example hierarchy:
Category consistency converter unit includes: that memory network and 1 convolutional layer go instantiation complete to two two-way length in short-term
Office network;
First two-way length in short-term memory network by from bottom to top with it is top-down in a manner of go to scan the example hierarchy
The characteristic spectrum of consistency feature, two-way length memory network in short-term specifically:
For scanning rule,For the output state of t moment;
It cascades and hides layer stateWithThe characteristic spectrum H1 of a mixing is obtained, and uses second two-way length
When memory network take horizontal direction to scan this feature map H1, to cascade the characteristic spectrum that the state of forward and backward is mixed
H2, this feature map H2 are input to the convolutional layer, obtain the consistency transformation parameter of class hierarchy, to the consistency transformation parameter
It is normalized to obtain transformation parameter φ by activation primitive;
New characteristic spectrum F can be generated as follows:
Wherein functionFor the tensorial multiplication between tensor x and y, E is characterized figure, the characteristic pattern new to this
Spectrum F carries out dimension and converts to obtain the characteristic spectrum of the consistency feature of category level
The scene cut system based on consistency feature, which is characterized in that the activation primitive is Softmax letter
Number.
The scene cut system based on consistency feature, which is characterized in that the convolutional layer uses 1 × 1 convolution
Core.
As it can be seen from the above scheme the present invention has the advantages that
Based on the Scene Segmentation and system of consistency feature (ConsensusFeatures), including to feature extraction
The feature that device learns carries out the transformation of example consistency and the transformation of classification consistency, and transformed feature is input to scene cut
Sub-network obtains the scene cut result of original image.The invention proposes one kind to go to learn by example consistency converter unit
Practise the consistency feature of example hierarchy.On the other hand, since, there is multiple similar examples, the present invention uses in scene image
Classification consistency unit goes the consistency feature of study class hierarchy.The two units greatly improve existing based on full convolution
The performance of scene cut model.
Detailed description of the invention
Fig. 1 is structure of the invention figure;
Fig. 2 is to utilize the result schematic diagram of image classification network to semantic segmentation in the prior art.
Specific embodiment
To allow features described above and effect of the invention that can illustrate more clearly understandable, special embodiment below, and cooperate
Bright book attached drawing is described in detail below.
Step 1, the deep layer that sub-network is extracted in essential characteristic go study real using example consistency converter unit
The consistency feature of example level.The example consistency converter unit, which is utilized, to be gone around contextual information as the generation of each spatial position
Transformation parameter.As shown in Fig. 1 (b), for a local feature mapC is the number of channels for levying map, H
× W is space size, and example consistency converter unit removes dimensionality reduction first with 11 × 1 convolution and saves calculating, obtains simultaneously
Characteristic spectrumHere C1 is the number of channels of characteristic spectrum, it be usually C (typically).It is obtaining
After obtaining characteristic spectrum P, example consistency converter unit employs the localized network of a lightweight to go to generate the change of example consistency
The parameter changedHere r indicates the regional area size centered on current spatial location.The size of parameter θ
It is to be converted with the transformation of regional area size r.It is desirable that localized network is gone using the circular contextual information of each point
Generate transformation parameter.We go to instantiate the localized network using two convolutional layers, their filter size is respectively r × r
With 1 × 1.The characteristic spectrum conduct that first convolutional layer (r × r) is used to capture ring around contextual information, after then obtaining
The input of second convolutional layer (1 × 1), then output is the parameter of example consistency transformation.Then we are swashed using softmax
Function living goes to obtain normalized transformation parameter θ, then carries out dimension and converts to obtainHere N=H × W.Together
When, the operation that characteristic spectrum P executes an expansion goes to extract the feature of sliding window buccal mass, and carries out dimension and convert to obtain characteristic pattern
SpectrumWe push up a functionFor the Element-Level multiplication of tensor x and y, then tieed up according to the last one
Degree is summed.New characteristic spectrumIt can calculate in following way:
It converts to obtain next, we carry out dimension to QIn fact, the spy of (i, j) in any position
Levy vectorInIt is the neighbours on characteristic spectrum PBecome with corresponding example level consistency
Change parameterWeighted sum, hereIt is a r × r centered on (i, j)
Rectangular area.Therefore in each position (i;J) map function can formalize are as follows:
Here
The Scene Segmentation further include:
Step 2, the consistency transformation that the consistency feature of the example hierarchy learnt is carried out to class hierarchy again, the step
2 include:
The consistency feature of study class hierarchy is gone using tired consistency converter unit.Category consistency converter unit
Structure such as Fig. 1 (c) shown in, we using a global network go generate class hierarchy consistency transformation parameterHere N=H × W.It is desirable that this global network ability removes " seeing " entire input feature vector map.One very
Natural solution mode is using a full articulamentum, global convolution, or the multiple big nuclear convolutions of stacking.These modes are not
Very effectively, because they have introduced a large amount of parameter and video memory expense.
Classification consistency converter unit has introduced a cyclic convolution neural network and has gone the modeling other dependence of region class,
Using two two-way length, memory network (BiLSTM) and 11 × 1 convolutional layer go instantiation global network (GN) in short-term for we.I
First with first two-way length in short-term memory network by from bottom to top with it is top-down in a manner of remove scanning feature map, such as
Shown in Fig. 2.It updates its hiding layer state using the feature of a row grade as the input at a moment.One classics
Length in short-term memory unit include an input gate it, a forgetting door ft, an out gate Ot, an output stateOne internal memory unit state Ct.Scanning ruleIt can formalize are as follows:
Therefore, the calculating of two-way length memory network in short-term can turn in the form of
After bilateral scanning, we cascade hiding layer stateWithIt goes to obtain the characteristic spectrum H1 of a mixing.
Similar mode, using second two-way length, memory network takes horizontal direction scanning feature map H1 in short-term for we, it is column grade
Characteristic spectrum be sliced input as a moment, and update hiding layer state.Then we cascade the shape of forward and backward
The characteristic spectrum H2 that state is mixed, the table that this feature map is interacted as each spatial position with the overall situation of other positions
Sign.Each response on characteristic spectrum H2 is the activation response of the position and whole image.Then, overall situation interaction letter
The input as 1 × 1 convolutional layer is ceased, the consistency transformation parameter for class hierarchy is exported.Most Softmax activation primitive is by then
For obtaining normalized transformation parameter φ.Our defined functionsFor the tensorial multiplication between tensor x and y.Newly
Characteristic spectrum F can generate as follows:
HereE represents characteristic pattern, specifically special for the consistency of example hierarchy as shown in Figure 1
Sign obtains X, X is by process of convolution, expansion processing and Shape correction by the Res4 network being connected in series by multiple residual units
After obtain E.It converts to obtain characteristic spectrum next, we carry out dimension to FIn fact, any
The feature vector of position (i, j)It is to be calculated according to such as under type:
Wherein
i∈[1,H],j∈[1,W],φij(h, w)=φij(hw),Be
The feature of position (h, w) on feature spectrogram E.
Experiment:
Data set introduction.Cityscapes data set includes the street scene from 50 different cities.This data set
It is divided into three subsets, including 2975 picture of training set, verifying 500 pictures of collection and 1525 picture of test set.Data set
The 19 class set of pixels mark of high quality is provided.Performance is using the friendship of all classes and the average value of ratio.PASCAL-Context data
Collection includes that training set 4998 opens image and verifying 5105 images of collection.This data set provides detailed semanteme for entire scene
Mark.It is proposed that model be evaluated at most common 59 class and 1 background classes.
Validity experimental verification:
Table 1 is converted in example consistency converter unit (ICT) and classification consistency of the Cityscapes verifying collection to proposition
The validation verification of unit (CCT).
As can be seen from Table 1 compared with benchmark model, can be obtained using example consistency converter unit
78.8meanIoU, in the precision improvement of benchmark model upper right 3.8%.Similarly, use classes consistency converter unit can be
2.6% performance boost is brought on benchmark model.When integrated example consistency converter unit and classification consistency converter unit, divide
It cuts precision to be further improved to 79.9%, this has 5.0% performance boost (79.9vs.74.9) compared with benchmark model.This
A bit the experiment proves that two the integrated of unit can bring huge segmentation precision to improve.
TFA_S is the one smaller factor (r of configuration in TFA in table 21, r2)={ (6,3), (10;7), (20,15) }
Compared with other methods:
This is a part of, and we can report that our method and other advanced methods compare.
Experimental result on Cityscapes:
Table 2 is compared in Cityscapes test set with other advanced methods.
Experimental result on PASCAL-Context:
Table 3 is compared in Pascalcontext data set with other methods
From table 2 and table 3 we can see that we design system on two authoritative semantic segmentation data sets all
Extraordinary performance is achieved, this also further demonstrates effectiveness of the invention.
The following are system embodiment corresponding with above method embodiment, present embodiment can be mutual with above embodiment
Cooperation is implemented.The relevant technical details mentioned in above embodiment are still effective in the present embodiment, in order to reduce repetition,
Which is not described herein again.Correspondingly, the relevant technical details mentioned in present embodiment are also applicable in above embodiment.
The invention also provides a kind of scene cut systems based on consistency feature characterized by comprising
Module 1 uses residual error network as feature extractor, extracts the local feature map in original image, and to this
Local feature map carries out the transformation of example consistency, obtains the consistency feature of example hierarchy;
Module 2 carries out the transformation of classification consistency to the consistency feature of the example hierarchy, obtains the consistency of class hierarchy
Feature;
Module 3, using category consistency feature as input, the field of the original image is exported by scene cut sub-network
Scape segmentation result.
The scene cut system based on consistency feature, which is characterized in that the example consistency, which converts, includes:
For a local feature mapC is the number of channels for levying map, and H × W is space size, first
First with 11 × 1 convolution dimensionality reduction local feature map, characteristic spectrum is obtainedC1It is the logical of characteristic spectrum
Road quantity generates the parameter of example consistency transformation after obtaining characteristic spectrum PHere r is indicated to work as
Regional area size centered on front space position, the size of parameter θ are converted with the transformation of regional area size r;
Dimension is carried out to convert to obtainThe operation that N=H × W, characteristic spectrum P execute an expansion, which goes to extract, to be slided
The feature of dynamic window block, and carry out dimension and convert to obtain characteristic spectrumNew characteristic spectrum Q,Wherein functionFor the Element-Level multiplication of tensor x and y, and asked according to the last one dimension
With;
Dimension is carried out to characteristic spectrum Q to convert to obtain the consistency feature of the example hierarchy:
The scene cut system based on consistency feature characterized by comprising building classification consistency transformation
Unit carries out the transformation of category consistency using consistency feature of the category consistency converter unit to example hierarchy:
Category consistency converter unit includes: that memory network and 1 convolutional layer go instantiation complete to two two-way length in short-term
Office network;
First two-way length in short-term memory network by from bottom to top with it is top-down in a manner of go to scan the example hierarchy
The characteristic spectrum of consistency feature, two-way length memory network in short-term specifically:
For scanning rule,For the output state of t moment;
It cascades and hides layer stateWithThe characteristic spectrum H1 of a mixing is obtained, and uses second two-way length
Short-term memory network takes horizontal direction to scan this feature map H1, to cascade the characteristic pattern that the state of forward and backward is mixed
H2 is composed, this feature map H2 is input to the convolutional layer, obtains the consistency transformation parameter of class hierarchy, converts and joins to the consistency
Number is normalized to obtain transformation parameter φ by activation primitive;
New characteristic spectrum F can be generated as follows:
Wherein functionFor the tensorial multiplication between tensor x and y, E is characterized figure, the characteristic pattern new to this
Spectrum F carries out dimension and converts to obtain the characteristic spectrum of the consistency feature of category level
Claims (10)
1. a kind of Scene Segmentation based on consistency feature characterized by comprising
Step 1 uses residual error network as feature extractor, extracts the local feature map in original image, and to the part
Characteristic spectrum carries out the transformation of example consistency, obtains the consistency feature of example hierarchy;
Step 2 carries out the transformation of classification consistency to the consistency feature of the example hierarchy, and the consistency for obtaining class hierarchy is special
Sign;
Step 3, using category consistency feature as input, the scene point of the original image is exported by scene cut sub-network
Cut result.
2. the Scene Segmentation as described in claim 1 based on consistency feature, which is characterized in that the example consistency becomes
It changes and includes:
For a local feature mapC is the number of channels for levying map, and H × W is space size, sharp first
With 11 × 1 convolution dimensionality reduction local feature map, characteristic spectrum is obtainedC1It is the port number of characteristic spectrum
Amount generates the parameter of example consistency transformation after obtaining characteristic spectrum PHere r is indicated with current empty
Meta position is set to the regional area size at center, and the size of parameter θ is converted with the transformation of regional area size r;
Dimension is carried out to convert to obtainThe operation that N=H × W, characteristic spectrum P execute an expansion goes to extract sliding window
The feature of buccal mass, and carry out dimension and convert to obtain characteristic spectrumNew characteristic spectrum Q,Wherein functionFor the Element-Level multiplication of tensor x and y, and asked according to the last one dimension
With;
Dimension is carried out to characteristic spectrum Q to convert to obtain the consistency feature of the example hierarchy:
3. the Scene Segmentation as claimed in claim 2 based on consistency feature characterized by comprising building classification
Consistency converter unit carries out category consistency using consistency feature of the category consistency converter unit to example hierarchy
Transformation:
Category consistency converter unit includes: that memory network and 1 convolutional layer go to instantiate global net two two-way length in short-term
Network;
Memory network to go in a manner of top-down from bottom to top scans the consistent of the example hierarchy to first two-way length in short-term
The characteristic spectrum of property feature, two-way length memory network in short-term specifically:
For scanning rule,For the output state of t moment;
It cascades and hides layer stateWithThe characteristic spectrum H1 of a mixing is obtained, and in short-term using second two-way length
Memory network takes horizontal direction to scan this feature map H1, to cascade the characteristic spectrum that the state of forward and backward is mixed
H2, this feature map H2 are input to the convolutional layer, obtain the consistency transformation parameter of class hierarchy, to the consistency transformation parameter
It is normalized to obtain transformation parameter φ by activation primitive;
New characteristic spectrum F can be generated as follows:
Wherein functionFor the tensorial multiplication between tensor x and y, E is characterized figure, to the new characteristic spectrum F into
Row dimension converts to obtain the characteristic spectrum of the consistency feature of category level
4. the Scene Segmentation as claimed in claim 3 based on consistency feature, which is characterized in that the activation primitive is
Softmax function.
5. the Scene Segmentation as claimed in claim 3 based on consistency feature, which is characterized in that the convolutional layer uses 1
× 1 convolution kernel.
6. a kind of scene cut system based on consistency feature characterized by comprising
Module 1 uses residual error network as feature extractor, extracts the local feature map in original image, and to the part
Characteristic spectrum carries out the transformation of example consistency, obtains the consistency feature of example hierarchy;
Module 2 carries out the transformation of classification consistency to the consistency feature of the example hierarchy, and the consistency for obtaining class hierarchy is special
Sign;
Module 3, using category consistency feature as input, the scene point of the original image is exported by scene cut sub-network
Cut result.
7. as claimed in claim 6 based on the scene cut system of consistency feature, which is characterized in that the example consistency becomes
It changes and includes:
For a local feature mapC is the number of channels for levying map, and H × W is space size, sharp first
With 11 × 1 convolution dimensionality reduction local feature map, characteristic spectrum is obtainedC1It is the port number of characteristic spectrum
Amount generates the parameter of example consistency transformation after obtaining characteristic spectrum PHere r is indicated with current empty
Meta position is set to the regional area size at center, and the size of parameter θ is converted with the transformation of regional area size r;
Dimension is carried out to convert to obtainThe operation that N=H × W, characteristic spectrum P execute an expansion goes to extract sliding window
The feature of buccal mass, and carry out dimension and convert to obtain characteristic spectrumNew characteristic spectrum Q,
Wherein functionFor the Element-Level multiplication of tensor x and y, and summed according to the last one dimension;
Dimension is carried out to characteristic spectrum Q to convert to obtain the consistency feature of the example hierarchy:
8. as claimed in claim 7 based on the scene cut system of consistency feature characterized by comprising building classification
Consistency converter unit carries out category consistency using consistency feature of the category consistency converter unit to example hierarchy
Transformation:
Category consistency converter unit includes: that memory network and 1 convolutional layer go to instantiate global net two two-way length in short-term
Network;
Memory network to go in a manner of top-down from bottom to top scans the consistent of the example hierarchy to first two-way length in short-term
The characteristic spectrum of property feature, two-way length memory network in short-term specifically:
For scanning rule,For the output state of t moment;
It cascades and hides layer stateWithThe characteristic spectrum H1 of a mixing is obtained, and is remembered in short-term using second two-way length
Recalling network takes horizontal direction to scan this feature map H1, to cascade the characteristic spectrum H2 that the state of forward and backward is mixed,
This feature map H2 is input to the convolutional layer, obtains the consistency transformation parameter of class hierarchy, logical to the consistency transformation parameter
Activation primitive is crossed to be normalized to obtain transformation parameter φ;
New characteristic spectrum F can be generated as follows:
Wherein functionFor the tensorial multiplication between tensor x and y, E is characterized figure, to the new characteristic spectrum F into
Row dimension converts to obtain the characteristic spectrum of the consistency feature of category level
9. as claimed in claim 8 based on the scene cut system of consistency feature, which is characterized in that the activation primitive is
Softmax function.
10. as claimed in claim 8 based on the scene cut system of consistency feature, which is characterized in that the convolutional layer uses 1
× 1 convolution kernel.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910604601.6A CN110472493B (en) | 2019-07-05 | 2019-07-05 | Scene segmentation method and system based on consistency characteristics |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910604601.6A CN110472493B (en) | 2019-07-05 | 2019-07-05 | Scene segmentation method and system based on consistency characteristics |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110472493A true CN110472493A (en) | 2019-11-19 |
CN110472493B CN110472493B (en) | 2022-01-21 |
Family
ID=68506772
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910604601.6A Active CN110472493B (en) | 2019-07-05 | 2019-07-05 | Scene segmentation method and system based on consistency characteristics |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110472493B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170046579A1 (en) * | 2015-08-12 | 2017-02-16 | Chiman KWAN | Method and system for ugv guidance and targeting |
CN107316313A (en) * | 2016-04-15 | 2017-11-03 | 株式会社理光 | Scene Segmentation and equipment |
CN108537157A (en) * | 2018-03-30 | 2018-09-14 | 特斯联(北京)科技有限公司 | A kind of video scene judgment method and device based on artificial intelligence classification realization |
-
2019
- 2019-07-05 CN CN201910604601.6A patent/CN110472493B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170046579A1 (en) * | 2015-08-12 | 2017-02-16 | Chiman KWAN | Method and system for ugv guidance and targeting |
CN107316313A (en) * | 2016-04-15 | 2017-11-03 | 株式会社理光 | Scene Segmentation and equipment |
CN108537157A (en) * | 2018-03-30 | 2018-09-14 | 特斯联(北京)科技有限公司 | A kind of video scene judgment method and device based on artificial intelligence classification realization |
Non-Patent Citations (2)
Title |
---|
THANH MINH NGUYEN ET AL.: "A Consensus Model for Motion Segmentation in Dynamic Scenes", 《IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY》 * |
翁健: "基于全卷积神经网络的全向场景分割研究与算法实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Also Published As
Publication number | Publication date |
---|---|
CN110472493B (en) | 2022-01-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Li et al. | Semantic-aware grad-gan for virtual-to-real urban scene adaption | |
CN110363201B (en) | Weak supervision semantic segmentation method and system based on collaborative learning | |
CN110111236B (en) | Multi-target sketch image generation method based on progressive confrontation generation network | |
CN107844795B (en) | Convolutional neural networks feature extracting method based on principal component analysis | |
CN106981080A (en) | Night unmanned vehicle scene depth method of estimation based on infrared image and radar data | |
CN110188635A (en) | A kind of plant pest recognition methods based on attention mechanism and multi-level convolution feature | |
CN107993238A (en) | A kind of head-and-shoulder area image partition method and device based on attention model | |
CN111738908A (en) | Scene conversion method and system for generating countermeasure network by combining instance segmentation and circulation | |
CN111667005B (en) | Human interactive system adopting RGBD visual sensing | |
CN107944459A (en) | A kind of RGB D object identification methods | |
CN110689000A (en) | Vehicle license plate identification method based on vehicle license plate sample in complex environment | |
CN112884758B (en) | Defect insulator sample generation method and system based on style migration method | |
CN113724354B (en) | Gray image coloring method based on reference picture color style | |
CN109657538A (en) | Scene Segmentation and system based on contextual information guidance | |
CN110659702A (en) | Calligraphy copybook evaluation system and method based on generative confrontation network model | |
CN113449878B (en) | Data distributed incremental learning method, system, equipment and storage medium | |
DE102019112595A1 (en) | GUIDED HALLUCATION FOR MISSING PICTURE CONTENT USING A NEURONAL NETWORK | |
CN113554653A (en) | Semantic segmentation method for long-tail distribution of point cloud data based on mutual information calibration | |
CN108810319A (en) | Image processing apparatus and image processing method | |
Wang et al. | A multi-scale attentive recurrent network for image dehazing | |
CN110472493A (en) | Scene Segmentation and system based on consistency feature | |
DE102018128592A1 (en) | Generating an image using a map representing different classes of pixels | |
CN112464924A (en) | Method and device for constructing training set | |
Sanders | Neural networks, AI, phone-based VR, machine learning, computer vision and the CUNAT automated translation app–not your father’s archaeological toolkit | |
Orhei | Urban landmark detection using computer vision |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |