CN110472493A - Scene Segmentation and system based on consistency feature - Google Patents

Scene Segmentation and system based on consistency feature Download PDF

Info

Publication number
CN110472493A
CN110472493A CN201910604601.6A CN201910604601A CN110472493A CN 110472493 A CN110472493 A CN 110472493A CN 201910604601 A CN201910604601 A CN 201910604601A CN 110472493 A CN110472493 A CN 110472493A
Authority
CN
China
Prior art keywords
consistency
feature
characteristic spectrum
transformation
scene
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910604601.6A
Other languages
Chinese (zh)
Other versions
CN110472493B (en
Inventor
唐胜
伍天意
李锦涛
张勇东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN201910604601.6A priority Critical patent/CN110472493B/en
Publication of CN110472493A publication Critical patent/CN110472493A/en
Application granted granted Critical
Publication of CN110472493B publication Critical patent/CN110472493B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/35Categorising the entire scene, e.g. birthday party or wedding scene

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The present invention proposes a kind of Scene Segmentation and system for being based on consistency feature (ConsensusFeatures), feature including learning to feature extractor carries out the transformation of example consistency and the transformation of classification consistency, transformed feature is input to scene cut sub-network, obtains the scene cut result of original image.The invention proposes a kind of consistency features that study example hierarchy is gone by example consistency converter unit.On the other hand, since, there is multiple similar examples, use classes consistency unit of the present invention goes the consistency feature of study class hierarchy in scene image.The two units greatly improve the performance of the existing scene cut model based on full convolution.

Description

Scene Segmentation and system based on consistency feature
Technical field
The present invention relates to machine learning and computer vision field, and in particular to a kind of based on Kronecker convolution sum tree The Scene Segmentation and system of shape structure feature aggregation module.
Background technique
Semantic segmentation has become an important composition component in scene understanding, and plays the part of in many application fields Very important role, such as automatic Pilot, self-navigation and virtual reality.The target of semantic segmentation is to every in image A pixel is classified.Depth convolutional neural networks achieve significant progress in semantic segmentation field.Current most of prevalences Semantic segmentation work be all based on image classification network.However, using image classification network as the spy of semantic segmentation model Levying extractor can having some limitations property.Image classification network tendency is indicated with the image level for learning entire input sample.Before Work show that the characterization of this image level is often dominated by the ga s safety degree region of foreground object or conspicuousness object, such as The head of horse, the face of dog.Then semantic segmentation target is that each pixel is divided in an image, therefore Pixel-level is characterized in It is required.Therefore two defects will directly be will lead to semantic segmentation using image classification network, as shown in Fig. 2 (d): (1) main It leads in the class of all spatial positions of object that feature is inconsistent, causes the segmentation result of the same example class inconsistent;(2) can not Feature is easy to be confused between the class of distinguishable region (such as secondary object or filling class region), causes similar example to have different The segmentation result of cause.To solve the above-mentioned problems, it is desirable to go the consistency feature of study Pixel-level.Consistency be characterized in by Neighbours' consistency related work is encouraged, it is that object matching field is used to find the direct reliable intensive sound of a pair of of image It answers.The present invention, our targets are study consistency features, it refers to all spies in an example or in same class example Levy indifference.As shown in Fig. 2 (a), the feature (such as B and C) of different zones should keep instance-level in (1) same example Consistency;(2) (for example A should keep consistent on class hierarchy with C) to the feature of the different zones of the different instances of the same category Property.In order to learn consistency feature, the present invention proposes two consistency converter units, including example consistency converter unit (InstanceConsensusTransformunit, ICTunit) and class consistency converter unit (CategoryConsensusTransformunit,CCTunit).Example consistency unit is expected to take study example hierarchy Consistency feature.Particularly, the localized network (LocalNetwork is abbreviated as LN) that we have introduced a lightweight goes benefit With the transformation parameter for around contextual information being each pixel generation instance-level.Then we utilize the transformation parameter of example hierarchy Take the feature being aggregated in the same example.On the other hand, since there is multiple similar examples, the present invention in scene image Use classes consistency unit goes to pursue the consistency feature of class hierarchy.Specifically, we have introduced the overall situation of a lightweight Network (GlobalNetwork is abbreviated as GN) goes to generate the consistency transformation parameter of class hierarchy.It needs to model different from LN, GN Any position and its
All interactions of his position.Two units proposed by the present invention are that the mode of data-driven learns, and trains Additional supervision is not needed in journey.The two units are used to update the feature of all positions.For each position,
The two units are all the information (foreground area) of adaptive enhancing relevant position and inhibit incoherent (background Region).Therefore consistency is characterized in maintaining the invariance to the variation of background in foreground area indifference.Learn with benchmark model To characteristic pattern 2 (c) compare, method proposed by the present invention learns to be characterized in that more cohesion is in example hierarchy and classification layer It is secondary, as shown in Fig. 2 (e).Meanwhile the method that the inconsistent segmentation result in Fig. 2 (d) is suggested is corrected, such as Fig. 2 (f) institute Show.Example consistency converter unit and classification consistency converter unit based on proposition, we have proposed one to be called consistency The semantic segmentation frame of character network (ConsensusFeatureNetwork is abbreviated as CFNet) goes the consistent of study Pixel-level Property feature, and obtain consistent segmentation result.By in four famous semantic segmentation data sets, including Cityscapes and PASCALContext, obtain be more than current best method precision, it was demonstrated that the validity of proposition method of the present invention.
Summary of the invention
In view of the deficiencies of the prior art, the present invention proposes a kind of Scene Segmentation based on consistency feature, feature It is, comprising:
Step 1 uses residual error network as feature extractor, extracts the local feature map in original image, and to this Local feature map carries out the transformation of example consistency, obtains the consistency feature of example hierarchy;
Step 2 carries out the transformation of classification consistency to the consistency feature of the example hierarchy, obtains the consistency of class hierarchy Feature;
Step 3, using category consistency feature as input, the field of the original image is exported by scene cut sub-network Scape segmentation result.
The Scene Segmentation based on consistency feature, which is characterized in that the example consistency, which converts, includes:
For a local feature mapC is the number of channels for levying map, and H × W is space size, first First with 11 × 1 convolution dimensionality reduction local feature map, characteristic spectrum is obtainedC1It is the logical of characteristic spectrum Road quantity generates the parameter of example consistency transformation after obtaining characteristic spectrum PHere r is indicated to work as Regional area size centered on front space position, the size of parameter θ are converted with the transformation of regional area size r;
Dimension is carried out to convert to obtainThe operation that N=H × W, characteristic spectrum P execute an expansion, which goes to extract, to be slided The feature of dynamic window block, and carry out dimension and convert to obtain characteristic spectrumNew characteristic spectrum Q,Wherein functionFor the Element-Level multiplication of tensor x and y, and asked according to the last one dimension With;
Dimension is carried out to characteristic spectrum Q to convert to obtain the consistency feature of the example hierarchy:
The Scene Segmentation based on consistency feature characterized by comprising building classification consistency transformation Unit carries out the transformation of category consistency using consistency feature of the category consistency converter unit to example hierarchy:
Category consistency converter unit includes: that memory network and 1 convolutional layer go instantiation complete to two two-way length in short-term Office network;
First two-way length in short-term memory network by from bottom to top with it is top-down in a manner of go to scan the example hierarchy The characteristic spectrum of consistency feature, two-way length memory network in short-term specifically:
For scanning rule,For the output state of t moment;
It cascades and hides layer stateWithThe characteristic spectrum H1 of a mixing is obtained, and uses second two-way length Short-term memory network takes horizontal direction to scan this feature map H1, to cascade the characteristic pattern that the state of forward and backward is mixed H2 is composed, this feature map H2 is input to the convolutional layer, obtains the consistency transformation parameter of class hierarchy, converts and joins to the consistency Number is normalized to obtain transformation parameter φ by activation primitive;
New characteristic spectrum F can be generated as follows:
Wherein functionFor the tensorial multiplication between tensor x and y, E is characterized figure, the characteristic pattern new to this Spectrum F carries out dimension and converts to obtain the characteristic spectrum of the consistency feature of category level
The Scene Segmentation based on consistency feature, which is characterized in that the activation primitive is Softmax letter Number.
The Scene Segmentation based on consistency feature, which is characterized in that the convolutional layer uses 1 × 1 convolution Core.
The invention also provides a kind of scene cut systems based on consistency feature characterized by comprising
Module 1 uses residual error network as feature extractor, extracts the local feature map in original image, and to this Local feature map carries out the transformation of example consistency, obtains the consistency feature of example hierarchy;
Module 2 carries out the transformation of classification consistency to the consistency feature of the example hierarchy, obtains the consistency of class hierarchy Feature;
Module 3, using category consistency feature as input, the field of the original image is exported by scene cut sub-network Scape segmentation result.
The scene cut system based on consistency feature, which is characterized in that the example consistency, which converts, includes:
For a local feature mapC is the number of channels for levying map, and H × W is space size, first First with 11 × 1 convolution dimensionality reduction local feature map, characteristic spectrum is obtainedC1It is the logical of characteristic spectrum Road quantity generates the parameter of example consistency transformation after obtaining characteristic spectrum PHere r is indicated to work as Regional area size centered on front space position, the size of parameter θ are converted with the transformation of regional area size r;
Dimension is carried out to convert to obtainThe operation that N=H × W, characteristic spectrum P execute an expansion, which goes to extract, to be slided The feature of dynamic window block, and carry out dimension and convert to obtain characteristic spectrumNew characteristic spectrum Q,Wherein functionFor the Element-Level multiplication of tensor x and y, and asked according to the last one dimension With;
Dimension is carried out to characteristic spectrum Q to convert to obtain the consistency feature of the example hierarchy:
The scene cut system based on consistency feature characterized by comprising building classification consistency transformation Unit carries out the transformation of category consistency using consistency feature of the category consistency converter unit to example hierarchy:
Category consistency converter unit includes: that memory network and 1 convolutional layer go instantiation complete to two two-way length in short-term Office network;
First two-way length in short-term memory network by from bottom to top with it is top-down in a manner of go to scan the example hierarchy The characteristic spectrum of consistency feature, two-way length memory network in short-term specifically:
For scanning rule,For the output state of t moment;
It cascades and hides layer stateWithThe characteristic spectrum H1 of a mixing is obtained, and uses second two-way length When memory network take horizontal direction to scan this feature map H1, to cascade the characteristic spectrum that the state of forward and backward is mixed H2, this feature map H2 are input to the convolutional layer, obtain the consistency transformation parameter of class hierarchy, to the consistency transformation parameter It is normalized to obtain transformation parameter φ by activation primitive;
New characteristic spectrum F can be generated as follows:
Wherein functionFor the tensorial multiplication between tensor x and y, E is characterized figure, the characteristic pattern new to this Spectrum F carries out dimension and converts to obtain the characteristic spectrum of the consistency feature of category level
The scene cut system based on consistency feature, which is characterized in that the activation primitive is Softmax letter Number.
The scene cut system based on consistency feature, which is characterized in that the convolutional layer uses 1 × 1 convolution Core.
As it can be seen from the above scheme the present invention has the advantages that
Based on the Scene Segmentation and system of consistency feature (ConsensusFeatures), including to feature extraction The feature that device learns carries out the transformation of example consistency and the transformation of classification consistency, and transformed feature is input to scene cut Sub-network obtains the scene cut result of original image.The invention proposes one kind to go to learn by example consistency converter unit Practise the consistency feature of example hierarchy.On the other hand, since, there is multiple similar examples, the present invention uses in scene image Classification consistency unit goes the consistency feature of study class hierarchy.The two units greatly improve existing based on full convolution The performance of scene cut model.
Detailed description of the invention
Fig. 1 is structure of the invention figure;
Fig. 2 is to utilize the result schematic diagram of image classification network to semantic segmentation in the prior art.
Specific embodiment
To allow features described above and effect of the invention that can illustrate more clearly understandable, special embodiment below, and cooperate Bright book attached drawing is described in detail below.
Step 1, the deep layer that sub-network is extracted in essential characteristic go study real using example consistency converter unit The consistency feature of example level.The example consistency converter unit, which is utilized, to be gone around contextual information as the generation of each spatial position Transformation parameter.As shown in Fig. 1 (b), for a local feature mapC is the number of channels for levying map, H × W is space size, and example consistency converter unit removes dimensionality reduction first with 11 × 1 convolution and saves calculating, obtains simultaneously Characteristic spectrumHere C1 is the number of channels of characteristic spectrum, it be usually C (typically).It is obtaining After obtaining characteristic spectrum P, example consistency converter unit employs the localized network of a lightweight to go to generate the change of example consistency The parameter changedHere r indicates the regional area size centered on current spatial location.The size of parameter θ It is to be converted with the transformation of regional area size r.It is desirable that localized network is gone using the circular contextual information of each point Generate transformation parameter.We go to instantiate the localized network using two convolutional layers, their filter size is respectively r × r With 1 × 1.The characteristic spectrum conduct that first convolutional layer (r × r) is used to capture ring around contextual information, after then obtaining The input of second convolutional layer (1 × 1), then output is the parameter of example consistency transformation.Then we are swashed using softmax Function living goes to obtain normalized transformation parameter θ, then carries out dimension and converts to obtainHere N=H × W.Together When, the operation that characteristic spectrum P executes an expansion goes to extract the feature of sliding window buccal mass, and carries out dimension and convert to obtain characteristic pattern SpectrumWe push up a functionFor the Element-Level multiplication of tensor x and y, then tieed up according to the last one Degree is summed.New characteristic spectrumIt can calculate in following way:
It converts to obtain next, we carry out dimension to QIn fact, the spy of (i, j) in any position Levy vectorInIt is the neighbours on characteristic spectrum PBecome with corresponding example level consistency Change parameterWeighted sum, hereIt is a r × r centered on (i, j) Rectangular area.Therefore in each position (i;J) map function can formalize are as follows:
Here
The Scene Segmentation further include:
Step 2, the consistency transformation that the consistency feature of the example hierarchy learnt is carried out to class hierarchy again, the step 2 include:
The consistency feature of study class hierarchy is gone using tired consistency converter unit.Category consistency converter unit Structure such as Fig. 1 (c) shown in, we using a global network go generate class hierarchy consistency transformation parameterHere N=H × W.It is desirable that this global network ability removes " seeing " entire input feature vector map.One very Natural solution mode is using a full articulamentum, global convolution, or the multiple big nuclear convolutions of stacking.These modes are not Very effectively, because they have introduced a large amount of parameter and video memory expense.
Classification consistency converter unit has introduced a cyclic convolution neural network and has gone the modeling other dependence of region class, Using two two-way length, memory network (BiLSTM) and 11 × 1 convolutional layer go instantiation global network (GN) in short-term for we.I First with first two-way length in short-term memory network by from bottom to top with it is top-down in a manner of remove scanning feature map, such as Shown in Fig. 2.It updates its hiding layer state using the feature of a row grade as the input at a moment.One classics Length in short-term memory unit include an input gate it, a forgetting door ft, an out gate Ot, an output stateOne internal memory unit state Ct.Scanning ruleIt can formalize are as follows:
Therefore, the calculating of two-way length memory network in short-term can turn in the form of
After bilateral scanning, we cascade hiding layer stateWithIt goes to obtain the characteristic spectrum H1 of a mixing. Similar mode, using second two-way length, memory network takes horizontal direction scanning feature map H1 in short-term for we, it is column grade Characteristic spectrum be sliced input as a moment, and update hiding layer state.Then we cascade the shape of forward and backward The characteristic spectrum H2 that state is mixed, the table that this feature map is interacted as each spatial position with the overall situation of other positions Sign.Each response on characteristic spectrum H2 is the activation response of the position and whole image.Then, overall situation interaction letter The input as 1 × 1 convolutional layer is ceased, the consistency transformation parameter for class hierarchy is exported.Most Softmax activation primitive is by then For obtaining normalized transformation parameter φ.Our defined functionsFor the tensorial multiplication between tensor x and y.Newly Characteristic spectrum F can generate as follows:
HereE represents characteristic pattern, specifically special for the consistency of example hierarchy as shown in Figure 1 Sign obtains X, X is by process of convolution, expansion processing and Shape correction by the Res4 network being connected in series by multiple residual units After obtain E.It converts to obtain characteristic spectrum next, we carry out dimension to FIn fact, any The feature vector of position (i, j)It is to be calculated according to such as under type:
Wherein
i∈[1,H],j∈[1,W],φij(h, w)=φij(hw),Be The feature of position (h, w) on feature spectrogram E.
Experiment:
Data set introduction.Cityscapes data set includes the street scene from 50 different cities.This data set It is divided into three subsets, including 2975 picture of training set, verifying 500 pictures of collection and 1525 picture of test set.Data set The 19 class set of pixels mark of high quality is provided.Performance is using the friendship of all classes and the average value of ratio.PASCAL-Context data Collection includes that training set 4998 opens image and verifying 5105 images of collection.This data set provides detailed semanteme for entire scene Mark.It is proposed that model be evaluated at most common 59 class and 1 background classes.
Validity experimental verification:
Table 1 is converted in example consistency converter unit (ICT) and classification consistency of the Cityscapes verifying collection to proposition The validation verification of unit (CCT).
As can be seen from Table 1 compared with benchmark model, can be obtained using example consistency converter unit 78.8meanIoU, in the precision improvement of benchmark model upper right 3.8%.Similarly, use classes consistency converter unit can be 2.6% performance boost is brought on benchmark model.When integrated example consistency converter unit and classification consistency converter unit, divide It cuts precision to be further improved to 79.9%, this has 5.0% performance boost (79.9vs.74.9) compared with benchmark model.This A bit the experiment proves that two the integrated of unit can bring huge segmentation precision to improve.
TFA_S is the one smaller factor (r of configuration in TFA in table 21, r2)={ (6,3), (10;7), (20,15) }
Compared with other methods:
This is a part of, and we can report that our method and other advanced methods compare.
Experimental result on Cityscapes:
Table 2 is compared in Cityscapes test set with other advanced methods.
Experimental result on PASCAL-Context:
Table 3 is compared in Pascalcontext data set with other methods
From table 2 and table 3 we can see that we design system on two authoritative semantic segmentation data sets all Extraordinary performance is achieved, this also further demonstrates effectiveness of the invention.
The following are system embodiment corresponding with above method embodiment, present embodiment can be mutual with above embodiment Cooperation is implemented.The relevant technical details mentioned in above embodiment are still effective in the present embodiment, in order to reduce repetition, Which is not described herein again.Correspondingly, the relevant technical details mentioned in present embodiment are also applicable in above embodiment.
The invention also provides a kind of scene cut systems based on consistency feature characterized by comprising
Module 1 uses residual error network as feature extractor, extracts the local feature map in original image, and to this Local feature map carries out the transformation of example consistency, obtains the consistency feature of example hierarchy;
Module 2 carries out the transformation of classification consistency to the consistency feature of the example hierarchy, obtains the consistency of class hierarchy Feature;
Module 3, using category consistency feature as input, the field of the original image is exported by scene cut sub-network Scape segmentation result.
The scene cut system based on consistency feature, which is characterized in that the example consistency, which converts, includes:
For a local feature mapC is the number of channels for levying map, and H × W is space size, first First with 11 × 1 convolution dimensionality reduction local feature map, characteristic spectrum is obtainedC1It is the logical of characteristic spectrum Road quantity generates the parameter of example consistency transformation after obtaining characteristic spectrum PHere r is indicated to work as Regional area size centered on front space position, the size of parameter θ are converted with the transformation of regional area size r;
Dimension is carried out to convert to obtainThe operation that N=H × W, characteristic spectrum P execute an expansion, which goes to extract, to be slided The feature of dynamic window block, and carry out dimension and convert to obtain characteristic spectrumNew characteristic spectrum Q,Wherein functionFor the Element-Level multiplication of tensor x and y, and asked according to the last one dimension With;
Dimension is carried out to characteristic spectrum Q to convert to obtain the consistency feature of the example hierarchy:
The scene cut system based on consistency feature characterized by comprising building classification consistency transformation Unit carries out the transformation of category consistency using consistency feature of the category consistency converter unit to example hierarchy:
Category consistency converter unit includes: that memory network and 1 convolutional layer go instantiation complete to two two-way length in short-term Office network;
First two-way length in short-term memory network by from bottom to top with it is top-down in a manner of go to scan the example hierarchy The characteristic spectrum of consistency feature, two-way length memory network in short-term specifically:
For scanning rule,For the output state of t moment;
It cascades and hides layer stateWithThe characteristic spectrum H1 of a mixing is obtained, and uses second two-way length Short-term memory network takes horizontal direction to scan this feature map H1, to cascade the characteristic pattern that the state of forward and backward is mixed H2 is composed, this feature map H2 is input to the convolutional layer, obtains the consistency transformation parameter of class hierarchy, converts and joins to the consistency Number is normalized to obtain transformation parameter φ by activation primitive;
New characteristic spectrum F can be generated as follows:
Wherein functionFor the tensorial multiplication between tensor x and y, E is characterized figure, the characteristic pattern new to this Spectrum F carries out dimension and converts to obtain the characteristic spectrum of the consistency feature of category level

Claims (10)

1. a kind of Scene Segmentation based on consistency feature characterized by comprising
Step 1 uses residual error network as feature extractor, extracts the local feature map in original image, and to the part Characteristic spectrum carries out the transformation of example consistency, obtains the consistency feature of example hierarchy;
Step 2 carries out the transformation of classification consistency to the consistency feature of the example hierarchy, and the consistency for obtaining class hierarchy is special Sign;
Step 3, using category consistency feature as input, the scene point of the original image is exported by scene cut sub-network Cut result.
2. the Scene Segmentation as described in claim 1 based on consistency feature, which is characterized in that the example consistency becomes It changes and includes:
For a local feature mapC is the number of channels for levying map, and H × W is space size, sharp first With 11 × 1 convolution dimensionality reduction local feature map, characteristic spectrum is obtainedC1It is the port number of characteristic spectrum Amount generates the parameter of example consistency transformation after obtaining characteristic spectrum PHere r is indicated with current empty Meta position is set to the regional area size at center, and the size of parameter θ is converted with the transformation of regional area size r;
Dimension is carried out to convert to obtainThe operation that N=H × W, characteristic spectrum P execute an expansion goes to extract sliding window The feature of buccal mass, and carry out dimension and convert to obtain characteristic spectrumNew characteristic spectrum Q,Wherein functionFor the Element-Level multiplication of tensor x and y, and asked according to the last one dimension With;
Dimension is carried out to characteristic spectrum Q to convert to obtain the consistency feature of the example hierarchy:
3. the Scene Segmentation as claimed in claim 2 based on consistency feature characterized by comprising building classification Consistency converter unit carries out category consistency using consistency feature of the category consistency converter unit to example hierarchy Transformation:
Category consistency converter unit includes: that memory network and 1 convolutional layer go to instantiate global net two two-way length in short-term Network;
Memory network to go in a manner of top-down from bottom to top scans the consistent of the example hierarchy to first two-way length in short-term The characteristic spectrum of property feature, two-way length memory network in short-term specifically:
For scanning rule,For the output state of t moment;
It cascades and hides layer stateWithThe characteristic spectrum H1 of a mixing is obtained, and in short-term using second two-way length Memory network takes horizontal direction to scan this feature map H1, to cascade the characteristic spectrum that the state of forward and backward is mixed H2, this feature map H2 are input to the convolutional layer, obtain the consistency transformation parameter of class hierarchy, to the consistency transformation parameter It is normalized to obtain transformation parameter φ by activation primitive;
New characteristic spectrum F can be generated as follows:
Wherein functionFor the tensorial multiplication between tensor x and y, E is characterized figure, to the new characteristic spectrum F into Row dimension converts to obtain the characteristic spectrum of the consistency feature of category level
4. the Scene Segmentation as claimed in claim 3 based on consistency feature, which is characterized in that the activation primitive is Softmax function.
5. the Scene Segmentation as claimed in claim 3 based on consistency feature, which is characterized in that the convolutional layer uses 1 × 1 convolution kernel.
6. a kind of scene cut system based on consistency feature characterized by comprising
Module 1 uses residual error network as feature extractor, extracts the local feature map in original image, and to the part Characteristic spectrum carries out the transformation of example consistency, obtains the consistency feature of example hierarchy;
Module 2 carries out the transformation of classification consistency to the consistency feature of the example hierarchy, and the consistency for obtaining class hierarchy is special Sign;
Module 3, using category consistency feature as input, the scene point of the original image is exported by scene cut sub-network Cut result.
7. as claimed in claim 6 based on the scene cut system of consistency feature, which is characterized in that the example consistency becomes It changes and includes:
For a local feature mapC is the number of channels for levying map, and H × W is space size, sharp first With 11 × 1 convolution dimensionality reduction local feature map, characteristic spectrum is obtainedC1It is the port number of characteristic spectrum Amount generates the parameter of example consistency transformation after obtaining characteristic spectrum PHere r is indicated with current empty Meta position is set to the regional area size at center, and the size of parameter θ is converted with the transformation of regional area size r;
Dimension is carried out to convert to obtainThe operation that N=H × W, characteristic spectrum P execute an expansion goes to extract sliding window The feature of buccal mass, and carry out dimension and convert to obtain characteristic spectrumNew characteristic spectrum Q, Wherein functionFor the Element-Level multiplication of tensor x and y, and summed according to the last one dimension;
Dimension is carried out to characteristic spectrum Q to convert to obtain the consistency feature of the example hierarchy:
8. as claimed in claim 7 based on the scene cut system of consistency feature characterized by comprising building classification Consistency converter unit carries out category consistency using consistency feature of the category consistency converter unit to example hierarchy Transformation:
Category consistency converter unit includes: that memory network and 1 convolutional layer go to instantiate global net two two-way length in short-term Network;
Memory network to go in a manner of top-down from bottom to top scans the consistent of the example hierarchy to first two-way length in short-term The characteristic spectrum of property feature, two-way length memory network in short-term specifically:
For scanning rule,For the output state of t moment;
It cascades and hides layer stateWithThe characteristic spectrum H1 of a mixing is obtained, and is remembered in short-term using second two-way length Recalling network takes horizontal direction to scan this feature map H1, to cascade the characteristic spectrum H2 that the state of forward and backward is mixed, This feature map H2 is input to the convolutional layer, obtains the consistency transformation parameter of class hierarchy, logical to the consistency transformation parameter Activation primitive is crossed to be normalized to obtain transformation parameter φ;
New characteristic spectrum F can be generated as follows:
Wherein functionFor the tensorial multiplication between tensor x and y, E is characterized figure, to the new characteristic spectrum F into Row dimension converts to obtain the characteristic spectrum of the consistency feature of category level
9. as claimed in claim 8 based on the scene cut system of consistency feature, which is characterized in that the activation primitive is Softmax function.
10. as claimed in claim 8 based on the scene cut system of consistency feature, which is characterized in that the convolutional layer uses 1 × 1 convolution kernel.
CN201910604601.6A 2019-07-05 2019-07-05 Scene segmentation method and system based on consistency characteristics Active CN110472493B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910604601.6A CN110472493B (en) 2019-07-05 2019-07-05 Scene segmentation method and system based on consistency characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910604601.6A CN110472493B (en) 2019-07-05 2019-07-05 Scene segmentation method and system based on consistency characteristics

Publications (2)

Publication Number Publication Date
CN110472493A true CN110472493A (en) 2019-11-19
CN110472493B CN110472493B (en) 2022-01-21

Family

ID=68506772

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910604601.6A Active CN110472493B (en) 2019-07-05 2019-07-05 Scene segmentation method and system based on consistency characteristics

Country Status (1)

Country Link
CN (1) CN110472493B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170046579A1 (en) * 2015-08-12 2017-02-16 Chiman KWAN Method and system for ugv guidance and targeting
CN107316313A (en) * 2016-04-15 2017-11-03 株式会社理光 Scene Segmentation and equipment
CN108537157A (en) * 2018-03-30 2018-09-14 特斯联(北京)科技有限公司 A kind of video scene judgment method and device based on artificial intelligence classification realization

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170046579A1 (en) * 2015-08-12 2017-02-16 Chiman KWAN Method and system for ugv guidance and targeting
CN107316313A (en) * 2016-04-15 2017-11-03 株式会社理光 Scene Segmentation and equipment
CN108537157A (en) * 2018-03-30 2018-09-14 特斯联(北京)科技有限公司 A kind of video scene judgment method and device based on artificial intelligence classification realization

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
THANH MINH NGUYEN ET AL.: "A Consensus Model for Motion Segmentation in Dynamic Scenes", 《IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY》 *
翁健: "基于全卷积神经网络的全向场景分割研究与算法实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Also Published As

Publication number Publication date
CN110472493B (en) 2022-01-21

Similar Documents

Publication Publication Date Title
Li et al. Semantic-aware grad-gan for virtual-to-real urban scene adaption
CN110363201B (en) Weak supervision semantic segmentation method and system based on collaborative learning
CN110111236B (en) Multi-target sketch image generation method based on progressive confrontation generation network
CN107844795B (en) Convolutional neural networks feature extracting method based on principal component analysis
CN106981080A (en) Night unmanned vehicle scene depth method of estimation based on infrared image and radar data
CN110188635A (en) A kind of plant pest recognition methods based on attention mechanism and multi-level convolution feature
CN107993238A (en) A kind of head-and-shoulder area image partition method and device based on attention model
CN111738908A (en) Scene conversion method and system for generating countermeasure network by combining instance segmentation and circulation
CN111667005B (en) Human interactive system adopting RGBD visual sensing
CN107944459A (en) A kind of RGB D object identification methods
CN110689000A (en) Vehicle license plate identification method based on vehicle license plate sample in complex environment
CN112884758B (en) Defect insulator sample generation method and system based on style migration method
CN113724354B (en) Gray image coloring method based on reference picture color style
CN109657538A (en) Scene Segmentation and system based on contextual information guidance
CN110659702A (en) Calligraphy copybook evaluation system and method based on generative confrontation network model
CN113449878B (en) Data distributed incremental learning method, system, equipment and storage medium
DE102019112595A1 (en) GUIDED HALLUCATION FOR MISSING PICTURE CONTENT USING A NEURONAL NETWORK
CN113554653A (en) Semantic segmentation method for long-tail distribution of point cloud data based on mutual information calibration
CN108810319A (en) Image processing apparatus and image processing method
Wang et al. A multi-scale attentive recurrent network for image dehazing
CN110472493A (en) Scene Segmentation and system based on consistency feature
DE102018128592A1 (en) Generating an image using a map representing different classes of pixels
CN112464924A (en) Method and device for constructing training set
Sanders Neural networks, AI, phone-based VR, machine learning, computer vision and the CUNAT automated translation app–not your father’s archaeological toolkit
Orhei Urban landmark detection using computer vision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant