CN110210485A - The image, semantic dividing method of Fusion Features is instructed based on attention mechanism - Google Patents
The image, semantic dividing method of Fusion Features is instructed based on attention mechanism Download PDFInfo
- Publication number
- CN110210485A CN110210485A CN201910391452.XA CN201910391452A CN110210485A CN 110210485 A CN110210485 A CN 110210485A CN 201910391452 A CN201910391452 A CN 201910391452A CN 110210485 A CN110210485 A CN 110210485A
- Authority
- CN
- China
- Prior art keywords
- semantic
- layer
- feature
- fusion
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
- G06F18/2193—Validation; Performance evaluation; Active pattern learning techniques based on specific statistical tests
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Probability & Statistics with Applications (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The present invention discloses a kind of image, semantic dividing method that Fusion Features are instructed based on attention mechanism, includes the following steps: that (10) encoder basic network constructs: being generated using improved ResNet-101 and a series of speaks in a low voice justice to the high semantic feature changed of low resolution by high-resolution;(20) decoder characteristic Fusion Module constructs: using the pyramid structure module based on three-layer coil product operation, extracting the high-level semantic of strong consistency constraint, then to the layer-by-layer Weighted Fusion of low layer phase characteristic, obtains primary segmentation thermal map;(30) auxiliary loss function building: additional back-up surveillance is exported to each fusion of decoding stage, then is superimposed with provost's loss after thermal map up-sampling, strengthens the order training method of model, obtains semantic segmentation figure.The image, semantic dividing method that Fusion Features are instructed based on attention mechanism of the invention, accuracy is high, boundary profile understands.
Description
Technical field
The invention belongs to still image identification technology field, especially a kind of accuracy is high, boundary profile is clearly based on
Attention mechanism instructs the image, semantic dividing method of Fusion Features.
Background technique
Semantic segmentation, that is, pixel scale image understanding, is one of important foundation stone of computer vision field, has very
It is widely applied scene.It imparts machine for the different zones of visual with Pixel-level in such a way that fine granularity is divided
The ability that do not peel away.The pixel region for belonging to same target in image is divided into together by semantic segmentation, to extend it
Application field.
Semantic segmentation classifies subjects into while carrying out Pixel-level prediction to be combined together with two problems of target positioning
Solve the problem of how higher level of abstraction the accurate target of object classification and low layer position the two mutually constrain between obtain
Balance is the current semantics dividing method key problem to be faced.Semantic segmentation method can be roughly divided into two classes.The first,
The semanteme of each object in image is generated by manually extracting feature, this method generally requires careful Feature Engineering means,
The classification that classifier carries out pixel scale is inputted again.Second is to be based on deep learning method, will by building end-to-end system
Feature extraction and classifying device is combined directly to distribute a semantic label for each pixel.
The machine learning method that most of traditional methods have relied on manual extraction feature and combine with classifier,
Such as the Boost method of Shotton et al., the random forest of Johnson et al., the support vector machines of Soatto et al..These sides
Method achieves substantive progress by integrating the abundant information from context and structuring Predicting Technique.However, due to hand
The limited influence of the feature representation ability that work is extracted, the image, semantic segmenting system performance based on conventional machines learning method is gradually
Saturation, can not breakthrough bottleneck, still have greatly improved space in segmentation accuracy rate performance.
In recent years, deep learning revolution allowed related fields that earth-shaking variation has occurred, including semantic segmentation
Many computer vision problems all begin to use deep layer framework to solve.The full convolution net proposed based on depth convolutional neural networks
Network method replaces full articulamentum to construct full convolutional network and is applied in semantic segmentation with convolutional layer, generate it is intensive pixel-by-pixel
Label output, obtains higher segmentation precision.Zhao et al. proposes that pyramid scene parses network method, utilizes pyramid pond
Change module, global context information is utilized by the context polymerization of different zones, is effectively produced using global priori
The segmentation result of high quality.Li et al. people is put by first classifying to shallow-layer phase zone, and by deeper stage emphasis
On minority difficult region, to carry out study adaptive and for the identification of difficult sample, segmentation performance is ultimately improved.Lin
Et al. propose a kind of general multi-path optimization network method, explicitly use all available informations during down-sampling, with
It realizes and is predicted using the high-resolution pixel grade of long-range residual connection.
However, the prior art is deposited for semantic segmentation effect, there are still two main problems:
1, in the image, semantic segmentation based on the full convolutional network of depth, when carrying out feature extraction using convolutional network, by
Cause feature resolution to gradually decrease in the repeated combination of convolution, maximum pond and down-sampling operation, contextual information is caused to be lost
It loses, so that the regional area misrecognition for occurring appearance complex target in segmentation result and multiple dimensioned object Small Target is caused to be known
Mistake etc. is not semantic inconsistent;
2, the success section of convolutional network is attributed to its inherent invariance to image local transformation, which enhances
The ability of e-learning Hierarchical abstraction, this is exactly needed for object classification contour level visual task.And semantic segmentation is solving
Also need to position the spatial details problems such as the boundary profile of object, simple pixel classifications while classification problem in segmentation
Often there is the smudgy phenomenon of boundary profile of object in segmentation result in task.
Summary of the invention
The purpose of the present invention is to provide a kind of image, semantic dividing method for instructing Fusion Features based on attention mechanism,
Accuracy is high, boundary profile understands.
Realize the technical solution of the object of the invention are as follows:
A kind of image, semantic dividing method being instructed Fusion Features based on attention mechanism, is included the following steps:
(10) encoder basic network constructs: being spoken in a low voice using improved ResNet-101 generation is a series of by high-resolution
Feature of the justice to the high semantic variation of low resolution;
(20) decoder characteristic Fusion Module constructs: using the pyramid structure module based on three-layer coil product operation, extracting
The high-level semantic of strong consistency constraint, then to the layer-by-layer Weighted Fusion of low layer phase characteristic, obtain primary segmentation thermal map;
(30) auxiliary loss function building: exporting additional back-up surveillance to each fusion of decoding stage, then on thermal map
Provost after sampling loses superposition, strengthens the order training method of model, obtains semantic segmentation figure.
Compared with prior art, the present invention has the following:
1, accuracy is high: the method for the present invention is by realizing the end high-layer semantic information extraction module similar to pyramid structure
The feature of three different scales is merged, and additionally introduces global pool branch connect with output feature and does subsequent processing, general
Contextual information is multiplied with the former feature after simple convolutional operates, and can capture under the premise of not introducing too many calculating
Strong semantic consistency feature reduces the probability of object regional area identification error;
2, boundary profile understands: the present invention contains more semantic information and low-level feature according to high-level characteristic between adjacent feature
Containing this feature of more space detailed information, first connects two hierarchy characteristics generation channels and pay attention to vector, as weight
Select in low-level feature the most information of judgement index, using the strong semantic consistency constraint guidance of high-level characteristic and refinement its with
The fusion of low-level feature, captures context abundant, has finally refined the partitioning boundary of object, preferably fusion hierarchy characteristic with
The edge details for restoring object in segmentation figure, reduce the smudgy phenomenon of boundary profile.
Detailed description of the invention
Fig. 1 is the main flow chart that the image, semantic dividing method of Fusion Features is instructed the present invention is based on attention mechanism.
Fig. 2 is the flow chart of encoder basic network construction step in Fig. 1.
Fig. 3 is the flow chart of decoder characteristic Fusion Module construction step in Fig. 1.
Fig. 4 is end high-layer semantic information extraction module example.
Fig. 5 is that attention mechanism instructs Fusion Features module example.
Specific embodiment
As shown in Figure 1, the present invention is based on the image, semantic dividing method that attention mechanism instructs Fusion Features, including it is as follows
Step:
(10) encoder basic network constructs: being spoken in a low voice using improved ResNet-101 generation is a series of by high-resolution
Feature of the justice to the high semantic variation of low resolution;
As shown in Fig. 2, (10) the encoder basic network construction step includes:
(11) number of plies of structure block is disposed again: the building number of blocks that res-2 to the res-5 stage respectively possesses is redeployed,
{ 3,4,23,3 } building number of blocks of res-2 to the res-5 of original ResNet-101 is adjusted to { 8,8,9,8 };
The purpose of convolutional network encoder is to generate a series of to be changed by high-resolution justice of speaking in a low voice to low resolution is high semantic
Feature.The basic network usually using existing convolutional neural networks model, as LeNet, AlexNet, VGG, GoogLeNet,
ResNet etc..Wherein ResNet-101 has used a large amount of residual error structure, and what gradient disappeared while solving number of plies intensification asks
Topic, each of which residual error structure also provides new path for positive and backpropagation, therefore has extremely strong ability to express.This hair
The bright encoder basic network for using ResNet-101 as semantic segmentation.
In basic network, feature is extracted from each stage tail portion of encoder section, for ResNet-101
Speech, respectively res-2, res-3, res-4 and res-5 four-stage, wherein contained building number of blocks of each stage be respectively 3,
4,23,3 }, each structure block is made of three convolutional layers.As it can be seen that the first two coding stage of ResNet-101 only have it is a small amount of
Structure block, the convolution number of plies shallower in this way prevent it from the semantic feature of extracting deep layer, and the semanteme of low-level feature is second-rate.And
Since res-4, after a large amount of deep layer convolution, the feature of output possesses stronger semanteme.Res-3 and res-4 two
The semantic quality gap between two adjacent features that stage extracts is very big.In order to improve the semantic quality of low-level feature, make
, closer to supervision, a direct method is to redeploy the building number of blocks that res-2 to the res-5 stage respectively possesses, flat for it
Weigh the convolution number of plies in each stage, the semantic difference between feature that two stages of res-3 and res-4 of reduction export.Again
In building of each stage number of blocks of deployment, { 3,4,23,3 } of res-2 to the res-5 of original ResNet-101 construct number of blocks
It is adjusted to { 8,8,9,8 }.
(12) expand receptive field: traditional convolution in res-5 stage in ResNet-101 infrastructure network is changed to expand
The empty convolution that rate is 2.
The output resolution ratio of semantic segmentation should be consistent with input picture.Although the semantic segmentation method based on full convolutional network
It can receive the input picture of arbitrary resolution, but continuous convolution sum pondization operation also reduces while increasing receptive field
The resolution ratio of feature.Although the characteristic pattern of diminution can be reverted to the original size of image, this process by up-sampling
Necessarily cause the information lost that can not restore, sensibility to image detail will be lost by up-sampling the characteristic pattern of recovery.Also, frequency
Numerous up-sampling operation is also required to additional memory and time.The present invention uses and is initially applied to field of signal processing wavelet transformation
Empty convolution method in analysis overcomes the problems, such as this.
Original filter is up-sampled 2 times, and is inserted into zero between filter value.Although the size of effective filter has
Increased, but without considering the intermediate zero being inserted into, i.e., it is empty, therefore the operand of the quantity of filter parameter and each position
Amount remains unchanged.Can be by changing spreading rate parameter r to be adaptively modified the size of receptive field, and then efficiently control volume
The resolution ratio of feature is without learning additional parameter in product network.
In convolutional neural networks, after continuous Standard convolution of 3 core having a size of 3 × 3, receptive field size is respectively
3 × 3,5 × 5 and 7 × 7.If the core of continuous convolution operation is having a size of (2d+1) × (2d+1) and constant, receptive field of n-th layer
Size are as follows:
fn=2dn+1 (1)
I.e. the size of receptive field linearly increases under Standard convolution.And be core shown in Fig. 2 having a size of 3 × 3, spreading rate divides
Not Wei 1,2 and 4 empty convolution, receptive field is respectively 3 × 3,7 × 7 and 15 × 15.Assuming that core size is similarly (2d+1)
The spreading rate of n-th layer is r in × (2d+1) constant continuous empty convolution operationn, then the size of receptive field are as follows:
fn=fn-1+2drn (2)
Wherein n >=2 and f1=2dr1+ 1, recursion can obtain:
Enable spreading rate rn=2n-1, then the size of receptive field becomes:
fn=2d (2n-1)+1 (4)
Receptive field exponentially type growth can be made by choosing spreading rate appropriate as a result, for empty convolution.In basic network knot
The res-5 stage in structure begins to use the empty convolution of spreading rate 2, because res5a and res5c uses 1 × 1 filter in the stage
Wave core, so practical, only res5b is in rapid expansion receptive field, to extract intensive feature.
(20) decoder characteristic Fusion Module constructs: using the pyramid structure module based on three-layer coil product operation, extracting
The high-level semantic of strong consistency constraint, then to the layer-by-layer Weighted Fusion of low layer phase characteristic, obtain primary segmentation thermal map;
Using the structure based on decoder module for restoring image resolution ratio, pass through the feature of each level of integration therebetween
To refine final prediction.Decoder architecture mainly considers how the sky for restoring to lose by continuous pondization and down-sampling operation
Between information.The present invention designs a terminus module in decoder architecture and is mainly used for extracting the height with the constraint of most strong consistency
Layer semantic information, and merged using the guidance of attention mechanism with low-level feature, refinement exports result.
As shown in figure 3, (20) the decoder characteristic Fusion Module construction step includes:
(21) it extracts end high-layer semantic information: using the similar pyramidal construction module based on three-layer coil product operation,
The convolution for using 3 × 3,5 × 5 and 7 × 7 respectively in the module obtains having most strong class by merging the context of different scale
The high-level semantic of interior semantic consistency;
Previous model is mostly to execute empty pyramid pond or the void space of a series of scales in basic network end
Pyramid module.In current semantic segmentation system, pyramid structure can extract the characteristic information of different scale, and in picture
Plain rank expands receptive field, but this structure lacks global context priori, the element that can not be suitble to by channel selecting, and
Important Pixel-level information may be lost.For example, excessively frequent empty convolution can cause local message to be lost and grid pond
Locally coherence to characteristic pattern is also harmful.The pyramid pond module proposed in PSPNet is even more often can be in different rulers
Location of pixels is lost in the pondization operation of degree.
The present invention, which is extracted using high-layer semantic information extraction module as shown in Figure 4 from basic network end, to be had in class by force
The high-level characteristic of semantic consistency constraint.
The module merges the characteristic information of three different scales by realizing structure as similar pyramid.In order to more
Useful context is extracted from different scale well, 3 × 3,5 × 5 and 7 × 7 convolution is used in the module respectively, due to height
Layer feature resolution is smaller, therefore will not bring too big computation burden.The feature that the module passes through gradually fusion different scale
Information more can accurately combine the contextual feature of adjacent scale.Output feature from res-5 is by 1 × 1
After convolution, it is multiplied with fusion feature by channel.The module also additionally introduces the global pool branch connecting with output feature, rear
It can be further improved the performance of semantic segmentation in continuous processing.
Similar pyramidal structure is benefited from, end high-layer semantic information extraction module can merge the upper and lower of different scales
Literary information, while powerful semantic information is generated for high-level characteristic.With pyramid pond module cut channel convolutional layer it
The feature of preceding connection different scale is different, and end high-layer semantic information extraction module is by contextual information and by simple 1 × 1
Former feature after convolution operation is multiplied, and will not introduce too many calculating.
(22) integrating context feature: paying attention to vector by successively merging the feature of adjacent phases to calculate channel,
The characteristic information that judgement index is strong in the low layer stage is selected in this, as weighting, and is blended with adjacent high-stage feature, is obtained
Divide thermal map.
In basic network, ResNet-101 includes five stages, generates the feature of corresponding scale respectively, and different phase is gathered around
There is different recognition capabilities to lead to a variety of consistent sex expressions.In the low layer stage, network code goes out fine spatial information, small sense
By wild and lacking the characteristic that spatial context guides makes it contain only a small amount of semantic consistency.In the high-rise stage, because of its big impression
It is wild and possess semantic consistency in powerful class, but foreseeable spatial accuracy is very coarse.In short, the low layer stage generates more
Accurately spatial prediction and the high-rise stage can provide more accurately semantic forecasts.It is possible thereby to the respective advantage of the two is combined,
The Fusion Features of guidance and low order section are removed using the semantic consistency of high-stage to obtain optimum prediction.The present invention is used as schemed
Attention mechanism shown in 5 instructs Fusion Features.
This design pays attention to vector as weight by merging the feature of adjacent phases to calculate a channel.High-level characteristic
Powerful consistency guidance is provided, and there is the information of different discriminating powers in the feature that the low layer stage provides.Channel pay attention to
Amount for weighting, selects the strong characteristic information of judgement index.In semantic segmentation framework, convolution operation final output score
Figure, to provide probability different classes of belonging to each pixel.Score in final score figure in characteristic pattern by owning
Channel summation and all channels summations in characteristic pattern of the score in final score figure:
Wherein x represents the feature of network output, and ω indicates convolution kernel, and D is the set of location of pixels.
In formula (6), p is prediction probability, N, that is, port number.In formula (5) and formula (6), final prediction label is probability value
That highest classification.Assuming that the prediction result of some classification isAnd true tag is y, then, introducing a parameter alpha will most
High probability values byBecome y, as shown in formula (7):
Wherein, y is new prediction output, and the Sigmoid output in α=Sigmoid (ω, x), i.e. Fig. 4.
Based on above-mentioned analysis, it can be seen that the Analysis of Deep Implications of attention mechanism.Formula (5) impliedly discloses different channels
Weight is equal.But as noted, the feature of different phase possesses different degrees of discriminating power, so as to cause not
Same prediction fine granularity.In order to obtain the prediction result with fine object bounds, should extract as much as possible with discriminating power
Feature and the feature that inhibits discriminating power weak.Therefore, the α value in formula (7) is applied to characteristic pattern x, indicates attention mechanism
Feature selecting.There is this module, can successively refine, exports optimal prediction result.
(30) auxiliary loss function building: exporting additional back-up surveillance to each fusion of decoding stage, then on thermal map
Provost after sampling loses superposition, strengthens the order training method of model, obtains semantic segmentation figure.
The present invention improves the loss function of common semantic dividing method, supervises strategy using a kind of layer-by-layer label,
By the additional back-up surveillance of the feature directly exported to each fusion of decoding stage, to promote each layer of branch in network model
Learning ability.In order to which in auxiliary branch's generative semantics output, each fusion feature is entering next step as the high-rise stage
Before be forced to learn more semantemes, with expectation to it is subsequent fusion it is more helpful.It should be noted that with encoder stage
The structure block number of plies is disposed equally again, and layer-by-layer label supervision itself can not promote the classification feature of convolutional network, only in semanteme
This measure can make convolutional network be forced to promote the semantic quality of low layer phase characteristic in segmentation task, thus to decoding stage
Output it is more helpful.
When trained network, differentiated in the corresponding Fusion Features module tail portion res-2, res-3 and res-4 addition etc.
Rate marks the auxiliary softmax loss of figure.The entire final Classification Loss of model be equivalent to final output supervision and three
Assist the sum of the supervision of branch.
In 3 given branches and the total T=4 supervision of final output, the feature port number of each surveillanced object output is
Classification number in training set is N.Feature F after t-th of branch end up-samplingtSpatial resolution be Wt×Ht, correspond to
The value of preferred coordinates position (w, h, n) is Ft w,h,n.Weight is added to the characteristic pattern supervision of final output and each branch
Softmax intersects entropy loss, respective weights λt, wherein λ0=1 is the loss weight of final output, remaining is back-up surveillance
Loss.By FtIt is input in softmax function, calculates each pixel in image and belong to different classes of probabilitysoftmax
The specific formula of function layer are as follows:
It will predictionIt is mapped to true tag Pt w,h,nOn, eventually for shown in trained loss function such as formula (9):
Layer-by-layer label supervision strategy makes gradient optimizing more smooth, and model is also easier to train.Supervision under each
Branch respectively possesses powerful learning ability, can acquire each level semantic feature abundant.By fusion so that final obtain
The segmentation figure precision arrived is independent of any individual branch.
Below with a specific embodiment, verifying the method for the present invention can be improved the accuracy of image, semantic segmentation.
To several amendment moulds proposed on two semantic segmentation data sets of PASCAL VOC 2012 and Cityscapes
Block is tested.Basic network is the ResNet-101 of the pre-training on ImageNet.Experimental Hardware platform is at Core i7
Device is managed, 3.6GHz dominant frequency, 48G memory, GPU is NVIDIA GTX 1080, and code operates in TensorFlow deep learning frame
On frame.
1, ablation experiment
This section gradually decomposes proposed method, verifies each validity for setting up module, in next experiment,
Assessment and more resulting data on the verifying collection of PASCAL VOC 2012.Firstly, based on original ResNet-101 conduct
Basic network, and output is directly up-sampled in end, as shown in table 1.
Scaling enhances the effect after data set with overturning to table 1 at random
Then, basic network is expanded to based on FCN coding-decoding architecture Fusion Features, Fusion Features strategy uses
It cuts and simply sums after up-sampling by channel.In order to examine the validity of this Fusion Features, series of features subset is selected
The effect of each phase characteristic fusion is listed, and effect compares after dispose with each stage structure block number of plies again, as shown in table 2.
The 2nd column can be found out with being apparent from table 2, merge more hierarchy characteristics and gradually improve segmenting system really
Output quality, however when merging more low-level features in the backward, overall performance quickly tends to be saturated.Watershed is
The res-4 stage of ResNet-101, the stage share 23 structure blocks totally 69 convolutional layers so that the res-2 stage export it is low
There are huge semantic gap between the low-level feature that layer feature and res-3 stage export, this estrangement to merge res-3 rank
Overall performance is promoted almost nil when the low-level feature of section output, and the effect that the later period continues fusion is not also significant.
The 2 structure block number of plies of table disposes the effect of front and back fusion feature again
It follows that the fusion between the huge feature of otherness is substantially invalid.The 3rd column show four structures in table
Build the Fusion Features effect block number of plies is disposed again after.When initial, res-5 output feature up-sampling after segmentation mass ratio it
It is again slightly too late before deployment, but almost can be ignored, it was confirmed that the structure block number of plies is disposed again does not strengthen convolution net
The classification capacity of network itself.Unlike the 2nd column, when with merging more low-level features backward, performance is promoted steadily, though
Right promotion paces are simultaneously unstable, but are also rapidly saturated unlike in the 2nd column.Weight deployment mechanisms make ResNet-101 original
The building block number of res2, res-3, res-4 and res-5 four-stage become { 8,8,9,8 } by { 3,4,23,3 } so that respectively
Stage exports the estrangement in feature between feature and changes relative decrease, Fusion Features better effect, and is better than disposing again in last
Preceding performance is higher by 0.52 percentage point.
Table 3 shows the validity of entire model each section component.
Ablation Experimental comparison on 3 PASCAL VOC2012 of table verifying collection
The semantic information that end high level extracts contains powerful semantic consistency, by powerful semantic constraint, gradually
It is merged to the low layer stage, obtains more careful image, semantic feature, model performance is improved 1.1%.Attention mechanism is
The most important improvement of entire model.Different from the fusion method simply summed by channel, the channel which generates pays attention to vector
The information of most judgement index in low-level feature is selected, so that the boundary of refined object segmentation well, is promoted on aforementioned base
The performance of model 2.06%, the other assemblies module of ratio, contribution are maximum.Final layer-by-layer label supervision has refined fused
Hierarchy characteristic makes each fused feature further towards supervision, to the performance boost of entire model 0.43%.End
For module other than generating high-layer semantic information, there are one branches to export global pool feature.Using global pool feature into
One step constrains the fused output of res-2 stage low-level feature, enhances entire model when handling image to all pictures of target
The semantic consistency of element.Global pool branch improves the performance of model 0.96%, has important value.
2, qualitative analysis
The image, semantic segmentation effect of visualization of several comparative approach is illustrated in table 4.
The visualization of 4 parts of images segmentation effect of table
Third column and the 5th column in table, FCN method show that object regional area misidentifies phenomenon.It is right in third column original image
Like three oxen, wherein two in contrast scale it is smaller.FCN basic network shows target object regional area identification mistake
The problem of, preceding two feet of the biggish ox of scale are close with ground visual effect, appearance slightly complicated, although there is certain segmentation
Effect but the sole for being mistakenly classified as horse.Ox lesser for two is even more many pixel region classification errors, mistake classification
Region be also misidentified as horse, thus it is speculated that should be ox in training set scale it is generally large, model can not handle well compared with
Small similar target.The invention shows effects almost ideal out, and answering right FCN to lose image detail well leads to ground object
The problem of local pixel classification error.5th column image is Baima on white railing side, and visually horseback and horse leg are by railing
It blocks.Since color is close, FCN method does not identify the horse leg under the horseback part of railing or more, railing are blocked directly
Show fuzzy result.The present invention is slightly perfect, in addition to error in judgement occurs in few partial pixel, there is no problem.
First and second and four in column, and FCN method shows the ambiguity on cutting object boundary.First column input picture is one
Sheep, partial tone gloss and earth background in the egative film of black and white on chimera are close to the highest white of brightness.FCN points
A part that part light tone earth background in result is misidentified as goat body is cut, and misrecognition region is more at random, this hair
Misrecognition area at random is eliminated well in the bright segmentation figure obtained on the basis of attention mechanism, boundary is very clear,
It is splendid to the binding effect of segmentation.Cabinet in second column adds the horse racing in display and the 4th column to be also such.
3, it is quantitatively evaluated
The present invention has carried out the experiment of several method on 2012 enhanced edition of PASCAL VOC and Cityscapes data set
As a result quantitative analysis is compared with.Test result is as shown in table 5.
5 attention Mechanism Model of table is on 2012 test set of PASCAL VOC by the accuracy rate of classification
When the present invention is compared with DeepLab, there is the classification accuracy rate of half or so to be higher than DeepLab, and part class
Other accuracy rate, which belongs to, to be much higher by, and final total accuracy rate is slightly above DeepLab.It is compared in the LRR method with forward position
When, present invention accuracy rate with higher in most of classification, wherein bicycle, ship, bottle, chair, potting, sofa are electric
Depending on etc. be higher by 3% than LRR in classifications, some is even higher by 15% to 20%, these classifications are all that segmentation difficulty is larger and easily mixed
The classification confused, the method for the present invention have meticulously merged the features of multiple low-levels by high-level semantic guidance, thus processing have compared with
The bicycle of multi-semantic meaning details, chair, with the advantage in feature extraction when the classifications such as potting, the target being partitioned into has very strong
Semantic consistency, seldom there is the problems such as regional area misrecognition.For milk cow, sheep, dog etc. has the classification of similar appearance
Target can also distinguish complicated semantic classes.
Finally, also assessing method of the invention on Cityscapes data set.In the training process, every image is cut out
It is cut into 800 × 800.It makes discovery from observation for high-definition picture, large-scale cut is very useful.Model is being surveyed
Performance on examination collection is as shown in table 6.It is similar with 2012 situation of PASCAL VOC, segmentation of the present invention in most of object
In achieve it is best as a result, and better than other methods on final part.
By the accuracy rate of classification on 6 attention Mechanism Model Cityscapes test set of table
The semantic information of different levels is embedded in characteristic pattern using convolutional network encoder by the present invention, and it is whole to reuse decoder
Each characteristic pattern refinement is closed to export and generate final segmentation result.
Encoder is the pre-training convolution model for extracting characteristics of image, and top feature has the semanteme of height,
But due to lack of resolution, the scarce capacity in terms of the fine detail for rebuilding segmentation figure.And the feature of encoder bottom end has height
Resolution details but lack powerful semantic information.Encoder has redeployed the building number of blocks in each stage with balance characteristics
Between semantic difference variation, and in res5b block using spreading rate be 2 empty convolution.Pass through end high-layer semantic information
Extraction module generates powerful semantic consistency constraint, is successively merged top-downly in decoding stage by attention mechanism low
The high-level characteristic of resolution ratio and high-resolution low-level feature instruct feature to melt using the powerful semantic consistency of high-level characteristic
It closes, to generate high-resolution semantic results.
Claims (3)
1. a kind of image, semantic dividing method for instructing Fusion Features based on attention mechanism, which is characterized in that including walking as follows
It is rapid:
(10) encoder basic network construct: using improved ResNet-101 generate it is a series of by high-resolution speak in a low voice justice to
The feature of the high semantic variation of low resolution;
(20) decoder characteristic Fusion Module constructs: using the pyramid structure module based on three-layer coil product operation, extracting strong by one
The high-level semantic of cause property constraint, then to the layer-by-layer Weighted Fusion of low layer phase characteristic, obtain primary segmentation thermal map;
(30) auxiliary loss function building: additional back-up surveillance is exported to each fusion of decoding stage, then is up-sampled with thermal map
Provost afterwards loses superposition, strengthens the order training method of model, obtains semantic segmentation figure.
2. image, semantic dividing method according to claim 1, which is characterized in that
(10) the encoder basic network construction step includes:
(11) the structure block number of plies is disposed again: the building number of blocks that res-2 to the res-5 stage respectively possesses is redeployed, it will be original
{ 3,4,23,3 } building number of blocks of res-2 to the res-5 of ResNet-101 is adjusted to { 8,8,9,8 };
(12) expand receptive field: it is 2 that traditional convolution in res-5 stage in ResNet-101 infrastructure network, which is changed to spreading rate,
Empty convolution.
3. image, semantic dividing method according to claim 1, which is characterized in that
(20) the decoder characteristic Fusion Module construction step includes:
(21) end high-layer semantic information is extracted: using the similar pyramidal construction module based on three-layer coil product operation, in mould
The convolution for using 3 × 3,5 × 5 and 7 × 7 in block respectively obtains having language in most strong class by merging the context of different scale
The high-level semantic of adopted consistency;
(22) integrating context feature: vector is paid attention to calculate channel by successively merging the feature of adjacent phases, with this
The characteristic information that judgement index is strong in the low layer stage is selected as weighting, and is blended with adjacent high-stage feature, is obtained preliminary
Divide thermal map.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910391452.XA CN110210485A (en) | 2019-05-13 | 2019-05-13 | The image, semantic dividing method of Fusion Features is instructed based on attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910391452.XA CN110210485A (en) | 2019-05-13 | 2019-05-13 | The image, semantic dividing method of Fusion Features is instructed based on attention mechanism |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110210485A true CN110210485A (en) | 2019-09-06 |
Family
ID=67785851
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910391452.XA Pending CN110210485A (en) | 2019-05-13 | 2019-05-13 | The image, semantic dividing method of Fusion Features is instructed based on attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110210485A (en) |
Cited By (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110675405A (en) * | 2019-09-12 | 2020-01-10 | 电子科技大学 | Attention mechanism-based one-shot image segmentation method |
CN110689061A (en) * | 2019-09-19 | 2020-01-14 | 深动科技(北京)有限公司 | Image processing method, device and system based on alignment feature pyramid network |
CN110705457A (en) * | 2019-09-29 | 2020-01-17 | 核工业北京地质研究院 | Remote sensing image building change detection method |
CN111104962A (en) * | 2019-11-05 | 2020-05-05 | 北京航空航天大学青岛研究院 | Semantic segmentation method and device for image, electronic equipment and readable storage medium |
CN111158068A (en) * | 2019-12-31 | 2020-05-15 | 哈尔滨工业大学(深圳) | Short-term prediction method and system based on simple convolutional recurrent neural network |
CN111222580A (en) * | 2020-01-13 | 2020-06-02 | 西南科技大学 | High-precision crack detection method |
CN111292330A (en) * | 2020-02-07 | 2020-06-16 | 北京工业大学 | Image semantic segmentation method and device based on coder and decoder |
CN111340046A (en) * | 2020-02-18 | 2020-06-26 | 上海理工大学 | Visual saliency detection method based on feature pyramid network and channel attention |
CN111488884A (en) * | 2020-04-28 | 2020-08-04 | 东南大学 | Real-time semantic segmentation method with low calculation amount and high feature fusion |
CN111508263A (en) * | 2020-04-03 | 2020-08-07 | 西安电子科技大学 | Intelligent guiding robot for parking lot and intelligent guiding method |
CN111598174A (en) * | 2020-05-19 | 2020-08-28 | 中国科学院空天信息创新研究院 | Training method of image ground feature element classification model, image analysis method and system |
CN111626196A (en) * | 2020-05-27 | 2020-09-04 | 成都颜禾曦科技有限公司 | Typical bovine animal body structure intelligent analysis method based on knowledge graph |
CN111626300A (en) * | 2020-05-07 | 2020-09-04 | 南京邮电大学 | Image semantic segmentation model and modeling method based on context perception |
CN111680695A (en) * | 2020-06-08 | 2020-09-18 | 河南工业大学 | Semantic segmentation method based on reverse attention model |
CN111767922A (en) * | 2020-05-22 | 2020-10-13 | 上海大学 | Image semantic segmentation method and network based on convolutional neural network |
CN111832453A (en) * | 2020-06-30 | 2020-10-27 | 杭州电子科技大学 | Unmanned scene real-time semantic segmentation method based on double-path deep neural network |
CN111898709A (en) * | 2020-09-30 | 2020-11-06 | 中国人民解放军国防科技大学 | Image classification method and device |
CN111915627A (en) * | 2020-08-20 | 2020-11-10 | 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) | Semantic segmentation method, network, device and computer storage medium |
CN111932553A (en) * | 2020-07-27 | 2020-11-13 | 北京航空航天大学 | Remote sensing image semantic segmentation method based on area description self-attention mechanism |
CN112052783A (en) * | 2020-09-02 | 2020-12-08 | 中南大学 | High-resolution image weak supervision building extraction method combining pixel semantic association and boundary attention |
CN112101363A (en) * | 2020-09-02 | 2020-12-18 | 河海大学 | Full convolution semantic segmentation system and method based on cavity residual error and attention mechanism |
CN112183448A (en) * | 2020-10-15 | 2021-01-05 | 中国农业大学 | Hulled soybean image segmentation method based on three-level classification and multi-scale FCN |
CN112215235A (en) * | 2020-10-16 | 2021-01-12 | 深圳市华付信息技术有限公司 | Scene text detection method aiming at large character spacing and local shielding |
CN112241762A (en) * | 2020-10-19 | 2021-01-19 | 吉林大学 | Fine-grained identification method for pest and disease damage image classification |
CN112906829A (en) * | 2021-04-13 | 2021-06-04 | 成都四方伟业软件股份有限公司 | Digital recognition model construction method and device based on Mnist data set |
CN113111848A (en) * | 2021-04-29 | 2021-07-13 | 东南大学 | Human body image analysis method based on multi-scale features |
CN113255675A (en) * | 2021-04-13 | 2021-08-13 | 西安邮电大学 | Image semantic segmentation network structure and method based on expanded convolution and residual path |
WO2021169049A1 (en) * | 2020-02-24 | 2021-09-02 | 大连理工大学 | Method for glass detection in real scene |
CN113393521A (en) * | 2021-05-19 | 2021-09-14 | 中国科学院声学研究所南海研究站 | High-precision flame positioning method and system based on double-semantic attention mechanism |
CN113436127A (en) * | 2021-03-25 | 2021-09-24 | 上海志御软件信息有限公司 | Method and device for constructing automatic liver segmentation model based on deep learning, computer equipment and storage medium |
CN113657388A (en) * | 2021-07-09 | 2021-11-16 | 北京科技大学 | Image semantic segmentation method fusing image super-resolution reconstruction |
CN113744279A (en) * | 2021-06-09 | 2021-12-03 | 东北大学 | Image segmentation method based on FAF-Net network |
CN113837965A (en) * | 2021-09-26 | 2021-12-24 | 北京百度网讯科技有限公司 | Image definition recognition method and device, electronic equipment and storage medium |
CN114332723A (en) * | 2021-12-31 | 2022-04-12 | 北京工业大学 | Video behavior detection method based on semantic guidance |
CN114626666A (en) * | 2021-12-11 | 2022-06-14 | 国网湖北省电力有限公司经济技术研究院 | Engineering field progress identification system based on full-time-space monitoring |
CN116091363A (en) * | 2023-04-03 | 2023-05-09 | 南京信息工程大学 | Handwriting Chinese character image restoration method and system |
CN117392392A (en) * | 2023-12-13 | 2024-01-12 | 河南科技学院 | Rubber cutting line identification and generation method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109190752A (en) * | 2018-07-27 | 2019-01-11 | 国家新闻出版广电总局广播科学研究院 | The image, semantic dividing method of global characteristics and local feature based on deep learning |
CN109284670A (en) * | 2018-08-01 | 2019-01-29 | 清华大学 | A kind of pedestrian detection method and device based on multiple dimensioned attention mechanism |
CN109461157A (en) * | 2018-10-19 | 2019-03-12 | 苏州大学 | Image, semantic dividing method based on multi-stage characteristics fusion and Gauss conditions random field |
CN109635694A (en) * | 2018-12-03 | 2019-04-16 | 广东工业大学 | A kind of pedestrian detection method, device, equipment and computer readable storage medium |
-
2019
- 2019-05-13 CN CN201910391452.XA patent/CN110210485A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109190752A (en) * | 2018-07-27 | 2019-01-11 | 国家新闻出版广电总局广播科学研究院 | The image, semantic dividing method of global characteristics and local feature based on deep learning |
CN109284670A (en) * | 2018-08-01 | 2019-01-29 | 清华大学 | A kind of pedestrian detection method and device based on multiple dimensioned attention mechanism |
CN109461157A (en) * | 2018-10-19 | 2019-03-12 | 苏州大学 | Image, semantic dividing method based on multi-stage characteristics fusion and Gauss conditions random field |
CN109635694A (en) * | 2018-12-03 | 2019-04-16 | 广东工业大学 | A kind of pedestrian detection method, device, equipment and computer readable storage medium |
Non-Patent Citations (4)
Title |
---|
CHANGQIAN YU ET AL: "Learning a Discriminative Feature Network for Semantic Segmentation", 《ARXIV:1804.09337V1》 * |
HANCHAO LI ET AL: "Pyramid Attention Network for Semantic Segmentation", 《ARXIV:1805.10180V3》 * |
HENGSHUANG ZHAO ET AL: "Pyramid Scene Parsing Network", 《ARXIV:1612.01105V2》 * |
宁庆群: "快速鲁棒的图像语义分割算法研究", 《中国博士学位论文全文数据库信息科技辑》 * |
Cited By (56)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110675405A (en) * | 2019-09-12 | 2020-01-10 | 电子科技大学 | Attention mechanism-based one-shot image segmentation method |
CN110675405B (en) * | 2019-09-12 | 2022-06-03 | 电子科技大学 | Attention mechanism-based one-shot image segmentation method |
CN110689061A (en) * | 2019-09-19 | 2020-01-14 | 深动科技(北京)有限公司 | Image processing method, device and system based on alignment feature pyramid network |
CN110689061B (en) * | 2019-09-19 | 2023-04-28 | 小米汽车科技有限公司 | Image processing method, device and system based on alignment feature pyramid network |
CN110705457A (en) * | 2019-09-29 | 2020-01-17 | 核工业北京地质研究院 | Remote sensing image building change detection method |
CN110705457B (en) * | 2019-09-29 | 2024-01-19 | 核工业北京地质研究院 | Remote sensing image building change detection method |
CN111104962B (en) * | 2019-11-05 | 2023-04-18 | 北京航空航天大学青岛研究院 | Semantic segmentation method and device for image, electronic equipment and readable storage medium |
CN111104962A (en) * | 2019-11-05 | 2020-05-05 | 北京航空航天大学青岛研究院 | Semantic segmentation method and device for image, electronic equipment and readable storage medium |
CN111158068A (en) * | 2019-12-31 | 2020-05-15 | 哈尔滨工业大学(深圳) | Short-term prediction method and system based on simple convolutional recurrent neural network |
CN111222580A (en) * | 2020-01-13 | 2020-06-02 | 西南科技大学 | High-precision crack detection method |
CN111292330A (en) * | 2020-02-07 | 2020-06-16 | 北京工业大学 | Image semantic segmentation method and device based on coder and decoder |
CN111340046A (en) * | 2020-02-18 | 2020-06-26 | 上海理工大学 | Visual saliency detection method based on feature pyramid network and channel attention |
US11361534B2 (en) | 2020-02-24 | 2022-06-14 | Dalian University Of Technology | Method for glass detection in real scenes |
WO2021169049A1 (en) * | 2020-02-24 | 2021-09-02 | 大连理工大学 | Method for glass detection in real scene |
CN111508263A (en) * | 2020-04-03 | 2020-08-07 | 西安电子科技大学 | Intelligent guiding robot for parking lot and intelligent guiding method |
CN111488884A (en) * | 2020-04-28 | 2020-08-04 | 东南大学 | Real-time semantic segmentation method with low calculation amount and high feature fusion |
CN111626300A (en) * | 2020-05-07 | 2020-09-04 | 南京邮电大学 | Image semantic segmentation model and modeling method based on context perception |
CN111626300B (en) * | 2020-05-07 | 2022-08-26 | 南京邮电大学 | Image segmentation method and modeling method of image semantic segmentation model based on context perception |
CN111598174A (en) * | 2020-05-19 | 2020-08-28 | 中国科学院空天信息创新研究院 | Training method of image ground feature element classification model, image analysis method and system |
CN111767922A (en) * | 2020-05-22 | 2020-10-13 | 上海大学 | Image semantic segmentation method and network based on convolutional neural network |
CN111767922B (en) * | 2020-05-22 | 2023-06-13 | 上海大学 | Image semantic segmentation method and network based on convolutional neural network |
CN111626196A (en) * | 2020-05-27 | 2020-09-04 | 成都颜禾曦科技有限公司 | Typical bovine animal body structure intelligent analysis method based on knowledge graph |
CN111680695A (en) * | 2020-06-08 | 2020-09-18 | 河南工业大学 | Semantic segmentation method based on reverse attention model |
CN111832453B (en) * | 2020-06-30 | 2023-10-27 | 杭州电子科技大学 | Unmanned scene real-time semantic segmentation method based on two-way deep neural network |
CN111832453A (en) * | 2020-06-30 | 2020-10-27 | 杭州电子科技大学 | Unmanned scene real-time semantic segmentation method based on double-path deep neural network |
CN111932553A (en) * | 2020-07-27 | 2020-11-13 | 北京航空航天大学 | Remote sensing image semantic segmentation method based on area description self-attention mechanism |
CN111915627A (en) * | 2020-08-20 | 2020-11-10 | 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) | Semantic segmentation method, network, device and computer storage medium |
CN111915627B (en) * | 2020-08-20 | 2021-04-16 | 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) | Semantic segmentation method, network, device and computer storage medium |
CN112052783B (en) * | 2020-09-02 | 2024-04-09 | 中南大学 | High-resolution image weak supervision building extraction method combining pixel semantic association and boundary attention |
CN112101363A (en) * | 2020-09-02 | 2020-12-18 | 河海大学 | Full convolution semantic segmentation system and method based on cavity residual error and attention mechanism |
CN112052783A (en) * | 2020-09-02 | 2020-12-08 | 中南大学 | High-resolution image weak supervision building extraction method combining pixel semantic association and boundary attention |
CN111898709A (en) * | 2020-09-30 | 2020-11-06 | 中国人民解放军国防科技大学 | Image classification method and device |
CN112183448A (en) * | 2020-10-15 | 2021-01-05 | 中国农业大学 | Hulled soybean image segmentation method based on three-level classification and multi-scale FCN |
CN112183448B (en) * | 2020-10-15 | 2023-05-12 | 中国农业大学 | Method for dividing pod-removed soybean image based on three-level classification and multi-scale FCN |
CN112215235B (en) * | 2020-10-16 | 2024-04-26 | 深圳华付技术股份有限公司 | Scene text detection method aiming at large character spacing and local shielding |
CN112215235A (en) * | 2020-10-16 | 2021-01-12 | 深圳市华付信息技术有限公司 | Scene text detection method aiming at large character spacing and local shielding |
CN112241762A (en) * | 2020-10-19 | 2021-01-19 | 吉林大学 | Fine-grained identification method for pest and disease damage image classification |
CN113436127A (en) * | 2021-03-25 | 2021-09-24 | 上海志御软件信息有限公司 | Method and device for constructing automatic liver segmentation model based on deep learning, computer equipment and storage medium |
CN113255675A (en) * | 2021-04-13 | 2021-08-13 | 西安邮电大学 | Image semantic segmentation network structure and method based on expanded convolution and residual path |
CN112906829A (en) * | 2021-04-13 | 2021-06-04 | 成都四方伟业软件股份有限公司 | Digital recognition model construction method and device based on Mnist data set |
CN113255675B (en) * | 2021-04-13 | 2023-10-10 | 西安邮电大学 | Image semantic segmentation network structure and method based on expanded convolution and residual path |
CN113111848B (en) * | 2021-04-29 | 2024-07-02 | 东南大学 | Human body image analysis method based on multi-scale features |
CN113111848A (en) * | 2021-04-29 | 2021-07-13 | 东南大学 | Human body image analysis method based on multi-scale features |
CN113393521A (en) * | 2021-05-19 | 2021-09-14 | 中国科学院声学研究所南海研究站 | High-precision flame positioning method and system based on double-semantic attention mechanism |
CN113393521B (en) * | 2021-05-19 | 2023-05-05 | 中国科学院声学研究所南海研究站 | High-precision flame positioning method and system based on dual semantic attention mechanism |
CN113744279A (en) * | 2021-06-09 | 2021-12-03 | 东北大学 | Image segmentation method based on FAF-Net network |
CN113744279B (en) * | 2021-06-09 | 2023-11-14 | 东北大学 | Image segmentation method based on FAF-Net network |
CN113657388A (en) * | 2021-07-09 | 2021-11-16 | 北京科技大学 | Image semantic segmentation method fusing image super-resolution reconstruction |
CN113657388B (en) * | 2021-07-09 | 2023-10-31 | 北京科技大学 | Image semantic segmentation method for super-resolution reconstruction of fused image |
CN113837965A (en) * | 2021-09-26 | 2021-12-24 | 北京百度网讯科技有限公司 | Image definition recognition method and device, electronic equipment and storage medium |
CN114626666A (en) * | 2021-12-11 | 2022-06-14 | 国网湖北省电力有限公司经济技术研究院 | Engineering field progress identification system based on full-time-space monitoring |
CN114332723B (en) * | 2021-12-31 | 2024-03-22 | 北京工业大学 | Video behavior detection method based on semantic guidance |
CN114332723A (en) * | 2021-12-31 | 2022-04-12 | 北京工业大学 | Video behavior detection method based on semantic guidance |
CN116091363A (en) * | 2023-04-03 | 2023-05-09 | 南京信息工程大学 | Handwriting Chinese character image restoration method and system |
CN117392392A (en) * | 2023-12-13 | 2024-01-12 | 河南科技学院 | Rubber cutting line identification and generation method |
CN117392392B (en) * | 2023-12-13 | 2024-02-13 | 河南科技学院 | Rubber cutting line identification and generation method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110210485A (en) | The image, semantic dividing method of Fusion Features is instructed based on attention mechanism | |
CN111047551B (en) | Remote sensing image change detection method and system based on U-net improved algorithm | |
CN110110751A (en) | A kind of Chinese herbal medicine recognition methods of the pyramid network based on attention mechanism | |
CN108509978A (en) | The multi-class targets detection method and model of multi-stage characteristics fusion based on CNN | |
CN108399380A (en) | A kind of video actions detection method based on Three dimensional convolution and Faster RCNN | |
CN110287960A (en) | The detection recognition method of curve text in natural scene image | |
CN110689599B (en) | 3D visual saliency prediction method based on non-local enhancement generation countermeasure network | |
CN109905624A (en) | A kind of video frame interpolation method, device and equipment | |
CN107808132A (en) | A kind of scene image classification method for merging topic model | |
CN109829537B (en) | Deep learning GAN network children's garment based style transfer method and equipment | |
CN113240691A (en) | Medical image segmentation method based on U-shaped network | |
CN110490082A (en) | A kind of road scene semantic segmentation method of effective integration neural network characteristics | |
CN106778768A (en) | Image scene classification method based on multi-feature fusion | |
CN114724222B (en) | AI digital human emotion analysis method based on multiple modes | |
CN110363770A (en) | A kind of training method and device of the infrared semantic segmentation model of margin guide formula | |
CN112270366B (en) | Micro target detection method based on self-adaptive multi-feature fusion | |
CN112215847A (en) | Method for automatically segmenting overlapped chromosomes based on counterstudy multi-scale features | |
CN112528961A (en) | Video analysis method based on Jetson Nano | |
CN109447897A (en) | A kind of real scene image composition method and system | |
CN113379707A (en) | RGB-D significance detection method based on dynamic filtering decoupling convolution network | |
CN114529940A (en) | Human body image generation method based on posture guidance | |
CN109766918A (en) | Conspicuousness object detecting method based on the fusion of multi-level contextual information | |
CN114332094A (en) | Semantic segmentation method and device based on lightweight multi-scale information fusion network | |
CN113255678A (en) | Road crack automatic identification method based on semantic segmentation | |
Al-Amaren et al. | RHN: A residual holistic neural network for edge detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190906 |