CN110210485A - The image, semantic dividing method of Fusion Features is instructed based on attention mechanism - Google Patents

The image, semantic dividing method of Fusion Features is instructed based on attention mechanism Download PDF

Info

Publication number
CN110210485A
CN110210485A CN201910391452.XA CN201910391452A CN110210485A CN 110210485 A CN110210485 A CN 110210485A CN 201910391452 A CN201910391452 A CN 201910391452A CN 110210485 A CN110210485 A CN 110210485A
Authority
CN
China
Prior art keywords
semantic
layer
feature
fusion
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910391452.XA
Other languages
Chinese (zh)
Inventor
龚声蓉
周鹏程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changshu Institute of Technology
Original Assignee
Changshu Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changshu Institute of Technology filed Critical Changshu Institute of Technology
Priority to CN201910391452.XA priority Critical patent/CN110210485A/en
Publication of CN110210485A publication Critical patent/CN110210485A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • G06F18/2193Validation; Performance evaluation; Active pattern learning techniques based on specific statistical tests
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The present invention discloses a kind of image, semantic dividing method that Fusion Features are instructed based on attention mechanism, includes the following steps: that (10) encoder basic network constructs: being generated using improved ResNet-101 and a series of speaks in a low voice justice to the high semantic feature changed of low resolution by high-resolution;(20) decoder characteristic Fusion Module constructs: using the pyramid structure module based on three-layer coil product operation, extracting the high-level semantic of strong consistency constraint, then to the layer-by-layer Weighted Fusion of low layer phase characteristic, obtains primary segmentation thermal map;(30) auxiliary loss function building: additional back-up surveillance is exported to each fusion of decoding stage, then is superimposed with provost's loss after thermal map up-sampling, strengthens the order training method of model, obtains semantic segmentation figure.The image, semantic dividing method that Fusion Features are instructed based on attention mechanism of the invention, accuracy is high, boundary profile understands.

Description

The image, semantic dividing method of Fusion Features is instructed based on attention mechanism
Technical field
The invention belongs to still image identification technology field, especially a kind of accuracy is high, boundary profile is clearly based on Attention mechanism instructs the image, semantic dividing method of Fusion Features.
Background technique
Semantic segmentation, that is, pixel scale image understanding, is one of important foundation stone of computer vision field, has very It is widely applied scene.It imparts machine for the different zones of visual with Pixel-level in such a way that fine granularity is divided The ability that do not peel away.The pixel region for belonging to same target in image is divided into together by semantic segmentation, to extend it Application field.
Semantic segmentation classifies subjects into while carrying out Pixel-level prediction to be combined together with two problems of target positioning Solve the problem of how higher level of abstraction the accurate target of object classification and low layer position the two mutually constrain between obtain Balance is the current semantics dividing method key problem to be faced.Semantic segmentation method can be roughly divided into two classes.The first, The semanteme of each object in image is generated by manually extracting feature, this method generally requires careful Feature Engineering means, The classification that classifier carries out pixel scale is inputted again.Second is to be based on deep learning method, will by building end-to-end system Feature extraction and classifying device is combined directly to distribute a semantic label for each pixel.
The machine learning method that most of traditional methods have relied on manual extraction feature and combine with classifier, Such as the Boost method of Shotton et al., the random forest of Johnson et al., the support vector machines of Soatto et al..These sides Method achieves substantive progress by integrating the abundant information from context and structuring Predicting Technique.However, due to hand The limited influence of the feature representation ability that work is extracted, the image, semantic segmenting system performance based on conventional machines learning method is gradually Saturation, can not breakthrough bottleneck, still have greatly improved space in segmentation accuracy rate performance.
In recent years, deep learning revolution allowed related fields that earth-shaking variation has occurred, including semantic segmentation Many computer vision problems all begin to use deep layer framework to solve.The full convolution net proposed based on depth convolutional neural networks Network method replaces full articulamentum to construct full convolutional network and is applied in semantic segmentation with convolutional layer, generate it is intensive pixel-by-pixel Label output, obtains higher segmentation precision.Zhao et al. proposes that pyramid scene parses network method, utilizes pyramid pond Change module, global context information is utilized by the context polymerization of different zones, is effectively produced using global priori The segmentation result of high quality.Li et al. people is put by first classifying to shallow-layer phase zone, and by deeper stage emphasis On minority difficult region, to carry out study adaptive and for the identification of difficult sample, segmentation performance is ultimately improved.Lin Et al. propose a kind of general multi-path optimization network method, explicitly use all available informations during down-sampling, with It realizes and is predicted using the high-resolution pixel grade of long-range residual connection.
However, the prior art is deposited for semantic segmentation effect, there are still two main problems:
1, in the image, semantic segmentation based on the full convolutional network of depth, when carrying out feature extraction using convolutional network, by Cause feature resolution to gradually decrease in the repeated combination of convolution, maximum pond and down-sampling operation, contextual information is caused to be lost It loses, so that the regional area misrecognition for occurring appearance complex target in segmentation result and multiple dimensioned object Small Target is caused to be known Mistake etc. is not semantic inconsistent;
2, the success section of convolutional network is attributed to its inherent invariance to image local transformation, which enhances The ability of e-learning Hierarchical abstraction, this is exactly needed for object classification contour level visual task.And semantic segmentation is solving Also need to position the spatial details problems such as the boundary profile of object, simple pixel classifications while classification problem in segmentation Often there is the smudgy phenomenon of boundary profile of object in segmentation result in task.
Summary of the invention
The purpose of the present invention is to provide a kind of image, semantic dividing method for instructing Fusion Features based on attention mechanism, Accuracy is high, boundary profile understands.
Realize the technical solution of the object of the invention are as follows:
A kind of image, semantic dividing method being instructed Fusion Features based on attention mechanism, is included the following steps:
(10) encoder basic network constructs: being spoken in a low voice using improved ResNet-101 generation is a series of by high-resolution Feature of the justice to the high semantic variation of low resolution;
(20) decoder characteristic Fusion Module constructs: using the pyramid structure module based on three-layer coil product operation, extracting The high-level semantic of strong consistency constraint, then to the layer-by-layer Weighted Fusion of low layer phase characteristic, obtain primary segmentation thermal map;
(30) auxiliary loss function building: exporting additional back-up surveillance to each fusion of decoding stage, then on thermal map Provost after sampling loses superposition, strengthens the order training method of model, obtains semantic segmentation figure.
Compared with prior art, the present invention has the following:
1, accuracy is high: the method for the present invention is by realizing the end high-layer semantic information extraction module similar to pyramid structure The feature of three different scales is merged, and additionally introduces global pool branch connect with output feature and does subsequent processing, general Contextual information is multiplied with the former feature after simple convolutional operates, and can capture under the premise of not introducing too many calculating Strong semantic consistency feature reduces the probability of object regional area identification error;
2, boundary profile understands: the present invention contains more semantic information and low-level feature according to high-level characteristic between adjacent feature Containing this feature of more space detailed information, first connects two hierarchy characteristics generation channels and pay attention to vector, as weight Select in low-level feature the most information of judgement index, using the strong semantic consistency constraint guidance of high-level characteristic and refinement its with The fusion of low-level feature, captures context abundant, has finally refined the partitioning boundary of object, preferably fusion hierarchy characteristic with The edge details for restoring object in segmentation figure, reduce the smudgy phenomenon of boundary profile.
Detailed description of the invention
Fig. 1 is the main flow chart that the image, semantic dividing method of Fusion Features is instructed the present invention is based on attention mechanism.
Fig. 2 is the flow chart of encoder basic network construction step in Fig. 1.
Fig. 3 is the flow chart of decoder characteristic Fusion Module construction step in Fig. 1.
Fig. 4 is end high-layer semantic information extraction module example.
Fig. 5 is that attention mechanism instructs Fusion Features module example.
Specific embodiment
As shown in Figure 1, the present invention is based on the image, semantic dividing method that attention mechanism instructs Fusion Features, including it is as follows Step:
(10) encoder basic network constructs: being spoken in a low voice using improved ResNet-101 generation is a series of by high-resolution Feature of the justice to the high semantic variation of low resolution;
As shown in Fig. 2, (10) the encoder basic network construction step includes:
(11) number of plies of structure block is disposed again: the building number of blocks that res-2 to the res-5 stage respectively possesses is redeployed, { 3,4,23,3 } building number of blocks of res-2 to the res-5 of original ResNet-101 is adjusted to { 8,8,9,8 };
The purpose of convolutional network encoder is to generate a series of to be changed by high-resolution justice of speaking in a low voice to low resolution is high semantic Feature.The basic network usually using existing convolutional neural networks model, as LeNet, AlexNet, VGG, GoogLeNet, ResNet etc..Wherein ResNet-101 has used a large amount of residual error structure, and what gradient disappeared while solving number of plies intensification asks Topic, each of which residual error structure also provides new path for positive and backpropagation, therefore has extremely strong ability to express.This hair The bright encoder basic network for using ResNet-101 as semantic segmentation.
In basic network, feature is extracted from each stage tail portion of encoder section, for ResNet-101 Speech, respectively res-2, res-3, res-4 and res-5 four-stage, wherein contained building number of blocks of each stage be respectively 3, 4,23,3 }, each structure block is made of three convolutional layers.As it can be seen that the first two coding stage of ResNet-101 only have it is a small amount of Structure block, the convolution number of plies shallower in this way prevent it from the semantic feature of extracting deep layer, and the semanteme of low-level feature is second-rate.And Since res-4, after a large amount of deep layer convolution, the feature of output possesses stronger semanteme.Res-3 and res-4 two The semantic quality gap between two adjacent features that stage extracts is very big.In order to improve the semantic quality of low-level feature, make , closer to supervision, a direct method is to redeploy the building number of blocks that res-2 to the res-5 stage respectively possesses, flat for it Weigh the convolution number of plies in each stage, the semantic difference between feature that two stages of res-3 and res-4 of reduction export.Again In building of each stage number of blocks of deployment, { 3,4,23,3 } of res-2 to the res-5 of original ResNet-101 construct number of blocks It is adjusted to { 8,8,9,8 }.
(12) expand receptive field: traditional convolution in res-5 stage in ResNet-101 infrastructure network is changed to expand The empty convolution that rate is 2.
The output resolution ratio of semantic segmentation should be consistent with input picture.Although the semantic segmentation method based on full convolutional network It can receive the input picture of arbitrary resolution, but continuous convolution sum pondization operation also reduces while increasing receptive field The resolution ratio of feature.Although the characteristic pattern of diminution can be reverted to the original size of image, this process by up-sampling Necessarily cause the information lost that can not restore, sensibility to image detail will be lost by up-sampling the characteristic pattern of recovery.Also, frequency Numerous up-sampling operation is also required to additional memory and time.The present invention uses and is initially applied to field of signal processing wavelet transformation Empty convolution method in analysis overcomes the problems, such as this.
Original filter is up-sampled 2 times, and is inserted into zero between filter value.Although the size of effective filter has Increased, but without considering the intermediate zero being inserted into, i.e., it is empty, therefore the operand of the quantity of filter parameter and each position Amount remains unchanged.Can be by changing spreading rate parameter r to be adaptively modified the size of receptive field, and then efficiently control volume The resolution ratio of feature is without learning additional parameter in product network.
In convolutional neural networks, after continuous Standard convolution of 3 core having a size of 3 × 3, receptive field size is respectively 3 × 3,5 × 5 and 7 × 7.If the core of continuous convolution operation is having a size of (2d+1) × (2d+1) and constant, receptive field of n-th layer Size are as follows:
fn=2dn+1 (1)
I.e. the size of receptive field linearly increases under Standard convolution.And be core shown in Fig. 2 having a size of 3 × 3, spreading rate divides Not Wei 1,2 and 4 empty convolution, receptive field is respectively 3 × 3,7 × 7 and 15 × 15.Assuming that core size is similarly (2d+1) The spreading rate of n-th layer is r in × (2d+1) constant continuous empty convolution operationn, then the size of receptive field are as follows:
fn=fn-1+2drn (2)
Wherein n >=2 and f1=2dr1+ 1, recursion can obtain:
Enable spreading rate rn=2n-1, then the size of receptive field becomes:
fn=2d (2n-1)+1 (4)
Receptive field exponentially type growth can be made by choosing spreading rate appropriate as a result, for empty convolution.In basic network knot The res-5 stage in structure begins to use the empty convolution of spreading rate 2, because res5a and res5c uses 1 × 1 filter in the stage Wave core, so practical, only res5b is in rapid expansion receptive field, to extract intensive feature.
(20) decoder characteristic Fusion Module constructs: using the pyramid structure module based on three-layer coil product operation, extracting The high-level semantic of strong consistency constraint, then to the layer-by-layer Weighted Fusion of low layer phase characteristic, obtain primary segmentation thermal map;
Using the structure based on decoder module for restoring image resolution ratio, pass through the feature of each level of integration therebetween To refine final prediction.Decoder architecture mainly considers how the sky for restoring to lose by continuous pondization and down-sampling operation Between information.The present invention designs a terminus module in decoder architecture and is mainly used for extracting the height with the constraint of most strong consistency Layer semantic information, and merged using the guidance of attention mechanism with low-level feature, refinement exports result.
As shown in figure 3, (20) the decoder characteristic Fusion Module construction step includes:
(21) it extracts end high-layer semantic information: using the similar pyramidal construction module based on three-layer coil product operation, The convolution for using 3 × 3,5 × 5 and 7 × 7 respectively in the module obtains having most strong class by merging the context of different scale The high-level semantic of interior semantic consistency;
Previous model is mostly to execute empty pyramid pond or the void space of a series of scales in basic network end Pyramid module.In current semantic segmentation system, pyramid structure can extract the characteristic information of different scale, and in picture Plain rank expands receptive field, but this structure lacks global context priori, the element that can not be suitble to by channel selecting, and Important Pixel-level information may be lost.For example, excessively frequent empty convolution can cause local message to be lost and grid pond Locally coherence to characteristic pattern is also harmful.The pyramid pond module proposed in PSPNet is even more often can be in different rulers Location of pixels is lost in the pondization operation of degree.
The present invention, which is extracted using high-layer semantic information extraction module as shown in Figure 4 from basic network end, to be had in class by force The high-level characteristic of semantic consistency constraint.
The module merges the characteristic information of three different scales by realizing structure as similar pyramid.In order to more Useful context is extracted from different scale well, 3 × 3,5 × 5 and 7 × 7 convolution is used in the module respectively, due to height Layer feature resolution is smaller, therefore will not bring too big computation burden.The feature that the module passes through gradually fusion different scale Information more can accurately combine the contextual feature of adjacent scale.Output feature from res-5 is by 1 × 1 After convolution, it is multiplied with fusion feature by channel.The module also additionally introduces the global pool branch connecting with output feature, rear It can be further improved the performance of semantic segmentation in continuous processing.
Similar pyramidal structure is benefited from, end high-layer semantic information extraction module can merge the upper and lower of different scales Literary information, while powerful semantic information is generated for high-level characteristic.With pyramid pond module cut channel convolutional layer it The feature of preceding connection different scale is different, and end high-layer semantic information extraction module is by contextual information and by simple 1 × 1 Former feature after convolution operation is multiplied, and will not introduce too many calculating.
(22) integrating context feature: paying attention to vector by successively merging the feature of adjacent phases to calculate channel, The characteristic information that judgement index is strong in the low layer stage is selected in this, as weighting, and is blended with adjacent high-stage feature, is obtained Divide thermal map.
In basic network, ResNet-101 includes five stages, generates the feature of corresponding scale respectively, and different phase is gathered around There is different recognition capabilities to lead to a variety of consistent sex expressions.In the low layer stage, network code goes out fine spatial information, small sense By wild and lacking the characteristic that spatial context guides makes it contain only a small amount of semantic consistency.In the high-rise stage, because of its big impression It is wild and possess semantic consistency in powerful class, but foreseeable spatial accuracy is very coarse.In short, the low layer stage generates more Accurately spatial prediction and the high-rise stage can provide more accurately semantic forecasts.It is possible thereby to the respective advantage of the two is combined, The Fusion Features of guidance and low order section are removed using the semantic consistency of high-stage to obtain optimum prediction.The present invention is used as schemed Attention mechanism shown in 5 instructs Fusion Features.
This design pays attention to vector as weight by merging the feature of adjacent phases to calculate a channel.High-level characteristic Powerful consistency guidance is provided, and there is the information of different discriminating powers in the feature that the low layer stage provides.Channel pay attention to Amount for weighting, selects the strong characteristic information of judgement index.In semantic segmentation framework, convolution operation final output score Figure, to provide probability different classes of belonging to each pixel.Score in final score figure in characteristic pattern by owning Channel summation and all channels summations in characteristic pattern of the score in final score figure:
Wherein x represents the feature of network output, and ω indicates convolution kernel, and D is the set of location of pixels.
In formula (6), p is prediction probability, N, that is, port number.In formula (5) and formula (6), final prediction label is probability value That highest classification.Assuming that the prediction result of some classification isAnd true tag is y, then, introducing a parameter alpha will most High probability values byBecome y, as shown in formula (7):
Wherein, y is new prediction output, and the Sigmoid output in α=Sigmoid (ω, x), i.e. Fig. 4.
Based on above-mentioned analysis, it can be seen that the Analysis of Deep Implications of attention mechanism.Formula (5) impliedly discloses different channels Weight is equal.But as noted, the feature of different phase possesses different degrees of discriminating power, so as to cause not Same prediction fine granularity.In order to obtain the prediction result with fine object bounds, should extract as much as possible with discriminating power Feature and the feature that inhibits discriminating power weak.Therefore, the α value in formula (7) is applied to characteristic pattern x, indicates attention mechanism Feature selecting.There is this module, can successively refine, exports optimal prediction result.
(30) auxiliary loss function building: exporting additional back-up surveillance to each fusion of decoding stage, then on thermal map Provost after sampling loses superposition, strengthens the order training method of model, obtains semantic segmentation figure.
The present invention improves the loss function of common semantic dividing method, supervises strategy using a kind of layer-by-layer label, By the additional back-up surveillance of the feature directly exported to each fusion of decoding stage, to promote each layer of branch in network model Learning ability.In order to which in auxiliary branch's generative semantics output, each fusion feature is entering next step as the high-rise stage Before be forced to learn more semantemes, with expectation to it is subsequent fusion it is more helpful.It should be noted that with encoder stage The structure block number of plies is disposed equally again, and layer-by-layer label supervision itself can not promote the classification feature of convolutional network, only in semanteme This measure can make convolutional network be forced to promote the semantic quality of low layer phase characteristic in segmentation task, thus to decoding stage Output it is more helpful.
When trained network, differentiated in the corresponding Fusion Features module tail portion res-2, res-3 and res-4 addition etc. Rate marks the auxiliary softmax loss of figure.The entire final Classification Loss of model be equivalent to final output supervision and three Assist the sum of the supervision of branch.
In 3 given branches and the total T=4 supervision of final output, the feature port number of each surveillanced object output is Classification number in training set is N.Feature F after t-th of branch end up-samplingtSpatial resolution be Wt×Ht, correspond to The value of preferred coordinates position (w, h, n) is Ft w,h,n.Weight is added to the characteristic pattern supervision of final output and each branch Softmax intersects entropy loss, respective weights λt, wherein λ0=1 is the loss weight of final output, remaining is back-up surveillance Loss.By FtIt is input in softmax function, calculates each pixel in image and belong to different classes of probabilitysoftmax The specific formula of function layer are as follows:
It will predictionIt is mapped to true tag Pt w,h,nOn, eventually for shown in trained loss function such as formula (9):
Layer-by-layer label supervision strategy makes gradient optimizing more smooth, and model is also easier to train.Supervision under each Branch respectively possesses powerful learning ability, can acquire each level semantic feature abundant.By fusion so that final obtain The segmentation figure precision arrived is independent of any individual branch.
Below with a specific embodiment, verifying the method for the present invention can be improved the accuracy of image, semantic segmentation.
To several amendment moulds proposed on two semantic segmentation data sets of PASCAL VOC 2012 and Cityscapes Block is tested.Basic network is the ResNet-101 of the pre-training on ImageNet.Experimental Hardware platform is at Core i7 Device is managed, 3.6GHz dominant frequency, 48G memory, GPU is NVIDIA GTX 1080, and code operates in TensorFlow deep learning frame On frame.
1, ablation experiment
This section gradually decomposes proposed method, verifies each validity for setting up module, in next experiment, Assessment and more resulting data on the verifying collection of PASCAL VOC 2012.Firstly, based on original ResNet-101 conduct Basic network, and output is directly up-sampled in end, as shown in table 1.
Scaling enhances the effect after data set with overturning to table 1 at random
Then, basic network is expanded to based on FCN coding-decoding architecture Fusion Features, Fusion Features strategy uses It cuts and simply sums after up-sampling by channel.In order to examine the validity of this Fusion Features, series of features subset is selected The effect of each phase characteristic fusion is listed, and effect compares after dispose with each stage structure block number of plies again, as shown in table 2.
The 2nd column can be found out with being apparent from table 2, merge more hierarchy characteristics and gradually improve segmenting system really Output quality, however when merging more low-level features in the backward, overall performance quickly tends to be saturated.Watershed is The res-4 stage of ResNet-101, the stage share 23 structure blocks totally 69 convolutional layers so that the res-2 stage export it is low There are huge semantic gap between the low-level feature that layer feature and res-3 stage export, this estrangement to merge res-3 rank Overall performance is promoted almost nil when the low-level feature of section output, and the effect that the later period continues fusion is not also significant.
The 2 structure block number of plies of table disposes the effect of front and back fusion feature again
It follows that the fusion between the huge feature of otherness is substantially invalid.The 3rd column show four structures in table Build the Fusion Features effect block number of plies is disposed again after.When initial, res-5 output feature up-sampling after segmentation mass ratio it It is again slightly too late before deployment, but almost can be ignored, it was confirmed that the structure block number of plies is disposed again does not strengthen convolution net The classification capacity of network itself.Unlike the 2nd column, when with merging more low-level features backward, performance is promoted steadily, though Right promotion paces are simultaneously unstable, but are also rapidly saturated unlike in the 2nd column.Weight deployment mechanisms make ResNet-101 original The building block number of res2, res-3, res-4 and res-5 four-stage become { 8,8,9,8 } by { 3,4,23,3 } so that respectively Stage exports the estrangement in feature between feature and changes relative decrease, Fusion Features better effect, and is better than disposing again in last Preceding performance is higher by 0.52 percentage point.
Table 3 shows the validity of entire model each section component.
Ablation Experimental comparison on 3 PASCAL VOC2012 of table verifying collection
The semantic information that end high level extracts contains powerful semantic consistency, by powerful semantic constraint, gradually It is merged to the low layer stage, obtains more careful image, semantic feature, model performance is improved 1.1%.Attention mechanism is The most important improvement of entire model.Different from the fusion method simply summed by channel, the channel which generates pays attention to vector The information of most judgement index in low-level feature is selected, so that the boundary of refined object segmentation well, is promoted on aforementioned base The performance of model 2.06%, the other assemblies module of ratio, contribution are maximum.Final layer-by-layer label supervision has refined fused Hierarchy characteristic makes each fused feature further towards supervision, to the performance boost of entire model 0.43%.End For module other than generating high-layer semantic information, there are one branches to export global pool feature.Using global pool feature into One step constrains the fused output of res-2 stage low-level feature, enhances entire model when handling image to all pictures of target The semantic consistency of element.Global pool branch improves the performance of model 0.96%, has important value.
2, qualitative analysis
The image, semantic segmentation effect of visualization of several comparative approach is illustrated in table 4.
The visualization of 4 parts of images segmentation effect of table
Third column and the 5th column in table, FCN method show that object regional area misidentifies phenomenon.It is right in third column original image Like three oxen, wherein two in contrast scale it is smaller.FCN basic network shows target object regional area identification mistake The problem of, preceding two feet of the biggish ox of scale are close with ground visual effect, appearance slightly complicated, although there is certain segmentation Effect but the sole for being mistakenly classified as horse.Ox lesser for two is even more many pixel region classification errors, mistake classification Region be also misidentified as horse, thus it is speculated that should be ox in training set scale it is generally large, model can not handle well compared with Small similar target.The invention shows effects almost ideal out, and answering right FCN to lose image detail well leads to ground object The problem of local pixel classification error.5th column image is Baima on white railing side, and visually horseback and horse leg are by railing It blocks.Since color is close, FCN method does not identify the horse leg under the horseback part of railing or more, railing are blocked directly Show fuzzy result.The present invention is slightly perfect, in addition to error in judgement occurs in few partial pixel, there is no problem.
First and second and four in column, and FCN method shows the ambiguity on cutting object boundary.First column input picture is one Sheep, partial tone gloss and earth background in the egative film of black and white on chimera are close to the highest white of brightness.FCN points A part that part light tone earth background in result is misidentified as goat body is cut, and misrecognition region is more at random, this hair Misrecognition area at random is eliminated well in the bright segmentation figure obtained on the basis of attention mechanism, boundary is very clear, It is splendid to the binding effect of segmentation.Cabinet in second column adds the horse racing in display and the 4th column to be also such.
3, it is quantitatively evaluated
The present invention has carried out the experiment of several method on 2012 enhanced edition of PASCAL VOC and Cityscapes data set As a result quantitative analysis is compared with.Test result is as shown in table 5.
5 attention Mechanism Model of table is on 2012 test set of PASCAL VOC by the accuracy rate of classification
When the present invention is compared with DeepLab, there is the classification accuracy rate of half or so to be higher than DeepLab, and part class Other accuracy rate, which belongs to, to be much higher by, and final total accuracy rate is slightly above DeepLab.It is compared in the LRR method with forward position When, present invention accuracy rate with higher in most of classification, wherein bicycle, ship, bottle, chair, potting, sofa are electric Depending on etc. be higher by 3% than LRR in classifications, some is even higher by 15% to 20%, these classifications are all that segmentation difficulty is larger and easily mixed The classification confused, the method for the present invention have meticulously merged the features of multiple low-levels by high-level semantic guidance, thus processing have compared with The bicycle of multi-semantic meaning details, chair, with the advantage in feature extraction when the classifications such as potting, the target being partitioned into has very strong Semantic consistency, seldom there is the problems such as regional area misrecognition.For milk cow, sheep, dog etc. has the classification of similar appearance Target can also distinguish complicated semantic classes.
Finally, also assessing method of the invention on Cityscapes data set.In the training process, every image is cut out It is cut into 800 × 800.It makes discovery from observation for high-definition picture, large-scale cut is very useful.Model is being surveyed Performance on examination collection is as shown in table 6.It is similar with 2012 situation of PASCAL VOC, segmentation of the present invention in most of object In achieve it is best as a result, and better than other methods on final part.
By the accuracy rate of classification on 6 attention Mechanism Model Cityscapes test set of table
The semantic information of different levels is embedded in characteristic pattern using convolutional network encoder by the present invention, and it is whole to reuse decoder Each characteristic pattern refinement is closed to export and generate final segmentation result.
Encoder is the pre-training convolution model for extracting characteristics of image, and top feature has the semanteme of height, But due to lack of resolution, the scarce capacity in terms of the fine detail for rebuilding segmentation figure.And the feature of encoder bottom end has height Resolution details but lack powerful semantic information.Encoder has redeployed the building number of blocks in each stage with balance characteristics Between semantic difference variation, and in res5b block using spreading rate be 2 empty convolution.Pass through end high-layer semantic information Extraction module generates powerful semantic consistency constraint, is successively merged top-downly in decoding stage by attention mechanism low The high-level characteristic of resolution ratio and high-resolution low-level feature instruct feature to melt using the powerful semantic consistency of high-level characteristic It closes, to generate high-resolution semantic results.

Claims (3)

1. a kind of image, semantic dividing method for instructing Fusion Features based on attention mechanism, which is characterized in that including walking as follows It is rapid:
(10) encoder basic network construct: using improved ResNet-101 generate it is a series of by high-resolution speak in a low voice justice to The feature of the high semantic variation of low resolution;
(20) decoder characteristic Fusion Module constructs: using the pyramid structure module based on three-layer coil product operation, extracting strong by one The high-level semantic of cause property constraint, then to the layer-by-layer Weighted Fusion of low layer phase characteristic, obtain primary segmentation thermal map;
(30) auxiliary loss function building: additional back-up surveillance is exported to each fusion of decoding stage, then is up-sampled with thermal map Provost afterwards loses superposition, strengthens the order training method of model, obtains semantic segmentation figure.
2. image, semantic dividing method according to claim 1, which is characterized in that
(10) the encoder basic network construction step includes:
(11) the structure block number of plies is disposed again: the building number of blocks that res-2 to the res-5 stage respectively possesses is redeployed, it will be original { 3,4,23,3 } building number of blocks of res-2 to the res-5 of ResNet-101 is adjusted to { 8,8,9,8 };
(12) expand receptive field: it is 2 that traditional convolution in res-5 stage in ResNet-101 infrastructure network, which is changed to spreading rate, Empty convolution.
3. image, semantic dividing method according to claim 1, which is characterized in that
(20) the decoder characteristic Fusion Module construction step includes:
(21) end high-layer semantic information is extracted: using the similar pyramidal construction module based on three-layer coil product operation, in mould The convolution for using 3 × 3,5 × 5 and 7 × 7 in block respectively obtains having language in most strong class by merging the context of different scale The high-level semantic of adopted consistency;
(22) integrating context feature: vector is paid attention to calculate channel by successively merging the feature of adjacent phases, with this The characteristic information that judgement index is strong in the low layer stage is selected as weighting, and is blended with adjacent high-stage feature, is obtained preliminary Divide thermal map.
CN201910391452.XA 2019-05-13 2019-05-13 The image, semantic dividing method of Fusion Features is instructed based on attention mechanism Pending CN110210485A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910391452.XA CN110210485A (en) 2019-05-13 2019-05-13 The image, semantic dividing method of Fusion Features is instructed based on attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910391452.XA CN110210485A (en) 2019-05-13 2019-05-13 The image, semantic dividing method of Fusion Features is instructed based on attention mechanism

Publications (1)

Publication Number Publication Date
CN110210485A true CN110210485A (en) 2019-09-06

Family

ID=67785851

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910391452.XA Pending CN110210485A (en) 2019-05-13 2019-05-13 The image, semantic dividing method of Fusion Features is instructed based on attention mechanism

Country Status (1)

Country Link
CN (1) CN110210485A (en)

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110675405A (en) * 2019-09-12 2020-01-10 电子科技大学 Attention mechanism-based one-shot image segmentation method
CN110689061A (en) * 2019-09-19 2020-01-14 深动科技(北京)有限公司 Image processing method, device and system based on alignment feature pyramid network
CN110705457A (en) * 2019-09-29 2020-01-17 核工业北京地质研究院 Remote sensing image building change detection method
CN111104962A (en) * 2019-11-05 2020-05-05 北京航空航天大学青岛研究院 Semantic segmentation method and device for image, electronic equipment and readable storage medium
CN111158068A (en) * 2019-12-31 2020-05-15 哈尔滨工业大学(深圳) Short-term prediction method and system based on simple convolutional recurrent neural network
CN111222580A (en) * 2020-01-13 2020-06-02 西南科技大学 High-precision crack detection method
CN111292330A (en) * 2020-02-07 2020-06-16 北京工业大学 Image semantic segmentation method and device based on coder and decoder
CN111340046A (en) * 2020-02-18 2020-06-26 上海理工大学 Visual saliency detection method based on feature pyramid network and channel attention
CN111488884A (en) * 2020-04-28 2020-08-04 东南大学 Real-time semantic segmentation method with low calculation amount and high feature fusion
CN111508263A (en) * 2020-04-03 2020-08-07 西安电子科技大学 Intelligent guiding robot for parking lot and intelligent guiding method
CN111598174A (en) * 2020-05-19 2020-08-28 中国科学院空天信息创新研究院 Training method of image ground feature element classification model, image analysis method and system
CN111626196A (en) * 2020-05-27 2020-09-04 成都颜禾曦科技有限公司 Typical bovine animal body structure intelligent analysis method based on knowledge graph
CN111626300A (en) * 2020-05-07 2020-09-04 南京邮电大学 Image semantic segmentation model and modeling method based on context perception
CN111680695A (en) * 2020-06-08 2020-09-18 河南工业大学 Semantic segmentation method based on reverse attention model
CN111767922A (en) * 2020-05-22 2020-10-13 上海大学 Image semantic segmentation method and network based on convolutional neural network
CN111832453A (en) * 2020-06-30 2020-10-27 杭州电子科技大学 Unmanned scene real-time semantic segmentation method based on double-path deep neural network
CN111898709A (en) * 2020-09-30 2020-11-06 中国人民解放军国防科技大学 Image classification method and device
CN111915627A (en) * 2020-08-20 2020-11-10 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) Semantic segmentation method, network, device and computer storage medium
CN111932553A (en) * 2020-07-27 2020-11-13 北京航空航天大学 Remote sensing image semantic segmentation method based on area description self-attention mechanism
CN112052783A (en) * 2020-09-02 2020-12-08 中南大学 High-resolution image weak supervision building extraction method combining pixel semantic association and boundary attention
CN112101363A (en) * 2020-09-02 2020-12-18 河海大学 Full convolution semantic segmentation system and method based on cavity residual error and attention mechanism
CN112183448A (en) * 2020-10-15 2021-01-05 中国农业大学 Hulled soybean image segmentation method based on three-level classification and multi-scale FCN
CN112215235A (en) * 2020-10-16 2021-01-12 深圳市华付信息技术有限公司 Scene text detection method aiming at large character spacing and local shielding
CN112241762A (en) * 2020-10-19 2021-01-19 吉林大学 Fine-grained identification method for pest and disease damage image classification
CN112906829A (en) * 2021-04-13 2021-06-04 成都四方伟业软件股份有限公司 Digital recognition model construction method and device based on Mnist data set
CN113111848A (en) * 2021-04-29 2021-07-13 东南大学 Human body image analysis method based on multi-scale features
CN113255675A (en) * 2021-04-13 2021-08-13 西安邮电大学 Image semantic segmentation network structure and method based on expanded convolution and residual path
WO2021169049A1 (en) * 2020-02-24 2021-09-02 大连理工大学 Method for glass detection in real scene
CN113393521A (en) * 2021-05-19 2021-09-14 中国科学院声学研究所南海研究站 High-precision flame positioning method and system based on double-semantic attention mechanism
CN113436127A (en) * 2021-03-25 2021-09-24 上海志御软件信息有限公司 Method and device for constructing automatic liver segmentation model based on deep learning, computer equipment and storage medium
CN113657388A (en) * 2021-07-09 2021-11-16 北京科技大学 Image semantic segmentation method fusing image super-resolution reconstruction
CN113744279A (en) * 2021-06-09 2021-12-03 东北大学 Image segmentation method based on FAF-Net network
CN113837965A (en) * 2021-09-26 2021-12-24 北京百度网讯科技有限公司 Image definition recognition method and device, electronic equipment and storage medium
CN114332723A (en) * 2021-12-31 2022-04-12 北京工业大学 Video behavior detection method based on semantic guidance
CN114626666A (en) * 2021-12-11 2022-06-14 国网湖北省电力有限公司经济技术研究院 Engineering field progress identification system based on full-time-space monitoring
CN116091363A (en) * 2023-04-03 2023-05-09 南京信息工程大学 Handwriting Chinese character image restoration method and system
CN117392392A (en) * 2023-12-13 2024-01-12 河南科技学院 Rubber cutting line identification and generation method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109190752A (en) * 2018-07-27 2019-01-11 国家新闻出版广电总局广播科学研究院 The image, semantic dividing method of global characteristics and local feature based on deep learning
CN109284670A (en) * 2018-08-01 2019-01-29 清华大学 A kind of pedestrian detection method and device based on multiple dimensioned attention mechanism
CN109461157A (en) * 2018-10-19 2019-03-12 苏州大学 Image, semantic dividing method based on multi-stage characteristics fusion and Gauss conditions random field
CN109635694A (en) * 2018-12-03 2019-04-16 广东工业大学 A kind of pedestrian detection method, device, equipment and computer readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109190752A (en) * 2018-07-27 2019-01-11 国家新闻出版广电总局广播科学研究院 The image, semantic dividing method of global characteristics and local feature based on deep learning
CN109284670A (en) * 2018-08-01 2019-01-29 清华大学 A kind of pedestrian detection method and device based on multiple dimensioned attention mechanism
CN109461157A (en) * 2018-10-19 2019-03-12 苏州大学 Image, semantic dividing method based on multi-stage characteristics fusion and Gauss conditions random field
CN109635694A (en) * 2018-12-03 2019-04-16 广东工业大学 A kind of pedestrian detection method, device, equipment and computer readable storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CHANGQIAN YU ET AL: "Learning a Discriminative Feature Network for Semantic Segmentation", 《ARXIV:1804.09337V1》 *
HANCHAO LI ET AL: "Pyramid Attention Network for Semantic Segmentation", 《ARXIV:1805.10180V3》 *
HENGSHUANG ZHAO ET AL: "Pyramid Scene Parsing Network", 《ARXIV:1612.01105V2》 *
宁庆群: "快速鲁棒的图像语义分割算法研究", 《中国博士学位论文全文数据库信息科技辑》 *

Cited By (56)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110675405A (en) * 2019-09-12 2020-01-10 电子科技大学 Attention mechanism-based one-shot image segmentation method
CN110675405B (en) * 2019-09-12 2022-06-03 电子科技大学 Attention mechanism-based one-shot image segmentation method
CN110689061A (en) * 2019-09-19 2020-01-14 深动科技(北京)有限公司 Image processing method, device and system based on alignment feature pyramid network
CN110689061B (en) * 2019-09-19 2023-04-28 小米汽车科技有限公司 Image processing method, device and system based on alignment feature pyramid network
CN110705457A (en) * 2019-09-29 2020-01-17 核工业北京地质研究院 Remote sensing image building change detection method
CN110705457B (en) * 2019-09-29 2024-01-19 核工业北京地质研究院 Remote sensing image building change detection method
CN111104962B (en) * 2019-11-05 2023-04-18 北京航空航天大学青岛研究院 Semantic segmentation method and device for image, electronic equipment and readable storage medium
CN111104962A (en) * 2019-11-05 2020-05-05 北京航空航天大学青岛研究院 Semantic segmentation method and device for image, electronic equipment and readable storage medium
CN111158068A (en) * 2019-12-31 2020-05-15 哈尔滨工业大学(深圳) Short-term prediction method and system based on simple convolutional recurrent neural network
CN111222580A (en) * 2020-01-13 2020-06-02 西南科技大学 High-precision crack detection method
CN111292330A (en) * 2020-02-07 2020-06-16 北京工业大学 Image semantic segmentation method and device based on coder and decoder
CN111340046A (en) * 2020-02-18 2020-06-26 上海理工大学 Visual saliency detection method based on feature pyramid network and channel attention
US11361534B2 (en) 2020-02-24 2022-06-14 Dalian University Of Technology Method for glass detection in real scenes
WO2021169049A1 (en) * 2020-02-24 2021-09-02 大连理工大学 Method for glass detection in real scene
CN111508263A (en) * 2020-04-03 2020-08-07 西安电子科技大学 Intelligent guiding robot for parking lot and intelligent guiding method
CN111488884A (en) * 2020-04-28 2020-08-04 东南大学 Real-time semantic segmentation method with low calculation amount and high feature fusion
CN111626300A (en) * 2020-05-07 2020-09-04 南京邮电大学 Image semantic segmentation model and modeling method based on context perception
CN111626300B (en) * 2020-05-07 2022-08-26 南京邮电大学 Image segmentation method and modeling method of image semantic segmentation model based on context perception
CN111598174A (en) * 2020-05-19 2020-08-28 中国科学院空天信息创新研究院 Training method of image ground feature element classification model, image analysis method and system
CN111767922A (en) * 2020-05-22 2020-10-13 上海大学 Image semantic segmentation method and network based on convolutional neural network
CN111767922B (en) * 2020-05-22 2023-06-13 上海大学 Image semantic segmentation method and network based on convolutional neural network
CN111626196A (en) * 2020-05-27 2020-09-04 成都颜禾曦科技有限公司 Typical bovine animal body structure intelligent analysis method based on knowledge graph
CN111680695A (en) * 2020-06-08 2020-09-18 河南工业大学 Semantic segmentation method based on reverse attention model
CN111832453B (en) * 2020-06-30 2023-10-27 杭州电子科技大学 Unmanned scene real-time semantic segmentation method based on two-way deep neural network
CN111832453A (en) * 2020-06-30 2020-10-27 杭州电子科技大学 Unmanned scene real-time semantic segmentation method based on double-path deep neural network
CN111932553A (en) * 2020-07-27 2020-11-13 北京航空航天大学 Remote sensing image semantic segmentation method based on area description self-attention mechanism
CN111915627A (en) * 2020-08-20 2020-11-10 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) Semantic segmentation method, network, device and computer storage medium
CN111915627B (en) * 2020-08-20 2021-04-16 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) Semantic segmentation method, network, device and computer storage medium
CN112052783B (en) * 2020-09-02 2024-04-09 中南大学 High-resolution image weak supervision building extraction method combining pixel semantic association and boundary attention
CN112101363A (en) * 2020-09-02 2020-12-18 河海大学 Full convolution semantic segmentation system and method based on cavity residual error and attention mechanism
CN112052783A (en) * 2020-09-02 2020-12-08 中南大学 High-resolution image weak supervision building extraction method combining pixel semantic association and boundary attention
CN111898709A (en) * 2020-09-30 2020-11-06 中国人民解放军国防科技大学 Image classification method and device
CN112183448A (en) * 2020-10-15 2021-01-05 中国农业大学 Hulled soybean image segmentation method based on three-level classification and multi-scale FCN
CN112183448B (en) * 2020-10-15 2023-05-12 中国农业大学 Method for dividing pod-removed soybean image based on three-level classification and multi-scale FCN
CN112215235B (en) * 2020-10-16 2024-04-26 深圳华付技术股份有限公司 Scene text detection method aiming at large character spacing and local shielding
CN112215235A (en) * 2020-10-16 2021-01-12 深圳市华付信息技术有限公司 Scene text detection method aiming at large character spacing and local shielding
CN112241762A (en) * 2020-10-19 2021-01-19 吉林大学 Fine-grained identification method for pest and disease damage image classification
CN113436127A (en) * 2021-03-25 2021-09-24 上海志御软件信息有限公司 Method and device for constructing automatic liver segmentation model based on deep learning, computer equipment and storage medium
CN113255675A (en) * 2021-04-13 2021-08-13 西安邮电大学 Image semantic segmentation network structure and method based on expanded convolution and residual path
CN112906829A (en) * 2021-04-13 2021-06-04 成都四方伟业软件股份有限公司 Digital recognition model construction method and device based on Mnist data set
CN113255675B (en) * 2021-04-13 2023-10-10 西安邮电大学 Image semantic segmentation network structure and method based on expanded convolution and residual path
CN113111848B (en) * 2021-04-29 2024-07-02 东南大学 Human body image analysis method based on multi-scale features
CN113111848A (en) * 2021-04-29 2021-07-13 东南大学 Human body image analysis method based on multi-scale features
CN113393521A (en) * 2021-05-19 2021-09-14 中国科学院声学研究所南海研究站 High-precision flame positioning method and system based on double-semantic attention mechanism
CN113393521B (en) * 2021-05-19 2023-05-05 中国科学院声学研究所南海研究站 High-precision flame positioning method and system based on dual semantic attention mechanism
CN113744279A (en) * 2021-06-09 2021-12-03 东北大学 Image segmentation method based on FAF-Net network
CN113744279B (en) * 2021-06-09 2023-11-14 东北大学 Image segmentation method based on FAF-Net network
CN113657388A (en) * 2021-07-09 2021-11-16 北京科技大学 Image semantic segmentation method fusing image super-resolution reconstruction
CN113657388B (en) * 2021-07-09 2023-10-31 北京科技大学 Image semantic segmentation method for super-resolution reconstruction of fused image
CN113837965A (en) * 2021-09-26 2021-12-24 北京百度网讯科技有限公司 Image definition recognition method and device, electronic equipment and storage medium
CN114626666A (en) * 2021-12-11 2022-06-14 国网湖北省电力有限公司经济技术研究院 Engineering field progress identification system based on full-time-space monitoring
CN114332723B (en) * 2021-12-31 2024-03-22 北京工业大学 Video behavior detection method based on semantic guidance
CN114332723A (en) * 2021-12-31 2022-04-12 北京工业大学 Video behavior detection method based on semantic guidance
CN116091363A (en) * 2023-04-03 2023-05-09 南京信息工程大学 Handwriting Chinese character image restoration method and system
CN117392392A (en) * 2023-12-13 2024-01-12 河南科技学院 Rubber cutting line identification and generation method
CN117392392B (en) * 2023-12-13 2024-02-13 河南科技学院 Rubber cutting line identification and generation method

Similar Documents

Publication Publication Date Title
CN110210485A (en) The image, semantic dividing method of Fusion Features is instructed based on attention mechanism
CN111047551B (en) Remote sensing image change detection method and system based on U-net improved algorithm
CN110110751A (en) A kind of Chinese herbal medicine recognition methods of the pyramid network based on attention mechanism
CN108509978A (en) The multi-class targets detection method and model of multi-stage characteristics fusion based on CNN
CN108399380A (en) A kind of video actions detection method based on Three dimensional convolution and Faster RCNN
CN110287960A (en) The detection recognition method of curve text in natural scene image
CN110689599B (en) 3D visual saliency prediction method based on non-local enhancement generation countermeasure network
CN109905624A (en) A kind of video frame interpolation method, device and equipment
CN107808132A (en) A kind of scene image classification method for merging topic model
CN109829537B (en) Deep learning GAN network children's garment based style transfer method and equipment
CN113240691A (en) Medical image segmentation method based on U-shaped network
CN110490082A (en) A kind of road scene semantic segmentation method of effective integration neural network characteristics
CN106778768A (en) Image scene classification method based on multi-feature fusion
CN114724222B (en) AI digital human emotion analysis method based on multiple modes
CN110363770A (en) A kind of training method and device of the infrared semantic segmentation model of margin guide formula
CN112270366B (en) Micro target detection method based on self-adaptive multi-feature fusion
CN112215847A (en) Method for automatically segmenting overlapped chromosomes based on counterstudy multi-scale features
CN112528961A (en) Video analysis method based on Jetson Nano
CN109447897A (en) A kind of real scene image composition method and system
CN113379707A (en) RGB-D significance detection method based on dynamic filtering decoupling convolution network
CN114529940A (en) Human body image generation method based on posture guidance
CN109766918A (en) Conspicuousness object detecting method based on the fusion of multi-level contextual information
CN114332094A (en) Semantic segmentation method and device based on lightweight multi-scale information fusion network
CN113255678A (en) Road crack automatic identification method based on semantic segmentation
Al-Amaren et al. RHN: A residual holistic neural network for edge detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190906