CN107180430A - A kind of deep learning network establishing method and system suitable for semantic segmentation - Google Patents

A kind of deep learning network establishing method and system suitable for semantic segmentation Download PDF

Info

Publication number
CN107180430A
CN107180430A CN201710342354.8A CN201710342354A CN107180430A CN 107180430 A CN107180430 A CN 107180430A CN 201710342354 A CN201710342354 A CN 201710342354A CN 107180430 A CN107180430 A CN 107180430A
Authority
CN
China
Prior art keywords
network
mrow
image
deep learning
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710342354.8A
Other languages
Chinese (zh)
Inventor
陶文兵
张灿
李坤乾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN201710342354.8A priority Critical patent/CN107180430A/en
Publication of CN107180430A publication Critical patent/CN107180430A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of deep learning network establishing method and system suitable for semantic segmentation, this method is on the basis of deconvolution network semantic segmentation, in view of condition random field it is preferable to edge optimization the characteristics of, condition random field is construed into Recursive Networks to be dissolved into deconvolution network, carry out end-to-end training, so that the parameter learning interaction in convolutional network and Recursive Networks, finally trains more preferable integrated network.Deconvolution network proposed by the present invention and the mode of condition random field joint training, obtain stronger details and shape information, solve the problem of image border segmentation is less accurate;With reference to multiple dimensioned input and the strategy in multiple dimensioned pond, situation about being split in semantic segmentation due to the big target that receptive field is single and produces by over-segmentation or Small object by leakage is solved.The present invention is extended to classical deconvolution network, using condition random field joint training and multicharacteristic information convergence strategy, improves the accuracy of semantic segmentation.

Description

A kind of deep learning network establishing method and system suitable for semantic segmentation
Technical field
The invention belongs to technical field of computer vision, more particularly, to a kind of depth suitable for semantic segmentation Practise network establishing method and system.
Background technology
With the explosive growth of web database technology, big data video procession is increasingly becoming a popular direction, Wherein deep learning technology has become the indispensable research tool of big data.Although the development time of deep learning is not long, Theory deposit is imperfect, but depth network establishing method emerges in an endless stream, and the application effect in computer vision direction is notable.Utilize Deep learning carries out visually-perceptible based on human brain vision mechanism, and multi-layer network designs the information processing vision for being analogous to classification System.The vision system processing point following sections of people, pixel is caught by pupil, and then cerebral cortex finds edge and direction, Then the shape of object is taken out by edge, the classification of object is finally further taken out.Depth network is similar, rudimentary level Edge feature is extracted, intergrade extracts shape facility and simultaneously does further abstract, finally obtains the behavior of whole target or target more High-rise feature is classified.Deep learning another new milestone as machine learning, has attracted increasing image Researcher participates, specific theoretical to include image classification, target identification, the problem of computer vision such as semantic segmentation is related, In terms of including intelligent DAS (Driver Assistant System), recognition of face, image retrieval.
The thinking for carrying out image recognition using machine learning is carried out generally according to the following steps:Sensor obtains image first Data, then by pretreatment and feature extraction, then carry out feature selecting, prediction are identified finally according to feature.Pre- place The purpose of reason, feature extraction and feature selecting is to find suitable feature representation in order to which grader is classified.Feature representation has Effect property often plays a part of most critical to the accuracy finally recognized, and the feature representation of early stage is all the feature manually extracted, Selection feature is complicated and laborious by hand, and years of researches are all lifted little to recognition result accuracy rate.Wherein most represent Property Scale invariant features transform, although rotation, scaling, brightness change are maintained the invariance, but for Protean For image, good recognition effect still can not be reached.Deep learning is automatically learned as unsupervised feature learning process Useful feature is practised, the training for adding mass data is supported and powerful Computing ability, is undoubtedly regarded as computer Feel the focus of research.Two dimensional image can directly as network input, it is to avoid need to enter data characteristics in traditional algorithm Row manual extraction and the process for rebuilding data.Convolutional neural networks mainly have the advantages that two aspects, first, it is straight by convolutional layer It is connected to dynamic training and extracts feature, it is to avoid the artificial extraction of feature, the feature extractor of training has more preferable robustness;The Two, the neuron on convolutional layer shares weight, can carry out parallel e-learning, reduce the training burden of parameter.Convolution god Indeformable X-Y scheme is distorted through network to displacement, scaling and other forms can preferably be recognized, as numerous meters The preferred model of calculation machine vision research person.
In numerous computer vision problems, the problem of image, semantic segmentation is one important and complicated.Image, semantic Segmentation is different with image classification detection, and image classification detection is the understanding for doing image level, and semantic segmentation is to do the reason of Pixel-level Solution, the target of semantic segmentation is a given pictures, and each pixel in picture is classified.Traditional partitioning algorithm is mainly solved Do not have markup semantics information to object classification in the problems such as foreground-background segmentation, cluster of image content, these problems, it is real Border needs that segmentation block is further processed if applying.End-to-end instruction can be directly carried out using convolutional neural networks Practice and predict, it is only necessary to which the data set of corresponding semantic segmentation, project training network structure, it is possible to obtain semantic segmentation are provided Result.
Since being used for Computer Vision Task from the convolutional neural networks in deep learning, numerous scholars are to image, semantic point Cut and also generate interest, and propose many convolutional neural networks for being applied to semantic segmentation, compared to conventional method before, The effect that the framework of deep learning carries out semantic segmentation is well a lot.Although it is semantic to have can be designed that preferable network is carried out Segmentation, but result is still not applied for all kinds of images, and the diversity of image make it that the amount of training data that needs prepare is very big, and Interference between of all categories causes the prediction of Pixel-level can not reach especially accurate.
The content of the invention
For the disadvantages described above or Improvement requirement of prior art, semanteme is applied to object of the present invention is to provide one kind The deep learning network establishing method and system of segmentation, thus solve the existing convolutional neural networks pair suitable for semantic segmentation The relatively low technical problem of the accuracy of semantic segmentation.
To achieve the above object, according to one aspect of the present invention, there is provided a kind of depth suitable for semantic segmentation Network establishing method is practised, including:
S1, the image progress multi-scale transform that data are concentrated, wherein, the image in the data set is according to classification It is marked;
S2, using the image and respective markers after multi-scale transform as deep learning network input, it is and then right Network structure file and Solution To The Network file in the Caffe frameworks of deep learning network are modified, wherein, the depth Practise includes convolutional network, deconvolution network and mean field iteration layer, the modification bag of the network structure file successively in network The network settings in multiple dimensioned pond are included, the modification of the Solution To The Network file is set including training parameter;
S3, the mean field iteration layer in using mean field iterative algorithm to the deconvolution network output be iterated Optimization;
S4, according to amended network structure file and Solution To The Network file, using deconvolution network and condition random field The mode of joint training, obtains target deep learning network, the target deep learning network can be to passing through multi-scale transform Image to be tested afterwards carries out semantic segmentation.
Preferably, step S2 specifically includes following sub-step:
S2.1, image and respective markers after multi-scale transform is sent to as input builds leveldb The file that Caffe can be used directly can be modified as in operation program;
S2.2, set the type and network structure of convolutional layer in network structure file in Caffe and pond layer literary Operating parameter in part, carries out multiple dimensioned pondization to last layer of pond layer and operates, the image of input is divided into and multiple dimensioned pond Change corresponding multiple regions, and obtain the value in each region and insert last layer of pond layer;
S2.3, the realization of mean field algorithm is added in the caffe frameworks of deep learning network;
S2.4, caffe.proto update ID (M, N), and arrange parameter;SIMPLE_FAST_MEANFIELD=M, MULTI_STAGE_MEANFIELD=N, wherein, M, N are positive integer;
The training file and test text in network structure file in S2.5, the caffe frameworks of change deep learning network Part, adds corresponding mean field iteration layer;
S2.6, the network model in training file, basic learning rate, study more new strategy, last gradient updating Weight, maximum iteration and operational mode are configured.
Preferably, step S3 specifically includes following sub-step:
S3.1, byObtain the feed back input of mean field iteration, wherein V2(t)=fθ(U, V1(t), I), 0≤t≤T represents the output by mean field iteration;
S3.2, byFinal output result is obtained, wherein, soft max return for progress probability One changes operation, and U is the output of deconvolution network, and t represents current iteration, and T represents iteration total degree, V1And V2During for iteration Between variable, I for input the two dimensional image after multi-scale transform, fθFor mean field iterative algorithm calculating process, θ is needs The parameter of the condition random field of training, specifically includes the coefficient between the weight coefficient of each gaussian kernel function and binary crelation, Y (t) exported for final semantic segmentation.
Preferably, the output V of final mean field iteration2(t) circular is:
A1, with deconvolution network semantic segmentation rough result to unitary potential function Ui(l) initialized, and byObtain probability normalized value, wherein Zi=∑lexp(Ui(l)), l is category label, Ui(l) it is i pictures Element belongs to the probability of l classifications;
A2, pass through gaussian kernel function km(pi,pj) influencing each other and using coefficient ω between transmission pixel key words sorting(m) Its weighted sum is sought, wherein being represented with below equation:
Wherein, i, j represent pixel, pi, pjRepresent the pixel value of corresponding pixel points, km(pi,pj) represent m-th of Gaussian kernel Function;
A3, obtain according to the coefficient μ (l, l') between balanced binary crelation influencing each other between pixel key words sorting Relation:Qi(l)=∑l'∈Lμ(l,l')Qi(l'), wherein, l' represents to be different fromlClassification, L represents the set of all categories;
A4, addition unitary potential function Ui(l), it is specially:Qi(l)=Ui(l)-Qi(l);
A5, byUpdate Qi(l) new input, is will be output as, step A2 is jumped to until receipts Hold back or reach maximum iteration, wherein, Zi=∑lexp(Ui(l) Q), finally giveni(l) it is that mean field iteration is defeated Go out V2(t)。
Preferably, step S4 specifically includes following sub-step:
S4.1, using the image and respective markers after multi-scale transform as input it is sent to the deep learning net Network;
S4.2, by the convolutional network extract image target area feature, pass through the deconvolution network reduce The detailed information and shape information of the target area, obtain reality output probability of all categories;
Difference between the actual output probability of S4.3, calculating and mark;
S4.4, convolution nuclear parameter and offset vector are adjusted by the method backpropagation of minimization error according to the difference joined The Optimal Parameters of number and condition random field.
Preferably, it is described multiple dimensioned including 3 yardsticks, respectively 0.5,1,1.5, represent to carry out corresponding multiple to original image Scaling.
Preferably, the multiple dimensioned pondization is using 3 kinds of different yardsticks, and respectively 1 × 1,2 × 2,4 × 4, respectively will figure As being divided into 1 region, 4 regions, 16 regions.
It is another aspect of this invention to provide that there is provided a kind of deep learning network building systems suitable for semantic segmentation, Including:
Image transform module, the image for being concentrated to data carries out multi-scale transform, wherein, the figure in the data set As being marked according to classification;
Setup module, for regarding the image and respective markers after multi-scale transform as the defeated of deep learning network Enter, and then the network structure file and Solution To The Network file in the Caffe frameworks of deep learning network are modified, wherein, Include convolutional network, deconvolution network and mean field iteration layer, the network structure text in the deep learning network successively The modification of part includes the network settings in multiple dimensioned pond, and the modification of the Solution To The Network file is set including training parameter;
Optimization module, for defeated to the deconvolution network using mean field iterative algorithm in mean field iteration layer Go out and be iterated optimization;
Joint training module, for according to amended network structure file and Solution To The Network file, using deconvolution net Network and the mode of condition random field joint training, obtain target deep learning network, the target deep learning network can be right Image to be tested after multi-scale transform carries out semantic segmentation.
In general, the inventive method can obtain following beneficial effect compared with prior art:
(1) present invention is on the basis of deconvolution network semantic segmentation method, it is contemplated that condition random field is to edge optimization Preferably the characteristics of, condition random field is construed to Recursive Networks and is dissolved into deconvolution network, carry out end-to-end training so that Parameter learning interaction in convolutional network and Recursive Networks, finally trains more preferable deep learning network.
(2) present invention proposes the mode of a kind of deconvolution network and condition random field joint training, and parameter has stronger Robustness, stronger details and shape information can be obtained, solve image border segmentation it is less accurate the problem of.
(3) impression of the invention by inputting multiple dimensioned picture and changing neutral net using the strategy in multiple dimensioned pond Open country, the change of receptive field ensure that the integrality of big Small object so that the deep learning network trained can solve semantic point In cutting due to receptive field is single and produces big target by over-segmentation or Small object by the situation of leakage segmentation.
(4) present invention is extended to classical deconvolution network, is believed using condition random field joint training and multiple features Cease the strategy of fusion so that the deep learning network trained can improve semantic segmentation when carrying out semantic segmentation to image Accuracy.
Brief description of the drawings
Fig. 1 is a kind of network frame figure of deep learning network suitable for semantic segmentation disclosed in the embodiment of the present invention;
Fig. 2 is a kind of flow of deep learning network establishing method suitable for semantic segmentation disclosed in the embodiment of the present invention Schematic diagram;
Fig. 3 is a kind of mean field alternative manner schematic diagram disclosed in the embodiment of the present invention.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, it is right below in conjunction with drawings and Examples The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.As long as in addition, technical characteristic involved in each embodiment of invention described below Not constituting conflict each other can just be mutually combined.
It is a kind of network frame of deep learning network suitable for semantic segmentation disclosed in the embodiment of the present invention as shown in Figure 1 Frame figure, includes convolutional network, deconvolution network and mean field iteration layer (i.e. successively in the network frame figure shown in Fig. 1 CRF-RNN layers).
It is illustrated in figure 2 a kind of deep learning network establishing method suitable for semantic segmentation disclosed in the embodiment of the present invention Schematic flow sheet, this method mainly include following steps:1) data collection and pretreatment;2) network frame (Convolutional Architecture for Fast Feature Embedding, Caffe) associated documents are changed;3) Mean field iterative process;4) condition random field and the training of deconvolution network association.Its embodiment is as follows:
S1, the image progress multi-scale transform that data are concentrated, wherein, the image in above-mentioned data set is according to classification It is marked;
Wherein, it is multiple dimensioned including 3 yardsticks, respectively 0.5,1,1.5, contracting of the expression to the corresponding multiple of original image progress Put.
S2, using the image and respective markers after multi-scale transform as deep learning network input, it is and then right Network structure file and Solution To The Network file in the Caffe frameworks of deep learning network are modified, wherein, deep learning net Include convolutional network, deconvolution network and mean field iteration layer in network successively, the modification of network structure file is including multiple dimensioned The network settings in pond, the modification of Solution To The Network file is set including training parameter;
Wherein, step S2 specifically includes following sub-step:
S2.1, image and respective markers after multi-scale transform is sent to as input builds leveldb The file that Caffe can be used directly can be modified as in operation program;
S2.2, set the type and network structure of convolutional layer in network structure file in Caffe and pond layer literary Operating parameter in part, multiple dimensioned pondization operation is carried out to last layer of pond layer, by the image of input be divided into it is multiple dimensioned The corresponding multiple regions of pondization, and obtain the value in each region and insert pond layer;
S2.3, the realization of mean field algorithm is added in the caffe frameworks of deep learning network;
S2.4, caffe.proto update ID (M, N), and arrange parameter;SIMPLE_FAST_MEANFIELD=M, MULTI_STAGE_MEANFIELD=N, wherein, M, N are positive integer;Preferably, M values are that 54, N values are 55;
Training file and test file in S2.5, the caffe frameworks of change deep learning network, addition are corresponding average Field iteration layer meanfield;
The network architecture part for training file is added into multiple dimensioned operation part, multiple dimensioned pondization is using 3 kinds of different chis Degree, respectively 1 × 1,2 × 2,4 × 4, image is divided into 1 region, 4 regions, 16 regions respectively, and obtain each region Value insert pond layer;Training core document solver.prototxt needs to be configured, and mainly includes network model title (train.prototxt used during training), basic learning rate (base_lr=0.01) learns more new strategy (lr_ policy:" step "), the weight (momentum of last gradient updating:0.9), maximum iteration (max_iter: 20000), operational mode (GPU) etc..
S2.6, the network model in the training file, basic learning rate, study more new strategy, last gradient are more New weight, maximum iteration and operational mode is configured.
S3, in mean field iteration layer using mean field iterative algorithm optimization is iterated to the output of deconvolution network;
A kind of mean field alternative manner schematic diagram disclosed in the embodiment of the present invention is illustrated in figure 3, including:
S3.1, byObtain the feed back input of mean field iteration, wherein V2(t)=fθ(U, V1(t), I), 0≤t≤T represents the output by mean field iteration;
S3.2, byFinal output result is obtained, wherein, soft max return for progress probability One changes operation, and U is the output (i.e. semantic segmentation rough result) of deconvolution network, and t represents current iteration, and T represents that iteration is always secondary Number, V1And V2Intermediate variable during for iteration, I is the two dimensional image after multi-scale transform of input, fθFor mean field iteration Algorithm calculating process, θ for need train condition random field parameter, specifically include each gaussian kernel function weight coefficient and Coefficient between binary crelation, Y (t) exports for final semantic segmentation.
Wherein, the output V of final mean field iteration2(t) circular is:
A1, with deconvolution network semantic segmentation rough result to unitary potential function Ui(l) initialized, and byObtain probability normalized value, wherein Zilexp(Ui(l)), l is category label, Ui(l) it is i pictures Element belongs to the probability of l classifications;
A2, pass through gaussian kernel function km(pi,pj) influencing each other and using coefficient ω between transmission pixel key words sorting(m) Its weighted sum is sought, wherein being represented with below equation:
Wherein, i, j represent pixel, pi, pjRepresent the pixel value of corresponding pixel points, km(pi,pj) represent m-th of Gaussian kernel Function, the number of gaussian kernel function can be determined according to actual needs;
For example, using two different gaussian kernel functions, i.e. m takes 1 and 2, θα, θβ, θγIt is specifically configured to 160,3,3, core letter P in numberi, pjThe color value of respectively i, j pixel, Ii, IjFor i, the position coordinate value of j pixels.
A3, obtain according to the coefficient μ (l, l') between balanced binary crelation influencing each other between pixel key words sorting Relation:Qi(l)=Σl'∈Lμ(l,l')Qi(l'), wherein, l' represents the classification different from l, and L represents the set of all categories;
Wherein, the otherness of each classification is mainly considered, two class its coefficient μ (l, l') less for difference are smaller, Span is -1 to 0.
A4, addition unitary potential function Ui(l), it is specially:Qi(l)=Ui(l)-Qi(l);
A5, byUpdate Qi(l) new input, is will be output as, step A2 is jumped to until receipts Hold back or reach maximum iteration, wherein, Zilexp(Ui(l) Q), finally giveni(l) it is that mean field iteration is defeated Go out V2(t)。
Preferably, maximum iteration is 10.
Wherein, softmax regression models are that two classification problems based on logistics models are promoted, can be by It is applied to many classification problems, specifically, for training set { (x(1),y(1)),...(x(l),y(l)), x(i)For training sample (this In i.e. each pixel pixel value), y(i)For the corresponding label of each pixel, y(i)∈{1,2,...,k}.For each input X by convolutional neural networks, it is necessary to obtain its probability for belonging to each class, and we simply represent that whole network needs training with θ Parameter, then can be characterized with hypothesis function, it is specific as follows:
Wherein θ12,...,θkTo need the model parameter trained,This is general for normalized output Rate, principal security probability and for 1.
S4, according to amended network structure file and Solution To The Network file, using deconvolution network and condition random field The mode of joint training, obtains target deep learning network, and the target deep learning network can be to after multi-scale transform Image to be tested carry out semantic segmentation.
Wherein, step S4 is mainly joint training, and whole network is joined together to be trained, by the data set of collection and Corresponding mark label is input to network;By convolutional layer, pond layer etc. carries out the extraction step by step of feature, by deconvolution net Network reduces the detailed information and shape information of target, obtains the probability graph of each class in this region, probability graph and artwork are input to The output probability figure after being optimized is iterated in mean field iterative process;Finally calculate output and the mark of reality Between difference and by the method backpropagation adjustment convolution nuclear parameter and offset vector parameter and condition random of minimization error The Optimal Parameters of field, preserve the parameter of network to be tested.Specifically include following sub-step:
S4.1, using the image and respective markers after multi-scale transform as input it is sent to the deep learning net Network;
S4.2, by the convolutional network extract image target area feature, pass through the deconvolution network reduce The detailed information and shape information of the target area, obtain reality output probability of all categories;
Difference between the actual output probability of S4.3, calculating and mark;
S4.4, convolution nuclear parameter and offset vector are adjusted by the method backpropagation of minimization error according to the difference joined The Optimal Parameters of number and condition random field.
After the network trained, test pictures can be subjected to multi-scale transform, be transformed to 0.5,1,1.5 Three kinds of yardsticks, and be sequentially sent into the depth network trained, the probability graph of multiple dimensioned picture is subjected to summation normalization Operation obtains final probability graph, and final semantic segmentation result is obtained according to probability graph.
As it will be easily appreciated by one skilled in the art that the foregoing is merely illustrative of the preferred embodiments of the present invention, it is not used to The limitation present invention, any modifications, equivalent substitutions and improvements made within the spirit and principles of the invention etc., it all should include Within protection scope of the present invention.

Claims (8)

1. a kind of deep learning network establishing method suitable for semantic segmentation, it is characterised in that including:
S1, the image progress multi-scale transform that data are concentrated, wherein, the image in the data set has been carried out according to classification Mark;
S2, using the image and respective markers after multi-scale transform as deep learning network input, and then to depth Network structure file and Solution To The Network file in the Caffe frameworks of learning network are modified, wherein, the deep learning net Include convolutional network, deconvolution network and mean field iteration layer in network successively, the modification of the network structure file is including more The network settings in yardstick pond, the modification of the Solution To The Network file is set including training parameter;
S3, the mean field iteration layer in using mean field iterative algorithm to the deconvolution network output be iterated it is excellent Change;
S4, according to amended network structure file and Solution To The Network file, combined using deconvolution network and condition random field The mode of training, obtains target deep learning network, the target deep learning network can be to after multi-scale transform Image to be tested carries out semantic segmentation.
2. according to the method described in claim 1, it is characterised in that step S2 specifically includes following sub-step:
S2.1, image and respective markers after multi-scale transform be sent to as input build leveldb and transport In line program, the file that Caffe can be used directly is modified as;
S2.2, set Caffe in network structure file in convolutional layer and pond layer type and network structure file in Operating parameter, multiple dimensioned pondization operation is carried out to last layer of pond layer, the image of input is divided into and multiple dimensioned pondization pair The multiple regions answered, and obtain the value in each region and insert last layer of pond layer;
S2.3, the realization of mean field algorithm is added in the caffe frameworks of deep learning network;
S2.4, caffe.proto update ID (M, N), and arrange parameter;SIMPLE_FAST_MEANFIELD=M, MULTI_ STAGE_MEANFIELD=N, wherein, M, N are positive integer;
The training file and test file in network structure file in S2.5, the caffe frameworks of change deep learning network, adds Plus corresponding mean field iteration layer;
S2.6, the network model in training file, basic learning rate, study more new strategy, the weight of last gradient updating, Maximum iteration and operational mode are configured.
3. according to the method described in claim 1, it is characterised in that step S3 specifically includes following sub-step:
S3.1, byObtain the feed back input of mean field iteration, wherein V2(t)=fθ(U,V1 (t), I), 0≤t≤T represents the output by mean field iteration;
S3.2, byFinal output result is obtained, wherein, normalization of the softmax to carry out probability is grasped Make, U is the output of deconvolution network, t represents current iteration, and T represents iteration total degree, V1And V2Intermediate variable during for iteration, I is the two dimensional image after multi-scale transform of input, fθFor mean field iterative algorithm calculating process, θ trains for needs Coefficient between the parameter of condition random field, including the weight coefficient and binary crelation of each gaussian kernel function, Y (t) is final Semantic segmentation output.
4. method according to claim 3, it is characterised in that the output V of final mean field iteration2(t) specific calculating Method is:
A1, with deconvolution network semantic segmentation rough result to unitary potential function Ui(l) initialized, and byObtain probability normalized value, wherein Zi=∑lexp(Ui(l)), l is category label, Ui(l) it is i pictures Element belongs to the probability of l classifications;
A2, pass through gaussian kernel function km(pi,pj) influencing each other and using coefficient ω between transmission pixel key words sorting(m)Ask it Weighted sum, wherein being represented with below equation:
<mrow> <msubsup> <mi>Q</mi> <mi>i</mi> <mrow> <mo>(</mo> <mi>m</mi> <mo>)</mo> </mrow> </msubsup> <mrow> <mo>(</mo> <mi>l</mi> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>j</mi> <mo>&amp;NotEqual;</mo> <mi>i</mi> </mrow> </msub> <msup> <mi>k</mi> <mrow> <mo>(</mo> <mi>m</mi> <mo>)</mo> </mrow> </msup> <mrow> <mo>(</mo> <msub> <mi>f</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>f</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <msub> <mi>Q</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>l</mi> <mo>)</mo> </mrow> </mrow>
<mrow> <msub> <mi>Q</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>l</mi> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mi>&amp;Sigma;</mi> <mi>m</mi> </msub> <msup> <mi>&amp;omega;</mi> <mrow> <mo>(</mo> <mi>m</mi> <mo>)</mo> </mrow> </msup> <msubsup> <mi>Q</mi> <mi>i</mi> <mrow> <mo>(</mo> <mi>m</mi> <mo>)</mo> </mrow> </msubsup> <mrow> <mo>(</mo> <mi>l</mi> <mo>)</mo> </mrow> </mrow>
Wherein, i, j represent pixel, pi, pjRepresent the pixel value of corresponding pixel points, km(pi,pj) represent m-th of Gaussian kernel letter Number;
A3, obtain according to the coefficient μ (l, l') between balanced binary crelation the relation that influences each other between pixel key words sorting: Qi(l)=∑l'∈Lμ(l,l')Qi(l'), wherein, l' represents the classification different from l, and L represents the set of all categories;
A4, addition unitary potential function Ui(l), it is specially:Qi(l)=Ui(l)-Qi(l);
A5, byUpdate Qi(l), will be output as new input, jump to step A2 until convergence or Person reaches maximum iteration, wherein, Zi=∑lexp(Ui(l) Q), finally giveni(l) it is mean field iteration output V2 (t)。
5. method according to claim 4, it is characterised in that step S4 specifically includes following sub-step:
S4.1, using the image and respective markers after multi-scale transform as input it is sent to the deep learning network;
S4.2, extracted by the convolutional network image target area feature, reduced by the deconvolution network described The detailed information and shape information of target area, obtain reality output probability of all categories;
Difference between the actual output probability of S4.3, calculating and mark;
S4.4, according to the difference by the method backpropagation of minimization error adjust convolution nuclear parameter and offset vector parameter with And the Optimal Parameters of condition random field.
6. the method according to claim 1 to 5 any one, it is characterised in that described multiple dimensioned including 3 yardsticks, point Wei not 0.5,1,1.5, scaling of the expression to the corresponding multiple of original image progress.
7. the method according to claim 1 to 5 any one, it is characterised in that the multiple dimensioned pondization is using 3 kinds of differences Yardstick, respectively 1 × 1,2 × 2,4 × 4, image is divided into 1 region, 4 regions, 16 regions respectively.
8. a kind of deep learning network building systems suitable for semantic segmentation, it is characterised in that including:
Image transform module, the image for being concentrated to data carries out multi-scale transform, wherein, the image in the data set is equal It is marked according to classification;
Setup module, for using the image and respective markers after multi-scale transform as deep learning network input, And then the network structure file and Solution To The Network file in the Caffe frameworks of deep learning network are modified, wherein, it is described Include convolutional network, deconvolution network and mean field iteration layer in deep learning network successively, the network structure file Modification includes the network settings in multiple dimensioned pond, and the modification of the Solution To The Network file is set including training parameter;
Optimization module, for the mean field iteration layer in using mean field iterative algorithm the deconvolution network is exported into Row iteration optimizes;
Joint training module, for according to amended network structure file and Solution To The Network file, using deconvolution network and The mode of condition random field joint training, obtains target deep learning network, and the target deep learning network can be to passing through Image to be tested after multi-scale transform carries out semantic segmentation.
CN201710342354.8A 2017-05-16 2017-05-16 A kind of deep learning network establishing method and system suitable for semantic segmentation Pending CN107180430A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710342354.8A CN107180430A (en) 2017-05-16 2017-05-16 A kind of deep learning network establishing method and system suitable for semantic segmentation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710342354.8A CN107180430A (en) 2017-05-16 2017-05-16 A kind of deep learning network establishing method and system suitable for semantic segmentation

Publications (1)

Publication Number Publication Date
CN107180430A true CN107180430A (en) 2017-09-19

Family

ID=59832220

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710342354.8A Pending CN107180430A (en) 2017-05-16 2017-05-16 A kind of deep learning network establishing method and system suitable for semantic segmentation

Country Status (1)

Country Link
CN (1) CN107180430A (en)

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107729992A (en) * 2017-10-27 2018-02-23 深圳市未来媒体技术研究院 A kind of deep learning method based on backpropagation
CN107730514A (en) * 2017-09-29 2018-02-23 北京奇虎科技有限公司 Scene cut network training method, device, computing device and storage medium
CN108010049A (en) * 2017-11-09 2018-05-08 华南理工大学 Split the method in human hand region in stop-motion animation using full convolutional neural networks
CN108053376A (en) * 2017-12-08 2018-05-18 长沙全度影像科技有限公司 A kind of semantic segmentation information guiding deep learning fisheye image correcting method
CN108335313A (en) * 2018-02-26 2018-07-27 阿博茨德(北京)科技有限公司 Image partition method and device
CN108345887A (en) * 2018-01-29 2018-07-31 清华大学深圳研究生院 The training method and image, semantic dividing method of image, semantic parted pattern
CN108765431A (en) * 2018-05-25 2018-11-06 中国科学院重庆绿色智能技术研究院 A kind of dividing method of image and its application in medical domain
CN108830854A (en) * 2018-03-22 2018-11-16 广州多维魔镜高新科技有限公司 A kind of image partition method and storage medium
CN108876796A (en) * 2018-06-08 2018-11-23 长安大学 A kind of lane segmentation system and method based on full convolutional neural networks and condition random field
CN109145713A (en) * 2018-07-02 2019-01-04 南京师范大学 A kind of Small object semantic segmentation method of combining target detection
CN109145939A (en) * 2018-07-02 2019-01-04 南京师范大学 A kind of binary channels convolutional neural networks semantic segmentation method of Small object sensitivity
CN109657715A (en) * 2018-12-12 2019-04-19 广东工业大学 A kind of semantic segmentation method, apparatus, equipment and medium
CN109670577A (en) * 2018-12-14 2019-04-23 北京字节跳动网络技术有限公司 Model generating method and device
CN109801293A (en) * 2019-01-08 2019-05-24 平安科技(深圳)有限公司 Remote Sensing Image Segmentation, device and storage medium, server
CN109829885A (en) * 2018-12-24 2019-05-31 中山大学 A kind of automatic identification nasopharyngeal carcinoma primary tumo(u)r method based on deep semantic segmentation network
CN110009556A (en) * 2018-01-05 2019-07-12 广东欧珀移动通信有限公司 Image background weakening method, device, storage medium and electronic equipment
CN110009573A (en) * 2019-01-29 2019-07-12 北京奇艺世纪科技有限公司 Model training, image processing method, device, electronic equipment and computer readable storage medium
CN110047047A (en) * 2019-04-17 2019-07-23 广东工业大学 Method, apparatus, equipment and the storage medium of three-dimensional appearance image information interpretation
CN110837811A (en) * 2019-11-12 2020-02-25 腾讯科技(深圳)有限公司 Method, device and equipment for generating semantic segmentation network structure and storage medium
CN111091560A (en) * 2019-12-19 2020-05-01 广州柏视医疗科技有限公司 Nasopharyngeal carcinoma primary tumor image identification method and system
CN111178495A (en) * 2018-11-10 2020-05-19 杭州凝眸智能科技有限公司 Lightweight convolutional neural network for detecting very small objects in images
CN111340047A (en) * 2020-02-28 2020-06-26 江苏实达迪美数据处理有限公司 Image semantic segmentation method and system based on multi-scale feature and foreground and background contrast
CN111582043A (en) * 2020-04-15 2020-08-25 电子科技大学 High-resolution remote sensing image ground object change detection method based on multitask learning
CN113269093A (en) * 2021-05-26 2021-08-17 大连民族大学 Method and system for detecting visual characteristic segmentation semantics in video description
CN117211758A (en) * 2023-11-07 2023-12-12 克拉玛依市远山石油科技有限公司 Intelligent drilling control system and method for shallow hole coring

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361363A (en) * 2014-11-25 2015-02-18 中国科学院自动化研究所 Deep deconvolution feature learning network, generating method thereof and image classifying method
CN105975968A (en) * 2016-05-06 2016-09-28 西安理工大学 Caffe architecture based deep learning license plate character recognition method
CN106157307A (en) * 2016-06-27 2016-11-23 浙江工商大学 A kind of monocular image depth estimation method based on multiple dimensioned CNN and continuous CRF
CN106372390A (en) * 2016-08-25 2017-02-01 姹ゅ钩 Deep convolutional neural network-based lung cancer preventing self-service health cloud service system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361363A (en) * 2014-11-25 2015-02-18 中国科学院自动化研究所 Deep deconvolution feature learning network, generating method thereof and image classifying method
CN105975968A (en) * 2016-05-06 2016-09-28 西安理工大学 Caffe architecture based deep learning license plate character recognition method
CN106157307A (en) * 2016-06-27 2016-11-23 浙江工商大学 A kind of monocular image depth estimation method based on multiple dimensioned CNN and continuous CRF
CN106372390A (en) * 2016-08-25 2017-02-01 姹ゅ钩 Deep convolutional neural network-based lung cancer preventing self-service health cloud service system

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
JONATHAN LONG ETC.: ""Fully Convolutional Networks for Semantic Segmentation"", 《IEEE IN COMPUTER VISION AND PATTERN RECOGNITION》 *
LIANG-CHIEH CHEN ET AL.: "Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs", 《ICLR 2015》 *
SHUAI ZHENG ET AL.: "Conditional Random Fields as Recurrent Neural Networks", 《2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION(ICCV)》 *
刘丹 等: "一种多尺度CNN的图像语义分割算法", 《遥感信息》 *
山世光 等: "深度学习:多层神经网络的复兴与变革", 《科技导报》 *

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107730514A (en) * 2017-09-29 2018-02-23 北京奇虎科技有限公司 Scene cut network training method, device, computing device and storage medium
CN107730514B (en) * 2017-09-29 2021-02-12 北京奇宝科技有限公司 Scene segmentation network training method and device, computing equipment and storage medium
CN107729992A (en) * 2017-10-27 2018-02-23 深圳市未来媒体技术研究院 A kind of deep learning method based on backpropagation
CN107729992B (en) * 2017-10-27 2020-12-29 深圳市未来媒体技术研究院 Deep learning method based on back propagation
CN108010049A (en) * 2017-11-09 2018-05-08 华南理工大学 Split the method in human hand region in stop-motion animation using full convolutional neural networks
CN108053376A (en) * 2017-12-08 2018-05-18 长沙全度影像科技有限公司 A kind of semantic segmentation information guiding deep learning fisheye image correcting method
US11410277B2 (en) 2018-01-05 2022-08-09 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method and device for blurring image background, storage medium and electronic apparatus
CN110009556A (en) * 2018-01-05 2019-07-12 广东欧珀移动通信有限公司 Image background weakening method, device, storage medium and electronic equipment
CN108345887A (en) * 2018-01-29 2018-07-31 清华大学深圳研究生院 The training method and image, semantic dividing method of image, semantic parted pattern
CN108335313A (en) * 2018-02-26 2018-07-27 阿博茨德(北京)科技有限公司 Image partition method and device
CN108830854A (en) * 2018-03-22 2018-11-16 广州多维魔镜高新科技有限公司 A kind of image partition method and storage medium
CN108765431B (en) * 2018-05-25 2022-07-15 中国科学院重庆绿色智能技术研究院 Image segmentation method and application thereof in medical field
CN108765431A (en) * 2018-05-25 2018-11-06 中国科学院重庆绿色智能技术研究院 A kind of dividing method of image and its application in medical domain
CN108876796A (en) * 2018-06-08 2018-11-23 长安大学 A kind of lane segmentation system and method based on full convolutional neural networks and condition random field
CN109145939B (en) * 2018-07-02 2021-11-02 南京师范大学 Semantic segmentation method for small-target sensitive dual-channel convolutional neural network
CN109145939A (en) * 2018-07-02 2019-01-04 南京师范大学 A kind of binary channels convolutional neural networks semantic segmentation method of Small object sensitivity
CN109145713A (en) * 2018-07-02 2019-01-04 南京师范大学 A kind of Small object semantic segmentation method of combining target detection
CN111178495A (en) * 2018-11-10 2020-05-19 杭州凝眸智能科技有限公司 Lightweight convolutional neural network for detecting very small objects in images
CN109657715B (en) * 2018-12-12 2024-02-06 广东省机场集团物流有限公司 Semantic segmentation method, device, equipment and medium
CN109657715A (en) * 2018-12-12 2019-04-19 广东工业大学 A kind of semantic segmentation method, apparatus, equipment and medium
CN109670577A (en) * 2018-12-14 2019-04-23 北京字节跳动网络技术有限公司 Model generating method and device
CN109829885A (en) * 2018-12-24 2019-05-31 中山大学 A kind of automatic identification nasopharyngeal carcinoma primary tumo(u)r method based on deep semantic segmentation network
CN109829885B (en) * 2018-12-24 2022-07-22 广州柏视医疗科技有限公司 Method for automatically identifying primary tumor of nasopharyngeal carcinoma based on deep semantic segmentation network
CN109801293B (en) * 2019-01-08 2023-07-14 平安科技(深圳)有限公司 Remote sensing image segmentation method and device, storage medium and server
CN109801293A (en) * 2019-01-08 2019-05-24 平安科技(深圳)有限公司 Remote Sensing Image Segmentation, device and storage medium, server
CN110009573B (en) * 2019-01-29 2022-02-01 北京奇艺世纪科技有限公司 Model training method, image processing method, device, electronic equipment and storage medium
CN110009573A (en) * 2019-01-29 2019-07-12 北京奇艺世纪科技有限公司 Model training, image processing method, device, electronic equipment and computer readable storage medium
CN110047047A (en) * 2019-04-17 2019-07-23 广东工业大学 Method, apparatus, equipment and the storage medium of three-dimensional appearance image information interpretation
CN110837811A (en) * 2019-11-12 2020-02-25 腾讯科技(深圳)有限公司 Method, device and equipment for generating semantic segmentation network structure and storage medium
CN111091560A (en) * 2019-12-19 2020-05-01 广州柏视医疗科技有限公司 Nasopharyngeal carcinoma primary tumor image identification method and system
CN111340047A (en) * 2020-02-28 2020-06-26 江苏实达迪美数据处理有限公司 Image semantic segmentation method and system based on multi-scale feature and foreground and background contrast
CN111340047B (en) * 2020-02-28 2021-05-11 江苏实达迪美数据处理有限公司 Image semantic segmentation method and system based on multi-scale feature and foreground and background contrast
CN111582043B (en) * 2020-04-15 2022-03-15 电子科技大学 High-resolution remote sensing image ground object change detection method based on multitask learning
CN111582043A (en) * 2020-04-15 2020-08-25 电子科技大学 High-resolution remote sensing image ground object change detection method based on multitask learning
CN113269093A (en) * 2021-05-26 2021-08-17 大连民族大学 Method and system for detecting visual characteristic segmentation semantics in video description
CN113269093B (en) * 2021-05-26 2023-08-22 大连民族大学 Visual feature segmentation semantic detection method and system in video description
CN117211758A (en) * 2023-11-07 2023-12-12 克拉玛依市远山石油科技有限公司 Intelligent drilling control system and method for shallow hole coring
CN117211758B (en) * 2023-11-07 2024-04-02 克拉玛依市远山石油科技有限公司 Intelligent drilling control system and method for shallow hole coring

Similar Documents

Publication Publication Date Title
CN107180430A (en) A kind of deep learning network establishing method and system suitable for semantic segmentation
CN109299274B (en) Natural scene text detection method based on full convolution neural network
WO2022147965A1 (en) Arithmetic question marking system based on mixnet-yolov3 and convolutional recurrent neural network (crnn)
CN106650721B (en) A kind of industrial character identifying method based on convolutional neural networks
CN108182441B (en) Parallel multichannel convolutional neural network, construction method and image feature extraction method
Mahapatra et al. Retinal image quality classification using saliency maps and CNNs
CN107610087B (en) Tongue coating automatic segmentation method based on deep learning
CN111553837B (en) Artistic text image generation method based on neural style migration
CN107133955B (en) A kind of collaboration conspicuousness detection method combined at many levels
CN107122375A (en) The recognition methods of image subject based on characteristics of image
CN107862261A (en) Image people counting method based on multiple dimensioned convolutional neural networks
CN106709568A (en) RGB-D image object detection and semantic segmentation method based on deep convolution network
CN107169421A (en) A kind of car steering scene objects detection method based on depth convolutional neural networks
CN106372648A (en) Multi-feature-fusion-convolutional-neural-network-based plankton image classification method
CN106096538A (en) Face identification method based on sequencing neural network model and device
CN107066916A (en) Scene Semantics dividing method based on deconvolution neutral net
CN112115993B (en) Zero sample and small sample evidence photo anomaly detection method based on meta-learning
CN110516512B (en) Training method of pedestrian attribute analysis model, pedestrian attribute identification method and device
CN107423747A (en) A kind of conspicuousness object detection method based on depth convolutional network
CN112069900A (en) Bill character recognition method and system based on convolutional neural network
CN114048822A (en) Attention mechanism feature fusion segmentation method for image
CN107220655A (en) A kind of hand-written, printed text sorting technique based on deep learning
CN108710893A (en) A kind of digital image cameras source model sorting technique of feature based fusion
CN109344713A (en) A kind of face identification method of attitude robust
CN107610062A (en) The quick identification and bearing calibration of piecture geometry fault based on BP neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170919

RJ01 Rejection of invention patent application after publication