CN107180430A - A kind of deep learning network establishing method and system suitable for semantic segmentation - Google Patents
A kind of deep learning network establishing method and system suitable for semantic segmentation Download PDFInfo
- Publication number
- CN107180430A CN107180430A CN201710342354.8A CN201710342354A CN107180430A CN 107180430 A CN107180430 A CN 107180430A CN 201710342354 A CN201710342354 A CN 201710342354A CN 107180430 A CN107180430 A CN 107180430A
- Authority
- CN
- China
- Prior art keywords
- network
- mrow
- image
- deep learning
- file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of deep learning network establishing method and system suitable for semantic segmentation, this method is on the basis of deconvolution network semantic segmentation, in view of condition random field it is preferable to edge optimization the characteristics of, condition random field is construed into Recursive Networks to be dissolved into deconvolution network, carry out end-to-end training, so that the parameter learning interaction in convolutional network and Recursive Networks, finally trains more preferable integrated network.Deconvolution network proposed by the present invention and the mode of condition random field joint training, obtain stronger details and shape information, solve the problem of image border segmentation is less accurate;With reference to multiple dimensioned input and the strategy in multiple dimensioned pond, situation about being split in semantic segmentation due to the big target that receptive field is single and produces by over-segmentation or Small object by leakage is solved.The present invention is extended to classical deconvolution network, using condition random field joint training and multicharacteristic information convergence strategy, improves the accuracy of semantic segmentation.
Description
Technical field
The invention belongs to technical field of computer vision, more particularly, to a kind of depth suitable for semantic segmentation
Practise network establishing method and system.
Background technology
With the explosive growth of web database technology, big data video procession is increasingly becoming a popular direction,
Wherein deep learning technology has become the indispensable research tool of big data.Although the development time of deep learning is not long,
Theory deposit is imperfect, but depth network establishing method emerges in an endless stream, and the application effect in computer vision direction is notable.Utilize
Deep learning carries out visually-perceptible based on human brain vision mechanism, and multi-layer network designs the information processing vision for being analogous to classification
System.The vision system processing point following sections of people, pixel is caught by pupil, and then cerebral cortex finds edge and direction,
Then the shape of object is taken out by edge, the classification of object is finally further taken out.Depth network is similar, rudimentary level
Edge feature is extracted, intergrade extracts shape facility and simultaneously does further abstract, finally obtains the behavior of whole target or target more
High-rise feature is classified.Deep learning another new milestone as machine learning, has attracted increasing image
Researcher participates, specific theoretical to include image classification, target identification, the problem of computer vision such as semantic segmentation is related,
In terms of including intelligent DAS (Driver Assistant System), recognition of face, image retrieval.
The thinking for carrying out image recognition using machine learning is carried out generally according to the following steps:Sensor obtains image first
Data, then by pretreatment and feature extraction, then carry out feature selecting, prediction are identified finally according to feature.Pre- place
The purpose of reason, feature extraction and feature selecting is to find suitable feature representation in order to which grader is classified.Feature representation has
Effect property often plays a part of most critical to the accuracy finally recognized, and the feature representation of early stage is all the feature manually extracted,
Selection feature is complicated and laborious by hand, and years of researches are all lifted little to recognition result accuracy rate.Wherein most represent
Property Scale invariant features transform, although rotation, scaling, brightness change are maintained the invariance, but for Protean
For image, good recognition effect still can not be reached.Deep learning is automatically learned as unsupervised feature learning process
Useful feature is practised, the training for adding mass data is supported and powerful Computing ability, is undoubtedly regarded as computer
Feel the focus of research.Two dimensional image can directly as network input, it is to avoid need to enter data characteristics in traditional algorithm
Row manual extraction and the process for rebuilding data.Convolutional neural networks mainly have the advantages that two aspects, first, it is straight by convolutional layer
It is connected to dynamic training and extracts feature, it is to avoid the artificial extraction of feature, the feature extractor of training has more preferable robustness;The
Two, the neuron on convolutional layer shares weight, can carry out parallel e-learning, reduce the training burden of parameter.Convolution god
Indeformable X-Y scheme is distorted through network to displacement, scaling and other forms can preferably be recognized, as numerous meters
The preferred model of calculation machine vision research person.
In numerous computer vision problems, the problem of image, semantic segmentation is one important and complicated.Image, semantic
Segmentation is different with image classification detection, and image classification detection is the understanding for doing image level, and semantic segmentation is to do the reason of Pixel-level
Solution, the target of semantic segmentation is a given pictures, and each pixel in picture is classified.Traditional partitioning algorithm is mainly solved
Do not have markup semantics information to object classification in the problems such as foreground-background segmentation, cluster of image content, these problems, it is real
Border needs that segmentation block is further processed if applying.End-to-end instruction can be directly carried out using convolutional neural networks
Practice and predict, it is only necessary to which the data set of corresponding semantic segmentation, project training network structure, it is possible to obtain semantic segmentation are provided
Result.
Since being used for Computer Vision Task from the convolutional neural networks in deep learning, numerous scholars are to image, semantic point
Cut and also generate interest, and propose many convolutional neural networks for being applied to semantic segmentation, compared to conventional method before,
The effect that the framework of deep learning carries out semantic segmentation is well a lot.Although it is semantic to have can be designed that preferable network is carried out
Segmentation, but result is still not applied for all kinds of images, and the diversity of image make it that the amount of training data that needs prepare is very big, and
Interference between of all categories causes the prediction of Pixel-level can not reach especially accurate.
The content of the invention
For the disadvantages described above or Improvement requirement of prior art, semanteme is applied to object of the present invention is to provide one kind
The deep learning network establishing method and system of segmentation, thus solve the existing convolutional neural networks pair suitable for semantic segmentation
The relatively low technical problem of the accuracy of semantic segmentation.
To achieve the above object, according to one aspect of the present invention, there is provided a kind of depth suitable for semantic segmentation
Network establishing method is practised, including:
S1, the image progress multi-scale transform that data are concentrated, wherein, the image in the data set is according to classification
It is marked;
S2, using the image and respective markers after multi-scale transform as deep learning network input, it is and then right
Network structure file and Solution To The Network file in the Caffe frameworks of deep learning network are modified, wherein, the depth
Practise includes convolutional network, deconvolution network and mean field iteration layer, the modification bag of the network structure file successively in network
The network settings in multiple dimensioned pond are included, the modification of the Solution To The Network file is set including training parameter;
S3, the mean field iteration layer in using mean field iterative algorithm to the deconvolution network output be iterated
Optimization;
S4, according to amended network structure file and Solution To The Network file, using deconvolution network and condition random field
The mode of joint training, obtains target deep learning network, the target deep learning network can be to passing through multi-scale transform
Image to be tested afterwards carries out semantic segmentation.
Preferably, step S2 specifically includes following sub-step:
S2.1, image and respective markers after multi-scale transform is sent to as input builds leveldb
The file that Caffe can be used directly can be modified as in operation program;
S2.2, set the type and network structure of convolutional layer in network structure file in Caffe and pond layer literary
Operating parameter in part, carries out multiple dimensioned pondization to last layer of pond layer and operates, the image of input is divided into and multiple dimensioned pond
Change corresponding multiple regions, and obtain the value in each region and insert last layer of pond layer;
S2.3, the realization of mean field algorithm is added in the caffe frameworks of deep learning network;
S2.4, caffe.proto update ID (M, N), and arrange parameter;SIMPLE_FAST_MEANFIELD=M,
MULTI_STAGE_MEANFIELD=N, wherein, M, N are positive integer;
The training file and test text in network structure file in S2.5, the caffe frameworks of change deep learning network
Part, adds corresponding mean field iteration layer;
S2.6, the network model in training file, basic learning rate, study more new strategy, last gradient updating
Weight, maximum iteration and operational mode are configured.
Preferably, step S3 specifically includes following sub-step:
S3.1, byObtain the feed back input of mean field iteration, wherein V2(t)=fθ(U,
V1(t), I), 0≤t≤T represents the output by mean field iteration;
S3.2, byFinal output result is obtained, wherein, soft max return for progress probability
One changes operation, and U is the output of deconvolution network, and t represents current iteration, and T represents iteration total degree, V1And V2During for iteration
Between variable, I for input the two dimensional image after multi-scale transform, fθFor mean field iterative algorithm calculating process, θ is needs
The parameter of the condition random field of training, specifically includes the coefficient between the weight coefficient of each gaussian kernel function and binary crelation, Y
(t) exported for final semantic segmentation.
Preferably, the output V of final mean field iteration2(t) circular is:
A1, with deconvolution network semantic segmentation rough result to unitary potential function Ui(l) initialized, and byObtain probability normalized value, wherein Zi=∑lexp(Ui(l)), l is category label, Ui(l) it is i pictures
Element belongs to the probability of l classifications;
A2, pass through gaussian kernel function km(pi,pj) influencing each other and using coefficient ω between transmission pixel key words sorting(m)
Its weighted sum is sought, wherein being represented with below equation:
Wherein, i, j represent pixel, pi, pjRepresent the pixel value of corresponding pixel points, km(pi,pj) represent m-th of Gaussian kernel
Function;
A3, obtain according to the coefficient μ (l, l') between balanced binary crelation influencing each other between pixel key words sorting
Relation:Qi(l)=∑l'∈Lμ(l,l')Qi(l'), wherein, l' represents to be different fromlClassification, L represents the set of all categories;
A4, addition unitary potential function Ui(l), it is specially:Qi(l)=Ui(l)-Qi(l);
A5, byUpdate Qi(l) new input, is will be output as, step A2 is jumped to until receipts
Hold back or reach maximum iteration, wherein, Zi=∑lexp(Ui(l) Q), finally giveni(l) it is that mean field iteration is defeated
Go out V2(t)。
Preferably, step S4 specifically includes following sub-step:
S4.1, using the image and respective markers after multi-scale transform as input it is sent to the deep learning net
Network;
S4.2, by the convolutional network extract image target area feature, pass through the deconvolution network reduce
The detailed information and shape information of the target area, obtain reality output probability of all categories;
Difference between the actual output probability of S4.3, calculating and mark;
S4.4, convolution nuclear parameter and offset vector are adjusted by the method backpropagation of minimization error according to the difference joined
The Optimal Parameters of number and condition random field.
Preferably, it is described multiple dimensioned including 3 yardsticks, respectively 0.5,1,1.5, represent to carry out corresponding multiple to original image
Scaling.
Preferably, the multiple dimensioned pondization is using 3 kinds of different yardsticks, and respectively 1 × 1,2 × 2,4 × 4, respectively will figure
As being divided into 1 region, 4 regions, 16 regions.
It is another aspect of this invention to provide that there is provided a kind of deep learning network building systems suitable for semantic segmentation,
Including:
Image transform module, the image for being concentrated to data carries out multi-scale transform, wherein, the figure in the data set
As being marked according to classification;
Setup module, for regarding the image and respective markers after multi-scale transform as the defeated of deep learning network
Enter, and then the network structure file and Solution To The Network file in the Caffe frameworks of deep learning network are modified, wherein,
Include convolutional network, deconvolution network and mean field iteration layer, the network structure text in the deep learning network successively
The modification of part includes the network settings in multiple dimensioned pond, and the modification of the Solution To The Network file is set including training parameter;
Optimization module, for defeated to the deconvolution network using mean field iterative algorithm in mean field iteration layer
Go out and be iterated optimization;
Joint training module, for according to amended network structure file and Solution To The Network file, using deconvolution net
Network and the mode of condition random field joint training, obtain target deep learning network, the target deep learning network can be right
Image to be tested after multi-scale transform carries out semantic segmentation.
In general, the inventive method can obtain following beneficial effect compared with prior art:
(1) present invention is on the basis of deconvolution network semantic segmentation method, it is contemplated that condition random field is to edge optimization
Preferably the characteristics of, condition random field is construed to Recursive Networks and is dissolved into deconvolution network, carry out end-to-end training so that
Parameter learning interaction in convolutional network and Recursive Networks, finally trains more preferable deep learning network.
(2) present invention proposes the mode of a kind of deconvolution network and condition random field joint training, and parameter has stronger
Robustness, stronger details and shape information can be obtained, solve image border segmentation it is less accurate the problem of.
(3) impression of the invention by inputting multiple dimensioned picture and changing neutral net using the strategy in multiple dimensioned pond
Open country, the change of receptive field ensure that the integrality of big Small object so that the deep learning network trained can solve semantic point
In cutting due to receptive field is single and produces big target by over-segmentation or Small object by the situation of leakage segmentation.
(4) present invention is extended to classical deconvolution network, is believed using condition random field joint training and multiple features
Cease the strategy of fusion so that the deep learning network trained can improve semantic segmentation when carrying out semantic segmentation to image
Accuracy.
Brief description of the drawings
Fig. 1 is a kind of network frame figure of deep learning network suitable for semantic segmentation disclosed in the embodiment of the present invention;
Fig. 2 is a kind of flow of deep learning network establishing method suitable for semantic segmentation disclosed in the embodiment of the present invention
Schematic diagram;
Fig. 3 is a kind of mean field alternative manner schematic diagram disclosed in the embodiment of the present invention.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, it is right below in conjunction with drawings and Examples
The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and
It is not used in the restriction present invention.As long as in addition, technical characteristic involved in each embodiment of invention described below
Not constituting conflict each other can just be mutually combined.
It is a kind of network frame of deep learning network suitable for semantic segmentation disclosed in the embodiment of the present invention as shown in Figure 1
Frame figure, includes convolutional network, deconvolution network and mean field iteration layer (i.e. successively in the network frame figure shown in Fig. 1
CRF-RNN layers).
It is illustrated in figure 2 a kind of deep learning network establishing method suitable for semantic segmentation disclosed in the embodiment of the present invention
Schematic flow sheet, this method mainly include following steps:1) data collection and pretreatment;2) network frame
(Convolutional Architecture for Fast Feature Embedding, Caffe) associated documents are changed;3)
Mean field iterative process;4) condition random field and the training of deconvolution network association.Its embodiment is as follows:
S1, the image progress multi-scale transform that data are concentrated, wherein, the image in above-mentioned data set is according to classification
It is marked;
Wherein, it is multiple dimensioned including 3 yardsticks, respectively 0.5,1,1.5, contracting of the expression to the corresponding multiple of original image progress
Put.
S2, using the image and respective markers after multi-scale transform as deep learning network input, it is and then right
Network structure file and Solution To The Network file in the Caffe frameworks of deep learning network are modified, wherein, deep learning net
Include convolutional network, deconvolution network and mean field iteration layer in network successively, the modification of network structure file is including multiple dimensioned
The network settings in pond, the modification of Solution To The Network file is set including training parameter;
Wherein, step S2 specifically includes following sub-step:
S2.1, image and respective markers after multi-scale transform is sent to as input builds leveldb
The file that Caffe can be used directly can be modified as in operation program;
S2.2, set the type and network structure of convolutional layer in network structure file in Caffe and pond layer literary
Operating parameter in part, multiple dimensioned pondization operation is carried out to last layer of pond layer, by the image of input be divided into it is multiple dimensioned
The corresponding multiple regions of pondization, and obtain the value in each region and insert pond layer;
S2.3, the realization of mean field algorithm is added in the caffe frameworks of deep learning network;
S2.4, caffe.proto update ID (M, N), and arrange parameter;SIMPLE_FAST_MEANFIELD=M,
MULTI_STAGE_MEANFIELD=N, wherein, M, N are positive integer;Preferably, M values are that 54, N values are 55;
Training file and test file in S2.5, the caffe frameworks of change deep learning network, addition are corresponding average
Field iteration layer meanfield;
The network architecture part for training file is added into multiple dimensioned operation part, multiple dimensioned pondization is using 3 kinds of different chis
Degree, respectively 1 × 1,2 × 2,4 × 4, image is divided into 1 region, 4 regions, 16 regions respectively, and obtain each region
Value insert pond layer;Training core document solver.prototxt needs to be configured, and mainly includes network model title
(train.prototxt used during training), basic learning rate (base_lr=0.01) learns more new strategy (lr_
policy:" step "), the weight (momentum of last gradient updating:0.9), maximum iteration (max_iter:
20000), operational mode (GPU) etc..
S2.6, the network model in the training file, basic learning rate, study more new strategy, last gradient are more
New weight, maximum iteration and operational mode is configured.
S3, in mean field iteration layer using mean field iterative algorithm optimization is iterated to the output of deconvolution network;
A kind of mean field alternative manner schematic diagram disclosed in the embodiment of the present invention is illustrated in figure 3, including:
S3.1, byObtain the feed back input of mean field iteration, wherein V2(t)=fθ(U,
V1(t), I), 0≤t≤T represents the output by mean field iteration;
S3.2, byFinal output result is obtained, wherein, soft max return for progress probability
One changes operation, and U is the output (i.e. semantic segmentation rough result) of deconvolution network, and t represents current iteration, and T represents that iteration is always secondary
Number, V1And V2Intermediate variable during for iteration, I is the two dimensional image after multi-scale transform of input, fθFor mean field iteration
Algorithm calculating process, θ for need train condition random field parameter, specifically include each gaussian kernel function weight coefficient and
Coefficient between binary crelation, Y (t) exports for final semantic segmentation.
Wherein, the output V of final mean field iteration2(t) circular is:
A1, with deconvolution network semantic segmentation rough result to unitary potential function Ui(l) initialized, and byObtain probability normalized value, wherein Zi=Σlexp(Ui(l)), l is category label, Ui(l) it is i pictures
Element belongs to the probability of l classifications;
A2, pass through gaussian kernel function km(pi,pj) influencing each other and using coefficient ω between transmission pixel key words sorting(m)
Its weighted sum is sought, wherein being represented with below equation:
Wherein, i, j represent pixel, pi, pjRepresent the pixel value of corresponding pixel points, km(pi,pj) represent m-th of Gaussian kernel
Function, the number of gaussian kernel function can be determined according to actual needs;
For example, using two different gaussian kernel functions, i.e. m takes 1 and 2, θα, θβ, θγIt is specifically configured to 160,3,3, core letter
P in numberi, pjThe color value of respectively i, j pixel, Ii, IjFor i, the position coordinate value of j pixels.
A3, obtain according to the coefficient μ (l, l') between balanced binary crelation influencing each other between pixel key words sorting
Relation:Qi(l)=Σl'∈Lμ(l,l')Qi(l'), wherein, l' represents the classification different from l, and L represents the set of all categories;
Wherein, the otherness of each classification is mainly considered, two class its coefficient μ (l, l') less for difference are smaller,
Span is -1 to 0.
A4, addition unitary potential function Ui(l), it is specially:Qi(l)=Ui(l)-Qi(l);
A5, byUpdate Qi(l) new input, is will be output as, step A2 is jumped to until receipts
Hold back or reach maximum iteration, wherein, Zi=Σlexp(Ui(l) Q), finally giveni(l) it is that mean field iteration is defeated
Go out V2(t)。
Preferably, maximum iteration is 10.
Wherein, softmax regression models are that two classification problems based on logistics models are promoted, can be by
It is applied to many classification problems, specifically, for training set { (x(1),y(1)),...(x(l),y(l)), x(i)For training sample (this
In i.e. each pixel pixel value), y(i)For the corresponding label of each pixel, y(i)∈{1,2,...,k}.For each input
X by convolutional neural networks, it is necessary to obtain its probability for belonging to each class, and we simply represent that whole network needs training with θ
Parameter, then can be characterized with hypothesis function, it is specific as follows:
Wherein θ1,θ2,...,θkTo need the model parameter trained,This is general for normalized output
Rate, principal security probability and for 1.
S4, according to amended network structure file and Solution To The Network file, using deconvolution network and condition random field
The mode of joint training, obtains target deep learning network, and the target deep learning network can be to after multi-scale transform
Image to be tested carry out semantic segmentation.
Wherein, step S4 is mainly joint training, and whole network is joined together to be trained, by the data set of collection and
Corresponding mark label is input to network;By convolutional layer, pond layer etc. carries out the extraction step by step of feature, by deconvolution net
Network reduces the detailed information and shape information of target, obtains the probability graph of each class in this region, probability graph and artwork are input to
The output probability figure after being optimized is iterated in mean field iterative process;Finally calculate output and the mark of reality
Between difference and by the method backpropagation adjustment convolution nuclear parameter and offset vector parameter and condition random of minimization error
The Optimal Parameters of field, preserve the parameter of network to be tested.Specifically include following sub-step:
S4.1, using the image and respective markers after multi-scale transform as input it is sent to the deep learning net
Network;
S4.2, by the convolutional network extract image target area feature, pass through the deconvolution network reduce
The detailed information and shape information of the target area, obtain reality output probability of all categories;
Difference between the actual output probability of S4.3, calculating and mark;
S4.4, convolution nuclear parameter and offset vector are adjusted by the method backpropagation of minimization error according to the difference joined
The Optimal Parameters of number and condition random field.
After the network trained, test pictures can be subjected to multi-scale transform, be transformed to 0.5,1,1.5
Three kinds of yardsticks, and be sequentially sent into the depth network trained, the probability graph of multiple dimensioned picture is subjected to summation normalization
Operation obtains final probability graph, and final semantic segmentation result is obtained according to probability graph.
As it will be easily appreciated by one skilled in the art that the foregoing is merely illustrative of the preferred embodiments of the present invention, it is not used to
The limitation present invention, any modifications, equivalent substitutions and improvements made within the spirit and principles of the invention etc., it all should include
Within protection scope of the present invention.
Claims (8)
1. a kind of deep learning network establishing method suitable for semantic segmentation, it is characterised in that including:
S1, the image progress multi-scale transform that data are concentrated, wherein, the image in the data set has been carried out according to classification
Mark;
S2, using the image and respective markers after multi-scale transform as deep learning network input, and then to depth
Network structure file and Solution To The Network file in the Caffe frameworks of learning network are modified, wherein, the deep learning net
Include convolutional network, deconvolution network and mean field iteration layer in network successively, the modification of the network structure file is including more
The network settings in yardstick pond, the modification of the Solution To The Network file is set including training parameter;
S3, the mean field iteration layer in using mean field iterative algorithm to the deconvolution network output be iterated it is excellent
Change;
S4, according to amended network structure file and Solution To The Network file, combined using deconvolution network and condition random field
The mode of training, obtains target deep learning network, the target deep learning network can be to after multi-scale transform
Image to be tested carries out semantic segmentation.
2. according to the method described in claim 1, it is characterised in that step S2 specifically includes following sub-step:
S2.1, image and respective markers after multi-scale transform be sent to as input build leveldb and transport
In line program, the file that Caffe can be used directly is modified as;
S2.2, set Caffe in network structure file in convolutional layer and pond layer type and network structure file in
Operating parameter, multiple dimensioned pondization operation is carried out to last layer of pond layer, the image of input is divided into and multiple dimensioned pondization pair
The multiple regions answered, and obtain the value in each region and insert last layer of pond layer;
S2.3, the realization of mean field algorithm is added in the caffe frameworks of deep learning network;
S2.4, caffe.proto update ID (M, N), and arrange parameter;SIMPLE_FAST_MEANFIELD=M, MULTI_
STAGE_MEANFIELD=N, wherein, M, N are positive integer;
The training file and test file in network structure file in S2.5, the caffe frameworks of change deep learning network, adds
Plus corresponding mean field iteration layer;
S2.6, the network model in training file, basic learning rate, study more new strategy, the weight of last gradient updating,
Maximum iteration and operational mode are configured.
3. according to the method described in claim 1, it is characterised in that step S3 specifically includes following sub-step:
S3.1, byObtain the feed back input of mean field iteration, wherein V2(t)=fθ(U,V1
(t), I), 0≤t≤T represents the output by mean field iteration;
S3.2, byFinal output result is obtained, wherein, normalization of the softmax to carry out probability is grasped
Make, U is the output of deconvolution network, t represents current iteration, and T represents iteration total degree, V1And V2Intermediate variable during for iteration,
I is the two dimensional image after multi-scale transform of input, fθFor mean field iterative algorithm calculating process, θ trains for needs
Coefficient between the parameter of condition random field, including the weight coefficient and binary crelation of each gaussian kernel function, Y (t) is final
Semantic segmentation output.
4. method according to claim 3, it is characterised in that the output V of final mean field iteration2(t) specific calculating
Method is:
A1, with deconvolution network semantic segmentation rough result to unitary potential function Ui(l) initialized, and byObtain probability normalized value, wherein Zi=∑lexp(Ui(l)), l is category label, Ui(l) it is i pictures
Element belongs to the probability of l classifications;
A2, pass through gaussian kernel function km(pi,pj) influencing each other and using coefficient ω between transmission pixel key words sorting(m)Ask it
Weighted sum, wherein being represented with below equation:
<mrow>
<msubsup>
<mi>Q</mi>
<mi>i</mi>
<mrow>
<mo>(</mo>
<mi>m</mi>
<mo>)</mo>
</mrow>
</msubsup>
<mrow>
<mo>(</mo>
<mi>l</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<msub>
<mi>&Sigma;</mi>
<mrow>
<mi>j</mi>
<mo>&NotEqual;</mo>
<mi>i</mi>
</mrow>
</msub>
<msup>
<mi>k</mi>
<mrow>
<mo>(</mo>
<mi>m</mi>
<mo>)</mo>
</mrow>
</msup>
<mrow>
<mo>(</mo>
<msub>
<mi>f</mi>
<mi>i</mi>
</msub>
<mo>,</mo>
<msub>
<mi>f</mi>
<mi>j</mi>
</msub>
<mo>)</mo>
</mrow>
<msub>
<mi>Q</mi>
<mi>j</mi>
</msub>
<mrow>
<mo>(</mo>
<mi>l</mi>
<mo>)</mo>
</mrow>
</mrow>
<mrow>
<msub>
<mi>Q</mi>
<mi>i</mi>
</msub>
<mrow>
<mo>(</mo>
<mi>l</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<msub>
<mi>&Sigma;</mi>
<mi>m</mi>
</msub>
<msup>
<mi>&omega;</mi>
<mrow>
<mo>(</mo>
<mi>m</mi>
<mo>)</mo>
</mrow>
</msup>
<msubsup>
<mi>Q</mi>
<mi>i</mi>
<mrow>
<mo>(</mo>
<mi>m</mi>
<mo>)</mo>
</mrow>
</msubsup>
<mrow>
<mo>(</mo>
<mi>l</mi>
<mo>)</mo>
</mrow>
</mrow>
Wherein, i, j represent pixel, pi, pjRepresent the pixel value of corresponding pixel points, km(pi,pj) represent m-th of Gaussian kernel letter
Number;
A3, obtain according to the coefficient μ (l, l') between balanced binary crelation the relation that influences each other between pixel key words sorting:
Qi(l)=∑l'∈Lμ(l,l')Qi(l'), wherein, l' represents the classification different from l, and L represents the set of all categories;
A4, addition unitary potential function Ui(l), it is specially:Qi(l)=Ui(l)-Qi(l);
A5, byUpdate Qi(l), will be output as new input, jump to step A2 until convergence or
Person reaches maximum iteration, wherein, Zi=∑lexp(Ui(l) Q), finally giveni(l) it is mean field iteration output V2
(t)。
5. method according to claim 4, it is characterised in that step S4 specifically includes following sub-step:
S4.1, using the image and respective markers after multi-scale transform as input it is sent to the deep learning network;
S4.2, extracted by the convolutional network image target area feature, reduced by the deconvolution network described
The detailed information and shape information of target area, obtain reality output probability of all categories;
Difference between the actual output probability of S4.3, calculating and mark;
S4.4, according to the difference by the method backpropagation of minimization error adjust convolution nuclear parameter and offset vector parameter with
And the Optimal Parameters of condition random field.
6. the method according to claim 1 to 5 any one, it is characterised in that described multiple dimensioned including 3 yardsticks, point
Wei not 0.5,1,1.5, scaling of the expression to the corresponding multiple of original image progress.
7. the method according to claim 1 to 5 any one, it is characterised in that the multiple dimensioned pondization is using 3 kinds of differences
Yardstick, respectively 1 × 1,2 × 2,4 × 4, image is divided into 1 region, 4 regions, 16 regions respectively.
8. a kind of deep learning network building systems suitable for semantic segmentation, it is characterised in that including:
Image transform module, the image for being concentrated to data carries out multi-scale transform, wherein, the image in the data set is equal
It is marked according to classification;
Setup module, for using the image and respective markers after multi-scale transform as deep learning network input,
And then the network structure file and Solution To The Network file in the Caffe frameworks of deep learning network are modified, wherein, it is described
Include convolutional network, deconvolution network and mean field iteration layer in deep learning network successively, the network structure file
Modification includes the network settings in multiple dimensioned pond, and the modification of the Solution To The Network file is set including training parameter;
Optimization module, for the mean field iteration layer in using mean field iterative algorithm the deconvolution network is exported into
Row iteration optimizes;
Joint training module, for according to amended network structure file and Solution To The Network file, using deconvolution network and
The mode of condition random field joint training, obtains target deep learning network, and the target deep learning network can be to passing through
Image to be tested after multi-scale transform carries out semantic segmentation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710342354.8A CN107180430A (en) | 2017-05-16 | 2017-05-16 | A kind of deep learning network establishing method and system suitable for semantic segmentation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710342354.8A CN107180430A (en) | 2017-05-16 | 2017-05-16 | A kind of deep learning network establishing method and system suitable for semantic segmentation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107180430A true CN107180430A (en) | 2017-09-19 |
Family
ID=59832220
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710342354.8A Pending CN107180430A (en) | 2017-05-16 | 2017-05-16 | A kind of deep learning network establishing method and system suitable for semantic segmentation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107180430A (en) |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107729992A (en) * | 2017-10-27 | 2018-02-23 | 深圳市未来媒体技术研究院 | A kind of deep learning method based on backpropagation |
CN107730514A (en) * | 2017-09-29 | 2018-02-23 | 北京奇虎科技有限公司 | Scene cut network training method, device, computing device and storage medium |
CN108010049A (en) * | 2017-11-09 | 2018-05-08 | 华南理工大学 | Split the method in human hand region in stop-motion animation using full convolutional neural networks |
CN108053376A (en) * | 2017-12-08 | 2018-05-18 | 长沙全度影像科技有限公司 | A kind of semantic segmentation information guiding deep learning fisheye image correcting method |
CN108335313A (en) * | 2018-02-26 | 2018-07-27 | 阿博茨德(北京)科技有限公司 | Image partition method and device |
CN108345887A (en) * | 2018-01-29 | 2018-07-31 | 清华大学深圳研究生院 | The training method and image, semantic dividing method of image, semantic parted pattern |
CN108765431A (en) * | 2018-05-25 | 2018-11-06 | 中国科学院重庆绿色智能技术研究院 | A kind of dividing method of image and its application in medical domain |
CN108830854A (en) * | 2018-03-22 | 2018-11-16 | 广州多维魔镜高新科技有限公司 | A kind of image partition method and storage medium |
CN108876796A (en) * | 2018-06-08 | 2018-11-23 | 长安大学 | A kind of lane segmentation system and method based on full convolutional neural networks and condition random field |
CN109145713A (en) * | 2018-07-02 | 2019-01-04 | 南京师范大学 | A kind of Small object semantic segmentation method of combining target detection |
CN109145939A (en) * | 2018-07-02 | 2019-01-04 | 南京师范大学 | A kind of binary channels convolutional neural networks semantic segmentation method of Small object sensitivity |
CN109657715A (en) * | 2018-12-12 | 2019-04-19 | 广东工业大学 | A kind of semantic segmentation method, apparatus, equipment and medium |
CN109670577A (en) * | 2018-12-14 | 2019-04-23 | 北京字节跳动网络技术有限公司 | Model generating method and device |
CN109801293A (en) * | 2019-01-08 | 2019-05-24 | 平安科技(深圳)有限公司 | Remote Sensing Image Segmentation, device and storage medium, server |
CN109829885A (en) * | 2018-12-24 | 2019-05-31 | 中山大学 | A kind of automatic identification nasopharyngeal carcinoma primary tumo(u)r method based on deep semantic segmentation network |
CN110009556A (en) * | 2018-01-05 | 2019-07-12 | 广东欧珀移动通信有限公司 | Image background weakening method, device, storage medium and electronic equipment |
CN110009573A (en) * | 2019-01-29 | 2019-07-12 | 北京奇艺世纪科技有限公司 | Model training, image processing method, device, electronic equipment and computer readable storage medium |
CN110047047A (en) * | 2019-04-17 | 2019-07-23 | 广东工业大学 | Method, apparatus, equipment and the storage medium of three-dimensional appearance image information interpretation |
CN110837811A (en) * | 2019-11-12 | 2020-02-25 | 腾讯科技(深圳)有限公司 | Method, device and equipment for generating semantic segmentation network structure and storage medium |
CN111091560A (en) * | 2019-12-19 | 2020-05-01 | 广州柏视医疗科技有限公司 | Nasopharyngeal carcinoma primary tumor image identification method and system |
CN111178495A (en) * | 2018-11-10 | 2020-05-19 | 杭州凝眸智能科技有限公司 | Lightweight convolutional neural network for detecting very small objects in images |
CN111340047A (en) * | 2020-02-28 | 2020-06-26 | 江苏实达迪美数据处理有限公司 | Image semantic segmentation method and system based on multi-scale feature and foreground and background contrast |
CN111582043A (en) * | 2020-04-15 | 2020-08-25 | 电子科技大学 | High-resolution remote sensing image ground object change detection method based on multitask learning |
CN113269093A (en) * | 2021-05-26 | 2021-08-17 | 大连民族大学 | Method and system for detecting visual characteristic segmentation semantics in video description |
CN117211758A (en) * | 2023-11-07 | 2023-12-12 | 克拉玛依市远山石油科技有限公司 | Intelligent drilling control system and method for shallow hole coring |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104361363A (en) * | 2014-11-25 | 2015-02-18 | 中国科学院自动化研究所 | Deep deconvolution feature learning network, generating method thereof and image classifying method |
CN105975968A (en) * | 2016-05-06 | 2016-09-28 | 西安理工大学 | Caffe architecture based deep learning license plate character recognition method |
CN106157307A (en) * | 2016-06-27 | 2016-11-23 | 浙江工商大学 | A kind of monocular image depth estimation method based on multiple dimensioned CNN and continuous CRF |
CN106372390A (en) * | 2016-08-25 | 2017-02-01 | 姹ゅ钩 | Deep convolutional neural network-based lung cancer preventing self-service health cloud service system |
-
2017
- 2017-05-16 CN CN201710342354.8A patent/CN107180430A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104361363A (en) * | 2014-11-25 | 2015-02-18 | 中国科学院自动化研究所 | Deep deconvolution feature learning network, generating method thereof and image classifying method |
CN105975968A (en) * | 2016-05-06 | 2016-09-28 | 西安理工大学 | Caffe architecture based deep learning license plate character recognition method |
CN106157307A (en) * | 2016-06-27 | 2016-11-23 | 浙江工商大学 | A kind of monocular image depth estimation method based on multiple dimensioned CNN and continuous CRF |
CN106372390A (en) * | 2016-08-25 | 2017-02-01 | 姹ゅ钩 | Deep convolutional neural network-based lung cancer preventing self-service health cloud service system |
Non-Patent Citations (5)
Title |
---|
JONATHAN LONG ETC.: ""Fully Convolutional Networks for Semantic Segmentation"", 《IEEE IN COMPUTER VISION AND PATTERN RECOGNITION》 * |
LIANG-CHIEH CHEN ET AL.: "Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs", 《ICLR 2015》 * |
SHUAI ZHENG ET AL.: "Conditional Random Fields as Recurrent Neural Networks", 《2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION(ICCV)》 * |
刘丹 等: "一种多尺度CNN的图像语义分割算法", 《遥感信息》 * |
山世光 等: "深度学习:多层神经网络的复兴与变革", 《科技导报》 * |
Cited By (38)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107730514A (en) * | 2017-09-29 | 2018-02-23 | 北京奇虎科技有限公司 | Scene cut network training method, device, computing device and storage medium |
CN107730514B (en) * | 2017-09-29 | 2021-02-12 | 北京奇宝科技有限公司 | Scene segmentation network training method and device, computing equipment and storage medium |
CN107729992A (en) * | 2017-10-27 | 2018-02-23 | 深圳市未来媒体技术研究院 | A kind of deep learning method based on backpropagation |
CN107729992B (en) * | 2017-10-27 | 2020-12-29 | 深圳市未来媒体技术研究院 | Deep learning method based on back propagation |
CN108010049A (en) * | 2017-11-09 | 2018-05-08 | 华南理工大学 | Split the method in human hand region in stop-motion animation using full convolutional neural networks |
CN108053376A (en) * | 2017-12-08 | 2018-05-18 | 长沙全度影像科技有限公司 | A kind of semantic segmentation information guiding deep learning fisheye image correcting method |
US11410277B2 (en) | 2018-01-05 | 2022-08-09 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Method and device for blurring image background, storage medium and electronic apparatus |
CN110009556A (en) * | 2018-01-05 | 2019-07-12 | 广东欧珀移动通信有限公司 | Image background weakening method, device, storage medium and electronic equipment |
CN108345887A (en) * | 2018-01-29 | 2018-07-31 | 清华大学深圳研究生院 | The training method and image, semantic dividing method of image, semantic parted pattern |
CN108335313A (en) * | 2018-02-26 | 2018-07-27 | 阿博茨德(北京)科技有限公司 | Image partition method and device |
CN108830854A (en) * | 2018-03-22 | 2018-11-16 | 广州多维魔镜高新科技有限公司 | A kind of image partition method and storage medium |
CN108765431B (en) * | 2018-05-25 | 2022-07-15 | 中国科学院重庆绿色智能技术研究院 | Image segmentation method and application thereof in medical field |
CN108765431A (en) * | 2018-05-25 | 2018-11-06 | 中国科学院重庆绿色智能技术研究院 | A kind of dividing method of image and its application in medical domain |
CN108876796A (en) * | 2018-06-08 | 2018-11-23 | 长安大学 | A kind of lane segmentation system and method based on full convolutional neural networks and condition random field |
CN109145939B (en) * | 2018-07-02 | 2021-11-02 | 南京师范大学 | Semantic segmentation method for small-target sensitive dual-channel convolutional neural network |
CN109145939A (en) * | 2018-07-02 | 2019-01-04 | 南京师范大学 | A kind of binary channels convolutional neural networks semantic segmentation method of Small object sensitivity |
CN109145713A (en) * | 2018-07-02 | 2019-01-04 | 南京师范大学 | A kind of Small object semantic segmentation method of combining target detection |
CN111178495A (en) * | 2018-11-10 | 2020-05-19 | 杭州凝眸智能科技有限公司 | Lightweight convolutional neural network for detecting very small objects in images |
CN109657715B (en) * | 2018-12-12 | 2024-02-06 | 广东省机场集团物流有限公司 | Semantic segmentation method, device, equipment and medium |
CN109657715A (en) * | 2018-12-12 | 2019-04-19 | 广东工业大学 | A kind of semantic segmentation method, apparatus, equipment and medium |
CN109670577A (en) * | 2018-12-14 | 2019-04-23 | 北京字节跳动网络技术有限公司 | Model generating method and device |
CN109829885A (en) * | 2018-12-24 | 2019-05-31 | 中山大学 | A kind of automatic identification nasopharyngeal carcinoma primary tumo(u)r method based on deep semantic segmentation network |
CN109829885B (en) * | 2018-12-24 | 2022-07-22 | 广州柏视医疗科技有限公司 | Method for automatically identifying primary tumor of nasopharyngeal carcinoma based on deep semantic segmentation network |
CN109801293B (en) * | 2019-01-08 | 2023-07-14 | 平安科技(深圳)有限公司 | Remote sensing image segmentation method and device, storage medium and server |
CN109801293A (en) * | 2019-01-08 | 2019-05-24 | 平安科技(深圳)有限公司 | Remote Sensing Image Segmentation, device and storage medium, server |
CN110009573B (en) * | 2019-01-29 | 2022-02-01 | 北京奇艺世纪科技有限公司 | Model training method, image processing method, device, electronic equipment and storage medium |
CN110009573A (en) * | 2019-01-29 | 2019-07-12 | 北京奇艺世纪科技有限公司 | Model training, image processing method, device, electronic equipment and computer readable storage medium |
CN110047047A (en) * | 2019-04-17 | 2019-07-23 | 广东工业大学 | Method, apparatus, equipment and the storage medium of three-dimensional appearance image information interpretation |
CN110837811A (en) * | 2019-11-12 | 2020-02-25 | 腾讯科技(深圳)有限公司 | Method, device and equipment for generating semantic segmentation network structure and storage medium |
CN111091560A (en) * | 2019-12-19 | 2020-05-01 | 广州柏视医疗科技有限公司 | Nasopharyngeal carcinoma primary tumor image identification method and system |
CN111340047A (en) * | 2020-02-28 | 2020-06-26 | 江苏实达迪美数据处理有限公司 | Image semantic segmentation method and system based on multi-scale feature and foreground and background contrast |
CN111340047B (en) * | 2020-02-28 | 2021-05-11 | 江苏实达迪美数据处理有限公司 | Image semantic segmentation method and system based on multi-scale feature and foreground and background contrast |
CN111582043B (en) * | 2020-04-15 | 2022-03-15 | 电子科技大学 | High-resolution remote sensing image ground object change detection method based on multitask learning |
CN111582043A (en) * | 2020-04-15 | 2020-08-25 | 电子科技大学 | High-resolution remote sensing image ground object change detection method based on multitask learning |
CN113269093A (en) * | 2021-05-26 | 2021-08-17 | 大连民族大学 | Method and system for detecting visual characteristic segmentation semantics in video description |
CN113269093B (en) * | 2021-05-26 | 2023-08-22 | 大连民族大学 | Visual feature segmentation semantic detection method and system in video description |
CN117211758A (en) * | 2023-11-07 | 2023-12-12 | 克拉玛依市远山石油科技有限公司 | Intelligent drilling control system and method for shallow hole coring |
CN117211758B (en) * | 2023-11-07 | 2024-04-02 | 克拉玛依市远山石油科技有限公司 | Intelligent drilling control system and method for shallow hole coring |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107180430A (en) | A kind of deep learning network establishing method and system suitable for semantic segmentation | |
CN109299274B (en) | Natural scene text detection method based on full convolution neural network | |
WO2022147965A1 (en) | Arithmetic question marking system based on mixnet-yolov3 and convolutional recurrent neural network (crnn) | |
CN106650721B (en) | A kind of industrial character identifying method based on convolutional neural networks | |
CN108182441B (en) | Parallel multichannel convolutional neural network, construction method and image feature extraction method | |
Mahapatra et al. | Retinal image quality classification using saliency maps and CNNs | |
CN107610087B (en) | Tongue coating automatic segmentation method based on deep learning | |
CN111553837B (en) | Artistic text image generation method based on neural style migration | |
CN107133955B (en) | A kind of collaboration conspicuousness detection method combined at many levels | |
CN107122375A (en) | The recognition methods of image subject based on characteristics of image | |
CN107862261A (en) | Image people counting method based on multiple dimensioned convolutional neural networks | |
CN106709568A (en) | RGB-D image object detection and semantic segmentation method based on deep convolution network | |
CN107169421A (en) | A kind of car steering scene objects detection method based on depth convolutional neural networks | |
CN106372648A (en) | Multi-feature-fusion-convolutional-neural-network-based plankton image classification method | |
CN106096538A (en) | Face identification method based on sequencing neural network model and device | |
CN107066916A (en) | Scene Semantics dividing method based on deconvolution neutral net | |
CN112115993B (en) | Zero sample and small sample evidence photo anomaly detection method based on meta-learning | |
CN110516512B (en) | Training method of pedestrian attribute analysis model, pedestrian attribute identification method and device | |
CN107423747A (en) | A kind of conspicuousness object detection method based on depth convolutional network | |
CN112069900A (en) | Bill character recognition method and system based on convolutional neural network | |
CN114048822A (en) | Attention mechanism feature fusion segmentation method for image | |
CN107220655A (en) | A kind of hand-written, printed text sorting technique based on deep learning | |
CN108710893A (en) | A kind of digital image cameras source model sorting technique of feature based fusion | |
CN109344713A (en) | A kind of face identification method of attitude robust | |
CN107610062A (en) | The quick identification and bearing calibration of piecture geometry fault based on BP neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170919 |
|
RJ01 | Rejection of invention patent application after publication |