CN110458841A - A method of improving image segmentation operating rate - Google Patents
A method of improving image segmentation operating rate Download PDFInfo
- Publication number
- CN110458841A CN110458841A CN201910535642.4A CN201910535642A CN110458841A CN 110458841 A CN110458841 A CN 110458841A CN 201910535642 A CN201910535642 A CN 201910535642A CN 110458841 A CN110458841 A CN 110458841A
- Authority
- CN
- China
- Prior art keywords
- convolution
- size
- convolution kernel
- image
- kernel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
A method of improving image segmentation operating rate, comprising: step 1 designs multiple dimensioned empty convolution kernel;Step 2 designs channel convolutional network;Step 3 designs full convolution connection and deconvolution network;The present invention can be suitable for arbitrary image size dimension by the network of deconvolution and full convolution operation, and can carry out semantic analysis to each pixel of image, achieve the purpose that Fast Segmentation image, and quickly accurate positioning can be carried out to characteristics of image.
Description
Technical field
The present invention relates to a kind of methods for changing image segmentation rate.
Technical background
In recent years with the rapid development of computer science and technology, image procossing, image object based on computer technology
Detection etc. also obtains unprecedented fast development, and wherein deep learning is extracted crucial by the digital picture feature of study magnanimity
Target signature has been more than the mankind in target detection, is brought to industry one and another pleasantly surprised.With neuroid
It rises once again, the video image method based on convolutional Neural metanetwork becomes the mainstream technology of image segmentation and identification, using template
The means such as matching, Edge Gradient Feature, histogram of gradients, realization accurately identify image.Although figure neural network based
Effective feature identification can be carried out for the target of complex scene as feature detects, and its effect is much better than traditional side
Method, but there is also shortcomings: (1) it is to noise anti-interference weaker;(2) over-fitting is solved by using Dropout method
Problem improves convolutional neural networks model and parameter, but precision is but declined slightly;(3) introduce changeable type convolution with can
Convolutional coding structure is separated, the generalization of model is improved, enhances network model ability in feature extraction, but to the target of complex scene
Identification performance is not good enough;(4) newer a kind of image partition method, i.e. End-to-End, direct forecast image pixel classifications information,
The pixel positioning of target object is accomplished, but model the problems such as that there are parameter amounts is big, efficiency is slow, segmentation is coarse.In short, traditional
There is cumbersome, accuracy of identification is not high, recognition efficiency is slow and divides the problems such as coarse for detection method and video image method.
Summary of the invention
In order to overcome the above-mentioned deficiency of the prior art, the present invention provides a kind of raising figure of full convolution for sample problem
As the method for segmentation operating rate.The present invention uses deep learning frame, and convolutional neural networks are optimized and improved;It adopts
The parameter amount of model is reduced with the method for channel convolution;Increase the feature of image using multiple dimensioned empty convolution, solves tradition
The small problem of network receptive field.
To achieve the above object, the invention adopts the following technical scheme:
A method of image segmentation operating rate is improved, is included the following steps:
Step 1 designs multiple dimensioned empty convolution kernel;
In order to solve the problems, such as to increase receptive field by using traditional convolution sum maximum pond method, present invention employs
Original convolution core is become " fluffy " based on sample rate rate is increased on traditional convolution kernel by empty convolution kernel.
In this way while keeping original calculation amount, receptive field is increased, so that the information of image segmentation is accurate enough, then
Receptive field size calculation formula based on empty convolution kernel is
In formula: F is current layer receptive field size;Rate is the sample rate of empty convolution kernel, i.e. spacing number, can be by conventional roll
The rate of product core is considered as 1, and the sample rate rate of empty convolution is considered as 2.Traditional convolution receptive field calculation formula is
In formula: Fi-1For upper one layer of receptive field size;kiFor i-th layer of convolution kernel or Chi Huahe size;N is that convolution is total
The number of plies;siFor the convolution step-length Stride of i-th layer of convolution kernel.
The thought of multi-scale image variation is used for reference to design multiple dimensioned empty convolution, sample rate, convolution kernel size are carried out
Diversification processing, enables adaptation to different size clarification of objective extraction process.Multiple dimensioned cavity convolution is calculated as
In formula: y [i] is the corresponding convolution summed result in i-th of step-length position;K is convolution kernel;K is convolution kernel intrinsic parameter
Coordinate position, k ∈ K;W [k] is convolution kernel weight;Rate is sample rate, can use 1,2,3 respective value.
Step 2 designs channel convolutional network;
Since traditional convolution mode is all a kind of liter dimension operation, it can be considered to use the side of channel convolution at the beginning
Formula has the function that feature convolution dimensionality reduction.Firstly, traditional convolution is changed to two layers of convolution, similar in ResNet
Group operation, this new construction shorten that calculate the time be about original 1/8 under the premise of not influencing accuracy rate, reduction parameter
About original 1/9 is measured, and can be good at being applied to mobile terminal, realizes that the real-time detection of target, model compression effect are bright
It is aobvious.
For traditional convolution, it is assumed that the feature port number of input is M;The width of convolution kernel or high respectively DkOr Dk;
The quantity of convolution kernel is N.Then just there is N number of MD in the every once a certain position of sliding of convolutionk·DkParameter amount, the step-length of sliding sets
It is set to s.Then the picture size size calculation formula after sliding is
In formula: h', w' are the height and width after convolution;Pad is the highly filled boundary of width.Therefore, size after h'w' convolution
Certain point corresponds to N number of MDk·DkParameter amount, then can be obtained total ginseng population size be NMDk·Dk·h'·w'
(6)
And improved channel convolution mode is used, convolution step is divided into two steps:
1) D is usedk·DkThe convolution of M carries out convolution to M channel respectively.It is slided, is rolled up using same step-length s
Size after product is h', w', then the parameter amount that the step generates is Dk·Dk·M·h'·w' (7)
2) convolution kernel that 11N is arranged carries out liter dimensional feature and extracts.Using step-length at this time is 1 mode to the above results
The characteristic pattern obtained carries out feature extraction again, and original M channel characteristics, each carries out feature extraction using N number of convolution kernel,
The Headquarters of the General Staff population size then calculated is MNh'w'11 (8)
The convolutional coding structure of the two comprehensive steps, obtaining the last ginseng population size of channel convolution is Dk·Dk·M·h'·
w'+M·N·h'·w' (9)
As previously mentioned, the parameter amount and improved channel deconvolution parameter amount of traditional convolution kernel compare size is
From formula (10), analysis can be obtained, if using convolution kernel size, channel convolution operation can be by parameter amount for 3 × 3
It is reduced to original 1/9.
Step 3 designs full convolution connection and deconvolution network;
Traditional network structure final layer is using fixed size, so that the picture of input need to be converted into admittedly in advance
Scale cun, is unfavorable for the acquisition of logistics vehicles vehicle commander's coordinate;And there is determining digit space and sit in traditional full connection layer network
Mark is lost, and leads to image space information distortion, fails effectively to be accurately positioned target.To solve the problems, such as that information is lost, this
Invention uses full convolution connection type and is accurately positioned to the position coordinates of feature in picture.
The convolutional network [b, c, h, w] of preceding part is switched to [b, chw] by the full connection of traditional network, i.e., [b,
4096], then switch to [b, cls], wherein b indicates batch batch size size, and cls indicates classification number.And use full convolution net
Network is relative to the convolutional network for being followed by 1 × 1, without full articulamentum.Therefore, referred to as full convolutional network.The calculating side of full convolution
Method is
In formula: 1≤n≤N;yn[i] [j] is the numerical value after the position (i, j) convolution of n-th of convolution kernel;siIt is lateral
Convolution step-length;sjFor longitudinal convolution step-length;knFor n-th of convolution kernel;DkIt is wide and high for convolution kernel, the corresponding step of convolution kernel size
D in rapid 2k·Dk;δi, δjFor the position in the convolution kernel, this layer of a total of N number of different types of convolution kernel, 0≤δi,δj≤
Dk, and the sliding convolution operation of convolution kernel can switch to two matrix multiple operations.The pixel of correspondence image and the result of convolution can
It is expressed as
Wherein: the matrix dimensionality on the left side is [N, MDk·Dk];The matrix dimensionality on the right is [MDk·Dk,w′·h′];
Dimension after convolution is [N, w ' h '].I is img in the matrix on the right, and subscript is followed successively by image width and image height, i.e. Iwh。
It is operated finally by deconvolution, [N, w ' h '] is switched to image size when input, can accurately be known in this way
The specific semantic information that not each pixel represents, and avoid loss of spatial information.The concrete operations of deconvolution, are equivalent to convolution
Inverse operation, i.e.,
In formula: k1..., kNThe corresponding weight of a convolution kernel is by originalVariation isIt should
Weight is the weight by training weight size adjusted, with image, semantic information characteristics.
Therefore, arbitrary image size dimension can be suitable for by the network of deconvolution and full convolution operation, and can be right
Each pixel of image carries out semantic analysis, achievees the purpose that Fast Segmentation image, and can carry out to characteristics of image fast
Fast accurate positioning.
The invention has the advantages that
The present invention improves image segmentation operating rate using a kind of method of full convolution for sample problem, most prominent
Out the characteristics of is to have carried out light-weight technologg to image, in the case where guaranteeing segmentation precision, improves the segmentation efficiency of model,
Reduce the parameter amount of model by way of the convolution of channel;It is provided with multiple dimensioned empty convolution kernel again, rationally and simply
The receptive field for improving model, enhances the generalization of model.The algorithm can be widely applied to framing identification field, than
Such as Logistics Park vehicle identification.
Detailed description of the invention
Fig. 1 is existing traditional convolution nuclear convolution operation chart;
Fig. 2 is the convolution operation schematic diagram of improved empty convolution kernel of the invention;
Fig. 3 a~Fig. 3 c is multiple dimensioned empty convolution kernel of the invention, and Fig. 3 a is the empty convolution kernel that sample rate is 1, Fig. 3 b
It is the empty convolution kernel that sample rate is 2, Fig. 3 c is the empty convolution kernel that sample rate is 3;
Fig. 4 is existing convolution mode;
Fig. 5 is channel convolution mode of the invention;
Fig. 6 is channel convolutional coding structure of the invention;
Fig. 7 is full convolutional network design structure of the invention;
Fig. 8 is full convolution matrix calculating process schematic diagram of the invention.
Note: in Fig. 6, DW is channel convolution group, indicates the regular collocation of channel convolution kernel composition;BN is batch normalization behaviour
Make, solves the problem of that middle layer data distribution changes in the training process;Conv is convolution layer operation;RelU is amendment
Linear unit is an activation primitive.
Note: in Fig. 8: k1..., kNFor convolution kernel number;For the position weight of n-th of convolution kernel.
Specific embodiment
In order to overcome the above-mentioned deficiency of the prior art, the present invention provides a kind of image point of full convolution for sample problem
Segmentation method is optimized and is improved using deep learning frame, and to convolutional neural networks;It is reduced using the method for channel convolution
The parameter amount of model;The feature for increasing image using multiple dimensioned empty convolution, solves the problems, such as that traditional network receptive field is small.
To achieve the above object, the invention adopts the following technical scheme:
A method of image segmentation operating rate is improved, is included the following steps:
Step 1 designs multiple dimensioned empty convolution kernel;
In order to solve the problems, such as to increase receptive field by using traditional convolution sum maximum pond method, present invention employs
Original convolution core is become " fluffy " based on sample rate rate is increased on traditional convolution kernel by empty convolution kernel.
In this way while keeping original calculation amount, receptive field is increased, so that the information of image segmentation is accurate enough, then
Receptive field size calculation formula based on empty convolution kernel is
In formula: F is current layer receptive field size;Rate is the sample rate of empty convolution kernel, i.e. spacing number, can be by conventional roll
The rate of product core is considered as 1, and the sample rate rate of empty convolution is considered as 2.Traditional convolution receptive field calculation formula is
In formula: Fi-1For upper one layer of receptive field size;kiFor i-th layer of convolution kernel or Chi Huahe size;N is that convolution is total
The number of plies;siFor the convolution step-length Stride of i-th layer of convolution kernel.
The thought of multi-scale image variation is used for reference to design multiple dimensioned empty convolution, sample rate, convolution kernel size are carried out
Diversification processing, enables adaptation to different size clarification of objective extraction process.Multiple dimensioned cavity convolution is calculated as
In formula: y [i] is the corresponding convolution summed result in i-th of step-length position;K is convolution kernel;K is convolution kernel intrinsic parameter
Coordinate position, k ∈ K;W [k] is convolution kernel weight;Rate is sample rate, can use 1,2,3 respective value.
Step 2 designs channel convolutional network;
Since traditional convolution mode is all a kind of liter dimension operation, it can be considered to use the side of channel convolution at the beginning
Formula has the function that feature convolution dimensionality reduction.Firstly, traditional convolution is changed to two layers of convolution, similar in ResNet
Group operation, this new construction shorten that calculate the time be about original 1/8 under the premise of not influencing accuracy rate, reduction parameter
About original 1/9 is measured, and can be good at being applied to mobile terminal, realizes that the real-time detection of target, model compression effect are bright
It is aobvious.
For traditional convolution, it is assumed that the feature port number of input is M;The width of convolution kernel or high respectively DkOr Dk;
The quantity of convolution kernel is N.Then just there is N number of MD in the every once a certain position of sliding of convolutionk·DkParameter amount, the step-length of sliding sets
It is set to s.Then the picture size size calculation formula after sliding is
In formula: h', w' are the height and width after convolution;Pad is the highly filled boundary of width.Therefore, size after h'w' convolution
Certain point corresponds to N number of MDk·DkParameter amount, then can be obtained total ginseng population size be NMDk·Dk·h'·w'
(6)
And improved channel convolution mode is used, convolution step is divided into two steps:
1) D is usedk·DkThe convolution of M carries out convolution to M channel respectively.It is slided, is rolled up using same step-length s
Size after product is h', w', then the parameter amount that the step generates is Dk·Dk·M·h'·w' (7)
2) convolution kernel that 11N is arranged carries out liter dimensional feature and extracts.Using step-length at this time is 1 mode to the above results
The characteristic pattern obtained carries out feature extraction again, and original M channel characteristics, each carries out feature extraction using N number of convolution kernel,
The Headquarters of the General Staff population size then calculated is MNh'w'11 (8)
The convolutional coding structure of the two comprehensive steps, obtaining the last ginseng population size of channel convolution is Dk·Dk·M·h'·
w'+M·N·h'·w' (9)
As previously mentioned, the parameter amount and improved channel deconvolution parameter amount of traditional convolution kernel compare size is
From formula (10), analysis can be obtained, if using convolution kernel size, channel convolution operation can be by parameter amount for 3 × 3
It is reduced to original 1/9.
Step 3 designs full convolution connection and deconvolution network;
Traditional network structure final layer is using fixed size, so that the picture of input need to be converted into admittedly in advance
Scale cun, is unfavorable for the acquisition of logistics vehicles vehicle commander's coordinate;And there is determining digit space and sit in traditional full connection layer network
Mark is lost, and leads to image space information distortion, fails effectively to be accurately positioned target.To solve the problems, such as that information is lost, this
Invention uses full convolution connection type and is accurately positioned to the position coordinates of feature in picture.
The convolutional network [b, c, h, w] of preceding part is switched to [b, chw] by the full connection of traditional network, i.e., [b,
4096], then switch to [b, cls], wherein b indicates batch batch size size, and cls indicates classification number.And use full convolution net
Network is relative to the convolutional network for being followed by 1 × 1, without full articulamentum.Therefore, referred to as full convolutional network.The calculating side of full convolution
Method is
In formula: 1≤n≤N;yn[i] [j] is the numerical value after the position (i, j) convolution of n-th of convolution kernel;siIt is lateral
Convolution step-length;sjFor longitudinal convolution step-length;knFor n-th of convolution kernel;DkIt is wide and high for convolution kernel, the corresponding step of convolution kernel size
D in rapid 2k·Dk;δi, δjFor the position in the convolution kernel, this layer of a total of N number of different types of convolution kernel, 0≤δi,δj≤
Dk, and the sliding convolution operation of convolution kernel can switch to two matrix multiple operations.The pixel of correspondence image and the result of convolution can
It is expressed as
Wherein: the matrix dimensionality on the left side is [N, MDk·Dk];The matrix dimensionality on the right is [MDk·Dk,w′·h′];
Dimension after convolution is [N, w ' h '].I is img in the matrix on the right, and subscript is followed successively by image width and image height, i.e. Iwh。
It is operated finally by deconvolution, [N, w ' h '] is switched to image size when input, can accurately be known in this way
The specific semantic information that not each pixel represents, and avoid loss of spatial information.The concrete operations of deconvolution, are equivalent to convolution
Inverse operation, i.e.,
In formula: k1..., kNThe corresponding weight of a convolution kernel is by originalVariation isIt should
Weight is the weight by training weight size adjusted, with image, semantic information characteristics.
Therefore, arbitrary image size dimension can be suitable for by the network of deconvolution and full convolution operation, and can be right
Each pixel of image carries out semantic analysis, achievees the purpose that Fast Segmentation image, and can carry out to characteristics of image fast
Fast accurate positioning.
In order to verify the superiority of the invention, using Logistics Park vehicle as example, following network model is constructed, is compareed
Experiment:
Firstly, carrying out network struction: acquiring cargo, dragon wagon, dumper, tank truck four from Logistics Park
The logistics vehicles of seed type, are divided into training set 8 000, each classification 2 000, and test set 4 000, every one kind
Other 1 000.Each parameter configuration of the network architecture built is as shown in table 1 below.
In table 1: k is convolution kernel size;S is step-length;P is the size of filling;DW is channel convolution group, indicates channel convolution
The regular collocation of core composition;Residual error summation has been used to be conducive to the gradient transmitting of big network;The activation of each layer and batch standardize
Operation (Batch Normalization, BN) is conducive to accelerate the training of network;ReLU is amendment linear unit, is one and swashs
Function living.
Each parameter designing of 1 network architecture of table
The allocation of computer that this example uses reaches 11 G of GTX1080Ti video memory for Jijia NVIDIA is tall and handsome, 1 607 MHz's
Video card.
Finally, compared the model measurement performance of this example network and traditional network, the results are shown in Table 2.
2 lightweight parted pattern performance comparison of table
Evaluation index MPA in table 2 indicates mean pixel point accuracy rate (Mean pixel accuracy);Before MA expression
Scape area accounts for the ratio (Mean accuracy) of label area;And MIOU indicate it is average hand over and with area coverage ratio (Mean
Intersection over union), that is, predict that correct region accounts for the ratio of prediction area and label area union;Unit
Mpic-1 indicates the training occupied memory of one picture, memory unit million (M);Unit msiter-1 indicates every iteration one
The time of secondary needs, chronomere's millisecond (ms);After the convolution of channel, the video memory of occupancy reduces 51%, and training speed mentions
78% is risen, test speed improves 79%, divides and is all substantially improved in every evaluation index of positioning, wherein MIOU
Promotion amplitude is maximum.
By this example, the performance of model measurement can actually be improved by demonstrating this improved method, i.e. raising image segmentation
Operating rate.
The advantages of this programme, is:
The present invention improves image segmentation operating rate using a kind of method of full convolution for sample problem, most prominent
Out the characteristics of is to have carried out light-weight technologg to image, in the case where guaranteeing segmentation precision, improves the segmentation efficiency of model,
Reduce the parameter amount of model by way of the convolution of channel;It is provided with multiple dimensioned empty convolution kernel again, rationally and simply
The receptive field for improving model, enhances the generalization of model.The algorithm can be widely applied to framing identification field, than
Such as Logistics Park vehicle identification.
Content described in this specification embodiment is only enumerating to the way of realization of inventive concept, protection of the invention
Range should not be construed as being limited to the specific forms stated in the embodiments, and protection scope of the present invention is also and in art technology
Personnel conceive according to the present invention it is conceivable that equivalent technologies mean.
Claims (1)
1. a kind of method for improving image segmentation operating rate, includes the following steps:
Step 1 designs multiple dimensioned empty convolution kernel;
In order to solve the problems, such as to increase receptive field by using traditional convolution sum maximum pond method, empty convolution is used
Original convolution core is become " fluffy " based on sample rate rate is increased on traditional convolution kernel by core;
In this way while keeping original calculation amount, receptive field is increased, so that the information of image segmentation is accurate enough, is then based on
The receptive field size calculation formula of empty convolution kernel is
In formula: F is current layer receptive field size;Rate is the sample rate of empty convolution kernel, i.e. spacing number, can be by traditional convolution kernel
Rate be considered as 1, and the sample rate rate of empty convolution is considered as 2;Traditional convolution receptive field calculation formula is
In formula: Fi-1For upper one layer of receptive field size;kiFor i-th layer of convolution kernel or Chi Huahe size;N is the total number of plies of convolution;
siFor the convolution step-length Stride of i-th layer of convolution kernel;
The thought of multi-scale image variation is used for reference to design multiple dimensioned empty convolution, multiplicity is carried out to sample rate, convolution kernel size
Change processing, enables adaptation to different size clarification of objective extraction process;Multiple dimensioned cavity convolution is calculated as
In formula: y [i] is the corresponding convolution summed result in i-th of step-length position;K is convolution kernel;K is convolution kernel intrinsic parameter coordinate
Position, k ∈ K;W [k] is convolution kernel weight;Rate is sample rate, can use 1,2,3 respective value;
Step 2 designs channel convolutional network;
Since traditional convolution mode is all a kind of liter dimension operation, reach feature by the way of the convolution of channel at the beginning
The effect of convolution dimensionality reduction;Firstly, traditional convolution is changed to two layers of convolution, operated similar to the group in ResNet, it is this new
Structure shortens that calculate the time be about original 1/8 under the premise of not influencing accuracy rate, reduces parameter amount is about original 1/9,
And it can be good at being applied to mobile terminal, realize that the real-time detection of target, model compression effect are obvious;
For traditional convolution, it is assumed that the feature port number of input is M;The width of convolution kernel or high respectively DkOr Dk;Convolution
The quantity of core is N;Then just there is N number of MD in the every once a certain position of sliding of convolutionk·DkParameter amount, the step-length of sliding is set as
s;Then the picture size size calculation formula after sliding is
In formula: h', w' are the height and width after convolution;Pad is the highly filled boundary of width;Therefore, size is a certain after h'w' convolution
The corresponding N number of MD of pointk·DkParameter amount, then total ginseng population size, which can be obtained, is
N·M·Dk·Dk·h'·w' (6)
And improved channel convolution mode is used, convolution step is divided into two steps:
1) D is usedk·DkThe convolution of M carries out convolution to M channel respectively;It is slided using same step-length s, after convolution
Size be h', w', then the step generate parameter amount be
Dk·Dk·M·h'·w' (7)
2) convolution kernel that 11N is arranged carries out liter dimensional feature and extracts;Step-length is used to obtain for 1 mode to the above results at this time
Characteristic pattern carry out feature extraction again, original M channel characteristics, each using N number of convolution kernel progress feature extraction, then count
The Headquarters of the General Staff population size of calculation is
M·N·h'·w'·1·1 (8)
The convolutional coding structure of the two comprehensive steps, obtaining the last ginseng population size of channel convolution is
Dk·Dk·M·h'·w'+M·N·h'·w' (9)
As previously mentioned, the parameter amount and improved channel deconvolution parameter amount of traditional convolution kernel compare size is
From formula (10), analysis can be obtained, and channel convolution operation reduces parameter amount;
Step 3 designs full convolution connection and deconvolution network;
Traditional network structure final layer is using fixed size, so that the picture of input need to be converted into fixed ruler in advance
It is very little, it is unfavorable for the acquisition of logistics vehicles vehicle commander's coordinate;And there is determining digit space coordinate and lose in traditional full connection layer network
It loses, leads to image space information distortion, fail effectively to be accurately positioned target;To solve the problems, such as that information is lost, use
Full convolution connection type is accurately positioned the position coordinates of feature in picture;
The convolutional network [b, c, h, w] of preceding part is switched to [b, chw] by the full connection of traditional network, i.e., [b, 4096], then
Switch to [b, cls], wherein b indicates batch batch size size, and cls indicates classification number;And it is opposite for using full convolutional network
In the convolutional network for being followed by 1 × 1, without full articulamentum;Therefore, referred to as full convolutional network;The calculation method of convolution is entirely
In formula: 1≤n≤N;yn[i] [j] is the numerical value after the position (i, j) convolution of n-th of convolution kernel;siFor lateral convolution
Step-length;sjFor longitudinal convolution step-length;knFor n-th of convolution kernel;DkWide and high for convolution kernel, convolution kernel size corresponds in step 2
Dk·Dk;δi, δjFor the position in the convolution kernel, this layer of a total of N number of different types of convolution kernel, 0≤δi,δj≤Dk, and
The sliding convolution operation of convolution kernel can switch to two matrix multiple operations;The pixel of correspondence image and the result of convolution are represented by
Wherein: the matrix dimensionality on the left side is [N, MDk·Dk];The matrix dimensionality on the right is [MDk·Dk,w′·h′];Convolution
Dimension afterwards is [N, w ' h '];I is img in the matrix on the right, and subscript is followed successively by image width and image height, i.e. Iwh;
It is operated finally by deconvolution, [N, w ' h '] is switched to image size when input, can accurately identified so every
The specific semantic information that one pixel represents, and avoid loss of spatial information;The concrete operations of deconvolution are equivalent to the inverse of convolution
Operation, i.e.,
In formula: k1..., kNThe corresponding weight of a convolution kernel is by originalVariation is The weight is
By training weight size adjusted, the weight with image, semantic information characteristics.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910535642.4A CN110458841B (en) | 2019-06-20 | 2019-06-20 | Method for improving image segmentation running speed |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910535642.4A CN110458841B (en) | 2019-06-20 | 2019-06-20 | Method for improving image segmentation running speed |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110458841A true CN110458841A (en) | 2019-11-15 |
CN110458841B CN110458841B (en) | 2021-06-08 |
Family
ID=68480779
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910535642.4A Active CN110458841B (en) | 2019-06-20 | 2019-06-20 | Method for improving image segmentation running speed |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110458841B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111626267A (en) * | 2019-09-17 | 2020-09-04 | 山东科技大学 | Hyperspectral remote sensing image classification method using void convolution |
CN111967401A (en) * | 2020-08-19 | 2020-11-20 | 上海眼控科技股份有限公司 | Target detection method, device and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107169974A (en) * | 2017-05-26 | 2017-09-15 | 中国科学技术大学 | It is a kind of based on the image partition method for supervising full convolutional neural networks more |
US20180260956A1 (en) * | 2017-03-10 | 2018-09-13 | TuSimple | System and method for semantic segmentation using hybrid dilated convolution (hdc) |
CN108776969A (en) * | 2018-05-24 | 2018-11-09 | 复旦大学 | Breast ultrasound image lesion segmentation approach based on full convolutional network |
CN108830855A (en) * | 2018-04-02 | 2018-11-16 | 华南理工大学 | A kind of full convolutional network semantic segmentation method based on the fusion of multiple dimensioned low-level feature |
CN108921196A (en) * | 2018-06-01 | 2018-11-30 | 南京邮电大学 | A kind of semantic segmentation method for improving full convolutional neural networks |
CN109410185A (en) * | 2018-10-10 | 2019-03-01 | 腾讯科技(深圳)有限公司 | A kind of image partition method, device and storage medium |
-
2019
- 2019-06-20 CN CN201910535642.4A patent/CN110458841B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180260956A1 (en) * | 2017-03-10 | 2018-09-13 | TuSimple | System and method for semantic segmentation using hybrid dilated convolution (hdc) |
CN107169974A (en) * | 2017-05-26 | 2017-09-15 | 中国科学技术大学 | It is a kind of based on the image partition method for supervising full convolutional neural networks more |
CN108830855A (en) * | 2018-04-02 | 2018-11-16 | 华南理工大学 | A kind of full convolutional network semantic segmentation method based on the fusion of multiple dimensioned low-level feature |
CN108776969A (en) * | 2018-05-24 | 2018-11-09 | 复旦大学 | Breast ultrasound image lesion segmentation approach based on full convolutional network |
CN108921196A (en) * | 2018-06-01 | 2018-11-30 | 南京邮电大学 | A kind of semantic segmentation method for improving full convolutional neural networks |
CN109410185A (en) * | 2018-10-10 | 2019-03-01 | 腾讯科技(深圳)有限公司 | A kind of image partition method, device and storage medium |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111626267A (en) * | 2019-09-17 | 2020-09-04 | 山东科技大学 | Hyperspectral remote sensing image classification method using void convolution |
CN111967401A (en) * | 2020-08-19 | 2020-11-20 | 上海眼控科技股份有限公司 | Target detection method, device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110458841B (en) | 2021-06-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110135267A (en) | A kind of subtle object detection method of large scene SAR image | |
CN105046276B (en) | Hyperspectral image band selection method based on low-rank representation | |
CN110660052A (en) | Hot-rolled strip steel surface defect detection method based on deep learning | |
CN110378383B (en) | Picture classification method based on Keras framework and deep neural network | |
CN111832546B (en) | Lightweight natural scene text recognition method | |
CN111368769B (en) | Ship multi-target detection method based on improved anchor point frame generation model | |
CN109523520A (en) | A kind of chromosome automatic counting method based on deep learning | |
CN110096968A (en) | A kind of ultrahigh speed static gesture identification method based on depth model optimization | |
CN111986125B (en) | Method for multi-target task instance segmentation | |
CN110135430A (en) | A kind of aluminium mold ID automatic recognition system based on deep neural network | |
CN110675411A (en) | Cervical squamous intraepithelial lesion recognition algorithm based on deep learning | |
CN113283419B (en) | Convolutional neural network pointer instrument image reading identification method based on attention | |
CN110458841A (en) | A method of improving image segmentation operating rate | |
CN109447088A (en) | A kind of method and device of breast image identification | |
CN115375711A (en) | Image segmentation method of global context attention network based on multi-scale fusion | |
CN110930378A (en) | Emphysema image processing method and system based on low data demand | |
CN116883393B (en) | Metal surface defect detection method based on anchor frame-free target detection algorithm | |
CN113344110A (en) | Fuzzy image classification method based on super-resolution reconstruction | |
CN109461144A (en) | A kind of method and device of breast image identification | |
CN108846845A (en) | SAR image segmentation method based on thumbnail and hierarchical fuzzy cluster | |
CN114708589A (en) | Cervical cell classification method based on deep learning | |
Yanmin et al. | Research on ear recognition based on SSD_MobileNet_v1 network | |
CN106570882B (en) | The active contour image partition method of mixture gaussian modelling | |
CN113222944A (en) | Cell nucleus segmentation method, system and device and cancer auxiliary analysis system and device based on pathological image | |
CN105404899A (en) | Image classification method based on multi-directional context information and sparse coding model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |