CN110458841A - A method of improving image segmentation operating rate - Google Patents

A method of improving image segmentation operating rate Download PDF

Info

Publication number
CN110458841A
CN110458841A CN201910535642.4A CN201910535642A CN110458841A CN 110458841 A CN110458841 A CN 110458841A CN 201910535642 A CN201910535642 A CN 201910535642A CN 110458841 A CN110458841 A CN 110458841A
Authority
CN
China
Prior art keywords
convolution
size
convolution kernel
image
kernel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910535642.4A
Other languages
Chinese (zh)
Other versions
CN110458841B (en
Inventor
张烨
樊一超
郭艺玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN201910535642.4A priority Critical patent/CN110458841B/en
Publication of CN110458841A publication Critical patent/CN110458841A/en
Application granted granted Critical
Publication of CN110458841B publication Critical patent/CN110458841B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

A method of improving image segmentation operating rate, comprising: step 1 designs multiple dimensioned empty convolution kernel;Step 2 designs channel convolutional network;Step 3 designs full convolution connection and deconvolution network;The present invention can be suitable for arbitrary image size dimension by the network of deconvolution and full convolution operation, and can carry out semantic analysis to each pixel of image, achieve the purpose that Fast Segmentation image, and quickly accurate positioning can be carried out to characteristics of image.

Description

A method of improving image segmentation operating rate
Technical field
The present invention relates to a kind of methods for changing image segmentation rate.
Technical background
In recent years with the rapid development of computer science and technology, image procossing, image object based on computer technology Detection etc. also obtains unprecedented fast development, and wherein deep learning is extracted crucial by the digital picture feature of study magnanimity Target signature has been more than the mankind in target detection, is brought to industry one and another pleasantly surprised.With neuroid It rises once again, the video image method based on convolutional Neural metanetwork becomes the mainstream technology of image segmentation and identification, using template The means such as matching, Edge Gradient Feature, histogram of gradients, realization accurately identify image.Although figure neural network based Effective feature identification can be carried out for the target of complex scene as feature detects, and its effect is much better than traditional side Method, but there is also shortcomings: (1) it is to noise anti-interference weaker;(2) over-fitting is solved by using Dropout method Problem improves convolutional neural networks model and parameter, but precision is but declined slightly;(3) introduce changeable type convolution with can Convolutional coding structure is separated, the generalization of model is improved, enhances network model ability in feature extraction, but to the target of complex scene Identification performance is not good enough;(4) newer a kind of image partition method, i.e. End-to-End, direct forecast image pixel classifications information, The pixel positioning of target object is accomplished, but model the problems such as that there are parameter amounts is big, efficiency is slow, segmentation is coarse.In short, traditional There is cumbersome, accuracy of identification is not high, recognition efficiency is slow and divides the problems such as coarse for detection method and video image method.
Summary of the invention
In order to overcome the above-mentioned deficiency of the prior art, the present invention provides a kind of raising figure of full convolution for sample problem As the method for segmentation operating rate.The present invention uses deep learning frame, and convolutional neural networks are optimized and improved;It adopts The parameter amount of model is reduced with the method for channel convolution;Increase the feature of image using multiple dimensioned empty convolution, solves tradition The small problem of network receptive field.
To achieve the above object, the invention adopts the following technical scheme:
A method of image segmentation operating rate is improved, is included the following steps:
Step 1 designs multiple dimensioned empty convolution kernel;
In order to solve the problems, such as to increase receptive field by using traditional convolution sum maximum pond method, present invention employs Original convolution core is become " fluffy " based on sample rate rate is increased on traditional convolution kernel by empty convolution kernel.
In this way while keeping original calculation amount, receptive field is increased, so that the information of image segmentation is accurate enough, then Receptive field size calculation formula based on empty convolution kernel is
In formula: F is current layer receptive field size;Rate is the sample rate of empty convolution kernel, i.e. spacing number, can be by conventional roll The rate of product core is considered as 1, and the sample rate rate of empty convolution is considered as 2.Traditional convolution receptive field calculation formula is
In formula: Fi-1For upper one layer of receptive field size;kiFor i-th layer of convolution kernel or Chi Huahe size;N is that convolution is total The number of plies;siFor the convolution step-length Stride of i-th layer of convolution kernel.
The thought of multi-scale image variation is used for reference to design multiple dimensioned empty convolution, sample rate, convolution kernel size are carried out Diversification processing, enables adaptation to different size clarification of objective extraction process.Multiple dimensioned cavity convolution is calculated as
In formula: y [i] is the corresponding convolution summed result in i-th of step-length position;K is convolution kernel;K is convolution kernel intrinsic parameter Coordinate position, k ∈ K;W [k] is convolution kernel weight;Rate is sample rate, can use 1,2,3 respective value.
Step 2 designs channel convolutional network;
Since traditional convolution mode is all a kind of liter dimension operation, it can be considered to use the side of channel convolution at the beginning Formula has the function that feature convolution dimensionality reduction.Firstly, traditional convolution is changed to two layers of convolution, similar in ResNet Group operation, this new construction shorten that calculate the time be about original 1/8 under the premise of not influencing accuracy rate, reduction parameter About original 1/9 is measured, and can be good at being applied to mobile terminal, realizes that the real-time detection of target, model compression effect are bright It is aobvious.
For traditional convolution, it is assumed that the feature port number of input is M;The width of convolution kernel or high respectively DkOr Dk; The quantity of convolution kernel is N.Then just there is N number of MD in the every once a certain position of sliding of convolutionk·DkParameter amount, the step-length of sliding sets It is set to s.Then the picture size size calculation formula after sliding is
In formula: h', w' are the height and width after convolution;Pad is the highly filled boundary of width.Therefore, size after h'w' convolution Certain point corresponds to N number of MDk·DkParameter amount, then can be obtained total ginseng population size be NMDk·Dk·h'·w' (6)
And improved channel convolution mode is used, convolution step is divided into two steps:
1) D is usedk·DkThe convolution of M carries out convolution to M channel respectively.It is slided, is rolled up using same step-length s Size after product is h', w', then the parameter amount that the step generates is Dk·Dk·M·h'·w' (7)
2) convolution kernel that 11N is arranged carries out liter dimensional feature and extracts.Using step-length at this time is 1 mode to the above results The characteristic pattern obtained carries out feature extraction again, and original M channel characteristics, each carries out feature extraction using N number of convolution kernel, The Headquarters of the General Staff population size then calculated is MNh'w'11 (8)
The convolutional coding structure of the two comprehensive steps, obtaining the last ginseng population size of channel convolution is Dk·Dk·M·h'· w'+M·N·h'·w' (9)
As previously mentioned, the parameter amount and improved channel deconvolution parameter amount of traditional convolution kernel compare size is
From formula (10), analysis can be obtained, if using convolution kernel size, channel convolution operation can be by parameter amount for 3 × 3 It is reduced to original 1/9.
Step 3 designs full convolution connection and deconvolution network;
Traditional network structure final layer is using fixed size, so that the picture of input need to be converted into admittedly in advance Scale cun, is unfavorable for the acquisition of logistics vehicles vehicle commander's coordinate;And there is determining digit space and sit in traditional full connection layer network Mark is lost, and leads to image space information distortion, fails effectively to be accurately positioned target.To solve the problems, such as that information is lost, this Invention uses full convolution connection type and is accurately positioned to the position coordinates of feature in picture.
The convolutional network [b, c, h, w] of preceding part is switched to [b, chw] by the full connection of traditional network, i.e., [b, 4096], then switch to [b, cls], wherein b indicates batch batch size size, and cls indicates classification number.And use full convolution net Network is relative to the convolutional network for being followed by 1 × 1, without full articulamentum.Therefore, referred to as full convolutional network.The calculating side of full convolution Method is
In formula: 1≤n≤N;yn[i] [j] is the numerical value after the position (i, j) convolution of n-th of convolution kernel;siIt is lateral Convolution step-length;sjFor longitudinal convolution step-length;knFor n-th of convolution kernel;DkIt is wide and high for convolution kernel, the corresponding step of convolution kernel size D in rapid 2k·Dk;δi, δjFor the position in the convolution kernel, this layer of a total of N number of different types of convolution kernel, 0≤δij≤ Dk, and the sliding convolution operation of convolution kernel can switch to two matrix multiple operations.The pixel of correspondence image and the result of convolution can It is expressed as
Wherein: the matrix dimensionality on the left side is [N, MDk·Dk];The matrix dimensionality on the right is [MDk·Dk,w′·h′]; Dimension after convolution is [N, w ' h '].I is img in the matrix on the right, and subscript is followed successively by image width and image height, i.e. Iwh
It is operated finally by deconvolution, [N, w ' h '] is switched to image size when input, can accurately be known in this way The specific semantic information that not each pixel represents, and avoid loss of spatial information.The concrete operations of deconvolution, are equivalent to convolution Inverse operation, i.e.,
In formula: k1..., kNThe corresponding weight of a convolution kernel is by originalVariation isIt should Weight is the weight by training weight size adjusted, with image, semantic information characteristics.
Therefore, arbitrary image size dimension can be suitable for by the network of deconvolution and full convolution operation, and can be right Each pixel of image carries out semantic analysis, achievees the purpose that Fast Segmentation image, and can carry out to characteristics of image fast Fast accurate positioning.
The invention has the advantages that
The present invention improves image segmentation operating rate using a kind of method of full convolution for sample problem, most prominent Out the characteristics of is to have carried out light-weight technologg to image, in the case where guaranteeing segmentation precision, improves the segmentation efficiency of model, Reduce the parameter amount of model by way of the convolution of channel;It is provided with multiple dimensioned empty convolution kernel again, rationally and simply The receptive field for improving model, enhances the generalization of model.The algorithm can be widely applied to framing identification field, than Such as Logistics Park vehicle identification.
Detailed description of the invention
Fig. 1 is existing traditional convolution nuclear convolution operation chart;
Fig. 2 is the convolution operation schematic diagram of improved empty convolution kernel of the invention;
Fig. 3 a~Fig. 3 c is multiple dimensioned empty convolution kernel of the invention, and Fig. 3 a is the empty convolution kernel that sample rate is 1, Fig. 3 b It is the empty convolution kernel that sample rate is 2, Fig. 3 c is the empty convolution kernel that sample rate is 3;
Fig. 4 is existing convolution mode;
Fig. 5 is channel convolution mode of the invention;
Fig. 6 is channel convolutional coding structure of the invention;
Fig. 7 is full convolutional network design structure of the invention;
Fig. 8 is full convolution matrix calculating process schematic diagram of the invention.
Note: in Fig. 6, DW is channel convolution group, indicates the regular collocation of channel convolution kernel composition;BN is batch normalization behaviour Make, solves the problem of that middle layer data distribution changes in the training process;Conv is convolution layer operation;RelU is amendment Linear unit is an activation primitive.
Note: in Fig. 8: k1..., kNFor convolution kernel number;For the position weight of n-th of convolution kernel.
Specific embodiment
In order to overcome the above-mentioned deficiency of the prior art, the present invention provides a kind of image point of full convolution for sample problem Segmentation method is optimized and is improved using deep learning frame, and to convolutional neural networks;It is reduced using the method for channel convolution The parameter amount of model;The feature for increasing image using multiple dimensioned empty convolution, solves the problems, such as that traditional network receptive field is small.
To achieve the above object, the invention adopts the following technical scheme:
A method of image segmentation operating rate is improved, is included the following steps:
Step 1 designs multiple dimensioned empty convolution kernel;
In order to solve the problems, such as to increase receptive field by using traditional convolution sum maximum pond method, present invention employs Original convolution core is become " fluffy " based on sample rate rate is increased on traditional convolution kernel by empty convolution kernel.
In this way while keeping original calculation amount, receptive field is increased, so that the information of image segmentation is accurate enough, then Receptive field size calculation formula based on empty convolution kernel is
In formula: F is current layer receptive field size;Rate is the sample rate of empty convolution kernel, i.e. spacing number, can be by conventional roll The rate of product core is considered as 1, and the sample rate rate of empty convolution is considered as 2.Traditional convolution receptive field calculation formula is
In formula: Fi-1For upper one layer of receptive field size;kiFor i-th layer of convolution kernel or Chi Huahe size;N is that convolution is total The number of plies;siFor the convolution step-length Stride of i-th layer of convolution kernel.
The thought of multi-scale image variation is used for reference to design multiple dimensioned empty convolution, sample rate, convolution kernel size are carried out Diversification processing, enables adaptation to different size clarification of objective extraction process.Multiple dimensioned cavity convolution is calculated as
In formula: y [i] is the corresponding convolution summed result in i-th of step-length position;K is convolution kernel;K is convolution kernel intrinsic parameter Coordinate position, k ∈ K;W [k] is convolution kernel weight;Rate is sample rate, can use 1,2,3 respective value.
Step 2 designs channel convolutional network;
Since traditional convolution mode is all a kind of liter dimension operation, it can be considered to use the side of channel convolution at the beginning Formula has the function that feature convolution dimensionality reduction.Firstly, traditional convolution is changed to two layers of convolution, similar in ResNet Group operation, this new construction shorten that calculate the time be about original 1/8 under the premise of not influencing accuracy rate, reduction parameter About original 1/9 is measured, and can be good at being applied to mobile terminal, realizes that the real-time detection of target, model compression effect are bright It is aobvious.
For traditional convolution, it is assumed that the feature port number of input is M;The width of convolution kernel or high respectively DkOr Dk; The quantity of convolution kernel is N.Then just there is N number of MD in the every once a certain position of sliding of convolutionk·DkParameter amount, the step-length of sliding sets It is set to s.Then the picture size size calculation formula after sliding is
In formula: h', w' are the height and width after convolution;Pad is the highly filled boundary of width.Therefore, size after h'w' convolution Certain point corresponds to N number of MDk·DkParameter amount, then can be obtained total ginseng population size be NMDk·Dk·h'·w' (6)
And improved channel convolution mode is used, convolution step is divided into two steps:
1) D is usedk·DkThe convolution of M carries out convolution to M channel respectively.It is slided, is rolled up using same step-length s Size after product is h', w', then the parameter amount that the step generates is Dk·Dk·M·h'·w' (7)
2) convolution kernel that 11N is arranged carries out liter dimensional feature and extracts.Using step-length at this time is 1 mode to the above results The characteristic pattern obtained carries out feature extraction again, and original M channel characteristics, each carries out feature extraction using N number of convolution kernel, The Headquarters of the General Staff population size then calculated is MNh'w'11 (8)
The convolutional coding structure of the two comprehensive steps, obtaining the last ginseng population size of channel convolution is Dk·Dk·M·h'· w'+M·N·h'·w' (9)
As previously mentioned, the parameter amount and improved channel deconvolution parameter amount of traditional convolution kernel compare size is
From formula (10), analysis can be obtained, if using convolution kernel size, channel convolution operation can be by parameter amount for 3 × 3 It is reduced to original 1/9.
Step 3 designs full convolution connection and deconvolution network;
Traditional network structure final layer is using fixed size, so that the picture of input need to be converted into admittedly in advance Scale cun, is unfavorable for the acquisition of logistics vehicles vehicle commander's coordinate;And there is determining digit space and sit in traditional full connection layer network Mark is lost, and leads to image space information distortion, fails effectively to be accurately positioned target.To solve the problems, such as that information is lost, this Invention uses full convolution connection type and is accurately positioned to the position coordinates of feature in picture.
The convolutional network [b, c, h, w] of preceding part is switched to [b, chw] by the full connection of traditional network, i.e., [b, 4096], then switch to [b, cls], wherein b indicates batch batch size size, and cls indicates classification number.And use full convolution net Network is relative to the convolutional network for being followed by 1 × 1, without full articulamentum.Therefore, referred to as full convolutional network.The calculating side of full convolution Method is
In formula: 1≤n≤N;yn[i] [j] is the numerical value after the position (i, j) convolution of n-th of convolution kernel;siIt is lateral Convolution step-length;sjFor longitudinal convolution step-length;knFor n-th of convolution kernel;DkIt is wide and high for convolution kernel, the corresponding step of convolution kernel size D in rapid 2k·Dk;δi, δjFor the position in the convolution kernel, this layer of a total of N number of different types of convolution kernel, 0≤δij≤ Dk, and the sliding convolution operation of convolution kernel can switch to two matrix multiple operations.The pixel of correspondence image and the result of convolution can It is expressed as
Wherein: the matrix dimensionality on the left side is [N, MDk·Dk];The matrix dimensionality on the right is [MDk·Dk,w′·h′]; Dimension after convolution is [N, w ' h '].I is img in the matrix on the right, and subscript is followed successively by image width and image height, i.e. Iwh
It is operated finally by deconvolution, [N, w ' h '] is switched to image size when input, can accurately be known in this way The specific semantic information that not each pixel represents, and avoid loss of spatial information.The concrete operations of deconvolution, are equivalent to convolution Inverse operation, i.e.,
In formula: k1..., kNThe corresponding weight of a convolution kernel is by originalVariation isIt should Weight is the weight by training weight size adjusted, with image, semantic information characteristics.
Therefore, arbitrary image size dimension can be suitable for by the network of deconvolution and full convolution operation, and can be right Each pixel of image carries out semantic analysis, achievees the purpose that Fast Segmentation image, and can carry out to characteristics of image fast Fast accurate positioning.
In order to verify the superiority of the invention, using Logistics Park vehicle as example, following network model is constructed, is compareed Experiment:
Firstly, carrying out network struction: acquiring cargo, dragon wagon, dumper, tank truck four from Logistics Park The logistics vehicles of seed type, are divided into training set 8 000, each classification 2 000, and test set 4 000, every one kind Other 1 000.Each parameter configuration of the network architecture built is as shown in table 1 below.
In table 1: k is convolution kernel size;S is step-length;P is the size of filling;DW is channel convolution group, indicates channel convolution The regular collocation of core composition;Residual error summation has been used to be conducive to the gradient transmitting of big network;The activation of each layer and batch standardize Operation (Batch Normalization, BN) is conducive to accelerate the training of network;ReLU is amendment linear unit, is one and swashs Function living.
Each parameter designing of 1 network architecture of table
The allocation of computer that this example uses reaches 11 G of GTX1080Ti video memory for Jijia NVIDIA is tall and handsome, 1 607 MHz's Video card.
Finally, compared the model measurement performance of this example network and traditional network, the results are shown in Table 2.
2 lightweight parted pattern performance comparison of table
Evaluation index MPA in table 2 indicates mean pixel point accuracy rate (Mean pixel accuracy);Before MA expression Scape area accounts for the ratio (Mean accuracy) of label area;And MIOU indicate it is average hand over and with area coverage ratio (Mean Intersection over union), that is, predict that correct region accounts for the ratio of prediction area and label area union;Unit Mpic-1 indicates the training occupied memory of one picture, memory unit million (M);Unit msiter-1 indicates every iteration one The time of secondary needs, chronomere's millisecond (ms);After the convolution of channel, the video memory of occupancy reduces 51%, and training speed mentions 78% is risen, test speed improves 79%, divides and is all substantially improved in every evaluation index of positioning, wherein MIOU Promotion amplitude is maximum.
By this example, the performance of model measurement can actually be improved by demonstrating this improved method, i.e. raising image segmentation Operating rate.
The advantages of this programme, is:
The present invention improves image segmentation operating rate using a kind of method of full convolution for sample problem, most prominent Out the characteristics of is to have carried out light-weight technologg to image, in the case where guaranteeing segmentation precision, improves the segmentation efficiency of model, Reduce the parameter amount of model by way of the convolution of channel;It is provided with multiple dimensioned empty convolution kernel again, rationally and simply The receptive field for improving model, enhances the generalization of model.The algorithm can be widely applied to framing identification field, than Such as Logistics Park vehicle identification.
Content described in this specification embodiment is only enumerating to the way of realization of inventive concept, protection of the invention Range should not be construed as being limited to the specific forms stated in the embodiments, and protection scope of the present invention is also and in art technology Personnel conceive according to the present invention it is conceivable that equivalent technologies mean.

Claims (1)

1. a kind of method for improving image segmentation operating rate, includes the following steps:
Step 1 designs multiple dimensioned empty convolution kernel;
In order to solve the problems, such as to increase receptive field by using traditional convolution sum maximum pond method, empty convolution is used Original convolution core is become " fluffy " based on sample rate rate is increased on traditional convolution kernel by core;
In this way while keeping original calculation amount, receptive field is increased, so that the information of image segmentation is accurate enough, is then based on The receptive field size calculation formula of empty convolution kernel is
In formula: F is current layer receptive field size;Rate is the sample rate of empty convolution kernel, i.e. spacing number, can be by traditional convolution kernel Rate be considered as 1, and the sample rate rate of empty convolution is considered as 2;Traditional convolution receptive field calculation formula is
In formula: Fi-1For upper one layer of receptive field size;kiFor i-th layer of convolution kernel or Chi Huahe size;N is the total number of plies of convolution; siFor the convolution step-length Stride of i-th layer of convolution kernel;
The thought of multi-scale image variation is used for reference to design multiple dimensioned empty convolution, multiplicity is carried out to sample rate, convolution kernel size Change processing, enables adaptation to different size clarification of objective extraction process;Multiple dimensioned cavity convolution is calculated as
In formula: y [i] is the corresponding convolution summed result in i-th of step-length position;K is convolution kernel;K is convolution kernel intrinsic parameter coordinate Position, k ∈ K;W [k] is convolution kernel weight;Rate is sample rate, can use 1,2,3 respective value;
Step 2 designs channel convolutional network;
Since traditional convolution mode is all a kind of liter dimension operation, reach feature by the way of the convolution of channel at the beginning The effect of convolution dimensionality reduction;Firstly, traditional convolution is changed to two layers of convolution, operated similar to the group in ResNet, it is this new Structure shortens that calculate the time be about original 1/8 under the premise of not influencing accuracy rate, reduces parameter amount is about original 1/9, And it can be good at being applied to mobile terminal, realize that the real-time detection of target, model compression effect are obvious;
For traditional convolution, it is assumed that the feature port number of input is M;The width of convolution kernel or high respectively DkOr Dk;Convolution The quantity of core is N;Then just there is N number of MD in the every once a certain position of sliding of convolutionk·DkParameter amount, the step-length of sliding is set as s;Then the picture size size calculation formula after sliding is
In formula: h', w' are the height and width after convolution;Pad is the highly filled boundary of width;Therefore, size is a certain after h'w' convolution The corresponding N number of MD of pointk·DkParameter amount, then total ginseng population size, which can be obtained, is
N·M·Dk·Dk·h'·w' (6)
And improved channel convolution mode is used, convolution step is divided into two steps:
1) D is usedk·DkThe convolution of M carries out convolution to M channel respectively;It is slided using same step-length s, after convolution Size be h', w', then the step generate parameter amount be
Dk·Dk·M·h'·w' (7)
2) convolution kernel that 11N is arranged carries out liter dimensional feature and extracts;Step-length is used to obtain for 1 mode to the above results at this time Characteristic pattern carry out feature extraction again, original M channel characteristics, each using N number of convolution kernel progress feature extraction, then count The Headquarters of the General Staff population size of calculation is
M·N·h'·w'·1·1 (8)
The convolutional coding structure of the two comprehensive steps, obtaining the last ginseng population size of channel convolution is
Dk·Dk·M·h'·w'+M·N·h'·w' (9)
As previously mentioned, the parameter amount and improved channel deconvolution parameter amount of traditional convolution kernel compare size is
From formula (10), analysis can be obtained, and channel convolution operation reduces parameter amount;
Step 3 designs full convolution connection and deconvolution network;
Traditional network structure final layer is using fixed size, so that the picture of input need to be converted into fixed ruler in advance It is very little, it is unfavorable for the acquisition of logistics vehicles vehicle commander's coordinate;And there is determining digit space coordinate and lose in traditional full connection layer network It loses, leads to image space information distortion, fail effectively to be accurately positioned target;To solve the problems, such as that information is lost, use Full convolution connection type is accurately positioned the position coordinates of feature in picture;
The convolutional network [b, c, h, w] of preceding part is switched to [b, chw] by the full connection of traditional network, i.e., [b, 4096], then Switch to [b, cls], wherein b indicates batch batch size size, and cls indicates classification number;And it is opposite for using full convolutional network In the convolutional network for being followed by 1 × 1, without full articulamentum;Therefore, referred to as full convolutional network;The calculation method of convolution is entirely
In formula: 1≤n≤N;yn[i] [j] is the numerical value after the position (i, j) convolution of n-th of convolution kernel;siFor lateral convolution Step-length;sjFor longitudinal convolution step-length;knFor n-th of convolution kernel;DkWide and high for convolution kernel, convolution kernel size corresponds in step 2 Dk·Dk;δi, δjFor the position in the convolution kernel, this layer of a total of N number of different types of convolution kernel, 0≤δij≤Dk, and The sliding convolution operation of convolution kernel can switch to two matrix multiple operations;The pixel of correspondence image and the result of convolution are represented by
Wherein: the matrix dimensionality on the left side is [N, MDk·Dk];The matrix dimensionality on the right is [MDk·Dk,w′·h′];Convolution Dimension afterwards is [N, w ' h '];I is img in the matrix on the right, and subscript is followed successively by image width and image height, i.e. Iwh
It is operated finally by deconvolution, [N, w ' h '] is switched to image size when input, can accurately identified so every The specific semantic information that one pixel represents, and avoid loss of spatial information;The concrete operations of deconvolution are equivalent to the inverse of convolution Operation, i.e.,
In formula: k1..., kNThe corresponding weight of a convolution kernel is by originalVariation is The weight is By training weight size adjusted, the weight with image, semantic information characteristics.
CN201910535642.4A 2019-06-20 2019-06-20 Method for improving image segmentation running speed Active CN110458841B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910535642.4A CN110458841B (en) 2019-06-20 2019-06-20 Method for improving image segmentation running speed

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910535642.4A CN110458841B (en) 2019-06-20 2019-06-20 Method for improving image segmentation running speed

Publications (2)

Publication Number Publication Date
CN110458841A true CN110458841A (en) 2019-11-15
CN110458841B CN110458841B (en) 2021-06-08

Family

ID=68480779

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910535642.4A Active CN110458841B (en) 2019-06-20 2019-06-20 Method for improving image segmentation running speed

Country Status (1)

Country Link
CN (1) CN110458841B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111626267A (en) * 2019-09-17 2020-09-04 山东科技大学 Hyperspectral remote sensing image classification method using void convolution
CN111967401A (en) * 2020-08-19 2020-11-20 上海眼控科技股份有限公司 Target detection method, device and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107169974A (en) * 2017-05-26 2017-09-15 中国科学技术大学 It is a kind of based on the image partition method for supervising full convolutional neural networks more
US20180260956A1 (en) * 2017-03-10 2018-09-13 TuSimple System and method for semantic segmentation using hybrid dilated convolution (hdc)
CN108776969A (en) * 2018-05-24 2018-11-09 复旦大学 Breast ultrasound image lesion segmentation approach based on full convolutional network
CN108830855A (en) * 2018-04-02 2018-11-16 华南理工大学 A kind of full convolutional network semantic segmentation method based on the fusion of multiple dimensioned low-level feature
CN108921196A (en) * 2018-06-01 2018-11-30 南京邮电大学 A kind of semantic segmentation method for improving full convolutional neural networks
CN109410185A (en) * 2018-10-10 2019-03-01 腾讯科技(深圳)有限公司 A kind of image partition method, device and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180260956A1 (en) * 2017-03-10 2018-09-13 TuSimple System and method for semantic segmentation using hybrid dilated convolution (hdc)
CN107169974A (en) * 2017-05-26 2017-09-15 中国科学技术大学 It is a kind of based on the image partition method for supervising full convolutional neural networks more
CN108830855A (en) * 2018-04-02 2018-11-16 华南理工大学 A kind of full convolutional network semantic segmentation method based on the fusion of multiple dimensioned low-level feature
CN108776969A (en) * 2018-05-24 2018-11-09 复旦大学 Breast ultrasound image lesion segmentation approach based on full convolutional network
CN108921196A (en) * 2018-06-01 2018-11-30 南京邮电大学 A kind of semantic segmentation method for improving full convolutional neural networks
CN109410185A (en) * 2018-10-10 2019-03-01 腾讯科技(深圳)有限公司 A kind of image partition method, device and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111626267A (en) * 2019-09-17 2020-09-04 山东科技大学 Hyperspectral remote sensing image classification method using void convolution
CN111967401A (en) * 2020-08-19 2020-11-20 上海眼控科技股份有限公司 Target detection method, device and storage medium

Also Published As

Publication number Publication date
CN110458841B (en) 2021-06-08

Similar Documents

Publication Publication Date Title
CN110135267A (en) A kind of subtle object detection method of large scene SAR image
CN105046276B (en) Hyperspectral image band selection method based on low-rank representation
CN110660052A (en) Hot-rolled strip steel surface defect detection method based on deep learning
CN110378383B (en) Picture classification method based on Keras framework and deep neural network
CN111832546B (en) Lightweight natural scene text recognition method
CN111368769B (en) Ship multi-target detection method based on improved anchor point frame generation model
CN109523520A (en) A kind of chromosome automatic counting method based on deep learning
CN110096968A (en) A kind of ultrahigh speed static gesture identification method based on depth model optimization
CN111986125B (en) Method for multi-target task instance segmentation
CN110135430A (en) A kind of aluminium mold ID automatic recognition system based on deep neural network
CN110675411A (en) Cervical squamous intraepithelial lesion recognition algorithm based on deep learning
CN113283419B (en) Convolutional neural network pointer instrument image reading identification method based on attention
CN110458841A (en) A method of improving image segmentation operating rate
CN109447088A (en) A kind of method and device of breast image identification
CN115375711A (en) Image segmentation method of global context attention network based on multi-scale fusion
CN110930378A (en) Emphysema image processing method and system based on low data demand
CN116883393B (en) Metal surface defect detection method based on anchor frame-free target detection algorithm
CN113344110A (en) Fuzzy image classification method based on super-resolution reconstruction
CN109461144A (en) A kind of method and device of breast image identification
CN108846845A (en) SAR image segmentation method based on thumbnail and hierarchical fuzzy cluster
CN114708589A (en) Cervical cell classification method based on deep learning
Yanmin et al. Research on ear recognition based on SSD_MobileNet_v1 network
CN106570882B (en) The active contour image partition method of mixture gaussian modelling
CN113222944A (en) Cell nucleus segmentation method, system and device and cancer auxiliary analysis system and device based on pathological image
CN105404899A (en) Image classification method based on multi-directional context information and sparse coding model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant