CN110458841B - Method for improving image segmentation running speed - Google Patents
Method for improving image segmentation running speed Download PDFInfo
- Publication number
- CN110458841B CN110458841B CN201910535642.4A CN201910535642A CN110458841B CN 110458841 B CN110458841 B CN 110458841B CN 201910535642 A CN201910535642 A CN 201910535642A CN 110458841 B CN110458841 B CN 110458841B
- Authority
- CN
- China
- Prior art keywords
- convolution
- size
- kernel
- image
- convolution kernel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
Abstract
A method of increasing an image segmentation run rate, comprising: designing a multi-scale cavity convolution kernel; designing a channel convolution network; designing a full convolution connection and deconvolution network; the invention can be suitable for any image size through a network of deconvolution and full convolution operations, can carry out semantic analysis on each pixel point of the image, achieves the aim of rapidly segmenting the image, and can rapidly and accurately position the image characteristics.
Description
Technical Field
The invention relates to a method for changing image segmentation rate.
Technical Field
In recent years, with the rapid development of computer science and technology, image processing, image target detection and the like based on computer technology have also been developed unprecedentedly, wherein deep learning is performed by learning massive digital image features and extracting key target features, which is more than human in target detection, and brings a further surprise to the industry. With the rise of the neuron network again, the video image method based on the convolutional neuron network becomes a mainstream technology of image segmentation and identification, and the accurate identification of the image is realized by means of template matching, edge feature extraction, gradient histograms and the like. Although the image feature detection based on the neural network can effectively identify the features of the targets in the complex scene, and the effect is far better than that of the traditional method, the method also has the following defects: (1) the noise immunity is weak; (2) the problem of overfitting is solved by using a Dropout method, a convolutional neural network model and parameters are improved, but the precision is slightly reduced; (3) a variable convolution and separable convolution structure is introduced, the generalization of the model is improved, the network model feature extraction capability is enhanced, but the target identification performance of a complex scene is poor; (4) a newer image segmentation method, namely End-to-End, directly predicts image pixel classification information and achieves pixel positioning of a target object, but the model has the problems of large parameter, low efficiency, rough segmentation and the like. In a word, the traditional detection method and the video image method have the problems of complex operation, low identification precision, low identification efficiency, rough segmentation and the like.
Disclosure of Invention
To overcome the above-mentioned deficiencies of the prior art, the present invention provides a method for increasing the running rate of image segmentation for full convolution. The invention adopts a deep learning framework and optimizes and improves the convolutional neural network; reducing the parameter quantity of the model by adopting a channel convolution method; the characteristics of the image are increased by adopting multi-scale hole convolution, and the problem of small receptive field of the traditional network is solved.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for improving the running speed of image segmentation comprises the following steps:
designing a multi-scale cavity convolution kernel;
in order to solve the problem that the receptive field is increased by adopting the traditional convolution and maximum pooling method, the invention adopts the hole convolution kernel, increases the sampling rate on the basis of the traditional convolution kernel, and changes the original convolution kernel into fluffy.
Thus, while the original calculated amount is kept, the receptive field is increased, so that the image segmentation information is accurate enough, and the calculation formula of the size of the receptive field based on the cavity convolution kernel is
In the formula: f is the size of the current layer receptive field; the rate is the sampling rate of the hole convolution kernel, i.e. the number of intervals, and the rate of the conventional convolution kernel can be regarded as 1, and the sampling rate of the hole convolution can be regarded as 2. The traditional convolution receptive field calculation formula is
In the formula: fi-1The size of the receptive field of the previous layer; k is a radical ofiIs convolution of ith layerNuclear or pooled nuclear size; n is the total number of layers of convolution; siIs the convolution step size Stride of the i-th layer convolution kernel.
The multi-scale cavity convolution is designed by using the thought of multi-scale image change, and the sampling rate and the convolution kernel size are subjected to diversified processing, so that the method can adapt to the feature extraction process of targets with different sizes. The calculation of the multi-scale hole convolution is
In the formula: y [ i ] is the convolution summation result corresponding to the ith step position; k is a convolution kernel; k is the coordinate position of the parameter in the convolution kernel, and K belongs to K; w [ k ] is the convolution kernel weight; the rate is a sampling rate and can take corresponding values of 1, 2 and 3.
Designing a channel convolution network;
since the conventional convolution mode is a dimension-increasing operation, it can be considered that a channel convolution mode is adopted to achieve the effect of reducing the dimension of the feature convolution. Firstly, the traditional convolution is changed into two-layer convolution, similar to group operation in ResNet, the new structure shortens the calculation time to about 1/8 on the premise of not influencing the accuracy, reduces the parameter quantity to about 1/9, can be well applied to a mobile terminal, realizes real-time detection of a target, and has an obvious model compression effect.
For the conventional convolution, the number of input characteristic channels is assumed to be M; width or height of convolution kernel is DkOr Dk(ii) a The number of convolution kernels is N. Then there are N M.D. s for each position that the convolution slides oncek·DkThe step size of the sliding is set to s. The calculation formula of the size of the image after sliding is
In the formula: h ', w' are height and width after convolution; pad is the boundary of the width and height fill. Therefore, a certain point of the size after h '& w' convolution corresponds to N M & Dk·DkThe parameter (D) can be obtained as a total parameter of N.M.Dk·Dk·h'·w' (6)
And the convolution step is divided into two steps by adopting an improved channel convolution mode:
1) by using Dk·DkThe convolution of M convolves the M channels separately. Sliding by using the same step length s, the dimension after convolution is h ', w', and the parameter quantity generated by the step is Dk·Dk·M·h'·w' (7)
2) And setting a convolution kernel of 1.1. N to perform the ascending-dimension feature extraction. At the moment, the feature diagram obtained by the result is subjected to feature extraction again by adopting a mode of step length 1, the original M channel features are subjected to feature extraction by adopting N convolution kernels respectively, and the calculated total parameter size is M.N.h '. w'. 1.1 (8)
The convolution structures of the two steps are integrated to obtain the final parameter quantity D of the channel convolutionk·Dk·M·h'·w'+M·N·h'·w' (9)
As previously mentioned, the parameter of the conventional convolution kernel is compared with the parameter of the improved channel convolution by the quantity
From the analysis of equation (10), if a convolution kernel size of 3 × 3 is used, the channel convolution operation can reduce the parameter amount to 1/9.
Designing a full convolution connection and deconvolution network;
the final layer of the traditional network structure adopts a fixed size, so that an input picture needs to be converted into a fixed size in advance, and the acquisition of the vehicle length coordinate of the logistics vehicle is not facilitated; in addition, the traditional full-connection layer network has the defects that the determined digit space coordinate is lost, so that the image space information is distorted, and the target cannot be effectively and accurately positioned. In order to solve the problem of information loss, the invention adopts a full convolution connection mode to accurately position the position coordinates of the features in the picture.
The full connection of the traditional network converts the convolution network [ b, c, h, w ] of the former part into [ b, c.h.w ], namely [ b,4096], and then into [ b, cls ], wherein b represents the batch size and cls represents the class number. The use of a full convolutional network is relative to a convolutional network followed by 1 × 1, without a full connection layer. Hence, it is called a full convolutional network. The calculation method of the full convolution is
In the formula: n is more than or equal to 1 and less than or equal to N; y isn[i][j]Convolving the (i, j) th position of the nth convolution kernel; siConvolution step size in the horizontal direction; sjConvolution step size in the vertical direction; k is a radical ofnIs the nth convolution kernel; dkFor convolution kernel width and height, the convolution kernel size corresponds to D in step 2k·Dk;δi,δjFor positions in the convolution kernel, the layer has a total of N different types of convolution kernels, 0 ≦ δi,δj≤DkWhereas the sliding convolution operation of the convolution kernel may be converted to a two matrix multiplication operation. The result of the convolution with the pixels of the corresponding image may be expressed as
Wherein: the matrix dimension on the left is [ N, M.D ]k·Dk](ii) a The matrix dimension on the right side is [ M.D ]k·Dk,w′·h′](ii) a The dimension after convolution is [ N, w '. h']. In the matrix on the right, I is img, and subscripts thereof are image width and image height in turn, i.e. Iwh。
Finally, through deconvolution operation, converting [ N, w '. h' ] into the size of the input image, thus accurately identifying the specific semantic information represented by each pixel and avoiding the loss of spatial information. The specific operation of deconvolution is equivalent to the inverse operation of convolution, i.e. using a single convolution as the inverse of the convolution
In the formula: k is a radical of1,…,kNThe weight value corresponding to each convolution kernel is changed from the original oneIs changed intoThe weight is adjusted through training, and has the image semantic information characteristic.
Therefore, the network through the deconvolution and full convolution operation can be suitable for any image size, semantic analysis can be carried out on each pixel point of the image, the purpose of rapidly segmenting the image is achieved, and rapid and accurate positioning can be carried out on image features.
The invention has the advantages that:
aiming at the sample problem, the invention adopts a full convolution method to improve the image segmentation running speed, and has the most prominent characteristics that the image is subjected to lightweight processing, the segmentation efficiency of the model is improved under the condition of ensuring the segmentation precision, and the parameter quantity of the model is reduced in a channel convolution mode; and a multi-scale cavity convolution kernel is arranged, so that the receptive field of the model is reasonably and simply improved, and the generalization of the model is enhanced. The algorithm can be widely applied to the field of image positioning identification, such as logistics park vehicle identification and the like.
Drawings
FIG. 1 is a diagram illustrating a conventional convolution kernel convolution operation;
FIG. 2 is a schematic diagram of the convolution operation of the improved hole convolution kernel of the present invention;
FIGS. 3 a-3 c are multi-scale hole convolution kernels of the present invention, with FIG. 3a being a hole convolution kernel with a sample rate of 1, FIG. 3b being a hole convolution kernel with a sample rate of 2, and FIG. 3c being a hole convolution kernel with a sample rate of 3;
FIG. 4 is a prior art convolution scheme;
FIG. 5 is a channel convolution scheme of the present invention;
FIG. 6 is a channel convolution structure of the present invention;
FIG. 7 is a full convolution network design structure of the present invention;
FIG. 8 is a schematic diagram of a full convolution matrix calculation process according to the present invention.
Note: in fig. 6, DW is a channel convolution group, which represents a fixed collocation of channel convolution kernels; BN is batch normalization operation, and the problem that the data distribution of the middle layer is changed in the training process is solved; conv is the convolutional layer operation; RelU is a modified linear unit and is an activation function.
Detailed Description
In order to overcome the defects in the prior art, the invention provides a full-convolution image segmentation method aiming at the sample problem, which adopts a deep learning framework and optimizes and improves a convolution neural network; reducing the parameter quantity of the model by adopting a channel convolution method; the characteristics of the image are increased by adopting multi-scale hole convolution, and the problem of small receptive field of the traditional network is solved.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for improving the running speed of image segmentation comprises the following steps:
designing a multi-scale cavity convolution kernel;
in order to solve the problem that the receptive field is increased by adopting the traditional convolution and maximum pooling method, the invention adopts the hole convolution kernel, increases the sampling rate on the basis of the traditional convolution kernel, and changes the original convolution kernel into fluffy.
Thus, while the original calculated amount is kept, the receptive field is increased, so that the image segmentation information is accurate enough, and the calculation formula of the size of the receptive field based on the cavity convolution kernel is
In the formula: f is the size of the current layer receptive field; the rate is the sampling rate of the hole convolution kernel, i.e. the number of intervals, and the rate of the conventional convolution kernel can be regarded as 1, and the sampling rate of the hole convolution can be regarded as 2. The traditional convolution receptive field calculation formula is
In the formula: fi-1The size of the receptive field of the previous layer; k is a radical ofiIs the convolution or pooling kernel size of the ith layer; n is the total number of layers of convolution; siIs the convolution step size Stride of the i-th layer convolution kernel.
The multi-scale cavity convolution is designed by using the thought of multi-scale image change, and the sampling rate and the convolution kernel size are subjected to diversified processing, so that the method can adapt to the feature extraction process of targets with different sizes. The calculation of the multi-scale hole convolution is
In the formula: y [ i ] is the convolution summation result corresponding to the ith step position; k is a convolution kernel; k is the coordinate position of the parameter in the convolution kernel, and K belongs to K; w [ k ] is the convolution kernel weight; the rate is a sampling rate and can take corresponding values of 1, 2 and 3.
Designing a channel convolution network;
since the conventional convolution mode is a dimension-increasing operation, it can be considered that a channel convolution mode is adopted to achieve the effect of reducing the dimension of the feature convolution. Firstly, the traditional convolution is changed into two-layer convolution, similar to group operation in ResNet, the new structure shortens the calculation time to about 1/8 on the premise of not influencing the accuracy, reduces the parameter quantity to about 1/9, can be well applied to a mobile terminal, realizes real-time detection of a target, and has an obvious model compression effect.
For the conventional convolution, the number of input characteristic channels is assumed to be M; width or height of convolution kernel is DkOr Dk(ii) a The number of convolution kernels is N. Then there are N M.D. s for each position that the convolution slides oncek·DkThe step size of the sliding is set to s. The calculation formula of the size of the image after sliding is
In the formula: h ', w' are height and width after convolution; pad is the boundary of the width and height fill. Therefore, a certain point of the size after h '& w' convolution corresponds to N M & Dk·DkThe parameter (D) can be obtained as a total parameter of N.M.Dk·Dk·h'·w' (6)
And the convolution step is divided into two steps by adopting an improved channel convolution mode:
1) by using Dk·DkThe convolution of M convolves the M channels separately. Sliding by using the same step length s, the dimension after convolution is h ', w', and the parameter quantity generated by the step is Dk·Dk·M·h'·w' (7)
2) And setting a convolution kernel of 1.1. N to perform the ascending-dimension feature extraction. At the moment, the feature diagram obtained by the result is subjected to feature extraction again by adopting a mode of step length 1, the original M channel features are subjected to feature extraction by adopting N convolution kernels respectively, and the calculated total parameter size is M.N.h '. w'. 1.1 (8)
The convolution structures of the two steps are integrated to obtain the final parameter quantity D of the channel convolutionk·Dk·M·h'·w'+M·N·h'·w' (9)
As previously mentioned, the parameter of the conventional convolution kernel is compared with the parameter of the improved channel convolution by the quantity
From the analysis of equation (10), if a convolution kernel size of 3 × 3 is used, the channel convolution operation can reduce the parameter amount to 1/9.
Designing a full convolution connection and deconvolution network;
the final layer of the traditional network structure adopts a fixed size, so that an input picture needs to be converted into a fixed size in advance, and the acquisition of the vehicle length coordinate of the logistics vehicle is not facilitated; in addition, the traditional full-connection layer network has the defects that the determined digit space coordinate is lost, so that the image space information is distorted, and the target cannot be effectively and accurately positioned. In order to solve the problem of information loss, the invention adopts a full convolution connection mode to accurately position the position coordinates of the features in the picture.
The full connection of the traditional network converts the convolution network [ b, c, h, w ] of the former part into [ b, c.h.w ], namely [ b,4096], and then into [ b, cls ], wherein b represents the batch size and cls represents the class number. The use of a full convolutional network is relative to a convolutional network followed by 1 × 1, without a full connection layer. Hence, it is called a full convolutional network. The calculation method of the full convolution is
In the formula: n is more than or equal to 1 and less than or equal to N; y isn[i][j]Convolving the (i, j) th position of the nth convolution kernel; siConvolution step size in the horizontal direction; sjConvolution step size in the vertical direction; k is a radical ofnIs the nth convolution kernel; dkFor convolution kernel width and height, the convolution kernel size corresponds to D in step 2k·Dk;δi,δjFor positions in the convolution kernel, the layer has a total of N different types of convolution kernels, 0 ≦ δi,δj≤DkWhereas the sliding convolution operation of the convolution kernel may be converted to a two matrix multiplication operation. The result of the convolution with the pixels of the corresponding image may be expressed as
Wherein: the matrix dimension on the left is [ N, M.D ]k·Dk](ii) a The matrix dimension on the right side is [ M.D ]k·Dk,w′·h′](ii) a The dimension after convolution is [ N, w '. h']. In the matrix on the right, I is img, and subscripts thereof are image width and image height in turn, i.e. Iwh。
Finally, through deconvolution operation, converting [ N, w '. h' ] into the size of the input image, thus accurately identifying the specific semantic information represented by each pixel and avoiding the loss of spatial information. The specific operation of deconvolution is equivalent to the inverse operation of convolution, i.e. using a single convolution as the inverse of the convolution
In the formula: k is a radical of1,…,kNThe weight value corresponding to each convolution kernel is changed from the original oneIs changed intoThe weight is adjusted through training, and has the image semantic information characteristic.
Therefore, the network through the deconvolution and full convolution operation can be suitable for any image size, semantic analysis can be carried out on each pixel point of the image, the purpose of rapidly segmenting the image is achieved, and rapid and accurate positioning can be carried out on image features.
In order to verify the superiority of the invention, a logistics park vehicle is taken as an example, the following network model is constructed, and a comparison experiment is carried out:
firstly, network construction is carried out: four types of logistics vehicles, namely van trucks, traction trucks, dump trucks and tank trucks, are collected from the logistics park and are divided into 8000 training sets, 2000 testing sets and 1000 testing sets. The configuration of each parameter of the constructed network model structure is shown in the following table 1.
In table 1: k is the convolution kernel size; s is the step length; p is the size of the fill; DW is a channel convolution group and represents the fixed collocation formed by channel convolution kernels; residual summation is used to facilitate gradient transfer of a large network; activation of each layer and Batch Normalization (BN) facilitate accelerated network training; ReLU is a modified linear unit and is an activation function.
TABLE 1 design of parameters of network model architecture
The computer configuration adopted in the example is a display card with 11G, 1607 MHz display memory of Cujia NVIDIA Yingwei GTX1080 Ti.
Finally, the model test performance of the example network and the conventional network were compared, and the results are shown in table 2.
TABLE 2 comparison of lightweight segmentation model Performance
The evaluation index MPA in table 2 represents the average pixel accuracy (Mean pixel accuracy); MA represents the ratio of foreground area to label area (Mean accuracy); the MIOU represents a ratio of average intersection to area coverage (Mean intersection over), i.e. a ratio of the predicted correct region to the union of the predicted area and the tag area; the unit M pic-1 represents the memory occupied by training a picture, and the memory unit is megaly (M); the unit ms · iter-1 represents the time required per iteration, in milliseconds (ms); after the channel convolution is adopted, the occupied video memory is reduced by 51%, the training speed is improved by 78%, the testing speed is improved by 79%, each evaluation index of the segmentation positioning is greatly improved, and the improving amplitude of the MIOU is the largest.
By the embodiment, the improved method is verified to be capable of improving the performance of the model test, namely the running speed of image segmentation.
The scheme has the advantages that:
aiming at the sample problem, the invention adopts a full convolution method to improve the image segmentation running speed, and has the most prominent characteristics that the image is subjected to lightweight processing, the segmentation efficiency of the model is improved under the condition of ensuring the segmentation precision, and the parameter quantity of the model is reduced in a channel convolution mode; and a multi-scale cavity convolution kernel is arranged, so that the receptive field of the model is reasonably and simply improved, and the generalization of the model is enhanced. The algorithm can be widely applied to the field of image positioning identification, such as logistics park vehicle identification and the like.
The embodiments described in this specification are merely illustrative of implementations of the inventive concept and the scope of the present invention should not be considered limited to the specific forms set forth in the embodiments but rather by the equivalents thereof as may occur to those skilled in the art upon consideration of the present inventive concept.
Claims (1)
1. A method for improving the running speed of image segmentation comprises the following steps:
designing a multi-scale cavity convolution kernel;
in order to solve the problem that the receptive field is increased by adopting the traditional convolution and maximum pooling method, a hole convolution kernel is adopted, the sampling rate is increased on the basis of the traditional convolution kernel, and the original convolution kernel is turned to be fluffy;
thus, while the original calculated amount is kept, the receptive field is increased, so that the image segmentation information is accurate enough, and the calculation formula of the size of the receptive field based on the cavity convolution kernel is
In the formula: f is the size of the current layer receptive field; the rate is the sampling rate of the void convolution kernel, i.e. the number of intervals, and can be regarded as 1 for the rate of the conventional convolution kernel and 2 for the rate of the void convolution; the traditional convolution receptive field calculation formula is
In the formula: fi-1The size of the receptive field of the previous layer; k is a radical ofiIs the convolution or pooling kernel size of the ith layer; n is the total number of layers of convolution; siConvolution step size Stride being the i-th layer of convolution kernel;
the multi-scale cavity convolution is designed by using the thought of multi-scale image change, and the sampling rate and the convolution kernel size are subjected to diversified processing, so that the method can adapt to the characteristic extraction process of targets with different sizes; the calculation of the multi-scale hole convolution is
In the formula: y [ i ] is the convolution summation result corresponding to the ith step position; k is a convolution kernel; k is the coordinate position of the parameter in the convolution kernel, and K belongs to K; w [ k ] is the convolution kernel weight; rate is a sampling rate and can take corresponding values of 1, 2 and 3;
designing a channel convolution network;
because the traditional convolution mode is a dimension increasing operation, the function of characteristic convolution dimension reduction is achieved by adopting a channel convolution mode at first; firstly, the traditional convolution is changed into two-layer convolution, similar to group operation in ResNet, the new structure shortens the calculation time to about 1/8 on the premise of not influencing the accuracy, reduces the parameter quantity to about 1/9, can be well applied to a mobile terminal, realizes the real-time detection of a target, and has an obvious model compression effect;
for conventional convolution, the input is assumedThe number of the characteristic channels is M; width or height of convolution kernel is DkOr Dk(ii) a The number of convolution kernels is N; then there are N M.D. s for each position that the convolution slides oncek·DkThe step length of sliding is set as s; the calculation formula of the size of the image after sliding is
In the formula: h ', w' are height and width after convolution; pad is the boundary of the width and height filling; therefore, a certain point of the size after h '& w' convolution corresponds to N M & Dk·DkThe parameter quantity of (2) can be obtained as the total parameter quantity
N·M·Dk·Dk·h'·w' (6)
And the convolution step is divided into two steps by adopting an improved channel convolution mode:
1) by using Dk·DkThe convolution of M performs convolution on M channels respectively; the same step length s is used for sliding, the dimension after convolution is h ', w', and the parameter quantity generated by the step is
Dk·Dk·M·h'·w' (7)
2) Setting convolution kernels of 1, 1 and N for performing dimension-increasing feature extraction; at the moment, the step length is 1, the feature diagram obtained in the step 1) is subjected to feature extraction again, the original M channel features are subjected to feature extraction by adopting N convolution kernels, and the calculated total parameter number is
M·N·h'·w'·1·1 (8)
The convolution structures of the two steps are integrated to obtain the final parameter quantity of the channel convolution as
Dk·Dk·M·h'·w'+M·N·h'·w' (9)
As previously mentioned, the parameter of the conventional convolution kernel is compared with the parameter of the improved channel convolution by the quantity
From the analysis of equation (10), the channel convolution operation reduces the parameter amount;
designing a full convolution connection and deconvolution network;
the final layer of the traditional network structure adopts a fixed size, so that an input picture needs to be converted into a fixed size in advance, and the acquisition of the vehicle length coordinate of the logistics vehicle is not facilitated; in addition, the traditional full-connection layer network has the defects that the determined digit space coordinate is lost, so that the image space information is distorted, and the target cannot be effectively and accurately positioned; in order to solve the problem of information loss, a full convolution connection mode is adopted to accurately position the position coordinates of the features in the picture;
the full connection of the traditional network converts the convolutional network [ b, c, h, w ] of the front part into [ b, c.h.w ], namely [ b,4096], and then into [ b, cls ], wherein b represents the batch size and cls represents the class number; the adopted full convolution network is a convolution network connected with 1 multiplied by 1 correspondingly, and has no full connection layer; hence, the term full convolutional network; the calculation method of the full convolution is
yn[i][j]=fkns(x[si+δi][sj+δj]) (11)
In the formula: n is more than or equal to 1 and less than or equal to N; y isn[i][j]Convolving the (i, j) th position of the nth convolution kernel; siConvolution step size in the horizontal direction; sjConvolution step size in the vertical direction; k is a radical ofnIs the nth convolution kernel; dkFor the width and height of the convolution kernel, the size of the convolution kernel corresponds to D in the second stepk·Dk;δi,δjFor the position in the convolution kernel, the full convolution connection network layer has N convolution kernels of different types in total, and delta is more than or equal to 0i,δj≤DkAnd the sliding convolution operation of the convolution kernel can be converted into two matrix multiplication operations; corresponding pixels and volumes of an imageThe result of the product can be expressed as
Wherein: the matrix dimension on the left is [ N, M.D ]k·Dk](ii) a The matrix dimension on the right side is [ M.D ]k·Dk,w′·h′](ii) a The dimension after convolution is [ N, w '. h'](ii) a In the matrix on the right, I is img, and subscripts thereof are image width and image height in turn, i.e. Iwh;
Finally, through deconvolution operation, converting [ N, w '& h' ] into the size of the input image, thus accurately identifying the specific semantic information represented by each pixel and avoiding the loss of spatial information; the specific operation of deconvolution is equivalent to the inverse operation of convolution, i.e. using a single convolution as the inverse of the convolution
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910535642.4A CN110458841B (en) | 2019-06-20 | 2019-06-20 | Method for improving image segmentation running speed |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910535642.4A CN110458841B (en) | 2019-06-20 | 2019-06-20 | Method for improving image segmentation running speed |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110458841A CN110458841A (en) | 2019-11-15 |
CN110458841B true CN110458841B (en) | 2021-06-08 |
Family
ID=68480779
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910535642.4A Active CN110458841B (en) | 2019-06-20 | 2019-06-20 | Method for improving image segmentation running speed |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110458841B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111626267B (en) * | 2019-09-17 | 2022-02-15 | 山东科技大学 | Hyperspectral remote sensing image classification method using void convolution |
CN111967401A (en) * | 2020-08-19 | 2020-11-20 | 上海眼控科技股份有限公司 | Target detection method, device and storage medium |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10147193B2 (en) * | 2017-03-10 | 2018-12-04 | TuSimple | System and method for semantic segmentation using hybrid dilated convolution (HDC) |
CN107169974A (en) * | 2017-05-26 | 2017-09-15 | 中国科学技术大学 | It is a kind of based on the image partition method for supervising full convolutional neural networks more |
CN108830855B (en) * | 2018-04-02 | 2022-03-25 | 华南理工大学 | Full convolution network semantic segmentation method based on multi-scale low-level feature fusion |
CN108776969B (en) * | 2018-05-24 | 2021-06-22 | 复旦大学 | Breast ultrasound image tumor segmentation method based on full convolution network |
CN108921196A (en) * | 2018-06-01 | 2018-11-30 | 南京邮电大学 | A kind of semantic segmentation method for improving full convolutional neural networks |
CN109410185B (en) * | 2018-10-10 | 2019-10-25 | 腾讯科技(深圳)有限公司 | A kind of image partition method, device and storage medium |
-
2019
- 2019-06-20 CN CN201910535642.4A patent/CN110458841B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN110458841A (en) | 2019-11-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107016681B (en) | Brain MRI tumor segmentation method based on full convolution network | |
CN108154102B (en) | Road traffic sign identification method | |
CN111310666B (en) | High-resolution image ground feature identification and segmentation method based on texture features | |
CN107358258B (en) | SAR image target classification based on NSCT double CNN channels and selective attention mechanism | |
Yang et al. | Fast vehicle logo detection in complex scenes | |
CN111951288B (en) | Skin cancer lesion segmentation method based on deep learning | |
CN111783782A (en) | Remote sensing image semantic segmentation method fusing and improving UNet and SegNet | |
Zhu et al. | SAR target classification based on radar image luminance analysis by deep learning | |
CN109410195B (en) | Magnetic resonance imaging brain partition method and system | |
CN111191583A (en) | Space target identification system and method based on convolutional neural network | |
CN111008632B (en) | License plate character segmentation method based on deep learning | |
Kashyap | Evolution of histopathological breast cancer images classification using stochasticdilated residual ghost model | |
CN108230330B (en) | Method for quickly segmenting highway pavement and positioning camera | |
CN112836671B (en) | Data dimension reduction method based on maximized ratio and linear discriminant analysis | |
CN110930378B (en) | Emphysema image processing method and system based on low data demand | |
CN112446891A (en) | Medical image segmentation method based on U-Net network brain glioma | |
CN110458841B (en) | Method for improving image segmentation running speed | |
CN111915583B (en) | Vehicle and pedestrian detection method based on vehicle-mounted thermal infrared imager in complex scene | |
CN111986126B (en) | Multi-target detection method based on improved VGG16 network | |
CN111062381B (en) | License plate position detection method based on deep learning | |
CN111325750A (en) | Medical image segmentation method based on multi-scale fusion U-shaped chain neural network | |
CN116630971B (en) | Wheat scab spore segmentation method based on CRF_Resunate++ network | |
CN113298032A (en) | Unmanned aerial vehicle visual angle image vehicle target detection method based on deep learning | |
CN116824239A (en) | Image recognition method and system based on transfer learning and ResNet50 neural network | |
CN113486894B (en) | Semantic segmentation method for satellite image feature parts |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |