AU2021105161A4 - Image segmentation based on feature attenuator - Google Patents

Image segmentation based on feature attenuator Download PDF

Info

Publication number
AU2021105161A4
AU2021105161A4 AU2021105161A AU2021105161A AU2021105161A4 AU 2021105161 A4 AU2021105161 A4 AU 2021105161A4 AU 2021105161 A AU2021105161 A AU 2021105161A AU 2021105161 A AU2021105161 A AU 2021105161A AU 2021105161 A4 AU2021105161 A4 AU 2021105161A4
Authority
AU
Australia
Prior art keywords
feature
convolution
layer
edges
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU2021105161A
Inventor
Zhenya Yue
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunshigao Technology Co Ltd
Original Assignee
Yunshigao Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunshigao Technology Co Ltd filed Critical Yunshigao Technology Co Ltd
Priority to AU2021105161A priority Critical patent/AU2021105161A4/en
Application granted granted Critical
Publication of AU2021105161A4 publication Critical patent/AU2021105161A4/en
Ceased legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/20Contour coding, e.g. using detection of edges

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

: Image segmentation is one of the popular techniques for vision application, i.e., retrieving object information from an image. Segmentation not only provides the class and location of an object, but the associated contour as well. Recent advances of the segmentation deploy the encoder and decoder architectures with skip connection, and also utilize the edge information to obtain the best contour. This patent focuses on maintaining the information flow from skip connection, and also can identifies suited feature from the bottom layer. The feature selector can be deployed in edge information for yielding segmentation contour, our method can also maintain the deep learning feature maps through the information filtering. The filtering module outperforms the former schemes with skip connection. Moreover, the proposed method also can attenuate the important edges information which can select the most important edges for different class reconstruction. 1 FigI Sets of filtering layer and factor ------------ fato Feature Map Filtering Layer Edge Detector Edges of Feature Map Weighted Edges for Contour Reconstruction Fig.2 (a) (b) 1

Description

FigI
Sets of filtering layer and ------------ factor fato Feature Map Filtering Layer
Edge Detector
Edges of Feature Map Weighted Edges for Contour Reconstruction
Fig.2
(a) (b)
1. Background and Purpose
Image segmentation aims to automatically label objects within images. Further, the nature and characteristic of the object can be identified based on the segmented parts. Moreover, the segmentation can also provide the location and the contours of the object which is much informative compared with the simple object detection. Image segmentation becomes more popular in case of the accuracy and the precise location of an object. However, the design of fast and accurate image segmentation is highly challenging for real-time condition due to the limitation on embedding system and hardware. Recently, with the aid of deep learning the image segmentation has been significantly improved. In this approach, the image segmentation is performed in a pixel wise manner, and each pixel is assigned to a specific segment. The Convolutional Neural Network (CNN) is commonly used as a feature extractor. Subsequently, the semantic segmentation is executed, and the objects are categorized into different classes, but not including the instances. The most popular schemes for semantic segmentation is to deploy autoencoder such as ENet , ResNet, DenseNet, these models have layer jump connections,this kind of structures can reduce the loss of information in the convolution process. However, not all of the skip connection features are important for the next reconstruction, unnecessary calculations will significantly reduce the running speed. Another question is about contour detection, The detected edges are not of equal contributions for segmentation contours in the classification problem. This patent focuses on maintaining the information flow from skip connection, and also can identifies suited feature from the bottom layer. The feature selector can be deployed in edge information for yielding segmentation contour, our method can also maintain the deep learning feature maps through the information filtering. The filtering module outperforms the former schemes with skip connection. Moreover, the proposed method also can attenuate the important edges information which can select the most important edges for different class reconstruction.
2. Image segmentation method Description 2.1 Efficient Network (ENet) We use ENet as the basic structure of image segmentation. ENet consist of several layers, they are initial, bottleneck and full convolution. The initial layer is composed of the 13 filter of convolution and maxpooling for the rest that later will be concatenated as the initial feature. In the bottleneck layer, the ENet has several different type of bottleneck which are regular, dilated or asymmetric. The regular convolution in ENet is a naive convolution with size 3x3 and the dilated convolution is used to capture the wide area of input with a cheap computation. Meanwhile for the asymmetric convolution is the separable convolution. In ENet the asymmetric convolution size is x5 which separated into two different convolution that are 1x5 and followed by 5x1. The full convolution layer that also known as the transposed convolution is used as the last upsampling layer. The other operation are max pooling for downsampling and max upooling for the upsampling. For the max pooling operation the kernel size 2x2 with stride 2 is used. The index of the pooling process will be saved to be used in the upsampling process. In the upsampling process, the value in the lower spatial size will be copied into the associated index that saved during the pooling operation, and the other value in the upsampled feature will set to zero.
2.2 Model structure Step 1: Attenuator layers This layer filters the information before passing it to the top layer. By filtering the feature based on their importance, the decoder is not overwhelmed, and the reconstruction can be performed better. Two terms in the attenuator layers, e.g., factor f and value #. The factor is the feature with range 0 to 1 as the weighting, indicating the importance of the feature. The value is the feature that attenuates with the factor to form the final feature. The factor is the result of filtering the feature map jby sequences of convolutional layers O, then followed by sigmoid activation function 0 as: f = 0(o (0)) (1) 0' =0 o f(O) (2) The filtering layers analyzes the importance of the feature, and the bottleneck layer is used for this process. The dimension of the feature is reduced first with 1xi convolution. Subsequently, the 3x3 separable convolution is applied, in which 1x3 is conducted before 3x1. The feature is expanded through 1xi convolution to its original depth size. The batch normalization layer and PReLU activation function are deployed inside the filtering layer. Finally, the sigmoid activation function is used to normalize the value of factor in between 0 to 1. Step 2: Edge selector method
In this method, every class will have different attenuator. After the attenuator produces the weight and weighting the vales, the result will be average and sum to the respected class. After that, the edge feature extraction of the object needs to be carried out from this information. In our method, the CNN layer is used to determine which information is edge information 1 G 2 *([1 0 1]*0i,) (3)
G, 0 1 2 1]*0,) (4) 1
G= G +G (5) The input feature maps later feed into Sobel operator to detect the edges of the feature map. The G, operator will compute the gradient of the feature maps in x direction that shows in (3), where * is the convolution operator. The separable Sobel operator is used to simplify the computational complexity compared with the original 3x3 kernel. The same way is applied to the y direction using Gy operator in (4). Finally (5) is applied to the feature maps of the initial block in ENet Oib. Then pass an activation layer to filter better edge features: fE o (0 (E(0,) (6) Step 3: Feature fusion The attenuated edges filter will be average to get the single class dedicated edges feature CE using (7): l d CE =I fjEm 0 E (0,) (7)
The CE later concatenated to form the final attenuated feature of the edges fCE using (8):
fCE ={CE, CEI,... ,CEd (8)
The fCE feature represent the most important edges across the feature maps that beneficial for the specific class feature. This feature later combined using the element wise summation with the class feature from the bottom layer 0 b for the final feature maps Of that feed to the top layer using (9): 0 f =fCE+(Oh) (9)
3. Brief Description of the Drawings
Fig. 1: Selecting the feature based on their importance. Fig. 2 Result of the filtered edges information. (a)The edges feature for sky class. (b) The edges feature for the tree class.

Claims (2)

The claims defining the invention are as follows: Image segmentation based on feature attenuator
1. Image segmentation method Description 1.1 Efficient Network (ENet) We use ENet as the basic structure of image segmentation. ENet consist of several layers, they are initial, bottleneck and full convolution. The initial layer is composed of the 13 filter of convolution and maxpooling for the rest that later will be concatenated as the initial feature. In the bottleneck layer, the ENet has several different type of bottleneck which are regular, dilated or asymmetric. The regular convolution in ENet is a naive convolution with size 3x3 and the dilated convolution is used to capture the wide area of input with a cheap computation. Meanwhile for the asymmetric convolution is the separable convolution. In ENet the asymmetric convolution size is x5 which separated into two different convolution that are 1x5 and followed by 5x1. The full convolution layer that also known as the transposed convolution is used as the last upsampling layer. The other operation are max pooling for downsampling and max upooling for the upsampling. For the max pooling operation the kernel size 2x2 with stride 2 is used. The index of the pooling process will be saved to be used in the upsampling process. In the upsampling process, the value in the lower spatial size will be copied into the associated index that saved during the pooling operation, and the other value in the upsampled feature will set to zero.
1.
2 Model structure Step 1: Attenuator layers This layer filters the information before passing it to the top layer. By filtering the feature based on their importance, the decoder is not overwhelmed, and the reconstruction can be performed better. Two terms in the attenuator layers, e.g., factor f and value #. The factor is the feature with range 0 to 1 as the weighting, indicating the importance of the feature. The value is the feature that attenuates with the factor to form the final feature. The factor is the result of filtering the feature map jby sequences of convolutional layers O, then followed by sigmoid activation function 0 as: f = 0of (0)) (1) 0' =0 o f(O) (2) The filtering layers analyzes the importance of the feature, and the bottleneck layer is used for this process. The dimension of the feature is reduced first with 1xi convolution. Subsequently, the 3x3 separable convolution is applied, in which 1x3 is conducted before 3x1. The feature is expanded through 1xi convolution to its original depth size. The batch normalization layer and PReLU activation function are deployed inside the filtering layer. Finally, the sigmoid activation function is used to normalize the value of factor in between 0 to 1. Step 2: Edge selector method In this method, every class will have different attenuator. After the attenuator produces the weight and weighting the vales, the result will be average and sum to the respected class. After that, the edge feature extraction of the object needs to be carried out from this information. In our method, the CNN layer is used to determine which information is edge information 1 G 2 1 0 1]*0,,) (3) 1
1 G, 0 1 2 1]*0,,) (4)
G= G +G (5)
The input feature maps later feed into Sobel operator to detect the edges of the feature map. The G, operator will compute the gradient of the feature maps in x direction that shows in (3), where * is the convolution operator. The separable Sobel operator is used to simplify the computational complexity compared with the original 3x3 kernel. The same way is applied to the y direction using Gy operator in (4). Finally (5) is applied to the feature maps of the initial block in ENet Oib. Then pass an activation layer to filter better edge features:
fE o (0 (E(0,) (6)
Step 3: Feature fusion The attenuated edges filter will be average to get the single class dedicated edges feature CE using (7):
I d CE =I fJEm 0 E (0,) (7)
The CE later concatenated to form the final attenuated feature of the edges fCE using (8):
fCE ={CEI, CEI,... ,CEdI (8)
The fCE feature represent the most important edges across the feature maps that beneficial for the specific class feature. This feature later combined using the element wise summation with the class feature from the bottom layer 0 b for the final feature maps Of that feed to the top layer using (9): 0 f =fCE+(Oh) (9)
Fig1
Fig.2
AU2021105161A 2021-08-09 2021-08-09 Image segmentation based on feature attenuator Ceased AU2021105161A4 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2021105161A AU2021105161A4 (en) 2021-08-09 2021-08-09 Image segmentation based on feature attenuator

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2021105161A AU2021105161A4 (en) 2021-08-09 2021-08-09 Image segmentation based on feature attenuator

Publications (1)

Publication Number Publication Date
AU2021105161A4 true AU2021105161A4 (en) 2021-10-07

Family

ID=77922952

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2021105161A Ceased AU2021105161A4 (en) 2021-08-09 2021-08-09 Image segmentation based on feature attenuator

Country Status (1)

Country Link
AU (1) AU2021105161A4 (en)

Similar Documents

Publication Publication Date Title
CN111047551B (en) Remote sensing image change detection method and system based on U-net improved algorithm
CN110287960B (en) Method for detecting and identifying curve characters in natural scene image
CN109101914B (en) Multi-scale-based pedestrian detection method and device
CN111915530B (en) End-to-end-based haze concentration self-adaptive neural network image defogging method
CN110363182A (en) Method for detecting lane lines based on deep learning
CN107609602A (en) A kind of Driving Scene sorting technique based on convolutional neural networks
CN111738995B (en) RGBD image-based target detection method and device and computer equipment
CN114821246B (en) Small target detection method based on multi-level residual error network perception and attention mechanism
CN112016614B (en) Construction method of optical image target detection model, target detection method and device
EP3566177A1 (en) A method and apparatus for detecting objects of interest in images
CN108416292B (en) Unmanned aerial vehicle aerial image road extraction method based on deep learning
CN112906706A (en) Improved image semantic segmentation method based on coder-decoder
CN110648316B (en) Steel coil end face edge detection method based on deep learning
CN115147648A (en) Tea shoot identification method based on improved YOLOv5 target detection
CN116129291A (en) Unmanned aerial vehicle animal husbandry-oriented image target recognition method and device
CN113822287B (en) Image processing method, system, device and medium
CN111860248A (en) Visual target tracking method based on twin gradual attention-guided fusion network
AU2021105161A4 (en) Image segmentation based on feature attenuator
CN113936299A (en) Method for detecting dangerous area in construction site
CN111753820A (en) Color fundus image cup segmentation method based on deep learning
CN109284752A (en) A kind of rapid detection method of vehicle
CN114067186B (en) Pedestrian detection method and device, electronic equipment and storage medium
CN112800952B (en) Marine organism identification method and system based on improved SSD algorithm
WO2022133874A1 (en) Image processing method and device and computer-readable storage medium
CN114511798A (en) Transformer-based driver distraction detection method and device

Legal Events

Date Code Title Description
FGI Letters patent sealed or granted (innovation patent)
MK22 Patent ceased section 143a(d), or expired - non payment of renewal fee or expiry