CN115147703B

CN115147703B - Garbage segmentation method and system based on GinTrans network

Info

Publication number: CN115147703B
Application number: CN202210901322.8A
Authority: CN
Inventors: 唐军
Original assignee: Guangdong Xiaobailong Environmental Protection Technology Co ltd
Current assignee: Guangdong Xiaobailong Environmental Protection Technology Co ltd
Priority date: 2022-07-28
Filing date: 2022-07-28
Publication date: 2023-11-03
Anticipated expiration: 2042-07-28
Also published as: CN115147703A

Abstract

The invention discloses a garbage segmentation method based on a GinTrans network, which comprises the following steps: (1) image segmentation system settings; (2) image acquisition input; extracting a characteristic diagram; (4) cleavage marker recombination; (5) remodelling and fusing; (6) jumping aggregation; (7) dividing the output. The invention also discloses an image segmentation system. The invention generates the feature map required by the feature mining through linear transformation, reduces network parameters, reduces complexity, ensures the segmentation processing efficiency and improves response speed; the Bi-Frequency Transformer (BiFTrans block) module is utilized to realize the frequency division processing characteristics of high frequency and low frequency, capture of high frequency details and overall information of low frequency are realized, the speed and stability of image processing are ensured, the precision and accuracy of garbage target segmentation are improved, and the classification efficiency is improved.

Description

Garbage segmentation method and system based on GinTrans network

Technical Field

The invention relates to the technical field of vision processing, in particular to a garbage segmentation method and system based on a GinTrans network.

Background

Along with the rapid economic development, the living standard of people is improved, the types of generated garbage are more and more complex, and the unified landfill method can not meet the concept of protecting natural and green development. The sorting treatment of the garbage can realize the recycling of resources, reduce soil hazard and prevent air pollution. The garbage classification is an important step for realizing the green environment-friendly society and recycling of resources. Garbage classification is carried out from each small household and each individual, so that a large amount of manpower and material resources for subsequent work can be saved. Realize the recycling of the recyclable garbage, the correct throwing of non-recyclable matters and the disposal of the garbage. At present, in 46 key cities which are firstly tested in advance in the current garbage classification of China, the coverage rate of a household garbage classification district is 86.6%, the average recycling rate of the household garbage is 30.4%, and the kitchen garbage treatment capacity is improved from 3.47 ten thousand tons per day in 2019 to 6.28 ten thousand tons per day in the year 2020.

However, although the domestic garbage classification and treatment of China are achieved in a very practical way in recent years, garbage classification is still a troublesome problem in part of cities and communities, manual sorting is still required in most garbage classification centers, the efficiency is low, and the human body is injured due to the scratch of sharp garbage. Through artificial intelligence technique, utilize computer vision and image processing to cut apart rubbish, not only can promote rubbish letter sorting efficiency greatly, can also accurate classification, will mix the recoverable rubbish screening out in other rubbish, realize the high-efficient and cyclic utilization of resource, but the following problem that current rubbish adopted based on vision rubbish classification system still exists: 1. the existing garbage segmentation method based on the traditional image processing has low segmentation precision; 2. the existing segmentation method based on the traditional deep learning network has the defects of multiple model parameters, high calculation complexity and influence on the subsequent image processing speed and efficiency; 3. the garbage targets are various in variety, serious in deformation and offset, boundaries among the targets are extremely not cleaned, details and contour information cannot be captured well by the existing method, so that the segmentation difficulty is high, the false segmentation rate is high, and the subsequent classification accuracy is affected.

Disclosure of Invention

The invention provides a garbage segmentation method based on a GinTrans network, aiming at the defects of the prior art.

The invention also discloses an image segmentation system.

The technical scheme adopted by the invention for achieving the purpose is as follows:

a garbage segmentation method based on a GinTrans network comprises the following steps:

(1) Image segmentation system settings: the image segmentation system comprises an image acquisition module, an extraction module, a cutting identification recombination module, a remodelling fusion output module, a jump connection aggregation module and a segmentation output module, wherein the extraction module is provided with a GhostNet network, the remodelling fusion output module is provided with a Bi-Frequency Transformer (BiFTrans block) module, the jump connection aggregation module is provided with an up-sampling recovery module and a GhostNet down-sampling feature extraction module, and the segmentation output module is provided with a segmentation head module;

(2) And (3) image acquisition input: the image acquisition module is connected with an image acquisition camera, the image acquisition camera performs image acquisition on garbage on the intelligent garbage collection and sorting line and transmits the garbage to the image acquisition module, and an image acquired by the image acquisition module is used as image input data and is transmitted to the lifting module for processing;

(3) Extracting a feature map: the extraction module processes the image input data, adopts a GhostNet network to extract bottom layer characteristics of the image input data, extracts an output characteristic diagram and processes the next step;

(4) Cutting identification recombination: the cutting identification reorganization module performs cutting operation on the extracted output feature map, gridding the output feature map through the cutting operation, generating marking features through linear mapping after gridding is completed, forming marking feature sequences by the marking features, and taking the marking feature sequences as input of a Bi-Frequency Transformer (BiFTrans block) module;

(5) Remodelling and fusing: the remolding fusion output module remodels the input marked characteristic sequence, and utilizes a double-frequency mixer arranged in a Bi-Frequency Transformer (BiFTrans block) module to perform fusion action on the characteristics from different frequencies, and the fusion action obtains fusion characteristic output;

(6) Jump aggregation: the fused feature output is restored to an input image with the same size as the image input data through an up-sampling restoring module, the GHostNet down-sampling feature extraction module extracts lower material features from the corresponding same-layer resolution feature image, and the input image obtained by the up-sampling restoring module is in jump link with the same-layer resolution feature image obtained by the GHostNet down-sampling feature extraction module, so that feature images presented in different resolution levels are aggregated;

(7) And (3) segmentation output: and the segmentation head module of the segmentation output module generates and segments the characteristic image to obtain a target image.

Further improvement is made, the Ghostret network is provided with a Ghost module, and the Ghost module is designed as follows:

(3.1) inputting the garbage image to be identified in the step (2) into data, and generating original feature images f with different resolutions by convolution through a GhostNet network _i ′；

(3.2) the extraction module contains linear operation phi, and generates a Ghost characteristic diagram as an output characteristic diagram by using the linear operation phi, wherein the formula is f _ij ＝Φ _i，j (f _i ') wherein f _i ' is the ith original feature map obtained by convolution operation of the garbage image to be identified, phi _i，j The jth linear operation of the ith original feature map is carried out to obtain a jth Ghost feature map f after the operation _ij 。

Further improved, the step (4) further comprises the following steps:

(4.1) carrying out gridding cutting on each Ghost characteristic diagram output in the step (3), wherein the size of each characteristic diagram is H multiplied by W, the size of each characteristic diagram is H, W respectively the length and the width of the characteristic diagram, the size of each grid sub-characteristic diagram is s multiplied by s, the s is the side length of a grid, and the s can be divided by H, W and can be cut into grid sub-characteristic diagrams;

(4.2) performing linear mapping on the gridded sub-feature graphs, wherein each grid sub-feature graph is mapped into a mark feature, and all grid sub-feature graphs form a mark feature sequenceX ⁱ I=1, 2, and N is the number of the signature.

Further improvements, said step (5) further comprises upsampling aggregation and skip chaining comprising the steps of:

(5.1) first performing layer normalization processing on the marked characteristic sequence,x is a marker feature sequence after layer normalization treatment, < >>The marker feature sequence obtained in the step (4);

(5.2) decomposing each marking characteristic in the marking characteristic sequence in the previous step into a high-frequency marking characteristic and a low-frequency marking characteristic by frequency decomposition;

(5.3) for high frequency features, the maximum pooling operation is performed first, then a linear layer is passed, and then a depth separable convolution layer is passed, the pooled layer, the linear layer and the depth separable convolution layer form a high frequency feature processor, and the output is Y _high ＝DConv(FC(MaxPool(X _hihg ) And) wherein Y _high For the feature output processed by the high-frequency feature processor, DConv is depth separable convolution processing, FC is linear full-connection processing, maxPool is maximum pooling processing, and X _hihg The high-frequency marking characteristics are obtained by decomposing the marking characteristics in the step (5.2);

(5.4) for the low frequency marker feature, firstly, carrying out an average pooling operation, then, carrying out a self-attention mechanism layer MSA, then, carrying out up-sampling to compensate the dimension reduction after the average pooling operation, wherein the average pooling layer, the self-attention mechanism layer and the up-sampling layer form a low frequency feature processor, and the output is Y _low ＝Upsample(MSA(AvePool(X _low ) And) wherein Y _low Is the feature output after the low frequency feature processor processes, upsamples is the upsampling process, MSA is the self-attention mechanism process, aveboost is the average pooling process, X _low The low-frequency marking characteristics are obtained by decomposing the marking characteristics in the step (5.2);

(5.5) high-low frequency characteristic fusion, namely fusing the high-frequency characteristic output and the low-frequency characteristic output to obtain a fused output Y _o ＝Concat(Y _high ，Y _low ) Wherein Y is _o For the fused characteristic output, concat is a characteristic link function;

(5.6) performing layer normalization operation again on the fused characteristic output;

(5.7) after normalization operation, outputting by a feedforward network FFN and a final BiFTrans block moduleWherein->Is the signature sequence of step (4), Y _o The characteristic output after high-low frequency fusion is processed by a feed-forward network, FFN is processed by a layer normalization.

The invention has the beneficial effects that: according to the invention, through the linear transformation of GhostNet, the characteristic diagram required by the feature mining is generated by using simple linear transformation, so that the network parameters are reduced, the complexity is reduced, the subsequent segmentation processing efficiency is ensured, and the response speed is improved; the dual-frequency mixer arranged in the Bi-Frequency Transformer (BiFTrans block) module is utilized to realize the frequency division processing characteristics of high frequency and low frequency, capture of high frequency details is realized, meanwhile, the low frequency global information can be considered, the speed and the processing stability of image processing are ensured, the segmentation network of the GhostNet network is enabled to learn the comprehensive characteristics of the high frequency information and the low frequency information in the garbage image data more effectively through fusion after the high frequency processing characteristics and the low frequency processing characteristics, the precision and the accuracy of garbage target segmentation with serious deformation and offset and fuzzy boundary sense are improved, and the subsequent classification efficiency is improved.

The invention will be further described with reference to the drawings and the detailed description.

Drawings

Fig. 1 is a flow chart of a garbage segmentation method based on GinTrans network in the present embodiment;

fig. 2 is a schematic diagram of the GinTrans network structure of the present embodiment;

fig. 3 is a schematic structural diagram of a Ghost module of the present embodiment;

fig. 4 is a schematic structural diagram of a bitrans block module in the present embodiment;

fig. 5 is a schematic structural diagram of a dual-frequency mixer according to the present embodiment.

Detailed Description

The following description is of the preferred embodiments of the invention, and is not intended to limit the scope of the invention.

Referring to fig. 1 to 5, an embodiment of a garbage segmentation method based on GinTrans network includes the following steps:

Further improved, the step (4) further comprises the following steps:

(5.4) for the low frequency marker feature, firstly, carrying out an average pooling operation, then carrying out a self-attention mechanism layer MSA, and then carrying out up-sampling to compensate the dimension reduction after the average pooling operation, wherein the average pooling layer, the self-attention mechanism layer and the up-sampling layer formA low frequency characteristic processor is provided, the output is Y _low ＝Upsample(MSA(AvePool(X _low ) And) wherein Y _low Is the feature output after the low frequency feature processor processes, upsamples is the upsampling process, MSA is the self-attention mechanism process, aveboost is the average pooling process, X _low The low-frequency marking characteristics are obtained by decomposing the marking characteristics in the step (5.2);

According to the invention, through the linear transformation of GhostNet, the characteristic diagram required by the feature mining is generated by using simple linear transformation, so that the network parameters are reduced, the complexity is reduced, the subsequent segmentation processing efficiency is ensured, and the response speed is improved; the dual-frequency mixer arranged in the Bi-Frequency Transformer (BiFTrans block) module is utilized to realize the frequency division processing characteristics of high frequency and low frequency, capture of high frequency details is realized, meanwhile, the low frequency global information can be considered, the speed and the processing stability of image processing are ensured, the segmentation network of the GhostNet network is enabled to learn the comprehensive characteristics of the high frequency information and the low frequency information in the garbage image data more effectively through fusion after the high frequency processing characteristics and the low frequency processing characteristics, the precision and the accuracy of garbage target segmentation with serious deformation and offset and fuzzy boundary sense are improved, and the subsequent classification efficiency is improved.

The present invention is not limited to the above embodiments, and other garbage segmentation methods and systems for GinTrans network are all within the scope of the present invention, which are obtained by adopting the same or similar structures, devices, processes or methods as the above embodiments of the present invention.

Claims

1. A garbage segmentation method based on a GinTrans network is characterized by comprising the following steps:

2. The garbage segmentation method based on the GinTrans network according to claim 1, wherein the GhotNet network is provided with a Ghost module, and the Ghost module is designed as follows:

(3.2) the extraction module contains linear operation phi, and generates a Ghost characteristic diagram as an output characteristic diagram by using the linear operation phi, wherein the formula is f _ij ＝Φ _i，j (f _i ') wherein f _i ' is a garbage map to be identifiedThe j-th linear operation is carried out to obtain a j-th Ghost characteristic diagram f after the operation _ij 。

3. The GinTrans network-based garbage segmentation method according to claim 2, wherein the step (4) further comprises the steps of:

4. The GinTrans network-based garbage segmentation method according to claim 3, wherein the step (5) further comprises up-sampling aggregation and skip linking, which comprises the steps of:

(5.3) for high frequency features, the maximum pooling operation is performed first, then a linear layer is passed, and then a depth separable convolution layer is passed, the pooled layer, the linear layer and the depth separable convolution layer form a high frequency feature processor, and the output is Y _high ＝DConv(FC(MaxPool(X _hihg ) (ii)) wherein Y _high For the feature output processed by the high-frequency feature processor, DConv is depth separable convolution processing, FC is linear full-connection processing, maxPool is maximum pooling processing，X _hihg The high-frequency marking characteristics are obtained by decomposing the marking characteristics in the step (5.2);