CN115147703A

CN115147703A - GinTrans network-based garbage segmentation method and system

Info

Publication number: CN115147703A
Application number: CN202210901322.8A
Authority: CN
Inventors: 唐军
Original assignee: Guangdong Xiaobailong Environmental Protection Technology Co ltd
Current assignee: Guangdong Xiaobailong Environmental Protection Technology Co ltd
Priority date: 2022-07-28
Filing date: 2022-07-28
Publication date: 2022-10-04
Anticipated expiration: 2042-07-28
Also published as: CN115147703B

Abstract

The invention discloses a garbage segmentation method based on a GinTrans network, which comprises the following steps: (1) setting an image segmentation system; (2) image acquisition and input; (3) extracting a feature map; (4) cutting, marking and recombining; (5) remodeling fusion; (6) skip polymerization; and (7) dividing and outputting. The invention also discloses an image segmentation system. According to the invention, the feature graph required by feature mining is generated through linear transformation, so that network parameters are reduced, the complexity is reduced, the segmentation processing efficiency is ensured, and the response speed is increased; the characteristic of Frequency division processing of high Frequency and low Frequency is realized by utilizing a Bi-Frequency Transformer (BiFTrans block) module, the capturing of high Frequency details and the consideration of low Frequency global information are realized, the speed and the stability of image processing are ensured, the precision and the accuracy of garbage target segmentation are improved, and the classification efficiency is improved.

Description

Garbage segmentation method and system based on GinTrans network

Technical Field

The invention relates to the technical field of visual processing, in particular to a method and a system for garbage segmentation based on a GinTrans network.

Background

Along with the rapid development of economy, the living standard of people is improved, the types of generated garbage are more and more complex, and the unified landfill method cannot meet the concept of protecting the natural and green development. The classification treatment of the garbage can realize the recycling of resources, reduce the harm of soil and prevent air pollution. To realize an environment-friendly society and realize the recycling of resources, garbage classification is an important step. The garbage classification is started from each small family and each person, so that a large amount of manpower and material resources can be saved in the subsequent work. The recycling of the recyclable garbage, the correct putting of non-recyclable matters and the garbage treatment are realized. At present, in 46 key cities which are developed by China and are tested in advance in the current garbage classification, the coverage rate of a household garbage classification cell reaches 86.6%, the average recycling rate of household garbage is 30.4%, and the kitchen garbage treatment capacity is improved from 3.47 ten thousand tons every day in 2019 to 6.28 ten thousand tons every day in 2020 years.

However, although the performance of domestic garbage classification and treatment is good in recent years, garbage classification in partial cities and communities is still a difficult problem, manual sorting is still needed in most garbage classification centers, the efficiency is low, and the human body is injured due to sharp garbage scratches. Through artificial intelligence technique, utilize computer vision and image processing to cut apart rubbish, not only can promote rubbish letter sorting efficiency greatly, but also accurate classification will mix recoverable rubbish screening out in other rubbish, realize the high efficiency and the cyclic utilization of resource, but the following problem that still exists of the rubbish classification system based on vision that nevertheless current rubbish adopted: 1. the existing garbage segmentation method based on traditional image processing has low segmentation precision; 2. the existing segmentation method based on the traditional deep learning network has the defects of more model parameters, high calculation complexity and influence on the subsequent image processing speed and efficiency; 3. the garbage targets are various in types, serious in deformation and pollution, the boundaries between the targets are not cleaned, and the existing method cannot capture detail and outline information well, so that the segmentation difficulty is high, the miscut segmentation rate is high, and the subsequent classification precision is influenced.

Disclosure of Invention

The invention provides a GinTrans network-based garbage segmentation method aiming at the defects of the prior art.

The invention also discloses an image segmentation system.

The technical scheme adopted by the invention for realizing the purpose is as follows:

a garbage segmentation method based on a GinTrans network comprises the following steps:

(1) The image segmentation system is set up as follows: the image segmentation system comprises an image acquisition module, an extraction module, a cutting identifier recombination module, a remodeling fusion output module, a jumping connection aggregation module and a segmentation output module, wherein the extraction module is provided with a GhostNet network, the remodeling fusion output module is provided with a Bi-Frequency transform (BiFTrans) module, the jumping connection aggregation module is provided with an upper sampling recovery module and a GhostNet down-sampling feature extraction module, and the segmentation output module is provided with a segmentation head module;

(2) Image acquisition input: the image acquisition module is connected with an image acquisition camera, the image acquisition camera acquires images of the garbage on the intelligent garbage recycling and sorting line and transmits the images to the image acquisition module, and the images acquired by the image acquisition module are used as image input data and are transmitted to the extraction module for processing;

(3) Extracting a characteristic diagram: the extraction module processes the image input data, and the extraction module adopts a GhostNet network to extract bottom layer features of the image input data, extracts an output feature map and carries out the next step of processing;

(4) Cutting mark recombination: the cutting identification recombination module is used for cutting the extracted output characteristic diagram, gridding the output characteristic diagram through cutting operation, generating marking characteristics through linear mapping after gridding is finished, forming marking characteristic sequences by the marking characteristics, and taking the marking characteristic sequences as the input of a Bi-Frequency transform (BiFTrans block) module;

(5) Remodeling and fusion: the remodeling fusion output module remodels the input marked feature sequence, and utilizes a dual-Frequency mixer arranged in a Bi-Frequency transform (BiFTrans block) module to perform fusion action on the features from different high and low frequencies, and the fusion action obtains fusion feature output;

(6) Jumping aggregation: recovering the fused feature output into an input image with the same size as the image input data through an up-sampling recovery module, extracting the lower material features of the corresponding same-layer resolution feature map through a GhostNet down-sampling feature extraction module, and jumping and linking the input image obtained by the up-sampling recovery module and the same-layer resolution feature map obtained by the GhostNet down-sampling feature extraction module so as to aggregate feature images presented in different resolution levels;

(7) And (3) segmentation output: and the segmentation head module of the segmentation output module generates and outputs the feature image to obtain a target image.

The improvement is further that the GhostNet network is provided with a Ghost module, and the design of the Ghost module is as follows:

(3.1) carrying out convolution on the input data of the to-be-identified garbage image in the step (2) through a GhostNet network to generate original feature maps f with different resolutions _i ′；

(3.2) the extraction module contains linear operation phi, and the linear operation phi is utilized to generate a Ghost characteristic diagram as an output characteristic diagram, wherein the formula is f _ij ＝Φ _i，j (f _i ') wherein, f _i ' is the ith original characteristic map, phi, obtained by convolution operation of the garbage image to be identified _i，j Performing j linear operation on the ith original feature map to obtain a j Ghost feature map f _ij 。

In a further improvement, the step (4) further comprises the following steps:

(4.1) performing gridding cutting on each Ghost feature map output in the step (3), wherein the size of each feature map is H multiplied by W, the H and the W are respectively the length and the width of the feature map, the size of each grid sub-feature map is s multiplied by s, the s is the side length of a grid, and the s can be divided by the H and the W into grid sub-feature maps;

(4.2) carrying out linear mapping on the gridded sub-feature graphs, wherein each gridded sub-feature icon is mapped into a mark feature, and all gridded sub-feature graphs form a mark feature sequence

X ⁱ Denotes the ith marker feature, i =1, 2.. N is the serial number of the marker feature.

In a further refinement, said step (5) further comprises upsampling aggregation and hopping chaining, comprising the steps of:

(5.1) firstly, carrying out layer normalization processing on the marked feature sequences,

x is a marked characteristic sequence after layer normalization processing,

the marked characteristic sequence obtained in the step (4);

(5.2) firstly carrying out frequency decomposition on each marking feature in the marking feature sequence in the previous step to decompose the marking features into high-frequency marking features and low-frequency marking features;

(5.3) for high frequency features, first performing a max pooling operation, then passing through a linear layer, and then passing through a depth separable convolutional layer, the pooling layer, the linear layer, and the depth separable convolutional layer forming a high frequency feature processor, the output being Y _high ＝DConv(FC(MaxPool(X _hihg ) )) in which Y is _high For feature output after high frequency feature processor processing, DConv is deep separable convolution processing, FC is linear full join processing, maxPool is maximum pooling processing, X _hihg Is the high frequency signature obtained from the signature decomposition in step (5.2);

(5.4) for the low-frequency marking features, firstly carrying out average pooling operation, then passing through an attention-seeking mechanism layer MSA, and then carrying out up-sampling to make up for the dimension reduction after the average pooling operation, wherein the average pooling layer, the attention-seeking mechanism layer and the up-sampling layer form a low-frequency feature positionPhysical processor with output of Y _low ＝Upsample(MSA(AvePool(X _low ) )) in which Y is _low Is the feature output after processing by the low frequency feature processor, upesple is the upsampling process, MSA is the self attention mechanism process, avePool is the averaging pooling process, X _low Is the low frequency signature feature obtained from the signature feature decomposition in step (5.2) above;

(5.5) high-low frequency feature fusion, namely, the high-frequency feature output and the low-frequency feature output are fused to obtain fused output Y _o ＝Concat(Y _high ，Y _low ) In which Y is _o For the fused feature output, concat is a feature linking function;

(5.6) carrying out layer normalization operation on the fused feature output again;

(5.7) after normalization operation, outputting the final BiFTrans block module through a feed-forward network FFN

Wherein

Is the signature sequence of step (4), Y _o The characteristic output is obtained after high-frequency and low-frequency fusion, FFN is feedforward network processing, and Norm is layer normalization processing.

The invention has the beneficial effects that: according to the invention, through the linear transformation of GhostNet, a feature graph required by feature mining is generated by using simple linear transformation, so that network parameters are reduced, the complexity is reduced, the subsequent segmentation processing efficiency is ensured, and the response speed is improved; the method has the advantages that the high-Frequency and low-Frequency division processing characteristics are realized by using a double-Frequency mixer arranged in a Bi-Frequency Transformer (BiFTrans block) module, the high-Frequency details are captured, meanwhile, the low-Frequency global information is considered, the speed and the processing stability of image processing are ensured, the segmentation network of the GhostNet network is fused after the high-Frequency and low-Frequency processing characteristics, the comprehensive characteristics of the high-Frequency and low-Frequency information in the garbage image data are effectively learned by the segmentation network, the accuracy and the precision of the garbage target segmentation with serious deformation and pollution and fuzzy boundary sense are improved, and the subsequent classification efficiency is improved.

The invention is further described with reference to the following detailed description and accompanying drawings.

Drawings

Fig. 1 is a schematic flow chart of a garbage segmentation method based on a GinTrans network according to this embodiment;

FIG. 2 is a schematic diagram of the GinTrans network structure of the present embodiment;

fig. 3 is a schematic structural diagram of the Ghost module in this embodiment;

fig. 4 is a schematic structural diagram of the bifrans block module in this embodiment;

fig. 5 is a schematic structural diagram of the dual-frequency mixer of the present embodiment.

Detailed Description

The following description is only a preferred embodiment of the present invention, and does not limit the scope of the present invention.

In an embodiment, referring to fig. 1 to 5, a method for garbage segmentation based on a GinTrans network includes the following steps:

(1) Image segmentation system settings: the image segmentation system comprises an image acquisition module, an extraction module, a cutting identifier recombination module, a remodeling fusion output module, a jump connection aggregation module and a segmentation output module, wherein the extraction module is provided with a GhostNet network, the remodeling fusion output module is provided with a Bi-Frequency Transformer (BiFTrans block) module, the jump connection aggregation module is provided with an upper sampling recovery module and a GhostNet down-sampling feature extraction module, and the segmentation output module is provided with a segmentation head module;

(2) Image acquisition input: the intelligent garbage collection and sorting line comprises an image collection module, an image collection module and a lifting module, wherein the image collection module is connected with an image collection camera, the image collection camera collects images of garbage on the intelligent garbage collection and sorting line and transmits the images to the image collection module, and the images collected by the image collection module are used as image input data and are transmitted to the lifting module for processing;

(4) Cutting identification and recombination: the cutting identification recombination module is used for cutting the extracted output characteristic diagram, the output characteristic diagram is gridded through the cutting operation, marking characteristics are generated through linear mapping after the gridding is finished, the marking characteristics form a marking characteristic sequence, and the marking characteristic sequence is used as the input of a Bi-Frequency Transformer (BiFTrans block) module;

(3.2) the extraction module contains linear operation phi, and the linear operation phi is utilized to generate a Ghost characteristic diagram as an output characteristic diagram, wherein the formula is f _ij ＝Φ _i，j (f _i ') wherein, f _i ' is the ith original characteristic map, phi, obtained by convolution operation of the garbage image to be identified _i，j Is to the jth of the ith original feature mapLinear operation is carried out, and the jth Ghost characteristic diagram f is obtained after the operation _ij 。

In a further improvement, the step (4) further comprises the following steps:

(4.1) performing gridding cutting on each Ghost feature map output in the step (3), wherein the size of each feature map is H multiplied by W, the H and the W are respectively the length and the width of the feature map, the size of each grid sub feature map is s multiplied by s, the s is the side length of the grid, and the s can be divided by the H and the W into grid sub feature maps;

x is a marked characteristic sequence after layer normalization processing,

the marked characteristic sequence obtained in the step (4);

(5.3) for the high frequency signature, first performing a maximum pooling operation, then passing through a linear layer, and then passing through a depth-separable convolutional layer, the pooling layer, the linear layer, and the depth-separable convolutional layer forming a high frequency signature processor with an output of Y _high ＝DConv(FC(MaxPool(X _hihg ) In which Y) is present, wherein Y is _high For the processed feature output of the high frequency feature processor, DConv is the deep separable convolution process, FC is the linear full join process, maxpool is the maximum pooling process, X _hihg Is the high frequency signature obtained from the signature decomposition in step (5.2);

(5.4) for the low-frequency marking features, firstly carrying out average pooling operation, then passing through an attention-seeking mechanism layer MSA, and then carrying out up-sampling to make up for the dimension reduction after the average pooling operation, wherein the average pooling layer, the attention-seeking mechanism layer and the up-sampling layer form a low-frequency feature processor, and the output is Y _low ＝Upsample(MSA(AvePool(X _low ) )) in which Y is _low Is the feature output after processing by the low frequency feature processor, upesple is the upsampling process, MSA is the self attention mechanism process, avePool is the averaging pooling process, X _low Is the low frequency signature feature obtained from the signature feature decomposition in step (5.2) above;

(5.5) high-low frequency feature fusion, namely, fusing the high-frequency feature output and the low-frequency feature output to obtain fused output Y _o ＝Concat(Y _high ，Y _low ) Wherein Y is _o For the fused feature output, concat is a feature linking function;

Wherein

According to the invention, through the linear transformation of GhostNet, a feature graph required by feature mining is generated by using simple linear transformation, so that network parameters are reduced, the complexity is reduced, the subsequent segmentation processing efficiency is ensured, and the response speed is improved; the method has the advantages that the high-Frequency and low-Frequency division processing characteristics are realized by using a double-Frequency mixer arranged in a Bi-Frequency Transformer (BiFTrans block) module, the high-Frequency details are captured, meanwhile, the low-Frequency global information is considered, the speed and the processing stability of image processing are ensured, the segmentation network of the GhostNet network is fused after the high-Frequency and low-Frequency processing characteristics, the comprehensive characteristics of the high-Frequency and low-Frequency information in the garbage image data are effectively learned by the segmentation network, the accuracy and the precision of the garbage target segmentation with serious deformation and pollution and fuzzy boundary sense are improved, and the subsequent classification efficiency is improved.

The present invention is not limited to the above embodiments, and other methods and systems for garbage segmentation based on the GinTrans network, which are obtained by using the same or similar structures, devices, processes or methods as those of the above embodiments of the present invention, are within the scope of the present invention.

Claims

1. A garbage segmentation method based on a GinTrans network is characterized by comprising the following steps:

(1) The image segmentation system is set up as follows: the image segmentation system comprises an image acquisition module, an extraction module, a cutting identifier recombination module, a remodeling fusion output module, a jump connection aggregation module and a segmentation output module, wherein the extraction module is provided with a GhostNet network, the remodeling fusion output module is provided with a Bi-Frequency Transformer (BiFTrans block) module, the jump connection aggregation module is provided with an upper sampling recovery module and a GhostNet down-sampling feature extraction module, and the segmentation output module is provided with a segmentation head module;

(2) Image acquisition and input: the image acquisition module is connected with an image acquisition camera, the image acquisition camera acquires images of the garbage on the intelligent garbage recycling and sorting line and transmits the images to the image acquisition module, and the images acquired by the image acquisition module are used as image input data and are transmitted to the extraction module for processing;

(7) And (3) segmentation output: and the segmentation head module of the segmentation output module generates segmentation output for the characteristic image to obtain a target image.

2. The GinTrans network-based garbage segmentation method according to claim 1, wherein the GhostNet network is provided with a Ghost module, and the Ghost module is designed as follows:

(3.1) carrying out convolution on the input data of the garbage image to be identified in the step (2) through a GhostNet network to generate original feature maps f with different resolutions _i ′；

(3.2) the extraction module contains linear operation phi, and the linear operation phi is utilized to generate a Ghost characteristic diagram as an output characteristic diagram, wherein the formula is f _ij ＝Φ _i，j (f′ _i ) Wherein f' _i Is the ith original characteristic diagram phi obtained by convolution operation of the garbage image to be identified _i，j The j th linear operation is carried out on the ith original feature map, and the j th Ghost feature map f is obtained after the operation _ij 。

3. The GinTrans network-based garbage segmentation method according to claim 2, wherein the step (4) further comprises the steps of:

Xi denotes the ith marker feature, i =1,2,. N is the serial number of the marker feature.

4. A GinTrans network based spam splitting method according to claim 3, wherein the step (5) further comprises upsampling aggregation and hopping chaining, comprising the steps of:

(5.1) firstly, the marked characteristic sequence is subjected to layer normalization processing,

x is the marked characteristic sequence after the layer normalization processing,

the marked characteristic sequence obtained in the step (4);

(5.3) for high frequency features, maximum pooling is performed firstOperating, then passing through a linear layer, then through a depth separable convolutional layer, the pooling layer, the linear layer and the depth separable convolutional layer forming a high frequency feature processor with an output of Y _high ＝DConv(FC(MaxPool(X _hihg ) In which Y) is present, wherein Y is _high For feature output after high frequency feature processor processing, DConv is deep separable convolution processing, FC is linear full join processing, maxPool is maximum pooling processing, X _hihg Is the high frequency signature obtained from the signature decomposition in step (5.2);

(5.4) for the low-frequency marking features, firstly carrying out average pooling operation, then passing through an attention-seeking mechanism layer MSA, and then carrying out up-sampling to make up for the dimension reduction after the average pooling operation, wherein the average pooling layer, the attention-seeking mechanism layer and the up-sampling layer form a low-frequency feature processor, and the output is Y _low ＝Upsample(MSA(AvePool(X _low ) In which Y) is present, wherein Y is _low Is the feature output after processing by the low frequency feature processor, upsample is the upsampling process, MSA is the autofocusing mechanism process, avePool is the averaging pooling process, X _low The low-frequency marking features are obtained by the marking feature decomposition in the step (5.2);

(5.5) high-low frequency feature fusion, namely, the high-frequency feature output and the low-frequency feature output are fused to obtain fused output Y _o ＝Concat(Y _high ，Y _low ) Wherein Y is _o For the fused feature output, concat is a feature linking function;

Wherein

Is the signature sequence of step (4), Y _O Is the characteristic output after high and low frequency fusion, FFN is feedforward network processing, norm is layer normalization processing。