CN110852351A - Image-based garbage classification method and device, terminal equipment and storage medium - Google Patents

Image-based garbage classification method and device, terminal equipment and storage medium Download PDF

Info

Publication number
CN110852351A
CN110852351A CN201911003601.7A CN201911003601A CN110852351A CN 110852351 A CN110852351 A CN 110852351A CN 201911003601 A CN201911003601 A CN 201911003601A CN 110852351 A CN110852351 A CN 110852351A
Authority
CN
China
Prior art keywords
image
garbage
information
module
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911003601.7A
Other languages
Chinese (zh)
Inventor
唐蔚然
谢洪涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Magic Island Information Technology Co Ltd
Original Assignee
Suzhou Magic Island Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Magic Island Information Technology Co Ltd filed Critical Suzhou Magic Island Information Technology Co Ltd
Priority to CN201911003601.7A priority Critical patent/CN110852351A/en
Publication of CN110852351A publication Critical patent/CN110852351A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Abstract

The invention discloses a garbage classification method and device based on images, a terminal device and a storage medium, wherein the garbage classification method comprises the following steps: performing feature extraction on an input image by using a basic feature network to obtain a garbage image; respectively carrying out noise suppression on the garbage image by utilizing at least three attention mechanism modules to obtain at least three corresponding depth characteristic information; carrying out bilinear polymerization on at least three kinds of depth characteristic information pairwise to obtain at least three kinds of enhancement information; fusing at least three types of enhancement information to obtain junk image expression information; and classifying the garbage image expression information by using a classifier so as to realize classification of the input image. The technical scheme of the embodiment of the invention obtains the best current result in the three widely used reference data sets.

Description

Image-based garbage classification method and device, terminal equipment and storage medium
Technical Field
The present invention relates to a garbage classification method, and in particular, to a garbage classification method and apparatus based on an image, a terminal device, and a storage medium.
Background
The garbage classification aims at classifying different garbage according to materials and purposes so as to effectively reduce environmental pollution and resource waste, and the automatic classification of garbage images has wide application prospect at present.
Because the garbage image data has larger intra-class difference and inter-class similarity, the traditional image classification algorithm cannot solve the problem. At present, a bilinear polymerization algorithm is used for carrying out high-order mapping on image features to capture the difference between fine-grained images. However, this bilinear polymerization approach ignores a problem: high-order mapping not only introduces more image detail information, but also amplifies noise information to influence the feature expression capability. In addition, due to the characteristic that the existence form of noise in image expression has diversity, it is difficult to effectively remove the noise through a single noise suppression mechanism.
Disclosure of Invention
In order to solve the technical problem, embodiments of the present invention provide a method and an apparatus for image-based garbage classification, a terminal device, and a storage medium.
In order to achieve the purpose, the technical scheme of the invention is realized as follows:
an embodiment of a first aspect of the present invention provides an image-based garbage classification method, where the garbage classification method includes:
performing feature extraction on an input image by using a basic feature network to obtain a garbage image;
respectively carrying out noise suppression on the garbage image by utilizing at least three attention mechanism modules to obtain at least three corresponding depth characteristic information;
carrying out bilinear polymerization on at least three kinds of depth characteristic information pairwise to obtain at least three kinds of enhancement information;
fusing at least three types of enhancement information to obtain junk image expression information;
and classifying the garbage image expression information by using a classifier so as to realize classification of the input image.
Further, the attention mechanism module comprises: the system comprises a spatial attention module based on the feature information of different areas of the garbage image, a channel attention module based on the feature channel information of the garbage image, and an area relation attention module based on the relation between different areas of the garbage image.
Further, the spatial attention module has the expression:
fs(Xi)=diag(ωi)Xi
wherein diag (-) is a diagonalization operation, using the elements in the input vector as diagonal elements to generate a matrix, vector ωiThe method comprises the following steps: feature X is transformed by a two-layer 1 × 1 convolution operation and a ReLU activation functioniMapping into a feature with the channel number of 1. Then using softmax operation to carry out characteristic normalization processing to obtain omegai
Further, the expression of the channel attention module is:
fc(Xi)=Xidiag(ci)+Xi
wherein diag (. cndot.) is a diagonalization operation, ciThe method comprises the following steps: using a global average pooling operation to pool features XiCarrying out spatial averaging to obtain a global vector characteristic; information extraction is carried out through two layers of full connection operation, and the dimension changes are c → c/16 and c/16 → c respectively; c is obtained by normalization through a Sigmoid activation functioni
Further, the expression of the region relationship attention module is as follows:
Figure BDA0002242061620000021
whereinAnd
Figure BDA0002242061620000023
are all XiObtained by convolution and pooling operations.
Further, classifying the spam image expression information using a classifier comprises: a cross entropy loss function is used as an optimization objective.
Further, classifying the spam image expression information using a classifier comprises:
performing data augmentation on the training data set;
and (3) disordering the augmented training data, performing batch training according to a preset quantity, and simultaneously, randomly intercepting an area with a preset size on the original image of the training data set and inputting the area into the classifier.
An embodiment of a second aspect of the present invention provides an image-based garbage classification apparatus, including:
the characteristic extraction module is used for extracting the characteristics of the input image by utilizing a basic characteristic network to obtain a garbage image;
the noise suppression module is used for performing noise suppression on the garbage image by utilizing at least three attention mechanism modules to obtain at least three corresponding depth characteristic information;
the enhancement module is used for performing bilinear polymerization on at least three kinds of depth characteristic information pairwise to obtain at least three kinds of enhancement information;
the fusion module is used for fusing at least three types of enhancement information to obtain junk image expression information;
and the classification module is used for classifying the garbage image expression information by using a classifier so as to realize classification of the input image.
In a third aspect, the present invention provides a terminal device, where the terminal device includes a processor and a memory for storing processor-executable instructions, and the processor executes the steps of any one of the above garbage classification methods.
A fourth aspect of the present invention provides a computer-readable storage medium, storing computer instructions, which when executed, implement the steps of any one of the above-mentioned garbage classification methods.
The embodiment of the invention provides a garbage classification method, a garbage classification device, terminal equipment and a storage medium based on images, on one hand, a plurality of complementary denoising image features can be extracted from garbage images to capture distinctive image features; on the other hand, a hierarchical fusion method based on high-order mapping is provided, and a plurality of noise-suppressed image features are subjected to high-order mapping and effectively fused to obtain a more robust global image expression and are used for classification. The technical scheme of the embodiment of the invention obtains the best current result in the three widely used reference data sets.
Drawings
FIG. 1 is an alternative flow chart of a garbage classification method according to an embodiment of the present invention;
FIG. 2 is another alternative flow chart of the garbage classification method provided by the embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a spatial attention module according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a channel attention module according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a region relationship attention module according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the drawings of the embodiments of the present invention. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the described embodiments of the invention, belong to the scope of protection of the invention.
As shown in fig. 1 and fig. 2, an embodiment of the first aspect of the present invention provides an image-based garbage classification method, where the method includes the following steps:
and S10, extracting the features of the input image by using the basic feature network to obtain a garbage image.
Specifically, the basic feature network may be any general convolutional neural network, that is, the basic feature network includes a plurality of convolutional operation modules, and each convolutional module includes a plurality of convolutional layers and an activation function; the convolution modules are connected by an average pooling layer or a maximum pooling layer, and the number and the size of convolution kernels in each convolution module are basically unchanged; the number of convolution kernels of different convolution modules is sequentially increased along with the increase of the number of network layers; here, we take the output of the last layer of the underlying feature network as the final image representation feature. Taking VGG-16 as an example, the base feature network comprises five convolutional layer modules, and each convolutional layer module comprises different numbers of convolution and activation operations. The number of the characteristic channels output by the five groups of modules is as follows in sequence: 64, 128, 256, 512, 512. With the increase of modules, the resolution of the output features becomes smaller in sequence, and the semantic hierarchy of the extracted features becomes higher in sequence. Finally, defining the characteristics output by the last layer of the basic characteristic network as Xi∈RN×CWhere N is the number of spatial pixels, C is the number of feature channels, and i is the sample index.
And S20, respectively carrying out noise suppression on the garbage image 10 by using at least three attention mechanism modules to obtain at least three corresponding depth feature information 20.
Further, the attention mechanism module comprises: the system comprises a spatial attention module based on the feature information of different areas of the garbage image, a channel attention module based on the feature channel information of the garbage image, and an area relation attention module based on the relation between different areas of the garbage image.
In one specific example of the present invention, as shown in fig. 2 to 5, the spatial information of the image, the channel information of the feature, and the region spatial relationship information are respectively refined and redundantly eliminated by using the three attention mechanism modules. The image space information includes important image space information, and the important image space information refers to texture information of which region in the garbage image is more important for fine-grained garbage classification, such as a pop can region in fig. 2. The feature channel information includes feature channel important information, which refers to which channels of information in the image feature expression have distinctiveness for garbage classification, such as channels related to texture and shape. The area relation information comprises area relation important information, the area relation important information refers to the relation between which areas in the garbage image are valuable for garbage classification, and for example, the pull ring area and the can body area are mutually related in the drawing to indicate that the can is probably an empty pop can.
In particular, the spatial attention module is fsRepresents; f for channel attention modulecRepresents; and regional relationship attention module frAnd (4) showing.
Further, fsFirst, feature X is transformed by a two-layer 1 × 1 convolution operation (Conv) and a ReLU activation functioniMapping into a feature with the channel number of 1. Then, a vector omega is obtained after feature normalization processing is carried out by using softmax operationiIts dimension is N. Vector omegaiThe weight in (1) corresponds to the feature XiThe degree of importance of each pixel: areas with high weights have more important information, and areas with low weights have more noise. Finally, the vector omega is processediAs weights and features XiThe spatial pixels in (1) are weighted, and the specific expression of the spatial attention module is as follows:
fs(Xi)=diag(ωi)Xi
wherein diag (·) is a diagonalization operation, and a matrix is generated by taking the elements in the input vector as diagonal elements. The use of softmax has the following advantages: softmax enables ωiThe value of (1) is between (0, 1), so that the weight of the large value is concentrated in the important area of the image; softmax can suppress the problem of deep network gradient explosions.
fcFirst, X is pooled by a global average pooling operationiAnd carrying out spatial averaging to obtain a global vector characteristic. And then information extraction is carried out through two-layer full connection operation, and the dimension change is c → c/16 and c/16 → c respectively. Finally, normalization is carried out through a Sigmoid activation function to obtain ci. The channel attention module attention expression is:
fc(Xi)=Xidiag(ci)+Xi
preferably, a residual attention mechanism is used here to make the training more stable.
Regional relationship attention module f compared to spatial attention modulerThe operation of one more area interaction is realized. The relation between space and space is obtained in the form of outer product, finally, the spatial relation weight is obtained by normalization through Softmax operation, and the expression of the regional relation attention module is as follows:
wherein
Figure BDA0002242061620000052
And
Figure BDA0002242061620000053
are all XiObtained by convolution and pooling operations. Soffmax here is the operation along the matrix row vector.
And S30, carrying out bilinear polymerization on at least three kinds of depth characteristic information pairwise to obtain at least three kinds of enhancement information.
Bilinear polymerization is a widely used characteristic polymerization method, and the method can carry out interaction between channels on two characteristic expressions of an image through the operation of outer product, and finally carry out space average pooling to obtain a final expression. Due to the outer product operation, the method can map the image features to a high-order semantic space to obtain richer feature expression, and therefore, the method also has stronger distinguishing capability. Further, the three enhanced images obtained above are expressed, bilinear polymerization is carried out between every two images by the step-type method in FIG. 2,
and S40, fusing at least three types of enhancement information to obtain junk image expression information.
Specifically, the final expression is obtained in a cascade manner. Compared with direct cascade, the method can additionally explore the hidden associated information among the three characteristics, so that the method has stronger distinguishing capability:
Figure BDA0002242061620000054
wherein
Figure BDA0002242061620000055
For a bilinear aggregation function:
Figure BDA0002242061620000056
x1,x2∈RN×Cis characterized by cascaded operation.
And S50, classifying the garbage image expression information by using a classifier to realize classification of the input image.
Further, classifying the spam image expression information using a classifier comprises: a cross entropy loss function is used as an optimization objective.
Since the classification of the garbage image is a fine-grained classification problem essentially, a cross entropy loss function is adopted as an optimization target, and the expression is as follows:
Figure BDA0002242061620000058
wherein, yiRepresenting the true classification result, i.e. the label; a isiRepresents the classifier pair YiThe classes obtained by the classification are scored.
Further, to reduce the risk of model overfitting, classifying the spam image expression information by using a classifier comprises:
performing data augmentation (such as folding, stretching and the like) on the training data set;
the augmented training data is scrambled and batch-trained in a predetermined number (e.g., batch size 8), and regions of a predetermined size (e.g., 448 x 448) are randomly clipped from the original image of the training data set and input to the classifier.
When a network is trained, a random gradient descent method is used as an optimizer, and the learning rate attenuation strategy is set to be exponential attenuation; the initial learning rate was 0.01. Meanwhile, the previous layer of the classifier is set to Dropout with a ratio of 0.5, and the coefficient value of the L2 penalty term is set to 0.0005. The initialization of the network adopts the MSRA method, and the Gaussian parameter is set asNormal distribution of (1), n is the number of parameters).
To validate the effectiveness of the present invention, we evaluated on three widely used fine-grained target classification benchmark datasets. The three reference data sets are respectively a fine-grained bird data set (CUB-200), a fine-grained vehicle data set (Car-196) and an action identification arbitrary set (MPII). The data set specific information is as follows:
1. the CUB-200 dataset consists of 11788 bird pictures of 200 categories. The training/testing of the data set is divided into: 5994 training pictures and 5794 test pictures;
2. the Car-196 data set consists of a total of 196 categories of 16185 Car class pictures. The training/testing of the data set is divided into: 8114 training pictures and 8041 test pictures;
3. the MPII data set consists of 15205 pictures of 393 behavior classes. The training/testing of the data set is divided into: 8218 training pictures and 6987 test pictures.
Experiments prove that the garbage classification method of the embodiment of the invention obtains the best experimental result on the three reference data sets. The recognition accuracy rates on the CUB-200 and Car-196 datasets were 86.2% and 91.5%, respectively, and the MAP (mean Average precision) index on the MPII dataset was 32.7%.
An embodiment of a second aspect of the present invention provides an image-based garbage classification apparatus, including:
the characteristic extraction module is used for extracting the characteristics of the input image by utilizing a basic characteristic network to obtain a garbage image;
the noise suppression module is used for performing noise suppression on the garbage image by utilizing at least three attention mechanism modules to obtain at least three corresponding depth characteristic information;
the enhancement module is used for performing bilinear polymerization on at least three kinds of depth characteristic information pairwise to obtain at least three kinds of enhancement information;
the fusion module is used for fusing at least three types of enhancement information to obtain junk image expression information;
and the classification module is used for classifying the garbage image expression information by using a classifier so as to realize classification of the input image.
In a third aspect, the present invention provides a terminal device, where the terminal device includes a processor and a memory for storing processor-executable instructions, and the processor executes the steps of any one of the above garbage classification methods.
The terminal device may further include a network interface, an input device, a hard disk, and a display device.
The various interfaces and devices described above may be interconnected by a bus architecture. A bus architecture may be any architecture that may include any number of interconnected buses and bridges. One or more Central Processing Units (CPUs), represented in particular by a processor, and one or more memories, represented by a memory, are connected together by various circuits. The bus architecture may also connect various other circuits such as peripherals, voltage regulators, power management circuits, and the like. It will be appreciated that a bus architecture is used to enable communications among the components. The bus architecture includes a power bus, a control bus, and a status signal bus, in addition to a data bus, all of which are well known in the art and therefore will not be described in detail herein.
The network interface can be connected to a network (such as the internet, a local area network, etc.), and can acquire relevant data from the network and store the relevant data in the hard disk.
The input device can receive various instructions input by an operator and send the instructions to the processor for execution. The input device may include a keyboard or a pointing device (e.g., a mouse, trackball, touch pad, touch screen, or the like).
The display device can display the result obtained by the processor executing the instruction.
The memory is used for storing programs and data necessary for operating the operating system, intermediate results in the calculation process of the processor and the like.
It will be appreciated that the memory in embodiments of the invention may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read Only Memory (EPROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), or a flash memory. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. The memories of the apparatus and methods described herein are intended to comprise, without being limited to, these and any other suitable types of memory.
In some embodiments, the memory stores elements, executable modules or data structures, or a subset thereof, or an expanded set thereof as follows: an operating system and an application program.
The operating system includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is used for implementing various basic services and processing hardware-based tasks. The application programs include various application programs, such as a browser (btower) and the like, and are used for realizing various application services. The program for implementing the method of the embodiment of the present invention may be included in the application program.
The processor acquires the panoramic image when calling and executing the application program and the data stored in the memory, specifically, the application program and the data can be a program or an instruction stored in the application program; preprocessing the panoramic image to obtain a subimage to be processed; inputting the sub-image to be processed into a multi-path convolution neural network to obtain a deep characteristic map of the sub-image to be processed; performing pooling treatment on the deep layer characteristic diagram; and inputting the deep characteristic map subjected to pooling into a full-connected model, and taking the output of the full-connected model as the position information after relocation.
The method disclosed by the above embodiment of the invention can be applied to a processor or implemented by the processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The processor may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof, configured to implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present invention. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules b in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.
It is to be understood that the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or any combination thereof. For a hardware implementation, the processing units may be implemented within one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), general purpose processors, controllers, micro-controllers, microprocessors, other electronic units designed to perform the functions described herein, or a combination thereof.
For a software implementation, the techniques described herein may be implemented with modules (e.g., procedures, functions, and so on) that perform the functions described herein. The software codes may be stored in a memory and executed by a processor. The memory may be implemented within the processor or external to the processor.
A fourth aspect of the present invention provides a computer-readable storage medium, storing computer instructions, which when executed, implement the steps of any one of the above-mentioned garbage classification methods.
It is understood that storage media include, but are not limited to: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Other steps of garbage classification according to embodiments of the present invention are understood and readily implemented by those skilled in the art and therefore will not be described in detail.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention.

Claims (10)

1. An image-based garbage classification method, characterized in that the garbage classification method comprises:
performing feature extraction on an input image by using a basic feature network to obtain a garbage image;
respectively carrying out noise suppression on the garbage image by utilizing at least three attention mechanism modules to obtain at least three corresponding depth characteristic information;
carrying out bilinear polymerization on at least three kinds of depth characteristic information pairwise to obtain at least three kinds of enhancement information;
fusing at least three types of enhancement information to obtain junk image expression information;
and classifying the garbage image expression information by using a classifier so as to realize classification of the input image.
2. The method of classifying refuse according to claim 1, wherein the attention mechanism module comprises: the system comprises a spatial attention module based on the feature information of different areas of the garbage image, a channel attention module based on the feature channel information of the garbage image, and an area relation attention module based on the relation between different areas of the garbage image.
3. The method of garbage classification of claim 2, wherein the spatial attention module has the expression:
fs(Xi)=diag(ωi)Xi
wherein diag (-) is a diagonalization operation, using the elements in the input vector as diagonal elements to generate a matrix, vector ωiThe method comprises the following steps: feature X is transformed by a two-layer 1 × 1 convolution operation and a ReLU activation functioniMapping into a characteristic with the channel number of 1; using softmax operation to carry out characteristic normalization processing to obtain omegai
4. The method of sorting garbage according to claim 2, wherein the expression of the channel attention module is:
fc(Xi)=Xidiag(ci)+Xi
wherein diag (. cndot.) is a diagonalization operation, ciThe method comprises the following steps: using a global average pooling operation to pool features XiCarrying out spatial averaging to obtain a global vector characteristic; information extraction is carried out through two layers of full connection operation, and the dimension changes are c → c/16 and c/16 → c respectively; c is obtained by normalization through a Sigmoid activation functioni
5. The method of garbage classification of claim 2 wherein the expression of the region relationship attention module is:
Figure FDA0002242061610000011
wherein
Figure FDA0002242061610000012
And
Figure FDA0002242061610000013
are all XiObtained by convolution and pooling operations.
6. The method of classifying spam according to claim 1 wherein classifying spam image expression information using a classifier comprises: a cross entropy loss function is used as an optimization objective.
7. The method of classifying spam according to claim 1 wherein classifying spam image expression information using a classifier comprises:
performing data augmentation on the training data set;
and (3) disordering the augmented training data, performing batch training according to a preset quantity, and simultaneously, randomly intercepting an area with a preset size on the original image of the training data set and inputting the area into the classifier.
8. An image-based garbage classification device, characterized in that the garbage classification device comprises:
the characteristic extraction module is used for extracting the characteristics of the input image by utilizing a basic characteristic network to obtain a garbage image;
the noise suppression module is used for performing noise suppression on the garbage image by utilizing at least three attention mechanism modules to obtain at least three corresponding depth characteristic information;
the enhancement module is used for performing bilinear polymerization on at least three kinds of depth characteristic information pairwise to obtain at least three kinds of enhancement information;
the fusion module is used for fusing at least three types of enhancement information to obtain junk image expression information;
and the classification module is used for classifying the garbage image expression information by using a classifier so as to realize classification of the input image.
9. A terminal device, characterized in that the terminal device comprises a processor and a memory for storing processor-executable instructions, the processor performing the steps of the garbage classification method of any one of claims 1 to 7.
10. A computer-readable storage medium storing computer instructions which, when executed, implement the steps of the garbage classification method of any one of claims 1 to 7.
CN201911003601.7A 2019-10-22 2019-10-22 Image-based garbage classification method and device, terminal equipment and storage medium Pending CN110852351A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911003601.7A CN110852351A (en) 2019-10-22 2019-10-22 Image-based garbage classification method and device, terminal equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911003601.7A CN110852351A (en) 2019-10-22 2019-10-22 Image-based garbage classification method and device, terminal equipment and storage medium

Publications (1)

Publication Number Publication Date
CN110852351A true CN110852351A (en) 2020-02-28

Family

ID=69596970

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911003601.7A Pending CN110852351A (en) 2019-10-22 2019-10-22 Image-based garbage classification method and device, terminal equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110852351A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111368942A (en) * 2020-05-27 2020-07-03 深圳创新奇智科技有限公司 Commodity classification identification method and device, electronic equipment and storage medium
CN116152115A (en) * 2023-04-04 2023-05-23 湖南融城环保科技有限公司 Garbage image denoising processing method based on computer vision
CN117522388A (en) * 2023-11-08 2024-02-06 永昊环境科技(集团)有限公司 Intelligent sanitation processing method for urban environment

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111368942A (en) * 2020-05-27 2020-07-03 深圳创新奇智科技有限公司 Commodity classification identification method and device, electronic equipment and storage medium
CN111368942B (en) * 2020-05-27 2020-08-25 深圳创新奇智科技有限公司 Commodity classification identification method and device, electronic equipment and storage medium
CN116152115A (en) * 2023-04-04 2023-05-23 湖南融城环保科技有限公司 Garbage image denoising processing method based on computer vision
CN117522388A (en) * 2023-11-08 2024-02-06 永昊环境科技(集团)有限公司 Intelligent sanitation processing method for urban environment
CN117522388B (en) * 2023-11-08 2024-04-12 永昊环境科技(集团)有限公司 Intelligent sanitation processing method for urban environment

Similar Documents

Publication Publication Date Title
Gao et al. Multiscale residual network with mixed depthwise convolution for hyperspectral image classification
He et al. Supercnn: A superpixelwise convolutional neural network for salient object detection
CN109685819B (en) Three-dimensional medical image segmentation method based on feature enhancement
Kao et al. Visual aesthetic quality assessment with a regression model
CN109344618B (en) Malicious code classification method based on deep forest
Li et al. HEp-2 specimen image segmentation and classification using very deep fully convolutional network
DE102017100609A1 (en) Online capture and classification of dynamic gestures with recurrent folding neural networks
CN107683469A (en) A kind of product classification method and device based on deep learning
CN109063719B (en) Image classification method combining structure similarity and class information
Oloyede et al. Improving face recognition systems using a new image enhancement technique, hybrid features and the convolutional neural network
Tursun et al. MTRNet++: One-stage mask-based scene text eraser
Shen et al. Deep cross residual network for HEp-2 cell staining pattern classification
Cao et al. Learning crisp boundaries using deep refinement network and adaptive weighting loss
CN110852351A (en) Image-based garbage classification method and device, terminal equipment and storage medium
WO2016170965A1 (en) Object detection method and image search system
Panella et al. Semantic segmentation of cracks: Data challenges and architecture
Zhang et al. Feature pyramid network for diffusion-based image inpainting detection
CN111783514A (en) Face analysis method, face analysis device and computer-readable storage medium
Qi et al. Hep-2 cell classification: The role of gaussian scale space theory as a pre-processing approach
CN114723010B (en) Automatic learning enhancement method and system for asynchronous event data
Song et al. Towards genetic programming for texture classification
WO2020119624A1 (en) Class-sensitive edge detection method based on deep learning
Liu et al. Image classification method on class imbalance datasets using multi-scale CNN and two-stage transfer learning
Bacea et al. Single stage architecture for improved accuracy real-time object detection on mobile devices
Karsh et al. mIV3Net: modified inception V3 network for hand gesture recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200228

WD01 Invention patent application deemed withdrawn after publication