CN114202690A - Multi-scale network analysis method based on mixed multilayer perceptron - Google Patents

Multi-scale network analysis method based on mixed multilayer perceptron Download PDF

Info

Publication number
CN114202690A
CN114202690A CN202111498048.6A CN202111498048A CN114202690A CN 114202690 A CN114202690 A CN 114202690A CN 202111498048 A CN202111498048 A CN 202111498048A CN 114202690 A CN114202690 A CN 114202690A
Authority
CN
China
Prior art keywords
pixel
steps
network analysis
scale network
dimension
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111498048.6A
Other languages
Chinese (zh)
Other versions
CN114202690B (en
Inventor
李林辉
林谋乐
景维鹏
陈广胜
刘鹏
李子游
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeast Forestry University
Original Assignee
Northeast Forestry University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeast Forestry University filed Critical Northeast Forestry University
Priority to CN202111498048.6A priority Critical patent/CN114202690B/en
Publication of CN114202690A publication Critical patent/CN114202690A/en
Application granted granted Critical
Publication of CN114202690B publication Critical patent/CN114202690B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/31Programming languages or programming paradigms
    • G06F8/315Object-oriented languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-scale network analysis method based on a mixed multilayer perceptron, which comprises the following steps: an MSC block and a UMLP block; the invention realizes that the high-efficiency hyperspectral classification method exceeds the previous method, the size of the model is only 0.185M, the method can be well suitable for the industrial requirement, the method can be well applied to the hyperspectral field, the forest transition can be monitored, and the early warning effect on forest disasters such as fire and the like can be realized in time.

Description

Multi-scale network analysis method based on mixed multilayer perceptron
Technical Field
The invention relates to the field of hyperspectral images, in particular to a multiscale network analysis method based on a mixed multilayer perceptron.
Background
Huertas et al is the first to apply hyperspectral images to city analysis. And extracting the boundaries of the urban buildings by adopting a geometric topological theory. Adams et al analyze spectral bands by mathematical theory and combine other factors to analyze hyperspectral data. Generally, early hyperspectral image studies mainly utilized theoretical knowledge of various disciplines such as geometry to analyze shallow information. The most advanced methods always imply deep learning represented by Convolutional Neural Networks (CNN). In addition, the method achieves huge achievement and outstanding expression in the aspects of semantic segmentation, classification, target detection and the like. A new framework with context CNN is described that explores local context interactions by jointly exploiting neighboring local spatial spectral relationships.
Undeniably, the role of CNN in deep learning is likely to be de facto standard. However, its parameters grow exponentially with increasing convolutional layer and its size increases with increasing computational power, lasting more than 20M to have acceptable expression capability. In addition, due to the persistence of multiplication and addition operations, computational consumption is a bottleneck for industrial applications and cannot meet the real-time requirements of the industry.
Prior art 2
Recently, self-attention layer based vision transducers (ViT) have gained the most advanced performance in computer vision and have attracted the attention of many researchers. However, the self-attention mechanism requires the computation of three matrices. The stack model consumes many computing resources deeply. Therefore, it results in many disadvantages such as a large consumption of calculation and a significant scale of the model. Tolstikhin demonstrated that CNN is not required for deep learning. The current MLP-Mixer framework employs two types of MLPs to mix the feature and spatial information of each location separately. Is a significant research topic. By sacrificing microscopic accuracy, the mold speed is significantly increased and the mold size is also compressed. Although it is simple, it has excellent performance in various fields and disciplines. And many kinds of research are based on the MLP-Mixer in the field of remote sensing to mine more meaningful analysis. Sildir et al propose a mixed integer nonlinear programming method based on a superstructure, which is used for optimal structural design of number selection, pruning and input selection of MLPs neurons and realizes the most advanced performance, and experimental image data sets are performed in two public hyperspectras. In addition, the Graph Convolution Network (GCN) can embed non-Euclidean features from neighboring notes, capturing features with excellent relationships between hyperspectral pixels. Lin uses GCN to convert features from a chaotic state to a highly cohesive state while reducing redundant information of the data. Noise in hyperspectral images is also a challenging problem. UnDIP is an excellent method that uses geometric end members to extract end members and employs deep learning to estimate abundance, thereby solving the noise problem of hyperspectral images. HyMiNoR is also an efficient denoising method using novel sparse noise frames. Furthermore, U-Net is a classical encoder-decoder architecture, where the encoder embeds spatial and semantic information, which the decoder mixes with the position features.
The second prior art has the defects
Although this method works well in hyperspectral classification, its model is computationally very computationally expensive and its run time is always lengthy. Convolution operations are the root of all problems. It brings excellent performance in various fields, but also brings complicated calculation consumption. The MLP-Mixer is an epoch-making study that involves only MLP operations, mixing all features, such as spatial information, by stacking layers. However, because the model has a simple structure, the expression ability of the model is weakened, and mainly because semantic information between adjacent structures is ignored, semantic relation between features cannot be well captured.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a multi-scale network analysis method based on a hybrid multi-layer sensor.
The technical scheme of the invention is as follows: the multi-scale network analysis method based on the mixed multi-layer perceptron comprises the following steps: MSC block and UMLP block.
Preferably, the MSC block comprises the steps of:
the method comprises the following steps: the adaptation data provided by UMLP is realized by converting the channel dimension into 2n 'n is an integer' through a mixed MLP layer;
step two: mixing channel information using convolutional layers with convolutional kernels of size 1 x 1;
step three: each pixel of the mixed channel information output hyperspectral image forms a patch similar to an image;
step four: each row represents a different characteristic representation of each Pixel generated by the convolution, namely Pixel-C;
step five: each column represents a summary of the original pixel channel values, i.e., Gen-C.
Preferably, the UMLP block comprises the steps of:
the method comprises the following steps: stacking layers of MSCs for higher receptive fields, the input (U, j, x) having global feature information in both the Pixel-C dimension and the Gen-C dimension;
step two: the MixerBlock module mixes semantic information of two directions through two MLP layers, and then extracts Pixel-C dimensional characteristics through one MLP (dimension reduction).
The multi-scale network analysis method based on the mixed multilayer perceptron has the following beneficial effects:
1. the invention realizes a high-efficiency hyperspectral classification method, which exceeds the prior method, thereby having certain commercial value.
2. The size of the model of the invention is only 0.185M, which can be well adapted to the industrial requirements.
3. The method can be well applied to the hyperspectral field, such as field transition analysis, city evolution and the like.
4. The invention monitors forest transition and performs timely early warning function on forest disasters such as fire and the like.
Drawings
FIG. 1 is a diagram of a multi-scale U-shaped multi-layered sensor according to the present invention.
FIG. 2 is a data flow diagram of the present invention.
Fig. 3 is a visualization of the results of the present invention.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.
The invention mainly comprises Numpy, Pandas, Tensor and the like through the existing deep learning frame Pythrch and a corresponding programming library. The Pytrch mainly uses a deep learning model, and comprises a linear module, a convolution module, a parameter penalty module and the like.
The specific scheme is realized according to the following principle:
we convert the segmentation into a classification task in the proposed method, performing pixel-level classification instead of patch segmentation. The method includes an MSC (multi-scale channel) block and a UMLP (U-shaped multi-layer perceptron) block. For the MSC block, in order to provide adaptation data for UMLP, it converts the channel dimension to 2n (n is an integer) by the hybrid MLP layer. The channel information is then mixed using convolutional layers with convolutional kernels of size 1 x 1. It outputs a patch that forms an image-like image for each pixel of the hyperspectral image. Each row represents a different characteristic representation of each pixel generated by the convolution, and is referred to below simply as: Pixel-C. Each column represents a summary of the original pixel channel values, abbreviated as: Gen-C.
The MSC mainly comprises two parts: firstly, MSC is used for extracting the characteristics of Pixel-C, similar to pooling operation, and secondly, the MSC is used for mixing Gen-C information. For a hyperspectral image, the size is H × W × C. C is the number of spectral bands and H and W are the height and width of the image, respectively. The random method randomizes all pixels and excludes the background, and employs random sampling of each pixel.
The Pixel-C dimension embeds only one feature of a Pixel. Therefore, we extract more semantic features using convolution operations of 1 × 1 kernel size, with various information embedded to summarize each pixel channel and named Gen-C.
For UMLP blocks, it consists of MixerBlock, U-shaped interpret and skip connect modules, respectively. The MSC layers are stacked to obtain a higher receptive field, and the input (U, j, x) has global feature information in both the Pixel-C dimension and the Gen-C dimension. The MixerBlock module mixes semantic information of two directions through two MLP layers, and then extracts Pixel-C dimensional characteristics through one MLP (dimension reduction), as shown in FIG. 1
Aiming at the defect of huge calculation consumption of the existing model, the invention provides a multi-scale network based on a hybrid multi-layer sensor. The consumption of computing resources of the MLP model is extremely low, only the multiplication and accumulation of corresponding positions are needed, and the problem that parameters are exponentially increased after network layers are overlapped does not exist, so that the defects of the existing model are well overcome.
Aiming at the defect of poor expression capability of an MLP-Mixer model, the invention provides a multi-scale multi-channel U-shaped network. It converts channel dimensions to 2n (n is an integer) to unify channel data of different data sets. The core of the module is 1 × 1 convolution operation, and the representation of various characteristics of pixels is expanded to obtain multidimensional distribution. It extracts multiple fragments from a single pixel to culture expression capacity. Finally, the representation capability of the model is further enhanced by superposing a hierarchical structure similar to a U-Net structure.
The data flow diagram of the present invention, as shown in FIG. 2;
the present invention has performed extensive experiments on a widely adopted public data set. MUMLP is comprehensively superior to the most advanced method, and in a Houston 2018 data set, compared with the average accuracy of CAGU, the average accuracy is improved by 6.61%, the average accuracy is improved by 5.47% in an MLP-Mixer, the average accuracy is improved by 14.17% in an OTVCA, and the average accuracy is improved by 14.64% in the OTVCA.
The results of the patented method are visualized as shown in fig. 3 below.

Claims (3)

1. The multi-scale network analysis method based on the mixed multilayer perceptron is characterized by comprising the following steps: MSC block and UMLP block.
2. The method of multi-scale network analysis based on hybrid multi-layer perceptron according to claim 1, characterized in that said MSC block comprises the following steps:
the method comprises the following steps: the adaptation data provided by UMLP is realized by converting the channel dimension into 2n 'n is an integer' through a mixed MLP layer;
step two: mixing channel information using convolutional layers with convolutional kernels of size 1 x 1;
step three: each pixel of the mixed channel information output hyperspectral image forms a patch similar to an image;
step four: each row represents a different characteristic representation of each Pixel generated by the convolution, namely Pixel-C;
step five: each column represents a summary of the original pixel channel values, i.e., Gen-C.
3. The hybrid multi-layered perceptron-based multi-scale network analysis method according to claim 1, wherein the UMLP block comprises the steps of:
the method comprises the following steps: stacking layers of MSCs for higher receptive fields, the input (U, j, x) having global feature information in both the Pixel-C dimension and the Gen-C dimension;
step two: the MixerBlock module mixes semantic information of two directions through two MLP layers, and then extracts Pixel-C dimensional characteristics through one MLP (dimension reduction).
CN202111498048.6A 2021-12-09 2021-12-09 Multi-scale network analysis method based on hybrid multi-layer perceptron Active CN114202690B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111498048.6A CN114202690B (en) 2021-12-09 2021-12-09 Multi-scale network analysis method based on hybrid multi-layer perceptron

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111498048.6A CN114202690B (en) 2021-12-09 2021-12-09 Multi-scale network analysis method based on hybrid multi-layer perceptron

Publications (2)

Publication Number Publication Date
CN114202690A true CN114202690A (en) 2022-03-18
CN114202690B CN114202690B (en) 2024-04-12

Family

ID=80651566

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111498048.6A Active CN114202690B (en) 2021-12-09 2021-12-09 Multi-scale network analysis method based on hybrid multi-layer perceptron

Country Status (1)

Country Link
CN (1) CN114202690B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200065968A1 (en) * 2018-08-24 2020-02-27 Ordnance Survey Limited Joint Deep Learning for Land Cover and Land Use Classification
CN112580670A (en) * 2020-12-31 2021-03-30 中国人民解放军国防科技大学 Hyperspectral-spatial-spectral combined feature extraction method based on transfer learning
US20210117729A1 (en) * 2018-03-16 2021-04-22 The United States Of America, As Represented By The Secretary, Department Of Health & Human Services Using machine learning and/or neural networks to validate stem cells and their derivatives (2-d cells and 3-d tissues) for use in cell therapy and tissue engineered products
US20210272018A1 (en) * 2020-03-02 2021-09-02 Uatc, Llc Systems and Methods for Training Probabilistic Object Motion Prediction Models Using Non-Differentiable Prior Knowledge
CN113516019A (en) * 2021-04-23 2021-10-19 深圳大学 Hyperspectral image unmixing method and device and electronic equipment
CN113642445A (en) * 2021-08-06 2021-11-12 中国人民解放军战略支援部队信息工程大学 Hyperspectral image classification method based on full convolution neural network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210117729A1 (en) * 2018-03-16 2021-04-22 The United States Of America, As Represented By The Secretary, Department Of Health & Human Services Using machine learning and/or neural networks to validate stem cells and their derivatives (2-d cells and 3-d tissues) for use in cell therapy and tissue engineered products
US20200065968A1 (en) * 2018-08-24 2020-02-27 Ordnance Survey Limited Joint Deep Learning for Land Cover and Land Use Classification
US20210272018A1 (en) * 2020-03-02 2021-09-02 Uatc, Llc Systems and Methods for Training Probabilistic Object Motion Prediction Models Using Non-Differentiable Prior Knowledge
CN112580670A (en) * 2020-12-31 2021-03-30 中国人民解放军国防科技大学 Hyperspectral-spatial-spectral combined feature extraction method based on transfer learning
CN113516019A (en) * 2021-04-23 2021-10-19 深圳大学 Hyperspectral image unmixing method and device and electronic equipment
CN113642445A (en) * 2021-08-06 2021-11-12 中国人民解放军战略支援部队信息工程大学 Hyperspectral image classification method based on full convolution neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZHE MENG等: ""SS-MLP: A Novel Spectral-Spatial MLP Architecture for Hyperspectral Image classification"", 《REMOTE SENSING》, 11 October 2021 (2021-10-11), pages 1 - 25 *
池涛等: ""多层局部感知卷积神经网络的高光谱图像分类"", 《四川大学学报(自然科学版)》, 31 January 2020 (2020-01-31), pages 103 - 112 *

Also Published As

Publication number Publication date
CN114202690B (en) 2024-04-12

Similar Documents

Publication Publication Date Title
Sony et al. A systematic review of convolutional neural network-based structural condition assessment techniques
CN112634276B (en) Lightweight semantic segmentation method based on multi-scale visual feature extraction
CN112163449B (en) Lightweight multi-branch feature cross-layer fusion image semantic segmentation method
CN109635662B (en) Road scene semantic segmentation method based on convolutional neural network
CN106934374B (en) Method and system for identifying traffic signboard in haze scene
CN111898439A (en) Deep learning-based traffic scene joint target detection and semantic segmentation method
CN114842085B (en) Full-scene vehicle attitude estimation method
CN113436210B (en) Road image segmentation method fusing context progressive sampling
CN112733693B (en) Multi-scale residual error road extraction method for global perception high-resolution remote sensing image
CN112819000A (en) Streetscape image semantic segmentation system, streetscape image semantic segmentation method, electronic equipment and computer readable medium
CN114358246A (en) Graph convolution neural network module of attention mechanism of three-dimensional point cloud scene
CN114202690B (en) Multi-scale network analysis method based on hybrid multi-layer perceptron
CN117475150A (en) Efficient semantic segmentation method based on SAC-UNet
Cao et al. An Improved YOLOv4 Lightweight Traffic Sign Detection Algorithm
CN113449656B (en) Driver state identification method based on improved convolutional neural network
Fan et al. CM-YOLOv8: Lightweight YOLO for Coal Mine Fully Mechanized Mining Face
CN111191674A (en) Primary feature extractor based on densely-connected porous convolution network and extraction method
Ma et al. Rtsnet: Real-time semantic segmentation network for outdoor scenes
CN116434039B (en) Target detection method based on multiscale split attention mechanism
CN116188774B (en) Hyperspectral image instance segmentation method and building instance segmentation method
Yao et al. Semantic information processing for interoperability in the Industrial Internet of Things
CN115205637B (en) Intelligent identification method for mine car materials
CN114067116B (en) Real-time semantic segmentation system and method based on deep learning and weight distribution
CN111798461B (en) Pixel-level remote sensing image cloud area detection method for guiding deep learning by coarse-grained label
CN117994655A (en) Bridge disease detection system and method based on improved Yolov s model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant