CN114202690A - Multi-scale network analysis method based on mixed multilayer perceptron - Google Patents
Multi-scale network analysis method based on mixed multilayer perceptron Download PDFInfo
- Publication number
- CN114202690A CN114202690A CN202111498048.6A CN202111498048A CN114202690A CN 114202690 A CN114202690 A CN 114202690A CN 202111498048 A CN202111498048 A CN 202111498048A CN 114202690 A CN114202690 A CN 114202690A
- Authority
- CN
- China
- Prior art keywords
- pixel
- steps
- network analysis
- scale network
- dimension
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 32
- 238000003012 network analysis Methods 0.000 title claims abstract description 10
- 239000000284 extract Substances 0.000 claims description 5
- 210000004271 bone marrow stromal cell Anatomy 0.000 claims description 4
- 230000006978 adaptation Effects 0.000 claims description 3
- 230000009467 reduction Effects 0.000 claims description 3
- 230000007704 transition Effects 0.000 abstract description 3
- 238000013527 convolutional neural network Methods 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 5
- 230000007547 defect Effects 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 230000003595 spectral effect Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 2
- 208000025174 PANDAS Diseases 0.000 description 1
- 208000021155 Paediatric autoimmune neuropsychiatric disorders associated with streptococcal infection Diseases 0.000 description 1
- 240000000220 Panda oleosa Species 0.000 description 1
- 235000016496 Panda oleosa Nutrition 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000739 chaotic effect Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002045 lasting effect Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000002688 persistence Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000013138 pruning Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/30—Creation or generation of source code
- G06F8/31—Programming languages or programming paradigms
- G06F8/315—Object-oriented languages
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a multi-scale network analysis method based on a mixed multilayer perceptron, which comprises the following steps: an MSC block and a UMLP block; the invention realizes that the high-efficiency hyperspectral classification method exceeds the previous method, the size of the model is only 0.185M, the method can be well suitable for the industrial requirement, the method can be well applied to the hyperspectral field, the forest transition can be monitored, and the early warning effect on forest disasters such as fire and the like can be realized in time.
Description
Technical Field
The invention relates to the field of hyperspectral images, in particular to a multiscale network analysis method based on a mixed multilayer perceptron.
Background
Huertas et al is the first to apply hyperspectral images to city analysis. And extracting the boundaries of the urban buildings by adopting a geometric topological theory. Adams et al analyze spectral bands by mathematical theory and combine other factors to analyze hyperspectral data. Generally, early hyperspectral image studies mainly utilized theoretical knowledge of various disciplines such as geometry to analyze shallow information. The most advanced methods always imply deep learning represented by Convolutional Neural Networks (CNN). In addition, the method achieves huge achievement and outstanding expression in the aspects of semantic segmentation, classification, target detection and the like. A new framework with context CNN is described that explores local context interactions by jointly exploiting neighboring local spatial spectral relationships.
Undeniably, the role of CNN in deep learning is likely to be de facto standard. However, its parameters grow exponentially with increasing convolutional layer and its size increases with increasing computational power, lasting more than 20M to have acceptable expression capability. In addition, due to the persistence of multiplication and addition operations, computational consumption is a bottleneck for industrial applications and cannot meet the real-time requirements of the industry.
Recently, self-attention layer based vision transducers (ViT) have gained the most advanced performance in computer vision and have attracted the attention of many researchers. However, the self-attention mechanism requires the computation of three matrices. The stack model consumes many computing resources deeply. Therefore, it results in many disadvantages such as a large consumption of calculation and a significant scale of the model. Tolstikhin demonstrated that CNN is not required for deep learning. The current MLP-Mixer framework employs two types of MLPs to mix the feature and spatial information of each location separately. Is a significant research topic. By sacrificing microscopic accuracy, the mold speed is significantly increased and the mold size is also compressed. Although it is simple, it has excellent performance in various fields and disciplines. And many kinds of research are based on the MLP-Mixer in the field of remote sensing to mine more meaningful analysis. Sildir et al propose a mixed integer nonlinear programming method based on a superstructure, which is used for optimal structural design of number selection, pruning and input selection of MLPs neurons and realizes the most advanced performance, and experimental image data sets are performed in two public hyperspectras. In addition, the Graph Convolution Network (GCN) can embed non-Euclidean features from neighboring notes, capturing features with excellent relationships between hyperspectral pixels. Lin uses GCN to convert features from a chaotic state to a highly cohesive state while reducing redundant information of the data. Noise in hyperspectral images is also a challenging problem. UnDIP is an excellent method that uses geometric end members to extract end members and employs deep learning to estimate abundance, thereby solving the noise problem of hyperspectral images. HyMiNoR is also an efficient denoising method using novel sparse noise frames. Furthermore, U-Net is a classical encoder-decoder architecture, where the encoder embeds spatial and semantic information, which the decoder mixes with the position features.
The second prior art has the defects
Although this method works well in hyperspectral classification, its model is computationally very computationally expensive and its run time is always lengthy. Convolution operations are the root of all problems. It brings excellent performance in various fields, but also brings complicated calculation consumption. The MLP-Mixer is an epoch-making study that involves only MLP operations, mixing all features, such as spatial information, by stacking layers. However, because the model has a simple structure, the expression ability of the model is weakened, and mainly because semantic information between adjacent structures is ignored, semantic relation between features cannot be well captured.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a multi-scale network analysis method based on a hybrid multi-layer sensor.
The technical scheme of the invention is as follows: the multi-scale network analysis method based on the mixed multi-layer perceptron comprises the following steps: MSC block and UMLP block.
Preferably, the MSC block comprises the steps of:
the method comprises the following steps: the adaptation data provided by UMLP is realized by converting the channel dimension into 2n 'n is an integer' through a mixed MLP layer;
step two: mixing channel information using convolutional layers with convolutional kernels of size 1 x 1;
step three: each pixel of the mixed channel information output hyperspectral image forms a patch similar to an image;
step four: each row represents a different characteristic representation of each Pixel generated by the convolution, namely Pixel-C;
step five: each column represents a summary of the original pixel channel values, i.e., Gen-C.
Preferably, the UMLP block comprises the steps of:
the method comprises the following steps: stacking layers of MSCs for higher receptive fields, the input (U, j, x) having global feature information in both the Pixel-C dimension and the Gen-C dimension;
step two: the MixerBlock module mixes semantic information of two directions through two MLP layers, and then extracts Pixel-C dimensional characteristics through one MLP (dimension reduction).
The multi-scale network analysis method based on the mixed multilayer perceptron has the following beneficial effects:
1. the invention realizes a high-efficiency hyperspectral classification method, which exceeds the prior method, thereby having certain commercial value.
2. The size of the model of the invention is only 0.185M, which can be well adapted to the industrial requirements.
3. The method can be well applied to the hyperspectral field, such as field transition analysis, city evolution and the like.
4. The invention monitors forest transition and performs timely early warning function on forest disasters such as fire and the like.
Drawings
FIG. 1 is a diagram of a multi-scale U-shaped multi-layered sensor according to the present invention.
FIG. 2 is a data flow diagram of the present invention.
Fig. 3 is a visualization of the results of the present invention.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.
The invention mainly comprises Numpy, Pandas, Tensor and the like through the existing deep learning frame Pythrch and a corresponding programming library. The Pytrch mainly uses a deep learning model, and comprises a linear module, a convolution module, a parameter penalty module and the like.
The specific scheme is realized according to the following principle:
we convert the segmentation into a classification task in the proposed method, performing pixel-level classification instead of patch segmentation. The method includes an MSC (multi-scale channel) block and a UMLP (U-shaped multi-layer perceptron) block. For the MSC block, in order to provide adaptation data for UMLP, it converts the channel dimension to 2n (n is an integer) by the hybrid MLP layer. The channel information is then mixed using convolutional layers with convolutional kernels of size 1 x 1. It outputs a patch that forms an image-like image for each pixel of the hyperspectral image. Each row represents a different characteristic representation of each pixel generated by the convolution, and is referred to below simply as: Pixel-C. Each column represents a summary of the original pixel channel values, abbreviated as: Gen-C.
The MSC mainly comprises two parts: firstly, MSC is used for extracting the characteristics of Pixel-C, similar to pooling operation, and secondly, the MSC is used for mixing Gen-C information. For a hyperspectral image, the size is H × W × C. C is the number of spectral bands and H and W are the height and width of the image, respectively. The random method randomizes all pixels and excludes the background, and employs random sampling of each pixel.
The Pixel-C dimension embeds only one feature of a Pixel. Therefore, we extract more semantic features using convolution operations of 1 × 1 kernel size, with various information embedded to summarize each pixel channel and named Gen-C.
For UMLP blocks, it consists of MixerBlock, U-shaped interpret and skip connect modules, respectively. The MSC layers are stacked to obtain a higher receptive field, and the input (U, j, x) has global feature information in both the Pixel-C dimension and the Gen-C dimension. The MixerBlock module mixes semantic information of two directions through two MLP layers, and then extracts Pixel-C dimensional characteristics through one MLP (dimension reduction), as shown in FIG. 1
Aiming at the defect of huge calculation consumption of the existing model, the invention provides a multi-scale network based on a hybrid multi-layer sensor. The consumption of computing resources of the MLP model is extremely low, only the multiplication and accumulation of corresponding positions are needed, and the problem that parameters are exponentially increased after network layers are overlapped does not exist, so that the defects of the existing model are well overcome.
Aiming at the defect of poor expression capability of an MLP-Mixer model, the invention provides a multi-scale multi-channel U-shaped network. It converts channel dimensions to 2n (n is an integer) to unify channel data of different data sets. The core of the module is 1 × 1 convolution operation, and the representation of various characteristics of pixels is expanded to obtain multidimensional distribution. It extracts multiple fragments from a single pixel to culture expression capacity. Finally, the representation capability of the model is further enhanced by superposing a hierarchical structure similar to a U-Net structure.
The data flow diagram of the present invention, as shown in FIG. 2;
the present invention has performed extensive experiments on a widely adopted public data set. MUMLP is comprehensively superior to the most advanced method, and in a Houston 2018 data set, compared with the average accuracy of CAGU, the average accuracy is improved by 6.61%, the average accuracy is improved by 5.47% in an MLP-Mixer, the average accuracy is improved by 14.17% in an OTVCA, and the average accuracy is improved by 14.64% in the OTVCA.
The results of the patented method are visualized as shown in fig. 3 below.
Claims (3)
1. The multi-scale network analysis method based on the mixed multilayer perceptron is characterized by comprising the following steps: MSC block and UMLP block.
2. The method of multi-scale network analysis based on hybrid multi-layer perceptron according to claim 1, characterized in that said MSC block comprises the following steps:
the method comprises the following steps: the adaptation data provided by UMLP is realized by converting the channel dimension into 2n 'n is an integer' through a mixed MLP layer;
step two: mixing channel information using convolutional layers with convolutional kernels of size 1 x 1;
step three: each pixel of the mixed channel information output hyperspectral image forms a patch similar to an image;
step four: each row represents a different characteristic representation of each Pixel generated by the convolution, namely Pixel-C;
step five: each column represents a summary of the original pixel channel values, i.e., Gen-C.
3. The hybrid multi-layered perceptron-based multi-scale network analysis method according to claim 1, wherein the UMLP block comprises the steps of:
the method comprises the following steps: stacking layers of MSCs for higher receptive fields, the input (U, j, x) having global feature information in both the Pixel-C dimension and the Gen-C dimension;
step two: the MixerBlock module mixes semantic information of two directions through two MLP layers, and then extracts Pixel-C dimensional characteristics through one MLP (dimension reduction).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111498048.6A CN114202690B (en) | 2021-12-09 | 2021-12-09 | Multi-scale network analysis method based on hybrid multi-layer perceptron |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111498048.6A CN114202690B (en) | 2021-12-09 | 2021-12-09 | Multi-scale network analysis method based on hybrid multi-layer perceptron |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114202690A true CN114202690A (en) | 2022-03-18 |
CN114202690B CN114202690B (en) | 2024-04-12 |
Family
ID=80651566
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111498048.6A Active CN114202690B (en) | 2021-12-09 | 2021-12-09 | Multi-scale network analysis method based on hybrid multi-layer perceptron |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114202690B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200065968A1 (en) * | 2018-08-24 | 2020-02-27 | Ordnance Survey Limited | Joint Deep Learning for Land Cover and Land Use Classification |
CN112580670A (en) * | 2020-12-31 | 2021-03-30 | 中国人民解放军国防科技大学 | Hyperspectral-spatial-spectral combined feature extraction method based on transfer learning |
US20210117729A1 (en) * | 2018-03-16 | 2021-04-22 | The United States Of America, As Represented By The Secretary, Department Of Health & Human Services | Using machine learning and/or neural networks to validate stem cells and their derivatives (2-d cells and 3-d tissues) for use in cell therapy and tissue engineered products |
US20210272018A1 (en) * | 2020-03-02 | 2021-09-02 | Uatc, Llc | Systems and Methods for Training Probabilistic Object Motion Prediction Models Using Non-Differentiable Prior Knowledge |
CN113516019A (en) * | 2021-04-23 | 2021-10-19 | 深圳大学 | Hyperspectral image unmixing method and device and electronic equipment |
CN113642445A (en) * | 2021-08-06 | 2021-11-12 | 中国人民解放军战略支援部队信息工程大学 | Hyperspectral image classification method based on full convolution neural network |
-
2021
- 2021-12-09 CN CN202111498048.6A patent/CN114202690B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210117729A1 (en) * | 2018-03-16 | 2021-04-22 | The United States Of America, As Represented By The Secretary, Department Of Health & Human Services | Using machine learning and/or neural networks to validate stem cells and their derivatives (2-d cells and 3-d tissues) for use in cell therapy and tissue engineered products |
US20200065968A1 (en) * | 2018-08-24 | 2020-02-27 | Ordnance Survey Limited | Joint Deep Learning for Land Cover and Land Use Classification |
US20210272018A1 (en) * | 2020-03-02 | 2021-09-02 | Uatc, Llc | Systems and Methods for Training Probabilistic Object Motion Prediction Models Using Non-Differentiable Prior Knowledge |
CN112580670A (en) * | 2020-12-31 | 2021-03-30 | 中国人民解放军国防科技大学 | Hyperspectral-spatial-spectral combined feature extraction method based on transfer learning |
CN113516019A (en) * | 2021-04-23 | 2021-10-19 | 深圳大学 | Hyperspectral image unmixing method and device and electronic equipment |
CN113642445A (en) * | 2021-08-06 | 2021-11-12 | 中国人民解放军战略支援部队信息工程大学 | Hyperspectral image classification method based on full convolution neural network |
Non-Patent Citations (2)
Title |
---|
ZHE MENG等: ""SS-MLP: A Novel Spectral-Spatial MLP Architecture for Hyperspectral Image classification"", 《REMOTE SENSING》, 11 October 2021 (2021-10-11), pages 1 - 25 * |
池涛等: ""多层局部感知卷积神经网络的高光谱图像分类"", 《四川大学学报(自然科学版)》, 31 January 2020 (2020-01-31), pages 103 - 112 * |
Also Published As
Publication number | Publication date |
---|---|
CN114202690B (en) | 2024-04-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Sony et al. | A systematic review of convolutional neural network-based structural condition assessment techniques | |
CN112634276B (en) | Lightweight semantic segmentation method based on multi-scale visual feature extraction | |
CN112163449B (en) | Lightweight multi-branch feature cross-layer fusion image semantic segmentation method | |
CN109635662B (en) | Road scene semantic segmentation method based on convolutional neural network | |
CN106934374B (en) | Method and system for identifying traffic signboard in haze scene | |
CN111898439A (en) | Deep learning-based traffic scene joint target detection and semantic segmentation method | |
CN114842085B (en) | Full-scene vehicle attitude estimation method | |
CN113436210B (en) | Road image segmentation method fusing context progressive sampling | |
CN112733693B (en) | Multi-scale residual error road extraction method for global perception high-resolution remote sensing image | |
CN112819000A (en) | Streetscape image semantic segmentation system, streetscape image semantic segmentation method, electronic equipment and computer readable medium | |
CN114358246A (en) | Graph convolution neural network module of attention mechanism of three-dimensional point cloud scene | |
CN114202690B (en) | Multi-scale network analysis method based on hybrid multi-layer perceptron | |
CN117475150A (en) | Efficient semantic segmentation method based on SAC-UNet | |
Cao et al. | An Improved YOLOv4 Lightweight Traffic Sign Detection Algorithm | |
CN113449656B (en) | Driver state identification method based on improved convolutional neural network | |
Fan et al. | CM-YOLOv8: Lightweight YOLO for Coal Mine Fully Mechanized Mining Face | |
CN111191674A (en) | Primary feature extractor based on densely-connected porous convolution network and extraction method | |
Ma et al. | Rtsnet: Real-time semantic segmentation network for outdoor scenes | |
CN116434039B (en) | Target detection method based on multiscale split attention mechanism | |
CN116188774B (en) | Hyperspectral image instance segmentation method and building instance segmentation method | |
Yao et al. | Semantic information processing for interoperability in the Industrial Internet of Things | |
CN115205637B (en) | Intelligent identification method for mine car materials | |
CN114067116B (en) | Real-time semantic segmentation system and method based on deep learning and weight distribution | |
CN111798461B (en) | Pixel-level remote sensing image cloud area detection method for guiding deep learning by coarse-grained label | |
CN117994655A (en) | Bridge disease detection system and method based on improved Yolov s model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |