CN116630709A - Hyperspectral image classification device and method capable of configuring mixed convolutional neural network - Google Patents
Hyperspectral image classification device and method capable of configuring mixed convolutional neural network Download PDFInfo
- Publication number
- CN116630709A CN116630709A CN202310598079.1A CN202310598079A CN116630709A CN 116630709 A CN116630709 A CN 116630709A CN 202310598079 A CN202310598079 A CN 202310598079A CN 116630709 A CN116630709 A CN 116630709A
- Authority
- CN
- China
- Prior art keywords
- data
- module
- neural network
- dimensional
- convolution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 57
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000012545 processing Methods 0.000 claims abstract description 38
- 230000001133 acceleration Effects 0.000 claims abstract description 21
- 238000001228 spectrum Methods 0.000 claims abstract description 13
- 239000000284 extract Substances 0.000 claims abstract description 9
- 238000004364 calculation method Methods 0.000 claims description 25
- 230000005540 biological transmission Effects 0.000 claims description 19
- 238000011176 pooling Methods 0.000 claims description 16
- 101100498818 Arabidopsis thaliana DDR4 gene Proteins 0.000 claims description 13
- 238000013528 artificial neural network Methods 0.000 claims description 10
- 230000009467 reduction Effects 0.000 claims description 9
- 238000012847 principal component analysis method Methods 0.000 claims description 4
- 238000004891 communication Methods 0.000 claims description 3
- 238000007667 floating Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 238000013461 design Methods 0.000 abstract description 16
- 230000003595 spectral effect Effects 0.000 description 8
- 238000011160 research Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000000605 extraction Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 125000004122 cyclic group Chemical group 0.000 description 3
- 238000013500 data storage Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000008707 rearrangement Effects 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 238000013136 deep learning model Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/58—Extraction of image or video features relating to hyperspectral data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/7715—Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/94—Hardware or software architectures specially adapted for image or video understanding
- G06V10/955—Hardware or software architectures specially adapted for image or video understanding using specific electronic processors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/194—Terrestrial scenes using hyperspectral data, i.e. more or other wavelengths than RGB
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A40/00—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
- Y02A40/10—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biomedical Technology (AREA)
- Medical Informatics (AREA)
- Biophysics (AREA)
- Databases & Information Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Remote Sensing (AREA)
- Neurology (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a hyperspectral image classification device and a hyperspectral image classification method for a configurable mixed convolutional neural network, which utilize a 2D-3D mixed convolutional neural network to fully extract characteristic information of space and spectrum dimensions, and effectively reduce the scale and complexity of a model while ensuring classification accuracy; based on ZYNQ heterogeneous multi-core processor, the software and hardware collaborative design is developed, the reasoning acceleration of the hybrid convolutional neural network model is realized by utilizing programmable logic resources, and meanwhile, the flexibility and the configurability of parallel dimension and convolutional kernel are supported. The method supports rapid deployment and reasoning acceleration of the 2D and 3D convolutional neural networks to the FPGA processor, has good applicability to different FPGA processors and convolutional neural network models, and provides support for low-power consumption and high-speed processing application of hyperspectral image classification.
Description
Technical Field
The invention relates to the technical field of remote sensing data real-time processing, in particular to a hyperspectral image classification device and method of a configurable mixed convolution neural network.
Background
The hyperspectral remote sensing technology is to acquire tens or hundreds of continuous narrow wave bands with nanoscale spectrum resolution while imaging the ground object target space by using an imaging spectrometer, so as to obtain hyperspectral remote sensing images containing rich space and spectrum information, and plays an important role in precise agriculture, geological investigation and other aspects. The hyperspectral image contains effective information that characterizes different features, and how to more effectively use this information to classify unknown regions has been a hotspot of research.
In recent years, some Deep learning models have been introduced into the problem of hyperspectral image classification, such as stacked self-encoding networks (Stacked Autoencoder, SAE), deep belief networks (Deep BeliefNetworks, DBN), convolutional neural networks (Convolutional Neural Networks, CNN), and the like. The classification method based on the convolutional neural network is most popular, and experiments show that the characteristic extraction and learning capacity of the convolutional neural network is superior to that of the traditional characteristic extraction method, so that the classification precision is greatly improved.
However, how to fully extract the abundant spatial and spectral information in the hyperspectral image by using the convolutional neural network is a key problem for determining the classification accuracy. The Wei L et al extract the spectral characteristics of the hyperspectral image through a one-dimensional convolutional neural network (1D-CNN), the network model is simple, but the limitation is that only spectral vectors can be extracted, spatial characteristics are not considered, and the hyperspectral image has the phenomena of 'homospectral foreign matters' and 'foreign matters homospectrum', and is classified only by utilizing the spectral characteristics, so that a good classification effect is difficult to obtain. Li S et al use PCA to reduce the dimension of hyperspectral image data, then use two-dimensional convolutional neural network (2D-CNN) to extract the spatial information in the neighborhood of input pixel, but the model can only extract the spatial characteristic information, and can not obtain the relation information between spectrum channels. Furthermore, researchers propose that classification accuracy can be further improved by simultaneously extracting joint features of spectrum space through a three-dimensional convolutional neural network (3D CNN). However, the 3D-CNN model requires deep convolution layers to effectively extract the spatial-spectral joint features, resulting in a significant increase in model complexity and number of training samples. In 2020, swalpaKumarRoy et al propose a hybrid sn model, and the method combines 3D CNN and 2D CNN to fully extract spatial feature information and spectral feature information, so that compared with the method of performing hyperspectral image classification by using 3D CNN alone, the computational complexity is reduced.
It is noted that, in order to make the model have a stronger spatial feature extraction capability, even if the hybrid sn model includes up to three layers of 3D convolution for feature extraction, the method of stacking multiple layers of 3D convolution to extract deep features significantly increases the complexity and computation cost of the model. On the other hand, hyperspectral images have the inherent advantage that the spectral dimension has spectral information in hundreds of bands, so that high density data presents a significant challenge for both computational and buffer throughput in the process.
In summary, the huge data volume of the hyperspectral image and the complex calculation requirement of deep learning bring double tests to the development of the hyperspectral image real-time processing device, and especially the hyperspectral image real-time processing device is based on an FPGA processor with more limited resources. Meanwhile, in order to reduce the dependence of FPGA design on the expertise of designers, the research can be adapted to flexible and configurable processing architectures of different network models, so that the development process is greatly promoted, and the method is also an important problem worthy of attention.
The deep learning method is introduced into the hyperspectral image classification problem, is a research field of comparison fronts, the existing patent is concentrated on the research of a deep learning model, for example, a Chinese patent application CN202210279207.1 (a hyperspectral classification method of a lightweight mixed convolution model based on global reasoning) is researched and added with a global reasoning module, global characteristic and deep characteristic information of a hyperspectral image are effectively extracted through reasoning on the context relation among different areas, and the patent application is limited to the research of the hyperspectral classification method and cannot form guidance on the design of a real-time processing device based on an embedded processor. Chinese patent application CN201910766635.5 (hyperspectral image classification method based on FPGA) designs a depth edge filter to classify hyperspectral images, and utilizes a field programmable gate array FPGA and an OpenCL heterogeneous computing framework to carry out device design.
Disclosure of Invention
In order to solve the technical problems, the invention provides a hyperspectral image classification device and a hyperspectral image classification method based on a configurable mixed convolution neural network, which adopt a hyperspectral image classification real-time processing system based on a ZYNQ heterogeneous multi-core processor, fully extract characteristic information of space and spectrum dimensions by using the mixed convolution neural network, and effectively reduce the scale and complexity of a model, thereby supporting deployment and quick operation in an embedded processor and providing support for low-power consumption and high-speed processing application of hyperspectral images.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
a hyperspectral image classification device of a configurable hybrid convolutional neural network comprises a processing system end, a programmable logic end and a DDR4 memory; the processing system end integrates a Cortex-A53 ARM four-core processor and is suitable for executing task control and floating point operation; the programmable logic end integrates a programmable logic array, provides powerful parallel processing capability, realizes parallel processing of a large amount of data, and completes parallel acceleration calculation of the convolutional neural network; the processing system end comprises a data and task scheduling module, an external interface controller module, a PCA module and a Softmax module; the programmable logic end comprises a configuration register module, a convolution operation module, a pooling operation module and a data transmission module.
Further, the data and task scheduling module is used for external data receiving, intermediate data caching, data preprocessing and scheduling control among convolutional neural network layers; the external interface controller module comprises a CameraLink interface controller and a UART interface controller, and respectively realizes data communication with the hyperspectral camera and an external memory card; the PCA module performs data dimension reduction processing by using a principal component analysis method; and the Softmax module carries out Softmax processing on the category confidence values, normalizes and probabilities the confidence values of all the categories, and finally obtains the category and the category probability value.
Further, the configuration register module is used for receiving instruction information of the processing system end and configuring the instruction information to registers corresponding to the convolution operation module and the pooling module operation module; the convolution operation module realizes parallel acceleration of convolution calculation, supports flexible configuration of two-dimensional convolution neural networks and three-dimensional convolution neural network calculation with different convolution kernel sizes, and supports multi-dimensional parallel acceleration of output including characteristic channels, convolution kernel channels and convolution kernel depths; the pooling operation module realizes the operation acceleration of two-dimensional pooling calculation and supports the parallel acceleration in the dimension of an input channel; the data transmission module realizes the data transmission between the convolution operation module and the pooling modular operation module and the DDR4 memory, the data transmission is carried out between the convolution operation module and the pooling modular operation module and the data transmission module by adopting a self-defined data read-write bus, and an AXI4 bus is adopted between the data transmission module and the DDR4 memory.
The invention also provides a hyperspectral image classification method of the configurable mixed convolution neural network, which comprises the following steps:
step 1, an original hyperspectral image is obtained and marked as X, the dimension of a data cube is marked as MxNxD, wherein M and N are the width and the height of a space dimension, and D is the number of spectrum bands; carrying out data dimension reduction on the spectrum of the original hyperspectral image by adopting a principal component analysis method, wherein the number of spectrum channels of the hyperspectral remote sensing image is reduced from the original D to B, and the dimension of a data cube after dimension reduction is changed into MxNxB;
step 2, cutting the dimension-reduced data cube into k three-dimensional small blocks with the size of S multiplied by B; the three-dimensional patches are then input one by one into a model of the hybrid convolutional neural network.
Further, the hybrid convolutional neural network is formed by connecting a plurality of three-dimensional convolutional layers, two-dimensional convolutional layers and full-connection layers in series; the three-dimensional convolution layer is used for extracting three-dimensional features in the image, the two-dimensional convolution layer further extracts two-dimensional features on the basis of the extracted three-dimensional features, and the full-connection layer maps the feature space calculated by all the convolution layers to a sample marking space, so that the robustness of the whole network is improved.
The beneficial effects are that:
1. according to the heterogeneous multi-core processor based on ZYNQ, firstly, the ARM is utilized to carry out PCA data dimension reduction on the hyperspectral image cube, then the programmable logic resource is utilized to realize parallel acceleration of the convolutional neural network, the advantages of software and hardware are fully exerted, the energy efficiency ratio is improved, and the application requirement of hyperspectral image classification real-time processing is met.
2. The convolutional neural network architecture designed by the invention supports 2D-CNN and 3D-CNN calculation at the same time, has flexible and adjustable parallel dimension, can flexibly adapt to different convolutional neural network structures, and can greatly accelerate development progress from algorithm to embedded design.
3. According to the invention, the hyperspectral image classification is performed by adopting the 2D CNN+3D CNN mixed convolutional neural network, and the complexity of the model is effectively controlled while the space characteristic information and the spectrum characteristic information are fully extracted, so that the real-time and rapid processing of the hyperspectral image classification on an embedded platform is possible.
Drawings
Fig. 1 is a schematic structural diagram of a hyperspectral image classification device based on a configurable mixed convolutional neural network.
FIG. 2 is a three-dimensional convolution operation loop optimization pseudocode of the present invention.
Fig. 3 is a schematic diagram of the loading of weight-multiplexed data according to the present invention.
Fig. 4 is a four-dimensional feature rearrangement illustration of the present invention.
FIG. 5 is a schematic diagram of a five-dimensional weight arrangement according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
The invention provides a hyperspectral image classification device and a hyperspectral image classification method based on a configurable mixed convolutional neural network, which adopt software and hardware collaborative design. The invention will be further illustrated with reference to specific examples.
In an exemplary embodiment of the invention, based on the ZYNQ MPSoC processor, the hyperspectral image classification real-time processing system based on the ZYNQ heterogeneous multi-core processor is adopted, the hybrid SN hybrid convolutional neural network model is utilized to realize the real-time processing of hyperspectral image classification, and good classification effect can be obtained. It is worth noting that the invention realizes the parallel acceleration calculation of 2D CNN and 3D CNN by utilizing a large amount of programmable logic resources in the ZYNQ MPSoC processor, and has good applicability to different hybrid convolutional neural network models due to the design flexibility of the programmable logic resources, and flexible and rapid design can be realized through parameter configuration.
The structure of the hybrid sn network model is shown in table 1 and comprises 3 three-dimensional convolution layers, 1 two-dimensional convolution layer and 2 fully-connected layers. And combining the 3D CNN and the 2D CNN, and fully extracting the spatial characteristic information and the spectral characteristic information while reducing the network complexity.
After the original hyperspectral image data is subjected to PCA data dimension reduction, 30 wave band data are reserved; further, the data cube is cut into three-dimensional tiles of size 25×25×30 (width 25 height 25 depth 30). The three-dimensional small blocks are used as the input of a convolutional neural network, and the convolutional kernel size, the cache and the calculation characteristics of each layer in the network are shown in table 1.
TABLE 1
Further, a hybrid SN mixed convolution neural network model is deployed into a ZYNQ MPSoC processor, so that real-time processing of hyperspectral image classification is realized. In an exemplary embodiment of the present invention, the hyperspectral image classification device based on the configurable mixed convolutional neural network is mainly composed of a ZYNQ UltraScale+MPSoC chip of Xilinx company and a DDR4 memory. MPSoC chip internal resources are divided into a processing system (Processing System, PS) side and a programmable logic (Programmable Logic, PL) side. The PS end is integrated with a Cortex-A53 ARM four-core processor, and is suitable for executing task control, floating point operation and the like; the PL terminal integrates the programmable logic array, provides powerful parallel processing capability, and can solve the parallel processing problem of a large amount of data.
The invention combines algorithm characteristics and complexity and ZYNQ chip structure characteristics to carry out software and hardware division design, utilizes a PS end to complete data and task scheduling, external interface control, PCA, softmax and other data processing, and utilizes a PL end to complete parallel acceleration calculation of a convolutional neural network.
As shown in fig. 1, the PS end of the hyperspectral image classification device of the configurable hybrid convolutional neural network of the present invention mainly includes:
(1) And a data and task scheduling module: and the method is responsible for external data receiving, intermediate data caching, data preprocessing, scheduling control among CNN network layers and the like.
(2) An external interface controller module: the system mainly comprises a CameraLink interface controller, a UART controller and a DDR4 controller, and respectively realizes data communication with a hyperspectral camera, an external memory card and a DDR4 memory.
(3) PCA module: and performing data dimension reduction processing by using a PCA method.
(4) Softmax module: and carrying out softmax processing on the category confidence values, normalizing and probability the confidence values of all the categories, and finally obtaining category and category probability values.
The PL end of the hyperspectral image classification device of the configurable mixed convolution neural network mainly comprises:
(1) A configuration register module: and the instruction information of the PS end is received and is configured to a corresponding register of the operation module.
(2) And a convolution operation module: the parallel acceleration of convolution calculation is realized, flexible configuration of 2D CNN and 3D CNN calculation with different convolution kernel sizes is supported, and multi-dimensional parallel acceleration of an output characteristic channel, a convolution kernel depth and the like is supported. The module is composed of 6 sub-modules, including: read data, weight buffer, feature buffer, data loading, convolution operation and write data.
(3) And the pooling operation module: the operation acceleration of two-dimensional pooling calculation is realized, and parallel acceleration in the dimension of an input channel is supported. The module is composed of 4 sub-modules, including: reading data, pooling in a wide direction, pooling in a high direction and writing data.
(4) And a data transmission module: the data transmission between the operation module and the DDR4 memory is realized, the data transmission between the operation module and the data transmission module is realized by adopting a self-defined data read-write bus, and the AXI4 bus is adopted between the data transmission module and the external DDR4 memory.
About 90% of the computation in the neural network is concentrated on convolution operations, which are mainly embodied by operations on multiple loops of a large amount of repeated data, and therefore, one of the emphasis of the acceleration design in the present invention is on the optimal design for multiple loops.
Fig. 2 is a schematic diagram of a multiple cycle structure of a three-dimensional convolution operation, wherein the direct convolution form of the three-dimensional convolution operation comprises 8 cycles, and the invention is described by a cycle optimization method in the three-dimensional convolution operation.
As shown in FIG. 2, the invention circularly blocks the three dimensions of the output characteristic channel, the convolution kernel channel and the convolution kernel depth, wherein the circulation blocks adopt a parallel mode, and the block sizes of the three dimensions are respectively marked as P n 、P c 、P d The parallelism can be flexibly configured according to the specific network structure and hardware resources. Meanwhile, cyclic expansion is carried out in the cyclic block, and the cyclic expansion is mapped into a plurality of groups of parallel multiply-add arrays. In addition, since the height of the output feature and the width of the output feature are equivalent, the invention also circularly combines the two dimensions of the height and the width of the output feature, and circularly blocks the combined circle. Compared with the structure before optimization, the optimized multi-cycle structure can perform parallel computation on three dimensions of an output characteristic channel, a convolution kernel channel and a convolution kernel depth, and perform pipelining parallel on other dimensions, so that great convenience is provided for the subsequent design of a convolution operation module, and support is provided for the flexible configuration of parallel dimensions and convolution types.
It is worth noting that the multi-cycle structure also comprises two cycles of convolution kernel height and convolution kernel width, so that the flexible design of different convolution kernel sizes can be realized by means of the support of the data loading submodule in the convolution operation module.
In practical application, when the invention is applied to processor platforms facing different resource configurations, the invention can support two configuration strategies: one is a processor platform which is sufficient for computing resources (mainly DSP resources), a PL end can generate a plurality of convolution operation modules, and each convolution operation module supports convolution calculation of specific convolution types and convolution kernel sizes; the other is a processor platform with limited computing resources and relatively sufficient storage resources, the PL end can only generate one convolution operation module, the convolution calculation can be configured through a configuration register module, and the loading sub-module is used for sending the adaptive characteristic data and weight data, so that the convolution calculation of various convolution types and convolution kernel sizes is realized.
Besides the calculation capability, the data transmission bandwidth is considered in the design of the hyperspectral image classification real-time processing system, the cache is designed reasonably as much as possible, the time consumption of data transmission is reduced, and therefore the overall processing performance of the system is improved. In the convolution operation, a large amount of data participates in the operation on a plurality of cycles, and part of the data is reused. Therefore, by utilizing the characteristic of convolution calculation, the data multiplexing mode is reasonably designed, the access bandwidth requirement is reduced, and the method is also greatly helpful for improving the acceleration performance. According to the method, the characteristic of the hyperspectral image classification model is combined, and the repeated reading and writing of weight data are avoided by adopting a weight multiplexing mode. As shown in fig. 3, the feature and weight loading flow based on weight multiplexing in the three-dimensional convolution operation is illustrated. In the convolution calculation process, one weight data and a plurality of characteristic data corresponding to the weight data are loaded each time, and multiplication and addition calculation is completed one by one.
In order to match the parallel computing method and the weight multiplexing mode, the invention designs a characteristic data storage mode and a weight data storage mode which are suitable for the parallel computing method and the weight multiplexing mode.
Fig. 4 shows a storage method of feature data. In the three-dimensional convolution operation, the input feature data is four-dimensional and can be expressed as (channel of feature, depth of feature, height of feature, width of feature), abbreviated as (C d ,D d ,H d ,W d ). The original four-dimensional characteristic data are rearranged into three dimensions, and a specific rearrangement mode is to sequentially arrange first-layer depth data on all channels in the original four-dimensional characteristic data, sequentially arrange second-layer depth data and the like until the last-layer depth data, and finally splice the data together along the depth direction to form new three-dimensional characteristic data. The chunk size of the channel of the input feature is P c HandleNovel three-dimensional feature data with width of 1 and height of 1 per P c The data is considered as a sub-block, and then the new three-dimensional features formed by these sub-blocks can be represented as (D d *ceil(C d /P c ),H d ,W d ). The characteristic data storage mode is to sequentially store the data in the first subblocks of the first row and the first column in the new three-dimensional characteristic in the external DDR4 memory, sequentially store the data in the first subblocks of the first row and the second column, and sequentially store the H th subblock in the sequence of the first column and the second column d Line W d The data in the first sub-block of the column is stored, and then the data in the second sub-block of the first column of the first row is stored sequentially, and so on until all the characteristic data is stored.
Fig. 5 shows a storage method of weight data. In the three-dimensional convolution operation, the input weight is five-dimensional and can be expressed as (output channel of convolution kernel, input channel of convolution kernel, depth of convolution kernel, high of convolution kernel, width of convolution kernel), abbreviated as (N) w ,C w ,D w ,H w ,W w ). The original five-dimensional weight data are rearranged into four dimensions, and a specific rearrangement mode is to sequentially arrange the first layer depth data of all input channels on each output channel in the original weight, sequentially arrange the second layer depth data until the last layer of depth data, and finally splice the data on each output channel together along the depth direction to form new four-dimensional weight data. The width of each output channel is 1 to 1 per C w The data is considered as a sub-block, and then the new four-dimensional weights of these sub-blocks can be expressed as (N) w ,D w ,H w ,W w ). The computational parallelism associated with the weight is P n 、P c And P d The weight is stored in the external DDR4 memory by sequentially storing the previous P in the new four-dimensional weight n First row first column front P of each output channel on each output channel d Front P of each of the sub-blocks c Data, and sequentially store P before the first row and the second column of each output channel d Front P of each of the sub-blocks c Data, then in accordance with the first row and then the second rowSequentially arranging the H th of each output channel w Line W w P before column d Front P of each of the sub-blocks c After the data are stored, the previous P in the new four-dimensional weight is sequentially stored n First row first column front P of each output channel on each output channel d Subsequent P of each of the sub-blocks c Data, and the like, front P n And finally repeating the above operations according to the sequence of the first depth and the second output channel until all the weight data are stored.
It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.
Claims (5)
1. A hyperspectral image classification device of a configurable hybrid convolutional neural network, characterized in that: the system comprises a processing system end, a programmable logic end and a DDR4 memory; the processing system end integrates a Cortex-A53 ARM four-core processor and is used for executing task control and floating point operation; the programmable logic end integrates a programmable logic array, provides powerful parallel processing capability, realizes parallel processing of a large amount of data, and completes parallel acceleration calculation of the convolutional neural network; the processing system end comprises a data and task scheduling module, an external interface controller module, a PCA module and a Softmax module; the programmable logic end comprises a configuration register module, a convolution operation module, a pooling operation module and a data transmission module.
2. A hyperspectral image classification apparatus as claimed in claim 1 wherein: the data and task scheduling module is used for external data receiving, intermediate data caching, data preprocessing and scheduling control among convolutional neural network layers; the external interface controller module comprises a CameraLink interface controller and a UART interface controller, and respectively realizes data communication with the hyperspectral camera and an external memory card; the PCA module performs data dimension reduction processing by using a principal component analysis method; and the Softmax module carries out Softmax processing on the category confidence values, normalizes and probabilities the confidence values of all the categories, and finally obtains the category and the category probability value.
3. A hyperspectral image classification apparatus as claimed in claim 1 wherein: the configuration register module is used for receiving instruction information of the processing system end and is configured to the corresponding registers of the convolution operation module and the pooling module operation module; the convolution operation module realizes parallel acceleration of convolution calculation, supports flexible configuration of two-dimensional convolution neural networks and three-dimensional convolution neural network calculation with different convolution kernel sizes, and supports multi-dimensional parallel acceleration of output including characteristic channels, convolution kernel channels and convolution kernel depths; the pooling operation module realizes the operation acceleration of two-dimensional pooling calculation and supports the parallel acceleration in the dimension of an input channel; the data transmission module realizes the data transmission among the convolution operation module, the pooling modular operation module and the DDR4 memory, the convolution operation module, the pooling modular operation module and the data transmission module adopt self-defined data read-write buses to carry out data transmission, and the data transmission module and the DDR4 memory adopt AXI4 buses.
4. A hyperspectral image classification method of a hyperspectral image classification apparatus of a configurable hybrid convolutional neural network as recited in any one of claims 1-3, comprising the steps of:
step 1, an original hyperspectral image is obtained and marked as X, the dimension of a data cube is marked as MxNxD, wherein M and N are the width and the height of a space dimension, and D is the number of spectrum bands; carrying out data dimension reduction on the spectrum of the original hyperspectral image by adopting a principal component analysis method, wherein the number of spectrum channels of the hyperspectral remote sensing image is reduced from the original D to B, and the dimension of a data cube after dimension reduction is changed into MxNxB;
step 2, cutting the dimension-reduced data cube into k three-dimensional small blocks with the size of S multiplied by B; the three-dimensional patches are then input one by one into a model of the hybrid convolutional neural network.
5. The hyperspectral image classification method as claimed in claim 4, wherein the mixed convolutional neural network is composed of a plurality of three-dimensional convolutional layers, two-dimensional convolutional layers and fully-connected layers in series; the three-dimensional convolution layer is used for extracting three-dimensional features in the image, the two-dimensional convolution layer further extracts two-dimensional features on the basis of the extracted three-dimensional features, and the full-connection layer maps the feature space calculated by all the convolution layers to a sample marking space, so that the robustness of the whole network is improved.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310598079.1A CN116630709B (en) | 2023-05-25 | 2023-05-25 | Hyperspectral image classification device and method capable of configuring mixed convolutional neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310598079.1A CN116630709B (en) | 2023-05-25 | 2023-05-25 | Hyperspectral image classification device and method capable of configuring mixed convolutional neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116630709A true CN116630709A (en) | 2023-08-22 |
CN116630709B CN116630709B (en) | 2024-01-09 |
Family
ID=87620891
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310598079.1A Active CN116630709B (en) | 2023-05-25 | 2023-05-25 | Hyperspectral image classification device and method capable of configuring mixed convolutional neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116630709B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118171049A (en) * | 2024-05-13 | 2024-06-11 | 西南交通大学 | Big data-based battery management method and system for edge calculation |
CN118350982A (en) * | 2024-06-18 | 2024-07-16 | 北京理工大学 | Hyperspectral perception processing system |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106228238A (en) * | 2016-07-27 | 2016-12-14 | 中国科学技术大学苏州研究院 | The method and system of degree of depth learning algorithm is accelerated on field programmable gate array platform |
CN106940815A (en) * | 2017-02-13 | 2017-07-11 | 西安交通大学 | A kind of programmable convolutional neural networks Crypto Coprocessor IP Core |
CN110414401A (en) * | 2019-07-22 | 2019-11-05 | 杭州电子科技大学 | A kind of intelligent monitor system and monitoring method based on PYNQ |
CN112329545A (en) * | 2020-10-13 | 2021-02-05 | 江苏大学 | ZCU104 platform-based convolutional neural network implementation and processing method for application of convolutional neural network implementation in fruit identification |
CN113362292A (en) * | 2021-05-27 | 2021-09-07 | 重庆邮电大学 | Bone age assessment method and system based on programmable logic gate array |
CN114841985A (en) * | 2022-05-24 | 2022-08-02 | 苏州鑫康成医疗科技有限公司 | High-precision processing and neural network hardware acceleration method based on target detection |
-
2023
- 2023-05-25 CN CN202310598079.1A patent/CN116630709B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106228238A (en) * | 2016-07-27 | 2016-12-14 | 中国科学技术大学苏州研究院 | The method and system of degree of depth learning algorithm is accelerated on field programmable gate array platform |
CN106940815A (en) * | 2017-02-13 | 2017-07-11 | 西安交通大学 | A kind of programmable convolutional neural networks Crypto Coprocessor IP Core |
CN110414401A (en) * | 2019-07-22 | 2019-11-05 | 杭州电子科技大学 | A kind of intelligent monitor system and monitoring method based on PYNQ |
CN112329545A (en) * | 2020-10-13 | 2021-02-05 | 江苏大学 | ZCU104 platform-based convolutional neural network implementation and processing method for application of convolutional neural network implementation in fruit identification |
CN113362292A (en) * | 2021-05-27 | 2021-09-07 | 重庆邮电大学 | Bone age assessment method and system based on programmable logic gate array |
CN114841985A (en) * | 2022-05-24 | 2022-08-02 | 苏州鑫康成医疗科技有限公司 | High-precision processing and neural network hardware acceleration method based on target detection |
Non-Patent Citations (1)
Title |
---|
刘翠连 等: ""混合卷积神经网络的高光谱图像分类方法"", 《激光技术》, pages 355 - 361 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118171049A (en) * | 2024-05-13 | 2024-06-11 | 西南交通大学 | Big data-based battery management method and system for edge calculation |
CN118350982A (en) * | 2024-06-18 | 2024-07-16 | 北京理工大学 | Hyperspectral perception processing system |
Also Published As
Publication number | Publication date |
---|---|
CN116630709B (en) | 2024-01-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108765247B (en) | Image processing method, device, storage medium and equipment | |
CN116630709B (en) | Hyperspectral image classification device and method capable of configuring mixed convolutional neural network | |
CN111967468B (en) | Implementation method of lightweight target detection neural network based on FPGA | |
US20220012593A1 (en) | Neural network accelerator and neural network acceleration method based on structured pruning and low-bit quantization | |
CN108596248B (en) | Remote sensing image classification method based on improved deep convolutional neural network | |
CN114937151B (en) | Lightweight target detection method based on multiple receptive fields and attention feature pyramid | |
US10394929B2 (en) | Adaptive execution engine for convolution computing systems | |
CN111897579B (en) | Image data processing method, device, computer equipment and storage medium | |
CN107463990A (en) | A kind of FPGA parallel acceleration methods of convolutional neural networks | |
CN113051216B (en) | MobileNet-SSD target detection device and method based on FPGA acceleration | |
CN111210019B (en) | Neural network inference method based on software and hardware cooperative acceleration | |
CN113516236A (en) | VGG16 network parallel acceleration processing method based on ZYNQ platform | |
Xie et al. | High throughput CNN accelerator design based on FPGA | |
Niu et al. | SPEC2: Spectral sparse CNN accelerator on FPGAs | |
Poostchi et al. | Efficient GPU implementation of the integral histogram | |
CN114003201A (en) | Matrix transformation method and device and convolutional neural network accelerator | |
Chinchanikar et al. | Design of binary neural network soft system for pattern detection using HDL tool | |
Zhou et al. | Efficient convolutional neural networks and network compression methods for object detection: A survey | |
Laban et al. | Enhanced pixel based urban area classification of satellite images using convolutional neural network | |
CN110765413B (en) | Matrix summation structure and neural network computing platform | |
CN111428787A (en) | Hyperspectral image parallel classification method based on GPU | |
CN117035028A (en) | FPGA-based convolution accelerator efficient calculation method | |
CN115457363B (en) | Image target detection method and system | |
CN114118415B (en) | Deep learning method of lightweight bottleneck attention mechanism | |
CN115170381A (en) | Visual SLAM acceleration system and method based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |