CN116630709B - Hyperspectral image classification device and method capable of configuring mixed convolutional neural network - Google Patents

Hyperspectral image classification device and method capable of configuring mixed convolutional neural network Download PDF

Info

Publication number
CN116630709B
CN116630709B CN202310598079.1A CN202310598079A CN116630709B CN 116630709 B CN116630709 B CN 116630709B CN 202310598079 A CN202310598079 A CN 202310598079A CN 116630709 B CN116630709 B CN 116630709B
Authority
CN
China
Prior art keywords
data
module
convolution
neural network
dimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310598079.1A
Other languages
Chinese (zh)
Other versions
CN116630709A (en
Inventor
贺文静
杨岳松
胡坚
陈玖英
王宁
汪琪
徐婉秋
吴昊昊
黎荆梅
欧阳光洲
李子扬
李传荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aerospace Information Research Institute of CAS
Original Assignee
Aerospace Information Research Institute of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aerospace Information Research Institute of CAS filed Critical Aerospace Information Research Institute of CAS
Priority to CN202310598079.1A priority Critical patent/CN116630709B/en
Publication of CN116630709A publication Critical patent/CN116630709A/en
Application granted granted Critical
Publication of CN116630709B publication Critical patent/CN116630709B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/58Extraction of image or video features relating to hyperspectral data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • G06V10/955Hardware or software architectures specially adapted for image or video understanding using specific electronic processors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/194Terrestrial scenes using hyperspectral data, i.e. more or other wavelengths than RGB
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/10Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Remote Sensing (AREA)
  • Neurology (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a hyperspectral image classification device and a hyperspectral image classification method for a configurable mixed convolutional neural network, which utilize a 2D-3D mixed convolutional neural network to fully extract characteristic information of space and spectrum dimensions, and effectively reduce the scale and complexity of a model while ensuring classification accuracy; based on ZYNQ heterogeneous multi-core processor, the software and hardware collaborative design is developed, the reasoning acceleration of the hybrid convolutional neural network model is realized by utilizing programmable logic resources, and meanwhile, the flexibility and the configurability of parallel dimension and convolutional kernel are supported. The method supports rapid deployment and reasoning acceleration of the 2D and 3D convolutional neural networks to the FPGA processor, has good applicability to different FPGA processors and convolutional neural network models, and provides support for low-power consumption and high-speed processing application of hyperspectral image classification.

Description

Hyperspectral image classification device and method capable of configuring mixed convolutional neural network
Technical Field
The invention relates to the technical field of remote sensing data real-time processing, in particular to a hyperspectral image classification device and method of a configurable mixed convolution neural network.
Background
The hyperspectral remote sensing technology is to acquire tens or hundreds of continuous narrow wave bands with nanoscale spectrum resolution while imaging the ground object target space by using an imaging spectrometer, so as to obtain hyperspectral remote sensing images containing rich space and spectrum information, and plays an important role in precise agriculture, geological investigation and other aspects. The hyperspectral image contains effective information that characterizes different features, and how to more effectively use this information to classify unknown regions has been a hotspot of research.
In recent years, some Deep learning models have been introduced into the problem of hyperspectral image classification, such as stacked self-encoding networks (Stacked Autoencoder, SAE), deep belief networks (Deep BeliefNetworks, DBN), convolutional neural networks (Convolutional Neural Networks, CNN), and the like. The classification method based on the convolutional neural network is most popular, and experiments show that the characteristic extraction and learning capacity of the convolutional neural network is superior to that of the traditional characteristic extraction method, so that the classification precision is greatly improved.
However, how to fully extract the abundant spatial and spectral information in the hyperspectral image by using the convolutional neural network is a key problem for determining the classification accuracy. The Wei L et al extract the spectral characteristics of the hyperspectral image through a one-dimensional convolutional neural network (1D-CNN), the network model is simple, but the limitation is that only spectral vectors can be extracted, spatial characteristics are not considered, and the hyperspectral image has the phenomena of 'homospectral foreign matters' and 'foreign matters homospectrum', and is classified only by utilizing the spectral characteristics, so that a good classification effect is difficult to obtain. Li S et al use PCA to reduce the dimension of hyperspectral image data, then use two-dimensional convolutional neural network (2D-CNN) to extract the spatial information in the neighborhood of input pixel, but the model can only extract the spatial characteristic information, and can not obtain the relation information between spectrum channels. Furthermore, researchers propose that classification accuracy can be further improved by simultaneously extracting joint features of spectrum space through a three-dimensional convolutional neural network (3D CNN). However, the 3D-CNN model requires deep convolution layers to effectively extract the spatial-spectral joint features, resulting in a significant increase in model complexity and number of training samples. In 2020, swalpaKumarRoy et al propose a hybrid sn model, and the method combines 3D CNN and 2D CNN to fully extract spatial feature information and spectral feature information, so that compared with the method of performing hyperspectral image classification by using 3D CNN alone, the computational complexity is reduced.
It is noted that, in order to make the model have a stronger spatial feature extraction capability, even if the hybrid sn model includes up to three layers of 3D convolution for feature extraction, the method of stacking multiple layers of 3D convolution to extract deep features significantly increases the complexity and computation cost of the model. On the other hand, hyperspectral images have the inherent advantage that the spectral dimension has spectral information in hundreds of bands, so that high density data presents a significant challenge for both computational and buffer throughput in the process.
In summary, the huge data volume of the hyperspectral image and the complex calculation requirement of deep learning bring double tests to the development of the hyperspectral image real-time processing device, and especially the hyperspectral image real-time processing device is based on an FPGA processor with more limited resources. Meanwhile, in order to reduce the dependence of FPGA design on the expertise of designers, the research can be adapted to flexible and configurable processing architectures of different network models, so that the development process is greatly promoted, and the method is also an important problem worthy of attention.
The deep learning method is introduced into the hyperspectral image classification problem, is a research field of comparison fronts, the existing patent is concentrated on the research of a deep learning model, for example, a Chinese patent application CN202210279207.1 (a hyperspectral classification method of a lightweight mixed convolution model based on global reasoning) is researched and added with a global reasoning module, global characteristic and deep characteristic information of a hyperspectral image are effectively extracted through reasoning on the context relation among different areas, and the patent application is limited to the research of the hyperspectral classification method and cannot form guidance on the design of a real-time processing device based on an embedded processor. Chinese patent application CN201910766635.5 (hyperspectral image classification method based on FPGA) designs a depth edge filter to classify hyperspectral images, and utilizes a field programmable gate array FPGA and an OpenCL heterogeneous computing framework to carry out device design.
Disclosure of Invention
In order to solve the technical problems, the invention provides a hyperspectral image classification device and a hyperspectral image classification method based on a configurable mixed convolution neural network, which adopt a hyperspectral image classification real-time processing system based on a ZYNQ heterogeneous multi-core processor, fully extract characteristic information of space and spectrum dimensions by using the mixed convolution neural network, and effectively reduce the scale and complexity of a model, thereby supporting deployment and quick operation in an embedded processor and providing support for low-power consumption and high-speed processing application of hyperspectral images.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
a hyperspectral image classification device of a configurable hybrid convolutional neural network comprises a processing system end, a programmable logic end and a DDR4 memory; the processing system end integrates a Cortex-A53 ARM four-core processor and is suitable for executing task control and floating point operation; the programmable logic end integrates a programmable logic array, provides powerful parallel processing capability, realizes parallel processing of a large amount of data, and completes parallel acceleration calculation of the convolutional neural network; the processing system end comprises a data and task scheduling module, an external interface controller module, a PCA module and a Softmax module; the programmable logic end comprises a configuration register module, a convolution operation module, a pooling operation module and a data transmission module.
Further, the data and task scheduling module is used for external data receiving, intermediate data caching, data preprocessing and scheduling control among convolutional neural network layers; the external interface controller module comprises a CameraLink interface controller and a UART interface controller, and respectively realizes data communication with the hyperspectral camera and an external memory card; the PCA module performs data dimension reduction processing by using a principal component analysis method; and the Softmax module carries out Softmax processing on the category confidence values, normalizes and probabilities the confidence values of all the categories, and finally obtains the category and the category probability value.
Further, the configuration register module is used for receiving instruction information of the processing system end and configuring the instruction information to registers corresponding to the convolution operation module and the pooling module operation module; the convolution operation module realizes parallel acceleration of convolution calculation, supports flexible configuration of two-dimensional convolution neural networks and three-dimensional convolution neural network calculation with different convolution kernel sizes, and supports multi-dimensional parallel acceleration of output including characteristic channels, convolution kernel channels and convolution kernel depths; the pooling operation module realizes the operation acceleration of two-dimensional pooling calculation and supports the parallel acceleration in the dimension of an input channel; the data transmission module realizes the data transmission between the convolution operation module and the pooling modular operation module and the DDR4 memory, the data transmission is carried out between the convolution operation module and the pooling modular operation module and the data transmission module by adopting a self-defined data read-write bus, and an AXI4 bus is adopted between the data transmission module and the DDR4 memory.
The invention also provides a hyperspectral image classification method of the configurable mixed convolution neural network, which comprises the following steps:
step 1, an original hyperspectral image is obtained and marked as X, the dimension of a data cube is marked as MxNxD, wherein M and N are the width and the height of a space dimension, and D is the number of spectrum bands; carrying out data dimension reduction on the spectrum of the original hyperspectral image by adopting a principal component analysis method, wherein the number of spectrum channels of the hyperspectral remote sensing image is reduced from the original D to B, and the dimension of a data cube after dimension reduction is changed into MxNxB;
step 2, cutting the dimension-reduced data cube into k three-dimensional small blocks with the size of S multiplied by B; the three-dimensional patches are then input one by one into a model of the hybrid convolutional neural network.
Further, the hybrid convolutional neural network is formed by connecting a plurality of three-dimensional convolutional layers, two-dimensional convolutional layers and full-connection layers in series; the three-dimensional convolution layer is used for extracting three-dimensional features in the image, the two-dimensional convolution layer further extracts two-dimensional features on the basis of the extracted three-dimensional features, and the full-connection layer maps the feature space calculated by all the convolution layers to a sample marking space, so that the robustness of the whole network is improved.
The beneficial effects are that:
1. according to the heterogeneous multi-core processor based on ZYNQ, firstly, the ARM is utilized to carry out PCA data dimension reduction on the hyperspectral image cube, then the programmable logic resource is utilized to realize parallel acceleration of the convolutional neural network, the advantages of software and hardware are fully exerted, the energy efficiency ratio is improved, and the application requirement of hyperspectral image classification real-time processing is met.
2. The convolutional neural network architecture designed by the invention supports 2D-CNN and 3D-CNN calculation at the same time, has flexible and adjustable parallel dimension, can flexibly adapt to different convolutional neural network structures, and can greatly accelerate development progress from algorithm to embedded design.
3. According to the invention, the hyperspectral image classification is performed by adopting the 2D CNN+3D CNN mixed convolutional neural network, and the complexity of the model is effectively controlled while the space characteristic information and the spectrum characteristic information are fully extracted, so that the real-time and rapid processing of the hyperspectral image classification on an embedded platform is possible.
Drawings
Fig. 1 is a schematic structural diagram of a hyperspectral image classification device based on a configurable mixed convolutional neural network.
FIG. 2 is a three-dimensional convolution operation loop optimization pseudocode of the present invention.
Fig. 3 is a schematic diagram of the loading of weight-multiplexed data according to the present invention.
Fig. 4 is a four-dimensional feature rearrangement illustration of the present invention.
FIG. 5 is a schematic diagram of a five-dimensional weight arrangement according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
The invention provides a hyperspectral image classification device and a hyperspectral image classification method based on a configurable mixed convolutional neural network, which adopt software and hardware collaborative design. The invention will be further illustrated with reference to specific examples.
In an exemplary embodiment of the invention, based on the ZYNQ MPSoC processor, the hyperspectral image classification real-time processing system based on the ZYNQ heterogeneous multi-core processor is adopted, the hybrid SN hybrid convolutional neural network model is utilized to realize the real-time processing of hyperspectral image classification, and good classification effect can be obtained. It is worth noting that the invention realizes the parallel acceleration calculation of 2D CNN and 3D CNN by utilizing a large amount of programmable logic resources in the ZYNQ MPSoC processor, and has good applicability to different hybrid convolutional neural network models due to the design flexibility of the programmable logic resources, and flexible and rapid design can be realized through parameter configuration.
The structure of the hybrid sn network model is shown in table 1 and comprises 3 three-dimensional convolution layers, 1 two-dimensional convolution layer and 2 fully-connected layers. And combining the 3D CNN and the 2D CNN, and fully extracting the spatial characteristic information and the spectral characteristic information while reducing the network complexity.
After the original hyperspectral image data is subjected to PCA data dimension reduction, 30 wave band data are reserved; further, the data cube is cut into three-dimensional tiles of size 25×25×30 (width 25 height 25 depth 30). The three-dimensional small blocks are used as the input of a convolutional neural network, and the convolutional kernel size, the cache and the calculation characteristics of each layer in the network are shown in table 1.
TABLE 1
Further, a hybrid SN mixed convolution neural network model is deployed into a ZYNQ MPSoC processor, so that real-time processing of hyperspectral image classification is realized. In an exemplary embodiment of the present invention, the hyperspectral image classification device based on the configurable mixed convolutional neural network is mainly composed of a ZYNQ UltraScale+MPSoC chip of Xilinx company and a DDR4 memory. MPSoC chip internal resources are divided into a processing system (Processing System, PS) side and a programmable logic (Programmable Logic, PL) side. The PS end is integrated with a Cortex-A53 ARM four-core processor, and is suitable for executing task control, floating point operation and the like; the PL terminal integrates the programmable logic array, provides powerful parallel processing capability, and can solve the parallel processing problem of a large amount of data.
The invention combines algorithm characteristics and complexity and ZYNQ chip structure characteristics to carry out software and hardware division design, utilizes a PS end to complete data and task scheduling, external interface control, PCA, softmax and other data processing, and utilizes a PL end to complete parallel acceleration calculation of a convolutional neural network.
As shown in fig. 1, the PS end of the hyperspectral image classification device of the configurable hybrid convolutional neural network of the present invention mainly includes:
(1) And a data and task scheduling module: and the method is responsible for external data receiving, intermediate data caching, data preprocessing, scheduling control among CNN network layers and the like.
(2) An external interface controller module: the system mainly comprises a CameraLink interface controller, a UART controller and a DDR4 controller, and respectively realizes data communication with a hyperspectral camera, an external memory card and a DDR4 memory.
(3) PCA module: and performing data dimension reduction processing by using a PCA method.
(4) Softmax module: and carrying out softmax processing on the category confidence values, normalizing and probability the confidence values of all the categories, and finally obtaining category and category probability values.
The PL end of the hyperspectral image classification device of the configurable mixed convolution neural network mainly comprises:
(1) A configuration register module: and the instruction information of the PS end is received and is configured to a corresponding register of the operation module.
(2) And a convolution operation module: the parallel acceleration of convolution calculation is realized, flexible configuration of 2D CNN and 3D CNN calculation with different convolution kernel sizes is supported, and multi-dimensional parallel acceleration of an output characteristic channel, a convolution kernel depth and the like is supported. The module is composed of 6 sub-modules, including: read data, weight buffer, feature buffer, data loading, convolution operation and write data.
(3) And the pooling operation module: the operation acceleration of two-dimensional pooling calculation is realized, and parallel acceleration in the dimension of an input channel is supported. The module is composed of 4 sub-modules, including: reading data, pooling in a wide direction, pooling in a high direction and writing data.
(4) And a data transmission module: the data transmission between the operation module and the DDR4 memory is realized, the data transmission between the operation module and the data transmission module is realized by adopting a self-defined data read-write bus, and the AXI4 bus is adopted between the data transmission module and the external DDR4 memory.
About 90% of the computation in the neural network is concentrated on convolution operations, which are mainly embodied by operations on multiple loops of a large amount of repeated data, and therefore, one of the emphasis of the acceleration design in the present invention is on the optimal design for multiple loops.
Fig. 2 is a schematic diagram of a multiple cycle structure of a three-dimensional convolution operation, wherein the direct convolution form of the three-dimensional convolution operation comprises 8 cycles, and the invention is described by a cycle optimization method in the three-dimensional convolution operation.
As shown in FIG. 2, the invention circularly blocks the three dimensions of the output characteristic channel, the convolution kernel channel and the convolution kernel depth, wherein the circulation blocks adopt a parallel mode, and the block sizes of the three dimensions are respectively marked as P n 、P c 、P d The parallelism can be based on the specificThe network structure and hardware resources are flexibly configured. Meanwhile, cyclic expansion is carried out in the cyclic block, and the cyclic expansion is mapped into a plurality of groups of parallel multiply-add arrays. In addition, since the height of the output feature and the width of the output feature are equivalent, the invention also circularly combines the two dimensions of the height and the width of the output feature, and circularly blocks the combined circle. Compared with the structure before optimization, the optimized multi-cycle structure can perform parallel computation on three dimensions of an output characteristic channel, a convolution kernel channel and a convolution kernel depth, and perform pipelining parallel on other dimensions, so that great convenience is provided for the subsequent design of a convolution operation module, and support is provided for the flexible configuration of parallel dimensions and convolution types.
It is worth noting that the multi-cycle structure also comprises two cycles of convolution kernel height and convolution kernel width, so that the flexible design of different convolution kernel sizes can be realized by means of the support of the data loading submodule in the convolution operation module.
In practical application, when the invention is applied to processor platforms facing different resource configurations, the invention can support two configuration strategies: one is a processor platform which is sufficient for computing resources (mainly DSP resources), a PL end can generate a plurality of convolution operation modules, and each convolution operation module supports convolution calculation of specific convolution types and convolution kernel sizes; the other is a processor platform with limited computing resources and relatively sufficient storage resources, the PL end can only generate one convolution operation module, the convolution calculation can be configured through a configuration register module, and the loading sub-module is used for sending the adaptive characteristic data and weight data, so that the convolution calculation of various convolution types and convolution kernel sizes is realized.
Besides the calculation capability, the data transmission bandwidth is considered in the design of the hyperspectral image classification real-time processing system, the cache is designed reasonably as much as possible, the time consumption of data transmission is reduced, and therefore the overall processing performance of the system is improved. In the convolution operation, a large amount of data participates in the operation on a plurality of cycles, and part of the data is reused. Therefore, by utilizing the characteristic of convolution calculation, the data multiplexing mode is reasonably designed, the access bandwidth requirement is reduced, and the method is also greatly helpful for improving the acceleration performance. According to the method, the characteristic of the hyperspectral image classification model is combined, and the repeated reading and writing of weight data are avoided by adopting a weight multiplexing mode. As shown in fig. 3, the feature and weight loading flow based on weight multiplexing in the three-dimensional convolution operation is illustrated. In the convolution calculation process, one weight data and a plurality of characteristic data corresponding to the weight data are loaded each time, and multiplication and addition calculation is completed one by one.
In order to match the parallel computing method and the weight multiplexing mode, the invention designs a characteristic data storage mode and a weight data storage mode which are suitable for the parallel computing method and the weight multiplexing mode.
Fig. 4 shows a storage method of feature data. In the three-dimensional convolution operation, the input feature data is four-dimensional and can be expressed as (channel of feature, depth of feature, height of feature, width of feature), abbreviated as (C d ,D d ,H d ,W d ). The original four-dimensional characteristic data are rearranged into three dimensions, and a specific rearrangement mode is to sequentially arrange first-layer depth data on all channels in the original four-dimensional characteristic data, sequentially arrange second-layer depth data and the like until the last-layer depth data, and finally splice the data together along the depth direction to form new three-dimensional characteristic data. The chunk size of the channel of the input feature is P c The new three-dimensional characteristic data is 1 in width and 1 in height per P c The data is considered as a sub-block, and then the new three-dimensional features formed by these sub-blocks can be represented as (D d *ceil(C d /P c ),H d ,W d ). The characteristic data storage mode is to sequentially store the data in the first subblocks of the first row and the first column in the new three-dimensional characteristic in the external DDR4 memory, sequentially store the data in the first subblocks of the first row and the second column, and sequentially store the H th subblock in the sequence of the first column and the second column d Line W d The data in the first sub-block of the column is stored, and then the data in the second sub-block of the first column of the first row is stored sequentially, and so on until all the characteristic data is stored.
Fig. 5 shows a storage method of weight data. In the three-dimensional convolution operation, the weight of the inputIs five-dimensional and can be expressed as (output channel of convolution kernel, input channel of convolution kernel, depth of convolution kernel, high of convolution kernel, wide of convolution kernel), abbreviated as (N) w ,C w ,D w ,H w ,W w ). The original five-dimensional weight data are rearranged into four dimensions, and a specific rearrangement mode is to sequentially arrange the first layer depth data of all input channels on each output channel in the original weight, sequentially arrange the second layer depth data until the last layer of depth data, and finally splice the data on each output channel together along the depth direction to form new four-dimensional weight data. The width of each output channel is 1 to 1 per C w The data is considered as a sub-block, and then the new four-dimensional weights of these sub-blocks can be expressed as (N) w ,D w ,H w ,W w ). The computational parallelism associated with the weight is P n 、P c And P d The weight is stored in the external DDR4 memory by sequentially storing the previous P in the new four-dimensional weight n First row first column front P of each output channel on each output channel d Front P of each of the sub-blocks c Data, and sequentially store P before the first row and the second column of each output channel d Front P of each of the sub-blocks c Data, then H of each output channel in order of column-first and row-second w Line W w P before column d Front P of each of the sub-blocks c After the data are stored, the previous P in the new four-dimensional weight is sequentially stored n First row first column front P of each output channel on each output channel d Subsequent P of each of the sub-blocks c Data, and the like, front P n And finally repeating the above operations according to the sequence of the first depth and the second output channel until all the weight data are stored.
It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (2)

1. A hyperspectral image classification device of a configurable hybrid convolutional neural network, characterized in that: the system comprises a processing system end, a programmable logic end and a DDR4 memory; the processing system end integrates a Cortex-A53 ARM four-core processor and is used for executing task control and floating point operation; the programmable logic end integrates a programmable logic array, provides powerful parallel processing capability, realizes parallel processing of a large amount of data, and completes parallel acceleration calculation of the convolutional neural network; the processing system end comprises a data and task scheduling module, an external interface controller module, a PCA module and a Softmax module; the programmable logic end comprises a configuration register module, a convolution operation module, a pooling operation module and a data transmission module;
the data and task scheduling module is used for external data receiving, intermediate data caching, data preprocessing and scheduling control among convolutional neural network layers; the external interface controller module comprises a CameraLink interface controller and a UART interface controller, and respectively realizes data communication with the hyperspectral camera and an external memory card; the PCA module performs data dimension reduction processing by using a principal component analysis method; the Softmax module carries out Softmax processing on the category confidence values, normalizes and probability the confidence values of all the categories, and finally obtains category and category probability values;
the configuration register module is used for receiving instruction information of the processing system end and is configured to the corresponding registers of the convolution operation module and the pooling module operation module; the convolution operation module realizes parallel acceleration of convolution calculation, supports flexible configuration of two-dimensional convolution neural networks and three-dimensional convolution neural network calculation with different convolution kernel sizes, and supports multi-dimensional parallel acceleration comprising an output characteristic channel, a convolution kernel channel and a convolution kernel depth;
performing cyclic partitioning in three dimensions of an output characteristic channel, a convolution kernel channel and a convolution kernel depth, wherein the cyclic blocks adopt a parallel mode, and the partitioning sizes of the three dimensions are respectively marked as P n 、P c 、P d Parallelism is based onThe specific network structure and hardware resources are flexibly configured; meanwhile, cyclic expansion is carried out in the cyclic block, and the cyclic expansion is mapped into a plurality of groups of parallel multiply-add arrays; the multiple circulation structure carries out parallel computation in three dimensions of an output characteristic channel, a convolution kernel channel and a convolution kernel depth, and carries out pipelining parallel in other dimensions;
combining with the characteristic of the hyperspectral image classification model, adopting a weight multiplexing mode to avoid repeated reading and writing of weight data; in the convolution calculation process, loading weight data and a plurality of characteristic data corresponding to the weight data each time, and completing multiplication and addition calculation one by one; in order to match the parallel computing method and the weight multiplexing mode, designing a characteristic data storage mode and a weight data storage mode which are suitable for the parallel computing method and the weight multiplexing mode;
the pooling operation module realizes the operation acceleration of two-dimensional pooling calculation and supports the parallel acceleration in the dimension of an input channel; the data transmission module realizes the data transmission among the convolution operation module, the pooling modular operation module and the DDR4 memory, the convolution operation module, the pooling modular operation module and the data transmission module adopt self-defined data read-write buses to carry out data transmission, and the data transmission module and the DDR4 memory adopt AXI4 buses.
2. A hyperspectral image classification method of a hyperspectral image classification apparatus of a configurable hybrid convolutional neural network as recited in claim 1, comprising the steps of:
step 1, an original hyperspectral image is obtained and marked as X, the dimension of a data cube is marked as MxNxD, wherein M and N are the width and the height of a space dimension, and D is the number of spectrum bands; carrying out data dimension reduction on the spectrum of the original hyperspectral image by adopting a principal component analysis method, wherein the number of spectrum channels of the hyperspectral remote sensing image is reduced from the original D to B, and the dimension of a data cube after dimension reduction is changed into MxNxB;
step 2, cutting the dimension-reduced data cube into k three-dimensional small blocks with the size of S multiplied by B; then inputting the three-dimensional small blocks into a model of the mixed convolutional neural network one by one;
the hybrid convolutional neural network is formed by connecting a plurality of three-dimensional convolutional layers, two-dimensional convolutional layers and full-connection layers in series; the three-dimensional convolution layer is used for extracting three-dimensional features in the image, the two-dimensional convolution layer further extracts two-dimensional features on the basis of the extracted three-dimensional features, and the full-connection layer maps the feature space calculated by all the convolution layers to a sample marking space, so that the robustness of the whole network is improved.
CN202310598079.1A 2023-05-25 2023-05-25 Hyperspectral image classification device and method capable of configuring mixed convolutional neural network Active CN116630709B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310598079.1A CN116630709B (en) 2023-05-25 2023-05-25 Hyperspectral image classification device and method capable of configuring mixed convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310598079.1A CN116630709B (en) 2023-05-25 2023-05-25 Hyperspectral image classification device and method capable of configuring mixed convolutional neural network

Publications (2)

Publication Number Publication Date
CN116630709A CN116630709A (en) 2023-08-22
CN116630709B true CN116630709B (en) 2024-01-09

Family

ID=87620891

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310598079.1A Active CN116630709B (en) 2023-05-25 2023-05-25 Hyperspectral image classification device and method capable of configuring mixed convolutional neural network

Country Status (1)

Country Link
CN (1) CN116630709B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106228238A (en) * 2016-07-27 2016-12-14 中国科学技术大学苏州研究院 The method and system of degree of depth learning algorithm is accelerated on field programmable gate array platform
CN106940815A (en) * 2017-02-13 2017-07-11 西安交通大学 A kind of programmable convolutional neural networks Crypto Coprocessor IP Core
CN110414401A (en) * 2019-07-22 2019-11-05 杭州电子科技大学 A kind of intelligent monitor system and monitoring method based on PYNQ
CN112329545A (en) * 2020-10-13 2021-02-05 江苏大学 ZCU104 platform-based convolutional neural network implementation and processing method for application of convolutional neural network implementation in fruit identification
CN113362292A (en) * 2021-05-27 2021-09-07 重庆邮电大学 Bone age assessment method and system based on programmable logic gate array
CN114841985A (en) * 2022-05-24 2022-08-02 苏州鑫康成医疗科技有限公司 High-precision processing and neural network hardware acceleration method based on target detection

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106228238A (en) * 2016-07-27 2016-12-14 中国科学技术大学苏州研究院 The method and system of degree of depth learning algorithm is accelerated on field programmable gate array platform
CN106940815A (en) * 2017-02-13 2017-07-11 西安交通大学 A kind of programmable convolutional neural networks Crypto Coprocessor IP Core
CN110414401A (en) * 2019-07-22 2019-11-05 杭州电子科技大学 A kind of intelligent monitor system and monitoring method based on PYNQ
CN112329545A (en) * 2020-10-13 2021-02-05 江苏大学 ZCU104 platform-based convolutional neural network implementation and processing method for application of convolutional neural network implementation in fruit identification
CN113362292A (en) * 2021-05-27 2021-09-07 重庆邮电大学 Bone age assessment method and system based on programmable logic gate array
CN114841985A (en) * 2022-05-24 2022-08-02 苏州鑫康成医疗科技有限公司 High-precision processing and neural network hardware acceleration method based on target detection

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"混合卷积神经网络的高光谱图像分类方法";刘翠连 等;《激光技术》;第355-361页 *

Also Published As

Publication number Publication date
CN116630709A (en) 2023-08-22

Similar Documents

Publication Publication Date Title
CN108765247B (en) Image processing method, device, storage medium and equipment
CN111967468B (en) Implementation method of lightweight target detection neural network based on FPGA
US20220012593A1 (en) Neural network accelerator and neural network acceleration method based on structured pruning and low-bit quantization
US10394929B2 (en) Adaptive execution engine for convolution computing systems
CN108596248B (en) Remote sensing image classification method based on improved deep convolutional neural network
US10474464B2 (en) Deep vision processor
CN111897579B (en) Image data processing method, device, computer equipment and storage medium
CN107463990A (en) A kind of FPGA parallel acceleration methods of convolutional neural networks
CN113051216B (en) MobileNet-SSD target detection device and method based on FPGA acceleration
CN111210019B (en) Neural network inference method based on software and hardware cooperative acceleration
CN113516236A (en) VGG16 network parallel acceleration processing method based on ZYNQ platform
Xie et al. High throughput CNN accelerator design based on FPGA
Pichel et al. A new approach for sparse matrix classification based on deep learning techniques
Dong et al. GCN: GPU-based cube CNN framework for hyperspectral image classification
Niu et al. SPEC2: Spectral sparse CNN accelerator on FPGAs
Li et al. A novel hardware-oriented ultra-high-speed object detection algorithm based on convolutional neural network
CN114003201A (en) Matrix transformation method and device and convolutional neural network accelerator
Chinchanikar et al. Design of binary neural network soft system for pattern detection using HDL tool
CN116630709B (en) Hyperspectral image classification device and method capable of configuring mixed convolutional neural network
Laban et al. Enhanced pixel based urban area classification of satellite images using convolutional neural network
CN111428787A (en) Hyperspectral image parallel classification method based on GPU
CN115457363B (en) Image target detection method and system
Zhou et al. Efficient convolutional neural networks and network compression methods for object detection: A survey
CN114118415B (en) Deep learning method of lightweight bottleneck attention mechanism
CN115170381A (en) Visual SLAM acceleration system and method based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant