CN116543291A - Method for realizing CNN (customer premise network) by using FPGA (field programmable gate array) with flexible resource configuration - Google Patents

Method for realizing CNN (customer premise network) by using FPGA (field programmable gate array) with flexible resource configuration Download PDF

Info

Publication number
CN116543291A
CN116543291A CN202310748451.2A CN202310748451A CN116543291A CN 116543291 A CN116543291 A CN 116543291A CN 202310748451 A CN202310748451 A CN 202310748451A CN 116543291 A CN116543291 A CN 116543291A
Authority
CN
China
Prior art keywords
vertex
layer
convolution
representing
fpga
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310748451.2A
Other languages
Chinese (zh)
Inventor
李强
赵峰
庄莉
王秋琳
伍臣周
宋立华
邱镇
黄晓光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Information and Telecommunication Co Ltd
Fujian Yirong Information Technology Co Ltd
Original Assignee
State Grid Information and Telecommunication Co Ltd
Fujian Yirong Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Information and Telecommunication Co Ltd, Fujian Yirong Information Technology Co Ltd filed Critical State Grid Information and Telecommunication Co Ltd
Priority to CN202310748451.2A priority Critical patent/CN116543291A/en
Publication of CN116543291A publication Critical patent/CN116543291A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • G06V10/955Hardware or software architectures specially adapted for image or video understanding using specific electronic processors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/188Vegetation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of neural network models, and discloses a method for realizing CNN by using an FPGA with flexible resource allocation, which comprises the following steps: step 101, before inputting a convolution layer, firstly selecting a serial-parallel combination configuration, wherein the serial-parallel combination configuration comprises a K value; step 102, generating serial step number, wherein the serial step number is equal to 'N/K', N is the number N of the characteristic diagrams of the previous layer, the number N of the current convolution layers is N x M, and the number M of the characteristic diagrams of the next layer; step 103, the K-1 convolution calculation structure calculates K feature images of the previous layer in parallel, merges the K-1 convolution calculation structures through serial merging structures according to the step N/K to obtain the final result of the feature images of the next layer, and completes the convolution calculation of the total N input feature images of the previous layer; by adopting the method for convolution, the operation efficiency of the whole neural network can be improved, and the plant pest degree can be calculated more efficiently.

Description

Method for realizing CNN (customer premise network) by using FPGA (field programmable gate array) with flexible resource configuration
Technical Field
The invention relates to the technical field of neural network models, in particular to a method for realizing CNN by using an FPGA with flexible resource configuration.
Background
The method has the advantages that the plant with a simple three-dimensional structure and the leaf surface as a main body can be subjected to insect pest degree evaluation by adopting the neural network model, the existing obstacle is that a large number of training samples for marking the number of the insect pests are required for training of a common neural network, the artificial statistics of the number of the insect pests in the samples is long, the insect pests are active and dynamic, the statistical time span can cause distortion of statistical results, and therefore the training of the neural network by the aid of the sufficient training samples cannot be obtained, so that the accuracy of the result of the number of the insect pests output by the neural network is poor.
Disclosure of Invention
The invention provides a method for realizing CNN by using an FPGA with flexible resource allocation, which solves the technical problem that training of a common neural network for calculating the number of pests in the related art requires a large number of training samples for marking the number of pests.
The invention provides a CNN neural network realized by FPGA with flexible resource allocation, which comprises the following components:
h convolution layers which are sequentially connected in series, wherein a first convolution layer inputs a unit image, and a plurality of unit images are sequentially input; performing linear transformation on the feature map output by the last convolution layer to generate a first feature vector;
generating an original top point diagram, wherein one vertex of the original top point diagram corresponds to one unit image; generating a vertex network diagram for each vertex, and vectorizing the vertices of the vertex network diagram to obtain vertex vectors;
a first hidden layer, wherein the first hidden layer inputs a first characteristic vector and a vertex vector of a vertex of the vertex network diagram; the h first hidden layer comprises C channels, the vertexes of the h layer of the vertex network diagram are randomly sampled to generate random subsets with consistent sizes, the i random subsets and the first eigenvectors and vertex vectors of the vertexes in the center of the vertex network diagram are input into the i channels, and the calculation comprises the following steps:
,/>the ith propagation vector representing the h th first hidden layer, +.>And->Weight parameter and bias parameter respectively representing the h first hidden layer, +.>Representing an activation function->The ith propagation information of the vertex v in the h layer of the vertex network diagram is represented, and the vertex v represents the vertex in the center of the vertex network diagram;
wherein->Representation->Is a random subset of the i-th of the set,vertex set representing the connection of the h layer of the vertex network graph with vertex v, vertex e belonging to +.>,/>Attention parameters representing the first feature vector corresponding to vertex v and the first feature vector corresponding to vertex e; />A vertex vector representing vertex e.
The second hiding layer is input after the output pretreatment of the first hiding layer;
the calculating of the second hidden layer includes:
wherein, the liquid crystal display device comprises a liquid crystal display device,a graph coding vector representing vertex v, +.>Vector representing the center of the y-th cluster, +.>From a first hidden layerOutput pre-processing of->Representing the number of cluster centers, +.>And->Weight parameter and bias parameter respectively representing the second hidden layer, < >>Representing an activation function->A vertex vector corresponding to the vertex v;
the result output layer inputs the graph coding vector of the vertex, searches the local maximum value of the graph coding vector, and the number of the local maximum value is the number of pests in the area corresponding to the unit image corresponding to the vertex;
the second hidden layer is connected with the full-connection layer during training, the full-connection layer is used for inputting the graph coding vector of the vertex, outputting and mapping to the classification space, and the classification label of the classification space represents the insect pest degree.
Further, connecting edges exist at vertexes corresponding to adjacent unit images on the plant image.
Further, the calculation formula of the attention parameter of the first feature vector corresponding to the vertex v and the first feature vector corresponding to the vertex e is as follows:,/>,/>representing the original attention parameter +.>Representing to which vertex e belongs in the vertex network graph of vertex vRandom subset,/->Represents an exponential function based on natural constants, < ->And->Representing the first eigenvector corresponding to vertex v and vertex e, respectively,>representing the scaling factor.
Further, the preprocessing of the output of the first hidden layer includes: and clustering the propagation vectors output by the first hidden layer by using the C propagation vectors output by the first hidden layer as a clustering center to generate C clusters, and calculating the vector of the cluster center by each cluster.
Further, performing random walk on the original vertex map to generate a vertex network map for each vertex, wherein the number of layers of the vertex network map is the same as that of the random walk.
Further, the method for generating a vertex network graph for each vertex by the random walk includes:
step 201, selecting a vertex, and starting random walk by taking the selected vertex as a center until the number of walked layers reaches A; a is the number of first hidden layers;
step 202, adding the vertex sequence of the walk into the vertex network graph, if the number of the walks is less than B, accumulating the number of the walks once, and returning to step 201, otherwise, ending the step.
Further, the classification labels of the classification space represent no pest, general pest and serious pest, respectively.
Further, the unit image is generated by uniformly dividing the plant image.
The invention provides a method for realizing CNN by using a flexibly-configured FPGA, which is used for carrying out the convolution operation of a convolution layer of a CNN neural network realized by using the flexibly-configured FPGA, and comprises the following steps:
step 101, before inputting a convolution layer, firstly selecting a serial-parallel combination configuration, wherein the serial-parallel combination configuration comprises a K value;
step 102, generating serial step number, wherein the serial step number is equal to 'N/K', N is the number N of the characteristic diagrams of the previous layer, the number N of the current convolution layers is N x M, and the number M of the characteristic diagrams of the next layer;
and 103, the K-1 convolution calculation structure calculates K feature graphs of the previous layer in parallel, merges the K-1 convolution calculation structures through serial merging structure serial according to the step N/K to obtain the final result of the feature graphs of the next layer, and the convolution calculation of the total N input feature graphs of the previous layer is completed.
Further, the K-1 convolution calculation structure performs the convolution calculation of the K convolution kernels and the corresponding feature graphs, and sums the corresponding points to obtain the values in the 1 feature graphs, wherein the convolution calculation of the K feature graphs and the K convolution kernels is performed in parallel, and each clock sums the K output convolution values;
the serial merging structure is to accumulate the results output by each time of K-1 convolution calculation structure, change K input feature images to continue calculation after each time of K-1 convolution calculation, complete the convolution calculation of the N input feature images in the last layer after N/K times, add bias to the accumulated value, and obtain the final result of the feature images through the operation of an activation function f.
The invention has the beneficial effects that: the training sample of the neural network model is short in labeling time, the integrated information is transmitted to a plurality of unit images to generate the coding vector corresponding to the unit images, the quantity result of pests is indirectly output by utilizing the embodiment of the neural network on the vector concerning the characteristics corresponding to the pests during training, the accuracy is ensured, the workload is reduced, meanwhile, the FPGA combined with flexible resource allocation performs efficient convolution processing operation on a convolution layer, and the operation speed of the neural network model is ensured.
Drawings
FIG. 1 is a flow chart of a method for implementing CNN by a resource-flexible-configuration FPGA of the present invention;
FIG. 2 is a flow chart of a method of generating a vertex network graph for each vertex for the random walk of the present invention.
Detailed Description
The subject matter described herein will now be discussed with reference to example embodiments. It is to be understood that these embodiments are merely discussed so that those skilled in the art may better understand and implement the subject matter described herein and that changes may be made in the function and arrangement of the elements discussed without departing from the scope of the disclosure herein. Various examples may omit, replace, or add various procedures or components as desired. In addition, features described with respect to some examples may be combined in other examples as well.
A FPGA-implemented CNN neural network with flexible resource configuration, comprising:
h convolution layers, wherein a first convolution layer inputs a unit image, and a plurality of unit images are sequentially input;
the H convolution layers are sequentially connected in series, the feature image output by the last convolution layer is defined as a final feature image, and each unit image corresponds to one final feature image;
a first linear layer which inputs the final feature map and performs linear transformation on the final feature map to generate a first feature vector;
generating an original top point diagram, wherein one vertex of the original top point diagram corresponds to one unit image, and connecting edges exist on the vertices corresponding to adjacent unit images on the plant image;
generating a vertex network diagram for each vertex by carrying out random walk on an original vertex diagram, wherein the starting point of the random walk is the vertex at the center of the vertex network diagram, the number of layers of the vertex network diagram is the same as that of the random walk, and vectorizing the vertices of the vertex network diagram to obtain a vertex vector;
a first hidden layer connected in parallel, wherein the first hidden layer inputs a first characteristic vector and a vertex vector of a vertex network diagram;
the h first hidden layer comprises C channels, the vertexes of the h layer of the vertex network diagram are randomly sampled to generate random subsets with consistent sizes, the i random subsets and the first eigenvectors and vertex vectors of the vertexes in the center of the vertex network diagram are input into the i channels (the channels share weight parameters and bias parameters), and the calculation comprises the following steps:
the ith propagation vector representing the h th first hidden layer, +.>And->Weight parameter and bias parameter respectively representing the h first hidden layer, +.>Representing an activation function->Information indicating the ith propagation of vertex v (vertex in the center of the vertex network graph) at the h layer of the vertex network graph;
wherein the method comprises the steps ofRepresentation->I-th random subset of (a),/i>Vertex set representing the h layer of the vertex network graph connected to vertex v, ">Attention parameters representing the first feature vector corresponding to vertex v and the first feature vector corresponding to vertex e; />A vertex vector representing vertex e.
Representing the original attention parameter +.>Representing the random subset to which e belongs in the vertex network graph of vertex v, < >>Represents an exponential function based on natural constants, < ->And->Representing the first eigenvector corresponding to vertex v and vertex e, respectively,>the expansion coefficient and the adjustable parameter are represented, and default is 0.2.
The second hidden layer is input after the output pretreatment of the first hidden layer, and the pretreatment comprises the following steps:
taking the C propagation vectors output by the first hidden layer as a clustering center, clustering the propagation vectors output by other first hidden layers to generate C clusters, and calculating the vector of the cluster center by each cluster;
the calculating of the second hidden layer includes:
wherein, the liquid crystal display device comprises a liquid crystal display device,a graph coding vector representing vertex v, +.>Vector representing the center of the y-th cluster, +.>Representing the number of cluster centers, +.>And->Weight parameter and bias parameter respectively representing the second hidden layer, < >>Representing an activation function->A vertex vector corresponding to the vertex v;
and the result output layer inputs the graph coding vector of the vertex, searches the local maximum value of the graph coding vector, and the number of the local maximum value is the number of pests in the area corresponding to the unit image corresponding to the vertex.
In an embodiment of the present invention, the second hidden layer of the CNN neural network implemented by the FPGA with flexible resource allocation is connected to a full-connection layer during training, where the full-connection layer is used for inputting the graph coding vector of the vertex, outputting and mapping the graph coding vector to a classification space, and classification labels of the classification space respectively represent no pest, general pest and serious pest. Specifically, the vertex tracing corresponds to a unit image, the classification label actually represents the insect pest degree of the area corresponding to the unit image, and for the training set, the classification label can be marked by manually checking pictures through experience, so that the specific information such as the number of insect pests in the unit image is not needed to be obtained.
The algorithm applied by the result output layer for searching the local maximum value is a conventional means, and the result output layer can be connected after other parts of training are completed.
In one embodiment of the invention, each vector component of the graph-encoded vector is mapped into a two-dimensional coordinate system, the component values of one vector component correspond to the Y-axis coordinates, the order of which corresponds to the X-axis coordinates, and curve peaks are found as local maxima after fitting the curve.
In one embodiment of the invention, the dimension of the graph coding vector of the vertex is the same as the number of elements of the matrix of the final feature graph, the graph coding vector is cut at equal length and sequentially spliced to form an intermediate matrix, the intermediate matrix is consistent with the matrix of the final feature graph in size, and the local maximum value of the intermediate matrix is searched to be used as the local maximum value of the graph coding vector.
As shown in fig. 2, in one embodiment of the present invention, a method for generating a vertex network graph for each vertex by random walk includes:
step 201, selecting a vertex, and starting random walk by taking the selected vertex as a center until the number of walked layers reaches A; a is the number of first hidden layers;
step 202, adding the moving vertex sequence into the vertex network graph, if the moving times are less than B, accumulating the moving times once, and returning to step 201, otherwise, ending the step;
the vertex sequence generates onehot codes, inputs the continuous bag-of-words model, outputs vectorized representations of the vertices, and marks the vectorized representations as vertex vectors.
In one embodiment of the invention, the cell image is generated by uniform segmentation of the plant image.
In one embodiment of the present invention, the plant image is a top view image of a plant planting area, the plant is tobacco or the like, and the plant image is an RGB image, so each unit image is also an RGB image, and image feature inputs of three channels are generated respectively.
In one embodiment of the invention, as with general convolution, pooling layers are provided between the convolution layers.
In one embodiment of the invention, the function is activatedThe sigmod activation function is selected.
In one embodiment of the invention, generating an original vertex map, generating a vertex network map for each vertex, and vectorizing the vertices of the vertex network map to obtain vertex vectors are performed outside a CNN neural network implemented by an FPGA with flexible resource allocation.
As shown in fig. 1, a method for implementing CNN by using an FPGA with flexible resource configuration includes the following steps:
step 101, before inputting a convolution layer, firstly selecting a serial-parallel combination configuration, wherein the serial-parallel combination configuration comprises a K value;
k is more than or equal to 1 and less than or equal to N, the size of the K value determines the resource cost of parallel operation, and when the resources of the FPGA are fewer, the K value can be set to be a smaller value.
Step 102, generating serial step number, wherein the serial step number is equal to 'N/K', N is the number N of the characteristic diagrams of the previous layer, the number N of the current convolution layers is N x M, and the number M of the characteristic diagrams of the next layer;
and 103, the K-1 convolution calculation structure calculates K feature graphs of the previous layer in parallel, merges the K-1 convolution calculation structures through serial merging structure serial according to the step N/K to obtain the final result of the feature graphs of the next layer, and the convolution calculation of the total N input feature graphs of the previous layer is completed. Wherein "" represents the value of the integer;
the K-1 convolution calculation structure is a convolution calculation structure which calculates K feature images of the previous layer in parallel, calculates to obtain 1 feature image of the next layer, executes convolution calculation of K convolution kernels and corresponding feature images, sums corresponding points to obtain values in the 1 feature images, the convolution calculation of the K feature images and the K convolution kernels is executed in parallel, and each clock sums the output K convolution values;
the serial merging structure is to accumulate the results output by each time of K-1 convolution calculation structure, change K input feature graphs to continue calculation after each time of K-1 convolution calculation, complete the convolution calculation of the total N input feature graphs of the previous layer after N/K times, add bias to the accumulated value, and obtain the final result of the feature graphs through the operation of an activation function f (such as Sigmoid).
The feature map input by the first convolution layer is the unit image.
And carrying out off-chip caching or on-chip storage on the feature map information calculated each time, when the number of the feature maps is large and the feature dimension is large, the storage resource occupation is large, and carrying out off-chip caching after 1 feature map calculation is needed, otherwise, carrying out on-chip storage directly, and reducing the transmission time and the final operation time delay.
The calculation of activation functions, pooling, full connection and the like is operated in a normal mode, and the method of the invention focuses on optimizing interlayer convolution operation which is the most core, consumes the most resources and affects the performance.
The L-1 layer has N characteristic diagrams, the L layer has M characteristic diagrams, and the number of the convolution kernels of the layer is N×M. Each output characteristic diagram of the L layer is obtained by convolving all L-1 characteristic diagrams with corresponding convolution kernels, then summing, adding offset and taking an excitation function. The calculation formula is as follows:
wherein the method comprises the steps ofInformation representing the j-th feature map of the L-th layer,>the i-th feature map information representing the L-1 th layer,convolution kernel representing i-th input and j-th output,>representing convolution,/->Representing the bias of the jth output feature map of the L-th layer, f represents the activation function. For the convenience of the following explanation, let n=4, m=3, the number of convolution kernels of this layer be 12, each convolution kernelA size of 3x3;
according to the serial-parallel structure, a K-1 convolution calculation structure is firstly executed, K L-1 layer input features are taken, convolution is carried out on the K L-1 layer input features and K convolution kernels, then the K L-1 layer input features and the K L-1 layer input features are added, and a temporary value of an L-1 layer output feature diagram is obtained, wherein K is not less than 1 and not more than N, and K is assumed to be 2.
Corresponding to convolution operation between the L-1 layer and the L layer, firstly taking the 1 st characteristic diagram information of the L-1 layerAnd 2 nd feature map information->Performing the convolution calculation structure of K-1 to obtain the first temporary feature map of the L-th layer +.>Performing on-chip caching; then take the 3 rd feature map information of L-1 layer +.>And 3 rd feature map information->Performing the convolution calculation structure of K-1 to obtain the second temporary feature map of the L-th layer +.>Will->And->After accumulation, adding bias, and activating function f operation to obtain the result of the first output characteristic diagram of the L layers.
The size of the K value determines the resource cost of parallel operation, when the resources of the FPGA are fewer, the K value can be set to be smaller until K=1, and complete serial operation is executed, and at the moment, N characteristic images of the L-1 layer and M characteristic images of the L layer can be executed for N times or M times to complete convolution operation of all the layers; when the resources of the FPGA are more, the K value can be set to be a larger value until N is reached, and the calculation of the N feature images of the L-1 layer and the M feature images of the L layer can be completed only by executing M times, so that the timeliness of the operation is improved.
For the CNN neural network realized by the FPGA with flexible resource configuration, as the independent unit images are input, the operation times of convolution can be multiplied along with the increase of the number of the unit images, so that the operation efficiency of the whole neural network can be improved by adopting the method for convolution.
The embodiment has been described above with reference to the embodiment, but the embodiment is not limited to the above-described specific implementation, which is only illustrative and not restrictive, and many forms can be made by those of ordinary skill in the art, given the benefit of this disclosure, are within the scope of this embodiment.

Claims (10)

1. A flexible resource configuration FPGA implemented CNN neural network, comprising:
h convolution layers which are sequentially connected in series, wherein a first convolution layer inputs a unit image, and a plurality of unit images are sequentially input; performing linear transformation on the feature map output by the last convolution layer to generate a first feature vector;
generating an original top point diagram, wherein one vertex of the original top point diagram corresponds to one unit image; generating a vertex network diagram for each vertex, and vectorizing the vertices of the vertex network diagram to obtain vertex vectors;
a first hidden layer, wherein the first hidden layer inputs a first characteristic vector and a vertex vector of a vertex of the vertex network diagram; the h first hidden layer comprises C channels, the vertexes of the h layer of the vertex network diagram are randomly sampled to generate random subsets with consistent sizes, the i random subsets and the first eigenvectors and vertex vectors of the vertexes in the center of the vertex network diagram are input into the i channels, and the calculation comprises the following steps:
,/>the ith propagation vector representing the h th first hidden layer, +.>And->Weight parameter and bias parameter respectively representing the h first hidden layer, +.>Representing an activation function->The ith propagation information of the vertex v in the h layer of the vertex network diagram is represented, and the vertex v represents the vertex in the center of the vertex network diagram;
wherein->Representation->I-th random subset of (a),/i>Vertex set representing the connection of the h layer of the vertex network graph with vertex v, vertex e belonging to +.>,/>Attention parameters representing the first feature vector corresponding to vertex v and the first feature vector corresponding to vertex e; />A vertex vector representing vertex e;
the second hiding layer is input after the output pretreatment of the first hiding layer;
the calculating of the second hidden layer includes:wherein->A graph coding vector representing vertex v, +.>Vector representing the center of the y-th cluster, +.>Obtained by the output preprocessing of the first hidden layer,representing the number of cluster centers, +.>And->Weight parameter and bias parameter respectively representing the second hidden layer, < >>Representing an activation function->A vertex vector corresponding to the vertex v;
the result output layer inputs the graph coding vector of the vertex, searches the local maximum value of the graph coding vector, and the number of the local maximum value is the number of pests in the area corresponding to the unit image corresponding to the vertex;
the second hidden layer is connected with the full-connection layer during training, the full-connection layer is used for inputting the graph coding vector of the vertex, outputting and mapping to the classification space, and the classification label of the classification space represents the insect pest degree.
2. The CNN neural network implemented by an FPGA with flexible resource allocation according to claim 1, wherein connecting edges exist at vertices corresponding to adjacent unit images on the plant image.
3. The CNN neural network implemented by an FPGA with flexible resource allocation according to claim 1, wherein a calculation formula of the attention parameter of the first eigenvector corresponding to the vertex v and the first eigenvector corresponding to the vertex e is as follows:,/>,/>the original attention parameter is represented as such,representing the random subset to which e belongs in the vertex network graph of vertex v, < >>Represents an exponential function based on natural constants, < ->And->Representing the first eigenvector corresponding to vertex v and vertex e, respectively,>representing the scaling factor.
4. The FPGA implemented CNN neural network of claim 1, wherein the preprocessing of the output of the first hidden layer comprises: and clustering the propagation vectors output by the first hidden layer by using the C propagation vectors output by the first hidden layer as a clustering center to generate C clusters, and calculating the vector of the cluster center by each cluster.
5. The CNN neural network implemented by an FPGA with flexible resource allocation according to claim 1, wherein the vertex network map is generated by performing a random walk on the original vertex map for each vertex, and the number of layers of the vertex network map is the same as the number of layers of the random walk.
6. The FPGA implemented CNN neural network of claim 5, wherein the method of generating a vertex network graph for each vertex by random walk comprises:
step 201, selecting a vertex, and starting random walk by taking the selected vertex as a center until the number of walked layers reaches A; a is the number of first hidden layers;
step 202, adding the vertex sequence of the walk into the vertex network graph, if the number of the walks is less than B, accumulating the number of the walks once, and returning to step 201, otherwise, ending the step.
7. The FPGA implemented CNN neural network of claim 1, wherein the classification labels of the classification space represent pest free, general pest and serious pest, respectively.
8. The FPGA implemented CNN neural network of claim 1, wherein the cell images are generated by uniform segmentation of the plant images.
9. A method for implementing CNN by using a flexible-resource-allocation FPGA, which is used for performing a convolution operation of a convolution layer of a flexible-resource-allocation FPGA implemented CNN neural network according to any one of claims 1 to 8, and includes the following steps:
step 101, before inputting a convolution layer, firstly selecting a serial-parallel combination configuration, wherein the serial-parallel combination configuration comprises a K value;
step 102, generating serial step number, wherein the serial step number is equal to 'N/K', N is the number N of the characteristic diagrams of the previous layer, the number N of the current convolution layers is N x M, and the number M of the characteristic diagrams of the next layer;
and 103, the K-1 convolution calculation structure calculates K feature graphs of the previous layer in parallel, merges the K-1 convolution calculation structures through serial merging structure serial according to the step N/K to obtain the final result of the feature graphs of the next layer, and the convolution calculation of the total N input feature graphs of the previous layer is completed.
10. The method for implementing CNN by using an FPGA with flexible resource allocation according to claim 9, wherein the K-1 convolution computation structure performs convolution computation of K convolution kernels and corresponding feature maps, and sums corresponding points to obtain values in the 1 feature maps, where the convolution computation of the K feature maps and the K convolution kernels is performed in parallel, and each clock sums the K output convolution values;
the serial merging structure is to accumulate the results output by each time of K-1 convolution calculation structure, change K input feature images to continue calculation after each time of K-1 convolution calculation, complete the convolution calculation of the N input feature images in the last layer after N/K times, add bias to the accumulated value, and obtain the final result of the feature images through the operation of an activation function f.
CN202310748451.2A 2023-06-25 2023-06-25 Method for realizing CNN (customer premise network) by using FPGA (field programmable gate array) with flexible resource configuration Pending CN116543291A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310748451.2A CN116543291A (en) 2023-06-25 2023-06-25 Method for realizing CNN (customer premise network) by using FPGA (field programmable gate array) with flexible resource configuration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310748451.2A CN116543291A (en) 2023-06-25 2023-06-25 Method for realizing CNN (customer premise network) by using FPGA (field programmable gate array) with flexible resource configuration

Publications (1)

Publication Number Publication Date
CN116543291A true CN116543291A (en) 2023-08-04

Family

ID=87445549

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310748451.2A Pending CN116543291A (en) 2023-06-25 2023-06-25 Method for realizing CNN (customer premise network) by using FPGA (field programmable gate array) with flexible resource configuration

Country Status (1)

Country Link
CN (1) CN116543291A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117348949A (en) * 2023-12-05 2024-01-05 成都玖锦科技有限公司 Multi-channel measurement method and system based on vector network analyzer

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117348949A (en) * 2023-12-05 2024-01-05 成都玖锦科技有限公司 Multi-channel measurement method and system based on vector network analyzer
CN117348949B (en) * 2023-12-05 2024-03-12 成都玖锦科技有限公司 Multi-channel measurement method and system based on vector network analyzer

Similar Documents

Publication Publication Date Title
Zhang et al. A review of deep learning-based semantic segmentation for point cloud
Chen et al. Efficient approximation of deep relu networks for functions on low dimensional manifolds
Xie et al. Grnet: Gridding residual network for dense point cloud completion
CN108510012B (en) Target rapid detection method based on multi-scale feature map
Park et al. Analysis on the dropout effect in convolutional neural networks
CN109685819B (en) Three-dimensional medical image segmentation method based on feature enhancement
US20200234172A1 (en) Systems and methods for hybrid algorithms using cluster contraction
US20180260709A1 (en) Calculating device and method for a sparsely connected artificial neural network
JP6962263B2 (en) 3D point cloud label learning device, 3D point cloud label estimation device, 3D point cloud label learning method, 3D point cloud label estimation method, and program
CN111507521B (en) Method and device for predicting power load of transformer area
CN112288011B (en) Image matching method based on self-attention deep neural network
CN111401436B (en) Streetscape image segmentation method fusing network and two-channel attention mechanism
CN113822209B (en) Hyperspectral image recognition method and device, electronic equipment and readable storage medium
CN113449736B (en) Photogrammetry point cloud semantic segmentation method based on deep learning
CN112163601A (en) Image classification method, system, computer device and storage medium
CN115908908B (en) Remote sensing image aggregation type target recognition method and device based on graph attention network
CN116543291A (en) Method for realizing CNN (customer premise network) by using FPGA (field programmable gate array) with flexible resource configuration
Chen et al. A 68-mw 2.2 tops/w low bit width and multiplierless DCNN object detection processor for visually impaired people
CN114677548A (en) Neural network image classification system and method based on resistive random access memory
Zhang et al. Segmentation model based on convolutional neural networks for extracting vegetation from Gaofen-2 images
CN113642716A (en) Depth variation autoencoder model training method, device, equipment and storage medium
CN113627440A (en) Large-scale point cloud semantic segmentation method based on lightweight neural network
CN115830596A (en) Remote sensing image semantic segmentation method based on fusion pyramid attention
JP2021039758A (en) Similar region emphasis method and system using similarity among images
Xiao et al. A point selection method in map generalization using graph convolutional network model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination