CN115082738A

CN115082738A - Method and system for realizing picture classification based on full fusion and pruning technology of binary neural network

Info

Publication number: CN115082738A
Application number: CN202210773759.8A
Authority: CN
Inventors: 王永; 张瑞; 栾存阳; 亓海凤; 仲祖霆
Original assignee: Shandong University
Current assignee: Shandong University
Priority date: 2022-07-01
Filing date: 2022-07-01
Publication date: 2022-09-20

Abstract

The invention relates to a method and a system for realizing picture classification based on a full fusion and pruning technology of a binarization neural network. The invention utilizes the full fusion technology and the pruning method, can effectively reduce the operation time and the operation quantity on the premise of not reducing the image classification precision, and has the advantage of high acceleration efficiency of the binarization neural network hardware.

Description

Method and system for realizing picture classification based on full fusion and pruning technology of binary neural network

Technical Field

The invention relates to a method and a system for realizing picture classification based on a full fusion and pruning technology of a binarization neural network, belonging to the field of Artificial Intelligence (AI) integrated circuit hardware design.

Background

Convolutional Neural Networks (CNN) are one of the AI classical structures and are important components of the deep learning field. CNN generally adopts floating point calculation, and requires a large storage space and a large amount of calculation, so that the hardware implementation thereof requires a large resource overhead. In recent years, deep convolutional neural networks have been widely applied in the fields of computer vision, speech recognition, natural language processing and the like. However, in the application scenario of the mobile terminal with low power consumption and low cost, the application of the convolutional neural network is limited by many factors: on one hand, the high-performance convolutional neural network model occupies a large storage space and has high requirements on hardware resources of the mobile terminal. On the other hand, the convolutional neural network has high computational complexity, and the power consumption for realizing dot product operation is large when a deeper convolutional neural network model is operated. In order to make CNNs more suitable for edge computation, Binary Neural Networks (BNNs) are proposed.

BNNs binarize parameters in the network to 1 and-1 (or to 1 and 0), resulting in a significant reduction in the demand of the neural network for hardware resources. The memory requirement of BNN is reduced to one tenth of CNN, and BNN replaces multiply-add operation in CNN by using bit AND OR operation, thus obviously reducing the calculation power consumption.

With the development of artificial intelligence, people apply the binarization neural network to mobile end hardware acceleration optimization, and the model operation speed is improved. However, at present, the acceleration of the binarization neural network implemented on hardware still adopts a software design idea, and is performed in a sequential, complete and hierarchical progressive manner. The hardware implementation method is not high in efficiency, the hardware acceleration advantage of the binarization neural network is not reflected to the maximum extent, and the application requirement of low power consumption of an actual embedded scene cannot be met. Aiming at the design problem of the binarization neural network in the hardware implementation, a technology for fusing and pruning the layers of the binarization neural network is needed, and the high-efficiency intelligent detection is realized by a hardware implementation method which is faster, has lower power consumption and occupies less hardware resources.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a method for realizing image classification based on the full fusion and pruning technology of a binarization neural network.

The invention fuses the convolution layer, the pooling layer and the first full-link layer of the binarization neural network, and prunes based on the particularity of the binarization, thereby effectively reducing the operation time and the operation quantity, and considering the hardware acceleration efficiency of the binarization neural network. The hardware acceleration method of the binary neural network can achieve smaller operation time and has hardware acceleration efficiency.

The invention also provides a system for realizing image classification based on the full fusion and pruning technology of the binarization neural network.

Interpretation of terms:

1. CNN, an abbreviation for volumetric Neural Networks, is known as Convolutional Neural Networks.

2. BNN, an abbreviation for Binarized Neural Networks, is called a binary Neural network.

3. Pruning, which means removing part of parameters from the existing network, improving network efficiency and simultaneously ensuring network accuracy.

The technical scheme of the invention is as follows:

a method for realizing image classification based on full fusion and pruning technology of a binarization neural network comprises the following steps:

step 1: constructing a binarization neural network architecture for realizing a picture two-classification task;

step 2: acquiring a picture data set, and carrying out graying and 1,0 binarization preprocessing on original picture data;

and step 3: constructing sliding windows, i.e. of dimension R _k *R _k A sliding window with the step length of m and the sliding times of c, wherein m is the product of the convolution step length and the pooling step length;

and 4, step 4: reconstructing convolution kernel, namely reconstructing the size and the content of the 1,0 binarization convolution kernel obtained by the training of the binarization neural network, and naming the reconstructed convolution kernel as reconstruction convolution kernel with the size of R _k *R _k ；

And 5: pruning the reconstructed convolution kernel, namely removing a part of 0 in the reconstructed convolution kernel;

step 6: selecting a first layer of fully-connected layer weight with the size of c x q, namely selecting the first layer of fully-connected layer weight obtained by training a binary neural network, wherein the size of the first layer of fully-connected layer weight is c x q, the c is the sliding frequency of a sliding window, and the q is the output size of the first layer of fully-connected layer;

and 7: loading the binary image preprocessed in the step 2, the reconstructed convolution kernel pruned in the step 5 and the first layer full-connection layer weight selected in the step 6 into a hardware memory;

and 8: sliding the sliding window on the binary image preprocessed in the step 2 by a step length m according to rules from left to right and from top to bottom;

and step 9: performing convolution-pooling fusion pruning operation on the binary image data intercepted by the sliding window, namely performing bitwise and operation on the binary image data intercepted by the sliding window and a reconstructed convolution kernel after pruning, and removing bits and operation which do not influence classification precision in the bitwise and operation process;

step 10: performing fusion pruning operation on the convolution-pooling fusion pruning result obtained in the step (9) and part of the first layer fully-connected layer weights selected in the step (6), namely determining the output result of the first layer fully-connected layer according to the convolution-pooling fusion pruning result of the current intercepted data of the sliding window, and removing the bit identity or operation of the convolution-pooling fusion pruning result of the intercepted data of the sliding window and part of the first layer fully-connected layer weights;

step 11: judging whether the sliding window completes traversal of the binary image data or not, if so, entering a step 12, and ending the full-fusion pruning operation of the step 9-10; otherwise, repeating the steps 8-10;

step 12: accumulating and comparing the fused pruning operation results output in the step 10 in the traversal process;

step 13: finishing the operation of the second layer full-link layer, namely performing the exclusive-nor operation on the accumulation and comparison result obtained in the step 12 and the weight value of the second layer full-link layer obtained by the training of the binarization neural network;

step 14: and taking the category corresponding to the maximum value of the result in the step 13 as a final classification label of the input picture.

Further, the network architecture of the binarization neural network comprises an input layer, a convolution layer, a pooling layer, two full-connection layers and an output layer; the input layer preprocesses the original picture into a binary picture; the convolution layer performs feature extraction on input picture data through convolution kernel, and activates the convolution layer output by a Sign activation function; the pooling layer reduces network parameters by down-sampling; the full-connection layer integrates the image characteristics extracted by the convolution pooling layer, and activates the output of the full-connection layer through a Sign activation function; the output layer may be considered a classifier for predicting the class of the input picture; the two full-connecting layers comprise a first full-connecting layer and a second full-connecting layer.

Further, in the step 1, in the forward propagation process of the binarization neural network, binarization processing is performed on a convolution kernel based on a binary function Binarized, and binarization processing is performed on the convolution layer and the output of the full connection layer based on a Sign function Sign; wherein the Binarized function is shown as (I), and the Sign function is shown as (II):

in the back propagation process, the gradient is calculated by adopting a Htanh function, wherein the Htanh function is shown as (III):

Htanh(x)＝max(-1,min(1,x)) (Ⅲ)

further, in step 3, the calculation formula of the sliding times c is as follows:

in formula (IV), I is the size of the input binary picture, R _k To reconstruct the single-sided size of the convolution kernel, S is the convolution step size, and P is the single-sided size of the pooling layer.

Further, in step 4, the size reconfiguration means: the convolution kernel size obtained by the training of the binary neural network is changed into R _k *R _k Wherein R is _k The formula (c) is shown in formula (v):

R _k ＝K+(S*(P-1)) (Ⅴ)

in formula (V), K represents the one-sided size of the convolutional layer convolution kernel; s represents convolution step length, (P-1) represents that the size of a single side of a pooling layer is reduced by 1; s (P-1) represents the size of the remaining convolution kernels excluding the first convolution kernel.

Further, in step 4, the content reconfiguration means: p x P identical convolution kernels are calculated according to the coordinate C by taking the convolution step S as the minimum unit _F Placing;

the coordinate C _F As shown in formula (VI):

in the formula (VI), P represents the unilateral size of the pooling layer, and i represents the ith convolution kernel;

performing the same convolution kernels with P x P according to the coordinate C by taking the convolution step S as the minimum unit _F The placement specifically means: a first convolution kernel placement coordinate is (0,0), a second convolution kernel placement coordinate is (S, 0), a pth convolution kernel placement coordinate is ((P-1) S, 0), a pth +1 convolution kernel placement coordinate is (0, S), a pth +2 convolution kernel placement coordinate is (S, S), a pth + P convolution kernel placement coordinate is ((P-1) S, (P-1) S)); the operation of the overlapped part between the convolution kernels for reconstruction is performed according to OR logic, and is characterized in that: if the overlapped part exists in the value of 1, the value of the overlapped part is 1 after reconstruction, and if the overlapped part exists in the value of 1If the partial non-existence value is 1, the partial value after reconstruction is 0.

Further, in the step 5, performing pruning operation on the reconstructed convolution kernel refers to: splitting a reconstruction convolution kernel containing 1,0 into a reconstruction convolution kernel containing only 1 and pruning a part of 0; pruning the part of 0 refers to emptying the part corresponding to 0 in the reconstructed convolution kernel, and not taking the part as a part of the reconstructed convolution kernel.

Further, in the step 9, performing convolution-pooling fusion and pruning operation, that is, performing bitwise and operation on the input binary image data intercepted by the sliding window and the reconstructed convolution kernel pruned in the step 5 to obtain a convolution-pooling fusion result; in the process of carrying out bitwise AND operation, after a certain bit result obtained by calculation is 1, the rest bits and operation are not carried out any more, and the output result is 1.

Further, in the step 10, performing fusion pruning operation on the convolution-pooling fusion pruning result in the step 9 and part of the first layer full-link layer weights selected in the step 6, namely performing first layer full-link layer operation on binary image data currently intercepted by the sliding window after the convolution-pooling fusion and pruning operation in the step 7 is completed on the binary image data intercepted by the sliding window for the jth time; the weight value of the first layer of fully-connected layers selected in the partial step 6 is the weight value of the [ (j-1) q, j q-1] th bit, wherein j represents the jth sliding of the sliding window; and q is the output size of the first fully-connected layer.

Further, in the step 10, the convolution-pooling fusion pruning result of the jth sliding intercepted data of the sliding window determines the result output by the first full-link layer of the jth sliding intercepted data of the sliding window, which means that if the convolution-pooling fusion pruning result of the jth sliding intercepted data of the sliding window is 1, the first full-link layer of the jth sliding intercepted data of the sliding window outputs the weight data selected by the first full-link layer of the first full-link layer; if the convolution-pooling fusion and pruning result of the jth sliding intercepted data of the sliding window is 0, outputting the first layer full-connection layer of the intercepted data as the bit-wise negation of the selected weight data of the first layer full-connection layer; and pre-storing the values of the first layer full-connection layer weight and the first layer full-connection layer weight which are inverted according to bits in a memory, and removing the bit identity or operation of the convolution-pooling fusion pruning result and the first layer full-connection layer weight.

Further, in the step 12, accumulating and comparing the fused pruning operation result output in the step 10 in the traversal process means: the convolution-pooling fusion pruning results output in the step 10 in the traversal process are added bit by bit, and whether the added result is larger than the result after bit by bit judgment is carried out

If the output value is larger than the preset value, the output result is 1; otherwise, the output result is 0; where c is the number of sliding of the sliding window.

A full fusion and pruning technology based on a binarization neural network realizes a picture classification system, which comprises:

a picture dataset acquisition and pre-processing module configured to: acquiring a picture data set, and preprocessing the picture data by adopting a 1,0 binarization method;

a reconstructed convolution kernel and pruning module configured to: carrying out size and content reconstruction on the convolution kernel obtained by the training of the binary neural network, and naming the reconstructed convolution kernel as a reconstructed convolution kernel with the size of R _k *R _k (ii) a Pruning the reconstructed convolution kernel, namely removing a part of 0 in the reconstructed convolution kernel;

a first layer full link layer weight selection module configured to: selecting a first layer of fully-connected layer weight with the size of c x q, namely selecting a first layer of fully-connected layer weight obtained by training a binarization neural network;

a convolution-pooling fused pruning operation module configured to: performing convolution-pooling fusion pruning operation on the binary image data intercepted by the sliding window, namely performing bitwise and operation on the binary image data intercepted by the sliding window and a reconstructed convolution kernel after pruning, and removing bits and operation which do not influence classification precision in the bitwise and operation process;

a full-connectivity fusion pruning operation module configured to: performing fusion pruning operation on the convolution-pooling fusion pruning result and the selected first layer full-connection layer weight, namely determining the output result of the first layer full-connection layer according to the convolution-pooling fusion pruning result of the current intercepted data of the sliding window, and removing the bit identity or operation of the convolution-pooling fusion pruning result of the intercepted data of the sliding window and part of the first layer full-connection layer weight;

a sliding window sliding and traversal determination module configured to: sliding the sliding window on the preprocessed binary image by step length m according to rules from left to right and from top to bottom; judging whether the sliding window completes traversal of the binary image data or not;

a accumulate and compare operation module configured to: after traversing the binary image data, the sliding window performs accumulation and comparison operation on all fusion pruning operation results;

a subsequent fully-connected layer operation module configured to: finishing the operation of the second layer full-link layer, namely performing exclusive OR operation on the obtained accumulation and comparison result and the second layer full-link layer weight obtained by the training of the binarization neural network;

a picture classification result output module configured to: and taking the category corresponding to the maximum result value of the subsequent full-connection layer operation module as a final classification label of the input picture.

A computer readable storage medium, wherein a memory stores a plurality of instructions, the instructions are suitable for being loaded by a processor of a terminal device and executing the method for realizing the picture classification based on the full fusion and pruning technology of the binary neural network.

A terminal device comprises a processor and a computer readable storage medium, wherein the processor is used for realizing instructions, and the computer readable storage medium is used for storing a plurality of instructions, and the instructions are suitable for being loaded by the processor and executing the method for realizing the image classification based on the full fusion and pruning technology of the binarization neural network.

The beneficial effects of the invention are as follows:

1. the invention provides a method for realizing picture classification based on full fusion and pruning technology of a binarization neural network, which fuses a convolution layer, a pooling layer and a first layer full connection layer of the binarization neural network, removes redundant operation which has no influence on final precision in the fusion process by a pruning method, and realizes hardware acceleration of the binarization neural network in a mode of higher efficiency, smaller operation delay and lower required hardware resources.

2. The invention provides a method for realizing image classification based on a full fusion and pruning technology of a binarization neural network, which combines fusion operation between layers of the binarization neural network and a pruning method, reduces hardware resources required by hardware acceleration, and simultaneously ensures image classification precision.

3. The invention provides a method for realizing image classification based on full fusion and pruning technology of a binarization neural network, which can realize hardware acceleration of the binarization neural network without executing a large amount of operations when the input binary image data is larger, thereby saving a large amount of hardware resources and improving the hardware acceleration efficiency of the binarization neural network.

4. The invention provides a method for realizing image classification based on full fusion and pruning technology of a binarization neural network, which supports hardware acceleration of the binarization neural network of convolution layers, pooling layers and a first layer full connection layer with various sizes.

Drawings

FIG. 1 is a schematic flow chart of a method for realizing image classification based on a full fusion and pruning technique of a binarization-based neural network according to the present invention;

FIG. 2 is a schematic diagram of a reconstructed convolution kernel in a method for realizing image classification based on full fusion and pruning technology of a binarization-based neural network;

FIG. 3 is a diagram of a binary neural network architecture of a method for implementing image classification based on full fusion and pruning techniques of a binary neural network according to the present invention;

FIG. 4 is a schematic diagram of a specific implementation process of a method for implementing image classification based on a full fusion and pruning technique of a binarization-based neural network according to the present invention;

Detailed Description

The invention is further described below, but not limited thereto, with reference to the figures and examples;

example 1

A method for realizing image classification based on full fusion and pruning technique of binary neural network features that the convolution layer, pooling layer and first full-connection layer of binary neural network are fused together, and a special pruning method is used to reduce the number of operations in fusion. The method can effectively reduce the operation time and the operation number on the premise of not reducing the picture classification precision. Because a large amount of redundant operations exist in the fusion process of the convolutional layer, the pooling layer and the first layer full-connection layer, and the redundant operations do not cause loss of the classification precision of the final picture, the fusion process can carry out pruning. As shown in fig. 1, the method comprises the following steps:

step 1: constructing a binary neural network architecture for realizing a two-classification task of the picture;

and step 3: constructing sliding windows, i.e. of dimension R _k *R _k A sliding window with a step length of m and a sliding frequency of c, wherein m is the product of the convolution step length and the pooling step length;

step 6: selecting a first layer of fully-connected layer weight with the size of c × q, namely selecting the first layer of fully-connected layer weight obtained by training a binarization neural network, wherein the size of the first layer of fully-connected layer weight is c × q, c is the sliding frequency of a sliding window, and q is the output size of the first layer of fully-connected layer;

Example 2

The method for realizing image classification based on the full fusion and pruning technology of the binarization neural network in the embodiment 1 is characterized in that:

as shown in fig. 3, the network architecture of the binarization neural network comprises an input layer, a convolution layer, a pooling layer, two full-link layers, and an output layer; the input layer preprocesses the original picture into a binary picture; the convolution layer performs feature extraction on input picture data through convolution kernel, and activates the convolution layer output by a Sign activation function; the pooling layer reduces network parameters by down-sampling; the full-connection layer integrates the image characteristics extracted by the convolution pooling layer, and activates the output of the full-connection layer through a Sign activation function; the output layer may be considered a classifier for predicting the class of the input picture; the two full-connecting layers comprise a first full-connecting layer and a second full-connecting layer.

In the step 1, carrying out binarization processing on a convolution kernel based on a binary function Binarized in the forward propagation process of a binarization neural network, and carrying out binarization processing on the convolution layer and the output of a full connection layer based on a Sign function Sign; wherein the Binarized function is shown as (I), the Sign function is shown as (II):

Htanh(x)＝max(-1,min(1,x)) (Ⅲ)

in step 2, a picture data set is obtained, namely the picture data set used for the training of the binarization neural network is obtained, and the original picture data is preprocessed. In order to perform subsequent fusion and pruning operations between layers, 1,0 binarization processing is performed on input picture data, the input picture is processed into a gray image according to a gray processing principle, and the gray image is constructed into a binary image according to a selected binarization threshold value and is used as a binary picture data set of a binarization neural network. The preprocessing comprises graying and 1,0 binarization operation, wherein the graying operation refers to unifying the three-channel RGB value of each pixel point of the original picture into a single-channel value. The binarization operation is to set picture pixel values larger than a selected threshold value to 1 and picture pixel values smaller than the selected threshold value to 0.

In step 3, the calculation formula of the sliding times c is as follows:

In step 4, in order to save the data register between the convolution layer and the pooling layer and to complete the operation of the convolution layer and the pooling layer in the same step, the size and content of the convolution kernel with the size of K x K obtained by the training of the binary neural network are reconstructed, and the reconstructed convolution kernel is named as a reconstructed convolution kernel. Obtaining the size R according to the input binary data, the binary convolution kernel and the convolution layer activation function _k *R _k The reconstructed convolution kernel of (1).

The size reconstruction means that: the convolution kernel size obtained by the training of the binary neural network is changed into R _k *R _k Wherein R is _k The formula (c) is shown in formula (v):

R _k ＝K+(S*(P-1)) (Ⅴ)

In step 4, content reconfiguration means: p x P identical convolution kernels are calculated according to the coordinate C by taking the convolution step S as the minimum unit _F Placing;

coordinate C _F As shown in formula (VI):

performing the same convolution kernels with P x P according to the coordinate C by taking the convolution step S as the minimum unit _F Placing, specifically: first convolution kernel placementThe coordinates are (0,0), the second convolution kernel placement coordinate is (S, 0), the pth convolution kernel placement coordinate is ((P-1) × S, 0), the pth +1 convolution kernel placement coordinate is (0, S), the pth +2 convolution kernel placement coordinate is (S, S), and the pth convolution kernel placement coordinate is ((P-1) × S, (P-1) × S)); the operation of the overlapped part between the convolution kernels for reconstruction is performed according to OR logic, and is characterized in that: if the overlapped part existing value is 1, the value of the part after reconstruction is 1, and if the overlapped part nonexistence value is 1, the value of the part after reconstruction is 0.

In step 5, pruning the reconstructed convolution kernel, which means: splitting a reconstruction convolution kernel containing 1,0 into a reconstruction convolution kernel containing only 1 and pruning a part of 0; pruning the 0 part means that the part corresponding to 0 in the reconstructed convolution kernel is nulled and is not used as a part of the reconstructed convolution kernel.

In the step 9, performing convolution-pooling fusion and pruning operation, namely performing bitwise and operation on the input binary image data intercepted by the sliding window and the reconstructed convolution kernel pruned in the step 5 to obtain a convolution-pooling fusion result; in the process of carrying out bitwise AND operation, when the result of a certain bit obtained by calculation is 1, the rest bits and operation are not carried out any more, and the output result is 1.

In step 10, performing fusion pruning operation on the convolution-pooling fusion pruning result obtained in the step 9 and part of the first layer full-connection layer weights selected in the step 6, namely performing the first layer full-connection layer operation of the sliding window currently intercepting the binary image data after the convolution-pooling fusion and pruning operation of the step 7 is completed on the binary image data intercepted by the sliding window for the jth time; the weight value of the first layer of fully-connected layers selected in the partial step 6 is the weight value of the [ (j-1) q, j q-1] th bit, wherein j represents the jth sliding of the sliding window; and q is the output size of the first fully-connected layer.

In step 10, determining a result output by a first full-link layer of the sliding intercepted data according to a convolution-pooling fusion pruning result of the sliding intercepted data of the jth time of the sliding window, wherein if the convolution-pooling fusion pruning result of the sliding intercepted data of the jth time of the sliding window is 1, the first full-link layer of the intercepted data is output as weight data selected by the first full-link layer of the first layer; if the convolution-pooling fusion and pruning result of the jth sliding intercepted data of the sliding window is 0, outputting the first layer full-connection layer of the intercepted data as the bit-wise negation of the selected weight data of the first layer full-connection layer; and pre-storing the values of the first layer full-connection layer weight and the first layer full-connection layer weight which are inverted according to bits in a memory, and removing the bit identity or operation of the convolution-pooling fusion pruning result and the first layer full-connection layer weight. And removing the bit identity or operation of the convolution-pooling fusion pruning result and the first layer full-link layer weight, namely acquiring the first layer full-link layer output result of the jth sliding intercepted data of the sliding window from the memory, and not needing to complete the identity or operation of the convolution-pooling fusion pruning result and the selected first layer full-link layer weight bit by bit.

In step 12, accumulating and comparing the fused pruning operation result output in step 10 in the traversal process means: performing bit-by-bit addition on the convolution-pooling fusion pruning result output in the step 10 in the traversal process, and judging whether the result after the addition is greater than or not bit-by-bit

Example 3

As shown in fig. 4, the method for classifying pictures based on the full fusion and pruning technique of the binarization-based neural network according to the embodiment 2 is characterized in that:

the reconstruction convolution kernel is defined by the user. A schematic diagram of a reconstructed convolution kernel based on a full fusion technology and a pruning method of a binarization neural network is shown in fig. 2, the size of the convolution kernel of the binarization neural network is 3 x 3, the size of a pooling layer is 2 x 2, and the convolution step size is 1;

and performing size reconstruction on the convolution kernel with the size of 3 x 3 obtained by training the binary neural network according to a size calculation formula of the reconstructed convolution kernel to obtain a reconstructed convolution kernel with the size of 4 x 4.

And (3) performing content reconstruction on the convolution kernel: the same convolution kernels, the number of which is equal to the size 2 x 2 of the pooling layer, are scaled by the coordinate C with the convolution step 1 as the minimum unit _F And (4) placing. That is, the first convolution kernel has a bottom left starting coordinate of (0,0), the second convolution kernel has a bottom left starting coordinate of (1, 0), the third convolution kernel has a bottom left starting coordinate of (0, 1), and the fourth convolution kernel has a bottom left starting coordinate of (1, 1). And (3) calculating the overlapped part between the convolution kernels for reconstruction according to OR logic, wherein if the overlapped part has a value of 1, the value of the part after reconstruction is 1, and if the overlapped part does not have a value of 1, the value of the part after reconstruction is 0, namely the overlapped part A obtains a: [11]As a result, the portion B overlapped gives B: [11]As a result, the overlapping C portion yields C: [10]As a result, the overlap D portion yields D: [00]As a result, the overlapping E portion yields E:

the result of (1).

Example 4

The method for realizing image classification based on the full fusion and pruning technology of the binarization-based neural network in the embodiment 2 is characterized in that:

the size of the convolution kernel of the selected binarization neural network is 3 × 3, the size of the pooling layer is 2 × 2, the convolution step is 1, and the pooling step is 2.

A sliding window is constructed. Constructing a sliding window with size 4 x 4 and step size M, wherein M is convolution step size and pooling step size is 2;

the specific steps of content reconstruction for the convolution kernel include: the same convolution kernels, the number of which is equal to the size 2 x 2 of the pooling layer, are scaled by the coordinate C with the convolution step 1 as the minimum unit _F And (4) placing. Namely, the first convolution kernel placement coordinate is (0,0), the second convolution kernel placement coordinate is (1, 0), the third convolution kernel placement coordinate is (0, 1), the fourth convolution kernel placement coordinate is (1, 1), and the part for reconstruction with coincidence between the convolution kernels is operated according to or logic. If the overlapped part existing value is 1, the value of the part after reconstruction is 1, and if the overlapped part nonexistence value is 1, the value of the part after reconstruction is 0.

In the embodiment, an MIT-BIH data set of Massachusetts institute of technology, national standards and one of ECG data sets is adopted, and based on the identification of two ECG signals, a traditional hardware acceleration method of a binary neural network, a full fusion technology based on the binary neural network and a pruning method are compared. Table 1 lists the voltages, clock frequencies, network sizes, network identification accuracy used in this example: the number of lookup tables, the number of registers, the number of cycles required by single recognition, the overall delay, the number of operation operands and the number of lookup tables, the number of registers, the number of cycles required by single recognition, the overall delay and the operation operands used by the traditional hardware acceleration method based on the full fusion technology and the pruning method based on the binarization neural network are compared, and the percentage reduction of the operation operands compared with the traditional hardware acceleration method based on the full fusion technology and the pruning method based on the binarization neural network is listed.

TABLE 1

As can be seen from table 1, the full fusion technique and the pruning method based on the binarization neural network are superior to the traditional hardware acceleration method of the binarization neural network in terms of the number of hardware resources, the number of cycles of single recognition, the inference delay and the operation number, the number of lookup tables is reduced by about 40%, the number of registers is reduced by about 54%, the number of cycles required for single recognition is reduced by about 92%, the delay time is reduced by about 92% (5.88us), the number of overall operation operations is reduced by about 94.3%, and the efficiency is greatly improved.

Example 5

a picture dataset acquisition and pre-processing module configured to: acquiring a picture data set, and preprocessing the picture data by adopting a binarization method of 1, 0;

a reconstructed convolution kernel and pruning module configured to: to pairCarrying out size and content reconstruction on the convolution kernel obtained by the training of the binary neural network, and naming the reconstructed convolution kernel as a reconstructed convolution kernel with the size of R _k *R _k (ii) a Pruning the reconstructed convolution kernel, namely removing a part of 0 in the reconstructed convolution kernel;

a convolution-pooling fusion pruning operations module configured to: performing convolution-pooling fusion pruning operation on the binary image data intercepted by the sliding window, namely performing bitwise and operation on the binary image data intercepted by the sliding window and a reconstructed convolution kernel after pruning, and removing bits and operation which do not influence classification precision in the bitwise and operation process;

a fully-connected fused pruning operations module configured to: performing fusion pruning operation on the convolution-pooling fusion pruning result and the selected first layer full-connection layer weight, namely determining the output result of the first layer full-connection layer according to the convolution-pooling fusion pruning result of the current intercepted data of the sliding window, and removing the bit identity or operation of the convolution-pooling fusion pruning result of the intercepted data of the sliding window and part of the first layer full-connection layer weight;

a sliding window sliding and traversal determination module configured to: sliding the sliding window on the preprocessed binary image by step length m according to the rule from left to right and from top to bottom; judging whether the sliding window completes traversal of the binary image data or not;

a picture classification result output module configured to: and taking the category corresponding to the maximum result value of the subsequent full-connection layer operation module as the final classification label of the input picture.

Example 6

A computer-readable storage medium, wherein a memory stores a plurality of instructions, and the instructions are adapted to be loaded by a processor of a terminal device and execute a method for classifying pictures based on the full fusion and pruning technique of the binarized neural network according to any one of embodiments 1-4.

Example 7

A terminal device, comprising a processor and a computer-readable storage medium, wherein the processor is configured to implement instructions, and the computer-readable storage medium is configured to store a plurality of instructions, and the instructions are adapted to be loaded by the processor and execute a method for implementing image classification based on the full fusion and pruning technique of the binarized neural network according to any one of embodiments 1 to 4.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.

Claims

1. A method for realizing image classification based on full fusion and pruning technology of a binary neural network is characterized by comprising the following steps:

and step 3: structure of the organizationMaking sliding windows, i.e. of dimension R _k *R _k A sliding window with a step length of m and a sliding frequency of c, wherein m is the product of the convolution step length and the pooling step length;

and 6: selecting a first layer of fully-connected layer weight with the size of c × q, namely selecting the first layer of fully-connected layer weight obtained by training a binarization neural network, wherein the size of the first layer of fully-connected layer weight is c × q, c is the sliding frequency of a sliding window, and q is the output size of the first layer of fully-connected layer;

and step 8: sliding the sliding window on the binary image preprocessed in the step 2 by a step length m according to rules from left to right and from top to bottom;

step 11: judging whether the sliding window completes traversal of the binary image data or not, if so, entering a step 12, and ending the full-fusion pruning operation of the step 9-10; otherwise, repeating the step 8-10;

2. The method for realizing the image classification based on the full fusion and pruning technology of the binarization neural network as claimed in claim 1, wherein a network architecture of the binarization neural network comprises a layer of input layer, a layer of convolution layer, a layer of pooling layer, two layers of full connection layer, and a layer of output layer; the input layer preprocesses the original picture into a binary picture; the convolution layer performs feature extraction on input picture data through convolution kernel, and activates the convolution layer output by a Sign activation function; the pooling layer reduces network parameters by down-sampling; the full-connection layer integrates the image characteristics extracted by the convolution pooling layer, and activates the output of the full-connection layer through a Sign activation function; the output layer may be considered a classifier for predicting the class of the input picture; the two full-connecting layers comprise a first full-connecting layer and a second full-connecting layer.

3. The method for realizing image classification based on the full fusion and pruning technology of the binarization neural network as claimed in claim 1, wherein in the step 1, in the forward propagation process of the binarization neural network, the binarization processing is performed on the convolution kernel based on a binary function Binarized, and the binarization processing is performed on the convolution layer and the full connection layer output based on a Sign function Sign; wherein the Binarized function is shown as (I), the Sign function is shown as (II):

in the back propagation process, the gradient is calculated by using a Htanh function, wherein the Htanh function is shown as (III):

Htanh(x)＝max(-1，min(1，x)) (III)。

4. the method for realizing image classification based on the full fusion and pruning technology of the binarization neural network as claimed in claim 1, wherein in the step 3, a calculation formula of the sliding times c is as follows:

in formula (VI), I is the size of the input binary picture, R _k Reconstructing the unilateral size of a convolution kernel, wherein S is a convolution step length, and P is the unilateral size of a pooling layer;

in the step 4, the size reconfiguration means: the convolution kernel size obtained by the training of the binary neural network is changed into R _k *R _k Wherein R is _k Is represented by the formula (V):

R _k ＝K+(S*(P-1)) (V)

in formula (V), K represents the unilateral size of the convolutional layer convolution kernel; s represents convolution step length, (P-1) represents that the unilateral size of a pooling layer is reduced by 1; s (P-1) represents the size of the remaining convolution kernels excluding the first convolution kernel;

content reconfiguration means: p x P identical convolution kernels are calculated according to the coordinate C by taking the convolution step S as the minimum unit _F Placing;

the coordinate C _F As shown in formula (VI):

performing the same convolution kernels with P x P according to the coordinate C by taking the convolution step S as the minimum unit _F The placement specifically means: a first convolution kernel placement coordinate is (0,0), a second convolution kernel placement coordinate is (S, 0), a pth convolution kernel placement coordinate is ((P-1) S, 0), a pth +1 convolution kernel placement coordinate is (0, S), a pth +2 convolution kernel placement coordinate is (S, S), a pth + P convolution kernel placement coordinate is ((P-1) S, (P-1) S)); the operation of the overlapped part between the convolution kernels for reconstruction is performed according to OR logic, and is characterized in that: if the overlapped part existing value is 1, the value of the part after reconstruction is 1, and if the overlapped part nonexistence value is 1, the value of the part after reconstruction is 0.

5. The method for realizing image classification based on the full fusion and pruning technology of the binarization neural network as claimed in claim 1, wherein in the step 5, the pruning operation on the reconstruction convolution kernel means: splitting a reconstruction convolution kernel containing 1,0 into a reconstruction convolution kernel containing only 1 and pruning a part of 0; the pruning of the part 0 means that the part corresponding to 0 in the reconstruction convolution kernel is empty and is not used as a part of the reconstruction convolution kernel;

6. The method for realizing image classification based on the full fusion and pruning technology of the binarization neural network as recited in claim 1, wherein in the step 10, the convolution-pooling fusion pruning result of the step 9 and part of the first layer full connection layer weight values selected in the step 6 are subjected to fusion pruning operation, which means that after the convolution-pooling fusion and pruning operation of the step 7 is completed on the binary image data intercepted by the sliding window at the jth sliding time, the first layer full connection layer operation of the sliding window for intercepting the binary image data currently is performed; the weight value of the first layer of fully-connected layers selected in the partial step 6 is the weight value of the [ (j-1) q, j q-1] th bit, wherein j represents the jth sliding of the sliding window; q is the output size of the first full connection layer;

determining a result output by a first layer of full-link layer of the sliding intercepted data according to a convolution-pooling fusion pruning result of the jth sliding intercepted data of the sliding window, wherein if the convolution-pooling fusion pruning result of the jth sliding intercepted data of the sliding window is 1, the first layer of full-link layer of the jth sliding intercepted data is output as weight data selected by the first layer of full-link layer; if the convolution-pooling fusion and pruning result of the jth sliding intercepted data of the sliding window is 0, outputting the first layer full-connection layer of the intercepted data as the bit-wise negation of the selected weight data of the first layer full-connection layer; and pre-storing the values of the first layer full-connection layer weight and the first layer full-connection layer weight which are inverted according to bits in a memory, and removing the bit identity or operation of the convolution-pooling fusion pruning result and the first layer full-connection layer weight.

7. The method for realizing image classification based on the full fusion and pruning technology of the binarization neural network as claimed in any one of claims 1-6, wherein in the step 12, the fusion pruning operation result output in the step 10 in the accumulation and comparison traversal process means that: performing bit-by-bit addition on the convolution-pooling fusion pruning result output in the step 10 in the traversal process, and judging whether the result after the addition is greater than or not bit-by-bit

8. A full fusion and pruning technology based on a binarization neural network realizes a picture classification system, which is characterized by comprising:

9. A computer-readable storage medium, wherein a plurality of instructions are stored in a memory, and the instructions are adapted to be loaded by a processor of a terminal device and execute a method for realizing image classification based on the full fusion and pruning technique of the binarized neural network according to any one of claims 1 to 7.

10. A terminal device, comprising a processor and a computer-readable storage medium, wherein the processor is used for implementing instructions, and the computer-readable storage medium is used for storing a plurality of instructions, wherein the instructions are adapted to be loaded by the processor and execute a method for implementing image classification based on the full fusion and pruning technique of the binarized neural network according to any one of claims 1 to 7.