CN113516171A

CN113516171A - Image classification method based on Bayesian neural network random addition decomposition structure

Info

Publication number: CN113516171A
Application number: CN202110544590.4A
Authority: CN
Inventors: 姜书艳; 孙召曦; 许怡楠; 黄乐天
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2021-05-19
Filing date: 2021-05-19
Publication date: 2021-10-19
Anticipated expiration: 2041-05-19
Also published as: CN113516171B

Abstract

The invention discloses an image classification method based on a Bayesian neural network random addition decomposition structure, which comprises the steps of scaling attribute parameters of a Bayesian neural network to obtain scaling attribute parameters, converting the scaling attribute parameters into random bit stream data, determining the input number of a reference multiplexer based on a network structure of the Bayesian neural network, determining the input number and the number of a middle multiplexer according to the input number of the reference multiplexer, determining an inner product scaling factor of an inner product operation unit based on a parameter scaling factor and the input number and the number of the middle multiplexer, determining an inner product operation output result of the inner product operation unit according to the random bit stream data and the middle multiplexer, determining an output scaling factor based on the inner product scaling factor and the parameter scaling factor, and outputting the final output result of the Bayesian neural network to finish image classification, the image classification is carried out by the classification method, so that the hardware implementation cost during classification is greatly reduced.

Description

Image classification method based on Bayesian neural network random addition decomposition structure

Technical Field

The invention belongs to the technical field of image classification, and particularly relates to an image classification method based on a Bayesian neural network random addition decomposition structure.

Background

Image classification is the hottest research direction in the fields of artificial intelligence and computer vision at present, and the problem of processing image classification by using a neural network becomes the mainstream in recent years.

In the prior art, a bayesian neural network is generally adopted for image classification, and in the prior art, the bayesian neural network is adopted for image classification, a gaussian random number generator, a multiplier and an adder need to work together, so that the hardware implementation cost is too high when the image classification is carried out.

Therefore, how to reduce the hardware implementation cost of image classification when using the bayesian neural network to classify images and improve the user experience is a technical problem to be solved by those skilled in the art.

Disclosure of Invention

The invention aims to solve the technical problem that the hardware implementation cost is too high when a Bayesian neural network is used for image classification in the prior art, and provides an image classification method based on a Bayesian neural network random addition decomposition structure.

The technical scheme of the invention is as follows: the image classification method based on the Bayesian neural network random addition decomposition structure comprises the following steps:

s1, scaling the attribute parameters of the Bayesian neural network according to parameter scaling factors to obtain scaling attribute parameters, and converting the scaling attribute parameters into random bit stream data, wherein the attribute parameters comprise input data, weight parameters and bias parameters, and the input data is specifically first image data to be classified in an image data set to be classified;

s2, determining the input number of the reference multi-path selector based on the network structure of the Bayesian neural network;

s3, decomposing the reference multiplexer into a plurality of intermediate multiplexers, and determining the input number and the number of the intermediate multiplexers according to the input number of the reference multiplexer;

s4, determining an inner product scaling factor of an inner product operation unit based on the parameter scaling factor and the input number and the number of the intermediate multi-path selector, and determining an inner product operation output result of the inner product operation unit according to the random bit stream data and the intermediate multi-path selector;

s5, determining an output scale factor of the Bayes neural network based on the inner product scaling factor and the parameter scaling factor, and outputting a final output result of the Bayes neural network according to the inner product operation output result and the output scale factor so as to complete image classification of first image data to be classified;

and S6, carrying out image classification on the rest image data to be classified in the image data set to be classified according to the Bayesian neural network with the determined quantity of the intermediate multi-path selectors.

Further, the parameter scaling factors specifically include an input data scaling factor, a weight parameter scaling factor, and a bias parameter scaling factor, the scaling attribute parameters specifically include floating-point input data, a floating-point weight parameter, and a floating-point bias parameter, and the random bitstream data includes an input data bitstream, a weight parameter bitstream, and a bias parameter bitstream.

Further, the step S2 is specifically to determine the number of neurons in each layer of the bayesian neural network according to the network structure of the bayesian neural network, where the input number of the reference multiplexer is the number of the neurons in the previous layer, and each of the neurons in each layer of the network except the input layer of the bayesian neural network has one reference multiplexer.

Further, in the step S3, the number of the intermediate multiplexers is smaller than the number of the inputs of the reference multiplexer, and a value obtained by dividing the number of the inputs of the reference multiplexer by the number of the intermediate multiplexers is a positive integer and equal to the number of the inputs of the intermediate multiplexers.

Further, in step S4, the weight parameter scaling factor, the input data parameter scaling factor, the number of inputs of the intermediate multiplexers, and the number of intermediate multiplexers are multiplied to obtain the inner product scaling factor of the inner product operation unit.

Further, the determining the output scale factor of the bayesian neural network in the step S5 specifically includes the following sub-steps:

s51, taking the least common multiple of the inner product scaling factor and the bias parameter scaling factor as a common scaling factor;

s52, determining a neuron scaling factor of a single neuron according to the common scaling factor;

s53, determining the neuron scaling factors corresponding to all neurons in different layers of the Bayesian neural network;

and S54, taking the largest neuron scaling factor in different layers of the Bayesian neural network as the layer scaling factor of the corresponding layer, and taking the layer scaling factor of the last layer of the Bayesian neural network as the output scaling factor.

Further, the step S5 of outputting the final output result of the bayesian neural network specifically includes the following sub-steps:

s55, inputting the inner product operation output result and the bias parameter bit stream into a bias addition unit for operation, and determining the output bit stream of the bias addition unit;

s56, inputting the output bit stream of the bias addition unit into an activation function for activation to obtain a neuron output result of the neuron;

s57, performing the Bayesian neural network forward reasoning based on the neuron output result and the layer scaling factor, thereby obtaining output bit stream data of the Bayesian neural network;

and S58, converting the output bit stream data of the Bayesian neural network into floating point result data represented by a floating point through a counter, and amplifying the floating point result data according to the output scale factor to obtain the final output result.

Further, between the step S51 and the step S52, the method further includes determining magnitudes of the inner product scaling factor and the offset parameter scaling factor, if the inner product scaling factor is smaller than the offset parameter scaling factor, updating the inner product scaling factor according to the common scaling factor, rescaling the inner product operation result according to the updated inner product scaling factor, if the inner product scaling factor is larger than the offset parameter scaling factor, updating the offset parameter scaling factor according to the common scaling factor, and rescaling the offset parameter bitstream according to the updated offset parameter scaling factor.

Further, in step S57, the bayesian neural network forward inference is performed, specifically, the scaling factor of the neuron is updated based on the layer scaling factor, and the output result of the neuron is rescaled according to the updated neuron scaling factor. And performing forward reasoning of the Bayesian neural network based on the rescaled neuron output result.

Further, the determining of the inner product operation output result of the inner product operation unit in step S4 is specifically to input the random bit stream data into all intermediate multiplexers, determine the output result of each intermediate multiplexer, and accumulate the output results corresponding to all the intermediate multiplexers through the accumulation multiplexers to obtain the inner product operation output result, where the inner product operation output result is a bit stream reduced by the inner product scaling factor.

Further, between the step S5 and the step S6, the method further includes:

a1, performing the Bayesian neural network forward reasoning for preset times in a random calculation mode, and simulating to obtain a first neuron output value distribution obtained by random calculation;

a2, performing the Bayesian neural network forward reasoning for preset times in a floating point operation mode, and simulating to obtain a second neuron output value distribution obtained by calculating random sampling values of weight and bias distribution;

a3, judging whether the difference value of the second prediction accuracy rate corresponding to the second neuron output value distribution minus the first prediction accuracy rate corresponding to the first neuron output value distribution is smaller than a preset threshold value, if so, keeping the number of the intermediate multi-path selectors unchanged, and if not, sequentially increasing or decreasing the number of the intermediate multi-path selectors until the difference value of the second prediction accuracy rate minus the first prediction accuracy rate is smaller than the preset threshold value.

Compared with the prior art, the invention has the following beneficial effects:

(1) the invention classifies the images in a Bayesian neural network random calculation mode, greatly reduces the hardware implementation cost of image classification and improves the use experience of users compared with the traditional image classification by utilizing the Bayesian neural network.

(2) The invention sets the reference multi-path selector in the Bayes neural network, determines the input number and the number of the middle multi-path selector according to the reference multi-path selector, determines the inner product scaling factor of the inner product operation unit through the middle multi-path selector and the parameter scaling factor, determines the inner product operation output result of the inner product operation unit according to the middle multi-path selector and the random bit stream data, and uses the middle multi-path selector with more number to input data, thereby reducing the error introduced by the direct sampling of the reference multi-path selector, effectively reducing the error of the Bayes neural network, and improving the accuracy of the random addition operation of the Bayes neural network.

(3) According to the invention, through adjusting the number and the input number of the intermediate multi-path selectors, additional complex structure design is not required to be introduced, the accuracy of the Bayesian neural network random addition is improved, the simplified structure of the Bayesian neural network is maintained, the complex calculation amount is avoided, and the working efficiency of the Bayesian neural network is ensured.

(4) According to the invention, the first neuron output value distribution is obtained in a random calculation mode, the second neuron output value distribution is obtained in a floating point operation mode, and the number of the intermediate multi-path selectors is adjusted by comparing the first neuron output value distribution with the second neuron output value distribution, so that the error level of the input Bayesian neural network can be controlled.

Drawings

Fig. 1 is a schematic flowchart of an image classification method based on a bayesian neural network random additive decomposition structure according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of an inner product operation unit according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

As described in the background art, in the prior art, when a bayesian neural network is used for image classification, a gaussian random number generator, a multiplier and an adder need to work together, and the hardware implementation cost is high when image classification is performed.

Therefore, the application provides an image classification method based on a Bayesian neural network random addition decomposition structure, which is used for solving the technical problem that hardware implementation cost is high when the Bayesian neural network is used for image classification in the prior art.

Fig. 1 is a schematic flow chart of an image classification method based on a bayesian neural network random additive decomposition structure according to an embodiment of the present invention, and the method includes the following steps:

step S1, scaling the attribute parameters of the bayesian neural network according to parameter scaling factors to obtain scaling attribute parameters, and converting the scaling attribute parameters into random bit stream data, where the attribute parameters include input data, weight parameters, and bias parameters, and the input data is specifically the first image data to be classified in the image data set to be classified.

The attribute parameters specifically include input data in a floating point form, a weight parameter and a bias parameter, the parameter scaling factors specifically include an input data scaling factor, a weight parameter scaling factor and a bias parameter scaling factor, the scaling attribute parameters specifically include floating point input data, a floating point weight parameter and a floating point bias parameter, and the random bit stream data includes an input data bit stream, a weight parameter bit stream and a bias parameter bit stream.

In practical applications, the data that is usually required to be classified is a data set, that is, an image data set to be classified, and the data set includes a plurality of image data to be classified.

Specifically, the attribute parameter may be reduced by first using an integer power of 2 as a scaling factor, and the attribute parameter is limited to a range representable by random calculation, where the data range of the input data x is [ m [ ]_x,n_x]The data range of the weight parameter is [ m ]_w,n_w]The data range of the bias parameter is [ m ]_b,n_b]Wherein the parameter scaling factor may be determined by the following formula:

input data scaling factor:

weight parameter scaling factor:

bias parameter scaling factor:

scaled floating point input data x_fA floating point weight parameter w_fFloating point bias parameterb_fAre respectively represented by the following formula:

wherein x represents input data, w represents a weight parameter, and b represents a bias parameter; x is the number of_fRepresenting scaled floating-point input data, w_fRepresenting scaled floating-point weight parameters, b_fRepresenting the scaled floating point bias parameters; s_xA scaling factor, s, representing the input data x_wA scaling factor, s, representing a weight parameter w_bRepresenting the scaling factor of the bias parameter b, m representing the left limit of the data range and n representing the right limit of the data range.

The scaling attribute parameters can be converted into a random bit stream form by using a linear feedback shift register as a random number generator and matching with a comparator, so as to obtain an input data bit stream x ', a weight parameter bit stream w ' and a bias parameter bit stream b '.

It should be noted that the above-mentioned scaling of the attribute parameter and the conversion of the scaled attribute parameter into the random bit stream are only one specific implementation manner in the present application, and those skilled in the art may flexibly select other manners according to actual situations, which does not affect the protection scope of the present application.

And step S2, determining the input number of the reference multiplexer based on the network structure of the Bayesian neural network.

The method specifically comprises the steps of firstly determining the number of the neurons in each layer of the Bayesian neural network according to the network structure of the Bayesian neural network, wherein the input number of the reference multiplexer is the number of the neurons in the previous layer, and one reference multiplexer is arranged in each neuron in each layer of the network except the input layer of the Bayesian neural network.

And step S3, decomposing the reference multiplexer into a plurality of intermediate multiplexers, and determining the input number and the number of the intermediate multiplexers according to the input number of the reference multiplexer.

In the above step S3, the number of the intermediate multiplexers is smaller than the number of the reference multiplexers, and a value obtained by dividing the number of the reference multiplexers by the number of the intermediate multiplexers is a positive integer and equal to the number of the intermediate multiplexers.

It should be noted that, in the present application, the reference multiplexer is decomposed into a plurality of intermediate multiplexers, and the number and number of the intermediate multiplexers in the present application are determined according to the number of the input of the reference multiplexer, which can reduce the information loss of the multiplexer in the sampling process in the conventional technology, and greatly reduce the operation error of the bayesian neural network during the random calculation.

Step S4, determining an inner product scaling factor of an inner product operation unit based on the parameter scaling factor and the input number and number of the intermediate multiplexer, and determining an inner product operation output result of the inner product operation unit according to the random bit stream data and the intermediate multiplexer.

Multiplying the weight parameter scaling factor, the input data parameter scaling factor, the input number of the intermediate multiplexers, and the number of the intermediate multiplexers to obtain the inner product scaling factor of the inner product operation unit.

Multiplying the weight parameter scaling factor, the input data parameter scaling factor and the input number of the intermediate multi-path selector to obtain the scaling factor s of the intermediate multi-path selector_dot1The scaling factor of the intermediate multiplexer can be expressed by the following formula:

s_dot1＝N/K·s_w·s_x

multiplying the scaling factor of the intermediate multiplexer by the number of the intermediate multiplexers to obtain an inner product scaling factor sdot of the inner product operation unit, where the inner product scaling factor can be expressed by the following formula:

s_dot＝K·s_dot1

where N is the input number of the reference multiplexers, K is the number of the intermediate multiplexers, s_wAs a scaling factor of the weight parameter, s_xA scaling factor is parameterized for the input data.

Determining the inner product operation output result of the inner product operation unit in step S4, specifically, inputting the random bit stream data into all intermediate multiplexers, determining the output result of each intermediate multiplexer, and accumulating the output results corresponding to all intermediate multiplexers by an accumulation multiplexer to obtain the inner product operation output result, where the inner product operation output result is a bit stream reduced by the inner product scaling factor.

The product operation of the input data bit stream and the weight parameter bit stream can be realized through an exclusive-OR gate circuit in the Bayes neural network, and the product result is accumulated by adopting an accumulation multiplexer based on equivalent weight input to be used as the inner product output result of an inner product operation unit.

Wherein, inputting the random bit stream data into all the intermediate multiplexers, the method also comprises the step of performing product operation on the input data and the network weight parameter bit stream by adopting an exclusive-or gate circuit.

In this embodiment, as shown in fig. 2, which is a schematic structural diagram of the inner product operation unit in this embodiment, first, the input number N of the reference multiplexer is determined according to the numbers of the neurons in different layers, then, the number K and the input number N/K of the multiplexers are determined, and the intermediate output results y' of the intermediate multiplexers are respectively calculated, where the intermediate output results can be obtained by the following formula:

then, the output results y' corresponding to all the intermediate multiplexers are accumulated through the accumulation multiplexer to obtain inner product operation output bit stream data y, and the inner product operation output bit stream data can be obtained through the following formula:

wherein j represents the count value of the number of the intermediate multiplexers, i represents the count value of the number of the inputs of the intermediate multiplexers, w 'represents the weight parameter bit stream, x' represents the input data bit stream, w 'x' represents the multiplication of the input data bit stream and the weight parameter bit stream by the exclusive-or gate circuit, K represents the number of the intermediate multiplexers, and N/K represents the number of the inputs of the intermediate multiplexers.

Step S5, determining an output scale factor of the bayesian neural network based on the inner product scaling factor and the parameter scaling factor, and outputting a final output result of the bayesian neural network according to the inner product operation output result and the output scale factor, thereby completing image classification of the first image data to be classified.

Wherein, the determining the output scale factor of the bayesian neural network in the step S5 specifically includes the following sub-steps:

Specifically, the expression of the common scaling factor s' is as follows:

s′＝max{s_dot,s_b}

neuronal scaling factor s_zThe expression of (a) is as follows:

s_z＝2·s′

wherein s is_dotIs an inner product scaling factor, s_bIs a bias parameter scaling factor.

Layer scaling factor s_LThe expression of (a) is as follows:

wherein s is_zFor the neuron scaling factor, i represents the network layer where the neuron is located, and n represents the number of neurons in the ith layer.

In this embodiment of the application, the outputting of the final output result of the bayesian neural network in step S5 specifically includes the following sub-steps:

Specifically, the determined inner product operation output result and the bias parameter bit stream are input into the bias addition unit for operation to obtain a bias operation result, that is, the output bit stream of the bias addition unit, the bias operation result is input into an activation function for activation to obtain a neuron output result corresponding to a neuron, forward reasoning of the bayesian neural network is performed based on the neuron output result and a layer scaling factor to determine output bit stream data of the bayesian neural network, in a specific application scenario, a counter can be used as a backward conversion circuit to convert the output bit stream data into floating point result data represented by a floating point, and the floating point result data is amplified according to the output scaling factor to obtain the final output result.

In this embodiment, between step S51 and step S52, the method further includes determining magnitudes of the inner product scaling factor and the offset parameter scaling factor, if the inner product scaling factor is smaller than the offset parameter scaling factor, updating the inner product scaling factor according to the common scaling factor, rescaling the inner product operation result according to the updated inner product scaling factor, and if the inner product scaling factor is larger than the offset parameter scaling factor, updating the offset parameter scaling factor according to the common scaling factor, and rescaling the offset parameter bitstream according to the updated offset parameter scaling factor.

Specifically, when the inner product operation output result is input to the offset addition unit, the sizes of the inner product scaling factor and the offset parameter scaling factor need to be determined first, then the inner product scaling factor or the offset parameter scaling factor is updated, and the inner product operation output result or the offset parameter bit stream is rescaled, where the update formula of the inner product scaling factor is as follows:

the update formula of the bias parameter scaling factor is as follows:

wherein sdot' is the updated inner product scaling factor，s_b'is the updated bias parameter scaling factor and s' is the common scaling factor.

In addition, when the bayesian neural network performs the forward inference in step S57, it is necessary to update the scaling factor of the neuron based on the layer scaling factor, and rescale the output result of the neuron according to the updated neuron scaling factor, where an update formula of the neuron scaling factor is as follows:

in the formula, s_z' is the updated neuron scaling factor, s_zTo be a pre-update neuron scaling factor, s_LIs the layer scaling factor.

The purpose of this step is to ensure that when the network performs the next layer of inference, the network has input data with the same scaling factor, that is, the output of the current layer of neurons is the input of the next layer of neurons, and rescales the output of the neurons, so that the input data of the next layer of neurons has the same scaling.

In the embodiment of the present application, between step S5 and step S6, the method further includes:

Specifically, a first neuron output value distribution and a second neuron output value distribution are respectively obtained through a random calculation mode and a floating point operation mode, the prediction accuracy rates corresponding to the first neuron output value distribution and the second neuron output value distribution are compared, if the difference value of the first neuron output value distribution and the second neuron output value distribution is higher than a set threshold value, the number K of intermediate multi-path selectors used for calculating intermediate results is used as a reference, and the intermediate multi-path selectors are based on the K<N and N/K ∈ N⁺The value of K is sequentially increased or decreased, and the number of intermediate multiplexers used to calculate intermediate results is further adjusted to balance the scaling error introduced based on random addition with the error introduced by the sampling operation.

After the K value is finally selected, the prediction process of the Bayesian neural network is implemented for a plurality of times, so that errors introduced in the whole random calculation process meet the random distribution expected by the Bayesian neural network.

And step S6, carrying out image classification on the rest image data to be classified in the image data set to be classified according to the Bayesian neural network with the determined quantity of the intermediate multi-path selectors.

Specifically, the intermediate multi-path selector structure in the Bayesian neural network is determined through the first image data to be classified, and the other image data to be classified in the image data set to be classified can be subjected to image classification based on the determined Bayesian neural network, so that the cost of image classification by using the Bayesian neural network is greatly reduced.

It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims

1. The image classification method based on the Bayesian neural network random addition decomposition structure is characterized by comprising the following steps of:

2. The image classification method based on the Bayesian neural network stochastic addition decomposition structure as recited in claim 1, wherein the parameter scaling factors specifically include an input data scaling factor, a weight parameter scaling factor, and a bias parameter scaling factor, wherein the scaling attribute parameters specifically include floating-point input data, a floating-point weight parameter, and a floating-point bias parameter, and wherein the random bit stream data includes an input data bit stream, a weight parameter bit stream, and a bias parameter bit stream.

3. The image classification method based on the bayesian neural network stochastic additive decomposition structure as claimed in claim 2, wherein said step S2 is to determine the number of neurons in each layer of the bayesian neural network according to the network structure of the bayesian neural network, and the input number of the reference multiplexer is the number of the neurons in the previous layer, wherein the number of the reference multiplexers in each of the neurons in each layer of the network except the input layer of the bayesian neural network is one.

4. The image classification method based on the bayesian neural network stochastic additive factorization structure as claimed in claim 2, wherein in the step S3, the number of the intermediate multiplexers is smaller than the number of the inputs of the reference multiplexer, and a value of the number of the inputs of the reference multiplexer divided by the number of the intermediate multiplexers is a positive integer and equal to the number of the inputs of the intermediate multiplexers.

5. The image classification method based on the bayesian neural network stochastic additive decomposition structure as claimed in claim 2, wherein in step S4, the weight parameter scaling factor, the input data parameter scaling factor, the number of inputs of the intermediate multiplexers, and the number of intermediate multiplexers are multiplied to obtain the inner product scaling factor of the inner product operation unit.

6. The image classification method based on the bayesian neural network stochastic additive decomposition structure as claimed in claim 2, wherein the determining of the output scale factor of the bayesian neural network in step S5 specifically comprises the following substeps:

7. The image classification method based on the bayesian neural network stochastic additive decomposition structure as claimed in claim 6, wherein said step S5 of outputting the final output result of the bayesian neural network specifically comprises the following substeps:

8. The image classification method based on the Bayesian neural network stochastic addition decomposition structure as recited in claim 6, further comprising, between step S51 and step S52, determining magnitudes of the inner product scaling factor and the bias parameter scaling factor, if the inner product scaling factor is smaller than the bias parameter scaling factor, updating the inner product scaling factor according to the common scaling factor, rescaling the inner product operation result according to the updated inner product scaling factor, if the inner product scaling factor is larger than the bias parameter scaling factor, updating the bias parameter scaling factor according to the common scaling factor, and rescaling the bias parameter bit stream according to the updated bias parameter scaling factor.

9. The image classification method based on the bayesian neural network stochastic additive decomposition structure as recited in claim 1, wherein the determining of the inner product operation output result of the inner product operation unit in step S4 is to input the random bit stream data to all intermediate multiplexers, determine the output result of each intermediate multiplexer, and accumulate the output results corresponding to all intermediate multiplexers by the accumulation multiplexers to obtain the inner product operation output result, wherein the inner product operation output result is a bit stream reduced by the inner product scaling factor.

10. The image classification method based on the bayesian neural network stochastic additive decomposition structure as recited in claim 1, further comprising, between the step S5 and the step S6: