CN116205274A

CN116205274A - Control method, device, equipment and storage medium of impulse neural network

Info

Publication number: CN116205274A
Application number: CN202310467454.9A
Authority: CN
Inventors: 蒋东东; 王斌强; 董刚
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2023-04-27
Filing date: 2023-04-27
Publication date: 2023-06-02
Anticipated expiration: 2043-04-27
Also published as: CN116205274B

Abstract

The invention discloses a control method, a device, equipment and a storage medium of a pulse neural network, which belong to the field of deep learning and are used for controlling the pulse neural network. Considering that for a pulse neural network with the value of each bit of data in the feature data being 1 or 0, the probability value distribution of each position in each group of feature data is not more, therefore, various convolution results can be pre-stored in a preset convolution result library in the invention, then for each target feature data, the target convolution result corresponding to the value distribution of each position in the target feature data can be obtained from the preset convolution result library, then the subsequent processing can be carried out according to the target convolution result, the efficiency of searching the data in the preset convolution result library is higher than the efficiency of carrying out convolution operation in real time, and the user experience is improved.

Description

Control method, device, equipment and storage medium of impulse neural network

Technical Field

The invention relates to the field of deep learning, in particular to a control method of a pulse neural network, and further relates to a control device, equipment and a storage medium of the pulse neural network.

Background

The impulse neural network is a new generation artificial neural network model derived from biological heuristics, belongs to a deep learning subset, has a high-efficiency brain-like calculation structure, can process characteristic data (the numerical value of each bit in the characteristic data is 1 or 0) from a CPU (Central Processing Unit ) or an impulse camera, however, a mature impulse neural network control method is lacked in the related technology, so that the calculation efficiency of the impulse neural network is lower, and the user experience is affected.

Therefore, how to provide a solution to the above technical problem is a problem that a person skilled in the art needs to solve at present.

Disclosure of Invention

The invention aims to provide a control method of a pulse neural network, which can acquire target convolution results corresponding to numerical distribution of each position in target feature data from a preset convolution result library, then carry out subsequent processing according to the target convolution results, and the efficiency of searching data from the preset convolution result library is higher than the efficiency of carrying out convolution operation in real time, so that the user experience is improved; the invention further aims to provide a control device, equipment and a storage medium of a pulse neural network, which can acquire target convolution results corresponding to numerical distribution of each position in target feature data from a preset convolution result library, and then perform subsequent processing according to the target convolution results, wherein the efficiency of searching data from the preset convolution result library is higher than that of performing convolution operation in real time, and the user experience is improved.

In order to solve the technical problems, the invention provides a control method of a pulse neural network, comprising the following steps:

acquiring target characteristic data of a pulse neural network to be subjected to convolution operation at present, wherein the target characteristic data is a numerical matrix;

obtaining target convolution results corresponding to the numerical distribution of each position in the target characteristic data from a preset convolution result library, wherein the preset convolution result library is pre-stored with convolution results corresponding to a plurality of numerical distributions of each position in the characteristic data;

accumulating the target convolution result to a current convolution result accumulated value to obtain the latest convolution result accumulated value;

judging whether the latest convolution result accumulated value meets a pulse generation condition or not;

if yes, resetting the convolution result accumulated value and generating a pulse so as to analyze the target characteristic data;

and if not, executing the step of acquiring the target characteristic data to be subjected to convolution operation currently of the impulse neural network.

On the other hand, the control method of the impulse neural network further comprises the following steps:

caching data generated in real time by a data source of the impulse neural network through at least N paths of buffers;

the obtaining the target characteristic data of the impulse neural network to be subjected to convolution operation currently comprises the following steps:

Acquiring target characteristic data to be subjected to convolution operation currently of the impulse neural network from the buffer;

wherein N is a size value of a convolution kernel of the impulse neural network.

In another aspect, the data generated in real time by the data source of the impulse neural network buffered by at least N-way buffer is applied to a field programmable gate array, and the method comprises:

and caching data generated in real time by a data source of the impulse neural network through at least N paths of buffers inside the field programmable gate array.

In another aspect, the buffering the data generated in real time by the data source of the impulse neural network through at least N-way buffers inside the field programmable gate array includes:

caching data generated in real time by a data source of the impulse neural network through an N-plus-1 buffer in the field programmable gate array;

the depth of each buffer is equal to the total number of bits of single-row data in the data source of the impulse neural network.

In another aspect, accumulating the target convolution result to a current convolution result accumulation value includes:

and accumulating the target convolution result to a current convolution result accumulated value through a serial bit adder in the field programmable gate array.

In another aspect, the accumulating the target convolution result to a current convolution result accumulated value by a serial bit adder internal to the field programmable gate array includes:

the target convolution result is accumulated to a current convolution result accumulation value by a serial bit adder built from a single 6-in 1-out look-up table within the field programmable gate array.

On the other hand, the obtaining, from a preset convolution result library, a target convolution result corresponding to the numerical distribution of each position in the target feature data includes:

determining a target query address corresponding to numerical distribution of each position in the target feature data;

and acquiring a target convolution result corresponding to the target query address from a preset convolution result library.

In another aspect, the determining the target query address corresponding to the numerical distribution of each position in the target feature data includes:

and encoding the numerical distribution of each position in the target feature data through the unique hot code to obtain a target query address corresponding to the numerical distribution of each position in the target feature data.

And acquiring the preset convolution result library in advance, and storing the preset convolution result library in a block random access memory in the field programmable gate array.

In another aspect, the preset convolution result library includes:

and determining a convolution result which corresponds to each probability numerical distribution of each position in the characteristic data matrix with the convolution kernel size of the impulse neural network one by one based on a preset weight matrix.

every one clock period, determining an attenuation value corresponding to the current clock count value of the timer through a nonlinear attenuation function;

taking the product of the attenuation value and the convolution result accumulated value as a new convolution result accumulated value;

the timer is cleared when the pulse is generated.

In another aspect, said taking the product of said attenuation value and said convolution result accumulation value as a new said convolution result accumulation value comprises:

determining the product of the attenuation value and the convolution result accumulated value through a multiplication unit constructed based on a serial bit adder in the field programmable gate array;

and taking the product as a new convolution result accumulated value.

In another aspect, the determining the product of the attenuation value and the convolution result accumulated value through a multiplication unit constructed based on a serial bit adder in the field programmable gate array includes:

determining one of the attenuation value and the convolution result accumulated value as a multiplier in performing a multiplication operation before inputting the attenuation value and the convolution result accumulated value to a multiplication unit constructed based on a serial bit adder in the field programmable gate array;

determining a target serial bit adder serving as a product output end in the multiplication unit according to zero value distribution in the multiplier;

after the attenuation value and the convolution result accumulated value are input to the multiplication unit, the output value of the target serial bit adder is taken as the product of the attenuation value and the convolution result accumulated value.

In another aspect, the determining the target serial bit adder in the multiplication unit as the product output end according to the zero value distribution in the multiplier includes:

judging whether zero values of at least M continuous bits exist from any end point of the multiplier;

if so, determining a target serial bit adder serving as a product output end in the multiplication unit according to the total number of bits of the longest zero sequence in the multiplier, the endpoint attribute of an endpoint occupied by the longest zero sequence in the multiplier and the logic topology of the multiplication unit;

Wherein M is half of the total number of bits of the multiplier, and the endpoint attribute is a head end or a tail end.

In another aspect, the multiplication unit constructed based on serial bit adders in the field programmable gate array includes:

the X shift registers are connected in series and are used for shifting a multiplicand input from an input end of the X shift registers to a high-order direction by one bit;

the zero value judging unit is used for receiving a value to be judged of the corresponding bit in the multiplier through a second input end of the zero value judging unit, and transmitting output data of the shift register corresponding to the zero value judging unit to the corresponding serial bit adder when the value to be judged is not zero;

the input end of the serial bit adder is connected with the output end of the corresponding zero value judging unit or the output end of the corresponding serial bit adder, and the serial bit adder is used for carrying out addition operation on the received numerical value so as to obtain the sum of output values of the shift registers, the bit position of which is not zero, in the corresponding multiplier;

Where X is the total number of bits of one of the attenuation value and the convolution result accumulated value as a multiplicand, and the multiplicand is one of the attenuation value and the convolution result accumulated value as a multiplicand.

In order to solve the technical problem, the invention also provides a control device of the impulse neural network, which comprises:

the first acquisition module is used for acquiring target characteristic data of the pulse neural network to be subjected to convolution operation at present, wherein the target characteristic data is a numerical matrix;

the second acquisition module is used for acquiring target convolution results corresponding to the numerical distribution of each position in the target feature data from a preset convolution result library, wherein the preset convolution result library is pre-stored with convolution results corresponding to a plurality of numerical distributions of each position in the feature data;

the accumulation module is used for accumulating the target convolution result to a current convolution result accumulated value to obtain a latest convolution result accumulated value;

the judging module is used for judging whether the latest convolution result accumulated value meets the pulse generation condition, if so, the generating module is triggered, and if not, the first acquisition module is triggered;

the generating module is used for resetting the convolution result accumulated value and generating pulses so as to analyze the target characteristic data.

In order to solve the technical problem, the invention also provides a control device of the impulse neural network, which is used for realizing the steps of the control method of the impulse neural network.

In another aspect, the control device is a field programmable gate array.

In another aspect, the control apparatus includes:

a memory for storing a computer program;

and a processor for implementing the steps of the control method of the impulse neural network as described above when executing the computer program.

To solve the above technical problem, the present invention also provides a storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the control method of a impulse neural network as described above.

The invention provides a control method of a pulse neural network, which considers the problems of low efficiency and long time consumption corresponding to the convolution result of characteristic data calculated in real time when the pulse neural network carries out convolution operation on the characteristic data each time, and considers that the probability value distribution of each position in each group of characteristic data is not more for the pulse neural network with the value of 1 or 0 of each bit of data in the characteristic data, so that various convolution results can be pre-stored in a preset convolution result library, then the target convolution result corresponding to the value distribution of each position in the target characteristic data can be obtained from the preset convolution result library for each target characteristic data, then the subsequent processing can be carried out according to the target convolution result, the efficiency of searching the data from the preset convolution result library is higher than the efficiency of carrying out convolution operation in real time, and the user experience is improved.

The invention also provides a control device, equipment, FPGA and storage medium of the impulse neural network, which have the same beneficial effects as the control method of the impulse neural network.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the following description will briefly explain the related art and the drawings required to be used in the embodiments, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person of ordinary skill in the art.

Fig. 1 is a schematic flow chart of a control method of a pulse neural network according to the present invention;

FIG. 2 is a schematic diagram of a pipelined data processing architecture according to the present invention;

FIG. 3 is a schematic diagram of a convolution algorithm of a pulse neural network according to the present invention;

FIG. 4 is a schematic diagram of a serial bit adder according to the present invention;

FIG. 5 is a schematic diagram of a multiplication unit based on a serial bit adder according to the present invention;

FIG. 6 is a schematic diagram of a method for optimizing the use of a multiplication unit according to the present invention;

Fig. 7 is a schematic flow chart of another control method of the impulse neural network provided by the invention;

FIG. 8 is a schematic flow chart of a control method of a pulse neural network according to the present invention;

fig. 9 is a schematic structural diagram of a control device of a pulse neural network according to the present invention;

fig. 10 is a schematic structural diagram of a control device of a pulse neural network according to the present invention;

fig. 11 is a schematic structural diagram of a storage medium according to the present invention.

Detailed Description

The core of the invention is to provide a control method of a pulse neural network, which can obtain target convolution results corresponding to numerical distribution of each position in target feature data from a preset convolution result library, then carry out subsequent processing according to the target convolution results, and the efficiency of searching data from the preset convolution result library is higher than the efficiency of carrying out convolution operation in real time, thereby improving user experience; the invention further provides a control device, equipment and a storage medium of the impulse neural network, which can acquire target convolution results corresponding to numerical distribution of each position in target feature data from a preset convolution result library, and then perform subsequent processing according to the target convolution results, wherein the efficiency of searching data from the preset convolution result library is higher than the efficiency of performing convolution operation in real time, and the user experience is improved.

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, fig. 1 is a flow chart of a control method of a pulse neural network according to the present invention, where the control method of the pulse neural network includes:

s101: acquiring target characteristic data of a pulse neural network to be subjected to convolution operation currently, wherein the target characteristic data is a numerical matrix;

specifically, considering the technical problems in the background art and considering the problems of low efficiency and long time consumption corresponding to the convolution result of the feature data in real time when the pulse neural network carries out convolution operation on the feature data each time, and considering that the probability value distribution of each position in each group of feature data is not more for the pulse neural network with the value of 1 or 0 of each bit of data in the feature data, the method is used for firstly acquiring the convolution result corresponding to the probability value distribution of each position in the target feature data, and then constructing the corresponding relation between the data combination of the feature data and the convolution result.

The data source of the impulse neural network may be various types, for example, may be a CPU or an impulse camera (outputting image data), which is not limited herein.

Specifically, it should be noted that the target feature data is in the form of an n×n matrix, where N is a value of a convolution kernel size of the pulse neural network, for example, 3, etc., and embodiments of the present invention are not limited herein.

S102: obtaining target convolution results corresponding to the numerical distribution of each position in the target feature data from a preset convolution result library, wherein the preset convolution result library is pre-stored with convolution results corresponding to a plurality of numerical distributions of each position in the feature data;

specifically, after the target feature data is obtained, the target convolution result corresponding to the numerical distribution of each position in the target feature data can be obtained from the preset convolution result library, and as the total number of the possible numerical distributions of each position in the target feature data is not too large, the searching from the convolution results corresponding to all the possible numerical distributions does not take too long, and the searching can be completed usually through one period, so that the working efficiency and the user experience are improved.

S103: accumulating the target convolution result to the current convolution result accumulated value to obtain the latest convolution result accumulated value;

Specifically, after the convolution result corresponding to the target feature data is obtained, the target convolution result can be accumulated to the current convolution result accumulated value so as to perform the overall flow of the impulse neural network.

S104: judging whether the latest convolution result accumulated value meets the pulse generation condition, if so, executing S105, and if not, executing S101;

specifically, after updating the convolution result accumulated value each time, it may be determined whether the latest convolution result accumulated value meets the pulse generation condition, if so, a subsequent pulse generation step may be performed, and if not, the next acquisition of the target feature data may be performed.

Specifically, determining whether the latest convolution result accumulated value satisfies the pulse generation condition may be: judging whether the latest convolution result accumulated value is larger than a preset threshold value, if so, judging that the latest convolution result accumulated value meets the pulse generation condition, and if not, judging that the latest convolution result accumulated value does not meet the pulse generation condition, wherein the judging mode is simple and efficient.

S105: the convolution result accumulated value is reset and a pulse is generated to analyze the target feature data.

Specifically, after the pulse generation condition is met, the pulse can be generated, so that the pulse neural network executes a subsequent flow, target characteristic data is analyzed, and the convolution result accumulated value can be reset (for example, can be cleared) while the pulse is generated, so that the next pulse generation process is performed.

Based on the above embodiments:

as an embodiment, the control method of the impulse neural network further includes:

buffering data generated in real time by a data source of the impulse neural network through at least N paths of buffers;

the obtaining of the target feature data to be subjected to convolution operation currently comprises the following steps:

acquiring target characteristic data to be subjected to convolution operation currently from a buffer;

where N is the size value of the convolution kernel of the impulse neural network.

In particular, considering that if all data to be processed of a data source are stored in advance, and then a part of extracted data of a part of the stored data is processed, a first-stage storage link is increased, storage hardware is needed to be increased, so that the working efficiency is reduced and the cost is improved.

As an embodiment, applied to an FPGA (Field Programmable Gate Array ), the data generated in real time by buffering the data source of the impulse neural network by at least N-way buffer includes:

and caching data generated in real time by a data source of the impulse neural network through at least N paths of buffers in the FPGA.

Specifically, considering that the data processing speed can be improved when the impulse neural network is operated through the FPGA, the operation process of the impulse neural network can be executed through the FPGA in the embodiment of the invention, so that the data generated in real time by the data source of the impulse neural network can be cached through at least N paths of buffers in the FPGA, the pre-storing of characteristic data is not required to be carried out by using a memory outside the FPGA, the cost is saved, and the working efficiency is improved.

Of course, the impulse neural network may be implemented to operate in other manners besides FPGA, and the embodiments of the present invention are not limited herein.

As one embodiment, the data generated in real time by the data source of the buffer-pulsed neural network through at least N-way buffers inside the FPGA comprises:

buffering data generated in real time by a data source of the pulse neural network through an N-plus-1 buffer in the FPGA;

Wherein the depth of each buffer is equal to the total number of bits of a single row of data in the data source of the impulse neural network.

For better explaining the embodiments of the present invention, please refer to fig. 2 and 3, fig. 2 is a schematic structural diagram of a pipelined Data processing architecture provided by the present invention, fig. 3 is a schematic structural diagram of a convolution algorithm of a pulse neural network provided by the present invention, and an off-chip memory in fig. 2 may be of various types, for example, DDR (Double Data Rate), HBM (High Bandwidth Memory ) or the like, which is not limited herein.

Specifically, the acceleration calculation unit in fig. 2 refers to an acceleration calculation unit inside the FPGA, the acceleration card refers to the FPGA, each buffer can buffer a line of data, the use of off-chip memory is omitted in the present invention as can be seen through the right diagram in fig. 2, the PE (Processing Engine, calculation engine) in fig. 3 is a part of calculating the convolution result in the related art, and the matrix composed of 9 weight values w1-w9 in fig. 3 is a weight matrix.

Specifically, in order to realize the running type processing of the feature data through the memory space as small as possible in the FPGA, in the embodiment of the invention, the data generated in real time by the data source of the pulse neural network can be cached through N-plus-1-path buffers in the FPGA, and the depth of each path of buffer is equal to the total number of bits of single-row data in the data source of the pulse neural network.

Of course, the buffer inside the FPGA for implementing the pipelined processing of the feature data may be other forms besides this specific form, and the embodiment of the present invention is not limited herein.

As one embodiment, accumulating the target convolution result to the current convolution result accumulation value includes:

and accumulating the target convolution result to the current convolution result accumulated value through a serial bit adder in the FPGA.

Specifically, considering that the DSP (Digital Signal Process, digital signal processing) resources (multiplier-adder) inside the FPGA are limited, in order to occupy as few DSP resources as possible for other purposes, in the embodiment of the present invention, the target convolution result is accumulated to the current convolution result accumulated value through the serial bit adder inside the FPGA, and the serial bit adder belongs to the richer resources inside the FPGA, so that the efficiency of the FPGA is improved.

As one embodiment, accumulating the target convolution result to the current convolution result accumulated value by a serial bit adder inside the FPGA includes:

and accumulating the target convolution result to a current convolution result accumulated value through a serial bit adder constructed by a single 6-in 1-out lookup table in the FPGA.

For better illustrating the embodiments of the present invention, please refer to fig. 4, fig. 4 is a schematic diagram of a serial bit adder according to the present invention.

Considering that the 6-in 1-out lookup table is the most abundant hardware resource in the FPGA and through which a serial bit adder can be constructed, the serial bit adder can be constructed by a single 6-in 1-out lookup table inside the FPGA in embodiments of the present invention.

Of course, in addition to this form, the serial bit adder may also be configured by other hardware in the FPGA, and embodiments of the present invention are not limited herein.

As one embodiment, obtaining a target convolution result corresponding to a numerical distribution of each position in the target feature data from a preset convolution result library includes:

and obtaining a target convolution result corresponding to the target query address from a preset convolution result library.

Specifically, the target is searched from the preset convolution result library in the mode of searching the address, so that the searching efficiency is further improved, different searching addresses can be generated through different numerical distribution, and the construction of the corresponding relation is facilitated.

Of course, other specific forms of obtaining the target convolution result corresponding to the numerical distribution of each position in the target feature data from the preset convolution result library may be adopted, and the embodiment of the present invention is not limited herein.

As one embodiment, determining a target query address corresponding to a numerical distribution of each location in the target feature data includes:

and encoding the numerical distribution of each position in the target characteristic data through the unique hot code to obtain a target query address corresponding to the numerical distribution of each position in the target characteristic data.

Specifically, considering that the encoding mode of the single thermal code is relatively fast and efficient, the query address corresponding to the numerical distribution can be accurately generated, so that in the embodiment of the invention, the numerical distribution of each position in the target feature data can be encoded through the single thermal code, and the target query address corresponding to the numerical distribution of each position in the target feature data can be obtained.

Of course, besides the unique thermal code, the target query address corresponding to the numerical distribution of each position in the target feature data may also be determined in other manners, which is not limited herein.

The method comprises the steps of obtaining a preset convolution result library in advance, and storing the preset convolution result library in a block random access memory BRAM in an FPGA.

Specifically, in order to utilize the internal cache resources of the FPGA as much as possible and reduce the use of external cache hardware, in the embodiment of the present invention, a preset convolution result library obtained in advance may be stored in a block random access memory BRAM (Block Random Access Memory ) in the FPGA, and the BRAM has a higher data transmission efficiency.

Of course, the preset convolution result library may be stored in other locations besides BRAM, and embodiments of the present invention are not limited herein.

As one embodiment, the preset convolution result library includes:

Specifically, considering that when convolution operation is performed, the weight corresponding to each non-zero value in the target feature data in the weight matrix can be determined, and then the sum of products of each non-zero value and the corresponding weight is used as the convolution result of the target feature data.

Of course, the generating manner of the preset convolution result library may be other various types besides this specific manner, and the embodiment of the present invention is not limited herein.

the timer is cleared when the pulse is generated.

Specifically, considering the characteristic that a convolution result accumulated value for assisting pulse generation in a pulse neural network has nonlinear attenuation along with the time, in the embodiment of the invention, the attenuation value corresponding to the current clock count value of the timer can be determined through a nonlinear attenuation function every one clock period, then the product of the attenuation value and the convolution result accumulated value is used as a new convolution result accumulated value, and the timer is cleared when the pulse is generated so as to control the attenuation value more accurately, so that the convolution result accumulated value can carry out periodic nonlinear attenuation according to the characteristic of the attenuation value, and the accuracy of pulse neural network data processing is improved.

As one embodiment, taking the product of the attenuation value and the convolution result accumulation value as the new convolution result accumulation value comprises:

determining the product of the attenuation value and the convolution result accumulated value through a multiplication unit constructed based on a serial bit adder in the FPGA;

the product is taken as a new convolution result accumulated value.

Specifically, in order to reduce the use of an multiplier-adder in the FPGA as much as possible, in the embodiment of the present invention, the multiplication unit is not implemented by the multiplier-adder, but a multiplication unit constructed by a serial bit adder in the FPGA is selected, the product of the attenuation value and the convolution result accumulated value is determined, and then the product is used as a new convolution result accumulated value, which is favorable for further improving the working efficiency of the FPGA.

As one embodiment, determining the product of the attenuation value and the convolution result accumulation value by a multiplication unit constructed based on a serial bit adder in the FPGA comprises:

before inputting the attenuation value and the convolution result accumulated value into a multiplication unit constructed based on a serial bit adder in the FPGA, determining one of the attenuation value and the convolution result accumulated value as a multiplier in multiplication operation;

determining a target serial bit adder serving as a product output end in a multiplication unit according to zero value distribution in the multiplier;

Specifically, considering that under some conditions that zero values in the multiplier are more, the output result of a certain shift register positioned in the middle layer can be selected as a convolution result according to the conditions, so that the time for obtaining the result is reduced, therefore, in the embodiment of the invention, before the attenuation value and the convolution result accumulated value are input into a multiplication unit constructed based on a serial bit adder in an FPGA, one of the attenuation value and the convolution result accumulated value is determined as the multiplier in multiplication operation, then a target serial bit adder serving as a product output end in the multiplication unit is determined according to zero value distribution in the multiplier, and finally, after the attenuation value and the convolution result accumulated value are input into the multiplication unit, the output value of the target serial bit adder is taken as the product of the attenuation value and the convolution result accumulated value, thereby being beneficial to improving the working efficiency.

As one embodiment, determining a target serial bit adder as a product output in a multiplication unit based on a distribution of zeros in a multiplier includes:

Judging whether zero values of at least M bits exist continuously from any end point of the multiplier;

where M is half of the total number of bits of the multiplier, and the endpoint attribute is head or tail.

Specifically, the specific mode provided by the embodiment of the invention can efficiently and accurately determine the target serial bit adder serving as the product output end in the multiplication unit according to zero value distribution in the multiplier.

Of course, other specific implementations of determining the target serial bit adder in the multiplier unit as the product output according to the zero value distribution in the multiplier are also possible, and embodiments of the present invention are not limited herein.

As one embodiment, a multiplication unit constructed based on a serial bit adder in an FPGA includes:

The zero value judging unit is used for receiving a value to be judged of the corresponding bit in the multiplier through the second input end of the zero value judging unit, and transmitting output data of the corresponding shift register to the corresponding serial bit adder when the value to be judged is not zero;

a plurality of serial bit adders, the input ends of which are connected with the output ends of the corresponding zero value judging units or the output ends of the corresponding serial bit adders, are used for carrying out addition operation on the received numerical values so as to obtain the sum of output values of each shift register with the bit position not being zero in the corresponding multiplier;

where X is the total number of bits of one of the attenuation value and the convolution result accumulated value as a multiplier and the multiplicand is one of the attenuation value and the convolution result accumulated value as a multiplicand.

For better explanation of the embodiment of the present invention, please refer to the following table 1 and fig. 5-6, wherein table 1 is a two-stage multiplication principle table, fig. 5 is a schematic structural diagram of a multiplication unit based on a serial bit adder provided by the present invention, fig. 6 is a schematic diagram of an optimized usage method of a multiplication unit provided by the present invention, for convenience of demonstration, the part multiplies by 4bit data a and b, wherein a can be used as a multiplicand, b can be used as a multiplier, b0 represents the 0 th bit of data b, the 0 value judgment is based on the bit value corresponding to the b value, if 0, the output is 0, otherwise, the shift register value corresponding to the a value is output, and finally the multiplication calculation of the data is completed by 3 serial bit adders.

In order to accelerate the multiplication unit and reduce the number of stages of the addition unit, if it is detected that some bits of the values in the lookup table are 0, the stages are skipped directly, and when some data with more 0 bits in the nonlinear lookup table are calculated, the performance is optimized, as shown in fig. 6, the output of the serial bit adder connected to the fast calculation judgment unit (the process for executing the judgment and determining the target serial bit adder) in the figure may be directly used as a convolution result, so that the calculation delay of the first-stage adder is reduced.

TABLE 1

Of course, the multiplication unit constructed based on the serial bit adder may have other specific structures besides this specific structure, and the embodiment of the present invention is not limited herein.

For better explanation of the embodiments of the present invention, please refer to fig. 7 and fig. 8, fig. 7 is a flow chart of another control method of a pulse neural network provided by the present invention, fig. 8 is a flow chart of another control method of a pulse neural network provided by the present invention, data received according to rows in fig. 8 is feature data received from a data source, and the feature data is buffered in corresponding buffers according to rows through a row data divider, then target feature data of a size corresponding to a convolution check is read, and then single-hot code encoding is performed, and a target convolution result is searched from a preset convolution result library according to a query address, and then accumulated to a convolution result accumulated value, and then whether a pulse generation condition is satisfied is determined according to the updated convolution result accumulated value, and a subsequent different step is selected according to the determination result.

In addition, the sum buffer in fig. 7 and the accumulated value in fig. 8 each refer to a convolution result accumulated value.

Referring to fig. 9, fig. 9 is a schematic structural diagram of a control device of a pulse neural network according to the present invention, where the control device of the pulse neural network includes:

a first obtaining module 91, configured to obtain target feature data to be subjected to convolution operation currently, where the target feature data is a numerical matrix;

a second obtaining module 92, configured to obtain a target convolution result corresponding to the numerical distribution of each position in the target feature data from a preset convolution result library;

an accumulation module 93, configured to accumulate the target convolution result to a current convolution result accumulated value, to obtain a latest convolution result accumulated value;

a judging module 94, configured to judge whether the latest convolution result accumulated value meets the pulse generating condition, if yes, trigger the generating module 95, and if not, trigger the first acquiring module 91;

a generating module 95, configured to reset the convolution result accumulated value and generate a pulse so as to analyze the target feature data.

For the description of the control device for the impulse neural network provided in the embodiment of the present invention, reference is made to the foregoing embodiment of the control method for the impulse neural network, and the embodiment of the present invention is not repeated herein.

The invention also provides a control device of the impulse neural network, which is used for realizing the steps of the control method of the impulse neural network in the embodiment.

As an embodiment, the control device is a field programmable gate array.

As one embodiment, a control apparatus includes:

a memory 101 for storing a computer program;

a processor 102 for implementing the steps of the control method of the impulse neural network in the previous embodiment when executing the computer program.

Referring to fig. 11, fig. 11 is a schematic structural diagram of a storage medium according to the present invention, a computer program 111 is stored on the storage medium 110, and the computer program 111 implements the steps of the control method of the impulse neural network according to the foregoing embodiment when executed by the processor 102.

For the description of the storage medium 110 provided in the embodiment of the present invention, reference is made to the foregoing embodiments of the control method of the impulse neural network, and the embodiments of the present invention are not repeated herein.

In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section. It should also be noted that in this specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A control method of a impulse neural network, comprising:

2. The control method of a impulse neural network according to claim 1, characterized in that the control method of an impulse neural network further comprises:

3. The method for controlling a pulsed neural network of claim 2, applied to a field programmable gate array, wherein buffering data generated in real time by a data source of the pulsed neural network through at least N-way buffers comprises:

4. The method of claim 3, wherein buffering the data generated in real time by the data source of the impulse neural network through at least N-way buffers inside the field programmable gate array comprises:

5. The control method of a impulse neural network according to claim 3, wherein accumulating the target convolution result to a current convolution result accumulation value comprises:

6. The method according to claim 5, wherein accumulating the target convolution result to a current convolution result accumulated value by a serial bit adder inside the field programmable gate array comprises:

7. The method for controlling a pulsed neural network according to claim 3, wherein the obtaining, from a preset convolution result library, a target convolution result corresponding to a numerical distribution of each position in the target feature data comprises:

8. The method for controlling a pulsed neural network of claim 7, wherein determining a target query address corresponding to a numerical distribution of each location in the target feature data comprises:

9. The control method of a impulse neural network according to claim 7, characterized in that the control method of an impulse neural network further comprises:

10. A control method of a impulse neural network according to claim 3, wherein the preset convolution result library comprises:

11. The control method of a impulse neural network according to any one of claims 3 to 10, characterized in that the control method of an impulse neural network further comprises:

the timer is cleared when the pulse is generated.

12. The control method of the impulse neural network according to claim 11, characterized in that said taking the product of the attenuation value and the convolution result accumulated value as a new convolution result accumulated value includes:

and taking the product as a new convolution result accumulated value.

13. The control method of the impulse neural network according to claim 12, wherein the determining the product of the attenuation value and the convolution result accumulated value by a multiplication unit constructed based on a serial bit adder in the field programmable gate array includes:

14. The method of claim 13, wherein determining the target serial bit adder as the product output in the multiplication unit according to the zero value distribution in the multiplier comprises:

15. The control method of a pulsed neural network of claim 12, wherein the multiplication unit constructed based on serial bit adders in the field programmable gate array comprises:

16. A control device for a impulse neural network, comprising:

17. A control device of a impulse neural network, characterized in that the control device is arranged to implement the steps of the control method of an impulse neural network as claimed in any one of the claims 1 to 15.

18. The control device of claim 17, wherein the control device is a field programmable gate array.

19. The control apparatus according to claim 17, characterized in that the control apparatus comprises:

a memory for storing a computer program;

a processor for implementing the steps of the control method of a impulse neural network as claimed in any one of the claims 1 to 15 when executing said computer program.

20. A storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the control method of a pulsed neural network of any one of claims 1 to 15.