CN111507465B

CN111507465B - Configurable convolutional neural network processor circuit

Info

Publication number: CN111507465B
Application number: CN202010545278.2A
Authority: CN
Inventors: 周军; 周勇; 刘嘉豪; 刘青松
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2020-06-16
Filing date: 2020-06-16
Publication date: 2020-10-23
Anticipated expiration: 2040-06-16
Also published as: CN111507465A

Abstract

The invention provides a configurable convolutional neural network processor circuit, which comprises an FIR (finite impulse response) filtering module, a windowing processing module and a neural network operation module, wherein the neural network operation module comprises a convolutional layer, a pooling layer, a configurable activation function layer and a full-connection layer, the configurable activation function layer comprises an absolute value taking module, an interval judgment module, a first multiplexer, a configuration module, an address generation module, an RAM (random access memory), an interval expansion module and a second multiplexer, and the configurable activation function layer is configured with a sigmoid function or a tanh function and an error, so that the universality and the flexibility of a processor are greatly improved; by combining layered quantization and saturation truncation, the configurable quantization standard of each layer of neural network is realized, and the overflow risk is reduced; the FIR filtering function is realized by multiplexing the product accumulation operation units of the full connection layer, and the data are transmitted in a two-phase data transmission mode, so that the power consumption is further reduced.

Description

Configurable convolutional neural network processor circuit

Technical Field

The invention belongs to the technical field of artificial intelligence, and particularly relates to a configurable convolutional neural network processor circuit.

Background

Artificial Intelligence (AI) is a strategic industry leading the future, and an AI chip is a key technical link in the whole field of Artificial Intelligence, is the basis of the Artificial Intelligence industry in China, and is an important level for realizing breakthrough of Artificial Intelligence. Deep learning is an important way for developing artificial intelligence, and the greatest difference from the traditional computing mode is that massive parallel computing is needed without large-scale logic programming, and a new special computing chip is urged to be generated due to the strong requirements of a new computing mode and new computing in the artificial intelligence era. The maturity of the deep learning algorithm, the calculation capacity improvement and the big data jointly promote the artificial intelligence to realize the leap-type development, and the artificial intelligence application develops endlessly to further promote the calculation capacity demand.

The deep convolutional neural network is one of the typical algorithms for deep learning, is implemented and applied on a software platform at present, and can be implemented simply and conveniently on a Central Processing Unit (CPU) or a Graphics Processing Unit (GPU) through software programming due to the development of a plurality of deep learning frameworks. However, the CPU cannot well utilize the parallelism characteristic of the convolutional neural network algorithm, and thus cannot meet the requirements of low latency and low power consumption required by most applications. Although the convolution neural network realized on the GPU can well utilize the parallelism characteristic of the convolution neural network algorithm so as to obtain good performance, the overhigh power consumption can not meet the requirement of portable equipment. The conventional Application Specific Integrated Circuit (ASIC) dedicated artificial intelligence calculation acceleration Circuit realizes calculation by a dedicated Circuit structure for a certain specific algorithm, but has poor configurability and cannot follow up the high-speed development of the artificial intelligence algorithm.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides a configurable convolutional neural network processor circuit, which greatly improves the universality and the accuracy of a neural network processor, improves a Finite Impulse Response (FIR) filtering module and a windowing processing module and reduces the chip area and the power consumption of the neural network processor by configuring an activation function layer structure and a quantization standard of each layer of neural network.

The specific technical scheme of the invention is as follows:

a configurable convolutional neural network processor circuit comprises an FIR filtering module, a windowing processing module and a neural network operation module, and is characterized in that the neural network operation module comprises a convolutional layer, a pooling layer, a configurable activation function layer and a full-connection layer, wherein the configurable activation function layer is configured with a sigmoid function or a tanh function and is also configured with an error;

the sigmoid function or tanh function fitting formula configured by the configurable activation function layer is obtained by the following method:

for input

Dividing into different intervals with required error less than

Sigmoid function or tanh function of

The activation function is

；

First, for

The fitting procedure for the sigmoid function or tanh function is as follows:

when in use

When, to

Performing first-order Taylor expansion at 0 to obtain a fitting formula

When is coming into contact with

When the abscissa is

Obtaining the first segment input interval

；

When function

When the abscissa is

Obtaining the last input interval

Said

The fitting formula of interval correspondence is

；

According to the first segment input interval

And end input interval

Obtaining the middle section input interval

Inputting the middle section into the interval

Is divided into

Segment segmentation interval

Wherein

According to the determination of the logic resource and the storage resource,

the larger the comparison logic for judging the required segmentation interval, the more the mapping value is stored

Required storage resourcesThe fewer sources; will segment into intervals

Divided into intra-segment cells of equal length

Wherein

For segmenting intervals

The number of cells in the inner segment, and

(ii) a Length of inter-cell within segment is determined by error

Determining, in the segment interval

In (1),

correspond to

Is provided with

And is

，

Corresponding to an independent variable interval of

If, if

Then take the segment interval

Has an intra-segment inter-cell length of

The length of the intra-segment cells between different segment intervals follows

Is increased by an increase in; the intra-segment cells adopt a direct mapping mode, namely, the intra-segment cells fall into

All inputs within are mapped to the same output value

；

In summary, in

The fitting formula of the sigmoid function or the tanh function is as follows:

(1)

secondly, according to the point symmetry properties of the sigmoid function and the tanh function:

（2）

obtaining sigmoid function or tanh function in

Finally obtaining a fitting formula of the sigmoid function or the tanh function in the whole independent variable interval;

further, the configurable activation function layer includes an absolute value taking module, an interval judging module, a first multiplexer, a configuration module, an address generating module, a RAM (Random Access Memory), an interval expanding module, and a second multiplexer;

the configuration process of the configurable activation function layer comprises the following steps:

firstly, sequentially storing middle section input intervals in the RAM

All segment inter-cell

Mapping value corresponding to sequence number

The serial number is an RAM address; and according to the activation function to be configured

Introducing segmentation points of segmentation intervals into the configuration module for sigmoid function or tanh function

The number of bits to be cut per segment interval

Offset number of

The number of fixed points after quantization of '1' and 1-bit function switching bit; wherein the cutoff number

In (1)

For quantizing coefficients, i.e. the number of bits occupied by a small number of bits in the N-bit fixed-point number, said offset number

First intra-segment inter-cell as segmented inter-segment

In the middle section input interval

In the sequence numbers in the inter-cell sets in all the segments, 1 in the 1-bit function switching bit represents a tanh function, and 0 represents a sigmoid function;

second input

Obtaining input through an absolute value obtaining module

Absolute value of (2)

And sign bit, into which

For signed N-bit fixed point number, absolute value

An input interval judgment module for combining the segment points output from the configuration module to the interval judgment module

Judging in the interval judging module to obtain the absolute value

The section where the section is located controls the output of the first multiplexer according to the section judgment result

The method specifically comprises the following steps:

if the interval judgment result is

Then the first multiplexer outputs

Wherein

Performing first-order Taylor expansion at 0 by a sigmoid function or a tanh function controlled by a 1-bit function switching bit;

if the interval judgment result is

Then the first multiplexer outputs

Wherein 1 is the number of fixed points after quantization of '1' output by the configuration module;

if the interval judgment result is

The address generation module is started according to the truncated number output to the address generation module by the configuration module

Offset number of

Calculating absolute values

The RAM address where the corresponding mapping value is located; the RAM receives the RAM address output by the address generation module and outputs a mapping value

Output via the first multiplexer

；

And then outputting the first multiplexer

Input to the second multiplexer, and output of the absolute value module

The sign bit of (a) controls whether the second multiplexer is paired or not

Expanding the interval if inputting

If the sign bit of the signal is positive, the output is

If it is input

Is negative, the output is obtained

Is composed of

Obtaining sigmoid function or tanh function through the output of the interval expansion module

Fitting value of

；

The 1-bit function switching bit output by the configuration module controls the operation of the interval expansion module, the interval expansion module outputs a result according to the point symmetry property of the sigmoid function and the tanh function, as shown in a formula (2), if the 1-bit function switching bit is 1, the interval expansion module outputs

Is output through a second multiplexer

If the 1-bit function switching bit is 0, the interval expansion module outputs

Is output through a second multiplexer

Obtaining sigmoid function or tanh function in

Fitting value of

(ii) a Where 1 is the quantized fixed-point number of "1" output by the configuration module.

Furthermore, the configurable activation function layer works according to the bit truncation number

Offset number of

Calculating absolute values

The steps corresponding to the RAM address where the mapping value is located are as follows: suppose that

Fall into

Then the RAM address is

I.e. absolute values

Minus said

Left boundary of interval

After, right shift by the number of truncations

Plus the offset number

Get input

And the sequence number between cells in the located segment is the RAM address.

Furthermore, the neural network operation module also comprises a layered quantization configuration module, the quantization standard of each layer of neural network is configured in a mode of combining layered quantization and saturation truncation so as to ensure that the calculation result of each layer of hardware cannot overflow as far as possible, the quantization standard of each layer of neural network is tested by software in advance, and then the layered quantization configuration module is configured to each layer of neural network; the configuration process of the hierarchical quantization configuration module is as follows:

the input of the convolution layer is a signed N-bit fixed point number, the input quantization standard is a quantization standard P-bit decimal number of the previous convolution layer, the middle value of the product accumulation operation of the current convolution layer is represented by a signed 2N-bit fixed point number, and the decimal point of the middle value is positioned between the 2P-th bit and the 2P + 1-th bit from the low position to the high position; if the quantization standard of the current layer neural network is set as Q-bit decimal, intercepting the post-decimal Q bit and the pre-decimal N-Q bit of the intermediate value expressed by the signed 2N-bit fixed point number as the signed N-bit fixed point number operation result of the current convolutional layer; and if the intercepted signed N-bit fixed point number operation result is overflow, saturation truncation is carried out on the overflow value, if the overflow is positive overflow, the operation result of the signed N-bit fixed point number judged to overflow is reset to be a positive maximum value, and if the overflow is negative overflow, the operation result of the signed N-bit fixed point number judged to overflow is reset to be a negative minimum value.

Furthermore, the fully-connected layer comprises a product accumulation operation unit, the FIR filtering module multiplexes the product accumulation operation unit of the fully-connected layer, the operation mode of the neural network operation module is divided into a neural network calculation mode and an FIR filtering calculation mode, and when the system is in the neural network calculation mode, the input of a multiplier in the product accumulation unit of the fully-connected layer is selected to be a fully-connected layer input characteristic diagram and a fully-connected layer weight; when the system is in an FIR filtering calculation mode, the input of a multiplier in the product accumulation unit of the full connection layer is selected as an input signal to be filtered and an FIR coefficient, and the FIR filtering module realizes the noise reduction processing of the input signal through a filter consisting of the FIR coefficient and the product accumulation operation unit.

Furthermore, the windowing processing module comprises a cache module, a windowing information calculating module and a data control module, and data are input to the neural network operation module by adopting a two-stage data transmission mode;

in the first stage of data transmission, denoised signals output by the FIR filtering module are simultaneously input into the cache module and the windowing information calculation module, the windowing information calculation module calculates the mark position of a windowing according to the input signals to serve as windowing information, the windowing information is output to the data control module, and the data control module reads data before the mark position stored in the cache module at one time and outputs the data to the neural network operation module; and then, performing second-stage data transmission, judging how many windows of data need to be received in real time after the data control module receives the windowing information sent by the windowing information calculation module, directly receiving the data after the mark position from the FIR filtering module in real time, outputting the data to the neural network operation module, and finally outputting all the data in the windows to the neural network operation module.

The invention has the beneficial effects that:

1. the invention can configure sigmoid function or tanh function and error by adopting the configurable activation function layer, thereby being capable of adapting to the derivation algorithm of a series of neural networks, and adopting different quantization strategies for different neural networks under different application scenes, and greatly improving the universality and flexibility compared with the traditional processor only supporting a single network;

2. according to the invention, a mode of combining layered quantization and saturation truncation is adopted, so that the configurable quantization standard of each layer of neural network is realized, the overflow risk is reduced, and the accuracy of the neural network processor is improved;

3. the invention provides a method for realizing FIR filtering function by a product accumulation operation unit of a full connection layer in a multiplexing neural network operation module, reducing the chip area of a processor and simultaneously reducing the calculation complexity and power consumption of a neural network;

4. compared with the mode of firstly caching and then reading in the network for all input signals, the two-stage data transmission mode can save more power consumption.

Drawings

FIG. 1 is a block diagram of a configurable convolutional neural network processor circuit of embodiment 1 of the present invention;

FIG. 2 is a block diagram of a configurable activation function layer in a configurable convolutional neural network processor circuit according to embodiment 1 of the present invention;

FIG. 3 is an activation function in a configurable convolutional neural network processor circuit according to embodiment 1 of the present invention

The mapping relationship of (2);

fig. 4 is a block diagram of a windowing processing module in the configurable convolutional neural network processor circuit according to embodiment 1 of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described with reference to the following embodiments and the accompanying drawings.

Example 1

The embodiment provides a configurable convolutional neural network processor circuit, as shown in fig. 1, which includes an FIR filtering module, a windowing processing module, and a neural network operation module, and is characterized in that the neural network operation module includes a convolutional layer, a pooling layer, a configurable activation function layer, and a full connection layer, and the configurable activation function layer configures a sigmoid function or a tanh function and also configures an error;

for input

Dividing into different intervals with required error less than

Sigmoid function or tanh function of

The activation function is

；

First, for

The fitting procedure for the sigmoid function or tanh function is as follows:

when in use

When, to

Performing first-order Taylor expansion at 0 to obtain a fitting formula

When is coming into contact with

When the abscissa is

To obtainFirst segment input interval

(ii) a Wherein the fitting formula of the sigmoid function is

The fitting formula of the tanh function is

；

When function

When the abscissa is

Obtaining the last input interval

Said

The fitting formula of interval correspondence is

；

According to the first segment input interval

And end input interval

Obtaining the middle section input interval

Inputting the middle section into the interval

Division into 4 segmented intervals

(ii) a Will segment into intervals

Divided into intra-segment cells of equal length

Wherein

For segmenting intervals

The number of cells in the inner segment, and

(ii) a Length of inter-cell within segment is determined by error

Determining, in the segment interval

In (1),

correspond to

Is provided with

And is

，

Corresponding to an independent variable interval of

If, if

Then take the segment interval

Has an intra-segment inter-cell length of

All inputs within are mapped to the same output value

；

In summary, in

The fitting formula of the sigmoid function or the tanh function is as follows:

(1)

the activation function

The mapping relation diagram is shown in FIG. 3, and is divided into a first segment of input intervals

Four-segment subsection interval of middle segment input interval

And end input interval

。

According to the point symmetry properties of the sigmoid function and the tanh function:

（2）

obtaining sigmoid function or tanh function in

Finally obtaining the fitting formula of the sigmoid function or the tanh function in the whole independent variable interval.

Further, the configurable activation function layer is shown in fig. 2 and includes an absolute value taking module, an interval judging module, a first multiplexer, a configuration module, an address generating module, a RAM, an interval expanding module, and a second multiplexer;

firstly, sequentially storing middle section input intervals in the RAM

All segment inter-cell

Mapping value corresponding to sequence number

The number of bits to be cut per segment interval

Offset number of

To quantize coefficients

Within-segment inter-cell minus base-2 segmentation interval

Absolute value of logarithm of interval, quantized coefficient

Determining the number of bits occupied by the small digits in the N-digit fixed point number, and the offset number

First intra-segment inter-cell as segmented inter-segment

In the middle section input interval

second input

Obtaining input through an absolute value obtaining module

Absolute value of (2)

And sign bit, into which

For signed N-bit fixed point number, absolute value

An input section judgment module for combining the section points of the section output from the configuration module to the section judgment module

Judging in the interval judging module to obtain the absolute value

The method specifically comprises the following steps:

if the interval judgment result is

Then the first multiplexer outputs

Wherein the fitting formula of the sigmoid function is

The fitting formula of the tanh function is

；

If the interval judgment result is

Then the first multiplexer outputs

Wherein 1 isThe number of fixed points after quantization of '1' output by the configuration module;

if the interval judgment result is

Offset number of

Computing

The RAM address where the corresponding mapping value is: suppose that

Fall into

Then the RAM address is

I.e. input

Absolute value of (2)

Minus said

Left boundary of interval

After, right shift by the number of truncations

Plus the offset number

Get input

The sequence number between cells in the segment, namely the RAM address; the RAM receives the RAM address output by the address generation module and outputs a mapping value

Output via the first multiplexer

；

And then outputting the first multiplexer

Input to the second multiplexer, and output of the absolute value module

The sign bit of (a) controls whether the second multiplexer is paired or not

Expanding the interval if inputting

If the sign bit of the signal is positive, the output is

If it is input

Is negative, the output is obtained

Is composed of

Fitting value of

；

Is output through a second multiplexer

If the 1-bit function switching bit is 0, the interval expansion module outputs

Is output through a second multiplexer

Obtaining sigmoid function or tanh function in

Fitting value of

Finally obtaining sigmoid function or tanh function

Fitting value of

setting the input of the convolution layer as an N-bit fixed point number, the input quantization standard as a P-bit decimal number of the quantization standard of the previous convolution layer, and representing the intermediate value of the current convolution layer product accumulation operation by adopting a 2N-bit fixed point number, wherein the decimal point of the intermediate value is positioned between the 2P-th bit and the 2P + 1-th bit from the low position to the high position; if the quantization standard of the current layer neural network is Q-bit decimal, intercepting the Q bit after the decimal point of the intermediate value represented by the 2N-bit fixed point number and the N-Q bit before the decimal point as the N-bit fixed point number operation result of the current convolutional layer; and if the intercepted N-bit fixed point number operation result is still overflow, saturation truncation is carried out on the overflow value, if the overflow is positive overflow, the N-bit fixed point number operation result judged to be overflow is reset to be a positive maximum value, and if the overflow is negative overflow, the N-bit fixed point number operation result judged to be overflow is reset to be a negative minimum value.

Further, the fully-connected layer comprises a product accumulation operation unit, the FIR filtering module multiplexes the product accumulation operation unit of the fully-connected layer, the operation mode of the neural network operation module is divided into a neural network calculation mode and a FIR filtering calculation mode, and when the system is in the neural network calculation mode, the input of a multiplier in the product accumulation unit of the fully-connected layer is selected to be a fully-connected layer input characteristic diagram and a fully-connected layer weight; when the system is in an FIR filtering calculation mode, the input of a multiplier in the product accumulation unit of the full connection layer is selected as an input signal to be filtered and an FIR coefficient, and the FIR filtering module realizes the noise reduction processing on the input signal through a filter consisting of the FIR coefficient and the product accumulation operation unit, thereby effectively reducing the influence of noise on the identification precision, weakening the calculation complexity of a neural network and reducing the power consumption.

Further, the windowing processing module, as shown in fig. 4, includes a cache module, a windowing information calculating module, and a data control module, and is configured to input data to the neural network operation module in a two-stage data transmission mode;

first-stage data transmission: the denoised signal data output by the FIR filtering module is simultaneously input into the cache module and the windowing information calculation module, the windowing information calculation module calculates the mark position, such as the peak position, of a windowing according to the input signal, the mark position serves as windowing information, the windowing information is output to the data control module, and the data control module reads the data before the mark position stored in the cache module from the cache module at one time and outputs the data to the neural network operation module;

and second-stage data transmission: and the data control module judges how many windows of data need to be received in real time after receiving the windowing information sent by the windowing information calculation module, directly receives the data after the mark position from the FIR filtering module in real time and outputs the data to the neural network operation module.

And finally, all data in the window are output to the neural network operation module, so that the power consumption of outputting the data after the mark position is read from the cache module to the neural network operation module is saved.

Claims

1. A configurable convolutional neural network processor circuit comprises an FIR filtering module, a windowing processing module and a neural network operation module, and is characterized in that the neural network operation module comprises a convolutional layer, a pooling layer, a configurable activation function layer and a full-connection layer, wherein the configurable activation function layer is configured with a sigmoid function or a tanh function and is also configured with an error;

the sigmoid function or tanh function of the configurable activation function layer configuration is (x), and the fitting formula in x ∈ [0, + ∞) is as follows:

wherein y is an activation function;

for the first segment input interval [0, x₁) And (x) is subjected to first-order Taylor expansion at 0 to obtain a fitting formula y as a₀x+b₀，x₁Is when yAbscissa of the case- (x), where, as error, a in the fitting equation of the sigmoid function₀Is composed of

b₀Is composed of

Fitting formula of tanh function₀Is 1, b₀Is 0;

for the last segment input interval [ x_K+1,+∞)，x_K+1Is the abscissa when (x) ═ 1-;

for the middle input interval [ x₁,x_K+1) Inputting the middle section into the interval [ x ]₁,x_K+1) Partitioning into K segments_i,x_i+1) K, and then segment interval [ x ═ 1_i,x_i+1) Divided into intra-segment cells of equal length

Wherein L is_iIs a segment interval [ x_i,x_i+1) The number of cells in the inner segment; the intra-segment cells adopt a direct mapping mode and fall into the intra-segment cells

All inputs within are mapped to the same output value

and obtaining a fitting formula of the sigmoid function or the tanh function in x ∈ (— ∞,0), and finally obtaining the fitting formula of the sigmoid function or the tanh function in the whole independent variable interval.

2. The configurable convolutional neural network processor circuit as claimed in claim 1, wherein the configurable activation function layer comprises an absolute value taking module, an interval judging module, a first multiplexer, a configuration module, an address generating module, a RAM, an interval expanding module and a second multiplexer; the configuration process of the configurable activation function layer comprises the following steps:

firstly, sequentially storing a middle section input interval [ x ] in the RAM₁,x_K+1) All segment inter-cell

Mapping value corresponding to sequence number

The serial number is an RAM address; leading in a segmentation point x of a segmentation interval in the configuration module according to the fact that the activation function y to be configured is a sigmoid function or a tanh function_iThe number of bits to be cut per segment interval

Bias number b (i), fixed point number after quantization of '1' and 1-bit function switching bit; wherein the segmentation point x_iK +1, i of the truncation number n (i) is a quantization coefficient, and i of the offset number b (i) is 1, K, M of the truncation number n (i) is a first intra-segment inter-cell interval of a segment interval, and the offset number b (i) is a first intra-segment inter-cell interval of the segment interval

In the middle input interval [ x₁,x_K+1) In the sequence numbers in the inter-cell sets in all the segments, 1 in the 1-bit function switching bit represents a tanh function, and 0 represents a sigmoid function;

secondly, the input x passes through an absolute value taking module to obtain an absolute value | x | and a sign bit of the input x, the absolute value | x | is input into an interval judgment module, and a segmentation point x output to the interval judgment module by a configuration module is combined_iThe absolute value | x | is obtained by judgment in the interval judgment moduleIn the section interval, the first multiplexer is controlled to output y according to the interval judgment result₁The method specifically comprises the following steps:

if the interval determination result is | x-<x₁Then the first multiplexer outputs y₁＝a₀|x|+b₀Wherein a is₀、b₀Performing first-order Taylor expansion at 0 by a sigmoid function or a tanh function controlled by a 1-bit function switching bit;

if the interval judgment result is that | x | > x |, is more than or equal to x_K+1Then the first multiplexer outputs y₁1, where 1 is the quantized fixed-point number of "1" output by the configuration module;

if the interval judgment result is x₁≤|x|<x_K+1If so, the address generation module starts to calculate the RAM address where the absolute value | x | corresponding to the mapping value is located according to the truncation number n (i) and the offset number b (i) output to the address generation module by the configuration module; the RAM receives the RAM address output by the address generation module, outputs a mapping value RAM _ out, and outputs the mapping value RAM _ out through a first multiplexer

Then, y outputted from the first multiplexer₁The sign bit output by the absolute value module controls whether the second multiplexer is used for y₁Expanding the interval, if the sign bit of the input x is positive, outputting y ═ y₁If the sign bit of input x is negative, then output y is y₁Obtaining a fitting value y of the sigmoid function or the tanh function in x ∈ [0, + ∞) through the output of the interval expansion module;

finally, the 1-bit function switching bit output by the configuration module controls the operation of the interval expansion module, the interval expansion module outputs a result according to the point symmetry property of the sigmoid function and the tanh function, and if the 1-bit function switching bit is 1, the interval expansion module outputs-y₁And outputting y-y via a second multiplexer₁If the 1-bit function switching bit is 0, the interval expansion module outputs 1-y₁The output y is 1-y through the second multiplexer₁To obtain sigmoid function or tanThe fitting value y of the h function at x ∈ (- ∞, 0); where 1 is the quantized fixed-point number of "1" output by the configuration module.

3. The configurable convolutional neural network processor circuit of claim 2, wherein the step of calculating the RAM address where the absolute value | x | corresponds to the mapped value is: suppose | x | falls into x_i≤|x|<x_i+1Then the RAM address is ((| x | -x)_i)＞＞n(i))+b(i)。

4. The configurable convolutional neural network processor circuit as claimed in claim 1, wherein the neural network operation module further comprises a hierarchical quantization configuration module, which configures quantization standards of each layer of neural network by combining hierarchical quantization and saturation truncation to avoid overflow of calculation results of each layer, and the quantization standards of each layer of neural network are configured to each layer of neural network and the full connection layer by the hierarchical quantization configuration module; the configuration process of the hierarchical quantization configuration module is as follows:

the input of the convolution layer is a signed N-bit fixed point number, the input quantization standard is the quantization standard of the previous convolution layer, the quantization standard of the previous convolution layer is a P-bit decimal number, the signed 2N-bit fixed point number is adopted to represent the intermediate value of the current convolution layer product accumulation operation, and the decimal point of the intermediate value is located between the 2P-th bit and the 2P + 1-th bit from the low position to the high position; if the quantization standard of the current layer neural network is set as Q-bit decimal, intercepting the post-decimal Q bit and the pre-decimal N-Q bit of the intermediate value expressed by the signed 2N-bit fixed point number as the signed N-bit fixed point number operation result of the current convolutional layer; and if the intercepted signed N-bit fixed point number operation result is overflow, saturation truncation is carried out on the overflow value, if the overflow is positive overflow, the operation result of the signed N-bit fixed point number judged to overflow is reset to be a positive maximum value, and if the overflow is negative overflow, the operation result of the signed N-bit fixed point number judged to overflow is reset to be a negative minimum value.

5. The configurable convolutional neural network processor circuit of claim 1 wherein the fully-connected layers comprise multiply-accumulate units and the FIR filtering module multiplexes the multiply-accumulate units of the fully-connected layers.

6. The configurable convolutional neural network processor circuit as claimed in claim 1, wherein the windowing processing module comprises a buffer module, a windowing information calculation module and a data control module, and the two-stage data transmission mode is adopted to input data to the neural network operation module;

in the first stage of data transmission, denoised signals output by the FIR filtering module are simultaneously input into the cache module and the windowing information calculation module, the windowing information calculation module calculates the mark position of a windowing according to the input signals to serve as windowing information, the windowing information is output to the data control module, and the data control module reads data before the mark position stored in the cache module and outputs the data to the neural network operation module; and then, performing second-stage data transmission, judging how many windows of data need to be received in real time after the data control module receives the windowing information sent by the windowing information calculation module, directly receiving the data after the mark position from the FIR filtering module in real time, outputting the data to the neural network operation module, and finally outputting all the data in the windows to the neural network operation module.