CN111507465B - Configurable convolutional neural network processor circuit - Google Patents

Configurable convolutional neural network processor circuit Download PDF

Info

Publication number
CN111507465B
CN111507465B CN202010545278.2A CN202010545278A CN111507465B CN 111507465 B CN111507465 B CN 111507465B CN 202010545278 A CN202010545278 A CN 202010545278A CN 111507465 B CN111507465 B CN 111507465B
Authority
CN
China
Prior art keywords
module
interval
bit
function
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010545278.2A
Other languages
Chinese (zh)
Other versions
CN111507465A (en
Inventor
周军
周勇
刘嘉豪
刘青松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202010545278.2A priority Critical patent/CN111507465B/en
Publication of CN111507465A publication Critical patent/CN111507465A/en
Application granted granted Critical
Publication of CN111507465B publication Critical patent/CN111507465B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention provides a configurable convolutional neural network processor circuit, which comprises an FIR (finite impulse response) filtering module, a windowing processing module and a neural network operation module, wherein the neural network operation module comprises a convolutional layer, a pooling layer, a configurable activation function layer and a full-connection layer, the configurable activation function layer comprises an absolute value taking module, an interval judgment module, a first multiplexer, a configuration module, an address generation module, an RAM (random access memory), an interval expansion module and a second multiplexer, and the configurable activation function layer is configured with a sigmoid function or a tanh function and an error, so that the universality and the flexibility of a processor are greatly improved; by combining layered quantization and saturation truncation, the configurable quantization standard of each layer of neural network is realized, and the overflow risk is reduced; the FIR filtering function is realized by multiplexing the product accumulation operation units of the full connection layer, and the data are transmitted in a two-phase data transmission mode, so that the power consumption is further reduced.

Description

Configurable convolutional neural network processor circuit
Technical Field
The invention belongs to the technical field of artificial intelligence, and particularly relates to a configurable convolutional neural network processor circuit.
Background
Artificial Intelligence (AI) is a strategic industry leading the future, and an AI chip is a key technical link in the whole field of Artificial Intelligence, is the basis of the Artificial Intelligence industry in China, and is an important level for realizing breakthrough of Artificial Intelligence. Deep learning is an important way for developing artificial intelligence, and the greatest difference from the traditional computing mode is that massive parallel computing is needed without large-scale logic programming, and a new special computing chip is urged to be generated due to the strong requirements of a new computing mode and new computing in the artificial intelligence era. The maturity of the deep learning algorithm, the calculation capacity improvement and the big data jointly promote the artificial intelligence to realize the leap-type development, and the artificial intelligence application develops endlessly to further promote the calculation capacity demand.
The deep convolutional neural network is one of the typical algorithms for deep learning, is implemented and applied on a software platform at present, and can be implemented simply and conveniently on a Central Processing Unit (CPU) or a Graphics Processing Unit (GPU) through software programming due to the development of a plurality of deep learning frameworks. However, the CPU cannot well utilize the parallelism characteristic of the convolutional neural network algorithm, and thus cannot meet the requirements of low latency and low power consumption required by most applications. Although the convolution neural network realized on the GPU can well utilize the parallelism characteristic of the convolution neural network algorithm so as to obtain good performance, the overhigh power consumption can not meet the requirement of portable equipment. The conventional Application Specific Integrated Circuit (ASIC) dedicated artificial intelligence calculation acceleration Circuit realizes calculation by a dedicated Circuit structure for a certain specific algorithm, but has poor configurability and cannot follow up the high-speed development of the artificial intelligence algorithm.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a configurable convolutional neural network processor circuit, which greatly improves the universality and the accuracy of a neural network processor, improves a Finite Impulse Response (FIR) filtering module and a windowing processing module and reduces the chip area and the power consumption of the neural network processor by configuring an activation function layer structure and a quantization standard of each layer of neural network.
The specific technical scheme of the invention is as follows:
a configurable convolutional neural network processor circuit comprises an FIR filtering module, a windowing processing module and a neural network operation module, and is characterized in that the neural network operation module comprises a convolutional layer, a pooling layer, a configurable activation function layer and a full-connection layer, wherein the configurable activation function layer is configured with a sigmoid function or a tanh function and is also configured with an error;
the sigmoid function or tanh function fitting formula configured by the configurable activation function layer is obtained by the following method:
for input
Figure 30651DEST_PATH_IMAGE001
Dividing into different intervals with required error less than
Figure 624443DEST_PATH_IMAGE002
Sigmoid function or tanh function of
Figure 967700DEST_PATH_IMAGE003
The activation function is
Figure 409701DEST_PATH_IMAGE004
First, for
Figure 217120DEST_PATH_IMAGE005
The fitting procedure for the sigmoid function or tanh function is as follows:
when in use
Figure 919497DEST_PATH_IMAGE006
When, to
Figure 812367DEST_PATH_IMAGE003
Performing first-order Taylor expansion at 0 to obtain a fitting formula
Figure 258392DEST_PATH_IMAGE007
When is coming into contact with
Figure 185896DEST_PATH_IMAGE008
When the abscissa is
Figure 855912DEST_PATH_IMAGE009
Obtaining the first segment input interval
Figure 908182DEST_PATH_IMAGE010
When function
Figure 220214DEST_PATH_IMAGE011
When the abscissa is
Figure 736646DEST_PATH_IMAGE012
Obtaining the last input interval
Figure 780826DEST_PATH_IMAGE013
Said
Figure 382708DEST_PATH_IMAGE013
The fitting formula of interval correspondence is
Figure 170536DEST_PATH_IMAGE014
According to the first segment input interval
Figure 541474DEST_PATH_IMAGE010
And end input interval
Figure 818872DEST_PATH_IMAGE013
Obtaining the middle section input interval
Figure 111313DEST_PATH_IMAGE015
Inputting the middle section into the interval
Figure 499569DEST_PATH_IMAGE015
Is divided into
Figure 990593DEST_PATH_IMAGE016
Segment segmentation interval
Figure 376575DEST_PATH_IMAGE017
Wherein
Figure 953050DEST_PATH_IMAGE016
According to the determination of the logic resource and the storage resource,
Figure 82680DEST_PATH_IMAGE016
the larger the comparison logic for judging the required segmentation interval, the more the mapping value is stored
Figure 693790DEST_PATH_IMAGE004
Required storage resourcesThe fewer sources; will segment into intervals
Figure 47411DEST_PATH_IMAGE018
Divided into intra-segment cells of equal length
Figure 48865DEST_PATH_IMAGE019
Wherein
Figure 44503DEST_PATH_IMAGE020
For segmenting intervals
Figure 447802DEST_PATH_IMAGE018
The number of cells in the inner segment, and
Figure 975254DEST_PATH_IMAGE021
(ii) a Length of inter-cell within segment is determined by error
Figure 526321DEST_PATH_IMAGE002
Determining, in the segment interval
Figure 263333DEST_PATH_IMAGE018
In (1),
Figure 583456DEST_PATH_IMAGE022
correspond to
Figure 278880DEST_PATH_IMAGE023
Is provided with
Figure 254926DEST_PATH_IMAGE024
And is
Figure 592366DEST_PATH_IMAGE025
Figure 704679DEST_PATH_IMAGE026
Corresponding to an independent variable interval of
Figure 102162DEST_PATH_IMAGE027
If, if
Figure 96663DEST_PATH_IMAGE028
Then take the segment interval
Figure 175477DEST_PATH_IMAGE018
Has an intra-segment inter-cell length of
Figure 204613DEST_PATH_IMAGE029
The length of the intra-segment cells between different segment intervals follows
Figure 710681DEST_PATH_IMAGE030
Is increased by an increase in; the intra-segment cells adopt a direct mapping mode, namely, the intra-segment cells fall into
Figure 458057DEST_PATH_IMAGE031
All inputs within are mapped to the same output value
Figure 871721DEST_PATH_IMAGE032
In summary, in
Figure 958625DEST_PATH_IMAGE005
The fitting formula of the sigmoid function or the tanh function is as follows:
Figure 697911DEST_PATH_IMAGE033
(1)
secondly, according to the point symmetry properties of the sigmoid function and the tanh function:
Figure 870267DEST_PATH_IMAGE034
(2)
obtaining sigmoid function or tanh function in
Figure 87621DEST_PATH_IMAGE035
Finally obtaining a fitting formula of the sigmoid function or the tanh function in the whole independent variable interval;
further, the configurable activation function layer includes an absolute value taking module, an interval judging module, a first multiplexer, a configuration module, an address generating module, a RAM (Random Access Memory), an interval expanding module, and a second multiplexer;
the configuration process of the configurable activation function layer comprises the following steps:
firstly, sequentially storing middle section input intervals in the RAM
Figure 91350DEST_PATH_IMAGE036
All segment inter-cell
Figure 939220DEST_PATH_IMAGE037
Mapping value corresponding to sequence number
Figure 661188DEST_PATH_IMAGE038
The serial number is an RAM address; and according to the activation function to be configured
Figure 619917DEST_PATH_IMAGE004
Introducing segmentation points of segmentation intervals into the configuration module for sigmoid function or tanh function
Figure 478152DEST_PATH_IMAGE039
The number of bits to be cut per segment interval
Figure 293661DEST_PATH_IMAGE040
Offset number of
Figure 440608DEST_PATH_IMAGE041
The number of fixed points after quantization of '1' and 1-bit function switching bit; wherein the cutoff number
Figure 268275DEST_PATH_IMAGE042
In (1)
Figure 184278DEST_PATH_IMAGE043
For quantizing coefficients, i.e. the number of bits occupied by a small number of bits in the N-bit fixed-point number, said offset number
Figure 170689DEST_PATH_IMAGE044
First intra-segment inter-cell as segmented inter-segment
Figure 601670DEST_PATH_IMAGE045
In the middle section input interval
Figure 902201DEST_PATH_IMAGE036
In the sequence numbers in the inter-cell sets in all the segments, 1 in the 1-bit function switching bit represents a tanh function, and 0 represents a sigmoid function;
second input
Figure 607DEST_PATH_IMAGE046
Obtaining input through an absolute value obtaining module
Figure 361181DEST_PATH_IMAGE046
Absolute value of (2)
Figure 13880DEST_PATH_IMAGE047
And sign bit, into which
Figure 180419DEST_PATH_IMAGE046
For signed N-bit fixed point number, absolute value
Figure 71014DEST_PATH_IMAGE047
An input interval judgment module for combining the segment points output from the configuration module to the interval judgment module
Figure 664807DEST_PATH_IMAGE039
Judging in the interval judging module to obtain the absolute value
Figure 8063DEST_PATH_IMAGE047
The section where the section is located controls the output of the first multiplexer according to the section judgment result
Figure 447135DEST_PATH_IMAGE048
The method specifically comprises the following steps:
if the interval judgment result is
Figure 254554DEST_PATH_IMAGE049
Then the first multiplexer outputs
Figure 956931DEST_PATH_IMAGE050
Wherein
Figure 849800DEST_PATH_IMAGE051
Performing first-order Taylor expansion at 0 by a sigmoid function or a tanh function controlled by a 1-bit function switching bit;
if the interval judgment result is
Figure 295825DEST_PATH_IMAGE052
Then the first multiplexer outputs
Figure 692172DEST_PATH_IMAGE053
Wherein 1 is the number of fixed points after quantization of '1' output by the configuration module;
if the interval judgment result is
Figure 627766DEST_PATH_IMAGE054
The address generation module is started according to the truncated number output to the address generation module by the configuration module
Figure 945615DEST_PATH_IMAGE055
Offset number of
Figure 257648DEST_PATH_IMAGE041
Calculating absolute values
Figure 508501DEST_PATH_IMAGE047
The RAM address where the corresponding mapping value is located; the RAM receives the RAM address output by the address generation module and outputs a mapping value
Figure 552680DEST_PATH_IMAGE056
Output via the first multiplexer
Figure 420142DEST_PATH_IMAGE057
And then outputting the first multiplexer
Figure 207969DEST_PATH_IMAGE048
Input to the second multiplexer, and output of the absolute value module
Figure 844487DEST_PATH_IMAGE046
The sign bit of (a) controls whether the second multiplexer is paired or not
Figure 59568DEST_PATH_IMAGE048
Expanding the interval if inputting
Figure 139957DEST_PATH_IMAGE046
If the sign bit of the signal is positive, the output is
Figure 528214DEST_PATH_IMAGE058
If it is input
Figure 956921DEST_PATH_IMAGE046
Is negative, the output is obtained
Figure 405220DEST_PATH_IMAGE004
Is composed of
Figure 981695DEST_PATH_IMAGE048
Obtaining sigmoid function or tanh function through the output of the interval expansion module
Figure 111325DEST_PATH_IMAGE059
Fitting value of
Figure 722434DEST_PATH_IMAGE004
The 1-bit function switching bit output by the configuration module controls the operation of the interval expansion module, the interval expansion module outputs a result according to the point symmetry property of the sigmoid function and the tanh function, as shown in a formula (2), if the 1-bit function switching bit is 1, the interval expansion module outputs
Figure 279318DEST_PATH_IMAGE060
Is output through a second multiplexer
Figure 343089DEST_PATH_IMAGE061
If the 1-bit function switching bit is 0, the interval expansion module outputs
Figure 73147DEST_PATH_IMAGE062
Is output through a second multiplexer
Figure 476447DEST_PATH_IMAGE063
Obtaining sigmoid function or tanh function in
Figure 266548DEST_PATH_IMAGE035
Fitting value of
Figure 755298DEST_PATH_IMAGE004
(ii) a Where 1 is the quantized fixed-point number of "1" output by the configuration module.
Furthermore, the configurable activation function layer works according to the bit truncation number
Figure 23469DEST_PATH_IMAGE055
Offset number of
Figure 343592DEST_PATH_IMAGE041
Calculating absolute values
Figure 507857DEST_PATH_IMAGE047
The steps corresponding to the RAM address where the mapping value is located are as follows: suppose that
Figure 546220DEST_PATH_IMAGE047
Fall into
Figure 352502DEST_PATH_IMAGE064
Then the RAM address is
Figure 730394DEST_PATH_IMAGE065
I.e. absolute values
Figure 862298DEST_PATH_IMAGE047
Minus said
Figure 325640DEST_PATH_IMAGE064
Left boundary of interval
Figure 201192DEST_PATH_IMAGE022
After, right shift by the number of truncations
Figure 230328DEST_PATH_IMAGE042
Plus the offset number
Figure 736396DEST_PATH_IMAGE066
Get input
Figure 483772DEST_PATH_IMAGE046
And the sequence number between cells in the located segment is the RAM address.
Furthermore, the neural network operation module also comprises a layered quantization configuration module, the quantization standard of each layer of neural network is configured in a mode of combining layered quantization and saturation truncation so as to ensure that the calculation result of each layer of hardware cannot overflow as far as possible, the quantization standard of each layer of neural network is tested by software in advance, and then the layered quantization configuration module is configured to each layer of neural network; the configuration process of the hierarchical quantization configuration module is as follows:
the input of the convolution layer is a signed N-bit fixed point number, the input quantization standard is a quantization standard P-bit decimal number of the previous convolution layer, the middle value of the product accumulation operation of the current convolution layer is represented by a signed 2N-bit fixed point number, and the decimal point of the middle value is positioned between the 2P-th bit and the 2P + 1-th bit from the low position to the high position; if the quantization standard of the current layer neural network is set as Q-bit decimal, intercepting the post-decimal Q bit and the pre-decimal N-Q bit of the intermediate value expressed by the signed 2N-bit fixed point number as the signed N-bit fixed point number operation result of the current convolutional layer; and if the intercepted signed N-bit fixed point number operation result is overflow, saturation truncation is carried out on the overflow value, if the overflow is positive overflow, the operation result of the signed N-bit fixed point number judged to overflow is reset to be a positive maximum value, and if the overflow is negative overflow, the operation result of the signed N-bit fixed point number judged to overflow is reset to be a negative minimum value.
Furthermore, the fully-connected layer comprises a product accumulation operation unit, the FIR filtering module multiplexes the product accumulation operation unit of the fully-connected layer, the operation mode of the neural network operation module is divided into a neural network calculation mode and an FIR filtering calculation mode, and when the system is in the neural network calculation mode, the input of a multiplier in the product accumulation unit of the fully-connected layer is selected to be a fully-connected layer input characteristic diagram and a fully-connected layer weight; when the system is in an FIR filtering calculation mode, the input of a multiplier in the product accumulation unit of the full connection layer is selected as an input signal to be filtered and an FIR coefficient, and the FIR filtering module realizes the noise reduction processing of the input signal through a filter consisting of the FIR coefficient and the product accumulation operation unit.
Furthermore, the windowing processing module comprises a cache module, a windowing information calculating module and a data control module, and data are input to the neural network operation module by adopting a two-stage data transmission mode;
in the first stage of data transmission, denoised signals output by the FIR filtering module are simultaneously input into the cache module and the windowing information calculation module, the windowing information calculation module calculates the mark position of a windowing according to the input signals to serve as windowing information, the windowing information is output to the data control module, and the data control module reads data before the mark position stored in the cache module at one time and outputs the data to the neural network operation module; and then, performing second-stage data transmission, judging how many windows of data need to be received in real time after the data control module receives the windowing information sent by the windowing information calculation module, directly receiving the data after the mark position from the FIR filtering module in real time, outputting the data to the neural network operation module, and finally outputting all the data in the windows to the neural network operation module.
The invention has the beneficial effects that:
1. the invention can configure sigmoid function or tanh function and error by adopting the configurable activation function layer, thereby being capable of adapting to the derivation algorithm of a series of neural networks, and adopting different quantization strategies for different neural networks under different application scenes, and greatly improving the universality and flexibility compared with the traditional processor only supporting a single network;
2. according to the invention, a mode of combining layered quantization and saturation truncation is adopted, so that the configurable quantization standard of each layer of neural network is realized, the overflow risk is reduced, and the accuracy of the neural network processor is improved;
3. the invention provides a method for realizing FIR filtering function by a product accumulation operation unit of a full connection layer in a multiplexing neural network operation module, reducing the chip area of a processor and simultaneously reducing the calculation complexity and power consumption of a neural network;
4. compared with the mode of firstly caching and then reading in the network for all input signals, the two-stage data transmission mode can save more power consumption.
Drawings
FIG. 1 is a block diagram of a configurable convolutional neural network processor circuit of embodiment 1 of the present invention;
FIG. 2 is a block diagram of a configurable activation function layer in a configurable convolutional neural network processor circuit according to embodiment 1 of the present invention;
FIG. 3 is an activation function in a configurable convolutional neural network processor circuit according to embodiment 1 of the present invention
Figure 100698DEST_PATH_IMAGE004
The mapping relationship of (2);
fig. 4 is a block diagram of a windowing processing module in the configurable convolutional neural network processor circuit according to embodiment 1 of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described with reference to the following embodiments and the accompanying drawings.
Example 1
The embodiment provides a configurable convolutional neural network processor circuit, as shown in fig. 1, which includes an FIR filtering module, a windowing processing module, and a neural network operation module, and is characterized in that the neural network operation module includes a convolutional layer, a pooling layer, a configurable activation function layer, and a full connection layer, and the configurable activation function layer configures a sigmoid function or a tanh function and also configures an error;
the sigmoid function or tanh function fitting formula configured by the configurable activation function layer is obtained by the following method:
for input
Figure 987270DEST_PATH_IMAGE067
Dividing into different intervals with required error less than
Figure 726556DEST_PATH_IMAGE002
Sigmoid function or tanh function of
Figure 898911DEST_PATH_IMAGE068
The activation function is
Figure 850687DEST_PATH_IMAGE004
First, for
Figure 119994DEST_PATH_IMAGE059
The fitting procedure for the sigmoid function or tanh function is as follows:
when in use
Figure 967864DEST_PATH_IMAGE069
When, to
Figure 689833DEST_PATH_IMAGE068
Performing first-order Taylor expansion at 0 to obtain a fitting formula
Figure 648561DEST_PATH_IMAGE070
When is coming into contact with
Figure 506796DEST_PATH_IMAGE071
When the abscissa is
Figure 322305DEST_PATH_IMAGE009
To obtainFirst segment input interval
Figure 469253DEST_PATH_IMAGE072
(ii) a Wherein the fitting formula of the sigmoid function is
Figure 293989DEST_PATH_IMAGE073
The fitting formula of the tanh function is
Figure 209993DEST_PATH_IMAGE074
When function
Figure 930824DEST_PATH_IMAGE075
When the abscissa is
Figure 627385DEST_PATH_IMAGE076
Obtaining the last input interval
Figure 927916DEST_PATH_IMAGE077
Said
Figure 760743DEST_PATH_IMAGE077
The fitting formula of interval correspondence is
Figure 918055DEST_PATH_IMAGE078
According to the first segment input interval
Figure 39594DEST_PATH_IMAGE072
And end input interval
Figure 206134DEST_PATH_IMAGE077
Obtaining the middle section input interval
Figure 627888DEST_PATH_IMAGE079
Inputting the middle section into the interval
Figure 159363DEST_PATH_IMAGE079
Division into 4 segmented intervals
Figure 564937DEST_PATH_IMAGE080
(ii) a Will segment into intervals
Figure 472850DEST_PATH_IMAGE081
Divided into intra-segment cells of equal length
Figure 280269DEST_PATH_IMAGE082
Wherein
Figure 982646DEST_PATH_IMAGE020
For segmenting intervals
Figure 612866DEST_PATH_IMAGE081
The number of cells in the inner segment, and
Figure 121207DEST_PATH_IMAGE083
(ii) a Length of inter-cell within segment is determined by error
Figure 720816DEST_PATH_IMAGE002
Determining, in the segment interval
Figure 390832DEST_PATH_IMAGE081
In (1),
Figure 770998DEST_PATH_IMAGE022
correspond to
Figure 755134DEST_PATH_IMAGE084
Is provided with
Figure 537145DEST_PATH_IMAGE085
And is
Figure 581325DEST_PATH_IMAGE086
Figure 448787DEST_PATH_IMAGE087
Corresponding to an independent variable interval of
Figure 33352DEST_PATH_IMAGE088
If, if
Figure 607552DEST_PATH_IMAGE089
Then take the segment interval
Figure 884950DEST_PATH_IMAGE081
Has an intra-segment inter-cell length of
Figure 911812DEST_PATH_IMAGE090
The length of the intra-segment cells between different segment intervals follows
Figure 565647DEST_PATH_IMAGE030
Is increased by an increase in; the intra-segment cells adopt a direct mapping mode, namely, the intra-segment cells fall into
Figure 791092DEST_PATH_IMAGE037
All inputs within are mapped to the same output value
Figure 177074DEST_PATH_IMAGE091
In summary, in
Figure 19128DEST_PATH_IMAGE059
The fitting formula of the sigmoid function or the tanh function is as follows:
Figure 148758DEST_PATH_IMAGE033
(1)
the activation function
Figure 228710DEST_PATH_IMAGE004
The mapping relation diagram is shown in FIG. 3, and is divided into a first segment of input intervals
Figure 113489DEST_PATH_IMAGE072
Four-segment subsection interval of middle segment input interval
Figure 114943DEST_PATH_IMAGE080
And end input interval
Figure 110581DEST_PATH_IMAGE077
According to the point symmetry properties of the sigmoid function and the tanh function:
Figure 248301DEST_PATH_IMAGE092
(2)
obtaining sigmoid function or tanh function in
Figure 38403DEST_PATH_IMAGE035
Finally obtaining the fitting formula of the sigmoid function or the tanh function in the whole independent variable interval.
Further, the configurable activation function layer is shown in fig. 2 and includes an absolute value taking module, an interval judging module, a first multiplexer, a configuration module, an address generating module, a RAM, an interval expanding module, and a second multiplexer;
the configuration process of the configurable activation function layer comprises the following steps:
firstly, sequentially storing middle section input intervals in the RAM
Figure 589470DEST_PATH_IMAGE036
All segment inter-cell
Figure 60902DEST_PATH_IMAGE037
Mapping value corresponding to sequence number
Figure 118376DEST_PATH_IMAGE038
The serial number is an RAM address; and according to the activation function to be configured
Figure 344958DEST_PATH_IMAGE004
Introducing segmentation points of segmentation intervals into the configuration module for sigmoid function or tanh function
Figure 55425DEST_PATH_IMAGE093
The number of bits to be cut per segment interval
Figure 392865DEST_PATH_IMAGE094
Offset number of
Figure 770757DEST_PATH_IMAGE095
The number of fixed points after quantization of '1' and 1-bit function switching bit; wherein the cutoff number
Figure 902661DEST_PATH_IMAGE042
To quantize coefficients
Figure 366003DEST_PATH_IMAGE043
Within-segment inter-cell minus base-2 segmentation interval
Figure 975976DEST_PATH_IMAGE037
Absolute value of logarithm of interval, quantized coefficient
Figure 270691DEST_PATH_IMAGE043
Determining the number of bits occupied by the small digits in the N-digit fixed point number, and the offset number
Figure 511180DEST_PATH_IMAGE044
First intra-segment inter-cell as segmented inter-segment
Figure 258556DEST_PATH_IMAGE045
In the middle section input interval
Figure 937799DEST_PATH_IMAGE079
In the sequence numbers in the inter-cell sets in all the segments, 1 in the 1-bit function switching bit represents a tanh function, and 0 represents a sigmoid function;
second input
Figure 24704DEST_PATH_IMAGE046
Obtaining input through an absolute value obtaining module
Figure 763990DEST_PATH_IMAGE046
Absolute value of (2)
Figure 733083DEST_PATH_IMAGE047
And sign bit, into which
Figure 888120DEST_PATH_IMAGE046
For signed N-bit fixed point number, absolute value
Figure 157428DEST_PATH_IMAGE047
An input section judgment module for combining the section points of the section output from the configuration module to the section judgment module
Figure 739719DEST_PATH_IMAGE093
Judging in the interval judging module to obtain the absolute value
Figure 196108DEST_PATH_IMAGE047
The section where the section is located controls the output of the first multiplexer according to the section judgment result
Figure 217154DEST_PATH_IMAGE048
The method specifically comprises the following steps:
if the interval judgment result is
Figure 278650DEST_PATH_IMAGE049
Then the first multiplexer outputs
Figure 359739DEST_PATH_IMAGE050
Wherein the fitting formula of the sigmoid function is
Figure 506687DEST_PATH_IMAGE073
The fitting formula of the tanh function is
Figure 65844DEST_PATH_IMAGE074
If the interval judgment result is
Figure 981847DEST_PATH_IMAGE052
Then the first multiplexer outputs
Figure 968258DEST_PATH_IMAGE053
Wherein 1 isThe number of fixed points after quantization of '1' output by the configuration module;
if the interval judgment result is
Figure 679467DEST_PATH_IMAGE054
The address generation module is started according to the truncated number output to the address generation module by the configuration module
Figure 979998DEST_PATH_IMAGE096
Offset number of
Figure 547246DEST_PATH_IMAGE095
Computing
Figure 970137DEST_PATH_IMAGE047
The RAM address where the corresponding mapping value is: suppose that
Figure 91677DEST_PATH_IMAGE047
Fall into
Figure 992636DEST_PATH_IMAGE064
Then the RAM address is
Figure 883232DEST_PATH_IMAGE065
I.e. input
Figure 477024DEST_PATH_IMAGE046
Absolute value of (2)
Figure 617019DEST_PATH_IMAGE047
Minus said
Figure 524932DEST_PATH_IMAGE064
Left boundary of interval
Figure 332351DEST_PATH_IMAGE022
After, right shift by the number of truncations
Figure 34728DEST_PATH_IMAGE042
Plus the offset number
Figure 662018DEST_PATH_IMAGE066
Get input
Figure 904781DEST_PATH_IMAGE046
The sequence number between cells in the segment, namely the RAM address; the RAM receives the RAM address output by the address generation module and outputs a mapping value
Figure 769968DEST_PATH_IMAGE056
Output via the first multiplexer
Figure 705563DEST_PATH_IMAGE057
And then outputting the first multiplexer
Figure 23412DEST_PATH_IMAGE048
Input to the second multiplexer, and output of the absolute value module
Figure 804286DEST_PATH_IMAGE046
The sign bit of (a) controls whether the second multiplexer is paired or not
Figure 586298DEST_PATH_IMAGE048
Expanding the interval if inputting
Figure 630477DEST_PATH_IMAGE046
If the sign bit of the signal is positive, the output is
Figure 497939DEST_PATH_IMAGE058
If it is input
Figure 285766DEST_PATH_IMAGE046
Is negative, the output is obtained
Figure 656705DEST_PATH_IMAGE004
Is composed of
Figure 934102DEST_PATH_IMAGE048
Obtaining sigmoid function or tanh function through the output of the interval expansion module
Figure 960964DEST_PATH_IMAGE059
Fitting value of
Figure 614799DEST_PATH_IMAGE004
The 1-bit function switching bit output by the configuration module controls the operation of the interval expansion module, the interval expansion module outputs a result according to the point symmetry property of the sigmoid function and the tanh function, as shown in a formula (2), if the 1-bit function switching bit is 1, the interval expansion module outputs
Figure 43507DEST_PATH_IMAGE060
Is output through a second multiplexer
Figure 229156DEST_PATH_IMAGE061
If the 1-bit function switching bit is 0, the interval expansion module outputs
Figure 71210DEST_PATH_IMAGE062
Is output through a second multiplexer
Figure 200840DEST_PATH_IMAGE063
Obtaining sigmoid function or tanh function in
Figure 546371DEST_PATH_IMAGE035
Fitting value of
Figure 165571DEST_PATH_IMAGE004
Finally obtaining sigmoid function or tanh function
Figure 167025DEST_PATH_IMAGE067
Fitting value of
Figure 897084DEST_PATH_IMAGE004
(ii) a Where 1 is the quantized fixed-point number of "1" output by the configuration module.
Furthermore, the neural network operation module also comprises a layered quantization configuration module, the quantization standard of each layer of neural network is configured in a mode of combining layered quantization and saturation truncation so as to ensure that the calculation result of each layer of hardware cannot overflow as far as possible, the quantization standard of each layer of neural network is tested by software in advance, and then the layered quantization configuration module is configured to each layer of neural network; the configuration process of the hierarchical quantization configuration module is as follows:
setting the input of the convolution layer as an N-bit fixed point number, the input quantization standard as a P-bit decimal number of the quantization standard of the previous convolution layer, and representing the intermediate value of the current convolution layer product accumulation operation by adopting a 2N-bit fixed point number, wherein the decimal point of the intermediate value is positioned between the 2P-th bit and the 2P + 1-th bit from the low position to the high position; if the quantization standard of the current layer neural network is Q-bit decimal, intercepting the Q bit after the decimal point of the intermediate value represented by the 2N-bit fixed point number and the N-Q bit before the decimal point as the N-bit fixed point number operation result of the current convolutional layer; and if the intercepted N-bit fixed point number operation result is still overflow, saturation truncation is carried out on the overflow value, if the overflow is positive overflow, the N-bit fixed point number operation result judged to be overflow is reset to be a positive maximum value, and if the overflow is negative overflow, the N-bit fixed point number operation result judged to be overflow is reset to be a negative minimum value.
Further, the fully-connected layer comprises a product accumulation operation unit, the FIR filtering module multiplexes the product accumulation operation unit of the fully-connected layer, the operation mode of the neural network operation module is divided into a neural network calculation mode and a FIR filtering calculation mode, and when the system is in the neural network calculation mode, the input of a multiplier in the product accumulation unit of the fully-connected layer is selected to be a fully-connected layer input characteristic diagram and a fully-connected layer weight; when the system is in an FIR filtering calculation mode, the input of a multiplier in the product accumulation unit of the full connection layer is selected as an input signal to be filtered and an FIR coefficient, and the FIR filtering module realizes the noise reduction processing on the input signal through a filter consisting of the FIR coefficient and the product accumulation operation unit, thereby effectively reducing the influence of noise on the identification precision, weakening the calculation complexity of a neural network and reducing the power consumption.
Further, the windowing processing module, as shown in fig. 4, includes a cache module, a windowing information calculating module, and a data control module, and is configured to input data to the neural network operation module in a two-stage data transmission mode;
first-stage data transmission: the denoised signal data output by the FIR filtering module is simultaneously input into the cache module and the windowing information calculation module, the windowing information calculation module calculates the mark position, such as the peak position, of a windowing according to the input signal, the mark position serves as windowing information, the windowing information is output to the data control module, and the data control module reads the data before the mark position stored in the cache module from the cache module at one time and outputs the data to the neural network operation module;
and second-stage data transmission: and the data control module judges how many windows of data need to be received in real time after receiving the windowing information sent by the windowing information calculation module, directly receives the data after the mark position from the FIR filtering module in real time and outputs the data to the neural network operation module.
And finally, all data in the window are output to the neural network operation module, so that the power consumption of outputting the data after the mark position is read from the cache module to the neural network operation module is saved.

Claims (6)

1. A configurable convolutional neural network processor circuit comprises an FIR filtering module, a windowing processing module and a neural network operation module, and is characterized in that the neural network operation module comprises a convolutional layer, a pooling layer, a configurable activation function layer and a full-connection layer, wherein the configurable activation function layer is configured with a sigmoid function or a tanh function and is also configured with an error;
the sigmoid function or tanh function of the configurable activation function layer configuration is (x), and the fitting formula in x ∈ [0, + ∞) is as follows:
Figure FDA0002639520820000011
wherein y is an activation function;
for the first segment input interval [0, x1) And (x) is subjected to first-order Taylor expansion at 0 to obtain a fitting formula y as a0x+b0,x1Is when yAbscissa of the case- (x), where, as error, a in the fitting equation of the sigmoid function0Is composed of
Figure FDA0002639520820000012
b0Is composed of
Figure FDA0002639520820000013
Fitting formula of tanh function0Is 1, b0Is 0;
for the last segment input interval [ xK+1,+∞),xK+1Is the abscissa when (x) ═ 1-;
for the middle input interval [ x1,xK+1) Inputting the middle section into the interval [ x ]1,xK+1) Partitioning into K segmentsi,xi+1) K, and then segment interval [ x ═ 1i,xi+1) Divided into intra-segment cells of equal length
Figure FDA0002639520820000014
Wherein L isiIs a segment interval [ xi,xi+1) The number of cells in the inner segment; the intra-segment cells adopt a direct mapping mode and fall into the intra-segment cells
Figure FDA0002639520820000015
All inputs within are mapped to the same output value
Figure FDA0002639520820000016
According to the point symmetry properties of the sigmoid function and the tanh function:
Figure FDA0002639520820000017
and obtaining a fitting formula of the sigmoid function or the tanh function in x ∈ (— ∞,0), and finally obtaining the fitting formula of the sigmoid function or the tanh function in the whole independent variable interval.
2. The configurable convolutional neural network processor circuit as claimed in claim 1, wherein the configurable activation function layer comprises an absolute value taking module, an interval judging module, a first multiplexer, a configuration module, an address generating module, a RAM, an interval expanding module and a second multiplexer; the configuration process of the configurable activation function layer comprises the following steps:
firstly, sequentially storing a middle section input interval [ x ] in the RAM1,xK+1) All segment inter-cell
Figure FDA0002639520820000021
Mapping value corresponding to sequence number
Figure FDA0002639520820000022
The serial number is an RAM address; leading in a segmentation point x of a segmentation interval in the configuration module according to the fact that the activation function y to be configured is a sigmoid function or a tanh functioniThe number of bits to be cut per segment interval
Figure FDA0002639520820000023
Bias number b (i), fixed point number after quantization of '1' and 1-bit function switching bit; wherein the segmentation point xiK +1, i of the truncation number n (i) is a quantization coefficient, and i of the offset number b (i) is 1, K, M of the truncation number n (i) is a first intra-segment inter-cell interval of a segment interval, and the offset number b (i) is a first intra-segment inter-cell interval of the segment interval
Figure FDA0002639520820000024
In the middle input interval [ x1,xK+1) In the sequence numbers in the inter-cell sets in all the segments, 1 in the 1-bit function switching bit represents a tanh function, and 0 represents a sigmoid function;
secondly, the input x passes through an absolute value taking module to obtain an absolute value | x | and a sign bit of the input x, the absolute value | x | is input into an interval judgment module, and a segmentation point x output to the interval judgment module by a configuration module is combinediThe absolute value | x | is obtained by judgment in the interval judgment moduleIn the section interval, the first multiplexer is controlled to output y according to the interval judgment result1The method specifically comprises the following steps:
if the interval determination result is | x-<x1Then the first multiplexer outputs y1=a0|x|+b0Wherein a is0、b0Performing first-order Taylor expansion at 0 by a sigmoid function or a tanh function controlled by a 1-bit function switching bit;
if the interval judgment result is that | x | > x |, is more than or equal to xK+1Then the first multiplexer outputs y11, where 1 is the quantized fixed-point number of "1" output by the configuration module;
if the interval judgment result is x1≤|x|<xK+1If so, the address generation module starts to calculate the RAM address where the absolute value | x | corresponding to the mapping value is located according to the truncation number n (i) and the offset number b (i) output to the address generation module by the configuration module; the RAM receives the RAM address output by the address generation module, outputs a mapping value RAM _ out, and outputs the mapping value RAM _ out through a first multiplexer
Figure FDA0002639520820000025
Then, y outputted from the first multiplexer1The sign bit output by the absolute value module controls whether the second multiplexer is used for y1Expanding the interval, if the sign bit of the input x is positive, outputting y ═ y1If the sign bit of input x is negative, then output y is y1Obtaining a fitting value y of the sigmoid function or the tanh function in x ∈ [0, + ∞) through the output of the interval expansion module;
finally, the 1-bit function switching bit output by the configuration module controls the operation of the interval expansion module, the interval expansion module outputs a result according to the point symmetry property of the sigmoid function and the tanh function, and if the 1-bit function switching bit is 1, the interval expansion module outputs-y1And outputting y-y via a second multiplexer1If the 1-bit function switching bit is 0, the interval expansion module outputs 1-y1The output y is 1-y through the second multiplexer1To obtain sigmoid function or tanThe fitting value y of the h function at x ∈ (- ∞, 0); where 1 is the quantized fixed-point number of "1" output by the configuration module.
3. The configurable convolutional neural network processor circuit of claim 2, wherein the step of calculating the RAM address where the absolute value | x | corresponds to the mapped value is: suppose | x | falls into xi≤|x|<xi+1Then the RAM address is ((| x | -x)i)>>n(i))+b(i)。
4. The configurable convolutional neural network processor circuit as claimed in claim 1, wherein the neural network operation module further comprises a hierarchical quantization configuration module, which configures quantization standards of each layer of neural network by combining hierarchical quantization and saturation truncation to avoid overflow of calculation results of each layer, and the quantization standards of each layer of neural network are configured to each layer of neural network and the full connection layer by the hierarchical quantization configuration module; the configuration process of the hierarchical quantization configuration module is as follows:
the input of the convolution layer is a signed N-bit fixed point number, the input quantization standard is the quantization standard of the previous convolution layer, the quantization standard of the previous convolution layer is a P-bit decimal number, the signed 2N-bit fixed point number is adopted to represent the intermediate value of the current convolution layer product accumulation operation, and the decimal point of the intermediate value is located between the 2P-th bit and the 2P + 1-th bit from the low position to the high position; if the quantization standard of the current layer neural network is set as Q-bit decimal, intercepting the post-decimal Q bit and the pre-decimal N-Q bit of the intermediate value expressed by the signed 2N-bit fixed point number as the signed N-bit fixed point number operation result of the current convolutional layer; and if the intercepted signed N-bit fixed point number operation result is overflow, saturation truncation is carried out on the overflow value, if the overflow is positive overflow, the operation result of the signed N-bit fixed point number judged to overflow is reset to be a positive maximum value, and if the overflow is negative overflow, the operation result of the signed N-bit fixed point number judged to overflow is reset to be a negative minimum value.
5. The configurable convolutional neural network processor circuit of claim 1 wherein the fully-connected layers comprise multiply-accumulate units and the FIR filtering module multiplexes the multiply-accumulate units of the fully-connected layers.
6. The configurable convolutional neural network processor circuit as claimed in claim 1, wherein the windowing processing module comprises a buffer module, a windowing information calculation module and a data control module, and the two-stage data transmission mode is adopted to input data to the neural network operation module;
in the first stage of data transmission, denoised signals output by the FIR filtering module are simultaneously input into the cache module and the windowing information calculation module, the windowing information calculation module calculates the mark position of a windowing according to the input signals to serve as windowing information, the windowing information is output to the data control module, and the data control module reads data before the mark position stored in the cache module and outputs the data to the neural network operation module; and then, performing second-stage data transmission, judging how many windows of data need to be received in real time after the data control module receives the windowing information sent by the windowing information calculation module, directly receiving the data after the mark position from the FIR filtering module in real time, outputting the data to the neural network operation module, and finally outputting all the data in the windows to the neural network operation module.
CN202010545278.2A 2020-06-16 2020-06-16 Configurable convolutional neural network processor circuit Active CN111507465B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010545278.2A CN111507465B (en) 2020-06-16 2020-06-16 Configurable convolutional neural network processor circuit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010545278.2A CN111507465B (en) 2020-06-16 2020-06-16 Configurable convolutional neural network processor circuit

Publications (2)

Publication Number Publication Date
CN111507465A CN111507465A (en) 2020-08-07
CN111507465B true CN111507465B (en) 2020-10-23

Family

ID=71877126

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010545278.2A Active CN111507465B (en) 2020-06-16 2020-06-16 Configurable convolutional neural network processor circuit

Country Status (1)

Country Link
CN (1) CN111507465B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738427B (en) * 2020-08-14 2020-12-29 电子科技大学 Operation circuit of neural network
CN112651497A (en) * 2020-12-30 2021-04-13 深圳大普微电子科技有限公司 Hardware chip-based activation function processing method and device and integrated circuit
CN115601692A (en) * 2021-07-08 2023-01-13 华为技术有限公司(Cn) Data processing method, training method and device of neural network model
CN113705776B (en) * 2021-08-06 2023-08-08 山东云海国创云计算装备产业创新中心有限公司 Method, system, equipment and storage medium for realizing activation function based on ASIC

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107729984A (en) * 2017-10-27 2018-02-23 中国科学院计算技术研究所 A kind of computing device and method suitable for neutral net activation primitive
CN107886166A (en) * 2016-09-29 2018-04-06 北京中科寒武纪科技有限公司 A kind of apparatus and method for performing artificial neural network computing
CN108154224A (en) * 2018-01-17 2018-06-12 北京中星微电子有限公司 For the method, apparatus and non-transitory computer-readable medium of data processing
CN108898216A (en) * 2018-05-04 2018-11-27 中国科学院计算技术研究所 Activation processing unit applied to neural network
CN110751280A (en) * 2019-09-19 2020-02-04 华中科技大学 Configurable convolution accelerator applied to convolutional neural network
CN110852416A (en) * 2019-09-30 2020-02-28 成都恒创新星科技有限公司 CNN accelerated computing method and system based on low-precision floating-point data expression form

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10949736B2 (en) * 2016-11-03 2021-03-16 Intel Corporation Flexible neural network accelerator and methods therefor
CN110163338B (en) * 2019-01-31 2024-02-02 腾讯科技(深圳)有限公司 Chip operation method and device with operation array, terminal and chip
CN110738311A (en) * 2019-10-14 2020-01-31 哈尔滨工业大学 LSTM network acceleration method based on high-level synthesis

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107886166A (en) * 2016-09-29 2018-04-06 北京中科寒武纪科技有限公司 A kind of apparatus and method for performing artificial neural network computing
CN107729984A (en) * 2017-10-27 2018-02-23 中国科学院计算技术研究所 A kind of computing device and method suitable for neutral net activation primitive
CN108154224A (en) * 2018-01-17 2018-06-12 北京中星微电子有限公司 For the method, apparatus and non-transitory computer-readable medium of data processing
CN108898216A (en) * 2018-05-04 2018-11-27 中国科学院计算技术研究所 Activation processing unit applied to neural network
CN110751280A (en) * 2019-09-19 2020-02-04 华中科技大学 Configurable convolution accelerator applied to convolutional neural network
CN110852416A (en) * 2019-09-30 2020-02-28 成都恒创新星科技有限公司 CNN accelerated computing method and system based on low-precision floating-point data expression form

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
苏潮阳 等.一种神经网络的可配置激活函数模块设计.《单片机与嵌入式系统应用》.2020,第20卷(第4期), *

Also Published As

Publication number Publication date
CN111507465A (en) 2020-08-07

Similar Documents

Publication Publication Date Title
CN111507465B (en) Configurable convolutional neural network processor circuit
CN110070178B (en) Convolutional neural network computing device and method
CN109063825B (en) Convolutional neural network accelerator
CN110880038B (en) System for accelerating convolution calculation based on FPGA and convolution neural network
CN107256424B (en) Three-value weight convolution network processing system and method
US20180218518A1 (en) Data compaction and memory bandwidth reduction for sparse neural networks
WO2021129445A1 (en) Data compression method and computing device
US11599367B2 (en) Method and system for compressing application data for operations on multi-core systems
CN111240746B (en) Floating point data inverse quantization and quantization method and equipment
CN108921292B (en) Approximate computing system for deep neural network accelerator application
CN113660113A (en) Self-adaptive sparse parameter model design and quantitative transmission method for distributed machine learning
CN110110852B (en) Method for transplanting deep learning network to FPAG platform
CN114640354A (en) Data compression method and device, electronic equipment and computer readable storage medium
CN109325590B (en) Device for realizing neural network processor with variable calculation precision
CN114222129A (en) Image compression encoding method, image compression encoding device, computer equipment and storage medium
US20210044303A1 (en) Neural network acceleration device and method
Wong et al. Low bitwidth CNN accelerator on FPGA using Winograd and block floating point arithmetic
CN210109863U (en) Multiplier, device, neural network chip and electronic equipment
CN112734021A (en) Neural network acceleration method based on bit sparse calculation
CN113360131A (en) Logarithm approximate multiplication accumulator for convolutional neural network accelerator
KR20220100030A (en) Pattern-Based Cache Block Compression
CN109416757B (en) Method, apparatus and computer-readable storage medium for processing numerical data
CN112132272A (en) Computing device, processor and electronic equipment of neural network
CN112001492A (en) Mixed flow type acceleration framework and acceleration method for binary weight Densenet model
CN111275184B (en) Method, system, device and storage medium for realizing neural network compression

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant