CN111105015A - General CNN reasoning accelerator, control method thereof and readable storage medium - Google Patents

General CNN reasoning accelerator, control method thereof and readable storage medium Download PDF

Info

Publication number
CN111105015A
CN111105015A CN201911243224.4A CN201911243224A CN111105015A CN 111105015 A CN111105015 A CN 111105015A CN 201911243224 A CN201911243224 A CN 201911243224A CN 111105015 A CN111105015 A CN 111105015A
Authority
CN
China
Prior art keywords
module
target data
pooling
calculation
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201911243224.4A
Other languages
Chinese (zh)
Inventor
徐天赐
景璐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Beijing Electronic Information Industry Co Ltd
Original Assignee
Inspur Beijing Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Beijing Electronic Information Industry Co Ltd filed Critical Inspur Beijing Electronic Information Industry Co Ltd
Priority to CN201911243224.4A priority Critical patent/CN111105015A/en
Publication of CN111105015A publication Critical patent/CN111105015A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Advance Control (AREA)

Abstract

The application discloses a general CNN reasoning accelerator, including: the preprocessing module is used for acquiring target data, convolution kernel data and a module time sequence; the convolution and activation module is used for performing convolution calculation and activation calculation on the target data; the pooling module is used for performing pooling calculation or data structure conversion on the target data; module timing includes the order in which target data enters the convolution and activation module and/or pooling module. In the application, the pooling module can perform pooling calculation or data structure conversion on the target data, output the target data with required size, and remove the limit of the calculation size of the target data; meanwhile, the sequence of the target data passing through the convolution and activation module or the pooling module can be flexibly adjusted by utilizing the module time sequence, so that the target data can enter or bypass a certain module for many times, and the limitation of the target data on the operator sequence is removed. The application also discloses a control method of the general CNN inference accelerator and a readable storage medium with the same beneficial effects.

Description

General CNN reasoning accelerator, control method thereof and readable storage medium
Technical Field
The invention relates to the field of CNN thrust acceleration, in particular to a general CNN inference accelerator, a control method thereof and a readable storage medium.
Background
In recent years, with the increase of computer computing power and the development of a CNN (Convolutional Neural network) structure, the recognition accuracy of the CNN network is greatly improved, and meanwhile, the depth of the CNN is continuously deepened, the network structure is more complex, and the computational load is also more and more large, so heterogeneous computing devices such as a GPU (Graphics processing unit), an FPGA (Field Programmable Gate Array), an ASIC (Application Specific Integrated Circuit), and the like are required to accelerate the CNN inference computation.
The general CNN reasoning accelerator based on FPGA has two main realization methods: multi-layer example implementation versus single-layer example implementation: the multilayer example implementation maps each hidden layer reasoning calculation in the CNN model into hardware implementation in the FPGA, data flow is input from a first layer to output from a last layer, and the CNN reasoning calculation is completed by the hardware implementation in the FPGA once; the single-layer example implementation abstracts the reasoning calculation of a hidden layer in the CNN model into the hardware implementation in the FPGA, and the data flow circularly and repeatedly passes through the hardware implementation of the FPGA so as to complete the CNN reasoning calculation.
The implementation of the multilayer example is limited by the number of the CNN layers, and the more the number of the CNN layers, the higher the hardware resource pressure of the CNN inference model implemented in the FPGA is; the single-layer example implementation is limited by the operator sequence in the relatively fixed single-layer neural network calculation, when the operator sequences of different CNN models are different, the FPGA hardware implementation in the general CNN inference accelerator is difficult to meet the flexible operator sequence, and sometimes the hardware implementation needs to be changed again to meet the requirements of the CNN models.
Therefore, how to provide a solution to the above technical problems is a problem to be solved by those skilled in the art.
Disclosure of Invention
In view of the above, the present invention provides a generic CNN inference accelerator supporting flexible configuration of module timings with various sizes, a control method thereof, and a readable storage medium. The specific scheme is as follows:
a generic CNN inference accelerator comprising:
the preprocessing module is used for acquiring target data, convolution kernel data and a module time sequence;
the convolution and activation module is used for performing convolution calculation and activation calculation on the target data;
the pooling module is used for performing pooling calculation or data structure conversion on the target data;
the module timing comprises an order in which the target data enters the convolution and activation module and/or the pooling module.
Preferably, the pooling module comprises:
the general pooling module is used for performing pooling calculation or data structure conversion of preset sizes on the target data;
the full-size pooling module is used for performing full-size pooling calculation on the target data;
the module timing specifically includes an order of the target data through the convolution and activation module and/or the general pooling module and/or the full-size pooling module.
Preferably, the pooling calculation is specifically: a maximum pooling calculation or an average pooling calculation.
Preferably, the general CNN inference accelerator is an inference accelerator implemented by a single-layer instance.
Preferably, the generic CNN inference accelerator is an inference accelerator implemented by a multi-layer instance.
Preferably, the generic CNN inference accelerator further includes:
a data organization module for organizing a data stream of the target data;
and the storage access module is used for calculating a storage address corresponding to the target data.
Preferably, the general CNN inference accelerator is an ASIC or FPGA based inference accelerator.
Correspondingly, the invention also discloses a control method of the general CNN inference accelerator, which is applied to the general CNN inference accelerator and comprises the following steps:
acquiring target data, convolution kernel data and a module time sequence through a preprocessing module; the module timing sequence comprises the sequence of the target data entering the convolution and activation module and/or the pooling module;
and according to the module time sequence, performing convolution calculation and activation calculation on the target data through the convolution and activation module and/or performing pooling calculation or data structure conversion on the target data through the pooling module.
Accordingly, the present invention also discloses a readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the steps of the control method of the generic CNN inference accelerator as described above.
The application discloses a general CNN reasoning accelerator, including: the preprocessing module is used for acquiring target data, convolution kernel data and a module time sequence; the convolution and activation module is used for performing convolution calculation and activation calculation on the target data; the pooling module is used for performing pooling calculation or data structure conversion on the target data; the module timing comprises an order in which the target data enters the convolution and activation module and/or the pooling module. In the application, the pooling module can perform pooling calculation or data structure conversion on the target data, output the target data with required size, and remove the limit of the calculation size of the target data; meanwhile, the sequence of the target data passing through the convolution and activation module or the pooling module can be flexibly adjusted by utilizing the module time sequence, so that the target data can enter or bypass a certain module for many times, and the limitation of the target data on the operator sequence is removed.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a structural distribution diagram of a general CNN inference accelerator according to an embodiment of the present invention;
FIG. 2 is a logic diagram of a general CNN inference accelerator according to an embodiment of the present invention;
fig. 3 is a flowchart illustrating steps of a method for controlling a CNN inference accelerator according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The multi-layer implementation of the general CNN inference accelerator based on the FPGA is limited by the number of CNN layers, and the CNN inference model with more CNN layers has higher hardware resource pressure in the FPGA; whereas single-layer example implementations are limited to the operator order in single-layer neural network computations, which is relatively fixed. When the operator sequences of different CNN models are different, the FPGA hardware implementation in the general CNN inference accelerator is difficult to meet the flexible operator sequences, and sometimes the hardware implementation needs to be changed again to meet the requirements of the CNN models.
In the application, the pooling module can perform pooling calculation or data structure conversion on the target data, so that the target data with the required size is output, and the limit of the calculation size of the target data is removed; meanwhile, the sequence of the target data passing through the convolution and activation module or the pooling module can be flexibly adjusted by utilizing the module time sequence, so that the target data can enter or bypass a certain module for many times, and the limitation of the target data on the operator sequence is removed.
The embodiment of the invention discloses a general CNN reasoning accelerator, which is shown in figure 1 and comprises the following components:
the preprocessing module 1 is used for acquiring target data, convolution kernel data and a module time sequence;
the convolution and activation module 2 is used for performing convolution calculation and activation calculation on the target data;
the pooling module 3 is used for performing pooling calculation or data structure conversion on the target data;
the module timing comprises the order in which the target data enters the convolution and activation module 2 and/or the pooling module 3.
Wherein the pooling calculation includes, but is not limited to, a maximum pooling calculation or an average pooling calculation.
It is understood that the generic CNN inference accelerator can be either a single-layer instance implementation inference accelerator or a multi-layer instance implementation inference accelerator.
It is understood that, in this embodiment, the number of the convolution and activation modules 2 and the pooling modules 3 may be one or more, and whether the target data passes through the convolution and activation modules 2 or the pooling modules 3 and the order of the convolution and activation modules 2 or the pooling modules 3 is determined by the module timing.
It can be understood that, the module time sequence is determined by the external system design and then sent to the generic CNN inference accelerator, because the existence of the internal pooling module 3 in the generic CNN inference accelerator and the module time sequence determine the calculation order of the internal modules, when the present embodiment is applied to calculate the target data of various sizes, types and positions, no change is required to the internal hardware of the generic CNN inference accelerator, and the restrictions on the size and calculation order of the target data are removed.
Further, the pooling module 3 may include:
the general pooling module 31 is used for performing pooling calculation or data structure conversion of preset sizes on the target data;
a full-size pooling module 32 for performing full-size pooling calculation on the target data;
it will be appreciated that the module timing in this case specifically includes the order of the target data through the convolution and activation module 2 and/or the general pooling module 31 and/or the full-size pooling module 32.
The number of the convolution and activation module 2, the general pooling module 31 and the full-size pooling module 32 is not limited, and is set according to actual requirements.
Further, the generic CNN inference accelerator typically further comprises:
a data organization module 4 for organizing the data stream of the target data;
and the storage access module 5 is used for calculating a storage address corresponding to the target data.
Specifically, in the process of operating the general CNN inference accelerator according to the module time sequence to perform inference calculation on target data, firstly, target data, convolution kernel data and the module time sequence are obtained through the preprocessing module 1, then the target data enter the data organization module 4, whether the target data enter the convolution and activation module 2 and/or the general pooling module 31 is determined according to the module time sequence, and after all calculations of the convolution and activation module 2 and the general pooling module 31 are finished, the storage access module 5 receives hidden layer feature map data corresponding to the target data transmitted after the calculation is finished, and calculates the storage address of the hidden layer feature map data, so that the data organization module 4 writes the received hidden layer feature map data into the on-chip memory according to the storage address. At this time, the target data of the storage access module 5 may enter the full-size pooling module 32, enter the data organization module 4 after performing pooling calculation, or directly enter the data organization module 4 by bypassing the full-size pooling module 32. The data calculation path of the whole target data is carried out according to the module time sequence, and whether the target data passes through each convolution and activation module 2, each universal pooling module 31, the full-size pooling module 32 and the passing sequence can be realized by setting the module time sequence.
Specifically, referring to the logic diagram in fig. 2, pooling and convolution are both short for pooling calculation, convolution and activation calculation, such as pooling calculation shown as pooling 11 in fig. 2, and a single layer can be merged with a generic pooling link in hidden layer inference calculation to which the previous pooling calculation belongs after the generic pooling module 31 is located behind the convolution and activation module 2; as shown in pooling 12 in fig. 2, when the general pooling module 31 is located in front of the convolution and activation module 2, it can be used as an independent pooling hidden layer, after the previous inference calculation is completed, the data organization module 4 of the layer of inference calculation performs data organization dedicated to the independent pooling hidden layer, then the target data directly enters the general pooling module 31 through an independent pooling shortcut to perform pooling calculation, and finally enters a subsequent link to complete the cost layer calculation; when there is no pooling calculation requirement before and after the convolution and activation module 2, the target data passes through the general pooling module 31 without pooling calculation, and only data structure conversion, that is, data storage structure change is performed. Therefore, the flexible configuration of the calculation sequence of each module is realized through the module time sequence in the embodiment.
Similarly, in the full-size pooling module 32 generally located at the end of the CNN model, the hidden layer to which the previous convolution calculation belongs is generally merged, and the data stream of the target data receives the hidden layer feature map data corresponding to the target data transmitted after the calculation through the storage access link 5, and calculates the storage address thereof, so that the data organization module 4 writes the received data into the on-chip memory according to the storage address. At this time, the target data of the storage access module 5 may enter the full-size pooling module 32, enter the data organization module 4 after performing pooling calculation, or directly enter the data organization module 4 by bypassing the full-size pooling module 32.
Furthermore, the general CNN inference accelerator is an inference accelerator based on an ASIC (application specific integrated circuit) or FPGA (field programmable gate array), so that the universality of the accelerator is greatly improved, various popular CNN models such as ResNet50, GoogleLeNet, Squeezenet and VGG (virtual grid generator) can be flexibly supported, and the overall performance of the general CNN inference accelerator is improved.
The application discloses a general CNN reasoning accelerator, including: the preprocessing module is used for acquiring target data, convolution kernel data and a module time sequence; the convolution and activation module is used for performing convolution calculation and activation calculation on the target data; the pooling module is used for performing pooling calculation or data structure conversion on the target data; the module timing comprises an order in which the target data enters the convolution and activation module and/or the pooling module. In the application, the pooling module can perform pooling calculation or data structure conversion on the target data, output the target data with required size, and remove the limit of the calculation size of the target data; meanwhile, the sequence of the target data passing through the convolution and activation module or the pooling module can be flexibly adjusted by utilizing the module time sequence, so that the target data can enter or bypass a certain module for many times, and the limitation of the target data on the operator sequence is removed.
Correspondingly, the present invention also discloses a control method for the general CNN inference accelerator, which is applied to the general CNN inference accelerator described above, and as shown in fig. 3, the method includes:
s1: acquiring target data, convolution kernel data and a module time sequence through a preprocessing module; the module timing sequence comprises the sequence of the target data entering the convolution and activation module and/or the pooling module;
s2: and according to the module time sequence, performing convolution calculation and activation calculation on the target data through the convolution and activation module and/or performing pooling calculation or data structure conversion on the target data through the pooling module.
The content of the general CNN inference accelerator in this embodiment may refer to the detailed description in the above embodiments, and is not described herein again.
The control method of the generic CNN inference accelerator in this embodiment has the same beneficial effects as the generic CNN inference accelerator in the above embodiments, and is not described herein again.
Correspondingly, the present invention also discloses a readable storage medium, on which a computer program is stored, and when being executed by a processor, the computer program implements the steps of the control method of the generic CNN inference accelerator, which specifically includes:
acquiring target data, convolution kernel data and a module time sequence through a preprocessing module; the module timing sequence comprises the sequence of the target data entering the convolution and activation module and/or the pooling module;
and according to the module time sequence, performing convolution calculation and activation calculation on the target data through the convolution and activation module and/or performing pooling calculation or data structure conversion on the target data through the pooling module.
Specifically, the content of the control method related to the general CNN inference accelerator in this embodiment may refer to the detailed description in the above embodiments, and is not described herein again.
The readable storage medium in this embodiment has the same beneficial effects as the control method of the general CNN inference accelerator in the above embodiments, and is not described herein again.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The general CNN inference accelerator, the control method thereof, and the readable storage medium provided by the present invention are described in detail above, and a specific example is applied in the present document to explain the principle and the implementation of the present invention, and the description of the above embodiment is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (9)

1. A generic CNN inference accelerator, comprising:
the preprocessing module is used for acquiring target data, convolution kernel data and a module time sequence;
the convolution and activation module is used for performing convolution calculation and activation calculation on the target data;
the pooling module is used for performing pooling calculation or data structure conversion on the target data;
the module timing comprises an order in which the target data enters the convolution and activation module and/or the pooling module.
2. The generic CNN inference accelerator of claim 1, wherein the pooling module comprises:
the general pooling module is used for performing pooling calculation or data structure conversion of preset sizes on the target data;
the full-size pooling module is used for performing full-size pooling calculation on the target data;
the module timing specifically includes an order of the target data through the convolution and activation module and/or the general pooling module and/or the full-size pooling module.
3. The generic CNN inference accelerator of claim 1, wherein the pooling calculation is specifically: a maximum pooling calculation or an average pooling calculation.
4. The generalized CNN inference accelerator of claim 3, wherein the generalized CNN inference accelerator is a single-layer instance-implemented inference accelerator.
5. The generic CNN inference accelerator of claim 3, wherein the generic CNN inference accelerator is a multi-layer instance-implemented inference accelerator.
6. The generic CNN inference accelerator of any of claims 1-5, further comprising:
a data organization module for organizing a data stream of the target data;
and the storage access module is used for calculating a storage address corresponding to the target data.
7. The generic CNN inference accelerator of claim 6, wherein the generic CNN inference accelerator is an ASIC or FPGA based inference accelerator.
8. A control method of a generic CNN inference accelerator, applied to the generic CNN inference accelerator of any one of claims 1 to 7, comprising:
acquiring target data, convolution kernel data and a module time sequence through a preprocessing module; the module timing sequence comprises the sequence of the target data entering the convolution and activation module and/or the pooling module;
and according to the module time sequence, performing convolution calculation and activation calculation on the target data through the convolution and activation module and/or performing pooling calculation or data structure conversion on the target data through the pooling module.
9. A readable storage medium, characterized in that the readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps of the method of controlling a generic CNN inference accelerator as claimed in claim 8.
CN201911243224.4A 2019-12-06 2019-12-06 General CNN reasoning accelerator, control method thereof and readable storage medium Withdrawn CN111105015A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911243224.4A CN111105015A (en) 2019-12-06 2019-12-06 General CNN reasoning accelerator, control method thereof and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911243224.4A CN111105015A (en) 2019-12-06 2019-12-06 General CNN reasoning accelerator, control method thereof and readable storage medium

Publications (1)

Publication Number Publication Date
CN111105015A true CN111105015A (en) 2020-05-05

Family

ID=70421801

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911243224.4A Withdrawn CN111105015A (en) 2019-12-06 2019-12-06 General CNN reasoning accelerator, control method thereof and readable storage medium

Country Status (1)

Country Link
CN (1) CN111105015A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111931911A (en) * 2020-07-30 2020-11-13 山东云海国创云计算装备产业创新中心有限公司 CNN accelerator configuration method, system and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170046616A1 (en) * 2015-08-15 2017-02-16 Salesforce.Com, Inc. Three-dimensional (3d) convolution with 3d batch normalization
CN107704922A (en) * 2017-04-19 2018-02-16 北京深鉴科技有限公司 Artificial neural network processing unit
CN108205704A (en) * 2017-09-27 2018-06-26 深圳市商汤科技有限公司 A kind of neural network chip
CN109086867A (en) * 2018-07-02 2018-12-25 武汉魅瞳科技有限公司 A kind of convolutional neural networks acceleration system based on FPGA
CN110276444A (en) * 2019-06-04 2019-09-24 北京清微智能科技有限公司 Image processing method and device based on convolutional neural networks

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170046616A1 (en) * 2015-08-15 2017-02-16 Salesforce.Com, Inc. Three-dimensional (3d) convolution with 3d batch normalization
CN107704922A (en) * 2017-04-19 2018-02-16 北京深鉴科技有限公司 Artificial neural network processing unit
CN108205704A (en) * 2017-09-27 2018-06-26 深圳市商汤科技有限公司 A kind of neural network chip
CN109086867A (en) * 2018-07-02 2018-12-25 武汉魅瞳科技有限公司 A kind of convolutional neural networks acceleration system based on FPGA
CN110276444A (en) * 2019-06-04 2019-09-24 北京清微智能科技有限公司 Image processing method and device based on convolutional neural networks

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111931911A (en) * 2020-07-30 2020-11-13 山东云海国创云计算装备产业创新中心有限公司 CNN accelerator configuration method, system and device
CN111931911B (en) * 2020-07-30 2022-07-08 山东云海国创云计算装备产业创新中心有限公司 CNN accelerator configuration method, system and device

Similar Documents

Publication Publication Date Title
CN109062611B (en) Neural network processing device and method for executing vector scaling instruction
CN111176727B (en) Computing device and computing method
CN112214726B (en) Operation accelerator
CN109543832B (en) Computing device and board card
EP3451157B1 (en) Device and method for performing forward operation of convolutional neural network
CN109522052B (en) Computing device and board card
EP3660706A1 (en) Convolutional operation device and method
CN107832844A (en) A kind of information processing method and Related product
US20230026006A1 (en) Convolution computation engine, artificial intelligence chip, and data processing method
CN112633490B (en) Data processing device, method and related product for executing neural network model
US20190354156A1 (en) Dynamic voltage frequency scaling device and method
CN109947573A (en) Intelligence suitable for electric system edge calculations accelerates chip
CN111210005A (en) Equipment operation method and device, storage medium and electronic equipment
CN112799599A (en) Data storage method, computing core, chip and electronic equipment
CN109711540B (en) Computing device and board card
CN111105015A (en) General CNN reasoning accelerator, control method thereof and readable storage medium
KR20220028899A (en) Accelerator, method for operating the same, and electronic device including the same
CN111047005A (en) Operation method, operation device, computer equipment and storage medium
US20220230069A1 (en) Neural network sparsification device and method, and related product
CN111143208B (en) Verification method for assisting FPGA to realize AI algorithm based on processor technology
CN114723024A (en) Linear programming-based neural network mapping method for storage and calculation integrated chip
CN111832714B (en) Operation method and device
CN111260070B (en) Operation method, device and related product
Zhang et al. Research of Heterogeneous Acceleration Optimization of Convolutional Neural Network Algorithm for Unmanned Vehicle Based on FPGA
CN113283593B (en) Convolution operation coprocessor and rapid convolution method based on processor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20200505

WW01 Invention patent application withdrawn after publication