CN110414663B - Convolution implementation method of neural network and related product - Google Patents

Convolution implementation method of neural network and related product Download PDF

Info

Publication number
CN110414663B
CN110414663B CN201810402644.1A CN201810402644A CN110414663B CN 110414663 B CN110414663 B CN 110414663B CN 201810402644 A CN201810402644 A CN 201810402644A CN 110414663 B CN110414663 B CN 110414663B
Authority
CN
China
Prior art keywords
size
convolution
kernel
data block
equal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810402644.1A
Other languages
Chinese (zh)
Other versions
CN110414663A (en
Inventor
曹庆新
黎立煌
李炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Intellifusion Technologies Co Ltd
Original Assignee
Shenzhen Intellifusion Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Intellifusion Technologies Co Ltd filed Critical Shenzhen Intellifusion Technologies Co Ltd
Priority to CN201810402644.1A priority Critical patent/CN110414663B/en
Publication of CN110414663A publication Critical patent/CN110414663A/en
Application granted granted Critical
Publication of CN110414663B publication Critical patent/CN110414663B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Stored Programmes (AREA)
  • Complex Calculations (AREA)

Abstract

The invention provides a convolution implementation method of a neural network and a related product, wherein the method comprises the following steps: acquiring input data and weight data; cutting the weight data into a plurality of data blocks with the core size of [ n ] [ m ], and fitting each data block into X convolution cores with the core size of [ A ] [ B ]; and performing convolution operation on each convolution kernel and the input data to obtain an intermediate result, and processing all the intermediate results to obtain a convolution result. The technical scheme provided by the application has the advantage of high calculation speed.

Description

Convolution implementation method of neural network and related product
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a convolution implementation method of a neural network and a related product.
Background
With the increasing maturity of artificial intelligence technology, the application scenes and product requirements of various industries are increased explosively, the updating iteration speed of the artificial intelligence algorithm is very high, a hardware computing platform needs to be flexible enough to meet the flexible and changeable algorithm requirements, and the research and development period needs to be as short as possible to meet the product competition pressure; for an algorithm model of artificial intelligent computation, particularly a neural network model, convolution operation belongs to basic operation of the neural network model, KERNEL SIZEs (English: Kernel SIZE) used in different neural network models in the convolution operation are not fixed, and any SIZE can be applied possibly, and an existing hardware platform cannot support operation and change of all KERNEL SIZEs, so that the convolution operation speed is influenced, and further the user experience degree is influenced.
Disclosure of Invention
The embodiment of the application provides convolution realization of a neural network and related products, and convolution operation of standard kernel sizes is realized by fitting different kernel sizes into the standard kernel sizes, so that convolution operation speed and user experience are improved.
In a first aspect, an embodiment of the present application provides a convolution implementation method for a neural network, where the method includes the following steps:
acquiring input data and weight data;
cutting the weight data into a plurality of data blocks with the core size of [ n ] [ m ], and fitting each data block into X convolution cores with the core size of [ A ] [ B ]; performing convolution operation on each convolution kernel and the input data to obtain an intermediate result, and processing all the intermediate results to obtain a convolution result;
wherein A is less than or equal to n, B is less than or equal to m, and A, B, m, n and X are integers which are greater than or equal to 1.
Optionally, fitting each data block to X convolution kernels of size [ a ] [ B ] includes:
and if each data block can not be cut into convolution kernels with integer kernel size of [ A ] [ B ], filling zero to the edge of each data block, wherein the kernel size of the data block after zero filling is [ n + B ] [ m + c ], and then cutting the data block with the kernel size of [ n + B ] [ m + c ] into X convolution kernels with the kernel size of [ A ] [ B ], wherein B and c are integers larger than or equal to 0.
Optionally, fitting each data block ] to X convolution kernels of size [ a ] [ B ] includes:
and if each data block can not be cut into convolution kernels with integral kernel size [ A ] [ B ], cutting each data block into E convolution kernels with the kernel size equal to [ A ] [ B ] and F convolution kernels with the kernel size smaller than [ A ] [ B ], filling zero to the edges of the F convolution kernels, and filling zero to the kernel sizes of the F convolution kernels after zero filling to [ A ] [ B ], wherein E + F ═ X, and E and F are integers larger than or equal to zero.
Optionally, the core size [ a ] [ B ] specifically includes: nuclear size [2] [2], nuclear size [3] [3], or nuclear size [5] [5 ].
In a second aspect, a neural network chip is provided;
the neural network chip is used for acquiring input data and weight data;
the neural network chip is used for cutting the weight data into a plurality of data blocks with the core size of [ n ] [ m ], and fitting each data block into X convolution cores with the core size of [ A ] [ B ]; performing convolution operation on each convolution kernel and the input data to obtain an intermediate result, and processing all the intermediate results to obtain a convolution result; (ii) a
A is less than or equal to n, B is less than or equal to m, and A, B, m, n and X are integers which are greater than or equal to 1.
Optionally, the neural network chip is further configured to, when each data block cannot be cut into integer convolution kernels with kernel sizes [ a ] [ B ], zero fill the edge of each data block, the kernel size of the data block after zero fill is [ n + B ] [ m + c ], then cut the data block with kernel size [ n + B ] [ m + c ] into X convolution kernels with kernel size [ a ] [ B ], where B and c are both integers greater than or equal to 0.
Optionally, the neural network chip is further configured to, when each data block cannot be cut into an integer number of convolution kernels with a kernel size of [ a ] [ B ], cut each data block into E convolution kernels with a kernel size equal to [ a ] [ B ] and F convolution kernels with a kernel size smaller than [ a ] [ B ], zero-fill edges of the F convolution kernels, and the kernel size of the F convolution kernels after zero-fill is [ a ] [ B ], where E + F ═ X, and E and F are both integers greater than or equal to zero.
Optionally, the core size [ a ] [ B ] specifically includes: nuclear size [2] [2], nuclear size [3] [3], or nuclear size [5] [5 ].
In a third aspect, an electronic device is provided, which may include the neural network chip of the second aspect.
In a fourth aspect, a computer-readable storage medium is provided, storing a computer program for electronic data exchange, wherein the computer program causes a computer to perform the method as provided in the first aspect.
In a fifth aspect, there is provided a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform the method provided by the first aspect.
The embodiment of the application has the following beneficial effects:
it can be seen that no matter how many SIZEs n and m of the KERNEL SIZE are, the technical scheme fits the KERNEL SIZE [ n ] [ m ] into X KERNEL SIZEs KERNEL SIZE [ A ] [ B ] set, so that during subsequent convolution operation, the KERNEL SIZE [ A ] [ B ] is always used as a basic unit for convolution operation, hardware only needs convolution operation matched with the KERNEL SIZE [ A ] [ B ], and the convolution operation speed and the user experience degree are further improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic structural diagram of an electronic device.
Fig. 2 is a flow chart diagram of a convolution implementation method of a neural network.
Fig. 3a is a schematic diagram of a cut of a data block provided in the present application.
Fig. 3b is a schematic diagram of a data block cutting provided in the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "comprising" and "having," and any variations thereof, in the description and claims of this application and the drawings described herein are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
The electronic device in the present application may include: the electronic Devices include, by way of example and not limitation, the electronic Devices described above, and for convenience of description, the electronic Devices described above are referred to as User Equipment (UE), a terminal or an electronic device in the following embodiments. Of course, in practical applications, the user equipment is not limited to the above presentation form, and may also include: intelligent vehicle-mounted terminal, computer equipment and the like.
The electronic device has a structure as shown in fig. 1, and specifically, the electronic device may include: the device comprises a processor 101, a memory 102, and a neural network chip 103, wherein the processor 101 is connected with the memory 102 and the neural network chip 103, and particularly in an alternative embodiment, the neural network chip 103 may be integrated in the processor 101. The memory 102 may include: flash Memory disks, Read-Only memories (ROMs), Random Access Memories (RAMs), and the like. The technical solution of the present invention is not limited to whether the neural network chip 103 is separately provided or integrated in the processor 101.
In one embodiment, the neural network chip 103 is used for obtaining input data and weight data.
The neural network chip 103 is further used for cutting the weight data into a plurality of data blocks with the core size of [ n ] [ m ], and fitting each data block into X convolution kernels with the core size of [ A ] [ B ]; and performing convolution operation on each convolution kernel and the input data to obtain an intermediate result, and processing all the intermediate results to obtain a convolution result.
Wherein A is less than or equal to n, B is less than or equal to m, and A, B, m, n and X are integers which are greater than or equal to 1.
Optionally, the neural network chip 103 is specifically configured to, if each data block cannot be cut into integer convolution kernels with kernel sizes [ a ] [ B ], zero padding an edge of each data block, where the kernel size of the data block after zero padding is [ n + B ] [ m + c ], and then cut the data block with the kernel size of [ n + B ] [ m + c ] into X convolution kernels with kernel sizes [ a ] [ B ], where B and c are both integers greater than or equal to 0.
Optionally, the neural network chip 103 is specifically configured to, if each data block cannot be cut into an integer number of convolution kernels with a kernel size of [ a ] [ B ], cut each data block into E convolution kernels with a kernel size equal to [ a ] [ B ] and F convolution kernels with a kernel size smaller than [ a ] [ B ], zero-fill edges of the F convolution kernels, and the kernel size of the F convolution kernels after zero-fill is [ a ] [ B ], where E + F ═ X, and E and F are both integers greater than or equal to zero.
Optionally, the core size [ a ] [ B ] specifically includes: nuclear size [2] [2], nuclear size [3] [3], or nuclear size [5] [5 ].
Referring to fig. 2, fig. 2 provides a convolution implementation method of a neural network, where the method is implemented by an electronic device, and a specific structure of the electronic device may be the electronic device shown in fig. 1, where the method is shown in fig. 2, and includes the following steps:
step S201, the neural network chip 103 obtains input data [ CI ] [ H ] [ W ] and weight data [ CP ] [ CO ] [ n ] [ m ];
wherein, CI is a depth value of the input data, H is a height value of the input data, W is a width value of the input data, CP is a magnitude value of the weight data, CO is a depth value of the weight data, [ n ] [ m ] is a convolution KERNEL SIZE of the weight data kernelsize, CI ═ CP, and CI, H, W, CP, CO, n, and m are integers greater than or equal to 1.
S202, the neural network chip 103 cuts the weight data into a plurality of data blocks with the size of [ n ] [ m ], and each data block is fitted into X convolution kernels with the size of [ A ] [ B ];
step S203, performing convolution operation on each convolution kernel and the input data to obtain an intermediate result, and processing all the intermediate results to obtain a convolution result.
A is less than or equal to n, B is less than or equal to m, and A, B, m, n and X are integers which are greater than or equal to 1.
The technical scheme of the application has the advantages that no matter how many the SIZEs n and m of the KERNEL SIZE of the data block are, the neural network chip 103 fits the KERNEL SIZE [ n ] [ m ] into X set KERNEL SIZEs KERNEL SIZE [ A ] [ B ], so that during subsequent convolution operation, the KERNEL SIZE [ A ] [ B ] is always used as a basic unit for convolution operation. The KERNEL SIZE [ A ] [ B ] with the set SIZE can be better matched with corresponding hardware calculation, so that one large-SIZE convolution KERNEL is split into a plurality of convolution KERNELs with the set SIZE under the condition that the total calculation amount is not changed, the hardware only needs to be matched with the convolution KERNELs with the set SIZE, and the convolution operation speed and the user experience degree are improved.
Optionally, the fitting of each data block to X convolution kernels with kernel size [ a ] [ B ] may specifically include:
and if each data block can not be cut into convolution kernels with integer kernel size of [ A ] [ B ], filling zero to the edge of each data block, wherein the kernel size of the data block after zero filling is [ n + B ] [ m + c ], and then cutting the data block with the kernel size of [ n + B ] [ m + c ] into X convolution kernels with the kernel size of [ A ] [ B ], wherein B and c are integers larger than or equal to 0.
To illustrate a practical example, assuming that the KERNELs SIZE [ n ] [ m ] of the data block is KERNELs SIZE [5] [5], the KERNELs SIZE [ a ] [ B ] of the convolution KERNEL is KERNELs SIZE [3] [3], adding the element value zero to the rows and columns of the KERNELs SIZE [5] [5] of the data block to obtain KERNELs SIZE [5+1] [5+1], and then cutting the KERNELs SIZE [5+1] [5+1] into 4 KERNELs SIZE [3] [3], which is schematically shown in fig. 3a, each dashed box represents 1 KERNELs SIZE [3] [3], so that a data block with a core SIZE of 5 × 5 is convolved into 4 cores with a core SIZE of 3 × 3. Similarly, data blocks with any kernel size can be fitted into X convolution kernels with the kernel size [ A ] [ B ], so that the hardware structure of the convolution kernels with the kernel size [ A ] [ B ] can be compatible with convolution kernels with all kernel sizes, and the calculation speed and efficiency of the hardware structure are improved.
And if each data block can not be cut into convolution kernels with integral kernel size [ A ] [ B ], cutting each data block into E convolution kernels with the kernel size equal to [ A ] [ B ] and F convolution kernels with the kernel size smaller than [ A ] [ B ], filling zero to the edges of the F convolution kernels, and filling zero to the kernel sizes of the F convolution kernels after zero filling to [ A ] [ B ], wherein E + F ═ X, and E and F are integers larger than or equal to zero.
To illustrate in a practical example, assuming that KERNEL SIZE [ n ] m of a data block is KERNEL SIZE [5] [5] and KERNEL SIZE [ A ] [ B ] of a convolution KERNEL is KERNEL SIZE [3] [3], then the data block KERNEL SIZE [5] [5] is cut into 4 KERNEL SIZEs, KERNEL SIZE [3] [3], KERNEL SIZE [3] [2], KERNEL SIZE [2] [3] and KERNEL SIZE [2] [2], and then the KERNEL SIZE [3] [2], the cut is rendered as shown in FIG. 3B, each dashed box represents 1 KERNEL SIZE [2] [3] and thus, a KERNEL SIZE of 5 is a SIZE, i.e., the data block is convolved into 4 KERNEL SIZE [3] [3] of the same SIZE as the KERNEL SIZE of the data block, and the KERNEL SIZE of the data block is 4 KeRNEL SIZE KERNEL SIZE, therefore, the hardware structure of the convolution kernel with the adaptive kernel size of [ A ] [ B ] can be compatible with the convolution kernels with all kernel sizes, and the calculation speed and efficiency of the hardware structure are improved.
The SIZE of the KERNEL SIZE is merely for example, and in practical applications, the KERNEL SIZE [3] or KERNEL SIZE [5] [5] may be other SIZEs, such as KERNEL SIZE [5] [7], KERNEL SIZE [6] [6], KERNEL SIZE [9] [9], and the like, and the two A, B values in the KERNEL [ A ] [ B ] may be different, but the present application is not limited to the A, B value being necessarily the same.
Embodiments of the present application also provide a computer storage medium, wherein the computer storage medium stores a computer program for electronic data exchange, and the computer program enables a computer to execute part or all of the steps of any one of the convolution implementation methods of a neural network as described in the above method embodiments.
Embodiments of the present application also provide a computer program product, which includes a non-transitory computer-readable storage medium storing a computer program, the computer program being operable to cause a computer to perform part or all of the steps of any one of the convolution implementation methods of a neural network as set forth in the above method embodiments.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are exemplary embodiments and that the acts and modules referred to are not necessarily required in this application.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative
In addition, the processors and chips in the embodiments of the present application may be integrated into one processing unit, may exist alone physically, or may be integrated into one unit by two or more pieces of hardware. The computer-readable storage medium or the computer-readable program may be stored in a computer-readable memory. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a memory, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned memory comprises: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, which may include: flash Memory disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
The foregoing detailed description of the embodiments of the present application has been presented to illustrate the principles and implementations of the present application, and the above description of the embodiments is only provided to help understand the method and the core concept of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (6)

1. A convolution implementation method of a neural network is characterized by comprising the following steps:
acquiring input data and weight data;
cutting the weight data into a plurality of data blocks with the core size of [ n ] [ m ], and fitting each data block into X convolution cores with the core size of [ A ] [ B ]; performing convolution operation on each convolution kernel and the input data to obtain an intermediate result, and processing all the intermediate results to obtain a convolution result;
wherein A is less than or equal to n, B is less than or equal to m, and A, B, m, n and X are integers which are greater than or equal to 1;
fitting each data block to X convolution kernels of size [ A ] [ B ] includes:
if each data block can not be cut into convolution kernels with integer kernel size of [ A ] [ B ], zero padding is carried out on the edge of each data block, the kernel size of the data block after zero padding is [ n + B ] [ m + c ], then the data block with the kernel size of [ n + B ] [ m + c ] is cut into X convolution kernels with the kernel size of [ A ] [ B ], wherein B and c are integers larger than or equal to 0; alternatively, the first and second electrodes may be,
and if each data block can not be cut into convolution kernels with integral kernel size [ A ] [ B ], cutting each data block into E convolution kernels with the kernel size equal to [ A ] [ B ] and F convolution kernels with the kernel size smaller than [ A ] [ B ], filling zero to the edges of the F convolution kernels, and filling zero to the kernel sizes of the F convolution kernels after zero filling to [ A ] [ B ], wherein E + F ═ X, and E and F are integers larger than or equal to zero.
2. The method of claim 1,
the core size [ A ] [ B ] specifically includes: nuclear size [2] [2], nuclear size [3] [3], or nuclear size [5] [5 ].
3. A neural network chip, characterized in that,
the neural network chip is used for acquiring input data and weight data;
the neural network chip is used for cutting the weight data into a plurality of data blocks with the core size of [ n ] [ m ], and fitting each data block into X convolution cores with the core size of [ A ] [ B ]; performing convolution operation on each convolution kernel and the input data to obtain an intermediate result, and processing all the intermediate results to obtain a convolution result;
wherein A is less than or equal to n, B is less than or equal to m, and A, B, m, n and X are integers which are greater than or equal to 1;
the neural network chip is also used for filling zero to the edge of each data block when each data block can not be cut into integral convolution kernels with kernel sizes of [ A ] [ B ], wherein the kernel size of the data block after zero filling is [ n + B ] [ m + c ], then cutting the data block with the kernel size of [ n + B ] [ m + c ] into X convolution kernels with the kernel size of [ A ] [ B ], and both B and c are integers larger than or equal to 0;
the neural network chip is further used for cutting each data block into E convolution kernels with the kernel size equal to [ A ] [ B ] and F convolution kernels with the kernel size smaller than [ A ] [ B ] when each data block cannot be cut into integral convolution kernels with the kernel sizes [ A ] [ B ], filling zero to the edges of the F convolution kernels, enabling the kernel sizes of the F convolution kernels after zero filling to be [ A ] [ B ], wherein E + F is X, and E and F are integers larger than or equal to zero.
4. The neural network chip of claim 3,
the core size [ A ] [ B ] specifically includes: nuclear size [2] [2], nuclear size [3] [3], or nuclear size [5] [5 ].
5. An electronic device, characterized in that it comprises a neural network chip according to any one of claims 3 or 4.
6. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program for electronic data exchange, wherein the computer program causes a computer to perform the method according to any one of claims 1 or 2.
CN201810402644.1A 2018-04-28 2018-04-28 Convolution implementation method of neural network and related product Active CN110414663B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810402644.1A CN110414663B (en) 2018-04-28 2018-04-28 Convolution implementation method of neural network and related product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810402644.1A CN110414663B (en) 2018-04-28 2018-04-28 Convolution implementation method of neural network and related product

Publications (2)

Publication Number Publication Date
CN110414663A CN110414663A (en) 2019-11-05
CN110414663B true CN110414663B (en) 2022-03-25

Family

ID=68357078

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810402644.1A Active CN110414663B (en) 2018-04-28 2018-04-28 Convolution implementation method of neural network and related product

Country Status (1)

Country Link
CN (1) CN110414663B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110851646B (en) * 2019-11-18 2020-11-24 嵊州市万睿科技有限公司 Working efficiency statistical method for intelligent park
CN111178513B (en) * 2019-12-31 2022-04-15 深圳云天励飞技术股份有限公司 Convolution implementation method and device of neural network and terminal equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105512723A (en) * 2016-01-20 2016-04-20 南京艾溪信息科技有限公司 Artificial neural network calculating device and method for sparse connection
CN107239824A (en) * 2016-12-05 2017-10-10 北京深鉴智能科技有限公司 Apparatus and method for realizing sparse convolution neutral net accelerator
CN107437110A (en) * 2017-07-11 2017-12-05 中国科学院自动化研究所 The piecemeal convolution optimization method and device of convolutional neural networks

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11042795B2 (en) * 2016-06-13 2021-06-22 The Regents Of The University Of Michigan Sparse neuromorphic processor

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105512723A (en) * 2016-01-20 2016-04-20 南京艾溪信息科技有限公司 Artificial neural network calculating device and method for sparse connection
CN107239824A (en) * 2016-12-05 2017-10-10 北京深鉴智能科技有限公司 Apparatus and method for realizing sparse convolution neutral net accelerator
CN107437110A (en) * 2017-07-11 2017-12-05 中国科学院自动化研究所 The piecemeal convolution optimization method and device of convolutional neural networks

Also Published As

Publication number Publication date
CN110414663A (en) 2019-11-05

Similar Documents

Publication Publication Date Title
CN107437110B (en) Block convolution optimization method and device of convolutional neural network
CN103810020B (en) Virtual machine elastic telescopic method and device
CN109993273B (en) Convolution implementation method of convolution neural network and related product
CN110414663B (en) Convolution implementation method of neural network and related product
CN111639699B (en) Method, system and equipment for extracting image features and readable storage medium
CN106528490B (en) FPGA heterogeneous acceleration computing device and system
CN106204439A (en) The method and system of picture self-adaptive processing
CN109844774B (en) Parallel deconvolution computing method, single-engine computing method and related products
CN114677473A (en) Method, device and equipment for reconstructing three-dimensional model and storage medium
CN111145202B (en) Model generation method, image processing method, device, equipment and storage medium
CN111210004B (en) Convolution calculation method, convolution calculation device and terminal equipment
CN107402905A (en) Computational methods and device based on neutral net
CN111124282A (en) Storage method, storage device and storage equipment in object storage system
US10387545B2 (en) Processing page
CN109766123A (en) Application program packaging method and device
CN111967478A (en) Feature map reconstruction method and system based on weight inversion, storage medium and terminal
US10082956B2 (en) Method and apparatus for downloading data including a progress bar indicating progress of downloading
CN107977923B (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
CN109472744B (en) Three-dimensional model reduction method
CN111542837A (en) Three-dimensional convolution neural network computing device and related product
CN116185545A (en) Page rendering method and device
CN107862316A (en) Convolution algorithm method and device
CN114489481A (en) Method and system for storing and accessing data in hard disk
CN110796238B (en) Convolutional neural network weight compression method and device based on ARM architecture FPGA hardware system
CN113297308A (en) Table structured information extraction method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant