CN110390392B - Convolution parameter accelerating device based on FPGA and data reading and writing method - Google Patents

Convolution parameter accelerating device based on FPGA and data reading and writing method Download PDF

Info

Publication number
CN110390392B
CN110390392B CN201910708612.9A CN201910708612A CN110390392B CN 110390392 B CN110390392 B CN 110390392B CN 201910708612 A CN201910708612 A CN 201910708612A CN 110390392 B CN110390392 B CN 110390392B
Authority
CN
China
Prior art keywords
convolution
group
read
parameter
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910708612.9A
Other languages
Chinese (zh)
Other versions
CN110390392A (en
Inventor
马向华
马成森
边立剑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Anlu Information Technology Co.,Ltd.
Original Assignee
Shanghai Anlogic Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Anlogic Information Technology Co ltd filed Critical Shanghai Anlogic Information Technology Co ltd
Priority to CN201910708612.9A priority Critical patent/CN110390392B/en
Publication of CN110390392A publication Critical patent/CN110390392A/en
Priority to PCT/CN2019/126433 priority patent/WO2021017378A1/en
Application granted granted Critical
Publication of CN110390392B publication Critical patent/CN110390392B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)
  • Complex Calculations (AREA)
  • Advance Control (AREA)

Abstract

The application discloses a convolution parameter accelerating device based on an FPGA and a data reading and writing method, wherein the method comprises the following steps: judging whether the convolution parameter is the last of a group of input convolution parameters or not, if not, automatically increasing a write control counter, and allocating an address to each convolution parameter in the group in a first random read-write memory; judging whether the convolution parameter is the last of the output group of convolution parameters, if not, the first random read-write memory outputs one of the group of convolution parameters according to the address, and the first read control counter is automatically increased; and judging whether the output of the set of convolution parameters for the preset times is finished or not, and if so, resetting the first reading control counter and the second reading control counter.

Description

Convolution parameter accelerating device based on FPGA and data reading and writing method
Technical Field
The invention relates to the technical field of integrated circuits, in particular to a convolution parameter accelerating device and a data reading and writing method based on an FPGA (field programmable gate array).
Background
The Convolutional Neural Network (CNN) is a mature technical scheme in the field of artificial intelligence, has the characteristic learning capacity, and can perform translation invariant classification on input information according to a hierarchical structure. With the proposal of deep learning theory and the improvement of numerical computation equipment, the CNN neural network is rapidly developed, is widely applied to the directions of computer vision, natural language processing and the like, is mainly used for the classification processing of targets, and has overwhelming advantages in the applications of image recognition, language recognition and the like.
At present, the implementation of the CNN neural network is mainly based on a computer platform, a CNN architecture is deployed on a development-end computer, then weight training is performed through mass data, and finally a proper weight coefficient is generated. If the product end needs to consider portability and practicability, a CNN neural network is not erected by a high-end server and a workstation generally, and embedded development becomes the first choice under the requirements of reducing cost and size.
Research on embedded development of CNN neural networks has been carried out in recent years. Digital Signal Processing (DSP) or arm (advanced RISC machines) is not considered due to the long computing time, so the research of the FPGA to design CNN neural network convolution accelerator becomes a hot direction, for example, FPGA parallel structure design of Convolutional Neural Network (CNN) algorithm (wang wei et al 2019.4 microelectronics and computer) and convolutional neural network accelerator design based on ZYNQ platform and its application research (dungshai 2018.5 beijing university of industry), the latter describes only a theoretical process, does not give an actual design model and performance analysis, the former proposes a specific neural network convolution accelerator model, which has a great improvement in performance through paper analysis, but also has a disadvantage of insufficient data throughput inside a chip for product level realization, and is difficult to realize application, for example, for image classification algorithm of yoloV2, 17.4G calculation times are required for each frame of image, and according to the design, the processing speed of only 1.15 frames/s can be realized under the condition of data seamless connection.
As shown in fig. 1, the conventional CNN neural network accelerator is not mature in technology, and has the main problems that the cost is high, the data throughput rate is low, the calculation delay is too long, and the real-time application and the low cost cannot be met.
Disclosure of Invention
The invention aims to provide a convolution parameter accelerating device and a data reading and writing method based on an FPGA (field programmable gate array), and solves the technical problems of slow data processing and insufficient data throughput in the prior art.
In order to solve the above problems, the present application discloses a convolution parameter data read-write method based on an FPGA, including:
judging whether the convolution parameter is the last of a group of input convolution parameters or not, if not, automatically increasing a write control counter, and allocating an address to each convolution parameter in the group in a first random read-write memory;
judging whether the convolution parameter is the last of the output group of convolution parameters, if not, the first random read-write memory outputs one of the group of convolution parameters according to the address, and the first read control counter is automatically increased; and judging whether the output of the set of convolution parameters for the preset times is finished or not, and if so, resetting the first reading control counter and the second reading control counter.
In a preferred embodiment, the write control counter is cleared if the last convolution parameter of the input set is the last convolution parameter of the input set.
In a preferred embodiment, if the last convolution parameter of the set of convolution parameters is output and the output of the set of convolution parameters for a predetermined number of times is not completed, the first read control counter is cleared and the second read control counter is self-incremented by 1.
In a preferred embodiment, while writing a set of convolution parameters into the first random access memory, the second random access memory outputs another set of convolution parameters; or, the first random read-write memory outputs a group of convolution parameters and writes another group of convolution parameters into the second random read-write memory at the same time.
In a preferred embodiment, the method further includes, after completing inputting the set of convolution parameters: and judging whether the current data is the last one of the other input set of convolution parameters, if not, automatically increasing the write control counter, and allocating an address to each of the other set of convolution parameters in the second random access memory.
In a preferred embodiment, the method further includes, after the outputting of the set of convolution parameters is completed: and judging whether the output is the last convolution parameter of the other output group, if not, outputting one convolution parameter of the other output group by the second random read-write memory, and automatically increasing the first read control counter.
The application also discloses a convolution parameter accelerating device based on FPGA includes:
at least one random access memory configured to store convolution parameters;
a write address control unit configured to determine whether the last of the set of input convolution parameters is present, and if not, the write control counter is incremented by itself, and an address is allocated to each of the set of convolution parameters in the first random access memory;
the read address control unit judges whether the output convolution parameter is the last convolution parameter, if not, the first random read-write memory outputs one convolution parameter of the group according to the address, and the first read control counter is self-increased; and judging whether the output of the set of convolution parameters for the preset times is finished or not, and if so, resetting the first reading control counter and the second reading control counter.
In a preferred embodiment, the device comprises a first random read-write memory and a second random read-write memory, wherein the second random read-write memory outputs another set of convolution parameters while writing a set of convolution parameters into the first random read-write memory; or, the first random read-write memory outputs a group of convolution parameters and writes another group of convolution parameters into the second random read-write memory at the same time.
In a preferred embodiment, the device comprises a first random access memory and a second random access memory; the write address control unit is further configured to: and judging whether the current volume parameter is the last volume parameter of the other input group, if not, allocating an address to each volume parameter of the other group in the second random read-write memory, and controlling the self-increment of the counter by the write address.
In a preferred embodiment, the device comprises a first random access memory and a second random access memory; the read address control unit is further configured to: and judging whether the output is the last convolution parameter of the other output group, if not, outputting one convolution parameter of the other output group by the second random read-write memory, and automatically increasing the first read control counter.
Compared with the prior art, the method has the following beneficial effects:
the convolution parameter accelerating device based on the FPGA uses the least logic resources to form the minimized convolution parameter management, the interface of the device is simple and easy to use, the resource occupation is less, the transplantation is easy, the input and output path is short, and two random read-write memories are used in the device, so that the data can be read and written at the same time, continuously output and kept in the peak state for a long time, the parallelism can be greatly improved, and the high throughput rate of the data can be realized.
Drawings
FIG. 1 illustrates a process diagram of a convolution technique in a CNN neural network model in the prior art;
FIG. 2 is a schematic diagram of an acceleration device in accordance with an embodiment of the present invention;
FIG. 3 shows a schematic view of an accelerator apparatus according to another embodiment of the invention;
FIG. 4 is a process diagram illustrating the writing of data in one embodiment of the invention;
FIG. 5 shows a process diagram for data output in one embodiment of the invention.
Detailed Description
In the following description, numerous technical details are set forth in order to provide a better understanding of the present application. However, it will be understood by those skilled in the art that the technical solutions claimed in the present application can be implemented without these technical details and with various changes and modifications based on the following embodiments.
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
Description of partial concepts:
CNN: convolutional Neural Networks
Convolution parameters: convolution kernel parameters in CNN
FPGA: field programmable logic gate array
RAM, Random Access Memory
Referring to fig. 2, the present application further discloses an FPGA-based convolution parameter acceleration apparatus, where the acceleration apparatus 100 includes:
at least one Random Access Memory (RAM), shown in fig. 2 as including a first RAM 101 configured to store convolution parameters;
a write address control unit 201, configured to determine whether the last convolution parameter is the last convolution parameter, and if the last convolution parameter is not the last convolution parameter, a write control counter (not shown in the figure) in the write address control unit 201 increments by 1, and allocates an address to each convolution parameter in the first random access memory 101;
the read address control unit 202 is configured to determine whether the output convolution parameter is the last convolution parameter, if the output convolution parameter is not the last convolution parameter, the first random access memory 101 outputs one convolution parameter of the set according to the address, a first read control counter in the read address control unit 202 increments by 1, and determines whether the output of the set of convolution parameters for a predetermined number of times is completed, and if the output convolution parameter is completed, a second read control counter (not shown in the figure) in the read address control unit 202 is cleared. In the embodiment, the minimum logic resources are used to form the minimum acceleration unit for managing the convolution parameters, the interface is simple and easy to use, the resource occupation is small, the migration is easy, and the input and output path is short.
In a preferred example, referring to fig. 3, the acceleration apparatus of the present application includes a first random access memory 101 and a second random access memory 102, where the second random access memory 102 outputs another set of convolution parameters while writing a set of convolution parameters into the first random access memory 101; or, while the first random access memory 101 outputs a set of convolution parameters, another set of convolution parameters is written into the second random access memory 102.
In a preferred embodiment, the device comprises a first random access memory 101 and a second random access memory 102; the write address control unit 201 is further configured to: and judging whether the current data is the last one of the other input set of convolution parameters, if not, allocating an address to each of the other set of convolution parameters in the second random access memory 202, wherein the write address controls the self-increment of the counter by 1.
In a preferred embodiment, the device comprises a first random access memory 101 and a second random access memory 102; the read address control unit 202 is further configured to: and judging whether the output is the last convolution parameter of the other output group, if not, outputting one convolution parameter of the other output group by the second random access memory 102, and self-incrementing a first read control counter in the read address control unit 202.
Because the accelerator uses two RAMs, the data can be read and written at the same time, one RAM is used for writing data, and the other RAM is used for reading data, thereby realizing the parallel processing of data, continuous output and long-term maintenance in a peak value state.
In another embodiment of the present application, a data read-write method based on an FPGA is further disclosed, including:
referring to fig. 4, the FPGA-based data writing method includes:
in step 11, judging whether data are input, if no data are input, entering step 15, and resetting a write control counter of the write control unit;
if data is input, the process proceeds to step 12, and determines whether the data is the last of a set of input convolution parameters, and if not, the process proceeds to step 13, where the write control counter is incremented by 1, the process proceeds to step S14, and an address is assigned to each of the set of input convolution parameters in the first random access memory;
if it is the last of the set of convolution parameters, the process proceeds to step 15, and the write control counter of the write control unit is cleared.
Referring to fig. 5, the FPGA-based data writing method includes:
first, it is determined whether data is output 21, and if no data is output, the process proceeds to step 26, where the first read control counter is cleared and the second read control counter is cleared.
If there is data output, the process proceeds to step 22, and determines whether it is the last convolution parameter of the output set, and if it is the last convolution parameter of the output set, the process proceeds to step 27, where the first read control counter is cleared and the second read control counter is incremented by 1.
If not, entering step 23, the first random access memory 101 outputs one of the convolution parameters according to the address, and simultaneously entering step 24, the first read control counter is self-incremented by 1; then, step 21 is performed again.
In a preferred embodiment, if the last convolution parameter of the set of convolution parameters is output and the output of the set of convolution parameters is not completed for a predetermined number of times, corresponding to step 27, the first read control counter is cleared, and the second read control counter is incremented by 1, which indicates that the output of the set of convolution parameters for one point is completed.
Then, the process proceeds to step 25 to determine whether the output of the set of convolution parameters for a predetermined number of times is completed, and if not, the process proceeds to step 21 again to determine whether to output data.
If the output of the set of convolution parameters for the predetermined number of times is completed, step 26 is entered, and the first read control counter and the second read control counter are cleared.
In a preferred embodiment, while writing a set of convolution parameters into the first random access memory 101, the second random access memory 102 outputs another set of convolution parameters; or, while the first random access memory 101 outputs a set of convolution parameters, another set of convolution parameters is written into the second random access memory 102.
In a preferred embodiment, after the inputting of a set of convolution parameters is completed, at this time, the first random access memory 101 stores a set of convolution parameters, and then the method further includes: and judging whether the data is the last one of the other input groups of convolution parameters, if not, automatically increasing the write control counter, allocating an address to each of the other groups of convolution parameters in the second random read-write memory, writing the other groups of convolution parameters into the second random read-write memory, and simultaneously outputting data by the first random read-write memory.
In a preferred embodiment, after the outputting of the set of convolution parameters is completed, the outputting of the set of convolution parameters in the first random access memory 101 is completed, and the method further includes: and judging whether the output is the last convolution parameter of the other output group, if not, outputting one convolution parameter of the other output group by the second random read-write memory, and controlling the counter to increase automatically by the first random read-write memory, so that the second random read-write memory is used for outputting data, and the first random read-write memory can write data simultaneously.
Because the accelerator uses two RAMs, the data can be read and written at the same time, one RAM is used for writing data, and the other RAM is used for reading data, thereby realizing the parallel processing of data, continuous output and long-term maintenance in a peak value state.
The first embodiment is a method embodiment corresponding to the present embodiment, and the technical details in the first embodiment may be applied to the present embodiment, and the technical details in the present embodiment may also be applied to the first embodiment.
It should be noted that, as will be understood by those skilled in the art, the implementation functions of the modules shown in the above embodiments of the acceleration apparatus can be understood by referring to the description of the data reading and writing method. The functions of the respective modules shown in the embodiment of the acceleration apparatus may be realized by a program (executable instructions) running on a processor, and may also be realized by a specific logic circuit. The acceleration device of the embodiment of the present application, if implemented in the form of a software functional module and sold or used as a standalone product, may also be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or portions thereof contributing to the prior art may be embodied in the form of a software product stored in a storage medium, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a magnetic disk, or an optical disk. Thus, embodiments of the present application are not limited to any specific combination of hardware and software.
Accordingly, another embodiment of the present application is implemented by a configuration file in an FPGA-readable storage medium. FPGA-readable storage media, including persistent and non-persistent, removable and non-removable media, can implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for FPGA profiles include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, FPGA-readable storage media does not include transitory computer-readable media (transient media), such as modulated data signals and carrier waves.
It is noted that, in the present patent application, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the use of the verb "comprise a" to define an element does not exclude the presence of another, same element in a process, method, article, or apparatus that comprises the element. In the present patent application, if it is mentioned that a certain action is executed according to a certain element, it means that the action is executed according to at least the element, and two cases are included: performing the action based only on the element, and performing the action based on the element and other elements. The expression of a plurality of, a plurality of and the like includes 2, 2 and more than 2, more than 2 and more than 2.
All documents mentioned in this specification are to be considered as being incorporated in their entirety into the disclosure of the present application so as to be subject to modification as necessary. It should be understood that the above description is only a preferred embodiment of the present disclosure, and is not intended to limit the scope of the present disclosure. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of one or more embodiments of the present disclosure should be included in the scope of protection of one or more embodiments of the present disclosure.
In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

Claims (6)

1. A convolution parameter data read-write method based on FPGA includes:
judging whether data is input or not, and if not, resetting the write controller; if yes, judging whether the convolution parameter is the last convolution parameter of the input group, if not, automatically increasing a write control counter, allocating an address to each convolution parameter of the group in a first random read-write memory, and if the convolution parameter is the last convolution parameter, resetting a write controller;
judging whether data is output or not, if so, judging whether the data is the last convolution parameter of a group of output parameters, if not, outputting one convolution parameter of the group of convolution parameters by the first random read-write memory according to an address, and automatically increasing a first read control counter; if the convolution parameter is the last of the output group of convolution parameters, judging whether the output of the group of convolution parameters for a preset number of times is finished, if not, resetting the first read control counter and self-increasing the second read control counter by 1, and if so, resetting the first read control counter and the second read control counter;
further comprising: while writing a group of convolution parameters into the first random read-write memory, the second random read-write memory outputs another group of convolution parameters; or, the first random read-write memory outputs a group of convolution parameters and writes another group of convolution parameters into the second random read-write memory at the same time.
2. The method of claim 1, wherein the write control counter is cleared if it is the last of a set of convolution parameters entered.
3. The method of claim 1, wherein the outputting of the set of convolution parameters after completion further comprises: and judging whether the output is the last convolution parameter of the other output group, if not, outputting one convolution parameter of the other output group by the second random read-write memory, and automatically increasing the first read control counter.
4. A convolution parameter accelerating device based on FPGA is characterized by comprising:
at least one random access memory configured to store convolution parameters;
a write address control unit configured to: judging whether data is input or not, and if not, resetting the write controller; if yes, judging whether the convolution parameter is the last convolution parameter of the input group, if not, automatically increasing a write control counter, allocating an address to each convolution parameter of the group in a first random read-write memory, and if the convolution parameter is the last convolution parameter, resetting a write controller;
a read address control unit configured to: judging whether data is output or not, if so, judging whether the data is the last convolution parameter of a group of output parameters, if not, outputting one convolution parameter of the group of convolution parameters by the first random read-write memory according to an address, and automatically increasing a first read control counter; if the convolution parameter is the last of the output group of convolution parameters, judging whether the output of the group of convolution parameters for a preset number of times is finished, if not, resetting the first read control counter and self-increasing the second read control counter by 1, and if so, resetting the first read control counter and the second read control counter;
the device comprises a first random read-write memory and a second random read-write memory, wherein the second random read-write memory outputs another group of convolution parameters while writing a group of convolution parameters into the first random read-write memory; or, the first random read-write memory outputs a group of convolution parameters and writes another group of convolution parameters into the second random read-write memory at the same time.
5. The apparatus of claim 4, comprising first and second random access memories; the write address control unit is further configured to: and judging whether the current volume parameter is the last volume parameter of the other input group, if not, allocating an address to each volume parameter of the other group in the second random read-write memory, and controlling the self-increment of the counter by the write address.
6. The apparatus of claim 4, comprising first and second random access memories; the read address control unit is further configured to: and judging whether the output is the last convolution parameter of the other output group, if not, outputting one convolution parameter of the other output group by the second random read-write memory, and automatically increasing the first read control counter.
CN201910708612.9A 2019-08-01 2019-08-01 Convolution parameter accelerating device based on FPGA and data reading and writing method Active CN110390392B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910708612.9A CN110390392B (en) 2019-08-01 2019-08-01 Convolution parameter accelerating device based on FPGA and data reading and writing method
PCT/CN2019/126433 WO2021017378A1 (en) 2019-08-01 2019-12-18 Fpga-based convolution parameter acceleration device and data read-write method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910708612.9A CN110390392B (en) 2019-08-01 2019-08-01 Convolution parameter accelerating device based on FPGA and data reading and writing method

Publications (2)

Publication Number Publication Date
CN110390392A CN110390392A (en) 2019-10-29
CN110390392B true CN110390392B (en) 2021-02-19

Family

ID=68288406

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910708612.9A Active CN110390392B (en) 2019-08-01 2019-08-01 Convolution parameter accelerating device based on FPGA and data reading and writing method

Country Status (2)

Country Link
CN (1) CN110390392B (en)
WO (1) WO2021017378A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110390392B (en) * 2019-08-01 2021-02-19 上海安路信息科技有限公司 Convolution parameter accelerating device based on FPGA and data reading and writing method

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5764374A (en) * 1996-02-05 1998-06-09 Hewlett-Packard Company System and method for lossless image compression having improved sequential determination of golomb parameter
EP1089475A1 (en) * 1999-09-28 2001-04-04 TELEFONAKTIEBOLAGET L M ERICSSON (publ) Converter and method for converting an input packet stream containing data with plural transmission rates into an output data symbol stream
CN100466601C (en) * 2005-04-28 2009-03-04 华为技术有限公司 Data read/write device and method
CN101257313B (en) * 2007-04-10 2010-05-26 深圳市同洲电子股份有限公司 Deconvolution interweave machine and method realized based on FPGA
CN104461934B (en) * 2014-11-07 2017-06-30 北京海尔集成电路设计有限公司 A kind of time solution convolutional interleave device and method of suitable DDR memory
CN106940815B (en) * 2017-02-13 2020-07-28 西安交通大学 Programmable convolutional neural network coprocessor IP core
US11775313B2 (en) * 2017-05-26 2023-10-03 Purdue Research Foundation Hardware accelerator for convolutional neural networks and method of operation thereof
CN108169727B (en) * 2018-01-03 2019-12-27 电子科技大学 Moving target radar scattering cross section measuring method based on FPGA
CN108154229B (en) * 2018-01-10 2022-04-08 西安电子科技大学 Image processing method based on FPGA (field programmable Gate array) accelerated convolutional neural network framework
CN109086867B (en) * 2018-07-02 2021-06-08 武汉魅瞳科技有限公司 Convolutional neural network acceleration system based on FPGA
CN109032781A (en) * 2018-07-13 2018-12-18 重庆邮电大学 A kind of FPGA parallel system of convolutional neural networks algorithm
CN109214281A (en) * 2018-07-30 2019-01-15 苏州神指微电子有限公司 A kind of CNN hardware accelerator for AI chip recognition of face
CN109359729B (en) * 2018-09-13 2022-02-22 深思考人工智能机器人科技(北京)有限公司 System and method for realizing data caching on FPGA
CN109711533B (en) * 2018-12-20 2023-04-28 西安电子科技大学 Convolutional neural network acceleration system based on FPGA
CN109409509A (en) * 2018-12-24 2019-03-01 济南浪潮高新科技投资发展有限公司 A kind of data structure and accelerated method for the convolutional neural networks accelerator based on FPGA
CN109784489B (en) * 2019-01-16 2021-07-30 北京大学软件与微电子学院 Convolutional neural network IP core based on FPGA
CN110390392B (en) * 2019-08-01 2021-02-19 上海安路信息科技有限公司 Convolution parameter accelerating device based on FPGA and data reading and writing method

Also Published As

Publication number Publication date
CN110390392A (en) 2019-10-29
WO2021017378A1 (en) 2021-02-04

Similar Documents

Publication Publication Date Title
US11836610B2 (en) Concurrent training of functional subnetworks of a neural network
US10417555B2 (en) Data-optimized neural network traversal
CN111325664B (en) Style migration method and device, storage medium and electronic equipment
CN104615594B (en) A kind of data-updating method and device
CN112329680A (en) Semi-supervised remote sensing image target detection and segmentation method based on class activation graph
US11928580B2 (en) Interleaving memory requests to accelerate memory accesses
US20210326702A1 (en) Processing device for executing convolutional neural network computation and operation method thereof
DE102021126634A1 (en) Storage device, storage system and method of operation
CN112734106A (en) Method and device for predicting energy load
CN110390392B (en) Convolution parameter accelerating device based on FPGA and data reading and writing method
CN111009034B (en) Three-dimensional model monomer method, system, storage medium and equipment
CN110019784A (en) A kind of file classification method and device
CN109597982A (en) Summary texts recognition methods and device
TWI751931B (en) Processing device and processing method for executing convolution neural network computation
US11436486B2 (en) Neural network internal data fast access memory buffer
CN113641872B (en) Hashing method, hashing device, hashing equipment and hashing medium
CN114758191A (en) Image identification method and device, electronic equipment and storage medium
CN113052292B (en) Convolutional neural network technique method, device and computer readable storage medium
Wu et al. Hetero layer fusion based architecture design and implementation for of deep learning accelerator
CN112308762A (en) Data processing method and device
CN112905239B (en) Point cloud preprocessing acceleration method based on FPGA, accelerator and electronic equipment
Ali et al. A New Merging Numerous Small Files Approach for Hadoop Distributed File System
CN110858121B (en) Background operation scheduling method and device
CN118012631B (en) Operator execution method, processing device, storage medium and program product
US20230010180A1 (en) Parafinitary neural learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 200434 Room 202, building 5, No. 500, Memorial Road, Hongkou District, Shanghai

Patentee after: Shanghai Anlu Information Technology Co.,Ltd.

Address before: Floor 4, no.391-393, dongdaming Road, Hongkou District, Shanghai 200080 (centralized registration place)

Patentee before: SHANGHAI ANLOGIC INFORMATION TECHNOLOGY Co.,Ltd.