CN107656899A - A kind of mask convolution method and system based on FPGA - Google Patents

A kind of mask convolution method and system based on FPGA Download PDF

Info

Publication number
CN107656899A
CN107656899A CN201710888288.4A CN201710888288A CN107656899A CN 107656899 A CN107656899 A CN 107656899A CN 201710888288 A CN201710888288 A CN 201710888288A CN 107656899 A CN107656899 A CN 107656899A
Authority
CN
China
Prior art keywords
convolution
data
register group
group
coefficient
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710888288.4A
Other languages
Chinese (zh)
Inventor
李东
敖晟
田劲东
田勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen University
Original Assignee
Shenzhen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen University filed Critical Shenzhen University
Priority to CN201710888288.4A priority Critical patent/CN107656899A/en
Publication of CN107656899A publication Critical patent/CN107656899A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations
    • G06F17/153Multidimensional correlation or convolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a kind of mask convolution method and system based on FPGA, this method includes obtaining the data bit width of view data, and the register group of corresponding depth is selected based on the data bit width;Obtain view data and be stored in the register group, obtain convolution coefficient and be stored in ROM;Obtain the selection parameter for associating the register group and the convolution coefficient;The data of extraction register group storage and corresponding convolution coefficient simultaneously carry out multiplying, based on adder group by the results added of the multiplying to realize convolution algorithm.The system is used to perform corresponding method.The register group data storage of present invention selection matching view data, select ROM storage convolution coefficients, multiplying is carried out by the corresponding relation of register and convolution coefficient, the add operation of product is carried out to realize convolution algorithm by adder group, convolution processing result can be improved, improve treatment effeciency.

Description

A kind of mask convolution method and system based on FPGA
Technical field
The present invention relates to technical field of image processing, more particularly to a kind of mask convolution implementation method based on FPGA and it is System.
Background technology
In Digital Image Processing, it is a kind of important method that spatial domain, which carries out processing to image,.Common some spaces filter Ripple operates, and comprising linear processes, the important operation frequently referred to is exactly image convolution computing.Because convolution algorithm needs Very big multiplies-adds operand, therefore causes processing high-definition picture time-consuming too long.The implementation method of system is using general CPU or DSP process machine, and mask convolution computing is carried out by pipeline system.Due to the limitation of CPU or DSP speed, for height Fast design, conventional method in real time can no longer meet to require.
The content of the invention
In order to solve the above problems, the present invention is by providing a kind of mask convolution method and system based on FPGA.
On the one hand the technical solution adopted by the present invention is a kind of mask convolution implementation method based on FPGA, including step: The data bit width of view data is obtained, the register group of corresponding depth is selected based on the data bit width;Obtain view data simultaneously The register group is stored in, convolution coefficient is obtained and is stored in ROM;Obtain for associating the register group and the convolution coefficient Selection parameter;The data of extraction register group storage and corresponding convolution coefficient simultaneously carry out multiplying, will based on adder group The results added of the multiplying is to realize convolution algorithm.
The shift register group of the corresponding depth of data bit width selection is preferably based on, the shift register group is used In obtaining view data, the register group obtains view data from the shift register group and stored.
Preferably, the data volume parameter and window size parameter of pending view data are obtained;Based on data volume parameter The register of respective amount and arrangement is selected from register group and is stored in view data;Based on the window size parameter extraction Correspond to the data of the register storage of arrangement simultaneously and its corresponding convolution coefficient carries out multiplying, multiply described in the acquisition of adder group The results added of method computing is to realize convolution algorithm.
Preferably, the convolution coefficient is stored in Hex files, and the Hex files deposit in the ROM.
Preferably, the adder group obtains the multiplication result and carries out add operation based on tree structure.
On the other hand the technical solution adopted by the present invention realizes system for a kind of mask convolution based on FPGA, including:Ginseng Number input module, for obtaining the data bit width of view data, the register group of corresponding depth is selected based on the data bit width; Data input module, for obtaining view data and being stored in the register group, obtain convolution coefficient and be stored in ROM;Calculate mould Block, it is used to associate the register group and the selection parameter of the convolution coefficient for obtaining;Computing module, it is additionally operable to extraction and posts The data of storage group storage and corresponding convolution coefficient simultaneously carry out multiplying, based on adder group by the result of the multiplying It is added to realize convolution algorithm.
Preferably, the parameter input module, it is additionally operable to select the shift LD of corresponding depth based on the data bit width Device group;The shift register group is used to obtain view data, and the register group obtains image from the shift register group Data simultaneously store.
Preferably, in addition to window module, for obtaining the data volume parameter and window size of pending view data Parameter;The data input module is selected the register of respective amount and arrangement based on data volume parameter and deposited from register group Enter view data;The data of register storage of the computing module based on the corresponding arrangement of the window size parameter extraction and and Its corresponding convolution coefficient carries out multiplying, and adder group obtains the results added of the multiplying to realize that convolution is transported Calculate.
Preferably, the convolution coefficient is stored in Hex files, and the Hex files deposit in the ROM.
Preferably, the adder group obtains the multiplication result and carries out add operation based on tree structure.
Beneficial effects of the present invention are the register group data storage of selection matching view data, select ROM storage convolution Coefficient, multiplying is carried out by the corresponding relation of register and convolution coefficient, the addition that product is carried out by adder group is transported Calculate to realize convolution algorithm, convolution processing result can be improved, improve treatment effeciency.
Brief description of the drawings
Fig. 1 show the schematic diagram of the FPGA basic structures based on the embodiment of the present invention;
Fig. 2 show the schematic diagram of the convolution basic module based on the embodiment of the present invention;
Fig. 3 show the multiplication schematic diagram based on the embodiment of the present invention;
Fig. 4 show the add tree schematic diagram based on the embodiment of the present invention.
Embodiment
The present invention will be described with reference to embodiments.
Embodiment 1 based on invention, a kind of mask convolution implementation method based on FPGA, including step:Obtain picture number According to data bit width, the register group of corresponding depth is selected based on the data bit width;Obtain and posted described in view data and deposit Storage group, obtain convolution coefficient and be stored in ROM;Obtain and join for associating the register group and the selection of the convolution coefficient Number;The data of extraction register group storage and corresponding convolution coefficient simultaneously carry out multiplying, based on adder group by the multiplication The results added of computing is to realize convolution algorithm.
Method based on embodiment, in addition to:The shift register group of corresponding depth, institute are selected based on the data bit width State shift register group to be used to obtain view data, the register group obtains view data from the shift register group and deposited Storage.
Method based on embodiment, in addition to:Obtain the data volume parameter and window size ginseng of pending view data Number;The register of respective amount and arrangement is selected from register group based on data volume parameter and is stored in view data;Based on institute State the data of the register storage of the corresponding arrangement of window size parameter extraction simultaneously and its corresponding convolution coefficient carry out multiplying, Adder group obtains the results added of the multiplying to realize convolution algorithm.
Method based on embodiment, the convolution coefficient are stored in Hex files, and the Hex files deposit in the ROM.
Method based on embodiment, the adder group are obtained the multiplication result and added based on tree structure Method computing.
With the progress of integrated circuit technology, FPGA performance is obviously improved, and it has been provided the user more Resource and Geng Gao speed can be handled.The platform that scheme is implemented is FPGA, obtains each seed ginseng of outside input first Count, such as the size of the size of the view data in convolution algorithm, data bit width and convolution window (i.e. described template);Wherein, The register group of corresponding depth is selected (because FPGA platform may include the register of many specifications, root according to data bit width The suitable register of specification is selected according to data bit width, can so increase the utilization ratio to whole system, mark these deposits Device is combined as register group), obtain pending view data and be stored in register group, obtain convolution coefficient (i.e. multiplication system Number) and ROM is stored in, wherein, convolution coefficient is stored in in ROM history file, with setting register address and file The corresponding relation of location, extraction there can be corresponding address in the history file while data of register storage are extracted Numerical value (i.e. convolution coefficient), the data and numerical value are subjected to multiplying, by the data of all registers (i.e. register group) With the product addition of corresponding convolution coefficient, convolution algorithm is realized.
Further improved on the basis of above-described embodiment 1, due to factors such as the limitations of computing capability, it will usually using volume Product window carries out processing data, now, obtains the parameter (i.e. described window size parameter) on window size, such as group first Into N*N window, i.e. the convolution window is made up of N*N register, the depth of register and the view data stored Bit wide is consistent, by the digital independent of this N*N register and carries out above-mentioned multiplying (being carried out respectively by N*N register) With add operation (multiplication result is added), convolution algorithm is realized, by the limitation of window, whole FPGA processing datas can be controlled Disposal ability.
Further improved on the basis of above-described embodiment 1, the shift register of (or selection certain amount) is set For as buffer unit, in the above example, it to be N*N sizes to set convolution window, then is correspondingly arranged N-1 shift LD Device, the shift register join end to end to form row, a line register of the convolution window respectively with shift register one by one It is corresponding, for realizing the transmission of data, i.e., shift register obtain outside data (i.e. view data), convolution window is from displacement Register obtains view data.
Further improved on the basis of above-described embodiment 1, convolution window is modified to arbitrary shape size.For M* N rectangular window, then the convolution window be adjusted to be made up of M*N register, shift register is adjusted to N-1.For radius For R ox-eye, then the convolution window is adjusted to be made up of (2R+1) * (2R+1) individual register, and shift register is adjusted to 2R+ 1, the circular window array that radius is R is then marked from (2R+1) * (2R+1) individual window registers.Other arbitrary shapes are big Small window can be cut by corresponding rectangular window and be realized.
For the explanation of embodiment 1, Hex files are changed easily in Quartus II, and symbol ten can have been selected to enter The input of system, it is user-friendly, it is well suited as the carrier of convolution coefficient.Coefficient write-in leads Hex files after preserving Enter in FPGA on-chip memories ROM, need to read ROM address realm by window size adjust automatically.
For N × N convolution windows, ROM addressing ranges are 0~(N2- 1), by N2Individual convolution coefficientTake out, ROM output and N2Level production line is connected.The sharp value for carrying out adjustment factor in this way and number are very convenient, realize any The extraction and application of coefficient, for the convolution window of arbitrary shape size, ROM addressing ranges are equal to the number of window registers.
For the explanation of embodiment 1, normal additive process is to extract data one by one and be added, but more in data When can increase add time, under conditions of FPGA, can allowing adder group, the register adjacent with two connects simultaneously respectively Multiplication result corresponding to acquisition, the add operation result for then extracting two adders carry out add operation again, then It can realize that multiple data are handled in a clock, the data of required clock and adder can greatly reduce, and improve Operation efficiency again reduces the occupancy of resource.
Embodiment 2 based on invention, a kind of mask convolution based on FPGA realize system, including:Parameter input module, use In the data bit width for obtaining view data, the register group of corresponding depth is selected based on the data bit width, is also used for obtaining volume The shape size of product window, corresponding convolution algorithm unit is automatically generated based on the parameter;Data input module, for obtaining View data is simultaneously stored in the register group, obtains convolution coefficient and imports ROM;Computing module, it is used to associate institute for obtaining State the selection parameter of register group and the convolution coefficient;Computing module, it is additionally operable to extract the data of register group storage and right Answer convolution coefficient and carry out multiplying, based on adder group by the results added of the multiplying to realize convolution algorithm.
System based on embodiment, the parameter input module, it is additionally operable to based on the corresponding depth of data bit width selection Shift register group;The shift register group is used to obtain view data, and the register group is from the shift register Group obtains view data and stored.
System based on embodiment, in addition to window module, for obtaining the data volume parameter of pending view data With window size parameter;The data input module selects respective amount and arrangement based on data volume parameter from register group Register is simultaneously stored in view data;Register storage of the computing module based on the corresponding arrangement of the window size parameter extraction Data and and its corresponding convolution coefficient carry out multiplying, the results added of the adder group acquisition multiplying is with reality Existing convolution algorithm.
System based on embodiment, the convolution coefficient are stored in Hex files, and the Hex files deposit in the ROM.
System based on embodiment, the adder group are obtained the multiplication result and added based on tree structure Method computing.
Embodiment 3 based on invention, realize the process that FPGA convolution is realized:
FPGA basic structures as shown in Figure 1, including central control unit, input block (i.e. data-interface), row caching Unit (being made up of shift register), convolution windows units (register group), convolution algorithm unit (obtain the data of register group And carry out multiplying), add tree unit (adder group), template parameter interface (i.e. data-interface or data input pin), Convolution coefficient interface (i.e. ROM, for storing multiplication coefficient) and output unit;Wherein, input block connection line buffer unit, OK Buffer unit connects convolution windows units, convolution windows units connection convolution algorithm unit, convolution algorithm unit connection add tree Unit, add tree unit connection output unit;Template parameter interface connects line buffer unit respectively, convolution windows units (are used for Define shift register, the quantity of register);Convolution coefficient interface connects convolution algorithm unit to provide multiplication coefficient.
The first step, establishing convolution basic module as shown in Figure 2 (includes outside auxiliary unit, convolution windows units, OK Buffer unit and convolution algorithm unit), wherein
Line_buffer1~Line_buffer (N-1) is N-1 row register group (being made up of shift register), Din It is the input/output terminal of each row register group respectively with Dout,It is N2Individual register.It is first between shift register Tail is connected, and the shift register number in each row register group is identical with the number of a line view data, shift register Depth is identical with the view data bit wide received, and in each N number of register of shift register row external connection, and this is N number of to post Storage is also to join end to end.
By N in convolution windows units2The data of individual register and the weight coefficient of outside input (i.e. Weights, are stored in ROM multiplying) is carried out, then is added two-by-two by add tree module, finally obtains convolutional calculation result.
As can be known from Fig. 2, except Din0 is connected with data input pin Pix_in, the output of remaining m-th shift register Hold Dout (M) the input Din (M+1) and the MN register P with the M+1 shift register respectivelyMNIt is connected, while each Window registers PMAll with previous PM-1Streamline is formed, said structure is all realized using the method for the multiple example of loop iteration.
The auxiliary unit of the outside includes ROM and the template parameter interface (N*N being used in defined parameters, such as figure Shift Registers), Multi pliers are convolution algorithm unit in figure, and Add Tree are add tree unit.
Second step, multiplication schematic diagram as shown in Figure 3, convolution coefficient is inputted, by convolution window (i.e. convolution windows units) Multiplying is carried out with corresponding convolution coefficient.Convolution coefficient is written in Hex files from outside, and Hex files are imported into FPGA In on-chip memory ROM, need to read ROM address realm by window size adjust automatically.By taking 3 × 3 windows as an example, data Bit wide is 8Bit, coefficient Q0~Q8Span be -128~+127.Mask coefficient Q0~Q8Permutation matrix be:
The order for writing Hex files is as shown in Figure 3.After coefficient imports, in ROM 0~8 address location is then addressed, Output is connected with 9 level production lines.For N × N convolution windows, ROM addressing ranges are 0~(N2- 1), by N2Individual coefficientAfter taking-up, multiplying is carried out with convolution window cell array.
3rd step, add tree schematic diagram as shown in Figure 4, by the multiplication result (D in convolution algorithm0~DN, wherein D For product, it is adder that N answers ID, Reg for register pair) input add tree module, it is cumulative to finish the complete convolution algorithm of output As a result.Two adjacent data are subjected to add operation two-by-two, if N2It is even number, first time computing needs N2/ 2 adders, N2/ 2 registers;Second of computing needs N2/ 4 adders, N2/ 4 registers.If N2It is odd number, first time computing needs Want N2/ 2 adders, N2/ 2+1 register;Second of computing needs N2/ 4 adders, N2/4+(N2/ 2) %2 deposit Device.By that analogy, each number to be added is:
a0=N*N, wherein anRepresent the data amount check that n-th computing needs to be added, a0It is just Initial value, N*N are convolution window size, and each computing spending is a clock.
The frequency n expression formula that add tree computing is fully completed needs is:2n-1<N*N≤2n, wherein, required clock number Also it is n.Therefore, it is cumulative to finish required operation times and clock cycle number n satisfactions for N number of data:log2N≤n< log2N+1。
The convolution windows units are a kind of hardware configurations being adjusted flexibly, and not similar shape is automatically generated according to input parameter The window of shape size, in the absence of the wasting of resources.
It is described above, simply presently preferred embodiments of the present invention, the invention is not limited in above-mentioned embodiment, as long as It reaches the technique effect of the present invention with identical means, should all belong to protection scope of the present invention.In the protection model of the present invention Its technical scheme and/or embodiment can have a variety of modifications and variations in enclosing.

Claims (10)

1. a kind of mask convolution implementation method based on FPGA, it is characterised in that including step:
The data bit width of view data is obtained, the register group of corresponding depth is selected based on the data bit width;
Obtain view data and be stored in the register group, obtain convolution coefficient and be stored in ROM;
Obtain the selection parameter for associating the register group and the convolution coefficient;
The data of extraction register group storage and corresponding convolution coefficient simultaneously carry out multiplying, based on adder group by the multiplication The results added of computing is to realize convolution algorithm.
2. a kind of mask convolution implementation method based on FPGA according to claim 1, it is characterised in that also include:
The shift register group of corresponding depth is selected based on the data bit width, the shift register group is used to obtain picture number According to the register group obtains view data from the shift register group and stored.
3. a kind of mask convolution implementation method based on FPGA according to claim 2, it is characterised in that also include:
Obtain the data volume parameter and window size parameter of pending view data;
The register of respective amount and arrangement is selected from register group based on data volume parameter and is stored in view data;
The data of register storage based on the corresponding arrangement of the window size parameter extraction are simultaneously entered with its corresponding convolution coefficient Row multiplying, adder group obtain the results added of the multiplying to realize convolution algorithm.
A kind of 4. mask convolution implementation method based on FPGA according to any one of claims 1 to 3, it is characterised in that The convolution coefficient is stored in Hex files, and the Hex files deposit in the ROM.
A kind of 5. mask convolution implementation method based on FPGA according to claim 4, it is characterised in that the adder Group obtains the multiplication result and carries out add operation based on tree structure.
6. a kind of mask convolution based on FPGA realizes system, it is characterised in that including:
Parameter input module, for obtaining the data bit width of view data, posting for corresponding depth is selected based on the data bit width Storage group;
Data input module, for obtaining view data and being stored in the register group, obtain convolution coefficient and be stored in ROM;
Computing module, it is used to associate the register group and the selection parameter of the convolution coefficient for obtaining;
Computing module, it is additionally operable to extract the data of register group storage and corresponding convolution coefficient and carries out multiplying, based on adds Musical instruments used in a Buddhist or Taoist mass group is by the results added of the multiplying to realize convolution algorithm.
7. a kind of mask convolution based on FPGA according to claim 6 realizes system, it is characterised in that the parameter is defeated Enter module, be additionally operable to select the shift register group of corresponding depth based on the data bit width;
The shift register group is used to obtain view data, and the register group obtains picture number from the shift register group According to and store.
8. a kind of mask convolution based on FPGA according to claim 7 realizes system, it is characterised in that also including window Module, for obtaining the data volume parameter and window size parameter of pending view data;
The data input module is selected the register of respective amount and arrangement based on data volume parameter and deposited from register group Enter view data;
The data of register storage of the computing module based on the corresponding arrangement of the window size parameter extraction simultaneously correspond to it Convolution coefficient carry out multiplying, adder group obtains the results added of the multiplying to realize convolution algorithm.
9. a kind of mask convolution based on FPGA according to claim 6~8 realizes system, it is characterised in that the volume Product coefficient is stored in Hex files, and the Hex files deposit in the ROM.
10. a kind of mask convolution based on FPGA according to claim 9 realizes system, it is characterised in that the addition Device group obtains the multiplication result and carries out add operation based on tree structure.
CN201710888288.4A 2017-09-27 2017-09-27 A kind of mask convolution method and system based on FPGA Pending CN107656899A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710888288.4A CN107656899A (en) 2017-09-27 2017-09-27 A kind of mask convolution method and system based on FPGA

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710888288.4A CN107656899A (en) 2017-09-27 2017-09-27 A kind of mask convolution method and system based on FPGA

Publications (1)

Publication Number Publication Date
CN107656899A true CN107656899A (en) 2018-02-02

Family

ID=61116930

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710888288.4A Pending CN107656899A (en) 2017-09-27 2017-09-27 A kind of mask convolution method and system based on FPGA

Country Status (1)

Country Link
CN (1) CN107656899A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108491929A (en) * 2018-03-20 2018-09-04 南开大学 A kind of structure of the configurable parallel fast convolution core based on FPGA
CN108681984A (en) * 2018-07-26 2018-10-19 珠海市微半导体有限公司 A kind of accelerating circuit of 3*3 convolution algorithms
CN109284475A (en) * 2018-09-20 2019-01-29 郑州云海信息技术有限公司 A kind of matrix convolution computing module and matrix convolution calculation method
CN109472734A (en) * 2018-10-18 2019-03-15 江苏第二师范学院(江苏省教育科学研究院) A kind of target detection network and its implementation based on FPGA
CN110543939A (en) * 2019-06-12 2019-12-06 电子科技大学 hardware acceleration implementation framework for convolutional neural network backward training based on FPGA
CN110647978A (en) * 2019-09-05 2020-01-03 北京三快在线科技有限公司 System and method for extracting convolution window in convolution neural network
CN111260536A (en) * 2018-12-03 2020-06-09 中国科学院沈阳自动化研究所 Digital image multi-scale convolution processor with variable parameters and implementation method thereof
CN111382861A (en) * 2018-12-31 2020-07-07 爱思开海力士有限公司 Processing system
CN111488983A (en) * 2020-03-24 2020-08-04 哈尔滨工业大学 Lightweight CNN model calculation accelerator based on FPGA
CN113328998A (en) * 2021-05-14 2021-08-31 维沃移动通信有限公司 Image data transmission method and electronic equipment
CN114022366A (en) * 2022-01-06 2022-02-08 深圳鲲云信息科技有限公司 Image size adjusting structure based on data stream architecture, image size adjusting method based on data stream architecture and image size adjusting equipment based on data stream architecture
CN114120082A (en) * 2021-11-23 2022-03-01 西南交通大学 Image acceleration convolution calculation method, system, equipment and readable storage medium
CN114862654A (en) * 2022-04-15 2022-08-05 山东浪潮科学研究院有限公司 Method and system for realizing real-time template convolution on FPGA

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1577438A (en) * 2003-07-07 2005-02-09 日本先锋公司 Panel display apparatus
CN102208005A (en) * 2011-05-30 2011-10-05 华中科技大学 2-dimensional (2-D) convolver
CN102681815A (en) * 2012-05-11 2012-09-19 深圳市清友能源技术有限公司 Signed multiply-accumulate algorithm method using adder tree structure
CN106228240A (en) * 2016-07-30 2016-12-14 复旦大学 Degree of depth convolutional neural networks implementation method based on FPGA
CN106779060A (en) * 2017-02-09 2017-05-31 武汉魅瞳科技有限公司 A kind of computational methods of the depth convolutional neural networks for being suitable to hardware design realization

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1577438A (en) * 2003-07-07 2005-02-09 日本先锋公司 Panel display apparatus
CN102208005A (en) * 2011-05-30 2011-10-05 华中科技大学 2-dimensional (2-D) convolver
CN102681815A (en) * 2012-05-11 2012-09-19 深圳市清友能源技术有限公司 Signed multiply-accumulate algorithm method using adder tree structure
CN106228240A (en) * 2016-07-30 2016-12-14 复旦大学 Degree of depth convolutional neural networks implementation method based on FPGA
CN106779060A (en) * 2017-02-09 2017-05-31 武汉魅瞳科技有限公司 A kind of computational methods of the depth convolutional neural networks for being suitable to hardware design realization

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
徐楚等: "基于优化的SIFT特征描述子的人脸特征点定位", 《南开大学学报(自然科学版)》 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108491929A (en) * 2018-03-20 2018-09-04 南开大学 A kind of structure of the configurable parallel fast convolution core based on FPGA
CN108681984A (en) * 2018-07-26 2018-10-19 珠海市微半导体有限公司 A kind of accelerating circuit of 3*3 convolution algorithms
CN108681984B (en) * 2018-07-26 2023-08-15 珠海一微半导体股份有限公司 Acceleration circuit of 3*3 convolution algorithm
CN109284475A (en) * 2018-09-20 2019-01-29 郑州云海信息技术有限公司 A kind of matrix convolution computing module and matrix convolution calculation method
CN109284475B (en) * 2018-09-20 2021-10-29 郑州云海信息技术有限公司 Matrix convolution calculating device and matrix convolution calculating method
CN109472734A (en) * 2018-10-18 2019-03-15 江苏第二师范学院(江苏省教育科学研究院) A kind of target detection network and its implementation based on FPGA
CN109472734B (en) * 2018-10-18 2022-12-27 江苏第二师范学院(江苏省教育科学研究院) Target detection network based on FPGA and implementation method thereof
CN111260536B (en) * 2018-12-03 2022-03-08 中国科学院沈阳自动化研究所 Digital image multi-scale convolution processor with variable parameters and implementation method thereof
CN111260536A (en) * 2018-12-03 2020-06-09 中国科学院沈阳自动化研究所 Digital image multi-scale convolution processor with variable parameters and implementation method thereof
CN111382861B (en) * 2018-12-31 2023-11-10 爱思开海力士有限公司 processing system
CN111382861A (en) * 2018-12-31 2020-07-07 爱思开海力士有限公司 Processing system
CN110543939B (en) * 2019-06-12 2022-05-03 电子科技大学 Hardware acceleration realization device for convolutional neural network backward training based on FPGA
CN110543939A (en) * 2019-06-12 2019-12-06 电子科技大学 hardware acceleration implementation framework for convolutional neural network backward training based on FPGA
CN110647978A (en) * 2019-09-05 2020-01-03 北京三快在线科技有限公司 System and method for extracting convolution window in convolution neural network
CN111488983A (en) * 2020-03-24 2020-08-04 哈尔滨工业大学 Lightweight CNN model calculation accelerator based on FPGA
CN111488983B (en) * 2020-03-24 2023-04-28 哈尔滨工业大学 Lightweight CNN model calculation accelerator based on FPGA
CN113328998A (en) * 2021-05-14 2021-08-31 维沃移动通信有限公司 Image data transmission method and electronic equipment
CN114120082A (en) * 2021-11-23 2022-03-01 西南交通大学 Image acceleration convolution calculation method, system, equipment and readable storage medium
CN114022366A (en) * 2022-01-06 2022-02-08 深圳鲲云信息科技有限公司 Image size adjusting structure based on data stream architecture, image size adjusting method based on data stream architecture and image size adjusting equipment based on data stream architecture
CN114022366B (en) * 2022-01-06 2022-03-18 深圳鲲云信息科技有限公司 Image size adjusting device, adjusting method and equipment based on data stream architecture
CN114862654A (en) * 2022-04-15 2022-08-05 山东浪潮科学研究院有限公司 Method and system for realizing real-time template convolution on FPGA

Similar Documents

Publication Publication Date Title
CN107656899A (en) A kind of mask convolution method and system based on FPGA
CN106445471A (en) Processor and method for executing matrix multiplication on processor
CN107729989A (en) A kind of device and method for being used to perform artificial neural network forward operation
EP3553673A1 (en) Convolution operation chip and communication device
CN109117186A (en) Processing with Neural Network device and its method for executing Outer Product of Vectors instruction
CN109767000A (en) Neural network convolution method and device based on Winograd algorithm
CN108154229A (en) Accelerate the image processing method of convolutional neural networks frame based on FPGA
CN109993273B (en) Convolution implementation method of convolution neural network and related product
WO2021232843A1 (en) Image data storage method, image data processing method and system, and related apparatus
CN108777612A (en) A kind of optimization method and circuit of proof of work operation chip core calculating unit
CN107612523A (en) A kind of FIR filter implementation method based on software checking book method
CN109240644A (en) A kind of local search approach and circuit for Yi Xin chip
EP4102354B1 (en) Method, circuit, and soc for performing matrix multiplication operation
CN114996638A (en) Configurable fast Fourier transform circuit with sequential architecture
CN107766503A (en) Data method for quickly querying and device based on redis
CN110414672A (en) Convolution algorithm method, apparatus and system
CN110135563A (en) A kind of convolutional neural networks binarization method and computing circuit
CN110414663A (en) The convolution implementation method and Related product of neural network
CN109146060A (en) A kind of method and device based on convolutional neural networks processing data
CN109902821A (en) A kind of data processing method, device and associated component
CN113222129A (en) Convolution operation processing unit and system based on multi-level cache cyclic utilization
CN111178513B (en) Convolution implementation method and device of neural network and terminal equipment
WO2023284130A1 (en) Chip and control method for convolution calculation, and electronic device
CN114722048B (en) Data processing method and device, electronic equipment and storage medium
CN106843819A (en) The method and device of object serialization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180202

RJ01 Rejection of invention patent application after publication