CN107656899A - A kind of mask convolution method and system based on FPGA - Google Patents
A kind of mask convolution method and system based on FPGA Download PDFInfo
- Publication number
- CN107656899A CN107656899A CN201710888288.4A CN201710888288A CN107656899A CN 107656899 A CN107656899 A CN 107656899A CN 201710888288 A CN201710888288 A CN 201710888288A CN 107656899 A CN107656899 A CN 107656899A
- Authority
- CN
- China
- Prior art keywords
- convolution
- data
- register group
- group
- coefficient
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000000605 extraction Methods 0.000 claims abstract description 13
- 238000012545 processing Methods 0.000 abstract description 8
- 238000013500 data storage Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 6
- 241000208340 Araliaceae Species 0.000 description 3
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 3
- 235000003140 Panax quinquefolius Nutrition 0.000 description 3
- 235000008434 ginseng Nutrition 0.000 description 3
- 230000001186 cumulative effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000008676 import Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000015654 memory Effects 0.000 description 2
- 241000566708 Buphthalmum salicifolium Species 0.000 description 1
- 241001269238 Data Species 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000004725 window cell Anatomy 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/15—Correlation function computation including computation of convolution operations
- G06F17/153—Multidimensional correlation or convolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Computational Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Complex Calculations (AREA)
Abstract
The invention discloses a kind of mask convolution method and system based on FPGA, this method includes obtaining the data bit width of view data, and the register group of corresponding depth is selected based on the data bit width;Obtain view data and be stored in the register group, obtain convolution coefficient and be stored in ROM;Obtain the selection parameter for associating the register group and the convolution coefficient;The data of extraction register group storage and corresponding convolution coefficient simultaneously carry out multiplying, based on adder group by the results added of the multiplying to realize convolution algorithm.The system is used to perform corresponding method.The register group data storage of present invention selection matching view data, select ROM storage convolution coefficients, multiplying is carried out by the corresponding relation of register and convolution coefficient, the add operation of product is carried out to realize convolution algorithm by adder group, convolution processing result can be improved, improve treatment effeciency.
Description
Technical field
The present invention relates to technical field of image processing, more particularly to a kind of mask convolution implementation method based on FPGA and it is
System.
Background technology
In Digital Image Processing, it is a kind of important method that spatial domain, which carries out processing to image,.Common some spaces filter
Ripple operates, and comprising linear processes, the important operation frequently referred to is exactly image convolution computing.Because convolution algorithm needs
Very big multiplies-adds operand, therefore causes processing high-definition picture time-consuming too long.The implementation method of system is using general
CPU or DSP process machine, and mask convolution computing is carried out by pipeline system.Due to the limitation of CPU or DSP speed, for height
Fast design, conventional method in real time can no longer meet to require.
The content of the invention
In order to solve the above problems, the present invention is by providing a kind of mask convolution method and system based on FPGA.
On the one hand the technical solution adopted by the present invention is a kind of mask convolution implementation method based on FPGA, including step:
The data bit width of view data is obtained, the register group of corresponding depth is selected based on the data bit width;Obtain view data simultaneously
The register group is stored in, convolution coefficient is obtained and is stored in ROM;Obtain for associating the register group and the convolution coefficient
Selection parameter;The data of extraction register group storage and corresponding convolution coefficient simultaneously carry out multiplying, will based on adder group
The results added of the multiplying is to realize convolution algorithm.
The shift register group of the corresponding depth of data bit width selection is preferably based on, the shift register group is used
In obtaining view data, the register group obtains view data from the shift register group and stored.
Preferably, the data volume parameter and window size parameter of pending view data are obtained;Based on data volume parameter
The register of respective amount and arrangement is selected from register group and is stored in view data;Based on the window size parameter extraction
Correspond to the data of the register storage of arrangement simultaneously and its corresponding convolution coefficient carries out multiplying, multiply described in the acquisition of adder group
The results added of method computing is to realize convolution algorithm.
Preferably, the convolution coefficient is stored in Hex files, and the Hex files deposit in the ROM.
Preferably, the adder group obtains the multiplication result and carries out add operation based on tree structure.
On the other hand the technical solution adopted by the present invention realizes system for a kind of mask convolution based on FPGA, including:Ginseng
Number input module, for obtaining the data bit width of view data, the register group of corresponding depth is selected based on the data bit width;
Data input module, for obtaining view data and being stored in the register group, obtain convolution coefficient and be stored in ROM;Calculate mould
Block, it is used to associate the register group and the selection parameter of the convolution coefficient for obtaining;Computing module, it is additionally operable to extraction and posts
The data of storage group storage and corresponding convolution coefficient simultaneously carry out multiplying, based on adder group by the result of the multiplying
It is added to realize convolution algorithm.
Preferably, the parameter input module, it is additionally operable to select the shift LD of corresponding depth based on the data bit width
Device group;The shift register group is used to obtain view data, and the register group obtains image from the shift register group
Data simultaneously store.
Preferably, in addition to window module, for obtaining the data volume parameter and window size of pending view data
Parameter;The data input module is selected the register of respective amount and arrangement based on data volume parameter and deposited from register group
Enter view data;The data of register storage of the computing module based on the corresponding arrangement of the window size parameter extraction and and
Its corresponding convolution coefficient carries out multiplying, and adder group obtains the results added of the multiplying to realize that convolution is transported
Calculate.
Preferably, the convolution coefficient is stored in Hex files, and the Hex files deposit in the ROM.
Preferably, the adder group obtains the multiplication result and carries out add operation based on tree structure.
Beneficial effects of the present invention are the register group data storage of selection matching view data, select ROM storage convolution
Coefficient, multiplying is carried out by the corresponding relation of register and convolution coefficient, the addition that product is carried out by adder group is transported
Calculate to realize convolution algorithm, convolution processing result can be improved, improve treatment effeciency.
Brief description of the drawings
Fig. 1 show the schematic diagram of the FPGA basic structures based on the embodiment of the present invention;
Fig. 2 show the schematic diagram of the convolution basic module based on the embodiment of the present invention;
Fig. 3 show the multiplication schematic diagram based on the embodiment of the present invention;
Fig. 4 show the add tree schematic diagram based on the embodiment of the present invention.
Embodiment
The present invention will be described with reference to embodiments.
Embodiment 1 based on invention, a kind of mask convolution implementation method based on FPGA, including step:Obtain picture number
According to data bit width, the register group of corresponding depth is selected based on the data bit width;Obtain and posted described in view data and deposit
Storage group, obtain convolution coefficient and be stored in ROM;Obtain and join for associating the register group and the selection of the convolution coefficient
Number;The data of extraction register group storage and corresponding convolution coefficient simultaneously carry out multiplying, based on adder group by the multiplication
The results added of computing is to realize convolution algorithm.
Method based on embodiment, in addition to:The shift register group of corresponding depth, institute are selected based on the data bit width
State shift register group to be used to obtain view data, the register group obtains view data from the shift register group and deposited
Storage.
Method based on embodiment, in addition to:Obtain the data volume parameter and window size ginseng of pending view data
Number;The register of respective amount and arrangement is selected from register group based on data volume parameter and is stored in view data;Based on institute
State the data of the register storage of the corresponding arrangement of window size parameter extraction simultaneously and its corresponding convolution coefficient carry out multiplying,
Adder group obtains the results added of the multiplying to realize convolution algorithm.
Method based on embodiment, the convolution coefficient are stored in Hex files, and the Hex files deposit in the ROM.
Method based on embodiment, the adder group are obtained the multiplication result and added based on tree structure
Method computing.
With the progress of integrated circuit technology, FPGA performance is obviously improved, and it has been provided the user more
Resource and Geng Gao speed can be handled.The platform that scheme is implemented is FPGA, obtains each seed ginseng of outside input first
Count, such as the size of the size of the view data in convolution algorithm, data bit width and convolution window (i.e. described template);Wherein,
The register group of corresponding depth is selected (because FPGA platform may include the register of many specifications, root according to data bit width
The suitable register of specification is selected according to data bit width, can so increase the utilization ratio to whole system, mark these deposits
Device is combined as register group), obtain pending view data and be stored in register group, obtain convolution coefficient (i.e. multiplication system
Number) and ROM is stored in, wherein, convolution coefficient is stored in in ROM history file, with setting register address and file
The corresponding relation of location, extraction there can be corresponding address in the history file while data of register storage are extracted
Numerical value (i.e. convolution coefficient), the data and numerical value are subjected to multiplying, by the data of all registers (i.e. register group)
With the product addition of corresponding convolution coefficient, convolution algorithm is realized.
Further improved on the basis of above-described embodiment 1, due to factors such as the limitations of computing capability, it will usually using volume
Product window carries out processing data, now, obtains the parameter (i.e. described window size parameter) on window size, such as group first
Into N*N window, i.e. the convolution window is made up of N*N register, the depth of register and the view data stored
Bit wide is consistent, by the digital independent of this N*N register and carries out above-mentioned multiplying (being carried out respectively by N*N register)
With add operation (multiplication result is added), convolution algorithm is realized, by the limitation of window, whole FPGA processing datas can be controlled
Disposal ability.
Further improved on the basis of above-described embodiment 1, the shift register of (or selection certain amount) is set
For as buffer unit, in the above example, it to be N*N sizes to set convolution window, then is correspondingly arranged N-1 shift LD
Device, the shift register join end to end to form row, a line register of the convolution window respectively with shift register one by one
It is corresponding, for realizing the transmission of data, i.e., shift register obtain outside data (i.e. view data), convolution window is from displacement
Register obtains view data.
Further improved on the basis of above-described embodiment 1, convolution window is modified to arbitrary shape size.For M*
N rectangular window, then the convolution window be adjusted to be made up of M*N register, shift register is adjusted to N-1.For radius
For R ox-eye, then the convolution window is adjusted to be made up of (2R+1) * (2R+1) individual register, and shift register is adjusted to 2R+
1, the circular window array that radius is R is then marked from (2R+1) * (2R+1) individual window registers.Other arbitrary shapes are big
Small window can be cut by corresponding rectangular window and be realized.
For the explanation of embodiment 1, Hex files are changed easily in Quartus II, and symbol ten can have been selected to enter
The input of system, it is user-friendly, it is well suited as the carrier of convolution coefficient.Coefficient write-in leads Hex files after preserving
Enter in FPGA on-chip memories ROM, need to read ROM address realm by window size adjust automatically.
For N × N convolution windows, ROM addressing ranges are 0~(N2- 1), by N2Individual convolution coefficientTake out,
ROM output and N2Level production line is connected.The sharp value for carrying out adjustment factor in this way and number are very convenient, realize any
The extraction and application of coefficient, for the convolution window of arbitrary shape size, ROM addressing ranges are equal to the number of window registers.
For the explanation of embodiment 1, normal additive process is to extract data one by one and be added, but more in data
When can increase add time, under conditions of FPGA, can allowing adder group, the register adjacent with two connects simultaneously respectively
Multiplication result corresponding to acquisition, the add operation result for then extracting two adders carry out add operation again, then
It can realize that multiple data are handled in a clock, the data of required clock and adder can greatly reduce, and improve
Operation efficiency again reduces the occupancy of resource.
Embodiment 2 based on invention, a kind of mask convolution based on FPGA realize system, including:Parameter input module, use
In the data bit width for obtaining view data, the register group of corresponding depth is selected based on the data bit width, is also used for obtaining volume
The shape size of product window, corresponding convolution algorithm unit is automatically generated based on the parameter;Data input module, for obtaining
View data is simultaneously stored in the register group, obtains convolution coefficient and imports ROM;Computing module, it is used to associate institute for obtaining
State the selection parameter of register group and the convolution coefficient;Computing module, it is additionally operable to extract the data of register group storage and right
Answer convolution coefficient and carry out multiplying, based on adder group by the results added of the multiplying to realize convolution algorithm.
System based on embodiment, the parameter input module, it is additionally operable to based on the corresponding depth of data bit width selection
Shift register group;The shift register group is used to obtain view data, and the register group is from the shift register
Group obtains view data and stored.
System based on embodiment, in addition to window module, for obtaining the data volume parameter of pending view data
With window size parameter;The data input module selects respective amount and arrangement based on data volume parameter from register group
Register is simultaneously stored in view data;Register storage of the computing module based on the corresponding arrangement of the window size parameter extraction
Data and and its corresponding convolution coefficient carry out multiplying, the results added of the adder group acquisition multiplying is with reality
Existing convolution algorithm.
System based on embodiment, the convolution coefficient are stored in Hex files, and the Hex files deposit in the ROM.
System based on embodiment, the adder group are obtained the multiplication result and added based on tree structure
Method computing.
Embodiment 3 based on invention, realize the process that FPGA convolution is realized:
FPGA basic structures as shown in Figure 1, including central control unit, input block (i.e. data-interface), row caching
Unit (being made up of shift register), convolution windows units (register group), convolution algorithm unit (obtain the data of register group
And carry out multiplying), add tree unit (adder group), template parameter interface (i.e. data-interface or data input pin),
Convolution coefficient interface (i.e. ROM, for storing multiplication coefficient) and output unit;Wherein, input block connection line buffer unit, OK
Buffer unit connects convolution windows units, convolution windows units connection convolution algorithm unit, convolution algorithm unit connection add tree
Unit, add tree unit connection output unit;Template parameter interface connects line buffer unit respectively, convolution windows units (are used for
Define shift register, the quantity of register);Convolution coefficient interface connects convolution algorithm unit to provide multiplication coefficient.
The first step, establishing convolution basic module as shown in Figure 2 (includes outside auxiliary unit, convolution windows units, OK
Buffer unit and convolution algorithm unit), wherein
Line_buffer1~Line_buffer (N-1) is N-1 row register group (being made up of shift register), Din
It is the input/output terminal of each row register group respectively with Dout,It is N2Individual register.It is first between shift register
Tail is connected, and the shift register number in each row register group is identical with the number of a line view data, shift register
Depth is identical with the view data bit wide received, and in each N number of register of shift register row external connection, and this is N number of to post
Storage is also to join end to end.
By N in convolution windows units2The data of individual register and the weight coefficient of outside input (i.e. Weights, are stored in
ROM multiplying) is carried out, then is added two-by-two by add tree module, finally obtains convolutional calculation result.
As can be known from Fig. 2, except Din0 is connected with data input pin Pix_in, the output of remaining m-th shift register
Hold Dout (M) the input Din (M+1) and the MN register P with the M+1 shift register respectivelyMNIt is connected, while each
Window registers PMAll with previous PM-1Streamline is formed, said structure is all realized using the method for the multiple example of loop iteration.
The auxiliary unit of the outside includes ROM and the template parameter interface (N*N being used in defined parameters, such as figure
Shift Registers), Multi pliers are convolution algorithm unit in figure, and Add Tree are add tree unit.
Second step, multiplication schematic diagram as shown in Figure 3, convolution coefficient is inputted, by convolution window (i.e. convolution windows units)
Multiplying is carried out with corresponding convolution coefficient.Convolution coefficient is written in Hex files from outside, and Hex files are imported into FPGA
In on-chip memory ROM, need to read ROM address realm by window size adjust automatically.By taking 3 × 3 windows as an example, data
Bit wide is 8Bit, coefficient Q0~Q8Span be -128~+127.Mask coefficient Q0~Q8Permutation matrix be:
The order for writing Hex files is as shown in Figure 3.After coefficient imports, in ROM 0~8 address location is then addressed,
Output is connected with 9 level production lines.For N × N convolution windows, ROM addressing ranges are 0~(N2- 1), by N2Individual coefficientAfter taking-up, multiplying is carried out with convolution window cell array.
3rd step, add tree schematic diagram as shown in Figure 4, by the multiplication result (D in convolution algorithm0~DN, wherein D
For product, it is adder that N answers ID, Reg for register pair) input add tree module, it is cumulative to finish the complete convolution algorithm of output
As a result.Two adjacent data are subjected to add operation two-by-two, if N2It is even number, first time computing needs N2/ 2 adders,
N2/ 2 registers;Second of computing needs N2/ 4 adders, N2/ 4 registers.If N2It is odd number, first time computing needs
Want N2/ 2 adders, N2/ 2+1 register;Second of computing needs N2/ 4 adders, N2/4+(N2/ 2) %2 deposit
Device.By that analogy, each number to be added is:
a0=N*N, wherein anRepresent the data amount check that n-th computing needs to be added, a0It is just
Initial value, N*N are convolution window size, and each computing spending is a clock.
The frequency n expression formula that add tree computing is fully completed needs is:2n-1<N*N≤2n, wherein, required clock number
Also it is n.Therefore, it is cumulative to finish required operation times and clock cycle number n satisfactions for N number of data:log2N≤n<
log2N+1。
The convolution windows units are a kind of hardware configurations being adjusted flexibly, and not similar shape is automatically generated according to input parameter
The window of shape size, in the absence of the wasting of resources.
It is described above, simply presently preferred embodiments of the present invention, the invention is not limited in above-mentioned embodiment, as long as
It reaches the technique effect of the present invention with identical means, should all belong to protection scope of the present invention.In the protection model of the present invention
Its technical scheme and/or embodiment can have a variety of modifications and variations in enclosing.
Claims (10)
1. a kind of mask convolution implementation method based on FPGA, it is characterised in that including step:
The data bit width of view data is obtained, the register group of corresponding depth is selected based on the data bit width;
Obtain view data and be stored in the register group, obtain convolution coefficient and be stored in ROM;
Obtain the selection parameter for associating the register group and the convolution coefficient;
The data of extraction register group storage and corresponding convolution coefficient simultaneously carry out multiplying, based on adder group by the multiplication
The results added of computing is to realize convolution algorithm.
2. a kind of mask convolution implementation method based on FPGA according to claim 1, it is characterised in that also include:
The shift register group of corresponding depth is selected based on the data bit width, the shift register group is used to obtain picture number
According to the register group obtains view data from the shift register group and stored.
3. a kind of mask convolution implementation method based on FPGA according to claim 2, it is characterised in that also include:
Obtain the data volume parameter and window size parameter of pending view data;
The register of respective amount and arrangement is selected from register group based on data volume parameter and is stored in view data;
The data of register storage based on the corresponding arrangement of the window size parameter extraction are simultaneously entered with its corresponding convolution coefficient
Row multiplying, adder group obtain the results added of the multiplying to realize convolution algorithm.
A kind of 4. mask convolution implementation method based on FPGA according to any one of claims 1 to 3, it is characterised in that
The convolution coefficient is stored in Hex files, and the Hex files deposit in the ROM.
A kind of 5. mask convolution implementation method based on FPGA according to claim 4, it is characterised in that the adder
Group obtains the multiplication result and carries out add operation based on tree structure.
6. a kind of mask convolution based on FPGA realizes system, it is characterised in that including:
Parameter input module, for obtaining the data bit width of view data, posting for corresponding depth is selected based on the data bit width
Storage group;
Data input module, for obtaining view data and being stored in the register group, obtain convolution coefficient and be stored in ROM;
Computing module, it is used to associate the register group and the selection parameter of the convolution coefficient for obtaining;
Computing module, it is additionally operable to extract the data of register group storage and corresponding convolution coefficient and carries out multiplying, based on adds
Musical instruments used in a Buddhist or Taoist mass group is by the results added of the multiplying to realize convolution algorithm.
7. a kind of mask convolution based on FPGA according to claim 6 realizes system, it is characterised in that the parameter is defeated
Enter module, be additionally operable to select the shift register group of corresponding depth based on the data bit width;
The shift register group is used to obtain view data, and the register group obtains picture number from the shift register group
According to and store.
8. a kind of mask convolution based on FPGA according to claim 7 realizes system, it is characterised in that also including window
Module, for obtaining the data volume parameter and window size parameter of pending view data;
The data input module is selected the register of respective amount and arrangement based on data volume parameter and deposited from register group
Enter view data;
The data of register storage of the computing module based on the corresponding arrangement of the window size parameter extraction simultaneously correspond to it
Convolution coefficient carry out multiplying, adder group obtains the results added of the multiplying to realize convolution algorithm.
9. a kind of mask convolution based on FPGA according to claim 6~8 realizes system, it is characterised in that the volume
Product coefficient is stored in Hex files, and the Hex files deposit in the ROM.
10. a kind of mask convolution based on FPGA according to claim 9 realizes system, it is characterised in that the addition
Device group obtains the multiplication result and carries out add operation based on tree structure.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710888288.4A CN107656899A (en) | 2017-09-27 | 2017-09-27 | A kind of mask convolution method and system based on FPGA |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710888288.4A CN107656899A (en) | 2017-09-27 | 2017-09-27 | A kind of mask convolution method and system based on FPGA |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107656899A true CN107656899A (en) | 2018-02-02 |
Family
ID=61116930
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710888288.4A Pending CN107656899A (en) | 2017-09-27 | 2017-09-27 | A kind of mask convolution method and system based on FPGA |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107656899A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108491929A (en) * | 2018-03-20 | 2018-09-04 | 南开大学 | A kind of structure of the configurable parallel fast convolution core based on FPGA |
CN108681984A (en) * | 2018-07-26 | 2018-10-19 | 珠海市微半导体有限公司 | A kind of accelerating circuit of 3*3 convolution algorithms |
CN109284475A (en) * | 2018-09-20 | 2019-01-29 | 郑州云海信息技术有限公司 | A kind of matrix convolution computing module and matrix convolution calculation method |
CN109472734A (en) * | 2018-10-18 | 2019-03-15 | 江苏第二师范学院(江苏省教育科学研究院) | A kind of target detection network and its implementation based on FPGA |
CN110543939A (en) * | 2019-06-12 | 2019-12-06 | 电子科技大学 | hardware acceleration implementation framework for convolutional neural network backward training based on FPGA |
CN110647978A (en) * | 2019-09-05 | 2020-01-03 | 北京三快在线科技有限公司 | System and method for extracting convolution window in convolution neural network |
CN111260536A (en) * | 2018-12-03 | 2020-06-09 | 中国科学院沈阳自动化研究所 | Digital image multi-scale convolution processor with variable parameters and implementation method thereof |
CN111382861A (en) * | 2018-12-31 | 2020-07-07 | 爱思开海力士有限公司 | Processing system |
CN111488983A (en) * | 2020-03-24 | 2020-08-04 | 哈尔滨工业大学 | Lightweight CNN model calculation accelerator based on FPGA |
CN113328998A (en) * | 2021-05-14 | 2021-08-31 | 维沃移动通信有限公司 | Image data transmission method and electronic equipment |
CN114022366A (en) * | 2022-01-06 | 2022-02-08 | 深圳鲲云信息科技有限公司 | Image size adjusting structure based on data stream architecture, image size adjusting method based on data stream architecture and image size adjusting equipment based on data stream architecture |
CN114120082A (en) * | 2021-11-23 | 2022-03-01 | 西南交通大学 | Image acceleration convolution calculation method, system, equipment and readable storage medium |
CN114862654A (en) * | 2022-04-15 | 2022-08-05 | 山东浪潮科学研究院有限公司 | Method and system for realizing real-time template convolution on FPGA |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1577438A (en) * | 2003-07-07 | 2005-02-09 | 日本先锋公司 | Panel display apparatus |
CN102208005A (en) * | 2011-05-30 | 2011-10-05 | 华中科技大学 | 2-dimensional (2-D) convolver |
CN102681815A (en) * | 2012-05-11 | 2012-09-19 | 深圳市清友能源技术有限公司 | Signed multiply-accumulate algorithm method using adder tree structure |
CN106228240A (en) * | 2016-07-30 | 2016-12-14 | 复旦大学 | Degree of depth convolutional neural networks implementation method based on FPGA |
CN106779060A (en) * | 2017-02-09 | 2017-05-31 | 武汉魅瞳科技有限公司 | A kind of computational methods of the depth convolutional neural networks for being suitable to hardware design realization |
-
2017
- 2017-09-27 CN CN201710888288.4A patent/CN107656899A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1577438A (en) * | 2003-07-07 | 2005-02-09 | 日本先锋公司 | Panel display apparatus |
CN102208005A (en) * | 2011-05-30 | 2011-10-05 | 华中科技大学 | 2-dimensional (2-D) convolver |
CN102681815A (en) * | 2012-05-11 | 2012-09-19 | 深圳市清友能源技术有限公司 | Signed multiply-accumulate algorithm method using adder tree structure |
CN106228240A (en) * | 2016-07-30 | 2016-12-14 | 复旦大学 | Degree of depth convolutional neural networks implementation method based on FPGA |
CN106779060A (en) * | 2017-02-09 | 2017-05-31 | 武汉魅瞳科技有限公司 | A kind of computational methods of the depth convolutional neural networks for being suitable to hardware design realization |
Non-Patent Citations (1)
Title |
---|
徐楚等: "基于优化的SIFT特征描述子的人脸特征点定位", 《南开大学学报(自然科学版)》 * |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108491929A (en) * | 2018-03-20 | 2018-09-04 | 南开大学 | A kind of structure of the configurable parallel fast convolution core based on FPGA |
CN108681984A (en) * | 2018-07-26 | 2018-10-19 | 珠海市微半导体有限公司 | A kind of accelerating circuit of 3*3 convolution algorithms |
CN108681984B (en) * | 2018-07-26 | 2023-08-15 | 珠海一微半导体股份有限公司 | Acceleration circuit of 3*3 convolution algorithm |
CN109284475A (en) * | 2018-09-20 | 2019-01-29 | 郑州云海信息技术有限公司 | A kind of matrix convolution computing module and matrix convolution calculation method |
CN109284475B (en) * | 2018-09-20 | 2021-10-29 | 郑州云海信息技术有限公司 | Matrix convolution calculating device and matrix convolution calculating method |
CN109472734A (en) * | 2018-10-18 | 2019-03-15 | 江苏第二师范学院(江苏省教育科学研究院) | A kind of target detection network and its implementation based on FPGA |
CN109472734B (en) * | 2018-10-18 | 2022-12-27 | 江苏第二师范学院(江苏省教育科学研究院) | Target detection network based on FPGA and implementation method thereof |
CN111260536B (en) * | 2018-12-03 | 2022-03-08 | 中国科学院沈阳自动化研究所 | Digital image multi-scale convolution processor with variable parameters and implementation method thereof |
CN111260536A (en) * | 2018-12-03 | 2020-06-09 | 中国科学院沈阳自动化研究所 | Digital image multi-scale convolution processor with variable parameters and implementation method thereof |
CN111382861B (en) * | 2018-12-31 | 2023-11-10 | 爱思开海力士有限公司 | processing system |
CN111382861A (en) * | 2018-12-31 | 2020-07-07 | 爱思开海力士有限公司 | Processing system |
CN110543939B (en) * | 2019-06-12 | 2022-05-03 | 电子科技大学 | Hardware acceleration realization device for convolutional neural network backward training based on FPGA |
CN110543939A (en) * | 2019-06-12 | 2019-12-06 | 电子科技大学 | hardware acceleration implementation framework for convolutional neural network backward training based on FPGA |
CN110647978A (en) * | 2019-09-05 | 2020-01-03 | 北京三快在线科技有限公司 | System and method for extracting convolution window in convolution neural network |
CN111488983A (en) * | 2020-03-24 | 2020-08-04 | 哈尔滨工业大学 | Lightweight CNN model calculation accelerator based on FPGA |
CN111488983B (en) * | 2020-03-24 | 2023-04-28 | 哈尔滨工业大学 | Lightweight CNN model calculation accelerator based on FPGA |
CN113328998A (en) * | 2021-05-14 | 2021-08-31 | 维沃移动通信有限公司 | Image data transmission method and electronic equipment |
CN114120082A (en) * | 2021-11-23 | 2022-03-01 | 西南交通大学 | Image acceleration convolution calculation method, system, equipment and readable storage medium |
CN114022366A (en) * | 2022-01-06 | 2022-02-08 | 深圳鲲云信息科技有限公司 | Image size adjusting structure based on data stream architecture, image size adjusting method based on data stream architecture and image size adjusting equipment based on data stream architecture |
CN114022366B (en) * | 2022-01-06 | 2022-03-18 | 深圳鲲云信息科技有限公司 | Image size adjusting device, adjusting method and equipment based on data stream architecture |
CN114862654A (en) * | 2022-04-15 | 2022-08-05 | 山东浪潮科学研究院有限公司 | Method and system for realizing real-time template convolution on FPGA |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107656899A (en) | A kind of mask convolution method and system based on FPGA | |
CN106445471A (en) | Processor and method for executing matrix multiplication on processor | |
CN107729989A (en) | A kind of device and method for being used to perform artificial neural network forward operation | |
EP3553673A1 (en) | Convolution operation chip and communication device | |
CN109117186A (en) | Processing with Neural Network device and its method for executing Outer Product of Vectors instruction | |
CN109767000A (en) | Neural network convolution method and device based on Winograd algorithm | |
CN108154229A (en) | Accelerate the image processing method of convolutional neural networks frame based on FPGA | |
CN109993273B (en) | Convolution implementation method of convolution neural network and related product | |
WO2021232843A1 (en) | Image data storage method, image data processing method and system, and related apparatus | |
CN108777612A (en) | A kind of optimization method and circuit of proof of work operation chip core calculating unit | |
CN107612523A (en) | A kind of FIR filter implementation method based on software checking book method | |
CN109240644A (en) | A kind of local search approach and circuit for Yi Xin chip | |
EP4102354B1 (en) | Method, circuit, and soc for performing matrix multiplication operation | |
CN114996638A (en) | Configurable fast Fourier transform circuit with sequential architecture | |
CN107766503A (en) | Data method for quickly querying and device based on redis | |
CN110414672A (en) | Convolution algorithm method, apparatus and system | |
CN110135563A (en) | A kind of convolutional neural networks binarization method and computing circuit | |
CN110414663A (en) | The convolution implementation method and Related product of neural network | |
CN109146060A (en) | A kind of method and device based on convolutional neural networks processing data | |
CN109902821A (en) | A kind of data processing method, device and associated component | |
CN113222129A (en) | Convolution operation processing unit and system based on multi-level cache cyclic utilization | |
CN111178513B (en) | Convolution implementation method and device of neural network and terminal equipment | |
WO2023284130A1 (en) | Chip and control method for convolution calculation, and electronic device | |
CN114722048B (en) | Data processing method and device, electronic equipment and storage medium | |
CN106843819A (en) | The method and device of object serialization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180202 |
|
RJ01 | Rejection of invention patent application after publication |