CN105681628B - A kind of convolutional network arithmetic element and restructural convolutional neural networks processor and the method for realizing image denoising processing - Google Patents

A kind of convolutional network arithmetic element and restructural convolutional neural networks processor and the method for realizing image denoising processing Download PDF

Info

Publication number
CN105681628B
CN105681628B CN201610003960.2A CN201610003960A CN105681628B CN 105681628 B CN105681628 B CN 105681628B CN 201610003960 A CN201610003960 A CN 201610003960A CN 105681628 B CN105681628 B CN 105681628B
Authority
CN
China
Prior art keywords
input
restructural
convolution
output
neural networks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610003960.2A
Other languages
Chinese (zh)
Other versions
CN105681628A (en
Inventor
张斌
饶磊
李艳婷
杨宏伟
赵季中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN201610003960.2A priority Critical patent/CN105681628B/en
Publication of CN105681628A publication Critical patent/CN105681628A/en
Application granted granted Critical
Publication of CN105681628B publication Critical patent/CN105681628B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/21Circuitry for suppressing or minimising disturbance, e.g. moiré or halo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/21Circuitry for suppressing or minimising disturbance, e.g. moiré or halo
    • H04N5/213Circuitry for suppressing or minimising impulsive noise
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N9/00Details of colour television systems
    • H04N9/64Circuits for processing colour signals
    • H04N9/73Colour balance circuits, e.g. white balance circuits or colour temperature control

Abstract

A kind of method that the present invention discloses convolutional network arithmetic element and restructural convolutional neural networks processor and realizes image denoising processing;Restructural convolutional neural networks processor disclosed in this invention, including bus interface, pretreatment unit, reconfigurable hardware controller, SRAM, SRAM control unit, input buffer module, output buffer module, memory, data storage controller and convolutional network arithmetic element;Its resource is few, speed is fast, can be suitably used for common convolutional neural networks framework.The present invention can be realized convolutional neural networks, and processing speed is fast, be easy to transplant, and resource consumption is few, can restore the image or video polluted by raindrop, dust, moreover it is possible to operate as pre-treatment and provide help for subsequent image recognition or classification.

Description

A kind of convolutional network arithmetic element and restructural convolutional neural networks processor and realization The method of image denoising processing
Technical field
The present invention relates to field of image processing, in particular to a kind of convolutional network arithmetic element and restructural convolutional Neural net Network processor and the method for realizing image denoising processing.
Background technique
The removal of image raindrop and dust is significant for image processing application, especially video monitoring and navigation system System.It can be used for restoring the image or video that are polluted by raindrop, dust, and being alternatively arranged as pre-treatment operation is that subsequent image is known Not or classification provides help.
The method of current removal picture noise is mostly completed in the way of gaussian filtering, median filtering, bilateral filtering etc., These method treatment effects are bad, are not usually able to satisfy the demand of specific image processing application.Therefore need an effect more preferable Method remove picture noise, the method for convolutional neural networks becomes a good selection.
Current deep learning network is mostly run on GPU, but GPU is expensive, and power consumption is high, is not appropriate for advising greatly The extensive use of mould.And the speed of service is slow on CPU, it is low to run large-scale deep learning network efficiency, is unable to satisfy performance Demand.
It can be seen that the problem of current technology is for applying convolutional neural networks, being primarily present has: processor area is big, at The problems such as this height, power consumption is big, and performance is poor.Therefore this just needs that a low-power consumption, area be small, restructural convolution of high treating effect Neural network processor.
Summary of the invention
The purpose of the present invention is to provide a kind of convolutional network arithmetic element and restructural convolutional neural networks processor and Realize image denoising processing method, hardware resource consumption is low, area is small, can restore by raindrop, dust pollution image or Video.
To achieve the goals above, the present invention adopts the following technical scheme:
A kind of convolutional network arithmetic element, including 2 restructural separation convolution modules, nonlinear activation function unit and multiply Accumulator element;
The output of first restructural separation convolution module is the input of nonlinear activation function unit, nonlinear activation letter The output of counting unit is the input of multiply-accumulator unit, and the output of multiply-accumulator unit is second restructural separation convolution module Input;
Picture signal and Configuration network parameter signal are input to first restructural separation convolution module;First restructural It separates convolution module and completes 16 × 16 convolution algorithms;Nonlinear activation function unit completes activation primitive in convolutional neural networks Operation;Multiply-accumulator unit completes the operation of the articulamentum in convolutional neural networks;Second restructural separation convolution module is same When complete 48 × 8 convolution algorithms;
The multiply-accumulator unit includes several multiply-accumulators and several registers;Wherein multiply-accumulator is for calculating one Layer convolutional network output valve and weight parameter product and;The result of upper one layer of convolutional network is input to by register to be multiplied accumulating In device.
Further, the restructural separation convolution module includes that 16 4 × 4 restructural one-dimensional convolution modules and first are posted Storage group;Picture signal and convolutional network parameter are input to restructural one-dimensional convolution module by the first register group;Restructural point 1 16 × 16 convolution achievable from convolution module are completed at the same time 48 × 8 convolution algorithms;4 × 4 restructural one-dimensional convolution moulds Block includes 4 first selectors, 4 the 1st input multipliers, the one 4 input summer, 4 the 2nd 2 input multipliers and the 24 input summers;The output end of 4 first selectors connects the input terminals of corresponding 4 the 1st inputs multiplier, and 4 the Another input terminal of one 2 input multipliers is the weight of neural network;The output end connection of 4 the 1st input multipliers The input terminal of one 4 input summer;The input of 4 the 2nd 2 input multipliers is output and the nerve of the one 4 input summer The weight of network;The input of 2nd 4 input summer is the output of 4 the 2nd 2 input multipliers.
Further, the nonlinear activation function unit includes QD generator and arithmetic unit group;Wherein QD generator Input is the output of restructural separation convolution, and the input of arithmetic unit group is the output of QD generator;QD generator is sharp for generating Parameter needed for function living;Arithmetic unit group is for calculating the final end value of activation primitive;
The QD generator includes first divider;Input signal is input to the first divider, and the first divider is defeated Quotient Q and remainder D out;The arithmetic unit group includes shift register, 2 first adders and the second divider;Shift register Output and for 2 first adders input;The output of 2 first adders is the input of the second divider;Shift LD Device, first adder and the second divider are sequentially connected;
A kind of restructural convolutional neural networks processor, including the control of bus interface, pretreatment unit, reconfigurable hardware Device, SRAM, SRAM control unit, input buffer module, output buffer module, memory, data storage controller and several Convolutional network arithmetic element described in any one of claims 1 to 3;Bus interface connects pretreatment unit, data storage Controller, reconfigurable hardware controller and input-buffer, output caching;Memory connects data Memory Controller;Input is slow Deposit connection reconfigurable hardware controller and SRAM control unit;Convolutional network arithmetic element connects input buffer module, output is delayed Storing module;
The input of the pretreatment unit is image or vision signal;Complete the pre-treatments such as white balance, noise filtering Operation;
The input buffer module, output buffer module are respectively used to the input of caching convolutional network arithmetic element and defeated Out;
The reconfigurable hardware controller configures convolutional network computing module, controls its calculating process;It is transporting During calculation or at the end of send interrupt requests and complete and the interaction of external system;
The SRAM control unit is used to control the transmission of convolutional network weight parameter.
Further, including 512 convolutional network arithmetic elements, it realizes at the image denoising based on convolutional neural networks Reason.
Further, a kind of restructural convolutional neural networks processor realizes 3 layers of convolutional neural networks, is used for The raindrop and dust adhered in removal image or video;The convolutional neural networks first layer is by 512 16 × 16 convolution It constitutes, the second layer is neural network articulamentum, and third layer is made of 512 8 × 8 convolution.
A kind of method that restructural convolutional neural networks processor realizes image denoising processing, comprising:
It is random to reduce convolution number during image denoising processing, the consumption of hardware resource is reduced, processing speed is improved Degree;
Alternatively, 16 × 16 convolution algorithm units and 8 × 8 convolution algorithm units are distinguished during image denoising processing It is divided into the convolution mask of 16 and 44 × 4, one-dimensional convolution is used to each 4 × 4 convolution.
Compared with the existing technology, the invention has the following advantages: convolutional network arithmetic element utilizes Reconfiguration Technologies, Achievable 16 × 16 convolution is completed at the same time 48 × 8 convolution algorithms, improves hardware performance and flexibility.The present invention is using deeply The method for spending study, realizes the denoising of removable image raindrop and dust, treatment effect meet demand.The present invention is not Under the premise of influencing treatment effect, the random template number for reducing convolutional network, but also the method for utilizing the one-dimensional convolution of piecemeal, Hardware resource consumption greatly reduces, and processing speed greatly improves.This processor can realize 3 layers of convolutional neural networks, Neng Gouwei Subsequent higher level image recognition, classification provide feature.Expensive relative to GPU, power consumption is high, and area is big.CPU operation speed Degree is slow, and it is low to run large-scale deep learning network efficiency.The present invention using Reconfiguration Technologies and above-mentioned reduction template number and The method of the one-dimensional convolution of piecemeal, the restructural convolutional neural networks processor of realization it is low in resources consumption, be easy to hardware realization, energy Enough images or video for restoring to be polluted by raindrop, dust.
Detailed description of the invention
Fig. 1 is the structural schematic diagram of convolutional network arithmetic element;
Fig. 2 is the structural schematic diagram of nonlinear activation function unit;
Fig. 3 is the structural schematic diagram of the one 4 × 4th restructural one-dimensional convolution module;
Fig. 4 is the structural schematic diagram of restructural separation convolution module;
Fig. 5 is the structural schematic diagram of restructural convolutional neural networks processor;
Specific embodiment
Explanation and specific embodiment elaborate to the present invention with reference to the accompanying drawing.
Referring to Fig.1, convolutional network arithmetic element packet used in restructural convolutional neural networks processor in the present invention Include 2 restructural separation convolution modules, nonlinear activation function unit and multiply-accumulator unit;First restructural separation convolution The output of module is the input of nonlinear activation function unit, and the output of nonlinear activation function unit is multiply-accumulator unit Input, the output of multiply-accumulator unit are the input of second restructural separation convolution module;
Picture signal and Configuration network parameter signal are input to first restructural separation convolution module;First restructural It separates convolution module and completes 16 × 16 convolution algorithms;Nonlinear activation function unit completes activation primitive in convolutional neural networks Operation;Multiply-accumulator unit completes the operation of the articulamentum in convolutional neural networks;Second restructural separation convolution module is same When complete 48 × 8 convolution algorithms;
It please refers to shown in Fig. 2, nonlinear activation function unit includes QD generator and arithmetic unit group;Wherein QD generator Input is the output of restructural separation convolution, and the input of arithmetic unit group is the output of QD generator;QD generator is sharp for generating Parameter needed for function living;Arithmetic unit group is for calculating the final result of activation primitive.
The activation primitive of neural network of the present invention is hyperbolic tangent function
By domain extension and Taylor series expansion, obtain
Wherein | D | < ln2
QD generator includes first divider, and input signal is input to the first divider, and the first divider is divided by fixed Value 0.69 exports quotient Q and remainder D;The arithmetic unit group includes shift register, 2 first adders and the second divider;It moves Bit register output and the input for 2 first adders;The output of 2 first adders is the input of the second divider;It moves Bit register, first adder and the second divider are sequentially connected;
It please refers to shown in Fig. 3,4 × 4 restructural one-dimensional convolution modules include 4 first selector MUX, 4 the 1st inputs Multiplier, the one 4 input summer, 4 the 2nd 2 input multipliers, the 2nd 4 input summer.Two of first selector are defeated Enter for picture signal and previous stage result;The output end of 4 first selectors connects corresponding 4 the 1st inputs multiplier Another input terminal of one input terminal, 4 the 1st input multipliers is the weight of neural network;4 the 1st inputs multiply The output end of musical instruments used in a Buddhist or Taoist mass connects the input terminal of the one 4 input summer;The input of 4 the 2nd 2 input multipliers is that the one 4 input adds The output of musical instruments used in a Buddhist or Taoist mass and the weight of neural network;The input of 2nd 4 input summer is the output of 4 the 2nd 2 input multipliers.
It please refers to shown in Fig. 4, restructural separation convolution module includes the first register group, 16 4 × 4 restructural one-dimensional volumes Volume module, 44 input first adders and 14 input second adders.Using Reconfiguration Technologies, restructural separation convolution mould Block achievable 16 × 16 is completed at the same time 48 × 8 convolution algorithms.Picture signal and configuration signal are input to the first register Group.The input of one 4 × 4th convolution 1 is 1-4 row picture signal, and the input of the one 4 × 4th convolution 5 is 5-8 row picture signal.
When convolution mask is 16 × 16, the input of the one 4 × 4th convolution 3 is the output of the one 4 × 4th convolution 2, the one 4 × The input of 4 convolution 7 is the output of the one 4 × 4th convolution 6, and the input of the one 4 × 4th convolution 11 is the output of the one 4 × 4th convolution 10, The input of one 4 × 4th convolution 15 is the output of the one 4 × 4th convolution 14.The input of one 4 × 4th convolution 9 is 9-12 row image letter Number, the input of the one 4 × 4th convolution 13 is 13-16 row picture signal.Restructural separation convolution module output is second adder As a result.
When convolution module is 8 × 8, the input of the one 4 × 4th convolution 3 is 1-4 row picture signal, the one 4 × 4th convolution 7 Input is 1-4 row picture signal, and the input of the one 4 × 4th convolution 11 is 1-4 row picture signal, the input of the one 4 × 4th convolution 15 For 1-4 row picture signal.The input of one 4 × 4th convolution 9 is 1-4 row picture signal, and the input of the one 4 × 4th convolution 13 is 5-8 Row picture signal.Restructural separation convolution module output is the result of 4 first adders.One restructural separation convolution module 48 × 8 convolution algorithms can be completed at the same time.
It please refers to shown in Fig. 5, a kind of restructural convolutional neural networks processor of the present invention includes bus interface, pre-treatment list Member, reconfigurable hardware controller, SRAM, SRAM control unit, input-buffer, output caching, memory, data storage control Device and several convolutional network arithmetic elements;Bus interface connects pretreatment unit, data storage controller, reconfigurable hardware control Device and input-buffer processed, output caching;Memory connects data Memory Controller;Input-buffer connects reconfigurable hardware control Device and SRAM control unit;Convolutional network arithmetic element connects input buffer module, output buffer module.
The input of pretreatment unit is image or vision signal;Complete the pre-treatments such as white balance, noise filtering operation;It is defeated Enter cache module, output caching is respectively used to outputting and inputting for caching convolutional network arithmetic element.Reconfigurable hardware controller Convolutional network arithmetic element is configured, its calculating process is controlled;In calculating process or at the end of send interrupt requests Complete the interaction with external system;SRAM control unit is used to control the transmission of convolutional network weight parameter.
One is realized in the convolutional neural networks of removal image raindrop and dust, including 512 convolutional network arithmetic elements. In order to reduce resource, processing speed is improved, the present invention uses following two method during specific implementation: (1) subtracting at random The method of few convolution number: reducing the number of convolutional network arithmetic element under the premise of not influencing treatment effect, reduces hardware The consumption of resource improves processing speed;(2) method of the one-dimensional convolution of piecemeal: 16 × 16 and 8 × 8 convolution mask is divided respectively At 16 and 44 × 4 convolution masks, to each 4 × 4 convolution by the way of one-dimensional convolution.
Referring to Fig. 5, restructural 16 × 16 convolution algorithm unit include 16 4 × 4 restructural one-dimensional convolution modules (1,2, 3 ..., 16), row storing module and register;The input of row storing module is image or vision signal, and the input of register group is that row is deposited The output of module, the input of 4 × 4 restructural one-dimensional convolution modules are the output of register group;Row storing module is for saving image; Register is used to save the image data that row deposits serial input, and image data is input to 4 × 4 restructural one-dimensional convolution modules.
Restructural 8 × 8 convolution algorithm unit includes 44 × 4 restructural one-dimensional convolution modules (1,2,3,4), row storing modules And register;The input of row storing module is the output of multiply-accumulator, and the input of register group is the output of row storing module, and 4 × 4 can The input for reconstructing one-dimensional convolution module is the output of register group.

Claims (6)

1. a kind of convolutional network arithmetic element, it is characterised in that: including 2 restructural separation convolution modules, nonlinear activation letters Counting unit and multiply-accumulator unit;
The output of first restructural separation convolution module is the input of nonlinear activation function unit, nonlinear activation function list The output of member is the input of multiply-accumulator unit, and the output of multiply-accumulator unit is the defeated of second restructural separation convolution module Enter;
Picture signal and Configuration network parameter signal are input to first restructural separation convolution module;First restructural separation Convolution module completes 16 × 16 convolution algorithms;Nonlinear activation function unit completes the fortune of activation primitive in convolutional neural networks It calculates;Multiply-accumulator unit completes the operation of the articulamentum in convolutional neural networks;Second restructural separation convolution module is simultaneously Complete 48 × 8 convolution algorithms;
The multiply-accumulator unit includes several multiply-accumulators and several registers;Wherein multiply-accumulator is for calculating one layer of volume Product network output valve and weight parameter product and;The result of upper one layer of convolutional network is input to multiply-accumulator by register In;
The restructural separation convolution module includes 16 4 × 4 restructural one-dimensional convolution modules and the first register group;First posts Storage group is used to picture signal or previous stage output and convolutional network parameter being input to restructural one-dimensional convolution module;It is restructural Separation convolution module is for completing 1 16 × 16 convolution or being completed at the same time 48 × 8 convolution algorithms;
4 × 4 restructural one-dimensional convolution modules include 4 first selectors, 4 the one 2 input multipliers, the one 4 input additions Device, 4 the 2nd 2 input multipliers and the 2nd 4 input summers;The output end connection of 4 first selectors is 4 first corresponding Another input terminal of the input terminal of 2 input multipliers, 4 the 1st input multipliers is the weight of neural network;4 The output end of one 2 input multipliers connects the input terminal of the one 4 input summer;The input of 4 the 2nd 2 input multipliers is the The output of one 4 input summers and the weight of neural network;The input of 2nd 4 input summer is 4 the 2nd 2 input multipliers Output.
2. a kind of convolutional network arithmetic element according to claim 1, it is characterised in that: the nonlinear activation function list Member includes QD generator and arithmetic unit group;Wherein the input of QD generator is the output of restructural separation convolution, arithmetic unit group Input is the output of QD generator;QD generator is for parameter needed for generating activation primitive;Arithmetic unit group is for calculating activation The final end value of function;
The QD generator includes first divider;Input signal is input to the first divider, and the first divider exports quotient Q With remainder D;The arithmetic unit group includes shift register, 2 first adders and the second divider;Shift register output and For the input of 2 first adders;The output of 2 first adders is the input of the second divider;Shift register, first add Musical instruments used in a Buddhist or Taoist mass and the second divider are sequentially connected.
3. a kind of restructural convolutional neural networks processor, it is characterised in that: including bus interface, pretreatment unit, restructural Hardware control, SRAM, SRAM control unit, input buffer module, output buffer module, memory, data storage control Convolutional network arithmetic element described in device and several any one of claims 1 to 2;Bus interface connects pretreatment unit, number According to Memory Controller, reconfigurable hardware controller and input-buffer, output caching;Memory connects data storage control Device;Input-buffer connects reconfigurable hardware controller and SRAM control unit;Convolutional network arithmetic element connects input-buffer mould Block, output buffer module;
The input of the pretreatment unit is image or vision signal;Complete the pre-treatments such as white balance, noise filtering operation;
The input buffer module, output buffer module are respectively used to outputting and inputting for caching convolutional network arithmetic element;
The reconfigurable hardware controller configures convolutional network computing module, controls its calculating process;In operation In journey or at the end of send interrupt requests and complete and the interaction of external system;
The SRAM control unit is used to control the transmission of convolutional network weight parameter.
4. a kind of restructural convolutional neural networks processor according to claim 3, it is characterised in that: rolled up including 512 Product network operations unit realizes the image denoising processing based on convolutional neural networks.
5. a kind of restructural convolutional neural networks processor according to claim 3, it is characterised in that: described one kind can weigh Structure convolutional neural networks processor realizes 3 layers of convolutional neural networks, for removing the raindrop adhered in image or video And dust;The convolutional neural networks first layer is made of 512 16 × 16 convolution, and the second layer is neural network articulamentum, Third layer is made of 512 8 × 8 convolution.
6. the method that a kind of restructural convolutional neural networks processor as claimed in claim 3 realizes image denoising processing, special Sign is: including:
It is random to reduce convolution number during image denoising processing, the consumption of hardware resource is reduced, processing speed is improved;
Alternatively, 16 × 16 convolution algorithm units and 8 × 8 convolution algorithm units are respectively classified into during image denoising processing 16 and 44 × 4 convolution masks use one-dimensional convolution to each 4 × 4 convolution.
CN201610003960.2A 2016-01-05 2016-01-05 A kind of convolutional network arithmetic element and restructural convolutional neural networks processor and the method for realizing image denoising processing Active CN105681628B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610003960.2A CN105681628B (en) 2016-01-05 2016-01-05 A kind of convolutional network arithmetic element and restructural convolutional neural networks processor and the method for realizing image denoising processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610003960.2A CN105681628B (en) 2016-01-05 2016-01-05 A kind of convolutional network arithmetic element and restructural convolutional neural networks processor and the method for realizing image denoising processing

Publications (2)

Publication Number Publication Date
CN105681628A CN105681628A (en) 2016-06-15
CN105681628B true CN105681628B (en) 2018-12-07

Family

ID=56298840

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610003960.2A Active CN105681628B (en) 2016-01-05 2016-01-05 A kind of convolutional network arithmetic element and restructural convolutional neural networks processor and the method for realizing image denoising processing

Country Status (1)

Country Link
CN (1) CN105681628B (en)

Families Citing this family (61)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203617B (en) * 2016-06-27 2018-08-21 哈尔滨工业大学深圳研究生院 A kind of acceleration processing unit and array structure based on convolutional neural networks
CN106203621B (en) * 2016-07-11 2019-04-30 北京深鉴智能科技有限公司 The processor calculated for convolutional neural networks
WO2018018470A1 (en) * 2016-07-27 2018-02-01 华为技术有限公司 Method, apparatus and device for eliminating image noise and convolutional neural network
CN106250103A (en) * 2016-08-04 2016-12-21 东南大学 A kind of convolutional neural networks cyclic convolution calculates the system of data reusing
US10832123B2 (en) 2016-08-12 2020-11-10 Xilinx Technology Beijing Limited Compression of deep neural networks with proper use of mask
US10936941B2 (en) 2016-08-12 2021-03-02 Xilinx, Inc. Efficient data access control device for neural network hardware acceleration system
US10802992B2 (en) 2016-08-12 2020-10-13 Xilinx Technology Beijing Limited Combining CPU and special accelerator for implementing an artificial neural network
US10810484B2 (en) 2016-08-12 2020-10-20 Xilinx, Inc. Hardware accelerator for compressed GRU on FPGA
US10762426B2 (en) 2016-08-12 2020-09-01 Beijing Deephi Intelligent Technology Co., Ltd. Multi-iteration compression for deep neural networks
US10621486B2 (en) 2016-08-12 2020-04-14 Beijing Deephi Intelligent Technology Co., Ltd. Method for optimizing an artificial neural network (ANN)
US10698657B2 (en) 2016-08-12 2020-06-30 Xilinx, Inc. Hardware accelerator for compressed RNN on FPGA
US10643124B2 (en) 2016-08-12 2020-05-05 Beijing Deephi Intelligent Technology Co., Ltd. Method and device for quantizing complex artificial neural network
CN107229967B (en) * 2016-08-22 2021-06-15 赛灵思公司 Hardware accelerator and method for realizing sparse GRU neural network based on FPGA
US10984308B2 (en) 2016-08-12 2021-04-20 Xilinx Technology Beijing Limited Compression method for deep neural networks with load balance
CN106331433B (en) * 2016-08-25 2020-04-24 上海交通大学 Video denoising method based on deep recurrent neural network
KR20180034853A (en) 2016-09-28 2018-04-05 에스케이하이닉스 주식회사 Apparatus and method test operating of convolutional neural network
IE87469B1 (en) * 2016-10-06 2024-01-03 Google Llc Image processing neural networks with separable convolutional layers
JP2018067154A (en) * 2016-10-19 2018-04-26 ソニーセミコンダクタソリューションズ株式会社 Arithmetic processing circuit and recognition system
CN106529669A (en) 2016-11-10 2017-03-22 北京百度网讯科技有限公司 Method and apparatus for processing data sequences
US10733505B2 (en) 2016-11-10 2020-08-04 Google Llc Performing kernel striding in hardware
CN108073977A (en) * 2016-11-14 2018-05-25 耐能股份有限公司 Convolution algorithm device and convolution algorithm method
CN108073550A (en) * 2016-11-14 2018-05-25 耐能股份有限公司 Buffer unit and convolution algorithm apparatus and method
US10438115B2 (en) * 2016-12-01 2019-10-08 Via Alliance Semiconductor Co., Ltd. Neural network unit with memory layout to perform efficient 3-dimensional convolutions
US10417560B2 (en) * 2016-12-01 2019-09-17 Via Alliance Semiconductor Co., Ltd. Neural network unit that performs efficient 3-dimensional convolutions
CN108241484B (en) * 2016-12-26 2021-10-15 上海寒武纪信息科技有限公司 Neural network computing device and method based on high-bandwidth memory
US10140574B2 (en) * 2016-12-31 2018-11-27 Via Alliance Semiconductor Co., Ltd Neural network unit with segmentable array width rotator and re-shapeable weight memory to match segment width to provide common weights to multiple rotator segments
CN106909970B (en) * 2017-01-12 2020-04-21 南京风兴科技有限公司 Approximate calculation-based binary weight convolution neural network hardware accelerator calculation device
CN106843809B (en) * 2017-01-25 2019-04-30 北京大学 A kind of convolution algorithm method based on NOR FLASH array
CN106940815B (en) * 2017-02-13 2020-07-28 西安交通大学 Programmable convolutional neural network coprocessor IP core
CN108629406B (en) * 2017-03-24 2020-12-18 展讯通信(上海)有限公司 Arithmetic device for convolutional neural network
CN107248144B (en) * 2017-04-27 2019-12-10 东南大学 Image denoising method based on compression type convolutional neural network
CN108804973B (en) * 2017-04-27 2021-11-09 深圳鲲云信息科技有限公司 Hardware architecture of target detection algorithm based on deep learning and execution method thereof
CN108804974B (en) * 2017-04-27 2021-07-02 深圳鲲云信息科技有限公司 Method and system for estimating and configuring resources of hardware architecture of target detection algorithm
CN107169563B (en) 2017-05-08 2018-11-30 中国科学院计算技术研究所 Processing system and method applied to two-value weight convolutional network
CN107256424B (en) * 2017-05-08 2020-03-31 中国科学院计算技术研究所 Three-value weight convolution network processing system and method
CN109117945B (en) * 2017-06-22 2021-01-26 上海寒武纪信息科技有限公司 Processor and processing method thereof, chip packaging structure and electronic device
CN107480782B (en) * 2017-08-14 2020-11-10 电子科技大学 On-chip learning neural network processor
CN107609641B (en) * 2017-08-30 2020-07-03 清华大学 Sparse neural network architecture and implementation method thereof
CN107844826B (en) * 2017-10-30 2020-07-31 中国科学院计算技术研究所 Neural network processing unit and processing system comprising same
CN107862374B (en) * 2017-10-30 2020-07-31 中国科学院计算技术研究所 Neural network processing system and processing method based on assembly line
CN108304923B (en) * 2017-12-06 2022-01-18 腾讯科技(深圳)有限公司 Convolution operation processing method and related product
CN107909148B (en) * 2017-12-12 2020-10-20 南京地平线机器人技术有限公司 Apparatus for performing convolution operations in a convolutional neural network
CN108038815B (en) * 2017-12-20 2019-12-17 深圳云天励飞技术有限公司 integrated circuit with a plurality of transistors
CN108256628B (en) * 2018-01-15 2020-05-22 合肥工业大学 Convolutional neural network hardware accelerator based on multicast network-on-chip and working method thereof
CN108154194B (en) * 2018-01-18 2021-04-30 北京工业大学 Method for extracting high-dimensional features by using tensor-based convolutional network
CN110147872B (en) * 2018-05-18 2020-07-17 中科寒武纪科技股份有限公司 Code storage device and method, processor and training method
CN108846420B (en) * 2018-05-28 2021-04-30 北京陌上花科技有限公司 Network structure and client
CN108764336A (en) * 2018-05-28 2018-11-06 北京陌上花科技有限公司 For the deep learning method and device of image recognition, client, server
CN109343826B (en) * 2018-08-14 2021-07-13 西安交通大学 Reconfigurable processor operation unit for deep learning
CN110874632A (en) * 2018-08-31 2020-03-10 北京嘉楠捷思信息技术有限公司 Image recognition processing method and device
CN109409512B (en) * 2018-09-27 2021-02-19 西安交通大学 Flexibly configurable neural network computing unit, computing array and construction method thereof
TWI766193B (en) * 2018-12-06 2022-06-01 神盾股份有限公司 Convolutional neural network processor and data processing method thereof
CN109711533B (en) * 2018-12-20 2023-04-28 西安电子科技大学 Convolutional neural network acceleration system based on FPGA
CN109784483B (en) * 2019-01-24 2022-09-09 电子科技大学 FD-SOI (field-programmable gate array-silicon on insulator) process-based binary convolution neural network in-memory computing accelerator
CN111626399B (en) * 2019-02-27 2023-07-28 中国科学院半导体研究所 Convolutional neural network computing device and data computing method
CN110070178B (en) * 2019-04-25 2021-05-14 北京交通大学 Convolutional neural network computing device and method
CN111008697B (en) * 2019-11-06 2022-08-09 北京中科胜芯科技有限公司 Convolutional neural network accelerator implementation architecture
TWI734598B (en) * 2020-08-26 2021-07-21 元智大學 Removing method of rain streak in image
RU2764395C1 (en) 2020-11-23 2022-01-17 Самсунг Электроникс Ко., Лтд. Method and apparatus for joint debayering and image noise elimination using a neural network
CN113591025A (en) * 2021-08-03 2021-11-02 深圳思谋信息科技有限公司 Feature map processing method and device, convolutional neural network accelerator and medium
CN115841416B (en) * 2022-11-29 2024-03-19 白盒子(上海)微电子科技有限公司 Reconfigurable intelligent image processor architecture for automatic driving field

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4644488A (en) * 1983-10-12 1987-02-17 California Institute Of Technology Pipeline active filter utilizing a booth type multiplier
US4937774A (en) * 1988-11-03 1990-06-26 Harris Corporation East image processing accelerator for real time image processing applications

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8442927B2 (en) * 2009-07-30 2013-05-14 Nec Laboratories America, Inc. Dynamically configurable, multi-ported co-processor for convolutional neural networks

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4644488A (en) * 1983-10-12 1987-02-17 California Institute Of Technology Pipeline active filter utilizing a booth type multiplier
US4937774A (en) * 1988-11-03 1990-06-26 Harris Corporation East image processing accelerator for real time image processing applications

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
A Deep Convolutional Neural Network Based on Nested Residue Number System;Hiroki Nakahara etal;《Field Programmable Logic and Applications (FPL)》;20150904;全文 *
A Massively Parallel Coprocessor for Conv-olutional Neural Networks;Murugan Sankaradas etal;《2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors》;20090709;全文 *
A reconfigurable interconnected filter for face recognition based on convolution neural network;Shefa A. Dawwd;《Design and Test Workshop (IDT)》;20091117;全文 *
卷积神经网络的FPGA并行加速方案设计;方睿等;《计算机工程与应用》;20150415(第8期);第2-4章,图1-4 *

Also Published As

Publication number Publication date
CN105681628A (en) 2016-06-15

Similar Documents

Publication Publication Date Title
CN105681628B (en) A kind of convolutional network arithmetic element and restructural convolutional neural networks processor and the method for realizing image denoising processing
CN111684473B (en) Improving performance of neural network arrays
US10943167B1 (en) Restructuring a multi-dimensional array
CN106529670B (en) It is a kind of based on weight compression neural network processor, design method, chip
CN105930902B (en) A kind of processing method of neural network, system
CN106951926A (en) The deep learning systems approach and device of a kind of mixed architecture
CN110533164B (en) Winograd convolution splitting method for convolution neural network accelerator
CN107463990A (en) A kind of FPGA parallel acceleration methods of convolutional neural networks
CN108733348B (en) Fused vector multiplier and method for performing operation using the same
CN106203617A (en) A kind of acceleration processing unit based on convolutional neural networks and array structure
CN111626403B (en) Convolutional neural network accelerator based on CPU-FPGA memory sharing
CN113033794B (en) Light weight neural network hardware accelerator based on deep separable convolution
CN110163362A (en) A kind of computing device and method
CN109284824A (en) A kind of device for being used to accelerate the operation of convolution sum pond based on Reconfiguration Technologies
CN110276447A (en) A kind of computing device and method
Duan et al. Energy-efficient architecture for FPGA-based deep convolutional neural networks with binary weights
Xiao et al. FPGA-based scalable and highly concurrent convolutional neural network acceleration
CN109472734B (en) Target detection network based on FPGA and implementation method thereof
CN102970545A (en) Static image compression method based on two-dimensional discrete wavelet transform algorithm
CN105955896A (en) Reconfigurable DBF algorithm hardware accelerator and control method
CN112988229B (en) Convolutional neural network resource optimization configuration method based on heterogeneous computation
Yin et al. FPGA-based high-performance CNN accelerator architecture with high DSP utilization and efficient scheduling mode
CN111886605B (en) Processing for multiple input data sets
Jiang et al. Hardware implementation of depthwise separable convolution neural network
CN114519425A (en) Convolution neural network acceleration system with expandable scale

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant