CN109993283A - The accelerated method of depth convolution production confrontation network based on photoelectricity computing array - Google Patents

The accelerated method of depth convolution production confrontation network based on photoelectricity computing array Download PDF

Info

Publication number
CN109993283A
CN109993283A CN201910291718.3A CN201910291718A CN109993283A CN 109993283 A CN109993283 A CN 109993283A CN 201910291718 A CN201910291718 A CN 201910291718A CN 109993283 A CN109993283 A CN 109993283A
Authority
CN
China
Prior art keywords
matrix
computing array
input
result
array
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910291718.3A
Other languages
Chinese (zh)
Other versions
CN109993283B (en
Inventor
王瑶
娄胜
王宇宣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Jixiang Sensing And Imaging Technology Research Institute Co Ltd
Original Assignee
Nanjing Jixiang Sensing And Imaging Technology Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Jixiang Sensing And Imaging Technology Research Institute Co Ltd filed Critical Nanjing Jixiang Sensing And Imaging Technology Research Institute Co Ltd
Priority to CN201910291718.3A priority Critical patent/CN109993283B/en
Publication of CN109993283A publication Critical patent/CN109993283A/en
Application granted granted Critical
Publication of CN109993283B publication Critical patent/CN109993283B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a kind of, and the depth convolution production based on photoelectricity computing array fights the accelerated method of network.Wherein, photoelectricity computing array includes the light emitting array being made of multiple luminescence unit periodic arrangements and the computing array that is made of multiple computing unit periodic arrangements.Accelerated method includes the following steps: that the weight of depth convolution production confrontation network is input in computing array by light emitting array by the method that light inputs, and as light input quantity, and is stored in computing array;It is input in computing array after image data or noise data are quantified as the data of single-bit, as electric input quantity;The convolutional layer operation in the generation network and differentiation network of depth convolution production confrontation network is calculated respectively using computing array, output digit signals after the completion of calculating, then displacement accumulation operations are carried out to the digital signal, obtains final result.Accelerated method of the invention has the characteristics of high energy efficiency while larger raising arithmetic speed.

Description

The accelerated method of depth convolution production confrontation network based on photoelectricity computing array
Technical field
The method that network is accelerated is fought to depth convolution production using photoelectricity computing array the present invention relates to a kind of, Belong to computer vision field and deep learning field.
Background technique
Traditional computer takes von Neumann framework mostly, however, due to von Neumann framework storage unit and operation Unit it is discrete, result in and produce great energy consumption in data transmission, and influence arithmetic speed.
It is the derivative model of most widely used GAN that depth convolution, which generates confrontation network (abbreviation DCGAN), in high quality graphic The application such as generation, Style Transfer, image repair is upper good behaviour.DCGAN, which refers to using the framework of CNN, constrains GAN, makes Obtain the training that GAN can be stable.
The conventional method that conventional digital circuits accelerate depth residual error network, be convolution algorithm is expanded into matrix to Multiplying is measured, and completes matrix multiplication operation using corresponding multiply-accumulate unit.But single multiplier needs biggish money Source (area), also brings higher power consumption;And the presence for storing wall also limits the further promotion of arithmetic speed.
Summary of the invention
The present invention proposes a kind of accelerated method of depth convolution production confrontation network based on photoelectricity computing array, should add Fast method has the characteristics of high energy efficiency while larger raising arithmetic speed.
The technical solution adopted by the invention is as follows:
The accelerated method of depth convolution production confrontation network based on photoelectricity computing array, the photoelectricity computing array packet The computing array for including the light emitting array being made of multiple luminescence unit periodic arrangements and being made of multiple computing unit periodic arrangements; The accelerated method includes the following steps: the side that light emitting array inputs the weight of depth convolution production confrontation network by light Method is input in computing array, as light input quantity, and is stored in computing array;Image data or noise data are quantified as It is input in computing array after the data of single-bit, as electric input quantity;Using computing array respectively to depth convolution production It fights the generation network of network and differentiates that the convolutional layer operation in network is calculated, output digit signals after the completion of calculating, then Displacement accumulation operations are carried out to the digital signal, obtain final result.
Further, the computing unit includes carrier control zone, coupled zone and photo-generated carrier collecting region and reading Area out;The carrier control zone is for controlling and modulating the carrier in photoproduction carrier collection area and read-out area;The light Collecting region in raw carrier collection area and read-out area is used to absorb the photon of luminescence unit transmitting and collects the photoproduction load of generation Stream;Read-out area in the carrier control zone or photo-generated carrier collecting region and read-out area is connect with electric signal, is read Area is used to export by the carrier after the photo-generated carrier and electric signal effect;The coupled zone connection collecting region and reading Area out.
Further, the micro-stepping width convolution algorithm in the convolutional layer for generating network is calculated using computing array, often The calculating process of one layer of convolutional layer is as follows: 1. for m convolution kernel of current convolutional layer, then by column expansion by each convolution kernel It is spliced into a column vector, the corresponding m column vector of m convolution kernel in this way is then spliced into a matrix, for n channel of input N matrix will be spliced into a new big matrix by characteristics of image figure up and down, using calculating battle array identical with the big matrix size Column, the light input quantity of computing array are corresponding value in big matrix;2. carrying out batch standardization for n characteristics of image figure of input Operation;3. the input of current convolutional layer is the characteristics of image figure in n channel, each of these characteristics of image figure is carried out such as Lower operation obtains n' matrix: several complete zero row or columns are added between every two row or two column, then the addition filling row around, Obtain new characteristic pattern;The region with convolution kernel same size is then chosen, it is enterprising in new characteristic pattern according to the stride of regulation P movement of row, it is mobile every time to take out value corresponding in characteristic pattern, according to longitudinal sequential deployment and be spliced into a line to Amount;After the completion of movement, p row vector has just been obtained, it is spliced into a matrix up and down in order;4. the n' matrix is left Right splicing obtains final electric input matrix;Every a line of electric input matrix is successively inputted into meter in accordance with the order from top to bottom Array is calculated, each column element of row vector corresponds to every a line of computing array;5. row vector is input to calculating battle array according to bit In column, i.e., a bit is inputted each time;After the completion of computing array calculating, the result of each column all turns by analog-digital converter It gets digital signal in return, then digital signal is shifted according to corresponding position respectively, then add up and obtain the form as a result, result The vector for being m for length;The result be a row vector of electric input matrix enter that computing array completes to calculate as a result, corresponding to M convolution kernel carries out then result that a convolution algorithm is summed to same area in n characteristics of image figure;6. 4. according to step 5. method, the p row vector of electric input matrix complete in order calculate after obtain p vector form as a result, by this p row to Amount is spliced into a matrix up and down;Each column of the matrix according to the sequential concatenation of the value from the characteristics of image figure after zero padding At a characteristic pattern, that is, correspond to each convolution kernel convolution algorithm as a result, m characteristic pattern is obtained;7. special to described m The addition of sign figure biases line activating operation of going forward side by side, and obtains the final result of current layer convolutional layer after the completion.
Further, the convolution algorithm in the convolutional layer for differentiating network is calculated using computing array, each layer of volume The calculating process of lamination is as follows: 1. for m convolution kernel of current convolutional layer, each convolution kernel being unfolded and then is spliced by column One column vector, the corresponding m column vector of m convolution kernel in this way are then spliced into a matrix, special for the image in n channel of input N matrix will be spliced into a new big matrix by sign figure up and down, using computing array identical with the big matrix size, calculating The light input quantity of array is corresponding value in big matrix;2. carrying out batch normalizing operation for n characteristics of image figure of input;③ The input of current convolutional layer is the characteristics of image figure in n channel, each of these characteristics of image figure is proceeded as follows It obtains n' matrix: choosing and the region of convolution kernel same size carries out p times on new characteristic pattern according to the stride of regulation It is mobile, it is mobile every time to take out value corresponding in characteristic pattern, according to longitudinal sequential deployment and it is spliced into a row vector;It is mobile After the completion, p row vector has just been obtained, it is spliced into a matrix up and down in order;4. described n' matrix or so is spliced, Obtain final electric input matrix;Every a line of electric input matrix is successively inputted into computing array in accordance with the order from top to bottom, Each column element of row vector corresponds to every a line of computing array;5. row vector is input in computing array according to bit, i.e., A bit is inputted each time;After the completion of computing array calculating, the result of each column is all converted to by analog-digital converter Digital signal, then digital signal is shifted according to corresponding position respectively, it then adds up and obtains as a result, the form of result is length For the vector of m;The result be a row vector of electric input matrix enter that computing array completes to calculate as a result, corresponding to m volume Same area carries out then result that a convolution algorithm is summed in product n characteristics of image figure of verification;6. 4. and 5. according to step Method, the p row vector of electric input matrix complete in order calculate after obtain p vector form as a result, by this p row vector Under be spliced into a matrix;Each column of the matrix according to the sequential concatenation of the value from the characteristics of image figure after zero padding at one A characteristic pattern, that is, correspond to each convolution kernel convolution algorithm as a result, m characteristic pattern is obtained;7. to the m characteristic pattern Addition biases line activating operation of going forward side by side, and obtains the final result of current layer convolutional layer after the completion.
The method that confrontation network is accelerated is generated to depth convolution based on photoelectricity computing array the invention proposes a kind of, The photoelectric calculator part that this method utilizes may be implemented it is high-precision deposit-calculate a body function, it is defeated that individual devices can store light Enter the optical signal at end and saved for a long time after disconnected light, and individual devices may be implemented and complete multiplying, is very suitable to Accelerate using neural network algorithm, CT algorithm as a series of algorithm of a large amount of operation matrix vector multiplication of needs of representative.The present invention By the area and power consumption advantages of computing array, compared to the method that conventional digital circuits accelerate, without accessing piece external storage repeatedly, And a large amount of small product size and energy consumption can be saved.
Detailed description of the invention
Fig. 1 is the multi-functional-area block diagram of computing unit.
Fig. 2 is the structural schematic diagram of photoelectricity computing array.
Fig. 3 is 1 computing unit structure (a) sectional view of embodiment and (b) perspective view.
Fig. 4 is 2 computing unit structure (a) sectional view of embodiment and (b) perspective view.
Fig. 5 is 3 computing unit (a) structural schematic diagram of embodiment and the multi-functional-area (b) schematic diagram.
Fig. 6 is the hardware accelerator block diagram of depth convolution production confrontation network.
Fig. 7 is the structural schematic diagram that embodiment 4 generates network.
Fig. 8 is the schematic illustration that convolution operation expands into matrix multiplication, (a) be n convolution kernel to a characteristic pattern into Row convolution operation exports the result in n channel;(b) it is characterized and corresponds to the region of convolution kernel in figure and be unfolded in column direction as defeated Enter, n convolution kernel is spliced into convolution nuclear matrix after being unfolded in column direction, the result of multiplication is n column matrix, each column corresponding diagram (a) result in a channel in.
Fig. 9 is the structural schematic diagram that embodiment 5 differentiates network.
In figure: 1- light emitting array, 2- computing array.
Specific embodiment
It is an object of the invention to built using photoelectricity computing array DCGAN accelerator mockup (including generator with Arbiter), to obtain smaller area and higher efficiency.
DCGAN generates network, refers to from the noise of Normal Distribution, and it is true to finally obtain imitation by successively training The data of distribution.Its network structure includes several layers convolutional layer (being substantially carried out micro-stepping width convolution operation), in addition to output layer Using batch normalizing operation, in addition to output layer is using tanh activation primitive, other layers use ReLU as activation primitive.
DCGAN differentiates network, refers in the case where input is image, finally obtains it by successively training as true figure The probability of picture.Its network structure includes several layers convolutional layer (being substantially carried out convolution operation), using batch mark in addition to input layer Standardization operation, in addition to output layer is softmax classifier, other layers use leaky ReLU as activation primitive.
Photoelectricity computing array can calculate extensive matrix multiplication with minimum cost, and by expansion convolution operation, it can So that it is expressed as matrix multiplication.The present embodiment realizes that DCGAN generates network and differentiates in network using photoelectricity computing array The operation of convolution and micro-stepping width convolution.Generator is from acquisition input in the noise being centainly distributed is met, by each in generator The processing of layer convolutional layer, obtains piece image, and the purpose of generating high quality graphic may be implemented;The image is sent into as input to be sentenced Other device obtains the probability that the image is true picture by the processing of each layer convolutional layer of arbiter.
As shown in Figure 1, the computing unit in photoelectricity computing array is the multi-functional-area structure for including three zones area, wherein Three zones area are as follows: carrier control zone, coupled zone, photo-generated carrier collecting region and read-out area, concrete function difference are as follows:
Carrier control zone: it is responsible for controlling and modulating the carrier in photoelectricity computing unit, and calculates list as photoelectricity The electrical input mouth of member inputs one of operand as electric input quantity;Or it only controls and modulates in photoelectricity computing unit Carrier, pass through other regions and input electric input quantity.
Coupled zone: being responsible for connection photo-generated carrier collecting region and read-out area, so that the photo-generated carrier that photon incidence generates The carrier in photoelectricity computing unit is acted on, operation relation is formed.
Photo-generated carrier collecting region and read-out area: wherein collecting region is responsible for absorbing incident photon and collects the photoproduction of generation Carrier, and the light input port as photoelectricity computing unit input one of operand as light input quantity;Read-out area The electrical input mouth that can be used as photoelectricity computing unit inputs one of operand as electric input quantity, and as photoelectricity The output port of computing unit, output is by the carrier after light input quantity and electric input quantity effect as unit output quantity;Or Input electric input quantity by other regions, read-out area is only used as the output port of photoelectricity computing unit, output by light input quantity and Carrier after electric input quantity effect, as unit output quantity.
The light that luminescence unit issues is collected as incident computing unit photo-generated carrier and the photon of read-out area, participates in fortune It calculates.Photoelectricity computing array includes light emitting array 1 and computing array 2, and structure is as shown in Figure 2.Light emitting array 1 is by multiple luminescence units Periodic arrangement composition, computing array 2 are made of multiple computing unit periodic arrangements.
Embodiment 1
As shown in figure 3, the computing unit of the present embodiment includes: as the control grid of carrier control zone, as coupling The Charged Couple floor in area, and as the P type substrate of photo-generated carrier collecting region and read-out area, left side is divided into P type substrate and is received Ji Qu and right side read-out area, wherein including shallow-trench isolation in the read-out area of right side, by the N-type source and N-type of ion implanting formation Drain terminal.Shallow-trench isolation is located at the centre at semiconductor substrate middle part, collecting region and read-out area, and shallow-trench isolation is by etching and being packed into Silica is formed, with the electric signal for collecting region and read-out area to be isolated.N-type source is located in read-out area and is situated between by near-bottom The side of matter layer is adulterated by ion implantation and is formed.N-type drain terminal is located in semiconductor substrate close to underlying dielectric layer and N The opposite other side of type source is equally doped method by ion implantation and is formed.It should be understood that left side mentioned in this article, Right side, top and lower section, which are only represented, is changing change with observation visual angle by the relative position under view as shown in the figure Change, and is not understood to the limitation to specific structure.
Apply the pulse that a voltage range is negative pressure on the substrate of collecting region, or applies a voltage on the control gate Range is the pulse of positive pressure, so that generating the depletion layer collected for photoelectron in collecting region substrate, and passes through right side read-out area Read the photoelectron quantity collected, the input quantity as light input end.When reading, applies a positive voltage on the control gate, make N Conducting channel is formed between type source and collecting region N-type drain terminal, then by applying a biasing arteries and veins between N-type source and N-type drain terminal Voltage is rushed, so that the electronics in conducting channel accelerates to be formed the electric current between source and drain.The load of electric current is formed between source and drain in channel Stream is controlled the photoelectron quantity collective effect that gate voltage, source-drain voltage and collecting region are collected, as by light input quantity Electronics with after electric input quantity collective effect, is exported in the form of electric current, and wherein control-grid voltage, source-drain voltage can be with As the electric input quantity of device, photoelectron quantity is then the light input quantity of device.
The Charged Couple layer of coupled zone makes depletion region in collecting region substrate start to collect for connecting collecting region and read-out area After photoelectron, the photoelectron quantity that collecting region substrate surface gesture just will receive collection influences;By the connection of Charged Couple layer, So that read-out area semiconductor substrate surface gesture is influenced by collecting region semiconductor substrate surface gesture, and then between influence read-out area source and drain Size of current, to read the photoelectron quantity of collecting region collection by judging electric current between read-out area source and drain.
The control gate of carrier control zone, to apply a pulse voltage on it, so that being read in P-type semiconductor substrate It generates in area for exciting photoelectronic depletion region out, while can also be used as electrical input, input a wherein bit arithmetic amount.
In addition, there is the underlying dielectric layer for isolation between P-type semiconductor substrate and Charged Couple layer;Charged Couple layer Also there is the top layer dielectric layer for isolation between control gate.
Embodiment 2
As shown in figure 4, the computing unit of the present embodiment includes: as the control grid of carrier control zone, as coupling The Charged Couple floor in area, and as the P-type semiconductor substrate of photo-generated carrier collecting region and read-out area, wherein in P type substrate Include the N-type source formed by ion implanting and drain terminal.P-type semiconductor substrate can undertake work that is photosensitive and reading simultaneously Make.N-type source is located at the side in read-out area close to underlying dielectric layer, is adulterated and is formed by ion implantation.N-type drain terminal position It is same to be carried out by ion implantation close to the underlying dielectric layer other side opposite with the N-type source in semiconductor substrate Doping method is formed.
When photosensitive, apply the pulse that a voltage range is negative pressure on P-type semiconductor substrate, while as carrier Apply the pulse that a voltage range is positive pressure on the control grid of control zone, is received so that being generated in P type substrate for photoelectron The depletion layer of collection is generated and is accelerated under the electric field action in the electronics in depletion region between control grid and P type substrate both ends, And sufficiently high energy is obtained reaching, the underlying dielectric layer potential barrier across P type substrate and Charged Couple layer, into charge Coupling layer is simultaneously stored in this, the amount of charge in Charged Couple layer, when will affect threshold value when device is opened, and then influencing to read Source and drain between size of current;When reading, apply a pulse voltage on the control gate, makes to be formed between N-type source and N-type drain terminal and lead Electric channel, then by applying a pulse voltage between N-type source and N-type drain terminal, so that the electronics in conducting channel accelerates shape At the electric current between source and drain.Electric current between source and drain is controlled in gate pulse voltage, source-drain voltage and Charged Couple layer and deposits The electron amount collective effect of storage, as by the electronics after light input quantity and electric input quantity collective effect, in the form of electric current into Row output, wherein control-grid voltage, source-drain voltage can be used as the electric input quantity of device, the photoelectricity stored in Charged Couple layer Subnumber amount is then the light input quantity of device.
The Charged Couple layer of coupled zone enters photoelectron therein for storing, and device threshold size when changing reading, And then electric current between read-out area source and drain is influenced, thus by judge between read-out area source and drain electric current come generation when reading photosensitive and entering Photoelectron quantity in Charged Couple layer.
The control gate of carrier control zone, to apply a pulse voltage on it, so that being read in P-type semiconductor substrate It generates in area for exciting photoelectronic depletion region out, while can also be used as electrical input, input a wherein bit arithmetic amount.
In addition, there are one layer of underlying dielectric layers for isolation between P-type semiconductor substrate and Charged Couple layer;Charge coupling It closes and also there is one layer of top layer dielectric layer for isolation between layer and control gate.
Embodiment 3
As shown in figure 5, the computing unit of the present embodiment includes: two pole of photoelectricity collected as photo-generated carrier with read-out area Pipe and readout tube, wherein photodiode is formed by ion doping, is responsible for photosensitive.The area N of photodiode passes through as coupling The photoelectron coupling lead for closing area is connected on the control gate of readout tube and the source of reset transistor, and the drain terminal of readout tube is applying one just Voltage pulse, the driving voltage as read current;Before exposure, reset transistor is opened, and reset transistor drain terminal voltage is applied to photoelectricity two In pole pipe, the photodiode as collecting region is made to be in reverse-biased, generates depletion layer;When exposure, reset transistor shutdown, photoelectricity Diode is electrically isolating, and photoelectron is generated behind photon incidence photodiode depletion region, and accumulate in the diode, two poles The area N of pipe with electrically by as coupled zone photoelectron couple lead connect with the area N readout tube control gate potential opening Begin decline, and then the electron concentration in influence readout tube channel.Readout tube is responsible for reading, and drain terminal applies a positive pulse voltage, Source is connected with addressing pipe drain terminal, when reading, is opened addressing pipe, is generated circuit current in readout tube, size of current is resetted Pipe drain terminal voltage, readout tube drain terminal voltage and incident light subnumber joint effect, the electronics in readout tube channel, input as by light Electronics after amount and electric input quantity collective effect, exports in the form of electric current, wherein reset transistor drain terminal voltage, readout tube drain terminal electricity Pressure can be used as the electric input quantity of device, and electric incident light subnumber is then the light input quantity of device.
The photoelectron coupling lead of coupled zone is used to be connected to the light of collecting region in photo-generated carrier collection and read-out area Electric diode and readout tube as read-out area, the area photodiode N potential is applied on readout tube control gate.
As the reset transistor of carrier control zone, a positive voltage is inputted by its drain terminal and acts on photodiode, when When reset transistor is opened, positive voltage can be acted on the photodiode, and photodiode is made to generate depletion region and photosensitive, while It can be used as electrical input, input a wherein bit arithmetic amount.
In addition, addressing pipe is used to control output of the entire arithmetic unit as the output electric current of output quantity, it can be in photoelectricity Ranks addressing uses when computing unit forms array.
Embodiment 4
This example proposes a kind of specific embodiment of DCGAN generation network acceleration based on photoelectricity computing array, leads to The photoelectricity computing array constructed using computing unit is crossed, to accelerate the generation network of DCGAN, hardware block diagram is as shown in Figure 6.
Input data enters accelerator by interface, is quantified as that the digital ratio of computing array can be inputted by auxiliary circuit Special data;The weight of network is inputted in computing array in the method that light inputs using LED array simultaneously;Pass through after the completion of operation The output of AD array is digital signal, then carries out displacement accumulation operations by auxiliary logical circuit, obtains the result of operation.
By taking the first layer convolutional layer conv1 in Fig. 7 as an example, upper layer output size is 4*4*1024.
(1) basic digital circuit is utilized, batch normalizing operation is completed to upper layer output;
(2) in the image of 4*4, every two row/column is inserted into 30 (row, column are all inserted into), around adds 3 boundaries and fills out It fills, convolution kernel size is 5*5 (sharing 512).Therefore being equivalent to the convolution kernel that size is 5*5 is 19 (4+3*3+2*3 to size =19) image carries out convolution operation, according to the method for Fig. 8, successively by the mobile expansion of the 5*5 fritter of input picture.
Input feature vector figure (n-th of channel)
It is obtained after zero padding,
Characteristic pattern after zero padding
Convolution kernel (n-th)
Convolution is expanded into matrix multiplication by rule as described in Figure 8
Micro-stepping width convolution (deconvolution) operation of CONV1 can indicate that as above (convolution kernel, size are (25*1024) * to W 512) i.e. the weight of photoelectricity computing array, a (upper layer output, size are 64* (25*1024)) the i.e. input of computing array.
(3) computing array inputs a horizontal electrical signal (64 row) totally every time, and input vector is calculated according to bit input entering light electricity In array, i.e., once one bit of input.Assuming that each member is known as 8bit, then divide 8 inputs, when the operation in computing array is complete Cheng Hou, the result of each column all pass through AD conversion and obtain digital signal, and using basic Digital Logical Circuits, this 8 times are exported It is shifted respectively according to corresponding position, then adds up and obtain result.
(4) according to the method for step (3), 64 row vectors of electric input matrix obtain 64 vectors after completing operation in order This 64 row vector is spliced into a matrix by the result (each vector has 512 elements) of form up and down.The first row of matrix According to the sequential concatenation of the value from characteristic pattern at a characteristic pattern, that is, correspond to first convolution kernel convolution algorithm as a result, Second arrange the corresponding second convolution kernel convolution algorithm of the characteristic pattern that is spliced into as a result, and so on, 512 features are obtained Scheme (size 8*8).
(5) 512 characteristic patterns addition biasing using basic digital circuit to obtaining, and use ReLU activation primitive It is activated.After the completion, then the output result of this layer of convolutional layer is obtained.
As above operation, four layers of convolutional layer can successively be built, and the last layer is operated without using BN, except the last layer uses Outside tanh activation primitive, remainder layer uses ReLU activation primitive.So far, the generation network establishment of DCGAN is completed.
Embodiment 5
This example proposes a kind of specific embodiment of DCGAN differentiation network acceleration based on photoelectricity computing array, leads to The photoelectricity computing array constructed using computing unit is crossed, to accelerate the differentiation network of DCGAN, hardware block diagram is as shown in Figure 6.
By taking the third layer convolutional layer conv3 of Fig. 9 as an example, upper layer output size is 8*8*256.
(1) basic digital circuit is utilized, batch normalizing operation is completed to upper layer output;
(2) in periphery addition 2 row bounds filling, the convolution kernel that 512 sizes are 5*5, stride 2 are shared.Input feature vector figure (n-th of channel)
After adding Boundary filling,
Characteristic pattern after addition Boundary filling
Convolution kernel (n-th of convolution kernel)
Convolution operation is expanded into matrix multiplication according to Fig. 8 method
The convolution operation of CONV4 can indicate that as above (convolution kernel, size are (25*256) * 512) i.e. computing array to W Weight, a (upper layer output, size are 16* (25*256)) the i.e. input of computing array.
(3) computing array inputs a horizontal electrical signal (totally 16 row) every time, and input vector is inputted according to bit into computing array In, i.e., once one bit of input.Assuming that each member is known as 8bit, then divide 8 inputs, after the completion of the operation in computing array, The result of each column all passes through AD conversion and obtains digital signal, and using basic Digital Logical Circuits, this 8 times output is pressed respectively It is shifted according to corresponding position, then adds up and obtain result.
(4) according to the method for step (3), 16 row vectors of electric input matrix obtain 16 vectors after completing operation in order This 16 row vector is spliced into a matrix by the result (each vector has 512 elements) of form up and down.The first row of matrix According to the sequential concatenation of the value from characteristic pattern at a characteristic pattern, that is, correspond to first convolution kernel convolution algorithm as a result, Second arrange the corresponding second convolution kernel convolution algorithm of the characteristic pattern that is spliced into as a result, and so on, 512 features are obtained Scheme (size 4*4).
(5) 512 characteristic patterns addition biasing using basic digital circuit to obtaining, and swashed using leakyReLU Function living carries out activation operation.After the completion, then the output result of this layer of convolutional layer is obtained.
As above operation, four layers of convolutional layer can successively be built, and image input layer is not suitable for BN operation, except the last layer is Outside softmax, remaining convolutional layer uses LeakyReLU as activation primitive.So far, the differentiation network establishment of DCGAN is completed.
The accelerated method of the DCGAN provided by the invention based on photoelectricity computing array is described in detail above, with It is easy to understand the present invention and its core concept.It, in the specific implementation, can be according to the present invention for those of ordinary skill in the art Core concept carry out a variety of modifications and deduction.In conclusion this specification is not construed as limitation of the present invention.

Claims (4)

1. the accelerated method of the depth convolution production confrontation network based on photoelectricity computing array, which is characterized in that the photoelectricity Computing array includes the light emitting array being made of multiple luminescence unit periodic arrangements and is made of multiple computing unit periodic arrangements Computing array;The accelerated method includes the following steps:
The weight of depth convolution production confrontation network is input in computing array by light emitting array by the method that light inputs, and is made For light input quantity, and it is stored in computing array;It is input to after image data or noise data are quantified as the data of single-bit In computing array, as electric input quantity;Using computing array respectively to depth convolution production confrontation network generation network and Differentiate that the convolutional layer operation in network is calculated, output digit signals after the completion of calculating, then the digital signal is shifted Accumulation operations obtain final result.
2. the accelerated method of the depth convolution production confrontation network according to claim 1 based on photoelectricity computing array, It is characterized in that, the computing unit includes carrier control zone, coupled zone and photo-generated carrier collecting region and read-out area; The carrier control zone is for controlling and modulating the carrier in photoproduction carrier collection area and read-out area;The photoproduction current-carrying Collecting region in sub- collecting region and read-out area is used to absorb the photon of luminescence unit transmitting and collects the photo-generated carrier of generation;Institute The read-out area stated in carrier control zone or photo-generated carrier collecting region and read-out area is connect with electric signal, and read-out area is for defeated Out by the carrier after the photo-generated carrier and electric signal effect;The coupled zone connection collecting region and read-out area.
3. the accelerated method of the depth convolution production confrontation network according to claim 1 based on photoelectricity computing array, It is characterized in that, being calculated using computing array the micro-stepping width convolution algorithm in the convolutional layer for generating network, each layer of volume The calculating process of lamination is as follows:
1. being unfolded and then being spliced into a column vector, m in this way by column for each convolution kernel for m convolution kernel of current convolutional layer The corresponding m column vector of a convolution kernel is then spliced into a matrix, for the characteristics of image figure in n channel of input, by n matrix It is spliced into a new big matrix up and down, using computing array identical with the big matrix size, the light input quantity of computing array For corresponding value in big matrix;
2. carrying out batch normalizing operation for n characteristics of image figure of input;
3. the input of current convolutional layer is the characteristics of image figure in n channel, each of these characteristics of image figure is carried out such as Lower operation obtains n' matrix: several complete zero row or columns are added between every two row or two column, then the addition filling row around, Obtain new characteristic pattern;The region with convolution kernel same size is then chosen, it is enterprising in new characteristic pattern according to the stride of regulation P movement of row, it is mobile every time to take out value corresponding in characteristic pattern, according to longitudinal sequential deployment and be spliced into a line to Amount;After the completion of movement, p row vector has just been obtained, it is spliced into a matrix up and down in order;
4. described n' matrix or so is spliced, final electric input matrix is obtained;It in accordance with the order from top to bottom successively will be electric Every a line of input matrix inputs computing array, and each column element of row vector corresponds to every a line of computing array;
5. row vector is input in computing array according to bit, i.e., a bit is inputted each time;When computing array has been calculated Cheng Hou, the result of each column all pass through analog-digital converter and are converted to digital signal, then by digital signal respectively according to corresponding position It is shifted, then adds up and obtain as a result, the form of result is the vector that length is m;The result is a line of electric input matrix Vector enters that computing array completion calculates as a result, corresponding to m convolution kernel carries out one to same area in n characteristics of image figure Then result that secondary convolution algorithm is summed;
6. the p row vector of electric input matrix is completed to obtain p vector form after calculating in order according to the method for step 4. and 5. As a result, this p row vector is spliced into a matrix up and down;Each column of the matrix according to from the characteristics of image figure after zero padding The sequential concatenation of middle value at a characteristic pattern, that is, correspond to each convolution kernel convolution algorithm as a result, m feature is obtained Figure;
7. biasing line activating operation of going forward side by side to m characteristic pattern addition, the most termination of current layer convolutional layer is obtained after the completion Fruit.
4. the accelerated method of the depth convolution production confrontation network according to claim 1 based on photoelectricity computing array, It is characterized in that, the convolution algorithm in the convolutional layer for differentiating network is calculated using computing array, each layer of convolutional layer Calculating process is as follows:
1. being unfolded and then being spliced into a column vector, m in this way by column for each convolution kernel for m convolution kernel of current convolutional layer The corresponding m column vector of a convolution kernel is then spliced into a matrix, for the characteristics of image figure in n channel of input, by n matrix It is spliced into a new big matrix up and down, using computing array identical with the big matrix size, the light input quantity of computing array For corresponding value in big matrix;
2. carrying out batch normalizing operation for n characteristics of image figure of input;
3. the input of current convolutional layer is the characteristics of image figure in n channel, each of these characteristics of image figure is carried out such as Lower operation obtains n' matrix: the region with convolution kernel same size is chosen, it is enterprising in new characteristic pattern according to the stride of regulation P movement of row, it is mobile every time to take out value corresponding in characteristic pattern, according to longitudinal sequential deployment and be spliced into a line to Amount;After the completion of movement, p row vector has just been obtained, it is spliced into a matrix up and down in order;
4. described n' matrix or so is spliced, final electric input matrix is obtained;It in accordance with the order from top to bottom successively will be electric Every a line of input matrix inputs computing array, and each column element of row vector corresponds to every a line of computing array;
5. row vector is input in computing array according to bit, i.e., a bit is inputted each time;When computing array has been calculated Cheng Hou, the result of each column all pass through analog-digital converter and are converted to digital signal, then by digital signal respectively according to corresponding position It is shifted, then adds up and obtain as a result, the form of result is the vector that length is m;The result is a line of electric input matrix Vector enters that computing array completion calculates as a result, corresponding to m convolution kernel carries out one to same area in n characteristics of image figure Then result that secondary convolution algorithm is summed;
6. the p row vector of electric input matrix is completed to obtain p vector form after calculating in order according to the method for step 4. and 5. As a result, this p row vector is spliced into a matrix up and down;Each column of the matrix according to from the characteristics of image figure after zero padding The sequential concatenation of middle value at a characteristic pattern, that is, correspond to each convolution kernel convolution algorithm as a result, m feature is obtained Figure;
7. biasing line activating operation of going forward side by side to m characteristic pattern addition, the most termination of current layer convolutional layer is obtained after the completion Fruit.
CN201910291718.3A 2019-04-12 2019-04-12 Deep convolution generation type countermeasure network acceleration method based on photoelectric calculation array Active CN109993283B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910291718.3A CN109993283B (en) 2019-04-12 2019-04-12 Deep convolution generation type countermeasure network acceleration method based on photoelectric calculation array

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910291718.3A CN109993283B (en) 2019-04-12 2019-04-12 Deep convolution generation type countermeasure network acceleration method based on photoelectric calculation array

Publications (2)

Publication Number Publication Date
CN109993283A true CN109993283A (en) 2019-07-09
CN109993283B CN109993283B (en) 2023-02-28

Family

ID=67133358

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910291718.3A Active CN109993283B (en) 2019-04-12 2019-04-12 Deep convolution generation type countermeasure network acceleration method based on photoelectric calculation array

Country Status (1)

Country Link
CN (1) CN109993283B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116306823A (en) * 2023-04-27 2023-06-23 北京爱芯科技有限公司 Method, device and chip for providing data for MAC array
CN112396083B (en) * 2019-08-19 2024-02-20 阿里巴巴集团控股有限公司 Image recognition, model training and construction and detection methods, systems and equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105829902A (en) * 2013-09-14 2016-08-03 科磊股份有限公司 Method and apparatus for non-contact measurement of internal quantum efficiency in light emitting diode structures
US9646243B1 (en) * 2016-09-12 2017-05-09 International Business Machines Corporation Convolutional neural networks using resistive processing unit array
CN108416434A (en) * 2018-02-07 2018-08-17 复旦大学 The circuit structure accelerated with full articulamentum for the convolutional layer of neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105829902A (en) * 2013-09-14 2016-08-03 科磊股份有限公司 Method and apparatus for non-contact measurement of internal quantum efficiency in light emitting diode structures
US9646243B1 (en) * 2016-09-12 2017-05-09 International Business Machines Corporation Convolutional neural networks using resistive processing unit array
CN108416434A (en) * 2018-02-07 2018-08-17 复旦大学 The circuit structure accelerated with full articulamentum for the convolutional layer of neural network

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112396083B (en) * 2019-08-19 2024-02-20 阿里巴巴集团控股有限公司 Image recognition, model training and construction and detection methods, systems and equipment
CN116306823A (en) * 2023-04-27 2023-06-23 北京爱芯科技有限公司 Method, device and chip for providing data for MAC array
CN116306823B (en) * 2023-04-27 2023-08-04 北京爱芯科技有限公司 Method, device and chip for providing data for MAC array

Also Published As

Publication number Publication date
CN109993283B (en) 2023-02-28

Similar Documents

Publication Publication Date Title
TWI750541B (en) Optoelectronic computing unit, optoelectronic computing array, and optoelectronic computing method
CN110647983B (en) Self-supervision learning acceleration system and method based on storage and calculation integrated device array
TWI774147B (en) Pulse convolutional neural network algorithm and related integrated circuits and method of manufacture thereof, computing devices and storage media
CN106338339B (en) Compact applied to array type single-photon avalanche diode detects quenching circuit
CN110263296A (en) A kind of matrix-vector multiplier and its operation method based on photoelectricity computing array
CN110009102A (en) A kind of accelerated method of the depth residual error network based on photoelectricity computing array
CN107063452A (en) A kind of single-photon avalanche photodiode capacitance quenching circuit
CN105046325B (en) A kind of circuit based on class MOS luminescent devices simulation biological neural network
CN109993283A (en) The accelerated method of depth convolution production confrontation network based on photoelectricity computing array
CN110263295A (en) A kind of operation optimization method of the matrix-vector multiplier based on photoelectricity computing array
CN106092339A (en) A kind of simulation counting circuit for single-photon detector
CN110276440A (en) A kind of convolution algorithm accelerator and its method based on photoelectricity computing array
CN110244817A (en) A kind of Solving Partial Differential Equations device and its method based on photoelectricity computing array
CN105935296B (en) A kind of pixel circuit, digital X-ray detection device and its detection method
CN110276046A (en) A kind of control method of photoelectricity computing unit
WO2021028762A1 (en) System
CN110275569A (en) The control method of photoelectricity calculating cell operation state
CN110276048B (en) Control method for matrix vector multiplication array
CN110276047A (en) A method of matrix-vector multiplication is carried out using photoelectricity computing array
CN110245324A (en) A kind of de-convolution operation accelerator and its method based on photoelectricity computing array
CN110263926B (en) Pulse neural network based on photoelectric computing unit, system and operation method thereof
CN103873791B (en) Pixel unit read-out circuit and method, and pixel array read-out circuit and method
CN103873792B (en) Pixel unit read-out device and method, and pixel array read-out device and method
CN110288078A (en) A kind of accelerator and its method for GoogLeNet model
CN110262774B (en) Calculation method of photoelectric multiplier

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant