CN100510836C - Pulsation array processing circuit for adaptive optical system wavefront control operation - Google Patents
Pulsation array processing circuit for adaptive optical system wavefront control operation Download PDFInfo
- Publication number
- CN100510836C CN100510836C CNB2007100991061A CN200710099106A CN100510836C CN 100510836 C CN100510836 C CN 100510836C CN B2007100991061 A CNB2007100991061 A CN B2007100991061A CN 200710099106 A CN200710099106 A CN 200710099106A CN 100510836 C CN100510836 C CN 100510836C
- Authority
- CN
- China
- Prior art keywords
- processing unit
- array
- multiply accumulating
- pin
- accumulating processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Abstract
The invention relates to a pulsating array processing circuit for implementing wavefront control operation of a self-adapting optical system, arranging multiple processing elements into two linear structures to respectively complete convolution and recursion operations and using an addition processing element to link the two arrays together to implement the wavefront control operation; and n voltages are serially calculated in time-division multiplexing mode in an array, and the processing elements implement data transfer between adjacent elements by shift registers, avoiding memory read-write operation. And the arrays of the invention have features of data communication localization, and simple data and control streams, saving resources and convenient to implement with hardware.
Description
Technical field
The present invention relates to a kind of pulsation array processing circuit, particularly a kind of wavefront control operation that is applicable to the ADAPTIVE OPTICS SYSTEMS wave front processor.
Background technology
In ADAPTIVE OPTICS SYSTEMS,, very high to the computing power and the real-time requirement of wave front processor in order effectively to proofread and correct the dynamic wavefront error that atmospheric turbulence causes.General microcomputer can not meet the demands, must develop special-purpose fast wave pretreating machine according to the characteristics that the ADAPTIVE OPTICS SYSTEMS wavefront is handled.The workflow of wave front processor comprises Flame Image Process, slope calculating, wave front restoration, control computing and five modules of D/A conversion, wherein wavefront control operation is with the corrugated error vector E from the output of restoration calculation module, obtain tilting mirror, the required control voltage of distorting lens driver through the control interative computation, computing formula is:
Wherein, b
0, b
1, b
2, b
3, a
1, a
2, a
3It is the controlled variable of system.If m effective sub-aperture n unit self-adapting optical system, then V and E are the vectors of n * 1.If e represents the one-component of wavefront error vector E, v is the one-component of gained control voltage V, and then the time domain computing formula of single channel control voltage is:
Wherein
Be the error amount on k frame i road,
Be k frame i road magnitude of voltage (1≤i≤n).As seen control computing and be an interative computation, not only relevant, also relevant with the data and the result of calculation of its front cross frame with the data of present frame.
The interative computation on this n road can come executed in parallel by the multicomputer system of a plurality of processors (DSP), for example is published in one piece of paper of " photoelectric project " in September, 1998, is entitled as " the fast wave pretreating machine of frame frequency 2900Hz ", the Wang Chunhong work.4 control interative computations that TMS320C31 is used to walk abreast have wherein been used.Because every DSP needs a large amount of control circuits, causes the device integrated level not high, is not easy to large scale integration; And this method is essentially the computing of being undertaken by software, so restricted on the high speed; Calculating begin after can only finishing at the restoration calculation of frame data in addition, i.e. restoration calculation and control computing serial carried out, and therefore the calculating time-delay is bigger.
Summary of the invention
Technology of the present invention is dealt with problems: overcome the deficiencies in the prior art, provide a kind of calculate the time-delay little, integrated level is high, the pulsation array processing circuit of the realization ADAPTIVE OPTICS SYSTEMS wave front processor wavefront control operation of fast operation.
Technical solution of the present invention: be used for the pulsation array processing circuit of adaptive optical system wavefront control operation, its characteristics are: it is by 7 multiply accumulating processing unit PE
1-PE
7, 1 addition process unit PE
8, 12 shift register M
1-M
12Form 4 multiply accumulating processing unit PE
1-PE
4Linear array is formed convolution algorithm portion, has two data stream in opposite directions in the array, and each component order of the error vector E of each frame is from the first multiply accumulating processing unit PE
1Flow into array, through the second multiply accumulating processing unit PE
2, the 3rd multiply accumulating processing unit PE
3With the 4th multiply accumulating processing unit PE
4Flow out array after the computing, convolution results with initial value 0 from the 4th multiply accumulating processing unit PE
4Flow into array, through PE
3, PE
2, PE
1Flow out array; 3 multiply accumulating processing unit PE
5-PE
8Linear array is formed recursive operation portion, has two data stream in opposite directions in the array, and the recursive operation part is 0 from the 7th multiply accumulating processing unit PE with initial value
7Flow into array, through the 6th multiply accumulating processing unit PE
6With the 5th multiply accumulating processing unit PE
5After the computing at addition process unit PE
8With output control voltage result of calculation after the array of the convolution algorithm portion output results added and feedback backward, through PE
5, PE
6, PE
7Calculate the back and flow out array, realize data transfer between adjacent cells by shift register interconnection between the each processing unit, promptly the data-out port of a processing unit connects the data input pin of a shift register, and the data output end of this shift register connects the data input pin of next processing unit.
Principle of the present invention: the present invention is divided into convolution algorithm portion and recursive operation portion two parts, and each finishes the convolution algorithm and the recursive operation of adaptive optics wavefront control operation respectively by 4 multiply accumulating processing unit linear array.Each processing unit is synchronous working under the promotion of clock, and by an adder unit two arrays is linked, and realizes two systolic arrays output results added and outside output voltage values.
Array is by as shown in fig. 18 processing unit PE
1-PE
8, 12 degree of depth are the shift register M of (n/2)
1-M
12Form, if n is not an even number, then the degree of depth is (n+1)/2.Array is divided into two parts: convolution algorithm portion 101 and recursive operation portion 102, finish the convolution and the recursive operation of following (3), (4) formula respectively, and at last two parts are linked to realize the control computing of (5) formula by an adder unit.
y
i=b
0e
i+b
1e
i-1+b
2e
i-2+b
3e
i-3 (3)
Then have
v
i=a
1v
i-1+a
2v
i-2+a
3v
i-3 (4)
v
i=(a
1v
i-1+a
2v
i-2+a
3v
i-3)+(b
0e
i+b
1e
i-1+b
2e
i-2+b
3e
i-3) (5)
PE wherein
1-PE
7Be the multiply accumulating computing unit that has local storage, each multiply accumulating processing unit all is made up of a register REG, a multiplier MUL totalizer ADD.
Fig. 2 is the structural representation of multiply accumulating processing unit.The port explanation of multiply accumulating processing unit:
MUL_in, MUL_out: the input/output port of error amount or magnitude of voltage;
MAD_in, MAD_out: the input/output port of accumulated value;
The logic function explanation of multiply accumulating processing unit:
Multiplier is got the multiplication that prestores among the input data of MUL_in port and the register REG, and totalizer is with after multiplier output and the addition of MAD_in port data, and the result is as the output of MAD_out port, and what its was carried out is exactly a multiply accumulating computing.
PE
8Be the adder unit of a band feedback, its structural representation as shown in Figure 3, it is by a totalizer ADD.The port explanation:
Y_in: convolution algorithm array 101 result of calculation y
iInput port;
MAD_in: the input port of accumulated value;
V_feedback: magnitude of voltage feedback port;
V_out: the outside output port of magnitude of voltage.
The logic function explanation of adder unit
Totalizer is with the data addition of port Y_in and MAD_in input, and the result exports as V_feedback and V_out, and its effect is exactly that two parts with systolic arrays interlink, and with the result outwards output and to the right feedback participate in computing.
The present invention's advantage compared with prior art is:
(1) processing unit is realized data transfer between adjacent cells by input queue register and output queue register, the multiplier of each unit directly obtains data foremost from the input queue of unit and operates, and totalizer is directly sent into result of calculation the output queue rearmost end of unit.Avoid read-write to storer with the method for this formation, this input data and output data of not using in calculating does not need to deposit in the memory block of unit, makes an array can reach very high efficient when calculating the plurality of voltages value.
(2) simultaneously, owing to when systolic arrays carries out the computing of one road recovery voltage, there is the processing unit of half to be in idle condition, so the mode that can share with processing unit in a systolic arrays is carried out the computing of two independent recovery voltage simultaneously, make the processing unit utilization factor reach 100%, saved hardware resource, thus the input and output queue degree of depth between the processing unit be n/2 (if n be odd number then the degree of depth be (n+1)/2).
(3) each PE unit is simple in structure, has only local data's communication, is convenient to hardware and realizes.
(4) with the parallel processing of restoration calculation module, calculating time-delay is a flow beat, and real-time is good.
(5) the present invention makes the usefulness of each processing element reach maximum, thereby promotes the miniaturization and the low consumption electrification of device.Workflow and work characteristics according to wave front processor can get: by the pixel order output data, frame data need pass through Flame Image Process, slope calculating, wave front restoration, control computing and the last output voltage of five modules of D/A conversion and drive distorting lens work CCD for line by line.This shows k-1 frame error vector E
K-1With k frame error vector E
kInput time very big at interval (being that CCD camera one frame pixel output time-delay is calculated time-delay and wave front restoration time-delay sum with slope).And the incoming wave surface error vector E of control computing is the output of wave front restoration module, each error component
... order output and output time are less at interval.Can assurance before the error vector of next frame enters array finish the calculating of n road magnitude of voltage in proper order.The present invention organizes rational data stream and the corresponding circuit of design according to above characteristics, use a systolic arrays to improve the utilization ratio of ARRAY PROCESSING unit by the n road independently being controlled voltage operational with the mode timesharing of serial, rationally also reduce the number of processing unit effectively, reduce resource occupation.
Description of drawings
Fig. 1 is a theory diagram of the present invention;
Fig. 2 is the structural representation of the multiply accumulating processing unit PE among the present invention;
Fig. 3 is the structural representation of the adder unit among the present invention.
Embodiment
Specify present embodiment below in conjunction with Fig. 1 to Fig. 3.
As shown in Figure 1, the present invention is by 7 multiply accumulating processing unit PE
1-PE
7, an addition process unit PE
8, 12 degree of depth be n/2 (if n be odd number then the degree of depth be (n+1)/2) shift register (M
1-M
12) form.Processing unit PE
1-PE
4With shift register M
1-M
6, processing unit PE
1-PE
8With shift register M
7-M
12Be arranged in two linear array structures respectively, port interconnects by shift register between the processing unit, the pin MUL_out that is positioned at the processing unit of left connects the data-in port of shift register, and the data-out port of shift register meets the pin MUL_in of right-hand processing unit; The pin MAD_in that is positioned at the processing unit of left connects the data-out port of shift register, and the data-in port of shift register connects the pin MAD_out that is positioned at right-hand processing unit; Processing unit PE
1Pin MUL_in meet the error information input port E of array.Processing unit PE
8Pin Y_in meet processing unit PE
1Pin MAD_out; Pin V_out meets the voltage output end mouth V of array; Pin V_feedback connects the input port of shift register M7, and the output port of M7 meets processing unit PE
5Pin MUL_in; Processing unit PE
8Pin MAD_in connect the output port of shift register M10, the input port of M10 meets processing unit PE
5Pin MAD_out.
As shown in Figure 1, 2, each multiply accumulating processing unit PE
1-PE
7All form by a register REG, a multiplier MUL totalizer ADD.Register REG connects the input end of multiplier MUL, the pin MUL_in of another input termination multiply accumulating processing unit of multiplier MUL, the input end of the output termination totalizer ADD of multiplier MUL, the pin MAD_in of another input termination multiply accumulating processing unit of totalizer ADD, the pin MAD_out of the output termination multiply accumulating processing unit of totalizer ADD, the pin MUL_in of multiply accumulating processing unit links to each other with pin MUL_out.
Shown in Fig. 1,3, addition process unit PE
8Constitute by a totalizer ADD.The input termination addition process unit PE of totalizer ADD
8Pin Y_in, another the input termination meet addition process unit PE
8Pin MAD_in, the output termination addition process unit PE of totalizer ADD
8Pin V_out and pin V_feedback.
The principle of work of this circuit is described below in conjunction with Fig. 1:
(1) before circuit is started working, the controlled variable b of system
0, b
1, b
2, b
3, a
1, a
2, a
3PE prestores respectively
1-PE
7Register REG in, and each storage unit of each shift register all is initialized as zero.
The n of (2) one tunnel error vectors component order input array is as an error amount e
iFrom PE
1Port MUL_in input, start-up circuit is the zero point of time beat.Processing unit PE
1-PE
7Synchronous working under the promotion of clock is clapped multiplier the 1st and is got shift register M from port MUL_in
1-M
6The multiplication that prestores among the data of low order end cell data and the register REG; The 2nd claps totalizer gets shift register M from the MAD_in port
7-M
12The data of high order end unit and multiplication result addition, the result exports as the MAD_out port; The 3rd totalizer of clapping PE8 is got PE from port Y_in
1MAD_out port output data with get M from port MAD_in
10The data addition of high order end unit, the result is as V_out and the output of MUL_out port; The 4th claps shift register M
1-M
6Middle data move a storage unit, shift register M from left to right
7-M
12Middle data move a storage unit from right to left, and each processing unit has just been finished once-through operation like this, and obtains v from the input port V_out of array
i, be designated as a flow beat.
(3) whenever an error amount input array, just start flow beat of array operation, the mobile to the left or to the right storage unit of data obtains this road magnitude of voltage.Each processing unit of each flow beat and shift register repeat identical operations.Error amount flows in convolution algorithm portion 101 from left to right in the array like this, and convolution results flows from right to left.Magnitude of voltage is zero to flow from right to left with the initial value in the recursive operation portion 102, finishes behind the recursive operation at PE
8In with the addition of convolution partial results, the magnitude of voltage that obtains is outwards exported.
Claims (4)
1, a kind of pulsation array processing circuit that is used to realize adaptive optical system wavefront control operation is characterized in that: it is by 7 multiply accumulating processing unit PE
1-PE
7, 1 addition process unit PE
8, 12 shift register M
1-M
12Form 4 multiply accumulating processing unit PE
1-PE
4Linear array is formed convolution algorithm portion, has two data stream in opposite directions in the array, and each component order of the error vector E of each frame is from the first multiply accumulating processing unit PE
1Flow into array, through the second multiply accumulating processing unit PE
2, the 3rd multiply accumulating processing unit PE
3With the 4th multiply accumulating processing unit PE
4Flow out array after the computing, convolution results with initial value 0 from the 4th multiply accumulating processing unit PE
4Flow into array, through PE
3, PE
2, PE
1Flow out array; 3 multiply accumulating processing unit PE
5-PE
7Linear array is formed recursive operation portion, has two data stream in opposite directions in the array, and the recursive operation part is 0 from the 7th multiply accumulating processing unit PE with initial value
7Flow into array, through the 6th multiply accumulating processing unit PE
6With the 5th multiply accumulating processing unit PE
5After the computing at addition process unit PE
8With output control voltage result of calculation after the array of the convolution algorithm portion output results added and feedback backward, through PE
5, PE
6, PE
7Calculate the back and flow out array, realize data transfer between adjacent cells by shift register interconnection between the each processing unit, promptly the data-out port of a processing unit connects the data input pin of a shift register, and the data output end of this shift register connects the data input pin of next processing unit.
2, according to a kind of pulsation array processing circuit that is used to realize adaptive optical system wavefront control operation of claim 1, it is characterized in that: described each multiply accumulating processing unit PE
1-PE
7All form by register REG, multiplier MUL and totalizer ADD, register REG connects the input end of multiplier MUL, the pin MUL_in of another input termination multiply accumulating processing unit of multiplier MUL, the input end of the output termination totalizer ADD of multiplier MUL, the pin MAD_in of another input termination multiply accumulating processing unit of totalizer ADD, the pin MAD_out of the output termination multiply accumulating processing unit of totalizer ADD, the pin MUL_in of multiply accumulating processing unit links to each other with pin MUL_out.
3, according to a kind of pulsation array processing circuit that is used to realize adaptive optical system wavefront control operation of claim 1, it is characterized in that: described addition process unit PE
8Constitute the input termination addition process unit PE of totalizer ADD by a totalizer ADD
8Pin Y_in, another input termination addition process unit PE
8Pin MAD_in, the output termination addition process unit PE of totalizer ADD
8Pin V_out and pin V_feedback.
4, according to a kind of pulsation array processing circuit that is used to realize adaptive optical system wavefront control operation of claim 1, it is characterized in that: when n is even number, described 12 shift register M
1-M
12The degree of depth be n/2; When n is odd number, described 12 shift register M
1-M
12The degree of depth be (n+1)/2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB2007100991061A CN100510836C (en) | 2007-05-11 | 2007-05-11 | Pulsation array processing circuit for adaptive optical system wavefront control operation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB2007100991061A CN100510836C (en) | 2007-05-11 | 2007-05-11 | Pulsation array processing circuit for adaptive optical system wavefront control operation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101067681A CN101067681A (en) | 2007-11-07 |
CN100510836C true CN100510836C (en) | 2009-07-08 |
Family
ID=38880284
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNB2007100991061A Expired - Fee Related CN100510836C (en) | 2007-05-11 | 2007-05-11 | Pulsation array processing circuit for adaptive optical system wavefront control operation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN100510836C (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102221842B (en) * | 2011-05-18 | 2013-01-09 | 中国科学院长春光学精密机械与物理研究所 | Thousands-of-unit extensible adaptive optical system wave-front processor |
US10838910B2 (en) * | 2017-04-27 | 2020-11-17 | Falcon Computing | Systems and methods for systolic array design from a high-level program |
CN108182471B (en) * | 2018-01-24 | 2022-02-15 | 上海岳芯电子科技有限公司 | Convolutional neural network reasoning accelerator and method |
CN110647975B (en) * | 2018-06-27 | 2022-09-13 | 龙芯中科技术股份有限公司 | Data processing method, device, equipment and medium |
CN111208865B (en) * | 2018-11-22 | 2021-10-08 | 南京大学 | Photoelectric calculation unit, photoelectric calculation array and photoelectric calculation method |
CN109902835A (en) * | 2019-02-01 | 2019-06-18 | 京微齐力(北京)科技有限公司 | Processing unit is provided with the artificial intelligence module and System on Chip/SoC of general-purpose algorithm unit |
CN114237551B (en) * | 2021-11-26 | 2022-11-11 | 南方科技大学 | Multi-precision accelerator based on pulse array and data processing method thereof |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6052770A (en) * | 1993-06-08 | 2000-04-18 | Theseus Logic, Inc. | Asynchronous register |
US6791362B1 (en) * | 2003-12-09 | 2004-09-14 | Honeywell International Inc. | System level hardening of asynchronous combinational logic |
CN1664650A (en) * | 2005-03-14 | 2005-09-07 | 中国科学院光电技术研究所 | Double wave front calibrator self-adaptive optical system |
CN1690763A (en) * | 2004-04-26 | 2005-11-02 | 中国科学院光电技术研究所 | Deformation mirror overvoltage protection adjustment method with central driver voltages approaching to zero |
-
2007
- 2007-05-11 CN CNB2007100991061A patent/CN100510836C/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6052770A (en) * | 1993-06-08 | 2000-04-18 | Theseus Logic, Inc. | Asynchronous register |
US6791362B1 (en) * | 2003-12-09 | 2004-09-14 | Honeywell International Inc. | System level hardening of asynchronous combinational logic |
CN1690763A (en) * | 2004-04-26 | 2005-11-02 | 中国科学院光电技术研究所 | Deformation mirror overvoltage protection adjustment method with central driver voltages approaching to zero |
CN1664650A (en) * | 2005-03-14 | 2005-09-07 | 中国科学院光电技术研究所 | Double wave front calibrator self-adaptive optical system |
Also Published As
Publication number | Publication date |
---|---|
CN101067681A (en) | 2007-11-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN100510836C (en) | Pulsation array processing circuit for adaptive optical system wavefront control operation | |
CN100449522C (en) | Matrix multiplication parallel computing system based on multi-FPGA | |
CN104899182A (en) | Matrix multiplication acceleration method for supporting variable blocks | |
CN108647773B (en) | Hardware interconnection system capable of reconstructing convolutional neural network | |
CN103677739A (en) | Configurable multiply accumulation cell and multiply accumulation array consisting of same | |
CN107341133B (en) | Scheduling method of reconfigurable computing structure based on LU decomposition of arbitrary dimension matrix | |
Ahmedsaid et al. | Improved SVD systolic array and implementation on FPGA | |
CN101038582B (en) | Systolic array processing method and circuit used for self-adaptive optical wave front restoration calculation | |
WO2021108559A1 (en) | Loading operands and outputting results from a multi-dimensional array using only a single side | |
CN109144469A (en) | Pipeline organization neural network matrix operation framework and method | |
CN111506343A (en) | Deep learning convolution operation implementation method based on pulse array hardware architecture | |
CN112905530A (en) | On-chip architecture, pooled computational accelerator array, unit and control method | |
CN116710912A (en) | Matrix multiplier and control method thereof | |
US20060153321A1 (en) | Device for implementing a sum of products expression | |
CN116090530A (en) | Systolic array structure and method capable of configuring convolution kernel size and parallel calculation number | |
CN112346704B (en) | Full-streamline type multiply-add unit array circuit for convolutional neural network | |
US11423292B2 (en) | Convolutional neural-network calculating apparatus and operation methods thereof | |
CN211577939U (en) | Special calculation array for neural network | |
CN1553310A (en) | Symmetric cutting algorithm for high-speed low loss multiplier and circuit strucure thereof | |
CN114912596A (en) | Sparse convolution neural network-oriented multi-chip system and method thereof | |
Saldana et al. | FPGA-based customizable systolic architecture for image processing applications | |
US20230252600A1 (en) | Image size adjustment structure, adjustment method, and image scaling method and device based on streaming architecture | |
CN114518861A (en) | Operation unit compatible with SIMD (Single instruction multiple data) calculation and floating-point matrix multiplication and application method thereof | |
CN111832717B (en) | Chip and processing device for convolution calculation | |
CN116738135A (en) | Matrix multiplier for training of transducer type model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C17 | Cessation of patent right | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20090708 Termination date: 20140511 |