EP0422964A2 - Second nearest-neighbor communication network for synchronous vector processor systems and methods - Google Patents
Second nearest-neighbor communication network for synchronous vector processor systems and methods Download PDFInfo
- Publication number
- EP0422964A2 EP0422964A2 EP90311267A EP90311267A EP0422964A2 EP 0422964 A2 EP0422964 A2 EP 0422964A2 EP 90311267 A EP90311267 A EP 90311267A EP 90311267 A EP90311267 A EP 90311267A EP 0422964 A2 EP0422964 A2 EP 0422964A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- data
- data processing
- output
- svp
- register
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/80—Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/80—Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
- G06F15/8015—One dimensional arrays, e.g. rings, linear arrays, buses
-
- F—MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
- F02—COMBUSTION ENGINES; HOT-GAS OR COMBUSTION-PRODUCT ENGINE PLANTS
- F02B—INTERNAL-COMBUSTION PISTON ENGINES; COMBUSTION ENGINES IN GENERAL
- F02B75/00—Other engines
- F02B75/02—Engines characterised by their cycles, e.g. six-stroke
- F02B2075/022—Engines characterised by their cycles, e.g. six-stroke having less than six strokes per cycle
- F02B2075/027—Engines characterised by their cycles, e.g. six-stroke having less than six strokes per cycle four
Definitions
- the present invention relates generally to single instruction, multiple data processors. More particularly, the invention relates to processors having a one dimensional array of processing elements, that finds particular application in digital signal processing such as Improved Definition Television (IDTV). Additionally, the invention relates to improvements to the processors, television and video systems and other systems improvements and methods of their operation and control.
- IDTV Improved Definition Television
- Video signal processing requires the use of Finite Impulse Response (FIR) digital filters for many of the data processing applications. If the sampling frequency is carefully selected, the coefficients of the filters can be small ratios of powers of two or at least simple combinations of powers of two.
- Real time video signal processing requires that the operating processors receive and process the video signal and the data necessary to emulate digital filters at extremely fast rates. In the prior art a substantial portion of the processing time is consumed in obtaining the sample data from adjacent processors in the array. For example the processors in the array would have to execute a series of instructions to address, read and transfer data located in its next adjacent processor untilk it reaches the desired location in the array. In a large array, this sequence transferring the data from one processor to the next until it reaches a desired location is time consuming. If a finite time exist to receive and process the data, a large data retrival time will of course leave less time for data processing. Therefore a technique for reducing the data retrival time in a synchronous vector processor is desired in the art.
- the present invention comprises processor circuits connected in a serial chain, each of said processor circuits including: a data processing unit having a digital input connected in common with the digital input of each of the data processing units of the other processor circuits for entry of said control and address signals, the data processing unit including an arithmetic logic unit, a plurality of data storage registers connected to said arithmetic logic unit, and data multiplexers connected to said data storage registers; a first register interface including a first set of bit registers for parallel entry of said first digital data signal and including a second set of bit registers, said first and second set of bit registers individually accessible by said data processing unit; a second register interface including a third set of bit registers and also having a fourth set of bit registers having a parallel digital output for producing the processed digital data signal, said third and fourth set of bit registers individually accessible by said data processing unit; a.
- first sequencer circuit connected by a first common line to the first register interface in each of the processor circuits and responsive to the clock pulses for selectively sequentially activating operations of each of said first register interface; and a second sequencer circuit connected by a second common line to the second register interface in each of the processor circuits and responsive to the clock pulses for selectively sequentially activating operations of each of said second register interface; said data processing units thereby operable by said controller independently of and cooperatively with said first and
- An SVP Synchronous Vector Processor of a preferred embodiment, is a general purpose mask- programmable single instruction, multiple data, reduced instruction set computing (SIMD-RISC) device capable of executing in real-time the 3-D algorithms useful in Improved and Extended Definition Television (IDTV and EDTV) systems.
- SIMD-RISC reduced instruction set computing
- IDTV and EDTV Improved and Extended Definition Television
- the Input and Output layers operate in synchronism with the data source (such as video camera, VCR, receiver, etc.) and the data sink respectively (such as the raster display).
- the Computation layer performs the desired transformation by the application of programmable functions simultaneously to all the elements of a packet (commonly referred to as a VECTOR: within the TV/Video environment all the samples comprising a single horizontal display line).
- VECTOR within the TV/Video environment all the samples comprising a single horizontal display line.
- a TV or video system 100 includes synchronous vector processor device 102.
- System 100 comprises a CRT 104 of the raster-scan type receiving an analog video signal at input 106 from standard analog video circuits 108 as used in a conventional TV receiver.
- a video signal from an antenna 110 is amplified, filtered and heterodyned in the usual manner through RF and IF stages 112 including tuner, IF strip and sync separator circuitry therein, producing an analog composite or component video signal at line 114.
- Detection of a frequency modulated (FM) audio component is separately performed and not further discussed here.
- FM frequency modulated
- the horizontal sync, vertical sync, and color burst are used by controller 128 to provide timing to SVP 102 and thus are not part of SVP's data path.
- the analog video signal on line 114 is converted to digital by analog-to-digital converter 116.
- the digitized video signal is provided at line 118 for input to synchronous vector processor 102.
- Processor 102 processes the digital video signal present on line 118 and provides a processed digital signal on lines 170.
- the processed video signal is then converted to analog by digital-to-analog converter 124 before being provided via line 126 to standard analog video circuits 108.
- Video signals can be provided to analog-to-digital converter 116 from a recorded or other non standard signal source such as video tape recorder 134.
- the VCR signal is provided on line 136 and by passes tuner 112.
- Processor 102 can store one (or more) video frames in a field memory 120, which is illustratively, a Texas Instruments Model TMS4C106O field memory device.
- Field memory 120 receives control and clocking on lines 138 and 140 142 and 144, 2 test lines 146, 7 DOR/RF1 address lines 133, 24 data output lines 170 and a 1-bit global output 178 (GO) line.
- the I/O system of the SVP comprises the Data Input Register 154 (DIR) and the Data Output Register 168 (DOR).
- DIR and DOR are sequentially addressed dual-ported memories and operate as high speed shift registers. Both DIR and DOR are dynamic memories in the preferred embodiment.
- DIR and DOR are asynchronous to the PEs 150 in the general case, some type of syncronization must occur before data is transferred between DIR/DOR and the PEs 150. This usually occurs during the horizontal blanking period in video applications. In some applications the DIR, DOR, and PEs may operate synchronously, but in any case it is not recommended to read or write to both ports of one of the registers simultaneously.
- the DIR of processor 102 is a 40960 bit dynamic dual-ported memory.
- One port 119 is organized as 1024 words of 40 bits each and functionally emulates the write port of a 1024 word line memory.
- Figure 4 depicts a timing diagram for a DIR write.
- the 40 Data Inputs 118 (DIO through DI39) are used in conjunction with timing signals Write Enable 190 (WE), Reset Write 192 (RSTWH), and Write Clock 194 (SWCK).
- WE 190 controls both the write function and the address pointer 148 (commutator) increment function synchronously with SWCK 194.
- the RSTWH 192 line resets the address pointer 148 to the first word in the 1024 word buffer on the next rising edge of SWCK.
- SWCK 194 is a continuous clock input. After an initial two clock delay, one 40 bit word of data 198 is written on each subsequent rising edge of SWCK 194. If data words 0 to N are to be written, WE remains high for N+4 rising edges of SWCK.
- the address pointer 148 may generally comprise a 1-of-1024 commutator, sequencer or ring counter triggered to begin at the end of a horizontal blanking period and continue for 1024 cycles synchronized with the sampling frequency of the A-to-D converter 116.
- the input commutator 148 is clocked at above 1024 times the horizontal scan rate.
- the output commutator 174 can be, but not necessarily, clocked at the same rate as the input.
- processor 102 is depicted as having 1024 processor elements, it can have more or less.
- the actual number is related to the television signal transmission standard employed, namely NTSC, PAL or SECAM, or the desired system or functions in non television applications.
- the second port 121 of data input register 154 is organized as 40 words of 1024 bits each; each bit corresponding to a processor element 150.
- Port 121 is physically a part of, and is mapped into the absolute address space of RFO; therefore, the DIR and RFO are mutually exclusive circuits.
- the DIR and RFO are mutually exclusive circuits.
- the DIR 154 works independently of the DOR 168; therefore it has its own address lines 131 and some of its own control lines 135.
- the exact function of DIR 154 is determined by many lines: C21, C8, C2, C1, C0, the contents of WRM 234, and by addresses RFOA6 through RFOAO, (See Fig. 5).
- Control line C2 1 selects DIR 154.
- the seven address lines RFOA6 - RFOAO select 1-of-40 bits to be read or written to while C1 and CO select the write source (for a read CO and C1 don't matter).
- With certain combinations of lines C1 and CO the write source for DIR 154 depends on the state of C21 and C8 and the contents of Working Register M 234. These form instructions called M-dependent instructions which allow more processor 102 flexibility.
- Table 1 sets forth the control line function for DIR 154.
- DOR 168 is a 24576 bit dynamic dual-ported memory.
- One port 169 is organized as 1024 words of 24 bits each and functionally emulates the read port of a 1024 word line memory.
- the Data Outputs (DOO through D023) 170 are used in conjunction with the signals Read Enable (RE), Reset Read (RSTRH), and serial Read Clock (SRCK) of Fig. 6.
- SRCK 496 is a continuous clock input. RE 490 enables and disables both the read function and the address pointer increment function synchronously with SRCK 496. When high, the RSTRH line 494 resets the address pointer (commutator) to the first word in the 1024 word buffer on the next rising edge 498 of SRCK 496.
- one 24 bit word of data is output an access time after each subsequent rising edge of SRCK. If data words O to N are to be read, then RE must remain high for N+3 rising edges of SRCK.
- the address pointer 174 can similarly comprise a 1-of-1024 commutator or ring counter.
- the second port 167 of data output register 168 is organized as 24 words of 1024 bits each; each bit corresponding to a Processor Element 150.
- Port 167 of DOR 168 is physically a part of, and is mapped into the absolute address space of RFI 166; therefore, the DOR 168 and RF1 166 are mutually exclusive circuits. When one is addressed by an operand on a given Assembly line, the other cannot be. An Assembly line which contains references to both will generate an assembly-time error. This is discussed in more detail hereinafter.
- DOR 168 works independently of DIR 154; therefore it has its own address lines 133 and some of its own control lines 137.
- the exact function of DOR 168 is determined by many lines: C21, C5, C4, C3, the contents of WRM 234, and by addresses RF1 A6 through RF1AO, (See Fig. 5).
- Control line C5 1 selects DOR 168.
- the seven address lines 133 select 1-of-24 bits to be read or written to while C4 and C3 select the write source. With certain combinations of control lines C4 and C3, the write source DOR 168 depends on the state of C21 and the contents of Working Register M 234. These form instructions called M-dependent instructions which allow more processor 102 flexibility.
- Table 3 sets forth the control line 130 function for DOR 168.
- the logical diagram of Figure 5 details the interconnect of RF1 and the DOR.
- C21, C5, C4, C3, and RF1A6 through RF1AO are control/address/data lines common to all 1024 PEs.
- Signal C 280 and M 250 are from WRC 248 and WRM 234 respectively.
- SM 262 and CY 264 are from ALU 260.
- the memory map of Table 4 below requires an eight bit address. This address is made up of Control line C5 (RF1A7) as the MSB and Address lines RF1A6 through RF1AO (133) as the lesser significant bits. C5 is not considered an address because the selection of the DOR 168 versus RF1 166 is implicit in the instruction mnemonic by bit C5.
- the Register Files, Data Input Register, and Data Output Register are dynamic read only memories and are periodically refreshed unless implicitly refreshed by the running program.
- the program will keep the RFs refreshed if the software loop is repeated more frequently than the refresh period. This keeps any memory locations which are being used by the program refreshed, while unused bits are allowed to remain un-refreshed.
- a program can explicitly refresh both RFs by simply reading all locations of interest within the refresh period.
- RFO 158 works independently of RF1 166; therefore it has its own address lines 131 and some of its own control lines. The exact function of RFO 158 is determined by many lines: C21, C8, C2, C1, C0, the contents of WRM 234, and by addresses RFOA6 through RFOAO (See Fig. 5).
- Control line 448 C2 0 selects RFO 158.
- the seven address lines 131 select 1-of-128 bits to be read or written to while C1 and CO select the write source. With certain combinations of control lines C1 and C0, the write source for RFO 158 depends on the state of C21 and C8 and the contents of Working Register M 234. These form instructions called M-dependent instructions which allow more processor 102 flexibility. Table 6 sets forth the control line funciton for register file 0 158.
- the logical diagram of Figure 5 details the interconnect of RFO 158 and the DIR 154.
- C21, C8, C2, C1, C0, and RFOA6 through RFOAO are control/address lines common to all 1024 PEs.
- Signal C 280 and M 250 are from WRC 248 and WRM 234 respectively.
- SM 262 is from ALU 260.
- R 322, 2R 324, L 310, and 2L 312 are signals from this PEs four nearest neighbors.
- RF1 166 works independently of RFO 158; therefore it has its own address lines 133 and some of its connect forty data lines 118 (from the parallel inputs DIO - D139) to dynamic memory cells 518. These cells are dual-port, and are also written to or read from through access transistors 520 and folded bit lines 522 and 524 connected to sense amplifier 156, when addressed by word lines 526. There are forty of the word lines 526 for the DIR part and 128 of the word lines 160 for the RFO part of this 168-bit dynamic random access (DRAM) column.
- DRAM dynamic random access
- the DIR is a 2-transistor dual port cell. Reading and writing can be performed for each port.
- the DIR operates as a high speed dynamic shift register.
- the dual port nature allows asynchronous communication of data into and out of the DIR.
- By using dynamic cells the shift register layout is greatly reduced. Although a dummy cell can be used, it is not a requirement for cell operation.
- the data output register utilizes a 3-transistor dual port gain cell. In most applications reading and writing is allowed at port 167, but only reading is performed from second port. DOR 168 also operates as a high speed dynamic shift register. The DOR with gain transistor circuit allows reading of capacitor 519 without destroying the stored charge. In operation if a logical "1" on cell 519 is greater than 1V T of transistor 1640; when select line 172 is turned on, line 1642 will be pulled to a logical "0" or to zero volts, eventually. If the charge on cell 519 is less than 1VT (i.e., a logical "0" or low) the charge on line 1642 will remain at a precharge value. Transistor 1642 is the cell read select transistor.
- All twenty four data outlines 560 are sensed simultaneously by transistor 1642 (i.e., transistor 1642 selects the processor element cells). As shown node 1650 is isolated. This connection reduces possibility of data loss in cell from noise generated from reading other processor element cells.
- Each 128 cell section has a comparator 1634 on the output line to sense the signals. A reference voltage is applied to comparator input 1636.
- Source 1638 of transistor 1630 is connected to V DD . This is not a requirement however, and source 1638 may be connected to another voltage level.
- Figures 8a-d illustrate voltage levels at several lines and nodes of the DOR circuit.
- Figure 9 illustrates an alternative DOR cell.
- PEs 150 for video applications utilize a 40-bit wide input data bus 118 and a 24 bit wide output data bus 170.
- These bus widths in combination with high clocking speeds of 8fsc (35ns) results in a large power drain and noise on the bus lines if the entire bus width for the 1024 DIR 154 or DOR 168 must be powered up for the entire clocking period.
- 8fsc 35ns
- the entire bus width for the 1024 DIR 154 or DOR 168 must be powered up for the entire clocking period.
- FIG. 10 depicts an SVP 102 input bus line 118 power drain and noise reduction control circuit 580.
- Circuit 580 reduces noise and power requirements of SVP 102 during a DIR 154 write.
- the 1024 by 40 DIR array 154 is segmented into eight segments or portions 586a-h, each including 128 PEs 150. Data is clocked into memory locations of each 128 DIR segment 586 by a segment of commutator 148 operating under control of a corresponding control unit 602.
- Control unit 1 (602a) has a segment of clock inputs 608 timed to be in sync with the horizontal scanning rate of the input video data signals on line 118.
- Each of the eight control units 602 is connected to receive a reset signal 610.
- Control unit 602 output signals include a commutator enable signal 151 for enabling the commutator 588 for operation as previously described.
- the individual control unit 602 output signals also include a power up output signal 606 for powering up the next adjacent control unit for operation when data signal write to the presently operating section is near completion. For example, once data read from line 118 to the DIR section 586a is near completion, the next adjacent control unit 602b enables its commutator segment 588b to be ready for a data write.
- a signal on line 604a powers down previous control unit 602a since it has completed writing data to segment 586a.
- This power up/power down control sequence is repeated for each section until all 1024 DIRs have been loaded. In this fashion only the commutator for the group of DIRs being written to is powered up during a portion of the clock cycle.
- the DIR data in all sections 586a-h is clocked into RFO while the controller reset signal is made active and a new scan line is ready for input.
- control circuit 580 is shown comprising subcircuits including flip-flops 614, 620 and 622.
- a reset signal at input 610 triggers the S or set input of flip-flops 614 and 620a.
- the same reset signal 610 triggers the clear inputs to flip-flops 620b-620g and triggers the reset input to flip-flop 622.
- set input of flip-flop 620a is triggered its Q output is activated to enable drivers 628.
- Control circuit 658 includes DIR select transistors, such as transistors 670 and 672 for the left half and 674 and 676 for the right half.
- Select transistor 670 has its source and drain connected between the DIR and the processor element sense amp 678.
- the gate of transistor 670 is connected to the output of AND gate 682.
- Input lead 692 of AND gate 682 receives a XFERLEFT or XFERIGHT signal.
- Transistor 672 is connected in a similar manner between DIR 650 and sense AMP 678. Similarly connected are transistors 674 and 676 of segment 652. Each DIR of each segment control circuit also includes a two transistor network which forces the sense amps to a known state as desired during operation. These are transistors 662 and 664 for the left half DIRs and transistors 666 and 668 for the right half DIRS.
- Transistor 662 has its source connected to the source of transistor 670 and its drain is grounded. Similarly, the source of transistor 664 is connected to the source of transistor 672. The drain of transistor 664, however, is connected to V DD . The gates of transistors 662 and 664 are connected to the output of AND gate 684. AND gate 684 has two inputs. Input 688 is connected to the output of inverter 686, the input of which is connected to the XFERLEFT/XFERRIGHT signal. Input 690 of AND gate 684 is connected to control bit C2.
- the control output from AND gate 684 is cross coupled from segment half 650 to 652 such that the output controls transistor 662 and 664 on the left side and transistors 674 and 676 on the right side.
- the output of AND gate 682 is similarly cross coupled between the left and right halfs of processor 102. On the left side gate 682 output controls transistors 670 and 672. On the right side gate 682 controls transistors 666 and 668.
- a high level on the XFERLEFT and C2 signals results in a low signal output from AND gate 684 and a high signal output from AND gate 682.
- This selects the contents of left side DIRs for transfer to RFO and activates the right side DIRs for loading.
- a low or XFERRIGHT signal on lead 692 while C2 is 1, selects the left side DIRs for loading and the right side DIRs for transfer of data to RFO. This sequence is repeated so that the DIR scan continually receives and transfers data alternatively in a piston like manner.
- Figure 15 shows an alternative scheme for recovering the originally transferred data. Instead of recovering the even and odd addresses separately, the drains of transistors 664 and 668 in Fig. 13 can be tied to ground and add and even addresses can be treated equally. The following would occur.
- Drawing Fig. 16 depicts the DIR control circuit of Fig. 13 in greater and slightly different detail.
- Fig. 17 depicts the DOR control circuit of Fig. 13 in greater and slightly different detail.
- Register Files are comprised of dynamic cells which are suitably refreshed in successive refresh periods to maintain their contents. Only those addresses which are used by the software need be refreshed. All remaining addresses may go without refresh since their data is not needed.
- a refresh operation is simply a read to each address requiring data retention; therefore, in many applications, the software program will keep the RFs refreshed if the software loop is repeated more frequently than the refresh period.
- Refreshing all 256K bits in SVP 102 requires only 64 cycles. This is because each RF actually reads and refreshes 2 bits at a time (for a total of 4 bits per PE). To perform a complete refresh to all of SVP 102, read each RF into any Working Register, increment the address by two each time and repeat 64 times. The following program illustrates a refresh operation.
- each WR comprises a data selector or multiplexer and a Flip/Flop. All four registers are clocked at the same time by internal SVP timing circuits shortly after valid data arrives from the RFs.
- Table 12 shows illustrative sources of data for each of the four Working Registers.
- WRA 238, the Addend/Minuend Register is a general purpose working register, and is used in most operations involving ALU 164.
- WRA is the second 256 of two inputs to multiplier block 258 in the ALU 164, and is the positive term entering adder/subtractor block 260.
- WRA is also an input to C MUX 244.
- Data selector 236 (n-to-1 multiplexer) chooses one of ten possible sources of data for WRA 238 as a function of Control lines C17, C16, C15, and C8 as shown in Table 14. Additionally, the data taken from lines R, R2, L, and L2 can be from 1 of 4 sources within the selected near-neighbor 160.
- WRB 242 the Addend/Subtrahend Register, is a general purpose working register, and is used in most operations involving ALU 164. In a subtraction operation, WRB 242 is always subtracted from WRA 238. WRB is also an input to the UR MUX 305.
- Data selector 240 chooses one of ten possible sources of data for WRB as a function of Control lines C14, C13, C12, and C8 as shown in table 15. Additionally, the data taken from lines R, R2, L, and L2 can be from 1 of 4 sources within the selected near-neighbor 160.
- WRC 248, the Carry/Borrow register is the Carry (or Borrow) input to ALU 164.
- WRC 248 holds the CY 264 from the previous addition between bits, while in multi-bit subtractions, NRC 248 holds the BW 266 bit.
- WRC output goes to A, B and M registers and to RFO MUX1.
- Global output signal 824 is equivalent to the logical OR 852 of all 1024 UR lines 178 exiting the PEs. That is, if one or more PEs 103 in Processor Array 102 outputs a logical ONE level on its UR line 178 the GO signal 824 will also output a logical ONE. The GO signal is active high.
- Fig. 19 also shows the generation of the UR signal exiting PE(n) and its relation to the global flag signal, GO (Global Output).
- the near-neighbor communications lines are brought out to the outside so that multiple SVP's may be cascaded if a processing width of more than 1024 bits is required.
- SVP 102 On the left of SVP 102 are L and 2L outputs and L and 2L inputs. To the right there are R and 2R outputs and R and 2R inputs. To avoid confusion with the interconnect, these pins are named CCOL 792, CC1 L 794, CC2L 796, CC3L 798 and CCOR 800, CC1 R 802, CC2R 804, CC3R 806 so it is only necessary to connect CCOL to CCOR, etc.
- Figure 20 depicts cascading interconnection for 2 or more SVPs.
- the inputs at the extremes should be grounded in most cases as in the figure, but this depends on the particular application.
- An alternative interconnection of SVPs is depicted in Fig. 21.
- the interconnect of Figure 21 allows the image in a video processing system to be wrapped around a cylinder by providing the wrap around connection. When using these lines, a wait stated cycle must be used with instructions which involve R/U2R/2L transfers to allow sufficient propagation time between SVP chips.
- An internal bus timing diagram for a wait-stated single instruction is depicted in Fig. 24.
- All instructions require only one clock cycle to complete, but the duration of that clock cycle varies depending on the type of cycle.
- the two cycle lengths are 'normal' and 'extended.'
- the length of an 'extended' cycle is approximately 1.5 times the length of a 'normal' cycle.
- the 'extended' time allows for the wait portion of the Wait-stated Single Instruction, or for the additional ALU operations performed during the Double Instruction.
- the Idle Instruction is extended only to further reduce power.
- Control bits C23 and C22 There are two control bits that set the mode of the instruction for the current cycle.
- the four modes are shown in Table 18 as a function of Control bits C23 and C22.
- the SVP Assembler and hardware is capable of automatically generating and executing an instruction which is the equivalent of two single instructions but requires an extended cycle for execution. An overall thruput advantage is incurred from this ability.
- a ⁇ READ>- ⁇ REGISTER>- ⁇ ALU>- ⁇ REGISTER>- ⁇ ALU>- ⁇ WRITE> sequence is performed.
- the additional time to the extended cycle is used for a second ALU and Register operation. This is possible because extended cycles work from a 2- bit Cache for each Register File during read/write operations.
- the SVP Assembler determines how to make the best use of these Caches by converting Single Instructions to Double Instructions whenever possible. This operation may be turned on and off by two assembler directives, DRI and ERI respectively.
- the double instruction will be used if the patterns of two sequential instructions are as in Table 21a, b.
- the register file addresses only need to be as indicated if they are being read or written.
- the assembler will optionally assemble these four types of instruction patterns into double instructions and their respective opcodes become as shown in Table 22.
- the External Bus 130 operation for the SVP chip is simple, as the only requirement is to present the device with a 38 bit microcode instruction (24 control, 14 address) and strobe PCK with the proper setup and hold times. Since the Data Input 154 and Data Output 168 Registers are asynchronous to the Processor Array 105, some form of synchronization is required prior to the Processor Array 105 transferring data to/from the DIR or DOR. In video applications, this is possibly handled by transferring during horizontal blanking time.
- Fig. 22 shows the sequence of events on the internal buses 171 of the SVP 102 for Single Instruction Mode.
- the SVP Assembler automatically generates what is called a Double Instruction from two single instructions providing they are identical except for address fields.
- Fig. 23 shows the sequence of events for the double instruction cycle.
- the Idle cycle allows the PA 105 to be mostly powered down until needed. This is shown in Fig. 25.
- the SVP is programmed at the microcode level. These microcode 'sub-instructions' combine to make the instruction portion of an instruction line in the SVP Assembly language. This section explains how to construct these instructions and how the assembler checks for conflicts. Some of the major topics in this section are:
- the SVP Assembly source is similar to that of other assemblers; each line contains an instruction, an assembler directive, comment, or macro directive.
- the SVP assembly line differs in that a single line containing one instruction comprises several sub-instructions. These sub-instructions combine to generate a single opcode when assembled.
- An 'instruction line' is made up of an optional label, one or more sub-instructions plus an optional comment field.
- a valid 'instruction' is made up of one or more sub- instructions such that no sub-instruction conflicts with another.
- a 'sub-instruction' comprises three parts: A destination operand, an assignment operator (the SVP Assembler recognizes the ' _' sign), and a source operand, in that order,
- a source operand may be specified more than once in an instruction line:
- a destination operand may be specified in an instruction line:
- Each Register File may be specified more than once as a source operand if the address is the same for each sub-instruction:
- RFO Only one of RFO, RFI, DIR and DOR may be specified as a destination operand in an assembly line:
- R0, R1, INP, or OUT is specified as a source operand and a destination operand the source and destination address must'be the same:
- any rule demonstrated above for Register Files RO and A1 applies to the INP (DIR) and OUT (DOR) instructions as well, with the exception that the address range of 'n' and 'p' is 0 to 127 while 'm' is 0 to 39, and 'q' is 0 to 23, respectively.
- Figure 26 shows an alternative embodiment of a processor element 150.
- processor element 151 of Fig. 26 includes four sense amps per processor element. Two for DIR/RFO write and read operations. Two for DOR/RF1 write and read operations. With the Fig. 26 emobodiment, register file 0 and register file 1 each read two bits of data in each memory cycle (total of four bits per cycle). However, only
- the control portion of the opcode is made up of eight octal digits. Each of the digits corresponds to one of the circuit blocks of Fig. 5 so a little familiarity with the opcode format permits the user to read the opcode directly. Table 26 indicates which bits correspond with which blocks. 'CIC' is Conditional Instruction Control.
- a controller 128 is shown connected to SVP 102 and to a software program development and television operation emulation system 900.
- Development system 900 includes a host computer system 912, a host computer interface logic 914, a pattern generator 916 and a data selector 918.
- Host computer system 912 can take a variety of forms in development system 900. Such forms include a personal computer, a remote control unit, a text editor or other means for developing a control alogrithm.
- Host computer interface logic 914 includes circuitry for emulating a television set's main micro-controller. In development system 900 host computer interface logic 914 cooperatively works with pattern generator 916 to interface host computer system 912 and local communication bus 930. Pattern generator 916 generates timing and other patterns to test program algorithms for algebraic accuracy. Pattern generator 916 also provides real-time test video data for SVP algorithm and hardware debugging.
- a data pattern programmer (or selector) 918 is used to select data for input to SVP from among the forty input lines 920 or from data patterns generated by data pattern generator 916. As depicted data selector 918 is inserted in series between the forty data input lines 920 and the forty SVP input pins 118.
- a capture (or field) memory 121 is provided to capture processed data from 8 of the 24 output lines 170. The desired 8 of the 24 output lines is selected by a 3---> octal multiplexer 171. In this manner a field of processed video data can be captured (or stored) and provided back to host interface 914 and/or host computer system 912 for real time analysis of the SVP's operations.
- Hardware interface 932 between host computer interface logic 914 and host computer 912 is achieved in development system 900 by conventional parallel interface connections.
- a conventional EIA RS-232C cable can be used when interface speed is not a primary concern.
- a IIC bus manufactured by PHILLIPS ELECTRONICS CORPORATION can be used as interface line 930 between host computer interface logic 914 and controller 128.
- controller 128 In video signal processing applications, controller 128 generates control signals for the SVP processor device 102 which are synchronized with the vertical synchronization component and horizontal synchronization component of the incoming television signal on line 110 of Fig. 1.
- FIG. 30 depicts a television micro-controller 1700.
- Micro-controller 1700 presets internal television circuitry upon initialization (system power-up).
- Micro-controller 1700 receives external signals, such as those from a personal computer key pad 1702, a remote control unit 1704 or a video signal decoder 1712, from register 962.
- Mask enable logic output on line 982 controls whether master controller address counter 984 will continue addressing in sequence or perform a jump.
- the output of multiplexer 968 is also provided via line 978 as an input to multiplexer 980.
- Multiplexer 980 has nine data output lines 986 providing inputs to master controller address program counter 984.
- the address on lines 988 from master controller address counter 984 address memory locations in master controller program memory 990.
- the address signal is also provided to return register 994 via lines 992 for subroutine call operations.
- the output of register 994 is provided via line 996 as another input to multiplexer 980.
- Master controller program memory 990 has 14 output lines 998.
- the microcode output includes address and operational mode instructions for the vertical timing generator 904 and the horizontal timing generator 906. These signals are provided to HTG and VTG via lines 936 and 932. Some of the microcode output bits on lines 998 are also provided to and decoded by instruction decoder 1002 which in turn provides operation control signals via lines 1004 to multiplexer 980 and master controller program address counter 984. Additionally, microcode output bits from lines 998 are provided via lines 1008 as another input to multiplexer 980 and as control for multiplexer 968.
- Master controller 902 also includes auxiliary register control logic 1012.
- auxiliary register control logic 1012 Nine signal lines 1198 from asynchronous-to-synchronous conversion logic 958 are connected as an input to auxiliary register control logic 1012. Operation of auxiliary registers is discussed hereinafter with reference to Fig. 40.
- vertical timing generator 904 of drawing Fig. 31, is depicted in greater detail.
- Vertical Timing Generator (VTG) 904 generates control codes on outputs 944, 940 and 942 for Horizontal Timing Generator 906, Constant Generator 908, and Instruction Generator 910 respectively.
- constant generator 908 also provides timing to circuits requiring a resolution of one horizontal line via external control line 952.
- Vertical timing generator 904 includes a vertical sequence counter (VSC) 1020.
- Vertical sequence counter 1020 is an up counter.
- Counter 1020 receives a control mode signal from master controller 902 via lines 932. The mode signal designates, among other things, whether, for example, a picture-in-picture operation is desired.
- the mode signal is essentially a starting address for vertical sequence counter 1020.
- VSC 1020 provides an address for vertical sequence memory 1024.
- Vertical sequence memory 1024 stores timing and other signals for initializing and synchronizing operations of horizontal timing generator 906, instruction genertor 910 and constant generator 908.
- the information sequences stored in vertical sequence memory 1024 are repeated during a typical opertation.
- Memory 1024 in addition to storing the information sequences stores the number of times the stored sequences are repeated.
- Sequence memory 1024 can comprise Random Access Memory (RAM), Read Only Memory (ROM) or other forms of Programmable Logic Arrays (PLA).
- the repeat number is provided via line 1027 to repeat counter 1028.
- Repeat counter 1028 is a down counter which counts down from the repeat sequence number.
- a control signal is provided via line 1032 to counter control logic 1034.
- Counter control logic 1034 provides a signal on line 1036 to signal vertical sequence counter 1020 to step to next address location. Another signal is provided via line 1040 to increment vertical loop counter 1030.
- Initialization of counter control logic 1034 is controlled by the vertical and horizontal synchronizing signal of the incoming television signal.
- the synshronizing signals are provided via lines 1038.
- the control component of the signal on line 1026 is provided to vertical loop counter 1030 to start the loop counter at a desired location.
- the vertical loop counter output provided on lines 1042 addresses memory locations in vertical loop memory 1044.
- Memory 1044 can also be RAM, ROM or PLA.
- Memory 1044 stores loop patterns (programs), starting addresses and labels for HTG, VTG and instruction generator (IG).
- Control data bits from vertical loop memory 1044 are provided to repeat counter 1028 to indicate that a looping sequence is complete and to increment. Bits are also provided to register load sequencer 1054.
- Register load sequencer 1054 includes a decoded clock to control latches 1048, 1050 or 1054.
- Register load sequencer 1054 also provides an increment signal for incrementing vertical loop counter 1044. Data is clocked from latches 1048, 1050 and 1052 at a rate up to once every horizontal line time.
- vertical loop counter 1030 provides an output signal 1042 to vertical loop memory 1044 which in turn fans out mode control signals which are latched by horizontal timing generator mode latch 1048, constant generator mode latch 1050, instruction generator mode latch 1052, register load sequencer 1054 and repeat counter 1028.
- Register load sequencer 1054 provides an output to vertical loop counter 1030 and to latches 1048, 1050 and 1052.
- Each of the mode latches provide their respective signals to the horizontal timing, the constant generator and the instruction generator on output lines 944, 940, and 942 when triggered.
- Vertical timing generator 904 functions also include changing the horizontal timing to a different mode, changing operational instructions to process television signals in zoom or with a different filter algorithm and includes jump flag arbitration control logic 1224 which receives a horizontal synchronization signal 1218, a mode control signal 1220 from vertical timing generator 904, and flag signals 1222.
- Jump flag arbitration logic 1224 provides 5 of eleven vectored jump address bits to input 1226 of instruction program register multiplexer (IPRX) 1230. The five bits on lines 1226 are the least significant of the eleven total.
- IPRX instruction program register multiplexer
- Jump flag arbitration logic 1224 also provides a jump signal 1228 to instruction decoder 1234.
- Instruction decoder 1234 provides multiple output signals.
- a line 1236 carries one of the output signals back to an input of jump flag arbitration logic 1224.
- Lines 1238 carry a 4-bit decoded multiplexer output control signal 1238 to IPRX 1230.
- Lines 1240 carry control signals to increment control logic 1242 and to a global rotation address generator (RF1) 1244 and to a global rotation address generator (RFO) 1246.
- the 4-bit control signal provided on lines 1240 instructs the global rotation address generator 1244 and 1246 to load or shift data for their respective register files.
- the signal provided to increment control logic 1242 set the address counter 1290 and 1292 increment for + 1 increment if single instruction operation is emplemented and to + increment if double instruction operation is emplememted.
- IPRX 1230 provides an 11-bit instruction address on lines 1248 to instruction program register 1250.
- Output signal 1252 from instruction point register 1250 is an address for instruction program memory 1258.
- Address 1252 is also provided back to the HOLD input 1254 of IPRX 1230.
- the hold input holds the output memory address for a readdress if desired.
- Address 1252 is also provided to a + 1 increment control logic 1256.
- Increment logic 1256 increments return register 1264 or instructs the IPRX 1230 to step to the next address. Return register is latched by a CALL input signal.
- Instruction program memory (IPM) 1258 stores the SVP system array instruction set in microcode.
- the array instruction set is presented earlier herein. A complete description of the 44 bits is provided therein.
- the 44 instruction bits from instruction program memory 1258 are branched to various locations as set forth in the array instruction set. For example, bit number forty-three is a break point flag. This bit is provided via line 1270 to break point controller 1274.
- Other control bits are provided to the VECTOR, JUMP and CALL inputs of IPRX 1230, and to input 1282 of instruction decoder 1234.
- a mask value bit for selecting a flag is provided via line 1223 to jump flag arbitration logic 1224.
- breakpoint controller 1274 If breakpoint controller 1274 is enabled during a break point bit read, a break signal on lines 1280 and 1284 to stop operation to provide a test. Breakpoint controller 1274 also receives a breakpoint line (BPline) input signal 1276 and a reset signal input 1278. Instruction bits 0 through 23 are branched from Instruction program memory (IPM) 1258 to control code latch 1288. Bits 25 through 31 are branched to RFO address counter 1290. Bits 32 through 38 are branched to RF1 address counter 1292. Bits 39 through 42 are branched to repeat counter 1294 and to increment control logic 1242.
- IPM Instruction program memory
- Increment control counter 1242 also receives inputs 1240 from the instruction decoder, which also provides a 4-bit control input to global rotation address generators (RF1) 1244 and (RFO) 1246.
- the latched instruction output 1194 from control code latch 1288 is provided to auxilliary register and controller logic 1196 which also receives global variables signal on line 1198.
- Output 1194 is also provided directly as microcode bits 0 through 23 on line 1200.
- Instruction Generator 910 feeds the SVP processor array with a stream of data, instructions, addresses, and control signals at a desired clock rate.
- the generated microcode manipulates and instructs the processor element arithmetic logic units, multiplexer, registers, etc. of SVP 102 of Fig. 1.
- Instruction generator 910 can, in addition to the core instructions, generate instructions which allow the SVP core processor to operate in the manner of a simple microprocessor. In this mode, instructions such as unconditional jump, call, and jump on certain flag test instructions flag 0, 1, etc., will be performed. The flags can be externally tested.
- Instruction generator 910 can receive internal control codes from Vertical Timing Generator 904 or Master Controller 906, and receive flags from Horizontal Timing Generator 906.
- instruction microcode stored in instruction Program Memory (IPM) 1258 are fetched, interpreted and executed by Instruction Decoder 1234. Some of the decoded signals can be used as the address selection of Instruction Program Register Multiplexer (IPRX) 1230 to change the address latched in the instruction Program Register (IPR) 1250.
- the instruction codes control the various types of Instruction Sets, for instance, conditional or unconditional jump, subroutine call or return, vector addressing with updated mode value, single or double instruction, auxiliary register control for the distribution of global variables, and the global rotation for RAM FILE(0 and 1) addresses, etc.
- break point controller 1274 sets the con]ent of IPR 1250 with a pre-determined value to move the flow of the program into specific subroutines in order to test the data processed by the SVP operations.
- This break function can be controlled by the maskable input of BPLINE 1276 Horizontal line within a given frame of the video signal.
- Repeat counter 1294 reduce the required amount of memory locations in IPM 1258 by representing a number of successive, identical instructions as a combination of this instruction code and the number of "1" end of loop signal is encountered.
- the sequence then proceeds to step 1214 and control logic 1170 resets loop counter 1182 and decrements repeat counter 1128 via signals on lines 1186 and 1192 respectively.
- step 1216 if repeat counter 1128 has not reached zero the sequence returns to step 1207. If repeat counter 1128 has reached zero the sequence proceeds to step 1221 and control logic 1132 increments sequence counter + 1 and the sequence returns to step 1206 and the steps are repeated. If at step 1223 the sequence counter count is greater than the number of sequences the operation stops at step 1227.
- Fig. 42 there is depicted a five pole finite impulse response (FIR) filter 792 of N-bit resolution which can be implemented in the present SVP device 102.
- FIR finite impulse response
- 2N instructions can be saved over single near-neighbor architecture.
- processor 102 requires N instructions to move N bits from 2L to 1 L to perform an add.
- N instructions are required to move N-bits from 2R to 1 R.
- 2N instructions are saved over a single near-neighbor communication network. If for example, a 12-bit FIR is implemented the second-nearest-neighbor arrangement would require less than 68% of the execution time of a single-near-neighbor network.
- SVP is a software programmable device
- filters and other functions can be implemented in addition to the FIR of Fig. 42 (horizontal filter). These include for example, vertical and temporal FIR filters and IIR filters (vertical and temporal).
- Fig. 43 four line memories are illustrated: an eight bit line memory 824; a six bit line memory 826; and two four bit line memories 828 and 830. These line memories can be emulated In the present SVP device 102.
- Fig. 44a represents a register file, such as RFO of processor element n, having bit locations 00 through 7F (0 through 127).
- the 44a register file can be broken into multiple pieces. In this example the register file is broken into two pieces-lower and upper (not necessarily equal).
- the upper part comprises bit locations 00 through 3F.
- the lower bit locations 40 through 7F If the upper part is designated the global rotation memory, the lower part can be used as the normal operating register file.
- the global rotation part can be, for example, reorganized as "P" words of "Q" bits where PxQ is less than or equal to the total global rotation space.
- Fig. 18b is an exploded view of the upper part of Fig. 44a.
- Each line of the Fig. 44b global rotation area comprises 8-bits of the register file transposed in a stacked horizontal fashion.
- Q modulus the total global rotation space.
- the global rotation instruction is executed once each horizontal line time.
- the SVP hardware allows the setting of the value of Q and the maximum value of the global rotation space.
- Figure 45 is a logic diagram of global rotation address generator for register file 0 (RFO) 1246 of Fig. 36.
- Global rotation address generator for register file 1 1244 of Fig. 36 is identical and accordingly the following discussion applies to both generators.
- Global rotation address generator 1246 receives a relative register address from register file 0 address counter via lines 1291. This relative address is provided to address register locations in register file 0 via lines 948.
- Microcode bits 32 through 37 are six of the eleven bits provided via lines 1374 and 1382 from instruction program memory 1258. The six bits provided via lines 1374 define the amount of registers in the total register area to rotated during a rotation step. This is the word length P in the previous example. For engineering design purposes the value defined by bits 32 through 37 is scaled by a factor of 2 in this example.
- the scaled P value is provided to registers 1370.
- the scaled Q value is provided to registers 1380.
- Figure 46a and 46b are parts of a flow diagram for a global rotation.
- Fig. 47 example circuitry for pipelining of address, data, control and other signals received from controller 128 is depicted.
- the illustrative circuit comprises an address buffer 1436 providing an input 1438 to factor generator 1440, the output of which is provided to address factor decoder 1448 by driver 1444.
- the output 1450 of decoder 1448 is provided to latch 1452 which is clocked at the sample frequency provided on line 1454.
- Latch 1452 can be reset between clocking by and active low input on line 1458.
- the output of latch 1452 is provided to the control line input of the section under control, such as word line 1462 of a data input register, input register file, output register file or data output register.
- Fig. 47 type circuit can be used on the DOR side also.
- Fig. 48 is a table of various inputs and outputs for a pipeline circuit.
- a timing diagram is provided to illustrate the improved speed of the device resulting from the ability to continuously provide signals to the SVP without requiring that the outcome of previously executed instructions be determined.
- Signal 1431 is a valid memory address signal being provided to SVP device 102 core via external contact pad 1432.
- Signal 1450 is the decoded signal output of address decoder 1448.
- Signal 1462 illustrates the signal output of driver 1456 being provided to, for example, the DIR word line. If at time t0 a valid address signal is provided, the signal is decoded and provided to the latch 1452 at time t1, whereat it is latched in at time t3. Upon sampling, the decoded address is provided to selected word lines.
- the latch holds the state of the current operation's address while the new address (for the next operation) is pipelining through the input buffer, factor generator/driver, wiring and address decoder.
- the present pipelining technique applies to data signals, control signals, instructions, constants and practically all other signals that are provided in a predeterminable sequence.
- Fig. 50 it is illustrated how to further pipeline the signals by configuring the input buffer as a latch. These latches can then be reset and clocked by some derivation of the reset 1482 and/or sample signals 1484.
- Contact pad 1486 receives a master clock input signal which is eventually provided throughout the pipelining system.
- clock generator 1496 generates the latching and reset signals for the system.
- a device of this type can be provided for all control and address signals from the controller.
- Figure 51- depicts a controller circuit suitable for controlling distribution of global variables. Controller as previously discussed provides addressing and control and data signals to the SVP processing elements. To load variables into the SVP and distribute those variables globally the controller hardware of Fig. 51 can be used.
- the controller can be modified to include a set of auxiliary registers 1570 and an addressing structure which modulates the M registers of the SVP processing elements to distribute the variables.
- the auxiliary registers and modulation section 1196 comprises an auxiliary storage register 1510 such as a RAM memory and a 2-->1 multiplexer (MUX) 1574.
- Auxiliary registers 1570 has an 8-bit load data input 1562, a data write input 1564 and a register address or read port 1568 organized as 5-bit by 1.
- the auxiliary register write port is organized as 2-bit by 8.
- Auxiliary register output 1572 is provided to trigger the High input of MUX 1574.
- the Low input to MUX 1574 is bit C18 of the opcode output.
- Line 1576 provides an auxiliary register instruction enable signal to MUX 1574.
- the auxiliary registers 1570 are discussed in greater detail hereinafter.
- a memory map of the register file 1 (RF1) and data output register (DOR) of a processor element is depicted.
- the auxiliary register address in the memory map is part of the unused addresses for RF1/DOR.
- the act of addressing the area "above” the DOR address in the memory selects the auxiliary registers.
- Data stored in the auxiliary registers are written as 4 words of 8-bits each, but read as 32 words of 1-bit each.
- Figure 54 depicts an alternative embodiment of the present synchronous vector processor/controller chip.
- the instruction generator an auxiliary registers are included on chip with the SVP processor core array.
- controller 1626 and SVP device 1628 can be manufactured on one silicon chip forming device 1630.
- Clock Oscillator 1632 is phase locked to the transmitted television signal and provides clocking signals to the controller section.
- Clock oscillator 1634 is generally clocked to match the SVP operating speed.
- Block 1630 may contain one or more SVP devices for system 1629.
- System 1630 includes conventional tuner circuitry 1644 for tuning reception of composite or S-VHS video signals.
- Color separation and demodulation circuitry 1642 processes the tuned signal and the output is provided to SVP system 1630 in the manner previously discussed.
- a processed signal output is color modulated by circuitry 1640 and either a composite video signal or a S-VHS video signal is output from modulator 1640.
- the composite video signal is RF modulated by circuitry 1638 and provided to a television antenna input or monitor input for display.
- the processed video signal is phase and FM modulated by circuitry 1634 and recorded by head logic 1636 in the conventional manner.
- the recorded signal is read from the tape and transmitted to phase and FM demodulation circuitry 1632. Thereafter the signal can again be processed by SVP system 1630 and provided as an output.
- One or more field memories 120 may be used to capture data in the manner previously discussed with respect to Fig. 1.
- the synchronous vector processor device and controller system disclosed and described herein is not limited to video applications.
- the SVP's unique real-time performance offers flexible design approaches to a number of signal processing applications. Some of the applications are listed in Table 27.
- Figure 58 depicts a vision inspection system incorporating a SVP system.
- the system includes a video camera for viewing objects to be inspected, or otherwise analyzed.
- the camera outputs a video signal to the inputs of an A-to-D converter which digitizes the analog video signal and provides a digital input to SVP system.
- the SVP system may also be provided with stored images from a memory or mask storage source such as an optical disk.
- the SVP can provide an output to a display or other indicator means and also to a host computer.
- the host computer may be used to control a timing and control circuit which also provides signals to the A analog to digital converter, the memory and the SVP device system.
- the visual inspection system of Fig. 58 can perform inspection of devices by comparing them to stored master images.
- the output can be an image showing differences, a simply pass/fail indicator, or a more complex report.
- the system can automatically determine which device is being inspected.
- Other type sensors could be used as well, such as infrared, x-ray, etc.
- a pre and post processing of the images could be performed to further enhance the output.
- Figure 59 depicts a pattern recognition system incorporating a SVP system.
- the SVP device receives digitized input signals from the output of an analog-to-digital convertor. Stored patterns may also be provided to the SVP for processing from an external memory. The input data is processed and a pattern number is output from the SVP.
- the analog-to-digital convertor, stored pattern memory and SVP may operate under control of output signals from a control and timing circuit.
- the pattern recognition system compares input data with stored data. This system goes beyond the visual inspection system and classifies the input data. Due to the SVP's speed many comparisons can be made in real-time. Long sequences of data can be classified.
- An example speech recognition application is illustrated Fig. 60. Fig.
- 60 depicts a speech data sample having a frequency of 8 kilohertz. Since speech is digitized at relatively low rates, 8 kilohertz, The SVP has plenty of time to perform many calculations on the transmitted speech data. An input of 1024 samples long would give approximately one-eighth second to process data, which corresponds to around 1.4 million instructions. In addition, the SVP can store many lines of data and thus recognize words, phrases, even sentences.
- Figure 61 depicts a typical radar processing system utilizes an SVP.
- Detected radar signal are transmitted from the antenna to an RF/IF circuit and the FM/AM outputs are provided to analog digital converter.
- the digitized output signal is processed by the SVP and the output is provided to a display or stored in memory.
- This system processes pulse radar data an either stores or displays the results.
- Figure 62 is a picture phone system utilizing an Synchronous Vector Device.
- Fig. 62 depicts the transmission and reception side.
- the video camera views the subject and the analog signal is digitized by analogs digital convertor.
- the digitized output is provided as an input to the SVP device.
- Other inputs include tables and the output of a frame memory.
- the SVP DTMS output is filtered in the filter circuit and provided to the phone lines.
- the phone lines transmit the transmitted data to an analog to digital convertor where the digitized signal is processed by a Synchronous Vector Device.
- the input signal may be processed along with stored data in a frame memory.
- the SVP output is converted to analog by digital by digital to analog convertor and placed in a matrix and displayed by a display.
- the picture phone system compresses input images, then encodes them as DTMF values and sends them over phone lines to a receiver. Sign tables are used to generate the tones directly in the SVP. On the receiving end the DTMF tones are digitized then detected and decompressed in the SVP.
- Figure 63a and 63b depict a facsimile system utilizing a Synchronous Vector Processor.
- Fig. 63 depicts the transmitting or sending in.
- a document scanner would scan the document to be transmitted and the scanned binary data is provided as an input to the SVP.
- Time tables can be used to generate tones directly in the SVP.
- the SVP performs encoding and tone generation.
- the tones are outputs to filter and then provided to the phone lines.
- the received data from the phone line is converted to digital by analogs digital convertor and provided to the SVP for tone detection and decoding.
- the decoded SVP output is then printed by a printer.
- Figure 64 is a SVP based document scanner system which converts scanned documents to ASCII files.
- the scanner output is provided to the SVP where it is processed along with character tables and the processed output is stored in memory.
- the document scanner system digitizes data like a FAX machine, but performs pattern recognition on the data and converts it to ASCII format.
- the SVP can for used for secure video transmission.
- This system is shown in Fig. 65.
- the system includes a video signal source which provides an output to an input buffer.
- the buffered signal is provided to the SVP for processing.
- the SVP and input buffer can operate under control of a controller.
- the encoded signal from the SVP is provided to a transmitter where it is transmitted to a receiver and is again input buffered and decoded by an SVP on the receiving end.
- the SVP in the above system can encrypt a video signal by multiplying the pixel in each processor element by an arbitrary constant.
- the mapping of encryption constants to processor elements is defined by ROM coded pattern in the encoding and
- ⁇ Destination_operand> ⁇ Source Operand>
- Abbreviations are used to reduce typing and some synonyms are used to reduce confusion when entering mnemonics
- Sub-instructions whose data source depends on the value of WRM show three lines.
- '(WRM)' is the contents of working Register WRM.
- ⁇ adr2> is an 11-bit address with the 5 LSB's equal to 00000.
- the absolute address is: ( ⁇ adr2> AND 07EOh) + ⁇ (mode register)>
- the table at ⁇ adr2> will most likely contain JMP instructions to subroutines within the main program; however, any instruction may be used in the table.
- the table must be located on a 5-bit boundary.
- the table must be constructed with up to 16 "OUT" instructions.
- TCMA Test COMA if COMA is equal to ⁇ c>, then jump to ⁇ label>. if COMA is not equal to ⁇ c>, then execute next instruction.
Abstract
Description
- The present invention relates generally to single instruction, multiple data processors. More particularly, the invention relates to processors having a one dimensional array of processing elements, that finds particular application in digital signal processing such as Improved Definition Television (IDTV). Additionally, the invention relates to improvements to the processors, television and video systems and other systems improvements and methods of their operation and control.
- Fast and accurate real-time processing of data signals is desirable in general purpose digital signal processing, consumer electronics, industrial electronics, graphics and imaging, instrumentation, medical electronics, military electronics, communications and automotive electronics applications among others, to name a few broad technological areas. In general, video signal processing, such as real-time image processing of video signals, requires massive data handling and processing in a short time interval. Image processing is discussed by Davis et al. in Electronic Design, October 31, 1984, pp. 207-218, and issues of Electronic Design for, November 15, 1984, pp. 289-300, November 29, 1984, pp. 257-266, December 13, 1984, pp. 217-226, and January 10, 1985, pp. 349-356.
- Video signal processing requires the use of Finite Impulse Response (FIR) digital filters for many of the data processing applications. If the sampling frequency is carefully selected, the coefficients of the filters can be small ratios of powers of two or at least simple combinations of powers of two. Real time video signal processing requires that the operating processors receive and process the video signal and the data necessary to emulate digital filters at extremely fast rates. In the prior art a substantial portion of the processing time is consumed in obtaining the sample data from adjacent processors in the array. For example the processors in the array would have to execute a series of instructions to address, read and transfer data located in its next adjacent processor untilk it reaches the desired location in the array. In a large array, this sequence transferring the data from one processor to the next until it reaches a desired location is time consuming. If a finite time exist to receive and process the data, a large data retrival time will of course leave less time for data processing. Therefore a technique for reducing the data retrival time in a synchronous vector processor is desired in the art.
- Briefly, in one embodiment, the present invention comprises processor circuits connected in a serial chain, each of said processor circuits including: a data processing unit having a digital input connected in common with the digital input of each of the data processing units of the other processor circuits for entry of said control and address signals, the data processing unit including an arithmetic logic unit, a plurality of data storage registers connected to said arithmetic logic unit, and data multiplexers connected to said data storage registers; a first register interface including a first set of bit registers for parallel entry of said first digital data signal and including a second set of bit registers, said first and second set of bit registers individually accessible by said data processing unit; a second register interface including a third set of bit registers and also having a fourth set of bit registers having a parallel digital output for producing the processed digital data signal, said third and fourth set of bit registers individually accessible by said data processing unit; a. first sequencer circuit connected by a first common line to the first register interface in each of the processor circuits and responsive to the clock pulses for selectively sequentially activating operations of each of said first register interface; and a second sequencer circuit connected by a second common line to the second register interface in each of the processor circuits and responsive to the clock pulses for selectively sequentially activating operations of each of said second register interface; said data processing units thereby operable by said controller independently of and cooperatively with said first and
- Figure 47 shows signal pipelining circuitry;
- Figure 48 shows the various signal inputs and outputs for a Fig. 47 type circuit;
- Figure 49 shows a timing diagram for signal flow using a Fig. 47 pipeline circuit;
- Figure 50 shows an alternative pipeline circuit;
- Figure 51 shows a global variable distribution controller circuit;
- Figure 52 shows an auxiliary register set and control circuit;
- Figure 53 shows memory reduction control circuitry;
- Figure 54 shows an alternative SVP controller/processor system;
- Figure 55 shows an SVP video tape recorder system;;
- Figure 56 shows an SVP based general purpose digital signal processing system;
- Figure 57 shows an SVP based graphics/image processing system;
- Figure 58 shows an SVP based visual inspection system;
- Figure 59 shows an SVP based pattern recognition system;
- Figure 60 shows an illustrative speech signal;
- Figure 61 shows an SVP based radar processing system;
- Figure 62 shows an SVP based picture phone system;
- Figures 63a and 63b shows an SVP based facsimile system;
- Figure 64 shows an SVP based document scanner;
- Figure 65 shows an SVP based secure video transmission system;
- Figure 66 shows an illustrative video signal for the Fig. 65 system; and
- Figure 67 is an illustration of a pin grid array package suitable for SVP packaging.
- In the following discussion of the preferred embodiments of the invention, reference is made to drawing figures. Like reference numerals used throughout the several figures refer to like or corresponding parts.
- An SVP, Synchronous Vector Processor of a preferred embodiment, is a general purpose mask- programmable single instruction, multiple data, reduced instruction set computing (SIMD-RISC) device capable of executing in real-time the 3-D algorithms useful in Improved and Extended Definition Television (IDTV and EDTV) systems. Although the SVP of the invention is disclosed for video signal processing in the preferred embodiment, the hardware of the SVP works well in many different applications so no particular filters or functions are implied in the architecture. Generally, the SVP can be used in any situation in which large numbers of incoming data are to be processed in parallel.
- In a typical application, such as video signal processing, the Input and Output layers operate in synchronism with the data source (such as video camera, VCR, receiver, etc.) and the data sink respectively (such as the raster display). Concurrently, the Computation layer performs the desired transformation by the application of programmable functions simultaneously to all the elements of a packet (commonly referred to as a VECTOR: within the TV/Video environment all the samples comprising a single horizontal display line). Thus the SVP is architecturally streamlined for Synchronous Vector Processing.
- In Fig. 1, a TV or
video system 100 includes synchronousvector processor device 102.System 100 comprises aCRT 104 of the raster-scan type receiving an analog video signal atinput 106 from standardanalog video circuits 108 as used in a conventional TV receiver. A video signal from anantenna 110, is amplified, filtered and heterodyned in the usual manner through RF andIF stages 112 including tuner, IF strip and sync separator circuitry therein, producing an analog composite or component video signal atline 114. Detection of a frequency modulated (FM) audio component is separately performed and not further discussed here. The horizontal sync, vertical sync, and color burst are used bycontroller 128 to provide timing toSVP 102 and thus are not part of SVP's data path. The analog video signal online 114 is converted to digital by analog-to-digital converter 116. The digitized video signal is provided atline 118 for input tosynchronous vector processor 102. -
Processor 102 processes the digital video signal present online 118 and provides a processed digital signal onlines 170. The processed video signal is then converted to analog by digital-to-analog converter 124 before being provided vialine 126 to standardanalog video circuits 108. Video signals can be provided to analog-to-digital converter 116 from a recorded or other non standard signal source such asvideo tape recorder 134. The VCR signal is provided on line 136 and bypasses tuner 112.Processor 102 can store one (or more) video frames in afield memory 120, which is illustratively, a Texas Instruments Model TMS4C106O field memory device.Field memory 120 receives control and clocking onlines test lines RF1 address lines data output lines 170 and a 1-bit global output 178 (GO) line. - The I/O system of the SVP comprises the Data Input Register 154 (DIR) and the Data Output Register 168 (DOR). DIR and DOR are sequentially addressed dual-ported memories and operate as high speed shift registers. Both DIR and DOR are dynamic memories in the preferred embodiment.
- Since the DIR and DOR are asynchronous to the
PEs 150 in the general case, some type of syncronization must occur before data is transferred between DIR/DOR and thePEs 150. This usually occurs during the horizontal blanking period in video applications. In some applications the DIR, DOR, and PEs may operate synchronously, but in any case it is not recommended to read or write to both ports of one of the registers simultaneously. - With reference again to Fig. 2, the DIR of
processor 102 is a 40960 bit dynamic dual-ported memory. Oneport 119 is organized as 1024 words of 40 bits each and functionally emulates the write port of a 1024 word line memory. Figure 4 depicts a timing diagram for a DIR write. The 40 Data Inputs 118 (DIO through DI39) are used in conjunction with timing signals Write Enable 190 (WE), Reset Write 192 (RSTWH), and Write Clock 194 (SWCK). WE 190 controls both the write function and the address pointer 148 (commutator) increment function synchronously withSWCK 194. When high, theRSTWH 192 line resets theaddress pointer 148 to the first word in the 1024 word buffer on the next rising edge of SWCK.SWCK 194 is a continuous clock input. After an initial two clock delay, one 40 bit word ofdata 198 is written on each subsequent rising edge ofSWCK 194. Ifdata words 0 to N are to be written, WE remains high for N+4 rising edges of SWCK. Theaddress pointer 148 may generally comprise a 1-of-1024 commutator, sequencer or ring counter triggered to begin at the end of a horizontal blanking period and continue for 1024 cycles synchronized with the sampling frequency of the A-to-D converter 116. Theinput commutator 148 is clocked at above 1024 times the horizontal scan rate. Theoutput commutator 174 can be, but not necessarily, clocked at the same rate as the input. - It should be noted at this time that although, for purposes of discussion,
processor 102 is depicted as having 1024 processor elements, it can have more or less. The actual number is related to the television signal transmission standard employed, namely NTSC, PAL or SECAM, or the desired system or functions in non television applications. - The
second port 121 ofdata input register 154 is organized as 40 words of 1024 bits each; each bit corresponding to aprocessor element 150.Port 121 is physically a part of, and is mapped into the absolute address space of RFO; therefore, the DIR and RFO are mutually exclusive circuits. When one is addressed by an operand on a given Assembly language line of assembler code, the other cannot be. An Assembly language line which contains references to both will generate an error at assembly-time. This is discussed in more detail hereinafter. - The
DIR 154 works independently of theDOR 168; therefore it has itsown address lines 131 and some of itsown control lines 135. The exact function ofDIR 154 is determined by many lines: C21, C8, C2, C1, C0, the contents ofWRM 234, and by addresses RFOA6 through RFOAO, (See Fig. 5). Control line C2 = 1 selectsDIR 154. The seven address lines RFOA6 - RFOAO select 1-of-40 bits to be read or written to while C1 and CO select the write source (for a read CO and C1 don't matter). With certain combinations of lines C1 and CO the write source forDIR 154 depends on the state of C21 and C8 and the contents ofWorking Register M 234. These form instructions called M-dependent instructions which allowmore processor 102 flexibility. Table 1 sets forth the control line function forDIR 154. - With reference again to Fig. 3,
DOR 168 is a 24576 bit dynamic dual-ported memory. Oneport 169 is organized as 1024 words of 24 bits each and functionally emulates the read port of a 1024 word line memory. The Data Outputs (DOO through D023) 170 are used in conjunction with the signals Read Enable (RE), Reset Read (RSTRH), and serial Read Clock (SRCK) of Fig. 6.SRCK 496 is a continuous clock input.RE 490 enables and disables both the read function and the address pointer increment function synchronously withSRCK 496. When high, theRSTRH line 494 resets the address pointer (commutator) to the first word in the 1024 word buffer on the next rising edge 498 ofSRCK 496. After an initial two clock delay, one 24 bit word of data is output an access time after each subsequent rising edge of SRCK. If data words O to N are to be read, then RE must remain high for N+3 rising edges of SRCK. As discussed hereinabove with reference toDIR 154, theaddress pointer 174 can similarly comprise a 1-of-1024 commutator or ring counter. - The second port 167 of
data output register 168 is organized as 24 words of 1024 bits each; each bit corresponding to aProcessor Element 150. Port 167 ofDOR 168 is physically a part of, and is mapped into the absolute address space ofRFI 166; therefore, theDOR 168 andRF1 166 are mutually exclusive circuits. When one is addressed by an operand on a given Assembly line, the other cannot be. An Assembly line which contains references to both will generate an assembly-time error. This is discussed in more detail hereinafter. -
DOR 168 works independently ofDIR 154; therefore it has itsown address lines 133 and some of itsown control lines 137. The exact function ofDOR 168 is determined by many lines: C21, C5, C4, C3, the contents ofWRM 234, and by addresses RF1 A6 through RF1AO, (See Fig. 5). Control line C5 = 1 selectsDOR 168. The sevenaddress lines 133 select 1-of-24 bits to be read or written to while C4 and C3 select the write source. With certain combinations of control lines C4 and C3, thewrite source DOR 168 depends on the state of C21 and the contents ofWorking Register M 234. These form instructions called M-dependent instructions which allowmore processor 102 flexibility. Table 3 sets forth thecontrol line 130 function forDOR 168. - The logical diagram of Figure 5 details the interconnect of RF1 and the DOR. C21, C5, C4, C3, and RF1A6 through RF1AO are control/address/data lines common to all 1024 PEs.
Signal C 280 andM 250 are fromWRC 248 andWRM 234 respectively.SM 262 andCY 264 are from ALU 260. - In order to make the hardware more efficient, the
same address lines 133 and much of the same hardware is shared betweenDOR 168 andRF1 166. - The memory map of Table 4 below requires an eight bit address. This address is made up of Control line C5 (RF1A7) as the MSB and Address lines RF1A6 through RF1AO (133) as the lesser significant bits. C5 is not considered an address because the selection of the
DOR 168 versusRF1 166 is implicit in the instruction mnemonic by bit C5. - M = the value contained in WRM (Working Register M)
- RO(n) a the value contained in RFO at address n
- R1 (p) = the value contained in RF1 at address p
- R1(p) = the value to be written back into RF1 at address p
- In a preferred embodiment the Register Files, Data Input Register, and Data Output Register are dynamic read only memories and are periodically refreshed unless implicitly refreshed by the running program. In many applications, (such as digital TV) the program will keep the RFs refreshed if the software loop is repeated more frequently than the refresh period. This keeps any memory locations which are being used by the program refreshed, while unused bits are allowed to remain un-refreshed. Also, a program can explicitly refresh both RFs by simply reading all locations of interest within the refresh period.
-
RFO 158 works independently ofRF1 166; therefore it has itsown address lines 131 and some of its own control lines. The exact function ofRFO 158 is determined by many lines: C21, C8, C2, C1, C0, the contents ofWRM 234, and by addresses RFOA6 through RFOAO (See Fig. 5). Control line 448 C2 = 0 selectsRFO 158. The sevenaddress lines 131 select 1-of-128 bits to be read or written to while C1 and CO select the write source. With certain combinations of control lines C1 and C0, the write source forRFO 158 depends on the state of C21 and C8 and the contents ofWorking Register M 234. These form instructions called M-dependent instructions which allowmore processor 102 flexibility. Table 6 sets forth the control line funciton forregister file 0 158. - The logical diagram of Figure 5 details the interconnect of
RFO 158 and theDIR 154. C21, C8, C2, C1, C0, and RFOA6 through RFOAO are control/address lines common to all 1024 PEs.Signal C 280 andM 250 are fromWRC 248 andWRM 234 respectively.SM 262 is from ALU 260.R 2R 324,L 2L 312 are signals from this PEs four nearest neighbors. - In order to make the hardware more efficient, the
same address lines 131 and much of the same hardware is shared betweenDIR 154 andRFO 158. The memory map of Table 2 requires an eight bit address. This address is made up of Control line C2 as the MSB. Address lines RFOA6 through RFOAO are the lesser significant bits. C2 is not considered an address because the selection of the DIR versus RFO is implicit in the instruction mnemonic. Other registers are mapped into the memory space so all undefined memory space in the memory map of Table 2 is reserved. -
RF1 166 works independently ofRFO 158; therefore it has itsown address lines 133 and some of its connect forty data lines 118 (from the parallel inputs DIO - D139) todynamic memory cells 518. These cells are dual-port, and are also written to or read from throughaccess transistors 520 and foldedbit lines 522 and 524 connected tosense amplifier 156, when addressed byword lines 526. There are forty of the word lines 526 for the DIR part and 128 of the word lines 160 for the RFO part of this 168-bit dynamic random access (DRAM) column. - As stated earlier hereinabove, the DIR is a 2-transistor dual port cell. Reading and writing can be performed for each port. The DIR operates as a high speed dynamic shift register. The dual port nature allows asynchronous communication of data into and out of the DIR. By using dynamic cells the shift register layout is greatly reduced. Although a dummy cell can be used, it is not a requirement for cell operation.
- The data output register utilizes a 3-transistor dual port gain cell. In most applications reading and writing is allowed at port 167, but only reading is performed from second port.
DOR 168 also operates as a high speed dynamic shift register. The DOR with gain transistor circuit allows reading ofcapacitor 519 without destroying the stored charge. In operation if a logical "1" oncell 519 is greater than 1VT oftransistor 1640; whenselect line 172 is turned on,line 1642 will be pulled to a logical "0" or to zero volts, eventually. If the charge oncell 519 is less than 1VT (i.e., a logical "0" or low) the charge online 1642 will remain at a precharge value.Transistor 1642 is the cell read select transistor. All twenty four data outlines 560 are sensed simultaneously by transistor 1642 (i.e.,transistor 1642 selects the processor element cells). As shownnode 1650 is isolated. This connection reduces possibility of data loss in cell from noise generated from reading other processor element cells. Each 128 cell section has acomparator 1634 on the output line to sense the signals. A reference voltage is applied tocomparator input 1636.Source 1638 oftransistor 1630 is connected to VDD. This is not a requirement however, andsource 1638 may be connected to another voltage level. - Figures 8a-d illustrate voltage levels at several lines and nodes of the DOR circuit.
- Figure 9 illustrates an alternative DOR cell.
- As previously indicated hereinabove a preferred embodiment of
PEs 150 for video applications utilize a 40-bit wideinput data bus 118 and a 24 bit wideoutput data bus 170. These bus widths in combination with high clocking speeds of 8fsc (35ns) results in a large power drain and noise on the bus lines if the entire bus width for the 1024DIR 154 orDOR 168 must be powered up for the entire clocking period. However because only an individual DIR (or DOR) is being read from or written to at any particular portion of the clocking period, it is possible to power up only theDIR 168 being written to or a portion of the DIR serial array including the DIR being written to at any given time. - Figure 10 depicts an
SVP 102input bus line 118 power drain and noisereduction control circuit 580.Circuit 580 reduces noise and power requirements ofSVP 102 during aDIR 154 write. For purposes of discussion and illustration the 1024 by 40DIR array 154 is segmented into eight segments or portions 586a-h, each including 128PEs 150. Data is clocked into memory locations of each 128 DIR segment 586 by a segment ofcommutator 148 operating under control of a corresponding control unit 602. Control unit 1 (602a) has a segment of clock inputs 608 timed to be in sync with the horizontal scanning rate of the input video data signals online 118. Each of the eight control units 602 is connected to receive areset signal 610. The reset signal causes the first control unit 602a to power up and powers down the remainingunits 602b-h. Control unit 602 output signals include a commutator enablesignal 151 for enabling the commutator 588 for operation as previously described. The individual control unit 602 output signals also include a power upoutput signal 606 for powering up the next adjacent control unit for operation when data signal write to the presently operating section is near completion. For example, once data read fromline 118 to the DIR section 586a is near completion, the nextadjacent control unit 602b enables itscommutator segment 588b to be ready for a data write. Oncesegment 602b enablescommutator section 588b, a signal on line 604a powers down previous control unit 602a since it has completed writing data to segment 586a. This power up/power down control sequence is repeated for each section until all 1024 DIRs have been loaded. In this fashion only the commutator for the group of DIRs being written to is powered up during a portion of the clock cycle. In accordance with the previously describedSVP 102 operation, during the video data signal scan line horizontal blanking period the DIR data in all sections 586a-h is clocked into RFO while the controller reset signal is made active and a new scan line is ready for input. - Referring now to drawing Fig. 11, a logical block diagram of a preferred embodiment of the power drain and noise
reduction control circuit 580 depicted in drawing Fig. 10, is depicted in greater detail. In Fig. 11,control circuit 580 is shown comprising subcircuits including flip-flops - In operation a reset signal at
input 610 triggers the S or set input of flip-flops 614 and 620a. The same reset signal 610 triggers the clear inputs to flip-flops 620b-620g and triggers the reset input to flip-flop 622. When set input of flip-flop 620a is triggered its Q output is activated to enable drivers 628. When drivers - An
example control circuit 688 for controlling segment selection and operation is also depicted in Fig. 13.Control circuit 658 includes DIR select transistors, such astransistors Select transistor 670 has its source and drain connected between the DIR and the processorelement sense amp 678. The gate oftransistor 670 is connected to the output of ANDgate 682.Input lead 692 of ANDgate 682 receives a XFERLEFT or XFERIGHT signal.Input lead 690 receives microcode control bit C2. When C2 = 1 DIR is selected; when C2 = 0 RFO is selected. -
Transistor 672 is connected in a similar manner betweenDIR 650 andsense AMP 678. Similarly connected aretransistors segment 652. Each DIR of each segment control circuit also includes a two transistor network which forces the sense amps to a known state as desired during operation. These aretransistors transistors -
Transistor 662 has its source connected to the source oftransistor 670 and its drain is grounded. Similarly, the source oftransistor 664 is connected to the source oftransistor 672. The drain oftransistor 664, however, is connected to VDD. The gates oftransistors Input 688 is connected to the output ofinverter 686, the input of which is connected to the XFERLEFT/XFERRIGHT signal. Input 690 of AND gate 684 is connected to control bit C2. - The control output from AND gate 684 is cross coupled from
segment half 650 to 652 such that the output controlstransistor transistors gate 682 is similarly cross coupled between the left and right halfs ofprocessor 102. On theleft side gate 682output controls transistors right side gate 682controls transistors - In operation a high level on the XFERLEFT and C2 signals results in a low signal output from AND gate 684 and a high signal output from AND
gate 682. This selects the contents of left side DIRs for transfer to RFO and activates the right side DIRs for loading. A low or XFERRIGHT signal onlead 692 while C2 is 1, selects the left side DIRs for loading and the right side DIRs for transfer of data to RFO. This sequence is repeated so that the DIR scan continually receives and transfers data alternatively in a piston like manner. - After a full scan line has been loaded into the DIRs and transferred into the register files a software program executed by
processor 102 logically ORs the even address data transferred data with zeroes to recover the original data. The odd address data transferred is logically ANDed with ones to recover the original data. This is illustrated in drawing Fig. 14. After the data received fromdata line 118 has been recovered from the two segment processing as previously discussed can begin. - Figure 15 shows an alternative scheme for recovering the originally transferred data. Instead of recovering the even and odd addresses separately, the drains of
transistors M 1, A = INP(j), B = 0, C = 0, R1 (n) = SM. Then OR first data with results of first part: (XFERLEft = 0); M = 1, A = R1 (n), B=INP(j), C = 1, R1 (n) = CY. - Drawing Fig. 16 depicts the DIR control circuit of Fig. 13 in greater and slightly different detail. Fig. 17 depicts the DOR control circuit of Fig. 13 in greater and slightly different detail.
- As discussed hereinabove the Register Files are comprised of dynamic cells which are suitably refreshed in successive refresh periods to maintain their contents. Only those addresses which are used by the software need be refreshed. All remaining addresses may go without refresh since their data is not needed.
- A refresh operation is simply a read to each address requiring data retention; therefore, in many applications, the software program will keep the RFs refreshed if the software loop is repeated more frequently than the refresh period.
- Refreshing all 256K bits in
SVP 102 requires only 64 cycles. This is because each RF actually reads and refreshes 2 bits at a time (for a total of 4 bits per PE). To perform a complete refresh to all ofSVP 102, read each RF into any Working Register, increment the address by two each time and repeat 64 times. The following program illustrates a refresh operation. - disabled for all instructions on the entire line.
-
- In the Fig. 2 embodiment, there are four working registers 162 (WR) per processor element 150 (PE): WRM, WRA, WRB, and WRC. All four registers can be the same except their data sources and destinations differ. As further depicted in Fig. 5, each WR comprises a data selector or multiplexer and a Flip/Flop. All four registers are clocked at the same time by internal SVP timing circuits shortly after valid data arrives from the RFs.
- Table 12 shows illustrative sources of data for each of the four Working Registers.
-
WRA 238, the Addend/Minuend Register, is a general purpose working register, and is used in mostoperations involving ALU 164. WRA is the second 256 of two inputs to multiplier block 258 in theALU 164, and is the positive term entering adder/subtractor block 260. WRA is also an input to C MUX 244. - Data selector 236 (n-to-1 multiplexer) chooses one of ten possible sources of data for
WRA 238 as a function of Control lines C17, C16, C15, and C8 as shown in Table 14. Additionally, the data taken from lines R, R2, L, and L2 can be from 1 of 4 sources within the selected near-neighbor 160. -
-
WRB 242, the Addend/Subtrahend Register, is a general purpose working register, and is used in mostoperations involving ALU 164. In a subtraction operation,WRB 242 is always subtracted fromWRA 238. WRB is also an input to the UR MUX 305. -
-
WRC 248, the Carry/Borrow register, is the Carry (or Borrow) input toALU 164. In multi-bit additions,WRC 248 holds theCY 264 from the previous addition between bits, while in multi-bit subtractions,NRC 248 holds theBW 266 bit. WRC output goes to A, B and M registers and to RFO MUX1. - Referring now to drawing Figs. 18 and 19,
Global output signal 824 is equivalent to the logical OR 852 of all 1024UR lines 178 exiting the PEs. That is, if one or more PEs 103 inProcessor Array 102 outputs a logical ONE level on itsUR line 178 theGO signal 824 will also output a logical ONE. The GO signal is active high. Fig. 19 also shows the generation of the UR signal exiting PE(n) and its relation to the global flag signal, GO (Global Output). - Care should be taken when using near-neighbor communications instructions on the same Assembly line with GO instructions since both share the same hardware, therefore their use is generally mutually exclusive. In any case, the SVP Assembler will flag any conflicts which may occur.
- At the chip level depicted in Fig. 20, the near-neighbor communications lines are brought out to the outside so that multiple SVP's may be cascaded if a processing width of more than 1024 bits is required. On the left of
SVP 102 are L and 2L outputs and L and 2L inputs. To the right there are R and 2R outputs and R and 2R inputs. To avoid confusion with the interconnect, these pins are namedCCOL 792,CC1 L 794,CC2L 796,CC3L 798 andCCOR 800,CC1 R 802,CC2R 804,CC3R 806 so it is only necessary to connect CCOL to CCOR, etc. - Figure 20 depicts cascading interconnection for 2 or more SVPs. The inputs at the extremes should be grounded in most cases as in the figure, but this depends on the particular application. An alternative interconnection of SVPs is depicted in Fig. 21. The interconnect of Figure 21 allows the image in a video processing system to be wrapped around a cylinder by providing the wrap around connection. When using these lines, a wait stated cycle must be used with instructions which involve R/U2R/2L transfers to allow sufficient propagation time between SVP chips. An internal bus timing diagram for a wait-stated single instruction is depicted in Fig. 24.
- There are four instruction modes in the SVP: Single, Double, Wait-stated Single, and Idle. The first two of the modes will work in combination with any valid assembly instruction line, the third works with instructions which communicate data to the left and right of an immediate processor element, while the fourth is an Idle mode in which the PEs are not clocked in order to conserve power.
- All instructions require only one clock cycle to complete, but the duration of that clock cycle varies depending on the type of cycle. The two cycle lengths are 'normal' and 'extended.' The length of an 'extended' cycle is approximately 1.5 times the length of a 'normal' cycle. The 'extended' time allows for the wait portion of the Wait-stated Single Instruction, or for the additional ALU operations performed during the Double Instruction. The Idle Instruction is extended only to further reduce power.
-
- During an assembly, the default is Single Instruction Mode. When appropriate Single Instruction pairs appear in the assembly sequence, each pair will be automatically replaced with one Double Instruction,
- The SVP Assembler and hardware is capable of automatically generating and executing an instruction which is the equivalent of two single instructions but requires an extended cycle for execution. An overall thruput advantage is incurred from this ability. During this extended cycle, a <READ>-<REGISTER>- <ALU>-<REGISTER>-<ALU>-<WRITE> sequence is performed. The additional time to the extended cycle is used for a second ALU and Register operation. This is possible because extended cycles work from a 2- bit Cache for each Register File during read/write operations. The SVP Assembler determines how to make the best use of these Caches by converting Single Instructions to Double Instructions whenever possible. This operation may be turned on and off by two assembler directives, DRI and ERI respectively.
-
- The assembler will optionally assemble these four types of instruction patterns into double instructions and their respective opcodes become as shown in Table 22.
- The
External Bus 130 operation for the SVP chip is simple, as the only requirement is to present the device with a 38 bit microcode instruction (24 control, 14 address) and strobe PCK with the proper setup and hold times. Since theData Input 154 andData Output 168 Registers are asynchronous to the Processor Array 105, some form of synchronization is required prior to the Processor Array 105 transferring data to/from the DIR or DOR. In video applications, this is possibly handled by transferring during horizontal blanking time. - The rising edge of the external Processor Clock (PCK) triggers a series of internal clocks which creates the timing for the internal bus 171. Fig. 22 shows the sequence of events on the internal buses 171 of the
SVP 102 for Single Instruction Mode. - The SVP Assembler automatically generates what is called a Double Instruction from two single instructions providing they are identical except for address fields.
- The Double instruction created by the Assembler requires a corresponding hardware mode. Fig. 23 shows the sequence of events for the double instruction cycle.
- When cascading SVPs (Figs. 20 and 21) , a slow propagation path between chips requires extra time when using the Near-neighbor communications. Slow cycles are accommodated by having a Wait- stated Single cycle. This cycle performs the operation of a single instruction cycle but requires the time of a double instruction cycle as shown in Fig. 24.
- The Idle cycle allows the PA 105 to be mostly powered down until needed. This is shown in Fig. 25.
- The SVP is programmed at the microcode level. These microcode 'sub-instructions' combine to make the instruction portion of an instruction line in the SVP Assembly language. This section explains how to construct these instructions and how the assembler checks for conflicts. Some of the major topics in this section are:
- * Rules for Forming Instruction Lines
- - Operand Destination/Source Names
- - Rules for Combining Sub-instructions
- - The Opcode Field
- * The Instruction Conflict Mask
- The SVP Assembly source is similar to that of other assemblers; each line contains an instruction, an assembler directive, comment, or macro directive. The SVP assembly line, however, differs in that a single line containing one instruction comprises several sub-instructions. These sub-instructions combine to generate a single opcode when assembled.
- An 'instruction line' is made up of an optional label, one or more sub-instructions plus an optional comment field.
- A valid 'instruction' is made up of one or more sub- instructions such that no sub-instruction conflicts with another.
- A 'sub-instruction' comprises three parts: A destination operand, an assignment operator (the SVP Assembler recognizes the ' _' sign), and a source operand, in that order,
- A source operand may be specified more than once in an instruction line:
- B = A , C = A is legal
- A destination operand may be specified in an instruction line:
- B = A , C = B is legal
- C = A , C = B is not legal
- Each Register File may be specified more than once as a source operand if the address is the same for each sub-instruction:
- A = RO(13) , B = RO(13) is legal (same address)
- A = RO(13) , B = RO(100) is not legal (same RF, different address)
- A = RO(13) , B = R1(100) is legal (different RF)
- Only one of RFO, RFI, DIR and DOR may be specified as a destination operand in an assembly line:
- C = BW, RO(10) = SM is legal (single memory write)
- RO(13) = A, R1(13) = B is not legal (simulataneous write to tow memory banks)
- If R0, R1, INP, or OUT is specified as a source operand and a destination operand the source and destination address must'be the same:
- B = RO(22) , RO(22) = SM is legal (read/modify/write)
- C = RO(22) , R1(123) = C is legal (different RF)
- C =R0(22) , R0(123) = C is not legal (same RF, different address)
- B = R1(25) , INP(10) = SM is legal (different RF's)
- B = R0(25) , INP(10) = SM is not legal (RO & INP are
- In general, any rule demonstrated above for Register Files RO and A1 applies to the INP (DIR) and OUT (DOR) instructions as well, with the exception that the address range of 'n' and 'p' is 0 to 127 while 'm' is 0 to 39, and 'q' is 0 to 23, respectively.
-
- Figure 26 shows an alternative embodiment of a
processor element 150. As depictedprocessor element 151 of Fig. 26 includes four sense amps per processor element. Two for DIR/RFO write and read operations. Two for DOR/RF1 write and read operations. With the Fig. 26 emobodiment, registerfile 0 and registerfile 1 each read two bits of data in each memory cycle (total of four bits per cycle). However, only - The control portion of the opcode is made up of eight octal digits. Each of the digits corresponds to one of the circuit blocks of Fig. 5 so a little familiarity with the opcode format permits the user to read the opcode directly. Table 26 indicates which bits correspond with which blocks. 'CIC' is Conditional Instruction Control.
- In Fig. 29, a
controller 128 is shown connected toSVP 102 and to a software program development and televisionoperation emulation system 900.Development system 900 includes ahost computer system 912, a host computer interface logic 914, a pattern generator 916 and adata selector 918. -
Host computer system 912 can take a variety of forms indevelopment system 900. Such forms include a personal computer, a remote control unit, a text editor or other means for developing a control alogrithm. Host computer interface logic 914 includes circuitry for emulating a television set's main micro-controller. Indevelopment system 900 host computer interface logic 914 cooperatively works with pattern generator 916 to interfacehost computer system 912 andlocal communication bus 930. Pattern generator 916 generates timing and other patterns to test program algorithms for algebraic accuracy. Pattern generator 916 also provides real-time test video data for SVP algorithm and hardware debugging. A data pattern programmer (or selector) 918 is used to select data for input to SVP from among the forty input lines 920 or from data patterns generated by data pattern generator 916. As depicteddata selector 918 is inserted in series between the forty data input lines 920 and the forty SVP input pins 118. In development system 900 a capture (or field)memory 121 is provided to capture processed data from 8 of the 24output lines 170. The desired 8 of the 24 output lines is selected by a 3---> octal multiplexer 171. In this manner a field of processed video data can be captured (or stored) and provided back to host interface 914 and/orhost computer system 912 for real time analysis of the SVP's operations. -
Hardware interface 932 between host computer interface logic 914 andhost computer 912 is achieved indevelopment system 900 by conventional parallel interface connections. In an alternative embodiment a conventional EIA RS-232C cable can be used when interface speed is not a primary concern. A IIC bus manufactured by PHILLIPS ELECTRONICS CORPORATION can be used asinterface line 930 between host computer interface logic 914 andcontroller 128. - In video signal processing applications,
controller 128 generates control signals for theSVP processor device 102 which are synchronized with the vertical synchronization component and horizontal synchronization component of the incoming television signal online 110 of Fig. 1. - Figure 30 depicts a
television micro-controller 1700. Micro-controller 1700 presets internal television circuitry upon initialization (system power-up).Micro-controller 1700 receives external signals, such as those from a personalcomputer key pad 1702, aremote control unit 1704 or avideo signal decoder 1712, fromregister 962. Mask enable logic output online 982 controls whether master controller address counter 984 will continue addressing in sequence or perform a jump. The output of multiplexer 968 is also provided via line 978 as an input tomultiplexer 980.Multiplexer 980 has nine data output lines 986 providing inputs to master controller address program counter 984. The address on lines 988 from master controller address counter 984 address memory locations in master controller program memory 990. The address signal is also provided to returnregister 994 vialines 992 for subroutine call operations. The output ofregister 994 is provided vialine 996 as another input tomultiplexer 980. - Master controller program memory 990 has 14
output lines 998. The microcode output includes address and operational mode instructions for thevertical timing generator 904 and thehorizontal timing generator 906. These signals are provided to HTG and VTG vialines lines 998 are also provided to and decoded byinstruction decoder 1002 which in turn provides operation control signals vialines 1004 to multiplexer 980 and master controller program address counter 984. Additionally, microcode output bits fromlines 998 are provided via lines 1008 as another input tomultiplexer 980 and as control for multiplexer 968. -
Master controller 902 also includes auxiliaryregister control logic 1012. Ninesignal lines 1198 from asynchronous-to-synchronous conversion logic 958 are connected as an input to auxiliaryregister control logic 1012. Operation of auxiliary registers is discussed hereinafter with reference to Fig. 40. - Referring now to drawing Fig. 33,
vertical timing generator 904 of drawing Fig. 31, is depicted in greater detail. Vertical Timing Generator (VTG) 904, generates control codes onoutputs Horizontal Timing Generator 906,Constant Generator 908, andInstruction Generator 910 respectively. Indevelopment system 900constant generator 908 also provides timing to circuits requiring a resolution of one horizontal line viaexternal control line 952.Vertical timing generator 904 includes a vertical sequence counter (VSC) 1020.Vertical sequence counter 1020 is an up counter.Counter 1020 receives a control mode signal frommaster controller 902 vialines 932. The mode signal designates, among other things, whether, for example, a picture-in-picture operation is desired. The mode signal is essentially a starting address forvertical sequence counter 1020.VSC 1020 provides an address forvertical sequence memory 1024.Vertical sequence memory 1024 stores timing and other signals for initializing and synchronizing operations ofhorizontal timing generator 906,instruction genertor 910 andconstant generator 908. The information sequences stored invertical sequence memory 1024 are repeated during a typical opertation.Memory 1024 in addition to storing the information sequences stores the number of times the stored sequences are repeated.Sequence memory 1024 can comprise Random Access Memory (RAM), Read Only Memory (ROM) or other forms of Programmable Logic Arrays (PLA). - The repeat number is provided via line 1027 to repeat counter 1028. Repeat counter 1028 is a down counter which counts down from the repeat sequence number. When a end of repeat bit is encountered by counter 1028 a control signal is provided via line 1032 to counter control logic 1034. Counter control logic 1034 provides a signal on
line 1036 to signalvertical sequence counter 1020 to step to next address location. Another signal is provided via line 1040 to incrementvertical loop counter 1030. Initialization of counter control logic 1034 is controlled by the vertical and horizontal synchronizing signal of the incoming television signal. The synshronizing signals are provided vialines 1038. - Referring again to
vertical sequence memory 1024, the control component of the signal online 1026 is provided tovertical loop counter 1030 to start the loop counter at a desired location. The vertical loop counter output provided onlines 1042 addresses memory locations invertical loop memory 1044.Memory 1044 can also be RAM, ROM or PLA.Memory 1044 stores loop patterns (programs), starting addresses and labels for HTG, VTG and instruction generator (IG). Control data bits fromvertical loop memory 1044 are provided to repeat counter 1028 to indicate that a looping sequence is complete and to increment. Bits are also provided to registerload sequencer 1054.Register load sequencer 1054 includes a decoded clock to controllatches 1048, 1050 or 1054.Register load sequencer 1054 also provides an increment signal for incrementingvertical loop counter 1044. Data is clocked fromlatches 1048, 1050 and 1052 at a rate up to once every horizontal line time. - In operation
vertical loop counter 1030 provides anoutput signal 1042 tovertical loop memory 1044 which in turn fans out mode control signals which are latched by horizontal timing generator mode latch 1048, constant generator mode latch 1050, instructiongenerator mode latch 1052,register load sequencer 1054 and repeat counter 1028.Register load sequencer 1054 provides an output tovertical loop counter 1030 and tolatches 1048, 1050 and 1052. Each of the mode latches provide their respective signals to the horizontal timing, the constant generator and the instruction generator onoutput lines -
Vertical timing generator 904 functions also include changing the horizontal timing to a different mode, changing operational instructions to process television signals in zoom or with a different filter algorithm and includes jump flagarbitration control logic 1224 which receives ahorizontal synchronization signal 1218, a mode control signal 1220 fromvertical timing generator 904, and flag signals 1222. Jumpflag arbitration logic 1224 provides 5 of eleven vectored jump address bits to input 1226 of instruction program register multiplexer (IPRX) 1230. The five bits onlines 1226 are the least significant of the eleven total. - Jump
flag arbitration logic 1224 also provides ajump signal 1228 toinstruction decoder 1234.Instruction decoder 1234 provides multiple output signals. A line 1236 carries one of the output signals back to an input of jumpflag arbitration logic 1224.Lines 1238 carry a 4-bit decoded multiplexeroutput control signal 1238 toIPRX 1230. Lines 1240 carry control signals toincrement control logic 1242 and to a global rotation address generator (RF1) 1244 and to a global rotation address generator (RFO) 1246. The 4-bit control signal provided on lines 1240 instructs the globalrotation address generator 1244 and 1246 to load or shift data for their respective register files. The signal provided toincrement control logic 1242 set theaddress counter -
IPRX 1230 provides an 11-bit instruction address on lines 1248 to instruction program register 1250. Output signal 1252 from instruction point register 1250 is an address forinstruction program memory 1258. Address 1252 is also provided back to theHOLD input 1254 ofIPRX 1230. The hold input holds the output memory address for a readdress if desired. Address 1252 is also provided to a + 1increment control logic 1256.Increment logic 1256 increments returnregister 1264 or instructs theIPRX 1230 to step to the next address. Return register is latched by a CALL input signal. - Instruction program memory (IPM) 1258 stores the SVP system array instruction set in microcode. The array instruction set is presented earlier herein. A complete description of the 44 bits is provided therein. The 44 instruction bits from
instruction program memory 1258 are branched to various locations as set forth in the array instruction set. For example, bit number forty-three is a break point flag. This bit is provided vialine 1270 to breakpoint controller 1274. Other control bits are provided to the VECTOR, JUMP and CALL inputs ofIPRX 1230, and to input 1282 ofinstruction decoder 1234. A mask value bit for selecting a flag is provided vialine 1223 to jumpflag arbitration logic 1224. Ifbreakpoint controller 1274 is enabled during a break point bit read, a break signal onlines 1280 and 1284 to stop operation to provide a test.Breakpoint controller 1274 also receives a breakpoint line (BPline) input signal 1276 and a reset signal input 1278.Instruction bits 0 through 23 are branched from Instruction program memory (IPM) 1258 to control code latch 1288. Bits 25 through 31 are branched toRFO address counter 1290.Bits 32 through 38 are branched toRF1 address counter 1292.Bits 39 through 42 are branched to repeatcounter 1294 and toincrement control logic 1242.Increment control counter 1242 also receives inputs 1240 from the instruction decoder, which also provides a 4-bit control input to global rotation address generators (RF1) 1244 and (RFO) 1246. The latched instruction output 1194 from control code latch 1288 is provided to auxilliary register andcontroller logic 1196 which also receives global variables signal online 1198. Output 1194 is also provided directly asmicrocode bits 0 through 23 online 1200.Outputs 948 and are provided to the SVP processor device. - In operation,
Instruction Generator 910 feeds the SVP processor array with a stream of data, instructions, addresses, and control signals at a desired clock rate. The generated microcode manipulates and instructs the processor element arithmetic logic units, multiplexer, registers, etc. ofSVP 102 of Fig. 1.Instruction generator 910 can, in addition to the core instructions, generate instructions which allow the SVP core processor to operate in the manner of a simple microprocessor. In this mode, instructions such as unconditional jump, call, and jump on certain flagtest instructions flag Instruction generator 910 can receive internal control codes fromVertical Timing Generator 904 orMaster Controller 906, and receive flags fromHorizontal Timing Generator 906. - During operation, instruction microcode stored in instruction Program Memory (IPM) 1258 are fetched, interpreted and executed by
Instruction Decoder 1234. Some of the decoded signals can be used as the address selection of Instruction Program Register Multiplexer (IPRX) 1230 to change the address latched in the instruction Program Register (IPR) 1250. The instruction codes control the various types of Instruction Sets, for instance, conditional or unconditional jump, subroutine call or return, vector addressing with updated mode value, single or double instruction, auxiliary register control for the distribution of global variables, and the global rotation for RAM FILE(0 and 1) addresses, etc. - When the break point signal is asserted during the debugging stage,
break point controller 1274 sets the con]ent of IPR 1250 with a pre-determined value to move the flow of the program into specific subroutines in order to test the data processed by the SVP operations. This break function can be controlled by the maskable input of BPLINE 1276 Horizontal line within a given frame of the video signal. -
Repeat counter 1294 reduce the required amount of memory locations inIPM 1258 by representing a number of successive, identical instructions as a combination of this instruction code and the number of "1" end of loop signal is encountered. The sequence then proceeds to step 1214 andcontrol logic 1170 resets loop counter 1182 and decrements repeat counter 1128 via signals onlines 1186 and 1192 respectively. Next the sequence proceeds to step 1216. Atstep 1216, ifrepeat counter 1128 has not reached zero the sequence returns to step 1207. Ifrepeat counter 1128 has reached zero the sequence proceeds to step 1221 andcontrol logic 1132 increments sequence counter + 1 and the sequence returns to step 1206 and the steps are repeated. If atstep 1223 the sequence counter count is greater than the number of sequences the operation stops atstep 1227. - In Fig. 42, there is depicted a five pole finite impulse response (FIR)
filter 792 of N-bit resolution which can be implemented in thepresent SVP device 102. By using the second nearest neighbor architecture of Fig. 18, 2N instructions can be saved over single near-neighbor architecture. For example, referring to the instruction set included hereinafter it is shown thatprocessor 102 requires N instructions to move N bits from 2L to 1 L to perform an add. Similarly, N instructions are required to move N-bits from 2R to 1 R. By having second nearest-neighbor connections, 2N instructions are saved over a single near-neighbor communication network. If for example, a 12-bit FIR is implemented the second-nearest-neighbor arrangement would require less than 68% of the execution time of a single-near-neighbor network. - As the SVP is a software programmable device, a variety of filters and other functions can be implemented in addition to the FIR of Fig. 42 (horizontal filter). These include for example, vertical and temporal FIR filters and IIR filters (vertical and temporal).
- In Fig. 43 four line memories are illustrated: an eight
bit line memory 824; a sixbit line memory 826; and two fourbit line memories present SVP device 102. To illustrate the technique, assume that Fig. 44a represents a register file, such as RFO of processor element n, havingbit locations 00 through 7F (0 through 127). The 44a register file can be broken into multiple pieces. In this example the register file is broken into two pieces-lower and upper (not necessarily equal). The upper part comprisesbit locations 00 through 3F. Thelower bit locations 40 through 7F. If the upper part is designated the global rotation memory, the lower part can be used as the normal operating register file. For ease of understanding the global rotation part can be, for example, reorganized as "P" words of "Q" bits where PxQ is less than or equal to the total global rotation space. This is illustrated in Fig. 18b, which is an exploded view of the upper part of Fig. 44a. Each line of the Fig. 44b global rotation area comprises 8-bits of the register file transposed in a stacked horizontal fashion. When an address in this memory area is specified, it is offset by a "rotation value = Q modulus the total global rotation space. Thus instead of requiring that the data be shifted throughout the memory bank the individual line memory subset of the register file are circularly rotated. This is illustrated by the following example. - If the four example line memories of Fig. 43 are stored in the global rotation area of Fig. 44b, and a global rotation instruction is performed, the apparent effect is for the data to follow: B-->C; C-->D; D-->E; E-->G; G-->H; H-->M and J; M-->N; J-->K; N and K-->B. At first glance the movement E-->G. H-->M and J. and N and K-->B would appear to be an error since the old data existing prior to a global rotation appears to have been merely shifted. This is not the case however since immediately after the global rotation the new data values A, F, I and L are written into those locations and thus the old values E, H, K and N are lost--as would be expected in a line memory. To emulate the 1-horizontal delays, the global rotation instruction is executed once each horizontal line time. The SVP hardware allows the setting of the value of Q and the maximum value of the global rotation space.
- Figure 45 is a logic diagram of global rotation address generator for register file 0 (RFO) 1246 of Fig. 36. Global rotation address generator for
register file 1 1244 of Fig. 36 is identical and accordingly the following discussion applies to both generators. Global rotation address generator 1246 receives a relative register address fromregister file 0 address counter vialines 1291. This relative address is provided to address register locations inregister file 0 vialines 948.Microcode bits 32 through 37 are six of the eleven bits provided vialines instruction program memory 1258. The six bits provided vialines 1374 define the amount of registers in the total register area to rotated during a rotation step. This is the word length P in the previous example. For engineering design purposes the value defined bybits 32 through 37 is scaled by a factor of 2 in this example. The scaled P value is provided toregisters 1370. Microcode bits C48 through 42, provided frominstruction program memory 1258 vialines 1382, define the total global rotation area, or Q in the previous example. For engineering design purposes the rotation areas is scaled by a factor of 8. The scaled Q value is provided toregisters 1380. When a global rotation is to begin,instruction decoder 1234 of Fig. 36 provides a signal LMRx (x = 0 for RFO and x = for RF1) via lines - The value of [ROT VAL AEG] is <rotation value>/2, and for the above example is any number between 0 and (MOD REG]*4
-
- Figure 46a and 46b are parts of a flow diagram for a global rotation.
- In Fig. 47, example circuitry for pipelining of address, data, control and other signals received from
controller 128 is depicted. The illustrative circuit comprises anaddress buffer 1436 providing an input 1438 tofactor generator 1440, the output of which is provided to address factor decoder 1448 by driver 1444. Theoutput 1450 of decoder 1448 is provided to latch 1452 which is clocked at the sample frequency provided online 1454. Latch 1452 can be reset between clocking by and active low input on line 1458. The output of latch 1452 is provided to the control line input of the section under control, such asword line 1462 of a data input register, input register file, output register file or data output register. If an external controller is used chip pad contact 1432 is provided to input the control signal to theSVP core 102. The Fig. 47 type circuit can be used on the DOR side also. Fig. 48 is a table of various inputs and outputs for a pipeline circuit. - In Fig. 49, a timing diagram is provided to illustrate the improved speed of the device resulting from the ability to continuously provide signals to the SVP without requiring that the outcome of previously executed instructions be determined.
Signal 1431 is a valid memory address signal being provided toSVP device 102 core via external contact pad 1432.Signal 1450 is the decoded signal output of address decoder 1448.Signal 1462 illustrates the signal output ofdriver 1456 being provided to, for example, the DIR word line. If at time t0 a valid address signal is provided, the signal is decoded and provided to the latch 1452 at time t1, whereat it is latched in at time t3. Upon sampling, the decoded address is provided to selected word lines. Speed of operation is substantially improved by being able continously provide the subsequent signals to the address buffer before the previous signal has been executed. In the present circuit, the latch holds the state of the current operation's address while the new address (for the next operation) is pipelining through the input buffer, factor generator/driver, wiring and address decoder. As previously mentioned hereinabove the present pipelining technique applies to data signals, control signals, instructions, constants and practically all other signals that are provided in a predeterminable sequence. - In Fig. 50, it is illustrated how to further pipeline the signals by configuring the input buffer as a latch. These latches can then be reset and clocked by some derivation of the reset 1482 and/or sample signals 1484. Contact pad 1486 receives a master clock input signal which is eventually provided throughout the pipelining system. Similarly, clock generator 1496 generates the latching and reset signals for the system. A device of this type can be provided for all control and address signals from the controller.
- Figure 51- depicts a controller circuit suitable for controlling distribution of global variables. Controller as previously discussed provides addressing and control and data signals to the SVP processing elements. To load variables into the SVP and distribute those variables globally the controller hardware of Fig. 51 can be used.
- As depicted the controller can be modified to include a set of
auxiliary registers 1570 and an addressing structure which modulates the M registers of the SVP processing elements to distribute the variables. The auxiliary registers andmodulation section 1196, comprises anauxiliary storage register 1510 such as a RAM memory and a 2-->1 multiplexer (MUX) 1574.Auxiliary registers 1570 has an 8-bitload data input 1562, adata write input 1564 and a register address or read port 1568 organized as 5-bit by 1. The auxiliary register write port is organized as 2-bit by 8.Auxiliary register output 1572 is provided to trigger the High input ofMUX 1574. The Low input toMUX 1574 is bit C18 of the opcode output.Line 1576 provides an auxiliary register instruction enable signal to MUX 1574. Theauxiliary registers 1570 are discussed in greater detail hereinafter. - Referring to Fig. 51, a memory map of the register file 1 (RF1) and data output register (DOR) of a processor element is depicted. As mentioned the auxiliary register address in the memory map is part of the unused addresses for RF1/DOR. In operation the act of addressing the area "above" the DOR address in the memory selects the auxiliary registers. Data stored in the auxiliary registers are written as 4 words of 8-bits each, but read as 32 words of 1-bit each. When the state of an auxiliary register bit is read, either the auxiliary register output or original opcode bit C18 is passed directly to the M register data selector MUX
- If considered in conjunction with the two 4-bit word addition example discussed earlier (Table 25) it is clear that
instructions 2 through 31 of the instruction set can be compressed into 15 double instructions. By then implementing the repeat counter mode, the 15 double instructions can be assembled as a single instruction repeated 15 times by the included hardware. Thus, an addition of two 32-bit words is reduced from 33 to 4 instructions. When the repeat counter is engaged, the program counter stops and the two address counters auto-increment 1 for single instructions or by 2 for double instructions. It should be apparent from the above discussion of operation that controller memory reduction as described in accordance with the present invention may be implemented with or without concurrent use with double instructions. If for example, the above 32-bit add example is implemented without double instructions the repeat count bit value can be increased to allow for a larger repeat count or the first repeat can be performed twice. - Figure 54 depicts an alternative embodiment of the present synchronous vector processor/controller chip. In Fig. 54 the instruction generator an auxiliary registers are included on chip with the SVP processor core array. As previously mentioned hereinabove controller 1626 and SVP device 1628 can be manufactured on one silicon
chip forming device 1630.Clock Oscillator 1632 is phase locked to the transmitted television signal and provides clocking signals to the controller section.Clock oscillator 1634 is generally clocked to match the SVP operating speed. - Figure 1 and descriptions relating thereto details how the SVP device and controller are incorporated into a television system. Also included is a description of how a video cassette/
tape recorder 134 can have its output 136 provided to the SVP processor in place of the transmitted video signal. Alternatively a SVP device/controller system can be incorporated directly within a video tape recorder. An example of how this can be done is depicted in Fig. 55.Block 1630 may contain one or more SVP devices forsystem 1629.System 1630 includesconventional tuner circuitry 1644 for tuning reception of composite or S-VHS video signals. Color separation anddemodulation circuitry 1642 processes the tuned signal and the output is provided toSVP system 1630 in the manner previously discussed. A processed signal output is color modulated bycircuitry 1640 and either a composite video signal or a S-VHS video signal is output frommodulator 1640. The composite video signal is RF modulated bycircuitry 1638 and provided to a television antenna input or monitor input for display. - During a record mode the processed video signal is phase and FM modulated by
circuitry 1634 and recorded byhead logic 1636 in the conventional manner. During a playback the recorded signal is read from the tape and transmitted to phase andFM demodulation circuitry 1632. Thereafter the signal can again be processed bySVP system 1630 and provided as an output. One ormore field memories 120 may be used to capture data in the manner previously discussed with respect to Fig. 1. - The synchronous vector processor device and controller system disclosed and described herein is not limited to video applications. The SVP's unique real-time performance offers flexible design approaches to a number of signal processing applications. Some of the applications are listed in Table 27.
- Figure 58 depicts a vision inspection system incorporating a SVP system. The system includes a video camera for viewing objects to be inspected, or otherwise analyzed. The camera outputs a video signal to the inputs of an A-to-D converter which digitizes the analog video signal and provides a digital input to SVP system. The SVP system may also be provided with stored images from a memory or mask storage source such as an optical disk. The SVP can provide an output to a display or other indicator means and also to a host computer. The host computer may be used to control a timing and control circuit which also provides signals to the A analog to digital converter, the memory and the SVP device system. The visual inspection system of Fig. 58 can perform inspection of devices by comparing them to stored master images. The output can be an image showing differences, a simply pass/fail indicator, or a more complex report. The system can automatically determine which device is being inspected. Other type sensors could be used as well, such as infrared, x-ray, etc. A pre and post processing of the images could be performed to further enhance the output.
- Figure 59 depicts a pattern recognition system incorporating a SVP system. The SVP device receives digitized input signals from the output of an analog-to-digital convertor. Stored patterns may also be provided to the SVP for processing from an external memory. The input data is processed and a pattern number is output from the SVP. The analog-to-digital convertor, stored pattern memory and SVP may operate under control of output signals from a control and timing circuit. The pattern recognition system compares input data with stored data. This system goes beyond the visual inspection system and classifies the input data. Due to the SVP's speed many comparisons can be made in real-time. Long sequences of data can be classified. An example speech recognition application is illustrated Fig. 60. Fig. 60 depicts a speech data sample having a frequency of 8 kilohertz. Since speech is digitized at relatively low rates, 8 kilohertz, The SVP has plenty of time to perform many calculations on the transmitted speech data. An input of 1024 samples long would give approximately one-eighth second to process data, which corresponds to around 1.4 million instructions. In addition, the SVP can store many lines of data and thus recognize words, phrases, even sentences.
- Figure 61 depicts a typical radar processing system utilizes an SVP. Detected radar signal are transmitted from the antenna to an RF/IF circuit and the FM/AM outputs are provided to analog digital converter. The digitized output signal is processed by the SVP and the output is provided to a display or stored in memory. This system processes pulse radar data an either stores or displays the results.
- Figure 62 is a picture phone system utilizing an Synchronous Vector Device. Fig. 62 depicts the transmission and reception side. The video camera views the subject and the analog signal is digitized by analogs digital convertor. The digitized output is provided as an input to the SVP device. Other inputs include tables and the output of a frame memory. The SVP DTMS output is filtered in the filter circuit and provided to the phone lines. On the reception end, the phone lines transmit the transmitted data to an analog to digital convertor where the digitized signal is processed by a Synchronous Vector Device. The input signal may be processed along with stored data in a frame memory. The SVP output is converted to analog by digital by digital to analog convertor and placed in a matrix and displayed by a display. The picture phone system compresses input images, then encodes them as DTMF values and sends them over phone lines to a receiver. Sign tables are used to generate the tones directly in the SVP. On the receiving end the DTMF tones are digitized then detected and decompressed in the SVP.
- Figure 63a and 63b depict a facsimile system utilizing a Synchronous Vector Processor. Fig. 63 depicts the transmitting or sending in. A document scanner would scan the document to be transmitted and the scanned binary data is provided as an input to the SVP. Time tables can be used to generate tones directly in the SVP. The SVP performs encoding and tone generation. The tones are outputs to filter and then provided to the phone lines. On the receiving end, the received data from the phone line is converted to digital by analogs digital convertor and provided to the SVP for tone detection and decoding. The decoded SVP output is then printed by a printer.
- Figure 64 is a SVP based document scanner system which converts scanned documents to ASCII files. The scanner output is provided to the SVP where it is processed along with character tables and the processed output is stored in memory. The document scanner system digitizes data like a FAX machine, but performs pattern recognition on the data and converts it to ASCII format.
- The SVP can for used for secure video transmission. This system is shown in Fig. 65. The system includes a video signal source which provides an output to an input buffer. The buffered signal is provided to the SVP for processing. The SVP and input buffer can operate under control of a controller. The encoded signal from the SVP is provided to a transmitter where it is transmitted to a receiver and is again input buffered and decoded by an SVP on the receiving end. The SVP in the above system can encrypt a video signal by multiplying the pixel in each processor element by an arbitrary constant. The mapping of encryption constants to processor elements is defined by ROM coded pattern in the encoding and
-
- The following sections list some legal sub-instruction mnemonics. Higher level instructions may be created from these primitives The value to the left of the assignment operator '=' in the listing is the Destination operand while to the right is the Source Operand:
-
- Sub-instructions whose data source depends on the value of WRM (that is M-dependent sub-instructions) show three lines. The first line shows the sub-instruction entered into the program, while the second and third lines show the operational result depending whether (WRM)=0 or (WRM)=1, respectively. '(WRM)' is the contents of working Register WRM.
-
- The following table lists all of the legal instruction mnemonics and their opcodes for the Instruction Generator plus the variations on the array instructions of for single, wait-stated single, and double instructions.
- x - don't care b - break point bit rrrr - 4-bit repeat count value in 2's complement form ppppppp - 7-bit memory address for RF1 or DOR or AUX nnnnnnn - 7-bit memory address for RFO or DIR ii..i iii iii - Array instruction opcode from Appendix B. 00..0 - all bits in the field are zero vvvv - 5-bit value from the IG Mode Input Pins
-
-
-
- Single wait stated Single Double Idle
- JMP <adrl> Unconditional Jump to address <adrl>.
- JME <val>,<adrl> JUMP on MODE EQUAL. Jump to <adrl> if <val> = <(mode register)>, else go to next statement.
- JMT <adr2> JUMP to MODE TABLE. Jump to mode table at <adr2> with relative table entry point of <(mode register)>. <adr2> is an 11-bit address with the 5 LSB's equal to 00000. The absolute address is: (<adr2> AND 07EOh) + <(mode register)>
- The table at <adr2> will most likely contain JMP instructions to subroutines within the main program; however, any instruction may be used in the table. The table must be located on a 5-bit boundary.
- OUT Output control signal.
- MC will pause its execution after "OUT" instruction, and re-start its execution when "FSYNC" comes.
-
- The table must be constructed with up to 16 "OUT" instructions.
- One of the "OUT" instructions is chosen by contents of "COMB".
-
-
- TCMA Test COMA if COMA is equal to <c>, then jump to <label>. if COMA is not equal to <c>, then execute next instruction.
-
- Object file Listing file Instruction format; Label Fields Instruction Fields Mnemonic Fields Operand Fields Comment Fields Constants; Binary Integers Octal Integers Decimal Integers Hexadecimal Integers Symbols Directives; .PAGE .TITLE "string" .WIDTH <width> .COPY <file name> .END .SET <value> .ASECT
Claims (9)
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US42148889A | 1989-10-13 | 1989-10-13 | |
US421488 | 1989-10-13 | ||
US07/421,499 US5163120A (en) | 1989-10-13 | 1989-10-13 | Second nearest-neighbor communication network for synchronous vector processor, systems and methods |
US421499 | 1995-04-12 |
Publications (3)
Publication Number | Publication Date |
---|---|
EP0422964A2 true EP0422964A2 (en) | 1991-04-17 |
EP0422964A3 EP0422964A3 (en) | 1993-06-23 |
EP0422964B1 EP0422964B1 (en) | 1997-05-14 |
Family
ID=27025268
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP90311267A Expired - Lifetime EP0422964B1 (en) | 1989-10-13 | 1990-10-15 | Second nearest-neighbor communication network for synchronous vector processor systems and methods |
Country Status (5)
Country | Link |
---|---|
EP (1) | EP0422964B1 (en) |
JP (1) | JP3187823B2 (en) |
KR (1) | KR0184865B1 (en) |
CN (1) | CN1042282C (en) |
DE (1) | DE69030705T2 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0676096A1 (en) * | 1993-10-28 | 1995-10-11 | Motorola, Inc. | Demodulator logic unit adaptable to multiple data protocols |
WO1996003700A1 (en) * | 1994-07-22 | 1996-02-08 | Ivp Integrated Vision Products Ab | Arrangement at an image processor |
EP0701219A1 (en) * | 1994-08-31 | 1996-03-13 | Sony Corporation | Parallel processor apparatus |
EP0726532A2 (en) * | 1995-02-10 | 1996-08-14 | International Business Machines Corporation | Array processor communication architecture with broadcast instuctions |
CN106559076A (en) * | 2015-09-24 | 2017-04-05 | 半导体元件工业有限责任公司 | The calibration of spread-spectrum clock generator and its method |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100306769B1 (en) * | 1999-03-13 | 2001-09-13 | 손재익 | Method for preparing mixed fuel using sewage sludge and use thereof |
KR100598702B1 (en) * | 2000-03-22 | 2006-07-11 | 넥스원퓨처 주식회사 | Measure system of receiving sensibility for receiving data |
US7017064B2 (en) * | 2001-05-09 | 2006-03-21 | Mosaid Technologies, Inc. | Calculating apparatus having a plurality of stages |
US8281395B2 (en) * | 2009-01-07 | 2012-10-02 | Micron Technology, Inc. | Pattern-recognition processor with matching-data reporting module |
CN107037262B (en) * | 2017-04-25 | 2020-02-11 | 成都玖锦科技有限公司 | Big data spectrum analysis system and method thereof |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0317218A2 (en) * | 1987-11-13 | 1989-05-24 | Texas Instruments Incorporated | Serial video processor and fault-tolerant serial video processor device |
EP0444368A1 (en) * | 1990-02-28 | 1991-09-04 | Texas Instruments France | Data input system for single-instruction multiple-data processor |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4380046A (en) * | 1979-05-21 | 1983-04-12 | Nasa | Massively parallel processor computer |
US4860248A (en) * | 1985-04-30 | 1989-08-22 | Ibm Corporation | Pixel slice processor with frame buffers grouped according to pixel bit width |
-
1990
- 1990-10-13 CN CN90108413A patent/CN1042282C/en not_active Expired - Fee Related
- 1990-10-13 KR KR1019900016356A patent/KR0184865B1/en not_active IP Right Cessation
- 1990-10-15 JP JP27612490A patent/JP3187823B2/en not_active Expired - Fee Related
- 1990-10-15 DE DE69030705T patent/DE69030705T2/en not_active Expired - Fee Related
- 1990-10-15 EP EP90311267A patent/EP0422964B1/en not_active Expired - Lifetime
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0317218A2 (en) * | 1987-11-13 | 1989-05-24 | Texas Instruments Incorporated | Serial video processor and fault-tolerant serial video processor device |
EP0444368A1 (en) * | 1990-02-28 | 1991-09-04 | Texas Instruments France | Data input system for single-instruction multiple-data processor |
Non-Patent Citations (4)
Title |
---|
IEEE 1988 INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS 8 June 1988, ROSEMONT, USA pages 144 - 145 D. CHIN ET AL 'The Princeton engine: a real-time video system simulator' * |
IEEE 1990 CUSTOM INTEGRATED CIRCUITS CONFERENCE 13 May 1990, BOSTON, USA pages 1731 - 1734 J. CHILDERS 'SVP : serial video processor' * |
IEEE TRANSACTIONS ON CONSUMER ELECTRONICS vol. 36, no. 3, August 1990, NEW YORK US pages 318 - 325 H. MIYAGUCHI ET AL 'Digital TV with serial video processor' * |
OPTICAL ENGINEERING vol. 28, no. 9, September 1989, BELLINGHAM US pages 943 - 948 B. K. LIEN AND G. Y. TANG 'Pipelined microcomputers for digital signal and image processing' * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0676096A1 (en) * | 1993-10-28 | 1995-10-11 | Motorola, Inc. | Demodulator logic unit adaptable to multiple data protocols |
EP0676096A4 (en) * | 1993-10-28 | 1999-08-25 | Motorola Inc | Demodulator logic unit adaptable to multiple data protocols. |
WO1996003700A1 (en) * | 1994-07-22 | 1996-02-08 | Ivp Integrated Vision Products Ab | Arrangement at an image processor |
CN1098494C (en) * | 1994-07-22 | 2003-01-08 | Ivp集成图象产品公司 | Arrangement at an image processor |
US5982393A (en) * | 1994-07-22 | 1999-11-09 | Ivp Integrated Vision Products Ab | Arrangement at an image processor |
AU685311B2 (en) * | 1994-07-22 | 1998-01-15 | Ivp Integrated Vision Products Ab | Arrangement at an image processor |
US5666169A (en) * | 1994-08-31 | 1997-09-09 | Sony Corporation | Parallel processor apparatus having means for processing signals of different lengths |
EP0973101A1 (en) * | 1994-08-31 | 2000-01-19 | Sony Corporation | Parallel processor apparatus |
EP0973100A1 (en) * | 1994-08-31 | 2000-01-19 | Sony Corporation | Parallel processor apparatus |
EP0701219A1 (en) * | 1994-08-31 | 1996-03-13 | Sony Corporation | Parallel processor apparatus |
US5659785A (en) * | 1995-02-10 | 1997-08-19 | International Business Machines Corporation | Array processor communication architecture with broadcast processor instructions |
EP0726532A3 (en) * | 1995-02-10 | 1997-03-19 | Ibm | Array processor communication architecture with broadcast instuctions |
EP0726532A2 (en) * | 1995-02-10 | 1996-08-14 | International Business Machines Corporation | Array processor communication architecture with broadcast instuctions |
CN106559076A (en) * | 2015-09-24 | 2017-04-05 | 半导体元件工业有限责任公司 | The calibration of spread-spectrum clock generator and its method |
CN106559076B (en) * | 2015-09-24 | 2021-12-24 | 半导体元件工业有限责任公司 | Calibration of spread spectrum clock generator and method thereof |
Also Published As
Publication number | Publication date |
---|---|
EP0422964A3 (en) | 1993-06-23 |
EP0422964B1 (en) | 1997-05-14 |
CN1042282C (en) | 1999-02-24 |
DE69030705T2 (en) | 1997-12-11 |
DE69030705D1 (en) | 1997-06-19 |
KR910008566A (en) | 1991-05-31 |
JP3187823B2 (en) | 2001-07-16 |
JPH0436856A (en) | 1992-02-06 |
KR0184865B1 (en) | 1999-05-15 |
CN1054871A (en) | 1991-09-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5598545A (en) | Circuitry and method for performing two operating instructions during a single clock in a processing device | |
US5628025A (en) | Timing and control circuit and method for a synchronous vector processor | |
US5210836A (en) | Instruction generator architecture for a video signal processor controller | |
US5163120A (en) | Second nearest-neighbor communication network for synchronous vector processor, systems and methods | |
US5539891A (en) | Data transfer control circuit with a sequencer circuit and control subcircuits and data control method for successively entering data into a memory | |
US5408673A (en) | Circuit for continuous processing of video signals in a synchronous vector processor and method of operating same | |
US5680600A (en) | Electronic circuit for reducing controller memory requirements | |
US5105387A (en) | Three transistor dual port dynamic random access memory gain cell | |
KR100283161B1 (en) | Motion evaluation coprocessor | |
US5327541A (en) | Global rotation of data in synchronous vector processor | |
EP0422964A2 (en) | Second nearest-neighbor communication network for synchronous vector processor systems and methods | |
US5452425A (en) | Sequential constant generator system for indicating the last data word by using the end of loop bit having opposite digital state than other data words | |
EP0422965A2 (en) | Circuit for continuous processing of video signals in a synchronous vector processor | |
Harney et al. | The i750 video processor: A total multimedia solution | |
EP0393125B1 (en) | Stored program controller with a conditional branch facility as for a video signal processor | |
US5293637A (en) | Distribution of global variables in synchronous vector processor | |
JP2774115B2 (en) | Sequential video processor system | |
US5499375A (en) | Feedback register configuration for a synchronous vector processor employing delayed and non-delayed algorithms | |
EP0422963A2 (en) | Signal pipelining in synchronous vector processor | |
EP0428269A2 (en) | Instruction generator architecture for a video signal processor controller | |
US5239628A (en) | System for asynchronously generating data block processing start signal upon the occurrence of processing end signal block start signal | |
US5860130A (en) | Memory interface apparatus including an address modification unit having an offset table for prestoring a plurality of offsets | |
JP2960328B2 (en) | Apparatus for providing operands to "n + 1" operators located in a systolic architecture | |
JPH10304356A (en) | Parallel picture compression processor | |
JPH11353470A (en) | Image drawing parallelizing device and parallelized image drawing system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): DE FR GB IT NL |
|
PUAL | Search report despatched |
Free format text: ORIGINAL CODE: 0009013 |
|
AK | Designated contracting states |
Kind code of ref document: A3 Designated state(s): DE FR GB IT NL |
|
17P | Request for examination filed |
Effective date: 19931221 |
|
17Q | First examination report despatched |
Effective date: 19941110 |
|
GRAG | Despatch of communication of intention to grant |
Free format text: ORIGINAL CODE: EPIDOS AGRA |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): DE FR GB IT NL |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT;WARNING: LAPSES OF ITALIAN PATENTS WITH EFFECTIVE DATE BEFORE 2007 MAY HAVE OCCURRED AT ANY TIME BEFORE 2007. THE CORRECT EFFECTIVE DATE MAY BE DIFFERENT FROM THE ONE RECORDED. Effective date: 19970514 Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 19970514 |
|
REF | Corresponds to: |
Ref document number: 69030705 Country of ref document: DE Date of ref document: 19970619 |
|
ET | Fr: translation filed | ||
NLV1 | Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act | ||
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed | ||
REG | Reference to a national code |
Ref country code: GB Ref legal event code: IF02 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20060915 Year of fee payment: 17 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20061031 Year of fee payment: 17 |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20071015 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20080501 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST Effective date: 20080630 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20061003 Year of fee payment: 17 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20071015 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20071031 |