WO2023007258A1 - Hybrid 3-dimensional optical computing accelerator engine apparatus and method - Google Patents

Hybrid 3-dimensional optical computing accelerator engine apparatus and method Download PDF

Info

Publication number
WO2023007258A1
WO2023007258A1 PCT/IB2022/054268 IB2022054268W WO2023007258A1 WO 2023007258 A1 WO2023007258 A1 WO 2023007258A1 IB 2022054268 W IB2022054268 W IB 2022054268W WO 2023007258 A1 WO2023007258 A1 WO 2023007258A1
Authority
WO
WIPO (PCT)
Prior art keywords
optical
hoca
liquid crystal
unit
lcu
Prior art date
Application number
PCT/IB2022/054268
Other languages
French (fr)
Inventor
Raghavendra SWAMY H
Iven Jose
Original Assignee
Swamy H Raghavendra
Iven Jose
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Swamy H Raghavendra, Iven Jose filed Critical Swamy H Raghavendra
Publication of WO2023007258A1 publication Critical patent/WO2023007258A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06EOPTICAL COMPUTING DEVICES; COMPUTING DEVICES USING OTHER RADIATIONS WITH SIMILAR PROPERTIES
    • G06E3/00Devices not provided for in group G06E1/00, e.g. for processing analogue or hybrid data
    • G06E3/008Matrix or vector computation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Definitions

  • the present invention is generally related to an architectural model of computation accelerator or processor accelerator, more particularly related to a multi-dimensional hybrid optical computing accelerator.
  • the US5510665 describes the unique construction of an optoelectronic circuit element unit that can be used as a basic circuit building block like a transistor or diode.
  • the unit optoelectronic element has sandwich layers of photodetector - Light modulator - Light source - Light modulator - photodetector in the order of the construction.
  • the source is placed at the center of the device and the two photodetectors are placed at the outer layers.
  • the two light modulator layers are housed between the photodetector and the light source on either side. This light modulator layer is used for modulating the light.
  • US6804412 describes the method of optical correlator to measure the correlation of images with the collimated light source and here the Fourier transformation is used as a mathematical model. It is targeted for specialized computing and not general-purpose computation.
  • US7747102B2 also describes the method of optical correlator to measure the correlation of images with the collimated light source and here the Fourier transformation is used as a mathematical model. It uses image production and image capture devices in the same plane to reduce the size of the computing block. It is targeted for specialized computing and not general-purpose computation.
  • US8610839B2 describes the Optical processing system using Twisted collimated spatial light modulator and uses the Fourier full and partial derivatives transformation as a mathematical model. It is targeted for specialized computing and not general-purpose computation.
  • W02014087126-PAMPH-041 describes the Optical processing system using a multilayered Twisted collimated spatial light modulator and uses the Fourier transform as a mathematical model.
  • US20170248734 describes the methods and systems using an optical receiver and electro-optic methods to transmit data from integrated computational elements. It uses the Time Division Multiplication and Wave Division Multiplication in the wavelength region of O-Band, S-Band, -Band, L-band, U-band, and Near Infra region.
  • the optical computing target is specific to remote monitoring of fluids in the oil and gas industry.
  • WO2021083348 describes the optical computing device construction based on the passive optical waveguides, which are used to transmit the monochromatic optical signals.
  • phase modulation and amplitude modulation has been incorporated to realize multiplication and addition operation.
  • the optical switching elements in an array of controllable optical switching elements are controlled in accordance with an adaptive computation to be performed on inputs to produce outputs.
  • a light that carries the inputs and outputs is emitted to and collected from the optical switching elements.
  • each of the pixels on the film can be considered a light modulating or optical switching element and the film can be considered an SLM array of the pixels.
  • the film together with the lenses, the optical fibers, and the controller can be considered an optical switch.
  • An optical neural network is constructed based on photonic integrated circuits to perform neuromorphic computing.
  • matrix multiplication is implemented using one or more optical interference units, which can apply an arbitrary weighting matrix multiplication to an array of input optical signals.
  • Nonlinear activation is realized by an optical nonlinearity unit, which can be based on nonlinear optical effects, such as saturable absorption.
  • the need of the hour is to have a method and apparatus for designing the computing accelerator which works based on the light instead of purely silicon transistor. It is also desirable to harness the inherent parallel capabilities of light properties to perform parallel computation. It is also desirable to have a custom parallel programming software interface support so that optical computing accelerated hardware can be used for user-level programming.
  • a method and apparatus for designing the computing accelerator which works based on the light instead of purely silicon transistor.
  • the liquid crystal unit is used as the basic compute element unit in the design of the hybrid optical computing accelerator engine.
  • This computing accelerator engine works based on the mathematical model of light behavior.
  • This mathematical algebra is the hybrid model of electronics and optical computing systems.
  • different architectural designs of the hybrid compute engines are proposed.
  • the software stack of application programming interface OxAPI can be used to harness the capability of the hybrid optical computing accelerator engine.
  • Figure 1 is an existing silicon-based computing block
  • Figure 2 is an existing silicon-based Non-hUMA APU [Non-heterogeneous Unified Memory Access Accelerated Processing Unit] computing block;
  • Figure 3 is an existing silicon-based hUMA APU [heterogeneous Unified Memory Access Accelerated Processing Unit] computing block;
  • Figure 4 is an existing artifact Liquid Crystal Unit showing the on and off light control behavior using voltage support
  • Figure 5 is an existing artifact Liquid Crystal Unit with the color filter block
  • Figure 6 is an existing artifact of single-pixel using RGB filter [RED, GREEN, BLUE] block liquid crystal units;
  • Figure 7 is a proposed artifact generic mathematical model of a single Liquid Crystal Unit with electronic gate control
  • Figure 8 is a proposed Model-1 generic artifact mathematical model of a single Liquid Crystal Unit with electronic gate control, horizontal and vertical polarizers blocks;
  • Figure 8A is a proposed artifact Model-1 Subtype-1 of one-dimensional LCU compute Blocks with electronic gate control, horizontal and vertical polarizers blocks;
  • Figure 8B is a proposed artifact Model-1 Subtype-2 of two-dimensional LCU compute Blocks with electronic gate control, horizontal and vertical polarizers blocks;
  • Figure 8C is a proposed artifact Model-1 Subtype-3 of three-dimensional LCU compute Blocks with electronic gate control, horizontal and vertical polarizers blocks;
  • Figure 9 is a proposed Model-2 generic artifact mathematical model of a single Liquid Crystal Unit with electronic gate control, horizontal, vertical polarizers, and RGB color filters blocks;
  • Figure 9A is a proposed artifact Model-2 Subtype-1 of one-dimensional LCU compute Blocks with electronic gate control, horizontal, vertical polarizers, and RGB color filters blocks;
  • Figure 9B is a proposed artifact Model-2 Subtype-2 of two-dimensional LCU compute Blocks with electronic gate control, horizontal, vertical polarizers, and RGB color filters blocks;
  • Figure 9C is a proposed artifact Model-2 Subtype-3 of three-dimensional LCU compute Blocks with electronic gate control, horizontal, vertical polarizers, and RGB color filters blocks;
  • Figure 10 Generic model of 3D optical compute accelerator unit integration to CPU based system using high-speed bus interface;
  • Figure 11 Generic model of 3D optical compute accelerator unit integration to GPU using high-speed bus interface
  • Model A Subtype 1, 3D-OCE as co-processor or co-accelerator to CPU based system using high-speed bus interface with electronic gate control, and horizontal, vertical polarizers blocks;
  • Model A Subtype 2, 3D-OCE as co-processor or co-accelerator to GPU using high-speed bus interface with electronic gate control, and horizontal, vertical polarizers blocks
  • Model B Subtype 1, 3D-OCE as co-processor or co-accelerator to CPU based system using high-speed bus interface with electronic gate control, horizontal, vertical polarizers, and RGB color filters blocks;
  • Model B Subtype 2, 3D-OCE as co-processor or co-accelerator to GPU using high-speed bus interface with electronic gate control, horizontal, vertical polarizers, and RGB color filters blocks
  • Figure 16 software stack block diagram for the 3D-HOCA
  • Figure 17 Compilation model of software stack block diagram for the 3D-HOCA.
  • FIG. 18 OxAPI Master program dispatch on CPU and Co-processor program/Slave program dispatch on the 3D-HOCA
  • Figure 19 CPU with 4 ALU ( Arithmetic Logic Unit ) computational block
  • FIG. 20 GPU with multi-ALU SIMD ( Single Instruction Multiple Data ) computational blocks
  • Figure 21 3D-HOCA as MIMD ( Multiple Instruction Multiple Data ) computational blocks
  • Figure 22 3D-HOCA with high-speed RAM hosting the Instruction and Data panel pipeline.
  • Figure 23 The output of 3D LCU optical computing unit #1 is spilled into two parallel inputs using a 50:50 beam splitter for parallel computation.
  • Figure 27 LCU implementation of hybrid optical AND gate.
  • Figure 29 LCU implementation of hybrid optical OR gate.
  • FIG. 1 shows a system that could include a processor like a single central processing unit (CPU) or it can be a multi-core processor, or network processor (NP) audio accelerator (AA) or a digital signal processor (DSP), a graphics processing unit (GPU), or Accelerated Processing Unit (APU) which has both CPU and GPU integrated which are based on silicon-based technology targeting general-purpose computation or specialized computations.
  • a processor like a single central processing unit (CPU) or it can be a multi-core processor, or network processor (NP) audio accelerator (AA) or a digital signal processor (DSP), a graphics processing unit (GPU), or Accelerated Processing Unit (APU) which has both CPU and GPU integrated which are based on silicon-based technology targeting general-purpose computation or specialized computations.
  • NP network processor
  • AA audio accelerator
  • DSP digital signal processor
  • GPU graphics processing unit
  • APU Accelerated Processing Unit
  • RAM Random Access Memory
  • DRAM Dynamic RAM
  • SRAM Static RAM
  • EEPROM electrically erasable programmable ROM
  • the power source provides the electric power to the circuitry of the system.
  • FIG. 2 shows a block diagram of non-heterogeneous unified memory (non-hUMA) access accelerated processing unit (APU) computing block. It has GPU and CPU fabricated on the same silicon die. Here the GPU can be programmed to perform general-purpose computations resulting general-purpose graphics processing unit (GPGPU) accelerator to speed up the computations. Since the GPU and CPU have their RAM address space modules that are disjoint and hence accessing either of the RAM blocks would result in the addition hop via CPU or GPU block.
  • non-hUMA non-heterogeneous unified memory
  • APU accelerated processing unit
  • FIG. 3 shows a block diagram of heterogeneous unified memory (hUMA) APU computing block.
  • hUMA heterogeneous unified memory
  • GPU and CPU fabricated on the same silicon die, but the GPU and CPU share the common RAM address space making it efficient GPGPU coprocessor acceleration for general-purpose computing.
  • FIG. 4 and FIG. 5 show the existing prior art of a liquid crystal unit (LCU), the liquid crystal is sandwiched between two glass plates coated with conducting transparent electrode material like indium tin oxide (ITO).
  • ITO indium tin oxide
  • the horizontal polarizing filter is layered below as the base of this sandwich construction and the vertical polarizing filter is layered above it.
  • the transparent electrodes are connected to the voltage source.
  • An unpolarized white light source unit is placed below the horizontal polarizer filter.
  • the working principle of the LCU is, when the switch S1 is open and when the light is illuminated the horizontal polarizer allows only horizontal planer waves of the light.
  • Fig.4 The rods in the liquid crystal without the application of the voltage will easily change horizontal to vertically polarized light and pass through the vertical polarizer filter at the top.
  • the polarized white source can be of a white light-emitting diode (LED), Red Green Blue
  • RGB LED Red, Green, Blue, Blue, Blue, and any known light generating devices or units.
  • FIG. 5 shows the LCU with the Red (R), Green (G), Blue (B), Cyan (C), Magenta (M), Yellow (Y) filter blocks, which can be used to control and allow the specific wavelength as the LCU output.
  • FIG. 6 shows the 3 LCU units with the RGB filters as a single unit with optical input as (RoGoBo), electronic gate RGB inputs (ReGeBe) controlling the 3 LCUs, and optical output as (RoGoBo).
  • FIG. 7 and FIG.8 shows the proposed generic mathematical model of a single liquid crystal unit with Z 0 as the optical output of the LCU, l 0 is the optical input of the LCU and Qe is the electronic gate control, which is the light or color filter transformation function using equation 1
  • HCBEA Hybrid Checker Board Eclipse Algebra
  • MCA Multi-level Color Algebra
  • R RED wavelength with step values from 1 to 255
  • G GREEN wavelength with step values from 1 to 255
  • B BLUE wavelength with step values from 1 to 255
  • M MAGENTA with step values from 1 to 255
  • Y YELLOW with step values from 1 to 255
  • C CYAN with step values from 1 to 255
  • Xo Optical output value ⁇ W 0 , R 0 , G 0 , B 0 , C 0 , M 0 , Y 0 , K 0, IR 0 ⁇
  • a o ⁇ R, G, B, C, M, Y ⁇
  • the HCBEA algebra is a multi-level logic color algebra executed based on the behavior of the LCU with the electronic color filter transformation control and the optical input. It uses both electronic control and optical discrete step values to support multi-level compute logic.
  • FIG. 8 Model-I and FIG. 9- Model II supports the HCBEA algebra model.
  • FIG.8 Model - I LCU has the horizontal and vertical polarizers and but without RGB filter blocks.
  • Model-ll LCU has RGB filters along with the horizontal and vertical polarizers enabling increasing in the step multi-step value compared to FIG.8 Model-I.
  • FIG. 9 Model - I, model a single LCU has the RGB LED source and electronic step gate control, henceforth following step combination are supported,
  • FIG. 9 Model - II, model a single LCU has the RGB LED source and electronic step gate control acts as the color filter transformation function on each optical input, henceforth following step combination are supported,
  • R e CFTF ( K 0 (1) + GRAY 0 (254) + W 0 (1) + R 0 (255) + G 0 (255) + B o (255) + C o (255) + M 0 (255) + Y 0 (
  • the existing transistor operating based on the binary logic would require 18,21,720 transistors to simulate FIG.9 Model II single LCU steps.
  • the two-parallel side by side LCU can simulate 13.27465503 Tera step combination and compute cube of 4 X 4 X 4 LCU can simulate 9.642417942847 X 10 104 Step combinations.
  • FIG. 8A Model-I, Subtype-I describes the one-dimensional hybrid optical computing block with 4 rows X 1 column vertically stacked LCUs in which the vertical polarizer planes slits of the two adjacent layers are aligned so has to light pass through them and similarly the horizontal polarizer plane slits of the two adjacent plane are aligned to light pass through them.
  • the below single RGB light source unit would drive input source l 0i and the RGB color sensor unit at the top of Z 04 output is used to detect optical result Z 04 .
  • FIG. 8B Model-1, Subtype-2 describes the two-dimensional hybrid optical computing block with 4 rows X 4 column vertically stacked LCUs in which the vertical polarizer planes slits of the two adjacent layers are aligned so has to light pass through them and similarly the horizontal polarizer plane slits of the two adjacent planes are aligned to light pass through them.
  • the 4 individual RGB light sources below the first layer of LCUs would drive input source l 0i , I 0 2, l 0 3, lo4, and the 4 individual RGB color sensors above the Z 0 14, Z 0 24, Z 0 34, Z 0 44 outputs are used to detect computed optical results.
  • FIG. 8C Model-I, Subtype-3 describes the three-dimensional hybrid optical computing block with 4 X 4 X 4 stacked LCUs in which the vertical polarizer planes slits of the two adjacent layers are aligned so has to light pass through them and similarly the horizontal polarizer plane slits of the two adjacent planes are aligned to light pass through them.
  • 33, Z o 230> Z 0 231 , Z 0 232> Z 0 233> Z o 330, Z 0 331 , Z 0 332, and Z 0 333 outputs is used to detect computed optical results. Furthermore, by using FIG.8 Model -I as the basic building block it can be further scaled up to a multi-dimensional grid hybrid optical computational unit.
  • FIG. 9A Model-2, Subtype-I describes the one-dimensional hybrid optical computing block with 4 rows X 1 column vertically stacked using FIG. 9 Model-2: LCUs with RBG filter block included.
  • LCUs with RBG filter block included.
  • the vertical polarizer planes slits of the two adjacent layers are aligned so has to light passes through them and similarly the horizontal polarizer plane slits of the two adjacent planes are aligned to light pass through them.
  • This creates 4 X 1 computing LCU blocks which can be electronically controlled individually by using Q e i, Q e 2, Qe3 , and Q e 4 lines to result in the collective optical computational sub outputs Z 01 , Z 02 , Z 03, and final result Z 04 from the optical input l 01 based on mathematical model HCBEA.
  • FIG. 9B Model-2, Subtype-2 describes the two-dimensional hybrid optical computing block with 4 rows X 4 columns vertically stacked using FIG. 9 Model- 2: LCUs with RBG filter block included.
  • the vertical polarizer planes slits of the two adjacent layers are aligned so has to light pass through them and similarly the horizontal polarizer plane slits of the two adjacent planes are aligned so as to light pass through them.
  • the 4 individual RGB light sources below the first layer of LCUs would drive input source l 01 , I 02 , l 03 , l o4, and the 4 individual RGB color sensors above the Z 014 , Z 0 24, Z 0 34, Z 0 44 outputs are used to detect computed optical results.
  • FIG. 9C Model-2, Subtype-3 describes the three-dimensional hybrid optical computing block with 4 X 4 X 4 vertically stacked using FIG. 9 Model- 2: LCUs with RBG filter block included.
  • LCUs with RBG filter block included.
  • the vertical polarizer planes slits of the two adjacent layers are aligned so has to light pass through them and similarly the horizontal polarizer plane slits of the two adjacent planes are aligned to light pass through them.
  • the 4 X 4 individual RGB light source below the first layer of LCUs would drive input source Uoo, I 0 001 , I 0 002,
  • the 4 X 4 individual RGB color sensor above the Z 0 o3o, Z 0 o3i , Z 0 032, Z 0 033, Z 0 130, Z 0 i3i , Z 0 132, Z 0 133, Zo23o, Zo23i , Zo232, Z 0 233, Z 0 33o, Z 0 33i , Z 0 332, and Z 0 333 outputs are used to detect computed optical results.
  • FIG.9 Model -2 as the basic building block it can be further scaled up to a multi-dimensional grid hybrid optical computational unit.
  • the FIG.8 Model-1 and FIG.9 Model-2 LCUs which works based on the HCBEA mathematical model can be used to design basic logical blocks like AND, OR, NOT, NOR, NAND, EXOR, and EX-NOR, in turn, be used to construct complex computational blocks. This in turn can be used to build complex integer and floating-point arithmetic operators +, -, * , and /.
  • the computational processor designed by using FIG.8 Model-1 and FIG.9 Model-2 type LCUs is that it can be easily reconfigured by changing the color filter transformation line Q e (x,x,x) to perform different types of operation. This leads to the reconfigurable processor which is a unique feature when compared to the existing silicon-based processors.
  • FIG. 10 shows the generic model of 3D hybrid optical Compute Accelerator Unit (3D- HOCA) integration to CPU-based system using a high-speed bus interface.
  • the silicon processor would act as the master and the 3D-HOCA unit acts as the co-accelerator.
  • the silicon processor can offload the computation workload of arithmetic, logical, relational, string processing to a reconfigurable multi-dimensional hybrid optical computing LCU or 3D-HOCA unit to increase the speed of computation.
  • the high-speed bus interface unit can be integrated with industry-standard bus architectures like Hyper transport, InfiniBand, and compute express link for high-speed communication between the silicon processor and the 3D-HOCA unit for the data and instruction transfer.
  • the 3D-HOCA unit has three blocks, the first block is the 3D hybrid optical Compute based on either FIG.8 Model-1 or FIG.9 Model-2 or it can be based on the mix of both FIG.8 Model-1 and FIG.9 Model-2 type LCUs which is the heart of the optical computation engine.
  • the RGB LED array would act as the optical input source and the color sensor array acts to interface light and the electronic signal conversion of the computed result.
  • the second block is a hybrid optical processor control unit, it is responsible for the complete control of functioning of 3D-HOCA by using command instructions and it is a silicon-based block.
  • the third block being the hybrid optical processor glue chip, which helps in the seamless conversion of optical to electronic signals and vice versa, it is also an edge-triggered module driving three optoelectronic sub-components of 3D-HOCA.
  • the three sub-components are the 2D dimensional LED Data/Control grid panel, the 3Dimesional LCD Data/Control grid panel, and finally, a 2D color sensor array like CCD (charge-coupled device) which can convert light into electrical signals.
  • the glue chip also incorporates a lookup table or a base (2) to base (256) and base (256) to base (2) convertor for seamless integration.
  • the other units are the industry standard subunits as explained in FIG.1.
  • FIG.1 FIG.
  • 3D- HOCA 3D hybrid optical Compute Accelerator Unit
  • the silicon-based Graphics Processing Unit GPU
  • the 3D-HOCA unit acts as the co-accelerator.
  • the GPU can offload the computation work to a 3D-HOCA unit to increase the speed of computation.
  • the high-speed bus interface unit can be integrated with the industry-standard bus architectures like Hyper transport, InfiniBand, and compute express link for high-speed communication between the GPU and the 3D- HOCA unit for the data and instruction transfer.
  • the 3D-HOCA unit has three blocks, the first block is the 3D hybrid optical Compute based on either FIG.8 Model-1 or FIG.9 Model-2 or it can be based on the mix of both FIG.8 Model-1 and FIG.9 Model-2 type LCUs which is the heart of the optical computation engine.
  • the RGB LED array would act as the optical input source and the color sensor array acts to interface light and the electronic signal conversion of the computed result.
  • the second block is a hybrid optical processor control unit, it is responsible for the complete control of functioning of 3D- HOCA by using command instructions and it is a silicon-based block.
  • the third block being the hybrid optical processor glue chip, which helps in the seamless conversion of optical to electronic signals and vice versa, it is also a silicon-based block.
  • the interesting aspect of the GPU interface is used GPU MEMORY which stores the frame buffer block that is will be used to display on the visual display unit like monitors, the
  • the FIG.12 Model A subtype-1 uses the FIG.8 Model-1 LCUs which is the heart of the optical computation engine.
  • the RGB LED array would act as the optical input source and the color sensor array acts to interface light and the electronic signal conversion of the computed result.
  • the second block is a hybrid optical processor control unit, it is responsible for the complete control of functioning of 3D-HOCA by using command instructions and it is a silicon-based block.
  • the third block being the hybrid optical processor glue chip, which helps in the seamless conversion of optical to electronic signals and vice versa, it is also a silicon-based block.
  • the FIG.13 Model A subtype-2 uses the FIG.8 Model-1 type LCUs which is the heart of the optical computation engine.
  • the RGB LED array would act as the optical input source and the color sensor array acts to interface light and the electronic signal conversion of the computed result.
  • the second block is a hybrid optical processor control unit, it is responsible for the complete control of functioning of 3D-HOCA by using command instructions and it is a silicon-based block.
  • the third block being the hybrid optical processor glue chip, which helps in the seamless conversion of optical to electronic signals and vice versa, it is also a silicon-based block.
  • the FIG.14 Model B subtype-1 uses the FIG.9 Model-2 type LCUs which is the heart of the optical computation engine.
  • the RGB LED array would act as the optical input source and the color sensor array acts to interface light and the electronic signal conversion of the computed result.
  • the second block is a hybrid optical processor control unit, it is responsible for the complete control of functioning of 3D-HOCA by using command instructions and it is a silicon-based block.
  • the third block being the hybrid optical processor glue chip, which helps in the seamless conversion of optical to electronic signals and vice versa, it is also a silicon-based block.
  • FIG.8 Model-1 and FIG.9 Model-2 type LCUs which is the heart of the optical computation engine.
  • the FIG 15 Model B subtype-2 uses the FIG.9 Model-2 type LCUs which is the heart of the optical computation engine.
  • the RGB LED array would act as the optical input source and the color sensor array acts to interface between light and the electronic signal conversion of the computed result.
  • the second block is a hybrid optical processor control unit, it is responsible for the complete control of functioning of 3D-HOCA by using command instructions and it is a silicon-based block.
  • the third block being the hybrid optical processor glue chip, which helps in the seamless integration of optical to electronic signals and vice versa, it is also a silicon-based block.
  • FIG.8 Model-1 and FIG.9 Model-2 type LCUs which is the heart of the optical computation engine.
  • Programming the 3D-HOCA model is a challenge to the existing developer to program and utilize the new hardware feature with a uniform interface.
  • new software API extensions are added to the existing programming model of the software stack for programming.
  • the software development kit [ SDK’s] and tools can be enhancing with the additional Ox software API support for ease of programming which is similar to APU acceleration by using Open Computing language (OpenCL) work items / Heterogeneous System Architecture (HSA) work items which uses the master and slave programming model.
  • OpenCL Open Computing language
  • HSA Heterogeneous System Architecture
  • the master program runs on the CPU to control the co-accelerator and the slave program is generally a data-intensive program run on the co-accelerator initiated via the master program.
  • FIG.16 explains the current model and the software programming stack with the new proposed model of Optical extensions APIs [ OxAPIs ].
  • This OxAPIs based application program is run on CPU and is called “master programs”, which invokes the OxAPIs support library to dispatch control commands and data send/receive to/from 3D-HOCA co-processor by using software driver for the 3D-HOCA hardware.
  • These OxAPIs library entry points are written in a generic programming
  • the computing accelerator architecture design by using generic LCU as the basic switching element which uses the Hybrid Checker Board Eclipse Algebra mathematical model and custom switching reconfigurable design compute block with 1 Dimension, 2 Dimension, 3 Dimension, and multi-dimension design is proposed which can interface to exiting CPU and GPU with the high-speed bus.
  • Different models based on the type of LCU are proposed and the accelerator can be programmed by using the software application programmer interface ⁇ cARI'.
  • a reconfigurable multi-dimensional hybrid optical computing accelerator consisting of a set of liquid crystal units with vertical polarizer plane slits and horizontal polarizer plane slits are aligned so as to light pass through and a set of Red, Green, Blue filter or RGB filter panels for each liquid crystal unit.
  • a set of hybrid optical processor glue chips may be used.
  • the liquid crystal unit is incorporated with a set of random access memory (RAM) to store the data for the optical input unit (l 0 ), electronic color transformation function control line (Q e ) to provide an optical output signal (Z 0 ) converted and stored
  • RAM random access memory
  • the liquid crystal unit is characterized by combinational computation execution of the optical Input (l 0 ), the electronic color transformation function control line (Q e ) to provide
  • an optical output Z 0
  • the function of the liquid crystal unit is capable of being reconfigurable by changing the stepping value of the electronic color transformation function control line (Q e ) thereby functional behavior of can be changed or reconfigured without redesigning the existing circuitry.
  • the hybrid optical processor glue chip is integrated to perform conversion of optical signals to electronic signals and vice versa.
  • the reconfigurable multi-dimensional hybrid optical computing accelerator is supported by Optical extensions software APIs [ OxAPIs ] wherein glue chip is activated by a common clock signal for synchronization in the conversion of optical signals to electronic signals and vice versa during the execution of an instruction.
  • OxAPIs Optical extensions software APIs
  • reconfigurable multi-dimensional hybrid optical computing accelerator may be integrated with a beam splitter to split the single optical signal into multiple optical signals.
  • This API is used to discover the System Buffer available in the machine.
  • HardwareFeaturePrt Gets hardware features details of System buffer else returns NULL.
  • This API is used to reserve the SystemBuffer available in the machine.
  • HardwareFeaturePtr-pointer which holds the device capabilities of “Device”.
  • SystemBufferPtr Pointer to System buffer.
  • HardwareFeaturePrt Gets hardware features details of System buffer else returns NULL.
  • This API is used to discover all the [Ox] devices that can be mapped to the System buffer.
  • This API is used to map the discovered 3D-HOCA devices to System buffer.
  • This API is used to map the selected 3D-HOCA devices to the System buffer.
  • DeviceNumber Corresponds to direct cell address memory location in System buffer, it is of integer type. [ 0..N]
  • This API is used to get the selected 3D-HOCA device's info into the System buffer.
  • status OxGetDeviceUsingSystemBuffer(SystemBufferP tr, DeviceNumber, C,U,Z)
  • SystemBufferPtr Pointer to System buffer.
  • DeviceNumber Corresponds to direct cell address memory location in System buffer, it is of integer type. [ 0..N]
  • This API is used to set the info packet into Systembuffer and it will be transferred to the 3D-HOCA device.
  • the info packet consists of the control and the data register file.
  • status OxSetDeviceUsingSystemBuffer(SystemBufferPt r, DeviceNumber, X,Y,Z,lnfopacket)
  • Info packet The device status, control, and the data register file of the target device.
  • OxGetDeviceControlWordUsingSystemBufferO This API is used to get the control word info packet into Systembuffer. It will be transferred from the 3D-HOCA device to Systembuffer.
  • ControlWord The control word of the target device.
  • This API is used to get the data word info packet into Systembuffer. It will be transferred from the 3D-HOCA device to Systembuffer.
  • DeviceNumber- Corresponds to direct cell address memory location in System buffer, it is of integer type.
  • This API is used to get the status word info packet into Systembuffer. It will be transferred from the 3D-HOCA device to Systembuffer.
  • DeviceNumber- Corresponds to direct cell address memory location in System buffer, it is of integer type.
  • [ 0..N] StatusWord The status word of the target device.
  • This API is used to set the control word info packet into Systembuffer. It will be transferred from Systembuffer into the 3D-HOCA device.
  • ControlWord - The Control word package or “eop” slave program code file of the target device.
  • This API is used to set the data word info packet into Systembuffer. It will be transferred from Systembuffer into the 3D-HOCA device.
  • DeviceNumber- Corresponds to direct cell address memory location in System buffer, it is of integer type.
  • This API is used to reset the target device. This is a subset of the “OxSetDeviceDControlWordllsingSystemBuffer” of API.
  • ControlWord The control word of the target device.
  • This API is used to stop the target device. This is a subset of the “OxSetDeviceDControlWordllsingSystemBuffer” of API.
  • ControlWord The control word of the target device.
  • This API is used to unmap the OxDevices from the Systembuffer.
  • DeviceNumber- Corresponds to direct cell address memory location in System buffer, it is of integer type. [ 0..N] X, Y, Z - X, Y and Z location mapping to System buffer to map a corresponding device to System buffer.
  • This API is used to unmap all the OxDevices from the Systembuffer.
  • OxUnmapDeviceRangeFromSystemBuffer() This API is used to unmap the range OxDevices from the Systembuffer.
  • This API is used to release the Systembuffer.
  • FIG. 17 describes the proposed software architecture for the compilation of slave program or co-processor program which runs on the 3D-HOCA accelerator. It is a two- step approach, the source program written in a generic programming language like C or 3D-HOCA GUI components are translated into the intermediate IR like LLVM IR or (Heterogeneous System Architecture) HSA IR or GCC’s GIMPLE IR or 3D-HOCA IR by using “e-Opto” compiler. At the runtime, these IRs are translated to 3D-HOCA native “e- op” Electronic-Optical hybrid native instructions and dispatched to 3D-HOCA for the computation.
  • the e-opto compiler translates high-level sources to IRs and the e- opto assembler, assembles the “eop instructions” to machine instructions to be deployed on 3D-HOCA.
  • This eop code is used to reset the target 3D-HOCA device with the given device ID. It will reinitialize the LCU
  • 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
  • This eop code is change the target 3D-HOCA with the given device ID to stop state.
  • 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
  • This eop code is to perform arithmetic addition on the target 3D-HOCA with the given device ID.
  • 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
  • the data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands.
  • This eop code is to perform arithmetic subtraction on the target 3D-HOCA with the given device ID.
  • 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
  • the data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands.
  • This eop code is to perform arithmetic multiplication on the target 3D-HOCA with the given device ID.
  • 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
  • the data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands.
  • This eop code is to perform arithmetic divide on the target 3D-HOCA with the given device ID.
  • 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
  • the data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands.
  • the data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands.
  • This eop code is to perform logical AND on the target 3D-HOCA with the given device ID.
  • 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
  • the data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands.
  • the data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands.
  • This eop code is to perform logical OR on the target 3D-HOCA with the given device ID.
  • 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
  • the data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands.
  • Data Unit #2 > -
  • the data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands.
  • eop-NOT ⁇ 3D-HOCA-ID > ⁇ Data Unit#1>
  • This eop code is to perform logical NOT on the target 3D-HOCA with the given device ID.
  • 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
  • This eop code is to perform logical EXOR on the target 3D-HOCA with the given device ID.
  • 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
  • the data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands.
  • This eop code is to perform LOADING data from memory into the target 3D-HOCA with 990 the given device ID ⁇ 3D-HOCA-ID : Address >.
  • 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
  • the data unit can be Single data, 1 -Dimention, 2-Dimention or 3- Dimention data.
  • This eop code is to perform STORING data from 3D-HOCA into the target memory marked ⁇ 3D-HOCA-ID : Address >.
  • the data unit can be Single data, 1 -Dimention, 2-Dimention or 3- Dimention data.
  • sta tus OxSe tDe vice ControlWordUsingSystemBuffer( SystemBufferPtr, De viceNumber,X, Y,Z, program.1st)
  • FIOCA for data computation operations is describe as follows,
  • FIG. 18 shows the deployment of the master program running on the CPU as master device and the co-processor program running on the 3D-HOCA accelerator.
  • FIG.19 shows the CPU architecture with ALU blocks, generally CPU workloads are more towards the control flow intensive programs, i.e., task-parallel programs.
  • the CPU also supports Advanced Vector engines for vector operations. But here when the data 1085 S et is huge, the CPU with few ALU blocks becomes squeezed up and experiences lag during computations.
  • FIG.20 shows the GPU architecture with SIMD (Single Instruction Multiple Data) blocks.
  • SIMD Single Instruction Multiple Data
  • a single operation is performed on the multiple data items.
  • GPU workloads are more towards the data-parallel intensive programs.
  • 1090 where for a single clock cycle GPU can drive out more data-parallel compute results when compared to CPU compute for the same input dataset.
  • the general purpose GPU compute can increase the performance 2X to 300X and the main reason is due to the architectural difference.
  • FIG.21 shows the 3D-HOCA architecture with a stack of LCU blocks.
  • multiple operations are performed on the multiple data items in parallel, where for a single clock cycle the 3D-HOCA can drive out more data-parallel compute results when compared to CPU and GPU compute for the same input dataset.
  • the main reason is due to the architectural difference with massive MIMD ( Multiple Instruction Multiple Data ) block
  • each LCU unit in the 3D-HOCA engine acts as Compute Element ( COXEL ) cell and can independently perform computation.
  • COXEL Compute Element
  • 1105 silicon chip is driving the clock. Due to this architecture, this can be best suited for both the workloads task-parallel and data-parallel intensive programs.
  • FIG.22 shows the 3D-HOCA architecture with high-speed RAM hosting the Instruction and Data panel pipelines.
  • the [Q e ]i - [Q e ] n is the array of electronic pipeline input with N
  • Each cell in the panel can host parallel data and/or instruction intended for the target computation within the 3D-HOCA is referred to as “voxel” (Compute Element).
  • the [X 0 ]i - [X 0 ] n is the array of optical pipeline input with the N x N grid size of each panel.
  • Each cell in the panel can host in parallel data and/or instruction intended for the target computation within the 3D-HOCA
  • the individual dedicated high-speed bus lanes #1, #2 and #3 are proposed directly linking the 3D-HOCA driver blocks so that memory bottleneck is mitigated.
  • This high-speed RAM interfaces the 3D-HOCA by using a high-speed LED driver unit, high speed LCD driver unit, and High-speed CCD driver unit.
  • This high-speed bus interface is also between the glue chip and the master unit. Since all these components are opt-
  • FIG.23 shows the parallel extension of a reconfigurable multi-dimensional hybrid optical computing accelerator or a 3D-HOCA architecture model for high-speed parallel computing.
  • the reconfigurable multi-dimensional LCU or 3D LCU optical computing unit #1 does not have a CCD sensor array to capture computed optical output result, but instead, it is fed into an X: X or X: Y based beam splitter or 50:50 beam splitter to split 1130 into two parallel inputs for the 3D LCU optical computing unit #2 and 3D LCU optical computing unit #3.
  • the 3D LCU optical computing unit #2 and 3D LCU optical computing unit #3 do not have the 2D LED array unit, but they directly use the two parallel inputs from the 50:50 beam splitter to compute the respective outputs.
  • both the 3D LCU optical computing unit #2 and 3D LCU optical 1135 computing unit #3 have their own unique set of data/instruction panel inputs for further computation in parallel.
  • This model can be extended by using an array of beam splitters to map the outputs to an array of 3D LCU optical computing unit #N for computation in parallel.
  • periodic optical repeaters and optical amplifier models are added.
  • This design acts as a data/instruction LCU 1140 pipeline. By striped of CCD sensor array and LED 2D array, now the LCU panels act as true optical register files and optical memory files.
  • FIG. 24 represents the symbol of hybrid optical NOT gate.
  • FIG. 25 describes the implementation of a hybrid optical NOT gate using LCU.
  • the optical output Z 0 is given by the HCBEA algebra,
  • FIG. 26 represents the symbol of hybrid optical AND gate.
  • FIG. 27 describes the implementation of a hybrid optical AND gate using LCU.
  • the optical output Z 0 is given by the HCBEA algebra
  • A, B, X are HCBEA Variables
  • FIG. 28 represents the symbol of a hybrid optical OR gate.
  • FIG. 29 describes the implementation of a hybrid optical OR gate using LCU.
  • the optical output Z 0 is given by the HCBEA algebra,
  • A, B, X are HCBEA Variables
  • the application program which runs on the 3D-HOCA accelerator uses the e-opto 1195 compiler, assembler or 4th generation GUI objects for e-opto code generation.

Abstract

The present invention is related to a reconfigurable multi-dimensional hybrid optical computing accelerating apparatus consisting of a vertical polarizer or horizontal polarizer plane facing each other between the Liquid crystal units aligned to light pass through and or each liquid crystal unit is incorporated with a set of the color filter unit and vertical polarizer or horizontal polarizer plane facing each other between the Liquid crystal units aligned to light pass-through wherein a function of the liquid crystal unit is made with combinational computation execution with the optical Input (Io), and the electronic color transformation function control line (Qe) to provide an optical output (Zo), further, the function of the liquid crystal unit is capable of being reconfigurable by changing the stepping value of electronic color transformation function control line (Qe) thereby functional behavior of the LCU can be changed or reconfigured without redesigning the existing circuitry.

Description

FIELD OF INVENTION
The present invention is generally related to an architectural model of computation accelerator or processor accelerator, more particularly related to a multi-dimensional hybrid optical computing accelerator. BACKGROUND
The current generation computing chips designed and fabricated by the computing industry are by using the silicon-based electronics circuits, which has the inherent limitation of packing more number transistors per unit area, increased power and heat dissipation issues, CPU frequency scaling limitation, etc... has led to the need of other computing architecture designs based on the different technologies to break free the above limitations and meet the ever-growing need of computation speed both in scientific and business class areas.
Presently, the scientific and business class applications heavily depend on the processor computing capabilities. But any increase of frequency to speed up the computation would lead to a thermal trip which would require an expensive cooling mechanism and also if the number of cores is increased to increase the speed of computation, it would lead to the increase in physical real estate with larger die area resulting in larger form factor and the cost of production.
Furthermore, the current generation silicon-based chips which have inherent limitations would ideally be replaced by the next-generation hybrid optical-based processor chips which work at the speed of light.
The US5510665 describes the unique construction of an optoelectronic circuit element unit that can be used as a basic circuit building block like a transistor or diode. The unit optoelectronic element has sandwich layers of photodetector - Light modulator - Light source - Light modulator - photodetector in the order of the construction. Here the light
source is placed at the center of the device and the two photodetectors are placed at the outer layers. The two light modulator layers are housed between the photodetector and the light source on either side. This light modulator layer is used for modulating the light.
US6804412 describes the method of optical correlator to measure the correlation of images with the collimated light source and here the Fourier transformation is used as a mathematical model. It is targeted for specialized computing and not general-purpose computation. US7747102B2 also describes the method of optical correlator to measure the correlation of images with the collimated light source and here the Fourier transformation is used as a mathematical model. It uses image production and image capture devices in the same plane to reduce the size of the computing block. It is targeted for specialized computing and not general-purpose computation. US8610839B2 describes the Optical processing system using Twisted collimated spatial light modulator and uses the Fourier full and partial derivatives transformation as a mathematical model. It is targeted for specialized computing and not general-purpose computation.
W02014087126-PAMPH-041 describes the Optical processing system using a multilayered Twisted collimated spatial light modulator and uses the Fourier transform as a mathematical model.
In US20170248734 describes the methods and systems using an optical receiver and electro-optic methods to transmit data from integrated computational elements. It uses the Time Division Multiplication and Wave Division Multiplication in the wavelength region of O-Band, S-Band, -Band, L-band, U-band, and Near Infra region. The optical computing target is specific to remote monitoring of fluids in the oil and gas industry.
In WO2021083348 describes the optical computing device construction based on the passive optical waveguides, which are used to transmit the monochromatic optical signals. Here phase modulation and amplitude modulation has been incorporated to realize multiplication and addition operation.
In WO 02/25395 A2, the optical switching elements in an array of controllable optical switching elements are controlled in accordance with an adaptive computation to be performed on inputs to produce outputs. A light that carries the inputs and outputs is emitted to and collected from the optical switching elements. Here each of the pixels on the film can be considered a light modulating or optical switching element and the film can be considered an SLM array of the pixels. The film together with the lenses, the optical fibers, and the controller can be considered an optical switch. In US10768659B2, An optical neural network is constructed based on photonic integrated circuits to perform neuromorphic computing. In the optical neural network, matrix multiplication is implemented using one or more optical interference units, which can apply an arbitrary weighting matrix multiplication to an array of input optical signals. Nonlinear activation is realized by an optical nonlinearity unit, which can be based on nonlinear optical effects, such as saturable absorption. These calculations are implemented optically, thereby resulting in high calculation speeds and low power consumption in the optical neural network.
In the article, On-chip CMOS-compatible optical signal processor” by Lin Yang et al, an optical signal processor performing matrix-vector multiplication, which is composed of laser modulator array, multiplexer, splitter, micro ring modulator matrix, and photodetector array. 8 * 107 multiplications and accumulations (MACs) per second is implemented at the clock at a clock frequency of 10 MHz. All functional units can be
ultimately monolithically integrated on a chip with the development of silicon photonics and an efficient high-performance computing system is expected in the future.
It is; therefore, the need of the hour is to have a method and apparatus for designing the computing accelerator which works based on the light instead of purely silicon transistor. It is also desirable to harness the inherent parallel capabilities of light properties to perform parallel computation. It is also desirable to have a custom parallel programming software interface support so that optical computing accelerated hardware can be used for user-level programming.
SUMMARY
A method and apparatus for designing the computing accelerator which works based on the light instead of purely silicon transistor. Here the liquid crystal unit is used as the basic compute element unit in the design of the hybrid optical computing accelerator engine. This computing accelerator engine works based on the mathematical model of light behavior. This mathematical algebra is the hybrid model of electronics and optical computing systems. Here different architectural designs of the hybrid compute engines are proposed. The integration to the existing silicon system CPU and GPU processors with glue chip interface. Finally, the software stack of application programming interface OxAPI can be used to harness the capability of the hybrid optical computing accelerator engine.
DETAILED DESCRIPTION OF THE DRAWINGS
The detailed explanation can be found under the detailed explanation section, wherein: Figure 1 is an existing silicon-based computing block;
Figure 2 is an existing silicon-based Non-hUMA APU [Non-heterogeneous Unified Memory Access Accelerated Processing Unit] computing block;
Figure 3 is an existing silicon-based hUMA APU [heterogeneous Unified Memory Access Accelerated Processing Unit] computing block;
Figure 4 is an existing artifact Liquid Crystal Unit showing the on and off light control behavior using voltage support; Figure 5 is an existing artifact Liquid Crystal Unit with the color filter block;
Figure 6 is an existing artifact of single-pixel using RGB filter [RED, GREEN, BLUE] block liquid crystal units;
Figure 7 is a proposed artifact generic mathematical model of a single Liquid Crystal Unit with electronic gate control; Figure 8 is a proposed Model-1 generic artifact mathematical model of a single Liquid Crystal Unit with electronic gate control, horizontal and vertical polarizers blocks;
Figure 8A is a proposed artifact Model-1 Subtype-1 of one-dimensional LCU compute Blocks with electronic gate control, horizontal and vertical polarizers blocks;
Figure 8B is a proposed artifact Model-1 Subtype-2 of two-dimensional LCU compute Blocks with electronic gate control, horizontal and vertical polarizers blocks;
Figure 8C is a proposed artifact Model-1 Subtype-3 of three-dimensional LCU compute Blocks with electronic gate control, horizontal and vertical polarizers blocks;
Figure 9 is a proposed Model-2 generic artifact mathematical model of a single Liquid Crystal Unit with electronic gate control, horizontal, vertical polarizers, and RGB color filters blocks;
Figure 9A is a proposed artifact Model-2 Subtype-1 of one-dimensional LCU compute Blocks with electronic gate control, horizontal, vertical polarizers, and RGB color filters blocks;
Figure 9B is a proposed artifact Model-2 Subtype-2 of two-dimensional LCU compute Blocks with electronic gate control, horizontal, vertical polarizers, and RGB color filters blocks;
Figure 9C is a proposed artifact Model-2 Subtype-3 of three-dimensional LCU compute Blocks with electronic gate control, horizontal, vertical polarizers, and RGB color filters blocks; Figure 10: Generic model of 3D optical compute accelerator unit integration to CPU based system using high-speed bus interface;
Figure 11: Generic model of 3D optical compute accelerator unit integration to GPU using high-speed bus interface;
Figure 12: Model A: Subtype 1, 3D-OCE as co-processor or co-accelerator to CPU based system using high-speed bus interface with electronic gate control, and horizontal, vertical polarizers blocks;
Figure 13: Model A: Subtype 2, 3D-OCE as co-processor or co-accelerator to GPU using high-speed bus interface with electronic gate control, and horizontal, vertical polarizers blocks; Figure 14: Model B: Subtype 1, 3D-OCE as co-processor or co-accelerator to CPU based system using high-speed bus interface with electronic gate control, horizontal, vertical polarizers, and RGB color filters blocks;
Figure 15: Model B: Subtype 2, 3D-OCE as co-processor or co-accelerator to GPU using high-speed bus interface with electronic gate control, horizontal, vertical polarizers, and RGB color filters blocks
Figure 16: software stack block diagram for the 3D-HOCA;
Figure 17: Compilation model of software stack block diagram for the 3D-HOCA.
Figure 18: OxAPI Master program dispatch on CPU and Co-processor program/Slave program dispatch on the 3D-HOCA Figure 19: CPU with 4 ALU ( Arithmetic Logic Unit ) computational block
Figure 20: GPU with multi-ALU SIMD ( Single Instruction Multiple Data ) computational blocks
Figure 21: 3D-HOCA as MIMD ( Multiple Instruction Multiple Data ) computational blocks Figure 22: 3D-HOCA with high-speed RAM hosting the Instruction and Data panel pipeline.
Figure 23: The output of 3D LCU optical computing unit #1 is spilled into two parallel inputs using a 50:50 beam splitter for parallel computation.
Figure 24: Hybrid optical NOT gate Symbol Figure 25: LCU implementation of hybrid optical NOT gate.
Figure 26: Hybrid optical AND gate Symbol
Figure 27: LCU implementation of hybrid optical AND gate.
Figure 28 Hybrid optical OR gate Symbol
Figure 29: LCU implementation of hybrid optical OR gate.
DETAILED DESCRIPTION OF THE PRESENT INVENTION
FIG. 1 shows a system that could include a processor like a single central processing unit (CPU) or it can be a multi-core processor, or network processor (NP) audio accelerator (AA) or a digital signal processor (DSP), a graphics processing unit (GPU), or Accelerated Processing Unit (APU) which has both CPU and GPU integrated which are based on silicon-based technology targeting general-purpose computation or specialized computations. These devices are connected to the input/output I/O blocks console unit like keyboard, monitor, a scanner, joystick, network connection, etc... using input/output bus (I/O) and also, they connect to memory devices like Random Access Memory (RAM), Dynamic RAM (DRAM), Static RAM (SRAM), an electrically erasable programmable ROM (EEPROM) using the system bus. The power source provides the electric power to the circuitry of the system.
FIG. 2 shows a block diagram of non-heterogeneous unified memory (non-hUMA) access accelerated processing unit (APU) computing block. It has GPU and CPU fabricated on the same silicon die. Here the GPU can be programmed to perform general-purpose computations resulting general-purpose graphics processing unit (GPGPU) accelerator to speed up the computations. Since the GPU and CPU have their RAM address space modules that are disjoint and hence accessing either of the RAM blocks would result in the addition hop via CPU or GPU block.
FIG. 3 shows a block diagram of heterogeneous unified memory (hUMA) APU computing block. Here also GPU and CPU fabricated on the same silicon die, but the GPU and CPU share the common RAM address space making it efficient GPGPU coprocessor acceleration for general-purpose computing.
FIG. 4 and FIG. 5 show the existing prior art of a liquid crystal unit (LCU), the liquid crystal is sandwiched between two glass plates coated with conducting transparent electrode material like indium tin oxide (ITO). The horizontal polarizing filter is layered below as the base of this sandwich construction and the vertical polarizing filter is layered above it. The transparent electrodes are connected to the voltage source. An unpolarized white light source unit is placed below the horizontal polarizer filter. The working principle of the LCU is, when the switch S1 is open and when the light is illuminated the horizontal polarizer allows only horizontal planer waves of the light. Fig.4 The rods in the liquid crystal without the application of the voltage will easily change horizontal to vertically polarized light and pass through the vertical polarizer filter at the top. But when the switch S1 is closed, the voltage gets applied to the conducting electrodes making the rods in the liquid crystal change the alignment thereby not having any change on horizontally polarized light passing through it. This horizontally polarized light gets blocked by the vertical polarizer filter block at the top. This simple construction of LCU with the application of voltage depicts the on and off light control behavior. The polarized white source can be of a white light-emitting diode (LED), Red Green Blue
(RGB) LED, phosphorus coated light units like CFL, and any known light generating devices or units.
FIG. 5, shows the LCU with the Red (R), Green (G), Blue (B), Cyan (C), Magenta (M), Yellow (Y) filter blocks, which can be used to control and allow the specific wavelength as the LCU output.
FIG. 6, shows the 3 LCU units with the RGB filters as a single unit with optical input as (RoGoBo), electronic gate RGB inputs (ReGeBe) controlling the 3 LCUs, and optical output as (RoGoBo).
FIG. 7 and FIG.8, shows the proposed generic mathematical model of a single liquid crystal unit with Z0 as the optical output of the LCU, l0 is the optical input of the LCU and Qe is the electronic gate control, which is the light or color filter transformation function using equation 1
Zo= lo * Qe. (1)
Furthermore, the mathematical model of the LCU is as follows by using “Hybrid Checker Board Eclipse Algebra” (HCBEA) or “Multi-level Color Algebra” (MCA), and the Conventions used in the HCBEA are listed below,
Constants of HCBEA algebra
Suffix “o”: Optical [ light Flow / photons] Suffix “e”: Electronic [ electron Flow ]
R = RED wavelength with step values from 1 to 255 G = GREEN wavelength with step values from 1 to 255 B = BLUE wavelength with step values from 1 to 255 M = MAGENTA with step values from 1 to 255 Y = YELLOW with step values from 1 to 255 C = CYAN with step values from 1 to 255
W - WHITE with value ( R255 + G255 + B255 )
K = BLACK / NULL wavelength with value 0 IR = INFRA RED wavelength 1 to 255 A, B, X are HCBEA Variables
1e = Electronic control filter OFF” signal, allowing the light to pass through the LCU 0e = Electronic control filter ON” signal, blocking the light to pass through the LCU Xe = Electronic control filter input value { We, Re, Ge, Be, Ce, Me, Ye, Ke }
Xo = Optical output value { W0, R0, G0, B0, C0, M0, Y0, K0, IR0 }
X = { R, G, B, C, M, Y, IR }
Ao = { R, G, B, C, M, Y }
Be = { R, G, B, C, M, Y }
Operators of HCBEA algebra
+ = CBEA Addition operator * = CBEA Multiplication operator bar = Complementary operation ( example : X : complementary of X )
Identities of HCBEA algebra
Figure imgf000012_0001
Figure imgf000013_0001
Black “+” addition Identity
Xo = Xo + Ko For all X = {R, G, B, C, M, Y K, IR} Note 1 : When an optical data wave X0 gets added to NULL wavelength / Black / K0 the output is Xo
Note 2: Comer case When Xo = Ko then the equation reduces to
Xo = Xo + Ko
= Ko + Ko
300 Xo = Ko (Which is a NULL Wavelength)
Black “*” Identity
Xo * Ke = Ke For all X = {R, G, B, C, M, Y K, IR}
305 White Identity
Wo - Ro + Go + Bo
Figure imgf000014_0001
The sample color algebra or HCBEA algebra truth table for 1e and 0e identity.
310 it is an important embodiment in the invention that the HCBEA algebra is a multi-level logic color algebra executed based on the behavior of the LCU with the electronic color filter transformation control and the optical input. It uses both electronic control and optical discrete step values to support multi-level compute logic.
TABLE 1
Figure imgf000014_0002
Figure imgf000015_0001
Example 1
Both FIG. 8 Model-I and FIG. 9- Model II supports the HCBEA algebra model. In the FIG.8 Model - I LCU has the horizontal and vertical polarizers and but without RGB filter blocks. In the case of FIG. 9, Model-ll LCU has RGB filters along with the horizontal and vertical polarizers enabling increasing in the step multi-step value compared to FIG.8 Model-I.
FIG. 9 Model - I, model a single LCU has the RGB LED source and electronic step gate control, henceforth following step combination are supported,
Ko = 1 step value GRAYo = 254 step values Wo = 1 step value
Ro = Go = Bo = Co = Mo = Yo = each 255 step values
Total Step values = K0 (1) + GRAY0 (254) + W0 (1) + R0 (255) + G0 (255) + B0 (255) + Co (255) + Mo (255) + Y0 (255) + IR0 (255). The single FIG. 8 Model-I supports 1786 step values without infrared and 2041 step values with infrared support included.
FIG. 9 Model - II, model a single LCU has the RGB LED source and electronic step gate control acts as the color filter transformation function on each optical input, henceforth following step combination are supported,
Ko = 1 step value; Wo = 1 step value;
GRAYo = 254 step values;
Ro = Go = Bo = Co = Mo = Yo = IRo each 255 step values Re = Ge = Be = Ce = Me = Ye = IRe each 255 step values on the application color filter transformation function (CFTF);
Step values using Re CFTF = ( K0 (1) + GRAY0 (254) + W0 (1) + R0 (255) + G0 (255) + Bo (255) + Co (255) + M0 (255) + Y0 (255) ) * Re (255) = (1786 *255 ) = 455430 step values. Hence total step values for the FIG. 9 Model II single LCU = Re CFTF + Ge CFTF + Be CFTF + Ce CFTF + Me CFTF + Ye CFTF + Re CFTF + GRAYe CFTF = 3643440 step combinations without IRe, and with IRe it supports 4098870.
The existing transistor operating based on the binary logic would require 18,21,720 transistors to simulate FIG.9 Model II single LCU steps. The two-parallel side by side LCU can simulate 13.27465503 Tera step combination and compute cube of 4 X 4 X 4 LCU can simulate 9.642417942847 X 10104 Step combinations.
The FIG. 8A Model-I, Subtype-I, describes the one-dimensional hybrid optical computing block with 4 rows X 1 column vertically stacked LCUs in which the vertical polarizer planes slits of the two adjacent layers are aligned so has to light pass through them and similarly the horizontal polarizer plane slits of the two adjacent plane are aligned to light pass through them. This creates 4 X 1 computing LCU blocks which can be electronically controlled individually by using Qei, Qe2, Qe3, and Qe4 lines to result in the collective optical computational sub outputs Z0i, Z02, Z03, and final result Z04 from the optical input l0i based on mathematical model HCBEA. The below single RGB light source unit would drive input source l0i and the RGB color sensor unit at the top of Z04 output is used to detect optical result Z04.
The FIG. 8B Model-1, Subtype-2, describes the two-dimensional hybrid optical computing block with 4 rows X 4 column vertically stacked LCUs in which the vertical polarizer planes slits of the two adjacent layers are aligned so has to light pass through them and similarly the horizontal polarizer plane slits of the two adjacent planes are aligned to light pass through them. This creates 4 X 4 computing LCU grid blocks which can be electronically controlled individually by using Qen, Qei2, Qei3, Qei4, Qe2i, Qe22, Qe23, Qe24, Qe3i, Qe32, Qe33, Qe34, Qe4i, Qe42, Qe43, and Qe44 lines to result in the collective optical computational outputs Z014, Z024, Z034, and Z044 from the optical inputs loi , lo2, lo3 and l04 based on mathematical model HCBEA. The 4 individual RGB light sources below the first layer of LCUs would drive input source l0i , I02, l03, lo4, and the 4 individual RGB color sensors above the Z014, Z024, Z034, Z044 outputs are used to detect computed optical results.
The FIG. 8C Model-I, Subtype-3, describes the three-dimensional hybrid optical computing block with 4 X 4 X 4 stacked LCUs in which the vertical polarizer planes slits of the two adjacent layers are aligned so has to light pass through them and similarly the horizontal polarizer plane slits of the two adjacent planes are aligned to light pass through them. This creates 4 X 4 X 4 computing LCU grid blocks which can be electronically controlled individually by using Qeooo, Qeooi, Qeoo2, Qeoo3, Qeioo, Qeioi,
Qe102> Qe103> Qe200> Qe201 > Qe202> Qe203> Qe300> Qe301 > Qe302> Qe303> Qe010> Qe011 > Qe012> Qe013> Qe110> Qe111 > Qe112> Qe113> Qe210> Qe211 > Qe212> Qe213> Qe310> Qe311 > Qe312> Qe313> Qe020> Qe021 > Qe022> Qe023> Qe120> Qe121 > Qe122> Qe123, Qe220> Qe221 > Qe222> Qe223> Qe320> Qe321 > Qe322> Qe323> Qe030> Qe031 > Qe032> Qe033> Qe130> Qe131 > Qe132> Qe133> Qe230> Qe221 >
Qe232, Qe233, Qe33o, Qe33i , Qe332 and Qe333 lines to result in the collective optical computational outputs Zo030> Zo031 , Zo032> Zo033> Z0130, Z0131 , Z0132, Z0133, Zo230, Z0231 , Z0232 , Zo233> Zo330, Z0331 , Z0332 and Z0333 frOITI the optical inputs lo000> Io001 > Io002> Io003> Io100> Io101. I0102, I0103, I0200, I0201 , I0202, Io203, Io3oo, Io3oi , Io302 and lO303 based on mathematical model HCBEA. The 4 X 4 individual RGB light source below the first layer of LCUs would drive
input source l0ooo, I0001 , I0002, I0003, I0100, I0101 , I0102, I0103, I0200, I0201 , I0202, I0203, I0300, I0301 , Io302, and Io303. The 4 X 4 individual RGB color sensor at the top of Z0o3o, Z0o3i , Z0032,
Zo033> Z0130, Z0131 , Z0132, Z0-|33, Zo230> Z0231 , Z0232> Z0233> Zo330, Z0331 , Z0332, and Z0333 outputs is used to detect computed optical results. Furthermore, by using FIG.8 Model -I as the basic building block it can be further scaled up to a multi-dimensional grid hybrid optical computational unit.
The FIG. 9A Model-2, Subtype-I, describes the one-dimensional hybrid optical computing block with 4 rows X 1 column vertically stacked using FIG. 9 Model-2: LCUs with RBG filter block included. Here the vertical polarizer planes slits of the two adjacent layers are aligned so has to light passes through them and similarly the horizontal polarizer plane slits of the two adjacent planes are aligned to light pass through them. This creates 4 X 1 computing LCU blocks which can be electronically controlled individually by using Qei, Qe2, Qe3, and Qe4 lines to result in the collective optical computational sub outputs Z01, Z02, Z03, and final result Z04 from the optical input l01 based on mathematical model HCBEA. The below single RGB light source would drive input source l01 and the RGB color sensor above the Z04 output is used to detect optical result Zo4. The FIG. 9B Model-2, Subtype-2, describes the two-dimensional hybrid optical computing block with 4 rows X 4 columns vertically stacked using FIG. 9 Model- 2: LCUs with RBG filter block included. Here the vertical polarizer planes slits of the two adjacent layers are aligned so has to light pass through them and similarly the horizontal polarizer plane slits of the two adjacent planes are aligned so as to light pass through them. This creates 4 X 4 computing LCU grid blocks which can be electronically controlled individually by using Qe11 , Qe12, Qe13, Qe14, Qe21, Qe22, Qe23, Qe24, Qe31, Qe32, Qe33, Qe34, Qe41, Qe42, Qe43, and Qe44 lines to result in the collective optical
computational outputs Zou, Z024, Z034, and Z044 from the optical inputs l01, I02, 103 and l04 based on mathematical model HCBEA. The 4 individual RGB light sources below the first layer of LCUs would drive input source l01, I02, l03, lo4, and the 4 individual RGB color sensors above the Z014, Z024, Z034, Z044 outputs are used to detect computed optical results.
The FIG. 9C Model-2, Subtype-3, describes the three-dimensional hybrid optical computing block with 4 X 4 X 4 vertically stacked using FIG. 9 Model- 2: LCUs with RBG filter block included. Here the vertical polarizer planes slits of the two adjacent layers are aligned so has to light pass through them and similarly the horizontal polarizer plane slits of the two adjacent planes are aligned to light pass through them. This creates 4 X 4 X 4 computing LCU grid blocks which can be electronically controlled individually by using Qe000> Qe001 > Qe002> Qe003> Qe100> Qe101 > Qe102> Qe103> Qe200> Qe201 > Qe202> Qe203> Qe300> Qe301 > Qe302> Qe303> Qe010> Qe011 > Qe012> Qe013> Qe110> Qe111 > Qe112>
Qe113i Qe210> Qe211 > Qe212> Qe213> Qe310> Qe311 > Qe312> Qe313> Qe020> Qe021 > Qe022> Qe023>
Qe120> Qe121 > Qe122> Qe123> Qe220> Qe221 > Qe222> Qe223> Qe320> Qe321 > Qe322> Qe323> Qe030>
Qe031 > Qe032> Qe033> Qe130> Qe131 > Qe132> Qe133> Qe230> Qe221 > Qe232> Qe233> Qe330> Qe331 >
Qe332 and Qe333 lines to result in the collective optical computational outputs Z0o3o, Z0o3i , Zo032, Zo033 , Z0130, Z0131 , Z0132, Z0133, Zo230> Z0231 , Z0232> Z0233> Zo330, Z0331 , Z0332 and Z0333 from the optical inputs lo000. Io001 > Io002> Io003> Io100, Io101. Io102, Io103> Io200> Io201 > Io202> Io203> Io3oo, Io3oi , Io302 and Io303 based on mathematical model HCBEA. The 4 X 4 individual RGB light source below the first layer of LCUs would drive input source Uoo, I0001 , I0002,
Io003> Io100> Io101. Io102> Io103> Io200> Io201 > Io202> Io203> Io300> Io301 > Io302, and lo303- The 4 X 4 individual RGB color sensor above the Z0o3o, Z0o3i , Z0032, Z0033, Z0130, Z0i3i , Z0132, Z0133, Zo23o, Zo23i , Zo232, Z0233, Z033o, Z033i , Z0332, and Z0333 outputs are used to detect computed optical results.
Furthermore, by using FIG.9 Model -2 as the basic building block it can be further scaled up to a multi-dimensional grid hybrid optical computational unit. The FIG.8 Model-1 and FIG.9 Model-2 LCUs which works based on the HCBEA mathematical model can be used to design basic logical blocks like AND, OR, NOT, NOR, NAND, EXOR, and EX-NOR, in turn, be used to construct complex computational blocks. This in turn can be used to build complex integer and floating-point arithmetic operators +, -, *, and /.
The main advantage is, the computational processor designed by using FIG.8 Model-1 and FIG.9 Model-2 type LCUs is that it can be easily reconfigured by changing the color filter transformation line Qe(x,x,x) to perform different types of operation. This leads to the reconfigurable processor which is a unique feature when compared to the existing silicon-based processors.
FIG. 10 shows the generic model of 3D hybrid optical Compute Accelerator Unit (3D- HOCA) integration to CPU-based system using a high-speed bus interface. The silicon processor would act as the master and the 3D-HOCA unit acts as the co-accelerator. Here the silicon processor can offload the computation workload of arithmetic, logical, relational, string processing to a reconfigurable multi-dimensional hybrid optical computing LCU or 3D-HOCA unit to increase the speed of computation. The high-speed bus interface unit can be integrated with industry-standard bus architectures like Hyper transport, InfiniBand, and compute express link for high-speed communication between the silicon processor and the 3D-HOCA unit for the data and instruction transfer. The 3D-HOCA unit has three blocks, the first block is the 3D hybrid optical Compute based on either FIG.8 Model-1 or FIG.9 Model-2 or it can be based on the mix of both FIG.8 Model-1 and FIG.9 Model-2 type LCUs which is the heart of the optical computation engine. The RGB LED array would act as the optical input source and the color sensor array acts to interface light and the electronic signal conversion of the computed result. The second block is a hybrid optical processor control unit, it is responsible for the complete control of functioning of 3D-HOCA by using command instructions and it is a silicon-based block. The third block being the hybrid optical processor glue chip, which helps in the seamless conversion of optical to electronic signals and vice versa, it is also an edge-triggered module driving three optoelectronic sub-components of 3D-HOCA. The three sub-components are the 2D dimensional LED Data/Control grid panel, the 3Dimesional LCD Data/Control grid panel, and finally, a 2D color sensor array like CCD (charge-coupled device) which can convert light into electrical signals. The glue chip also incorporates a lookup table or a base (2) to base (256) and base (256) to base (2) convertor for seamless integration. The other units are the industry standard subunits as explained in FIG.1. FIG. 11 shows a generic model of 3D hybrid optical Compute Accelerator Unit (3D- HOCA) integration to GPU using a high-speed bus interface. The silicon-based Graphics Processing Unit (GPU) would act as the master and the 3D-HOCA unit acts as the co-accelerator. Here the GPU can offload the computation work to a 3D-HOCA unit to increase the speed of computation. The high-speed bus interface unit can be integrated with the industry-standard bus architectures like Hyper transport, InfiniBand, and compute express link for high-speed communication between the GPU and the 3D- HOCA unit for the data and instruction transfer. The 3D-HOCA unit has three blocks, the first block is the 3D hybrid optical Compute based on either FIG.8 Model-1 or FIG.9 Model-2 or it can be based on the mix of both FIG.8 Model-1 and FIG.9 Model-2 type LCUs which is the heart of the optical computation engine. The RGB LED array would act as the optical input source and the color sensor array acts to interface light and the electronic signal conversion of the computed result. The second block is a hybrid optical processor control unit, it is responsible for the complete control of functioning of 3D- HOCA by using command instructions and it is a silicon-based block. The third block being the hybrid optical processor glue chip, which helps in the seamless conversion of optical to electronic signals and vice versa, it is also a silicon-based block. The interesting aspect of the GPU interface is used GPU MEMORY which stores the frame buffer block that is will be used to display on the visual display unit like monitors, the
same RAM content in the form of frame buffer can be used as the data input computations source for 3D-HOCA.
The FIG.12 Model A subtype-1 uses the FIG.8 Model-1 LCUs which is the heart of the optical computation engine. The RGB LED array would act as the optical input source and the color sensor array acts to interface light and the electronic signal conversion of the computed result. The second block is a hybrid optical processor control unit, it is responsible for the complete control of functioning of 3D-HOCA by using command instructions and it is a silicon-based block. The third block being the hybrid optical processor glue chip, which helps in the seamless conversion of optical to electronic signals and vice versa, it is also a silicon-based block.
The FIG.13 Model A subtype-2 uses the FIG.8 Model-1 type LCUs which is the heart of the optical computation engine. The RGB LED array would act as the optical input source and the color sensor array acts to interface light and the electronic signal conversion of the computed result. The second block is a hybrid optical processor control unit, it is responsible for the complete control of functioning of 3D-HOCA by using command instructions and it is a silicon-based block. The third block being the hybrid optical processor glue chip, which helps in the seamless conversion of optical to electronic signals and vice versa, it is also a silicon-based block.
The FIG.14 Model B subtype-1 uses the FIG.9 Model-2 type LCUs which is the heart of the optical computation engine. The RGB LED array would act as the optical input source and the color sensor array acts to interface light and the electronic signal conversion of the computed result. The second block is a hybrid optical processor control unit, it is responsible for the complete control of functioning of 3D-HOCA by using command instructions and it is a silicon-based block. The third block being the hybrid optical processor glue chip, which helps in the seamless conversion of optical to electronic signals and vice versa, it is also a silicon-based block.
Furthermore, a mix of both FIG.8 Model-1 and FIG.9 Model-2 type LCUs which is the heart of the optical computation engine.
The FIG 15 Model B subtype-2 uses the FIG.9 Model-2 type LCUs which is the heart of the optical computation engine. The RGB LED array would act as the optical input source and the color sensor array acts to interface between light and the electronic signal conversion of the computed result. The second block is a hybrid optical processor control unit, it is responsible for the complete control of functioning of 3D-HOCA by using command instructions and it is a silicon-based block. The third block being the hybrid optical processor glue chip, which helps in the seamless integration of optical to electronic signals and vice versa, it is also a silicon-based block.
Furthermore, a mix of both FIG.8 Model-1 and FIG.9 Model-2 type LCUs which is the heart of the optical computation engine.
Programming the 3D-HOCA model is a challenge to the existing developer to program and utilize the new hardware feature with a uniform interface. Hence new software API extensions are added to the existing programming model of the software stack for programming. The software development kit [ SDK’s] and tools can be enhancing with the additional Ox software API support for ease of programming which is similar to APU acceleration by using Open Computing language (OpenCL) work items / Heterogeneous System Architecture (HSA) work items which uses the master and slave programming model. The master program runs on the CPU to control the co-accelerator and the slave program is generally a data-intensive program run on the co-accelerator initiated via the master program. FIG.16 explains the current model and the software programming stack with the new proposed model of Optical extensions APIs [ OxAPIs ]. This OxAPIs based application program is run on CPU and is called “master programs”, which invokes the OxAPIs support library to dispatch control commands and data send/receive to/from 3D-HOCA co-processor by using software driver for the 3D-HOCA hardware. These OxAPIs library entry points are written in a generic programming
language like C/C++, which in turn invokes low-level system calls to control the hardware driver functionalities.
Software API’s For Ox Programming, the following are the proposed software Optical Programming extensions for application Programmer Interfaces [OxAPI] that can be used for programming the 3D-HOCA unit using the master program.
Therefore the line numbers from 175 to 555 the following are discussed in short, the computing accelerator architecture design by using generic LCU as the basic switching element which uses the Hybrid Checker Board Eclipse Algebra mathematical model and custom switching reconfigurable design compute block with 1 Dimension, 2 Dimension, 3 Dimension, and multi-dimension design is proposed which can interface to exiting CPU and GPU with the high-speed bus. Different models based on the type of LCU are proposed and the accelerator can be programmed by using the software application programmer interface ΌcARI'.
From the above, it is clear that the following embodiments are addressed from the above descriptions. it is an important embodiment that a reconfigurable multi-dimensional hybrid optical computing accelerator consisting of a set of liquid crystal units with vertical polarizer plane slits and horizontal polarizer plane slits are aligned so as to light pass through and a set of Red, Green, Blue filter or RGB filter panels for each liquid crystal unit. Here a set of hybrid optical processor glue chips may be used. And the liquid crystal unit is incorporated with a set of random access memory (RAM) to store the data for the optical input unit (l0), electronic color transformation function control line (Qe) to provide an optical output signal (Z0) converted and stored
The liquid crystal unit is characterized by combinational computation execution of the optical Input (l0), the electronic color transformation function control line (Qe) to provide
an optical output (Z0). Wherein the function of the liquid crystal unit is capable of being reconfigurable by changing the stepping value of the electronic color transformation function control line (Qe) thereby functional behavior of can be changed or reconfigured without redesigning the existing circuitry. it is another important embodiment in the invention that the hybrid optical processor glue chip is integrated to perform conversion of optical signals to electronic signals and vice versa.
Yet the other embodiment in the invention is that the reconfigurable multi-dimensional hybrid optical computing accelerator is supported by Optical extensions software APIs [ OxAPIs ] wherein glue chip is activated by a common clock signal for synchronization in the conversion of optical signals to electronic signals and vice versa during the execution of an instruction.
And the reconfigurable multi-dimensional hybrid optical computing accelerator may be integrated with a beam splitter to split the single optical signal into multiple optical signals.
Using the hybrid optical logic gates based on the HCBEA algebra, complex combinatorial circuits can be formed like adders, subtractors, etc... It requires periodic repeaters / optical amplifiers to overcome the LCU light attenuation.
The OxAPI software API list,
OxDiscoverSystemBuffer()
OxMarkReserveSystemBuffer()
OxlnitializeSystemBuffer()
OxDiscoverDevices()
OxMapDiscoveredDevicesToSystemBuffer()
OxMapOnlySpecificDevicesToSystemBuffer()
OxGetDeviceUsingSystemBufferQ OxSetDeviceUsingSystemBuffer()
OxGetDeviceControlWordUsingSystemBufferO
OxGetDeviceDataWordllsingSystemBufferO
OxGetDeviceStatusWordllsingSystemBufferO
OxSetDeviceControlWordUsingSystemBufferO
OxSetDeviceDataWordUsingSystemBuffer()
OxResetDevice()
OxStopDevice()
OxUnmapDeviceFromSystemBuffer()
OxUnmapAIIDeviceFromSystemBufferO
OxUnmapDeviceRangeFromSystemBuffer()
OxReleaseSystemBuffer()
The details of the API description are listed below,
OxDiscoverSystemBuffer()
This API is used to discover the System Buffer available in the machine.
General Syntax: status=OxDiscoverSystemBuffer(HardwareFeaturePtr)
Where,
Status - API status
HardwareFeaturePrt - Gets hardware features details of System buffer else returns NULL.
OxMarkReserveSystemBuffer():
This API is used to reserve the SystemBuffer available in the machine.
General Syntax: status=OxMarkReserveSystemBuffer(device, size, Hard wareFeaturePtr,Oxenable,SystemBufferPtr)
Where, Status - API status Device -3D-OHCA Size- SystemBufferSize [integer]
HardwareFeaturePtr-pointer which holds the device capabilities of “Device”.
OxEnable,- Enable / Disable 3D-OHCA feature [ Boolean type ]
SystemBufferPtr - Pointer to System buffer. HardwareFeaturePrt - Gets hardware features details of System buffer else returns NULL.
OxlnitializeSystemBuffer():
This API is used to initialize the SystemBuffer available in the machine. General Syntax: status=OxlnitializeSystemBuffer(SystemBufferPtr, Value)
Where,
Status - API status
SystemBufferPtr - Pointer to System buffer.
Value - Value used to initialize the System buffer.
OxDiscoverDevices()
This API is used to discover all the [Ox] devices that can be mapped to the System buffer.
General Syntax: status= OxDiscoverDevices(*list)
Where,
Status - API status
*list - All the Ox enabled devices discovered.
OxMapDiscoveredDevicesToSystemBuffer()
This API is used to map the discovered 3D-HOCA devices to System buffer.
General Syntax: status=OxMapDiscoveredDevicesToSystemBuffer(*list,
SystemBufferPtr)
Where,
Status - API status
*list - All the Ox enabled devices discovered. SystemBufferPtr - Pointer to System buffer.
OxMapOnlySpecificDevicesToSystemBuffer()
This API is used to map the selected 3D-HOCA devices to the System buffer.
General Syntax: status=OxMapOnlySpecificDevicesToSystemBuffer(*li st, SystemBufferPtr, DeviceNumber,X,Y,Z)
Where,
Status - API status
*list - All the Ox enabled devices discovered.
SystemBufferPtr - Pointer to System buffer.
DeviceNumber - Corresponds to direct cell address memory location in System buffer, it is of integer type. [ 0..N]
X, Y, Z -X, Y, and Z location mapping to System buffer to map a corresponding device to System buffer.
Note :
If X, Y, and Z are provided, the device number is marked [1]
If device number is provided then X, Y, Z is [-1]
OxGetDeviceUsingSystemBuffer()
This API is used to get the selected 3D-HOCA device's info into the System buffer. GeneralSyntax: status=OxGetDeviceUsingSystemBuffer(SystemBufferP tr, DeviceNumber, C,U,Z) Where,
Status - API status
SystemBufferPtr - Pointer to System buffer. DeviceNumber- Corresponds to direct cell address memory location in System buffer, it is of integer type. [ 0..N]
X, Y, Z - X, Y, and Z location mapping to System buffer to map a corresponding device to System buffer.
Note : If X Y and Z are provided, the device number is marked [1]
If device number is provided then X, Y and Z is [-1]
OxSetDeviceUsingSystemBuffer()
This API is used to set the info packet into Systembuffer and it will be transferred to the 3D-HOCA device. The info packet consists of the control and the data register file. General Syntax: status=OxSetDeviceUsingSystemBuffer(SystemBufferPt r, DeviceNumber, X,Y,Z,lnfopacket)
Where,
Status - API status SystemBufferPtr - Pointer to System buffer.
DeviceNumber- Corresponds to direct cell address memory location in System buffer, it is of integer type. [ 0..N]
Info packet - The device status, control, and the data register file of the target device.
X, Y, Z - X, Y, and Z location mapping to System buffer to map a corresponding device to System buffer.
Note :
If X, Y, and Z are provided, the device number is marked [1]
If device number is provided then X, Y, Z is [-1] OxGetDeviceControlWordUsingSystemBufferO This API is used to get the control word info packet into Systembuffer. It will be transferred from the 3D-HOCA device to Systembuffer.
General Syntax: status= OxGetDeviceControlWordUsingSystemBuffer
(SystemBufferPtr,DeviceNumber,X,Y, Z, ControlWord) Where,
Status - API status
SystemBufferPtr - Pointer to System buffer.
DeviceNumber- Corresponds to direct cell address memory location in System buffer, it is of integer type. [ 0..N] ControlWord - The control word of the target device.
X, Y, Z - X, Y, and Z location mapping to System buffer to map a corresponding device to System buffer.
Note :
If X, Y, and Z are provided the device number is marked [1 ]. If device number is provided then X, Y, Z is [-1]
OxGetDeviceDataWordUsingSystemBuffer()
This API is used to get the data word info packet into Systembuffer. It will be transferred from the 3D-HOCA device to Systembuffer.
General Syntax: status= OxGetDeviceDataWordUsingSystemBuffer
(SystemBufferPtr, DeviceNumber, C,U,Z, DataWord)
Where,
Status - API status
SystemBufferPtr - Pointer to System buffer.
DeviceNumber- Corresponds to direct cell address memory location in System buffer, it is of integer type. [ 0..N] DataWord - The data of the target device.
X, Y, Z - X, Y, and Z location mapping to System buffer to map a corresponding device to System buffer.
Note :
If X, Y, and Z are provided the device number is marked [1 ].
If device number is provided then X, Y and Z is [-1]
OxGetDeviceStatusWordUsingSystemBuffer()
This API is used to get the status word info packet into Systembuffer. It will be transferred from the 3D-HOCA device to Systembuffer.
General Syntax: status= OxGetDeviceStatusWordllsingSystemBuffer (System BufferPtr, DeviceNumber, X,Y, Z, StatusWord)
Where,
Status - API status
SystemBufferPtr - Pointer to System buffer.
DeviceNumber- Corresponds to direct cell address memory location in System buffer, it is of integer type. [ 0..N] StatusWord - The status word of the target device.
X, Y, Z -X, X, and Z location mapping to System buffer to map a corresponding device to System buffer.
Note :
If X, Y, and Z are provided the device number is marked [1 ].
If device number is provided then X, Y, Z is [-1]
OxSetDeviceControlWordUsingSystem Buffer()
This API is used to set the control word info packet into Systembuffer. It will be transferred from Systembuffer into the 3D-HOCA device.
General Syntax: status= OxSetDeviceControlWordUsingSystemBuffer (SystemBufferPtr,DeviceNumber,X,Y,Z, ControlWord)
Where,
Status - API status SystemBufferPtr - Pointer to System buffer.
DeviceNumber- Corresponds to direct cell address memory location in System buffer, it is of integer type. [ 0..N]
ControlWord - The Control word package or “eop” slave program code file of the target device. X, Y, Z -X, Y, and Z location mapping to System buffer to map a corresponding device to System buffer.
Note :
If X, Y, Z are provided, the device number is marked [1]
If device number is provided then X, Y, Z is [-1]
OxSetDeviceDataWordUsingSystemBuffer()
This API is used to set the data word info packet into Systembuffer. It will be transferred from Systembuffer into the 3D-HOCA device.
General Syntax: status= OxSetDeviceDataWordUsingSystemBuffer (SystemBufferPtr, DeviceNumber, C,U,Z, DataWord)
Where,
Status - API status
SystemBufferPtr - Pointer to System buffer.
DeviceNumber- Corresponds to direct cell address memory location in System buffer, it is of integer type. [ 0..N] DataWord - The data word of the target device.
X, Y, Z - X, Y, and Z location mapping to System buffer to map a corresponding device to System buffer.
Note :
If X, Y, Z are provided, the device number is marked [1]
If device number is provided then X, Y, Z is [-1]
OxResetDevice()
This API is used to reset the target device. This is a subset of the “OxSetDeviceDControlWordllsingSystemBuffer” of API.
General Syntax: status=OxResetDevice(SystemBufferPtr,DeviceNumber, C,U,Z, ControlWord)
Where,
Status - API status
SystemBufferPtr - Pointer to System buffer.
DeviceNumber- Corresponds to direct cell address memory location in System buffer, it is of integer type. [ 0..N]
ControlWord - The control word of the target device.
X, Y, Z - X, Y, Z location mapping to System buffer to map a corresponding device to System buffer.
Note :
If X, Y, Z are provided the device number is marked [1]
If device number is provided then X, Y, Z is [-1]
OxStopDevice()
This API is used to stop the target device. This is a subset of the “OxSetDeviceDControlWordllsingSystemBuffer” of API.
General Syntax: status=OxResetDevice(SystemBufferPtr, DeviceNumber, C,U,Z, ControlWord)
Where,
Status - API status SystemBufferPtr - Pointer to System buffer.
DeviceNumber- Corresponds to direct cell address memory location in System buffer, it is of integer type. [ 0..N]
ControlWord - The control word of the target device. X, Y, Z - X, Y, Z location mapping to System buffer to map a corresponding device to System buffer.
Note :
If X, Y, Z are provided the device number is marked [1]
If device number is provided then X, Y, Z is [-1]
OxUnmapDeviceFromSystemBuffer()
This API is used to unmap the OxDevices from the Systembuffer.
General Syntax: status=OxUnmapDeviceFromSystemBuffer(SystemBufferPtr, DeviceNumber, C,U,Z) Where,
Status - API status
SystemBufferPtr - Pointer to System buffer.
DeviceNumber- Corresponds to direct cell address memory location in System buffer, it is of integer type. [ 0..N] X, Y, Z - X, Y and Z location mapping to System buffer to map a corresponding device to System buffer.
Note :
If X, Y, Z are provided the device number is marked [1]
If device number is provided then X, Y, Z is [-1]
OxUnmapAIIDeviceFromSystemBufferO
This API is used to unmap all the OxDevices from the Systembuffer.
General Syntax: status=OxUnmapAIIDeviceFromSystemBuffer(System BufferPtr) Where,
Status - API status
SystemBufferPtr - Pointer to System buffer.
OxUnmapDeviceRangeFromSystemBuffer() This API is used to unmap the range OxDevices from the Systembuffer.
General Syntax: status=OxUnmapDeviceRangeFromSystemBuffer(SystemBufferPtr,Devicel_istRange,X, Y, Z, TotalDevices)
Where, Status - API status
SystemBufferPtr - Pointer to System buffer.
DeviceListRange - Range of devices [ Start Range to EndRange]
X[],Y[]. Z[] - Integer Array of X, Y, Z cell address.
TotalDevices - Count of devices.
OxReleaseSystemBuffer()
This API is used to release the Systembuffer.
General Syntax: status=OxReleaseSystemBuffer(SystemBufferPtr)
Where, Status - API status
SystemBufferPtr - Pointer to System buffer.
Example:2
FIG. 17 describes the proposed software architecture for the compilation of slave program or co-processor program which runs on the 3D-HOCA accelerator. It is a two- step approach, the source program written in a generic programming language like C or 3D-HOCA GUI components are translated into the intermediate IR like LLVM IR or (Heterogeneous System Architecture) HSA IR or GCC’s GIMPLE IR or 3D-HOCA IR by using “e-Opto” compiler. At the runtime, these IRs are translated to 3D-HOCA native “e- op” Electronic-Optical hybrid native instructions and dispatched to 3D-HOCA for the computation. Here the e-opto compiler translates high-level sources to IRs and the e- opto assembler, assembles the “eop instructions” to machine instructions to be deployed on 3D-HOCA.
The electronic-Optical “eop” Instruction Set Architecture [ISA] is proposed, that can be used for computation on 3D-HOCA unit. The “eop” list,
Control Operation “eop” instruction:
1. eop-RESET < 3D-HOCA-ID >
2. eop-STOP < 3D-HOCA-ID >
Arithmetic Operation “eop” instructions:
1. eop-ADD < 3D-HOCA-ID > <Data Unit #1> <Data Unit #2>
2. eop-SUB < 3D-HOCA-ID > <Data Unit #1> <Data Unit #2>
2. eop-MUL < 3D-HOCA-ID > <Data Unit #1> <Data Unit #2>
3. eop-DIV < 3D-HOCA-ID > <Data Unit #1> <Data Unit #2>
Logical Operation “eop” instructions:
1. eop- AND < 3D-HOCA-ID > <Data Unit #1> <Data Unit #2>
2. eop-OR < 3D-HOCA-ID > <Data Unit #1> <Data Unit #2>
3. eop-NOT < 3D-HOCA-ID > <Data Unit #1>
4. eop-EXOR < 3D-HOCA-ID > <Data Unit #1> <Data Unit #2>
Data Transfer Operation “eop” instructions:
1. eop- LOAD < 3D-HOCA-ID : Address > <Data Unit #1>
2. eop-STORE < 3D-HOCA-ID : Address > <Data Unit #1> The details of the eop instruction description are listed below,
Control Operation “eop” instruction:
1. eop-RESET < 3D-HOCA-ID >
This eop code is used to reset the target 3D-HOCA device with the given device ID. It will reinitialize the LCU
Where,
< 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
2. eop-STOP < 3D-HOCA-ID >
This eop code is change the target 3D-HOCA with the given device ID to stop state.
Where,
< 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
Arithmetic Operation “eop” instructions: eop-ADD < 3D-HOCA-ID > <Data Unit #1> <Data Unit #2>
This eop code is to perform arithmetic addition on the target 3D-HOCA with the given device ID. Where,
< 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
< Data Unit #1 > - The data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands.
< Data Unit #2 > - The data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands. eop-SUB < 3D-HOCA-ID > <Data Unit #1> <Data Unit #2>
This eop code is to perform arithmetic subtraction on the target 3D-HOCA with the given device ID.
Where,
< 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
< Data Unit #1 > - The data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands.
< Data Unit #2 > - The data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands. eop-MUL < 3D-HOCA-ID > <Data Unit #1> <Data Unit #2>
This eop code is to perform arithmetic multiplication on the target 3D-HOCA with the given device ID.
Where,
< 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
< Data Unit #1 > - The data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands.
< Data Unit #2 > - The data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands. eop-DIV < 3D-HOCA-ID > <Data Unit #1> <Data Unit #2>
This eop code is to perform arithmetic divide on the target 3D-HOCA with the given device ID. Where,
< 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
< Data Unit #1 > - The data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands.
< Data Unit #2 > - The data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands.
Logical Operation “eop” instructions: eop- AND < 3D-HOCA-ID > <Data Unit #1> <Data Unit #2>
This eop code is to perform logical AND on the target 3D-HOCA with the given device ID. Where,
< 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
< Data Unit #1 > - The data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands.
< Data Unit #2 > - The data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands.
eop-OR < 3D-HOCA-ID > <Data Unit#1> <Data Unit#2>
This eop code is to perform logical OR on the target 3D-HOCA with the given device ID. Where,
< 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
< Data Unit #1 > - The data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands.
< Data Unit #2 > - The data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands. eop-NOT < 3D-HOCA-ID > <Data Unit#1>
This eop code is to perform logical NOT on the target 3D-HOCA with the given device ID.
Where,
< 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
< Data Unit #1 > - The data unit can be Single operand, 1-Dimention, 1-Dimention, 2- Dimention or 3-Dimention operands. eop-EXOR < 3D-HOCA-ID > <Data Unit#1> <Data Unit#2>
This eop code is to perform logical EXOR on the target 3D-HOCA with the given device ID.
Where,
< 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
< Data Unit #1 > - The data unit can be Single operand, 1-Dimention, 2-Dimention or 3- Dimention operands.
< Data Unit #2 > - The data unit can be Single operand, 1-Dimention, 2-Dimentionor 3- Dimention operands. Data Transfer Operation “eop” instructions: eop- LOAD < 3D-HOCA-ID : Address > <Data Unit #1>
This eop code is to perform LOADING data from memory into the target 3D-HOCA with 990 the given device ID < 3D-HOCA-ID : Address >.
Where,
< 3D-HOCA-ID > - Is the target 3D-HOCA device ID.
< Data Unit #1 > - The data unit can be Single data, 1 -Dimention, 2-Dimention or 3- Dimention data.
995 eop-STORE < 3D-HOCA-ID : Address > <Data Unit #1>
This eop code is to perform STORING data from 3D-HOCA into the target memory marked < 3D-HOCA-ID : Address >.
Where,
1000 < 3D-HOCA-ID : Address > - Is the target 3D-HOCA device ID with the address.
< Data Unit #1 > - The data unit can be Single data, 1 -Dimention, 2-Dimention or 3- Dimention data.
As an example, for the master program and the associated slave program for simple addition is described below. The software template of the master program using OxAPIs 1005 Pseudo Code Program: Master Program running on the CPU to control, issue commands for data manipulation operations and data transfer on the 3D-HOCA is describe as follows,
#in elude <OxCompute.h>
1010 main ()
{
// Identify system memory. status=OxDiscoverSystemBuffer(HardwareFeaturePtr); 1015 // Setup system memory. status=OxMarkReserveSystemBuffer(device,size,HardwareFeaturePtr,Oxenabie,Syste mBufferPtr) status=OxlnitializeSystemBuffer( System BufferPtr, Value)
1020 //Identify 3D-HOCA. status= OxDiscoverDevices(*Hst)
//Setup 3D-HOCA.
1025 status=OxMapDiscoveredDevicesToSystemBuffer(*iist, System BufferPtr) status=OxMapOnlySpecificDevicesToSystemBuffer(*Hst,
SystemBufferPtr,DeviceNumber,X,Y,Z)
//Issue the following
1030 // Store the Data#1 package : A ( can be unit data, 1 -Dimension, 2-Dimension and 3- Dimension ) in to 3D-HOCA status=OxSetDeviceDataWordUsingSystemBuffer(SystemBufferPtr,DeviceNumber,X,Y,
Z, A)
1035 // Store the Data#2 package : B ( can be unit data, 1 -Dimension, 2-Dimension and 3- Dimension ) in to 3D-H0CA status=OxSetDeviceDataWordUsingSystemBuffer(SystemBufferPtr,DeviceNumber,X,Y,
Z, B)
1040 //program list contains Result = A + B
sta tus= OxSe tDe vice ControlWordUsingSystemBuffer( SystemBufferPtr, De viceNumber,X, Y,Z, program.1st)
//getting the computed operation status. status=
1045 OxGetDeviceStatusWordUsingSystemBuffer(SystemBufferPtr,DeviceNumber,X, Y, Z, Statu sWord)
//getting the computed result in DataWord back to CPU RAM. status= OxGetDeviceData WordUsingSystemBuffer
1050 (SystemBufferPtr, DeviceNumber,X,Y,Z, Result)
//Free 3D-H0CA sta tus=OxUnmapDe viceFromSystemBuffer( SystemBufferPtr, De viceNumber,X, Y,Z)
//Free ALL 3D-H0CA
1055 status=OxUnmapDeviceRangeFromSystemBuffer(SystemBufferPtr,DeviceListRange,X, Y, Z, TotalDevices)
// Free Memory status=OxReieaseSystemBuffer(SystemBufferPtr)
1060
}
Pseudo Code Program for Slave or Co-Accelerator: Program running on the 3D-
FIOCA for data computation operations is describe as follows,
1065
//Pseudo Code of Program : program.1st add () { result = A + B; //A, B and Result can be single data item, 1-Dimesion, 2-Dimesion or 1070 3-Dimesion. “+”is the “eop-ADD”
} eop Assembly code of the ADDER Program : program.1st eop-RESET < 3D-HOCA-ID > eop- LOAD < 3D-HOCA-ID : Address > A 1075 eop- LOAD < 3D-HOCA-ID : Address > B eop- ADD < 3D-HOCA-ID : Address > A, B eop-STORE < 3D-HOCA-ID : Address > RESULT eop-STOP < 3D-HOCA-ID >
1080 The FIG. 18 shows the deployment of the master program running on the CPU as master device and the co-processor program running on the 3D-HOCA accelerator.
The FIG.19 shows the CPU architecture with ALU blocks, generally CPU workloads are more towards the control flow intensive programs, i.e., task-parallel programs. The CPU also supports Advanced Vector engines for vector operations. But here when the data 1085 Set is huge, the CPU with few ALU blocks becomes squeezed up and experiences lag during computations.
FIG.20 shows the GPU architecture with SIMD (Single Instruction Multiple Data) blocks. Here a single operation is performed on the multiple data items. Due to this architecture, generally, GPU workloads are more towards the data-parallel intensive programs. 1090 where for a single clock cycle GPU can drive out more data-parallel compute results when compared to CPU compute for the same input dataset. The general purpose GPU compute can increase the performance 2X to 300X and the main reason is due to the architectural difference.
1095 FIG.21 shows the 3D-HOCA architecture with a stack of LCU blocks. Here multiple operations are performed on the multiple data items in parallel, where for a single clock cycle the 3D-HOCA can drive out more data-parallel compute results when compared to CPU and GPU compute for the same input dataset. The main reason is due to the architectural difference with massive MIMD ( Multiple Instruction Multiple Data ) block
1100 designs, where each LCU unit in the 3D-HOCA engine acts as Compute Element ( COXEL ) cell and can independently perform computation. Hence the proposed model of 3D-HOCA which is based on the MIMD mode where each LCU cell performs 1 unit of independent operation and if the size of the 3D-HOCA engine is N x N x N LCU compute units, it can perform N x N x N operations in a single clock cycle, even though
1105 silicon chip is driving the clock. Due to this architecture, this can be best suited for both the workloads task-parallel and data-parallel intensive programs.
Example: 3
FIG.22 shows the 3D-HOCA architecture with high-speed RAM hosting the Instruction and Data panel pipelines. The [Qe]i - [Qe]n is the array of electronic pipeline input with N
1110 x N grid size of each panel. Each cell in the panel can host parallel data and/or instruction intended for the target computation within the 3D-HOCA is referred to as “voxel” (Compute Element). On the same lines, the [X0]i - [X0]n is the array of optical pipeline input with the N x N grid size of each panel. Each cell in the panel can host in parallel data and/or instruction intended for the target computation within the 3D-HOCA
1115 voxel. The individual dedicated high-speed bus lanes #1, #2 and #3 are proposed directly linking the 3D-HOCA driver blocks so that memory bottleneck is mitigated. This high-speed RAM interfaces the 3D-HOCA by using a high-speed LED driver unit, high speed LCD driver unit, and High-speed CCD driver unit. This high-speed bus interface is also between the glue chip and the master unit. Since all these components are opt-
1120 electronic units, they are connected and synchronized by a common clock, and to mitigate the difference in switching time between them appropriate buffers are added
along with the wait cycles to maintain the synchronization with a high-speed bus architecture. This kind of pipelined architecture in the MIMD would accelerate the speed of computation.
1125 FIG.23 shows the parallel extension of a reconfigurable multi-dimensional hybrid optical computing accelerator or a 3D-HOCA architecture model for high-speed parallel computing. The reconfigurable multi-dimensional LCU or 3D LCU optical computing unit #1 does not have a CCD sensor array to capture computed optical output result, but instead, it is fed into an X: X or X: Y based beam splitter or 50:50 beam splitter to split 1130 into two parallel inputs for the 3D LCU optical computing unit #2 and 3D LCU optical computing unit #3. The 3D LCU optical computing unit #2 and 3D LCU optical computing unit #3 do not have the 2D LED array unit, but they directly use the two parallel inputs from the 50:50 beam splitter to compute the respective outputs. It can also be noted that both the 3D LCU optical computing unit #2 and 3D LCU optical 1135 computing unit #3 have their own unique set of data/instruction panel inputs for further computation in parallel. This model can be extended by using an array of beam splitters to map the outputs to an array of 3D LCU optical computing unit #N for computation in parallel. To address the problem of attenuation of the signal, periodic optical repeaters and optical amplifier models are added. This design acts as a data/instruction LCU 1140 pipeline. By striped of CCD sensor array and LED 2D array, now the LCU panels act as true optical register files and optical memory files.
FIG. 24 represents the symbol of hybrid optical NOT gate. FIG. 25 describes the implementation of a hybrid optical NOT gate using LCU. The optical output Z0 is given by the HCBEA algebra,
1145 A, B, X are HCBEA Variables
1e = Electronic control filter OFF” signal, allowing the light to pass through the LCU 0e = Electronic control filter ON” signal, blocking the light to pass through the LCU The LCU implementation of hybrid optical NOT gate is given by the equation,
_ 11
Zo = (Xo Xe) = (¾ ) 1150
Where, Xe = { ON / OFF }
Xe is OFF” then Z0 = X0./ Light passes through LCU / 10 Xe is ON” then Z0 = K0 / Light blocked / 0o
1155 Xo = { Ro, Go, Bo, Co, Mo, Yo, Ko }
EXAMPLE
Truth Table: Hybrid Optical NOT gate
Figure imgf000046_0001
EXAMPLE: 4
1160 FIG. 26 represents the symbol of hybrid optical AND gate. FIG. 27 describes the implementation of a hybrid optical AND gate using LCU. The optical output Z0 is given by the HCBEA algebra,
A, B, X are HCBEA Variables
1e = Electronic control filter OFF” signal, allowing the light to pass through the LCU 1165 0e = Electronic control filter ON” signal, blocking the light to pass through the LCU The LCU implementation of hybrid optical AND gate is given by the equation,
Zo = (XoXel) (Xe2)
Where,
Xo - { Ro, Go, Bo }
1170 Xe - { Re, Ge, Be }
Ke = { Re, Ge, Be } for all X0 ¹ Xe Truth Table: Hybrid Optical AND gate
Figure imgf000047_0001
EXAMPLE: 5
1175 FIG. 28 represents the symbol of a hybrid optical OR gate. FIG. 29 describes the implementation of a hybrid optical OR gate using LCU. The optical output Z0 is given by the HCBEA algebra,
A, B, X are HCBEA Variables
1e = Electronic control filter OFF” signal, allowing the light to pass through the LCU
1180 0e = Electronic control filter ON” signal, blocking the light to pass through the LCU The LCU implementation of hybrid optical AND gate is given by the equation,
Zo = Ao + Bo
Where,
Ae, = Pass through LCU
1185 Be = Pass through LCU Ce = Pass through LCU
Ao = Bo = Xo = { Ro, Go, Bo, Wo, Co, Mo, Yo, K0 )
Ao = Bo = Xo = Should be of same filter type Truth Table: Hybrid Optical OR gate
Figure imgf000047_0002
Figure imgf000048_0001
1190 The following description of the present invention is well described to express the best method of performing the invention from Example 2 of line number 840 to 1150 that
- The software framework which uses master and slave architecture, and to program the 3D-HOCA accelerator OxAPI are described in detail.
- Here the application program which runs on the 3D-HOCA accelerator uses the e-opto 1195 compiler, assembler or 4th generation GUI objects for e-opto code generation.
- Example: 1. Hybrid optical NOT Gate
2. Hybrid optical AND Gate
3. Hybrid optical OR Gate
- Comparative architecture of existing multicore, GPU SMID blocks and proposed 3D- 1200 HOCA MIMD model pipelines architecture is summarized.
While the present subject matter has been described in detail with respect to specific embodiments thereof, it will be appreciated that those skilled in the art, upon attaining an understanding of the foregoing may readily produce alterations to, variations of, and equivalents to such embodiments. Accordingly, the scope of the present disclosure is by 1205 way of example rather than by way of limitation, and the subject disclosure does not preclude inclusion of such modifications, variations and/or additions to the present subject matter as would be readily apparent to one of ordinary skill in the art.

Claims

Claims
1. A reconfigurable multi-dimensional hybrid optical computing accelerating apparatus consisting of
-a set of liquid crystal units with vertical polarizer plane slits and horizontal polarizer plane slits are aligned to light pass through and
- a set of silicon-based hybrid optical processor glue chip units activated by a common clock signal to drive LED driver, LCD driver, COLOR sensor driver, and buffer units to provide appropriate synchronization.
-a set of silicon-based hybrid optical control units for controlling the hybrid optical computing accelerating apparatus
-a set of RGB LED sources /White light sources/infrared source
-a set of CCD sensors/lnfra-red sensors
-a set of color sensor array or Infra-red array or both used to convert the computed light output result into the electronic signal
- wherein the said liquid crystal unit is incorporated with a vertical polarizer or horizontal polarizer plane facing each other between the Liquid crystal units aligned to light pass through or/and
- wherein each liquid crystal unit is incorporated with a set of the color filter unit and vertical polarizer or horizontal polarizer plane facing each other between the Liquid crystal units aligned to light pass-through,
-wherein the liquid crystal unit is incorporated with a set of random access memory (RAM) to store the data for optical input unit (l0), electronic color transformation function control line (Qe) to provide an optical output signal (Z0) converted and stored,
-wherein, a function of the liquid crystal unit is characterized with combinational computation execution of the optical Input (l0), and the electronic color transformation function control line (Qe) to provide an optical output (Z0)
-wherein the function of the liquid crystal unit is capable of being reconfigurable by changing the stepping value of the electronic color transformation function control line (Qe) thereby functional behavior of the LCU can be changed or reconfigured without redesigning the existing circuitry.
2. The reconfigurable multi-dimensional hybrid optical computing accelerating apparatus as claimed in claim 1, wherein the said silicon-based hybrid optical processor glue chip is integrated to perform conversion of optical signals to electronic signals and vice versa
3. The reconfigurable multi-dimensional hybrid optical computing accelerating apparatus as claimed in claim 1, wherein the said reconfigurable multi-dimensional hybrid optical computing accelerating apparatus is further characterized to interface Optical extensions API platform for driving the said accelerator [OxAPIs ]
4. The reconfigurable multi-dimensional hybrid optical computing accelerator as claimed in claim 1, wherein the said reconfigurable multi-dimensional hybrid optical computing accelerator is integrated with an X: X or X: Y based beam splitter to split the single input optical signal into multiple optical signals.
PCT/IB2022/054268 2021-07-29 2022-05-09 Hybrid 3-dimensional optical computing accelerator engine apparatus and method WO2023007258A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN202141034072 2021-07-29
IN202141034072 2021-07-29

Publications (1)

Publication Number Publication Date
WO2023007258A1 true WO2023007258A1 (en) 2023-02-02

Family

ID=85087531

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2022/054268 WO2023007258A1 (en) 2021-07-29 2022-05-09 Hybrid 3-dimensional optical computing accelerator engine apparatus and method

Country Status (1)

Country Link
WO (1) WO2023007258A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9798183B2 (en) * 2012-12-27 2017-10-24 Toppan Printing Co., Ltd. Liquid crystal display device, color filter substrate, and method for producing color filter substrate

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9798183B2 (en) * 2012-12-27 2017-10-24 Toppan Printing Co., Ltd. Liquid crystal display device, color filter substrate, and method for producing color filter substrate

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SAWCHUK ALEXANDER A., TIMOTHY C. STRAND: "Digital Optical Computing", PROCEEDINGS OF THE IEEE, vol. 72, no. 7, 31 July 1984 (1984-07-31), pages 758 - 779, XP093030713, DOI: 10.1109/PROC.1984.12937 *

Similar Documents

Publication Publication Date Title
EP4160449A1 (en) Deep learning hardware
US6405185B1 (en) Massively parallel array processor
US4697247A (en) Method of performing matrix by matrix multiplication
WO2020092899A1 (en) Matrix multiplication using optical processing
KR102650911B1 (en) High bandwidth memory silicon photonic through silicon via architecture for lookup computing artificial intellegence accelerator
US11436186B2 (en) High throughput processors
CN109074516A (en) Calculation processing apparatus and computation processing method
Hu et al. Batch processing and data streaming fourier-based convolutional neural network accelerator
US20220043474A1 (en) Path-number-balanced universal photonic network
WO2023005084A1 (en) Optical circuit building method, optical circuit, and optical signal processing method and apparatus
WO2023007258A1 (en) Hybrid 3-dimensional optical computing accelerator engine apparatus and method
Kai et al. The symmetric MSD encoder for one-step adder of ternary optical computer
RU2502126C1 (en) Multiprocessor computer system
Ghosh et al. Quadruple-valued logic system using savart plate and spatial light modulator (SLM) and it’s applications
Lehner et al. Diversity of Processing Units: An Attempt to Classify the Plethora of Modern Processing Units
AU2020395435B2 (en) Flexible precision neural inference processing units
Rudi et al. A parallel optical implementation of arithmetic operations
Bhattacharya et al. Newly designed modified trinary-valued logic gates using SLM-based Savart plate
WO2022179157A1 (en) Optical computing apparatus, optical computing system and method
Wu et al. Demonstration and architectural analysis of complementary metal-oxide semiconductor/multiple-quantum-well smart-pixel array cellular logic processors for single-instruction multiple-data parallel-pipeline processing
Stucke Parallel architecture for a digital optical computer
Daniel et al. Recent Trends and Improvisations in FPGA
Fey et al. Digit pipelined arithmetic for 3-D massively parallel optoelectronic circuits
Ito et al. A uniform partitioning method for Mono-Instruction Set Computer (MISC)
Tamir et al. Electro-optical DSP of Tera operations per second and beyond

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22848760

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE